key: cord- -e zhty authors: tassier, troy; polgreen, philip; segre, alberto title: network position and health care worker infections date: - - journal: j econ interact coord doi: . /s - - - sha: doc_id: cord_uid: e zhty we use a newly collected data set coupled with an agent-based model to study the spread of infectious disease in hospitals. we estimate the average and marginal infections created by various worker groups in a hospital as a function of their network position in order to identify groups most crucial in a hospital-based epidemic. surprisingly, we find that many groups with primary patient care responsibilities play a small role in spreading an infectious disease within our hospital data set. we also demonstrate that the effect of different network positions can be as important as the effect of different transmission rates for some categories of workers. vaccines are a primary way to stop or slow the spread of many infectious diseases, perhaps most notably, influenza. the lack of appropriate vaccination levels is a major health problem. for instance, influenza is a major cause of morbidity and mortality throughout the world despite the availability of a highly effective and inexpensive vaccine. in the us alone, influenza causes an estimated , deaths and , hospitalizations annually yet only around / of healthcare workers are vaccinated each year (thompson et al. ) . efficient provision of vaccinations poses a difficult problem in that the positive externality associated with a vaccination is the product of the probability of infection, the cost of the infection, and the marginal infections generated by an agent if infected (all of which may vary across agents). there is great concern over the spread of infectious diseases in hospitals, but little knowledge is available to identify healthcare workers who are most likely to acquire and transmit infectious diseases in hospitals. the problem is especially difficult because the transmission of many infectious diseases is not observable. for instance if someone in your household acquires influenza, you likely do not know which of the potentially hundreds of people you come in contact with each day that may have caused the infection. further if a vaccine is available for an infectious disease and the vaccine is in short supply or is expensive, it is imperative to know which individuals should have the highest priority in vaccine campaigns. in this paper we use a newly collected data set on hospital worker contacts in order to identify hospital worker groups that have the potential to create the largest number of infections based on their location in a hospital contact network. to achieve this goal, we have collected person-to-person contact information on individuals belonging to one of types of healthcare workers at the university of iowa hospitals and clinics (uihc). the data contain information on the contacts between healthcare workers and patients and between healthcare workers and other healthcare workers at the hospital. with this information we develop a network model describing the spread of an infectious disease in a hospital. we estimate, using an agent-based model, the effect of network position of different hospital worker groups on the spread of infectious diseases in a hospital. through this model we are able to identify the hospital worker groups that create the largest externality if removed from the network (perhaps through a vaccination or a quarantine). we argue that methods such as those used in this paper can help hospitals, health care professionals, and epidemiologists to design efficient programs for healthcare worker vaccinations. of importance, we note that we only study the externality in terms of network position within the hospital. in this paper we do not consider other potential heterogeneity among agents such as differences in transmission rates across workers, or differences in behavior outside the hospital that may play a role. while these effects are also important, the large differences across classes of workers shown below are worthy of independent study. this is the first paper to use specific micro-level contact data within a hospital that can be used to help guide policy makers and public health officials in the problem of efficiently allocating vaccines within hospitals. the data used in this paper is unique and detailed in comparison to other studies. the data consists of shadow data (where a research assistant follows a specific, randomly chosen hospital worker for an entire shift) for the major groups of healthcare workers at the uihc, a -bed major medical center. this results in over h of direct hospital worker observations and the notation of over specific worker to worker or worker to patient contacts throughout the hospital. to the best of our knowledge, the data that we have collected comprises the most detailed micro-level healthcare worker contact data set in existence. as a comparison, ueno and masuda ( ) collect data on contacts between doctors, nurses, and patients. their data is based on two calendar days from a small, room, community hospital in tokyo. they model contacts between nurses, physicians and patients. their data does not consider contacts with and between other healthcare worker groups (other than nurses and doctors). based on our data at the uihc, these assumptions would ignore over % of hospital staff, including most of the groups we identify as most crucial to the spread of an infection disease. we begin by discussing the background and motivation for our study and then move to develop a simple model of infectious disease transmission. in the model, we initially assume homogeneous contacts as in traditional epidemiological models. we then discuss a similar model with heterogeneous contacts and discuss the difficulties of achieving efficient vaccination levels. following the theoretical discussion, we use our newly collected data on healthcare worker and patient contacts to model the spread of an infectious disease in a hospital setting. the model allows us to identify the healthcare worker groups that would be expected to play the largest role in the spread of infectious diseases, in terms of network position, in this hospital setting. traditionally, epidemiology research has focused on well-mixed (randomly mixed) populations where agent contacts are homogeneous. in these models, every agent in a population may "bump into" any other agent with equal probability, much as a gas molecule may bump into any other gas molecule with an equal probability over a fixed time period. only recently have epidemiologists and other researchers begun to study the heterogeneous contact structures between people over which infectious diseases spread (early studies include comins et al. ; grenfell and harwood ; wallinga et al. ) . we focus this study on healthcare workers and a particular class of infectious disease, of which influenza is an example. healthcare workers are at especially high risk of contracting influenza. one study of healthcare workers with a low rate of influenza vaccination demonstrated that % of healthcare workers had evidence of influenza infection during a single influenza season (elder et al. ) . two features of influenza make its spread difficult to control in hospitals. first, people with influenza are infectious - days before the onset of symptoms. thus, they can spread the virus when they are still feeling well and are unaware of their own infectious state. second, only about % of infected persons develop classic influenza symptoms (cdc (cdc , consequently, restricting healthcare workers with influenza-like symptoms from the workplace will not completely prevent transmission of influenza because healthcare workers with atypical symptoms could continue working and spreading the virus. furthermore, studies show that healthcare workers are more likely than other workers to return to work early or to keep working when they have influenza-related symptoms (weingarten et al. ) . because of the ease with which influenza may be contracted and spread by healthcare workers, the centers for disease control and prevention (cdc) have, for the past two decades, recommended influenza vaccination for all healthcare workers. yet, in the us, only % of workers who have direct patient contact are immunized against influenza annually (smith and bresee ) . outside of concerns about traditional influenza, there are additional reasons to study the spread of infectious diseases in hospitals. first, healthcare-associated infections affect about million patients in us hospitals each year (jarvis ). second, there is a growing fear that hospitals could become a breeding ground for new strains of influenza such as the recent h n influenza outbreak, the potential emergence of person-to-person transmission of avian flu, or other "new viruses." much as sars spread widely in hospitals to begin the sars epidemic in toronto (chowell et al. ) , person-to-person transmission of avian flu may start in hospitals as well, and, if a more lethal version of h n were to develop, hospitals again could be a breeding ground for new infections. this last point is of particular salience. with the recent h n outbreak and the subsequent work to develop a vaccine, controversies arose concerning which individuals to vaccinate first. healthcare workers were high on the list. but, as we show below, not all healthcare workers are equal in terms of their importance in spreading infectious diseases. thus, one primary focus of this paper is identifying the individual hospital workers who are most important to vaccinate should a similar outbreak occur again. there is a growing literature in economics on the vaccination choices of individuals and of the externalities associated with vaccinations. but scant attention is paid to the network effects determined by heterogeneous contacts that we focus on in this paper. for example, francis ( ) solves for the optimal tax/subsidy policy for influenza in an sir model with a constant contact rate and random mixing among the population. geoffard and philipson ( ) examine how the individual incentives for vaccination decrease as disease incidence decreases and thereby argue that relying exclusively on private markets is unlikely to lead to disease eradication. boulier et al. ( ) , the most similar paper to ours, investigate the magnitude of the externality associated with a vaccination as a function of the number of vaccinations in the population, the transmission rate of the disease, and the efficacy of the vaccination. they find nonmontonic relationships between each of these items and the magnitude of the vaccine externality. more specifically, the externality and the number of infections prevented per vaccination initially increases before eventually decreasing. however, like francis, they do not consider the case of heterogenous contact number or heterogenous sorting among the population. finally, much of the recent literature on the economics of infectious disease is summarized in philipson et al. ( ) . we begin by describing a simple model where agents in a population have contacts with each other with a uniform probability. this is the traditional random mixing model in epidemiology pioneered by kermack and mckenrick ( ) . the important results in this paper describe exceptions to this homogeneous contact assumption, but we use the simplified model to develop intuition before describing a richer model with heterogeneous contacts. in this simple model, we assume that all agents are homogeneous in that all agents have the same number of contacts with other agents and that these contacts are randomly drawn with uniform probability from the population at large. suppose that agents are assigned to one of three states: susceptible (s), infected (i), or recovered (r). a susceptible agent can transition to being infected with probability α if she is in contact with an infected agent. once infected, an agent transitions to the recovered state according to a parameter κ. once recovered, the agent is immune to the possibility of future infection. this is a classic susceptible-infected-recovered (sir) model for infectious diseases such as influenza (kermack and mckenrick ) . the description and parameters yield the following differential equations describing the flows of agents among the various states, assuming a constant population of size n and contact rate of γ . each equation describes the rate of growth for one of the three populations in the sir model. equation describes how susceptible agents contact γ other agents in the population, of which i t /n are infected, and how, of these contacts with infected agents, a percentage α cause the susceptible agent to transition to being infected. equation describes the previously mentioned flows from susceptible into infected and that each infected agent moves to being recovered at rate κ. finally, eq. describes the flows from infected to recovered. we can write these equations in terms of population shares by dividing each of the above equations by the population size n and using lower case letters to denote these population shares, s t = s t /n , i t = i t /n , and r t = r t /n , yielding the following population share equations: the number of infected agents will increase in the population if the flows into the infected state from the susceptible state exceed the flows out of the infected state into the recovered state, di t dt > . this condition is equivalent to αγ s t > κ or s t > κ αγ . if this inequality holds then we say that the population is above the epidemic threshold. note that we cannot remain above the epidemic threshold forever without an introduction of new susceptible agents since ds t dt < : eventually the population will run out of susceptible agents to infect unless the susceptible population is replenished at a sufficient rate. one goal of healthcare policy is to attempt to place a population below the epidemic threshold so that the number of infectious agents in a population does not grow subject to some cost constraint. a population is most vulnerable to being above the epidemic threshold when the infectious disease first enters a population because s t ≈ . this implies that each infected agent infects approximately αγ κ new agents in the population. this fraction is sometimes referred to as the initial reproduction number in the population and is commonly denoted as r ≡ αγ κ . without new individuals entering a population in the susceptible state this reproduction number can only decline as the infectious disease spreads. a successful vaccination moves an agent from state s to state r without incurring the costs of infection. if we reduce the initial population of susceptible individuals, s , by enough we can push the population below the epidemic threshold. specifically, if s < κ αγ then the infectious disease dies out of its own accord without further action. thus an epidemic is prevented whenever s < κ αγ which occurs when ( − κ αγ )n agents are successfully vaccinated. vaccinating enough agents to produce this effect is called herd immunity (smith ) ; once enough people are vaccinated, the entire population (herd) is effectively protected without everyone being vaccinated. the question then becomes, given a cost of vaccination, c(v), what is the efficient level of vaccinations to provide in a population and how do we obtain this efficient level? we begin to approach this question by introducing standard value function notation. in this initial model, once an agent enters state r she remains there forever. thus the value of being in state r is simply the lifetime discounted utility received in state r. we also introduce the possibility of having heterogenous contacts at this stage by indexing agent j's contacts (γ j ) and other terms that we allow to vary across agents. where u j ( ) = utility of agent j from the specified state and β is the discount rate. if an agent is in state i, she will remain in state i for /κ periods, on average, until recovered and then enter state r. if an agent is in state s, she receives the same utility as she would if she was recovered, unless she becomes infected. the value to an agent of being in state s is the value of being in state r less the product of the probability that the agent becomes infected and the cost of the infected period. where π(γ j , α, i) is the probability of becoming infected over the course of the epidemic as a function of the contacts of the agent and the transmission rate of the infectious disease. the cost of being infected is the difference in utility between states s and i during the time spent in state i , with the value functions specified we can now specify the vaccination problem for the individual and the social planner. to simplify the vaccination choice of individuals we assume that an agent can only be vaccinated at time period . at this time the agent will choose to be vaccinated if the value of being in the recovered state less the cost of the vaccination is greater than the value of being in the susceptible state. thus the agent will choose to be vaccinated if which implies the agent will choose to be vaccinated if c(v) < π(γ j , α, i)c j . the social planner's vaccination problem is more difficult than the individual vaccination problem. essentially, the social planner's problem is to vaccinate agent j if the cost of the vaccination is less than the expected dis-utility of the increase in infections created by agent j if agent j is infected weighted by the probability that agent j is infected. we define the term "marginal infections" of agent j to be the additional infections that occur if j is infected that would not occur if j was not infected. note that this is not simply the number of infections that agent j creates. as an example, agent k maybe infected by agent j. but, if k would have been infected by another agent had she not been infected by j then this would not be a "marginal infection" of agent j. k will be infected regardless of the vaccination choice of agent j. along these lines of thinking, measuring marginal infections is a difficult problem for epidemiologists for at least three reasons: . as mentioned earlier, many infectious disease transmissions are not observable. thus it is not easily known how many infections a given agent causes. . even if transmissions are observable, the marginal infections created by agent j are not simply the number of other agents that j infects. this is because some agents that j infects may get infected by agents other than j even if j does not infect them herself. further one needs to know how many additional agents are infected by those that j infects and any additional infections created by these agents and so forth. thus one needs to know information on the dynamics of the entire epidemic to measure the true marginal infections of a given agent. . the marginal value of vaccinating an agent depends on the behavior and vaccination choices of other agents. it eventually must be decreasing in the number of other vaccinations that are performed. in the extreme, if there are enough vaccinations in the population to produce herd immunity the marginal value of vaccinating an additional agent only involves the probability that the agent is infected from outside the agent population. in effect, the only value is preventing a single agent from infection because she will not, on average, infect anyone else. because of these difficulties we use a simulation approach to help us measure the average and marginal effects of individuals belonging to different worker groups in our hospital contact data. with simulations one can monitor the various infections that occur and also perform controlled experiments to sort out the effects of various groups on potential hospital epidemics. define m j (γ j , α, κ, i, v) as the true marginal infections created by agent j if infected, where v is the number of agents vaccinated in the population. for the majority of the paper we will suppress the notation that does not differ across agents and simply refer to marginal infections as m j (γ j ) since the primary focus of the paper will be on the effect of heterogeneous contacts on the spread of infectious diseases. as shown in boulier et al. marginal infections may be increasing in v for sufficiently small v. but, marginal infections must eventually decrease in v; at the extreme, marginal infections are for any level of v above the point at which herd immunity is reached. thus marginal infections may be increasing or decreasing in the number of vaccinations depending on the specifics of disease transmission, contact patterns, and the number of vaccinations performed. we can now state the social planner's vaccination problem. vaccinate agent j if: note that the individual and social planner's vaccination problems differ by the term π(γ j , α, i)m j (γ j )c j (i). this is the positive externality created by a vaccination when a vaccinated agent j protects other agents which he contacts from acquiring the disease from him. the key terms to investigate in this externality are the probability agent j gets infected, and the marginal infections created by agent j if infected. note that if these marginal infections, m j (γ j ), go to , the social planner's problem and the individual problem converge, and the externality is removed. similarly the externality is removed if enough vaccinations are performed to reach herd immunity, again because m j (γ j ) = when herd immunity is achieved because the epidemic never occurs. the main focus of this paper concerns the network position of hospital workers. as such, we assume throughout the paper that the cost of an infection is equal across all agents, c j (i) = c k (i) for all j and k. further, without loss of generality, we can normalize c j (i) = . thus the externality above is the product of the probability of infection and the marginal infections produced by the agent if infected. one question then emerges: how are π(γ j ) and m(γ j ) related? at a simple level, if contacts only vary in degree, that is if the only difference in contacts between two agents is the number of contacts and not other, qualitative, aspects of the contacts, then you would expect π(γ j ) and m(γ j ) to be highly correlated. if an agent has a high likelihood of being infected because he has many contacts then he also has many contacts to pass on the infection. suppose that each agent pair is connected with a fixed probability. then, by chance, some agents will have a higher than average number of contacts and some a lower than average number of contacts. in this case, any agent who has a large number of contacts will also generate a large number of secondary infections since there is a lack of structure within the network population. thus any agent with a high probability of infection, π(γ j ) will also be expected to generate a high level of marginal infections m(γ j ). example one is fairly direct. however, various relationships are possible as we show below. in this case each agent in a population is directly connected to every other agent in the population. if this is the case, then any agent that becomes infected is directly tied to all other agents and can infect anyone in the population. thus once someone is infected each agent has a high probability of becoming infected (either from the original agent or from secondary infections). but, since the first agent contacts everyone in the population, and many agents will be infected from him, the other agents in the population may have a low m(γ j ). thus it is possible to have a high probability of infection π(γ j ) and low marginal infections generated m(γ j ) from the same agent. imagine that there are three groups of agents in a population. two of these groups, call them a and b, are separate fully connected graphs containing equal numbers of agents, who do not have any connections to the other group. in other words an agent a ∈ a is connected to every agent a ∈ a but no agent a ∈ a is connected to any agent b ∈ b. suppose that group b is formed in a similar manner. the third group is composed of one agent, j. agent j has only two contacts: one to agent x ∈ a and one to agent y ∈ b. in this example it may be unlikely that agent j gets infected, especially if there is a low transmission rate, because he only has two contacts in the populations. but, if agent j is infected, he may be integral to spreading the disease to the second fully-connected group. suppose an agent in a becomes infected and subsequently infects agent x (or any of the other agents in a) as well as several other agents in a. agents in group b are safe from infection as long as agent j is not infected. but, if j becomes infected, then it is possible that a large fraction of agents in b may become infected as well. thus agent j may have a low probability of being infected, π(γ j ), but create a large number of marginal infections, m(γ j ), if he does become infected. note that each of these three examples offer different implications for public policy approaches to encouraging vaccinations. in the first example, each agent has a probability of being infected that is in line with the number of marginal infections generated. in the other two examples, the infection rate and the number of marginal infections generated may not have a simple relationship with each other. this is an important observation if one considers using subsidies or other means to encourage increased vaccination rates. in the remaining portion of the paper we examine a data set on contacts of and between healthcare workers and patients in a large hospital on the university of iowa campus. we discuss the data and use agent-based models to identify the healthcare workers whose position in the hospital contact network has the potential to create large numbers of infections in the hospital. observational data on contacts between healthcare workers and patients was collected during the winter and early spring of - (the - "flu season") at the university of iowa hospitals and clinics (uihc). the uihc is a -bed comprehensive academic medical center and regional referral center in iowa city. data were collected by randomly selecting uihc employees from each of job classifications (specified below) and then using research assistants to "shadow" the selected employees. the research assistants manually recorded every human contact of the subjects (within approximately three feet) over a work shift. note that these observed contacts include anyone contacted within the hospital, not just with the other workers in the shadow sample. this resulted in a total of recorded contacts over h of observation. additionally, the ra recorded the worker or patient group category for each observed contact (patient or category of healthcare worker) in our data set, and the location in the hospital where the contact occurred. the job categories and number of observed subjects in the data set are as follows: floor nurse ( ), food service ( ), housekeeper ( ), intensive care nurse ( ), nurse assistant ( ), pharmacist ( ), phlebotomist ( ), physical/occupational therapist ( ), resident/fellow/ medical student ( ), respiratory therapist ( ), social worker ( ), staff physician ( ), transporter ( ), unit clerk ( ), and x-ray technician ( ). the data for each group contain approximately h of shadowing. the data were summarized into tallies of contacts over -min intervals and then aggregated into contacts per h shift by the authors. table lists the average number of non-repeated contacts per h that occur between the various worker (and patient) categories. note that we were not allowed to choose patients as subjects in our shadow data directly because of privacy concerns. we were only able to observe patient contacts as a result of shadowing other members of the hospital. thus they do not appear as a row in the table. we use this contact data to model the spread of an infectious disease across the uihc hospital. with this data we create a probabilistic contact network for the hospital worker groups. the network is constructed to match the distribution of worker groups at the university of iowa hospitals and clinics. this totals employees. the distribution of workers across the categories is given in table . we create a contact network among these agents. in the model, each worker in a given group connects to other workers according to the rates observed in our shadowed subjects given in table . as an example, all floor nurses in the model create contacts to other randomly selected floor nurses on average, contacts to food service workers, etc. we assume that the contacts are symmetric in our model, that is, a contact from a given floor nurse to a given housekeeper is also a contact from the housekeeper to the floor nurse. there are at least two reasons for this assumption. first, if our subject is in close enough proximity to pass on the influenza virus to a second agent, the second agent is also within close enough proximity to pass on the influenza virus to our subject. thus the ability to acquire or to pass on virus is a symmetric relationship. second, the reader may note in the table that the matrix of observed contacts is not symmetric because of randomness in the observation of subjects. for instance one notices that the subject floor nurses were not observed to contact food service workers, but a small number of food service worker to floor nurse contacts were observed. thus by assuming that all contacts that occur in the matrix are undirected, we create a symmetric contact matrix where the total number of contacts from a member of group x to group y (and from group y to group x) is one-half the sum of the observed average contacts from group x to group y and from group y to group x. we create the contacts in a uniform random manner within groups. let ρ i j be the ratio of the average contacts between a member of group i and j (taken from table ) to the total number of group j employees (taken from table ). we then take each pair of employees across each group and create a contact with probability ρ i j . specifically, let agent a be a member of group i and agent b be a member of group j. then the probability that a and b are connected is average number of contacts observed between members of groups i and j and n j is the number of employees belonging to group j. before moving to the computational model, we mention two limitations of our data set. human contact networks frequently have a small number of individuals with a much larger than average number of contacts, perhaps differing by orders of magnitude. these individuals are often called hubs. these hubs have the potential to significantly influence disease transmission because they are highly likely to be infected, and if so, to pass on infections to a large number of individuals. because our sample includes approximately . % of the hospital worker population, we may be missing hubs in our sample if they exist in this setting. however, we note two things in relation to this: first, the relatively homogeneous workday responsibilities of workers within categories, likely limit the variation of contacts within a category. for instance two physicians or two floor nurses are likely constrained to see a relatively similar number of patients each day. this is unlike many other social network data sets, (such as general friendship or online networks) where there are not these work responsibility constraints on individual contacts. thus the possibility of hubs with orders of magnitude differences in numbers of contacts is more limited in our data context. second, our data set is more comprehensive than any other within hospital contact data set in existence in terms of the worker categories included. recall from earlier in the paper that the ueno and masuda data set only includes physicians, nurses, and patients in a hospital much smaller than is studied here. our results below suggest that many of the most important groups in the hospital are not included in their study. thus, at a minimum, our study highlights the importance of funding for studies that aim to collect even more comprehensive data sets that include individuals with nonpatient care responsibilities and more comprehensively cover a larger share of hospital workers. in addition, we note that we consider all contacts in our data set to be equivalent, or "equally weighted." there may be some concern that not all contacts are created equal in our context. for instance, a contact between a physician and a patient that occurs during a physical examination, may be more likely to spread an infectious disease than other contacts in our data set. we attempt to control for this possibility later in the paper when we consider heterogeneous transmission rates and repeated contacts. as mentioned above, transmissions of infectious disease are not usually observable. thus studying infectious disease transmission using an agent-based model can be a useful tool. in the remainder of the paper we model the spread of an infectious disease across the simulated hospital contact networks described above. once created, we use the contact network in a model of the spread of an infectious disease in the hospital as follows: agents can be in one of three states, susceptible to being infected (s), infected and able to infect others (i), or recovered (and therefore immune) (r). we assume that each infected agent recovers after periods which is in-line with the infectious period for influenza. once recovered the agent enters state r and is therefore immune to further transitions to the infected state. initially, all agents in the model are in state s. agents may be vaccinated against infection. vaccinations occur only in the initial period of the model. once vaccinated, an agent moves immediately from state s to state r and is thus immune for the remainder of the model. in the initial period of the model, each agent in state s (all agents that have not been vaccinated) is subject to infection with probability α = . . these are the agents of our model that seed the potential epidemic. once these initial infections occur we assume that each contact in our network occurs once in each subsequent period of the model. if a contact occurs between a state i and a state s agent, the state s agent transitions to state i with probability α, which we vary across experiments. we continue the model until no agents remain in state i. in each period of the model we calculate the fraction of agents in each worker category in each state (s, i, or r). for each of the results reported below we run replications of each parameter set or computational experiment reported. the results reported are averages over these replications. in addition to the results reported below, we have studied a wide range of parameters for our model and find the results reported below to be robust to changes in all of the parameters within reasonable bounds. the purpose of the model is to estimate m(γ j ) and the externality generated for the network of contacts in the uihc shadow data and, in turn, to identify the classes of workers most important to vaccinate. this is a two step process. first we perform a series of base-line models as described above with none of the healthcare workers and patients vaccinated. from this baseline, we observe the rate of infection for each class of agents in the hospital population ( worker groups and patients for a total of groups). we denote the infection rate of group k in the base model as a function of the transmission rate α as π k b (α) and the overall infection rate in the entire population as a function of α as π b (α). second, we want to calculate the average and marginal infections generated by each group. to do so, we change the vaccination rate for each group, one group at a time, and re-run the model. as an example, we run the model with all floor nurses vaccinated and no other vaccinations and observe π (α). then we run the model with all housekeepers vaccinated and no one else and observe π (α) and so on for each group. we then compare the change in the average number of infections between the models, δ(b, k) = (π b (α)−π k (α))n , which is the difference between the overall infection rate in the base model with no vaccinations and the overall infection rate in the model with all of group k vaccinated, multiplied by the total population size n . now, using the notation described above, the change in the number of infections δ(b, k) is equal to the change in number of people vaccinated, n k , multiplied by the probability that each of these agents becomes infected if not vaccinated, multiplied by the number of additional infections each infected agent would generate. simplifying, if we assume each agent infects the same number of individuals, we can write the average number of secondary infections generated per infected person in group k, as a k and write: one can then find an estimate of the average secondary infections per infected person as: effectively, this process removes each group from the hospital, one at a time. we then can observe the effect of each individual worker group on the size of the modeled epidemic. instead of vaccinating all agents of a group at once, we can vaccinate a fraction of a group. as we change this fraction at intervals n k we can view the effect of increases in vaccination rates for each group (one at a time). we then have an estimate of the marginal infections prevented per vaccination as: we now proceed in two steps. first we investigate the effect of each group in total on the epidemic process by vaccinating an entire group one at a time and calculatinĝ a k for each group. we then choose a sample of interesting groups and study the details of the epidemic process as we vary the number of vaccinations performed in each of these groups, at specified intervals between and %. interestingly, as we vary the percent of each group vaccinated, we will see that there are different outcomes across these different groups in terms of marginal infections generated, probability of infection and the overall effect of a vaccination (in terms of reducing the number of infections). we begin by varying the transmission rate, α, over the range [ . , . ] and observing the base infection rate π b (α). the results are displayed in fig. . as one can see in the figure, a sufficiently large transmission rate is needed to generate an epidemic of reasonable size. further, as expected, the number of infections generated monotonically increases as a function of the transmission rate, α. our primary interest is in intermediate ranges of epidemic outbreaks. if the transmission rate is too high then almost everyone in a population needs to be vaccinated in order to reach herd immunity. and, if the transmission rate is too low, then there is not a large need to worry about vaccine priority. thus, we concentrate on two intermediate levels of the transmission rate α = . and α = . . with no vaccinations, these levels yield an epidemic where between one-third and one-half of the population is infected over the course of the epidemic. we now find the average effect of vaccinations across the hospital worker groups using the procedure described above for α = . and α = . . we present results for the average "secondary infections" generated per infected person, k , and the product of average infections and probability of infection, which yields the "decrease in infections per vaccination," δ(b,k) n k in tables and . from the decrease in infections per vaccination we have an indication of how much the vaccination of an individual group member is contributing to preventing the spread of an epidemic. the results of this experiment suggest where efforts should be directed in the event of an influenza vaccine shortage or in the event of the development of new disease for which a vaccine may be developed (e.g., avian flu, swine flu, etc.) but is initially in short supply until mass quantities may be made available. note that some of the groups have vaccinations that prevent less than one infection per vaccination. this occurs because these groups have a low probability of infection and sufficiently low number of average infections that each member generates if infected. groups with large decreases in infections per vaccination are the ones to prioritize in times of a vaccine shortage, assuming equal costs of infection. in these experiments we see three clear groups that stand above the others in terms of the effect of vaccinations. for the parameters of the experiments, each vaccination of a unit clerk, social worker, and phlebotomist, results in a decrease of . infections or more on average for α = . and of . or more for α = . . in addition vaccinating unit clerks is extremely effective; each unit clerk vaccination results in a decrease of over infections for α = . and over infections for α = . . somewhat surprisingly, some of the groups that are seen as central to the functioning of a hospital play a very small or moderate role in spreading an infectious disease. vaccinating staff physicians results in a lower than average decrease in infections. we revisit this result in our discussion of transmission rates later in the paper. also of note, as one would expect, as the transmission rate increases, the probability of infection increases. but, this has the effect of making individual vaccinations less beneficial on average. note that k is smaller for each group for a higher transmission rate. this has the effect of lowering the variance of average infections. for the α = . case above the standard deviation is . , and for the α = . case, the standard deviation is . . as the transmission rate increases, a larger fraction of individuals are infected throughout the population. thus there are more opportunities for each individual to be infected if she has not already been infected. vaccinating a given person in the population will only prevent one of these multiple channels for infection. so, as the infection rate increases, the effectiveness of a vaccination becomes more uniform across the groups. this has direct policy applications. an infectious disease that is highly contagious could best be met with a uniform vaccination strategy since each individual in the population will create a similar level of infections on average. but an infectious disease with a low level of contagiousness could most effectively be met with a targeted vaccination campaign (bansal and pourbohloul ) . we next look at the marginal effect of a vaccination as the number of vaccinations increase. we present results in figs. , , , , and for five interesting worker group categories for the same two transmission rates discussed above. unit clerks, social workers and phlebotomists are chosen because of their large number of secondary infections generated. we also choose floor nurses and staff physicians because of interest in the effect of worker groups with primary patient care responsibilities. in fig. we plot the marginal infections prevented per vaccination as a function of the number of vaccinations performed for the five groups. recall that the number of marginal infections is the additional number of infections that an agent generates if the agent becomes infected. in fig. we plot the probability of infection for the five groups. and, in fig. we plot the product of marginal infections and probability of infection which yields the total number of infections prevented per vaccination. these figures all consider a transmission rate of . . figures , , and plot the same relationships for a transmission rate of . . we begin with marginal infections. recall that marginal infections may be increasing or decreasing in the number of vaccinations performed for small numbers of vaccinations (boulier et al. ). (here the number of vaccinations performed is small relative to the entire population as we are only vaccinating some members of one of the groups. so, even if we vaccinate an entire group, this is a small number relative to the entire population.) here, we see two interesting outcomes. first, we see that marginal infections for both unit clerks and floor nurses increase as more vaccinations are performed. for these two groups, each additional vaccination prevents a larger and larger number of infections. this is particularly extreme for a transmission rate of . and the case of unit clerks where a small number of vaccinations results the product of the two previously discussed statistics yields the number of infections prevented per vaccination. again, unit clerks display a unique relationship in that the number of infections prevented per vaccination dramatically increases in the number of vaccinations performed. the other four groups result in much flatter plots. thus again there is little difference between the marginal and the average for these groups. there are two interesting points to be made from these results. the first is that the effect of vaccinating unit clerks in our data is most important both from a marginal and an average perspective regardless of how many vaccinations have been performed. particularly, the marginal effect of vaccinating a unit clerk increases at a greater rate than the probability of infection for a non-vaccinated unit clerk falls. thus, it is always more beneficial to vaccinate one more unit clerk as opposed to a worker from another group (assuming transmission rates are equal). the second is that there is little difference between the average and the marginal effect of a vaccination for the other groups considered here. this second point can be interpreted as good news from a policy making perspective in the sense that the optimal allocation of vaccinations does not switch as more of a group is vaccinated. in other words it is not the case that group a is the optimal group to target up to some vaccination percentage, after which group b should be targeted. switching such as that would indicate a much more complicated solution to the optimal vaccine allocation problem. here, because there is little difference between the marginal and the average effect of a vaccination for most groups, one can pragmatically target the groups with the largest average effect of a vaccination. we now move to discuss the important features of the contact network that creates the externality. as we will see below, it is not just the number of contacts that an agent has but also which specific agents and groups the agent contacts, as well as who the agent's contacts connect to in turn. we begin by looking at some basic statistics of the contacts in our data in table . for each group, the table displays the total number of contacts, the percentage of total contacts that are with members of an agent's own group, and the number of patient contacts. total contacts and contacts with patients could be directly correlated with the likelihood of being infected and with passing on infections. the percentage of contacts within one's own group can indicate how varied one's network is and how widespread one's connections are. for instance having few contacts within one's own group provides the possibility of introducing an infection to other groups within the hospital. in addition the table also contains a common network characteristic measure, betweenness centrality, which we discuss below. on a network, the geodesic distance, g a,b , between nodes a and b is the length of the shortest path between the two nodes that traverses connections on the network measured as the number of connections traversed (or "hops" required) to reach node b from node a. as one measure of the centrality of a node on a graph one can calculate the average distance to all other nodes on the graph. thus if a graph g is composed of n nodes, average distance for node a is calculated as: a short average distance indicates that a node is close to other nodes on average an thus may be likely to be infected and to pass on infections. thus it is sometimes considered a measure of the centrality or importance of a node in a network. betweenness is another measure of the centrality of a node in a network. let c i jk be the proportion of all geodesics linking node j and node k which pass through node i. let c i be the sum of all c i jk for i = j = k. letc i be the maximum possible value for c i . (normalized) betweenness for node i, b i is then: betweenness for node i is therefore a measure of the proportion of shortest paths between nodes that go through node i. in the the table below we report group level values for betweenness centrality. the values are created in the following manner. first we create a network using the same methods as described above. second, we then calculate the betweenness centrality measurement for each agent in the simulated network. third, we calculate the average value for each hospital worker group and report the result in the table below. this network variable is likely to be an indicator of importance for disease transmission in the network. if a hospital worker group has a high average betweenness value, then the nodes in this group are potentially important in passing infections on to other nodes as it plays a crucial role in location along many of the shortest paths between nodes in a network. as such, this measure should be closely related to the marginal infections that a group generates. what is most interesting in table is the lack of a clear relationship between any of the variables in the first three columns and our previously-listed most important groups (highlighted in bold). the only measure that consistently aligns well is the betweenness measure. for a moment concentrate on the values in the first three columns. each of these three plausibly important characteristics fail to display a meaningful relationship with the average or marginal infections generated. if we concentrate on the top three most important groups, some have relatively large numbers of contacts (unit clerks), although not the largest, while others have contacts significantly below the average (phlebotomists). some have large numbers of patient contacts (phlebotomists) while others have some of the smallest number of patient contacts (unit clerks and social workers). one interesting thing that appears in the table is that phlebotomists have almost all of their contacts with patients and other phlebotomists. thus the network position of phlebotomists, is in some sense, very similar to that of patients. overall, what these observations imply is that there is not likely to be a simple relationship (or a small set of simple relationships) indicating which individuals are most important to vaccinate by looking at easily observed worker interaction patterns. instead the relationship depends on the intricate and complex web of relationships that make up the entire contact network of the hospital. this is what is captured in the betweenness centrality measure. betweenness measures the percentage of shortest paths in the entire network on which an agent is located. if we remove agents with high betweenness measures (by vaccinations) from the disease propagation network, we disrupt the flow of an epidemic. for concreteness, the correlation between the betweeness measure and the average infections generated is about . for both α = . and α = . . as mentioned above, the primary focus of this paper concerns the effect of network position on the marginal infections generated within a hospital. here, we consider two robustness checks to the results presented above. first, we discuss a comparison of the magnitude of the above described network effects and the effect of transmission rates on marginal infections for an interesting example group, staff physicians. second, we do an additional set of experiments using an additional data set that includes observed repeated contacts. we perform these robustness checks for two main reasons: first, because of the job responsibilities of different worker groups, the different worker groups may have different transmission rates, durations of contacts, or frequency of contacts creating another source of heterogeneity in infections. as a few examples, a staff physician may be more likely to transmit an infection over the course of a patient exam that includes a series of physical contacts, compared to a nurse who has a brief arm's-length conversation with a hospital transporter. a floor nurse may have multiple interactions with the same patient during the course of the day. we begin this analysis by varying the transmission rate of staff physicians. (recall that staff physicians created a lower than average number of infections in our earlier model.) again, this is primarily to account for the fact that physicians may have longer duration contacts with patients and the contacts may more frequently involve physical touch. we now assign a special transmission rate to staff physicians that we denote as α p . this will be the transmission rate for any contact between a physician and another agent where one of the agents is infected. we use α = . for all other contacts in the population and vary α p from α p = . and α p = . . as we do this we again measure average infections as described earlier and show the resulting average infections for unit clerks and staff physicians in fig. . before we present the results we note that there are alternative ways to model this scenario. for instance, we could re-weight our contact matrix. if we knew, for instance, that physician to patient contacts lasted twice as long as other pairs of contacts in the hospital worker population, we could re-weight each physician to patient contact by a factor of two. but note that, on average, this is equivalent to increasing the transmission probability by a factor of two because the expected number of new infections is the number of susceptible to infected contacts in a period multiplied by the transmission rate. a doubling of the number of contacts is equivalent to a doubling of the transmission rate. recall from our results above that when the transmission rates are equal the average secondary infections created by unit clerks are slightly greater than and slightly greater than . for staff physicians. as we increase the transmission rate for staff physicians we see several things: first, as you would expect, the average infections for staff physicians increases. but the change is not large. for α p = . the average secondary infections created by staff physicians is . , slightly less than a two-fold increase and still well below the level of average infections noted for unit clerks when the transmission rates are equal. for unit clerks, as α p increases, the average infections of unit clerks drops rapidly. this occurs for at least two reasons: one is that unit clerks become less important relative to staff physicians as α p increases. another is that, as α p increases, the overall infection rates increase, and as we reported earlier, this causes average infections to become more uniform across groups. overall, the level of average infections between these two groups does not become similar until the α p increases to about . , a five-fold increase in the staff physician transmission rate. and, the average infections of staff physicians does not become greater than that of unit clerks until α p = . . most interesting about these results is the magnitude of the network effects relative to the magnitude of the transmission rates. in the case of staff physicians and unit clerks it takes a - times increase in the transmission rate of staff physicians to "make up" the difference in network position. this suggests that the network effect differences in average infections are very important in understanding overall transmission patterns. as a second robustness check we use an additional cut of our data that includes observed repeated contacts. in some instances members of the hospital population were observed to have come in contact more than once during the observation period. we use this data as one way to include the weighting of contacts mentioned above into the analysis. table displays the observed contacts that were observed and had occurred previously in our data. as you see in the table many of these contacts involved repeated interactions with patients (often by members of the nursing staff). we re-perform the analysis above with these additional contacts added into the data. the only additional change is that we modify the transmission rate to α = . so that the total number of infections in the population remains nearly constant compared to the non-repeat contact data. (recall from earlier in the paper that a larger epidemic smooths out differences in the population groups and makes the average and marginal values more similar across groups. thus we control for epidemic size by varying the transmission rate.) we present the results in table . you will notice in the table that most of the groups that previously had the largest effect still do. unit clerks are still the most important group to vaccinate, but the difference in magnitude between unit clerks and other important groups is less than in the the previous model. further, as one would expect, groups that have more repeated contacts, such as all types of nurses, become more important. the important point though, is that the relative ranking of most of the groups changes very little. of course this is partially due to there being relatively few repeated contacts in the data set. taking these two robustness checks in combination, they demonstrate two important things. first, network structure is at least as important as transmission rate in determining the course of an epidemic in our data set. second, while the data we have collected is not perfect in terms of comprehensiveness, the relative ranking of group importance appears relatively robust to alternative measures of network structure. a a minimum, taken in combination, these results suggest a need for a greater emphasis on network based data collection in order to better understand both micro level and macro level epidemiology. we now make a small shift in focus. generally, a hospital's primary goal is to restore or improve patient health. thus prioritizing healthcare worker vaccinations so as to best protect patients may be a legitimate goal of a hospital. in other words hospital administrators may care about protecting patients from infection as much, or more, than they do about protecting the entire hospital population from infection. of course these may be closely related goals. in addition, in a large scale epidemic, it may be of great importance to have a healthy staff of physicians to treat patients. thus protecting physicians may be another important goal in vaccine priority within a hospital. in tables and we display the same relationships as displayed in our initial results section above but this time only with regard to patient infections generated and staff physician infections generated, not infections in the entire hospital. in this analysis we see very similar results to the overall population results. beginning with patients, the four groups (unit clerk, social worker, physical and occupational therapist, and phlebotomist) that played the most important role in transmitting to the hospital population as a whole also play an important role in their effect on patients specifically. however, we see some difference in the two groups of models. first, groups that have more direct patient contacts increase in importance. for instance, phlebotomists replace unit clerks as the most important group. second new groups emerge as important for transmitting to patients. for instance, hospital transporters are among the top four groups in transmitting to patients but are significantly below the average in terms of transmissions to the general population. with this in mind it seems that giving vaccination priority to health care workers with direct patient contacts is more important for protecting patients than it is for protecting the general population, as one would expect. but still, some of the groups with the largest impact on infecting patients have few direct patient contacts (unit clerks and social workers, for example). for staff physicians we see similar results. again, the four most important groups remain important, but other groups such as floor nurses increase in importance when considering staff physicians specifically. to summarize the results of this section, the same groups that create infections in the general population also create infections in the patient and staff physician population. but, groups that have direct patient or staff physician contacts have increased importance. still, one should not ignore other groups central to the network of the hos- pital that have only few direct patient or staff physician contacts (e.g., social workers and unit clerks). we utilize a newly collected data set on contacts of health care workers at a large university hospital to estimate network effects for infectious disease transmission. interestingly the most important groups to vaccinate tend to have heterogeneous contacts throughout the hospital. groups such as social workers and unit clerks are very important to vaccinate even though they have been given low priority in past vaccine campaigns because of their relatively limited number of patient contacts. for instance, the cdc recommended in their interim influenza vaccination recommendations in - that influenza vaccine priority be given to "health-care workers involved in direct patient care" and further stated, "persons who are not included in one of the priority groups described above should be informed about the urgent vaccine supply situation and asked to forego or defer vaccination." this mismatch of scientific results and past policy decisions suggests that future research in this area is warranted especially when one considers the public health dangers associated with the emergence of avian flu, a more lethal version of swine flu, or recent dangers such as sars. with that stated, we want to be careful to recognize that one reason to vaccinate primary care providers is to assure individuals are available to care for the sick. this important incentive is outside the scope of our model. the results of this paper lead to important public policy considerations. specifically, hospital workers with a low probability of infection may be likely to ignore recommendations for vaccination even if they are central to the spread of an infectious disease. one way to increase the overall vaccination level is with a subsidy program. but, as the results in this paper show, not all hospital workers are equal in terms of the positive externality generated by a vaccination. because of the heterogeneous contacts throughout the hospital, some workers are more important to the spread of an infectious disease than others. thus if hospitals and other public health organizations want to efficiently distribute vaccines they need to target specific worker groups, perhaps by allocating subsidies, on the basis of discrepancies in probability of infection and marginal infections generated. this paper is the first to use specific micro-level contact data within a hospital to guide policy makers and public health officials in this endeavor. to be clear, these results are not meant to be specifically calibrated to measure the exact effect of vaccinations in these groups. instead our hope is that the orderings of the hospital worker groups (which are robust across the parameters that we have explored) indicate where public health officials can effectively intervene in order to prevent widespread epidemics within hospitals. and these experiments reveal interesting and surprising groupings. prior to this study, as quoted above, it had been argued that groups like unit clerks be excluded from influenza vaccine campaigns, in times of vaccine shortages, because of their minimal patient contacts. the results of this study suggest that decisions such as these need to be more fully explored. a comparative analysis of influenza vaccination programs article cdc ( ) epidemiology and prevention of vaccine-preventable diseases, th edn. public health foundation prevention and control of influenza: recommendations of the advisory committee on immunization practice (acip) model parameters and outbreak control for sars the spatial dynamics of host parasitoid systems incidence and recall of influenza in a cohort of glasgow healthcare workers during the - epidemic: results of serum testing and questionnaire optimal tax/subsidy combinations for flu season disease eradication: private versus public vaccination meta)population dynamics of infectious diseases selected aspects of the socioeconomic impact of nosocomial infections: morbidity mortality, cost, and prevention a contibution to the mathematical theory of epidemics epidemiology economic, diseases infectious prospects of the control of disease prevention and control of influenza: recommendations of the advisory committee on immunization practices, morbidity and mortality weekly report mortality associated with influenza and respiratory syncytial virus in the united states controlling nosocomial infection based on the structure of hospital social networks perspective: human contact patter ns and the spread of airborne infectious diseases barriers to influenza vaccination acceptance. a survey of physicians and nurses key: cord- -bjx td authors: vanhems, philippe; barrat, alain; cattuto, ciro; pinton, jean-françois; khanafer, nagham; régis, corinne; kim, byeul-a; comte, brigitte; voirin, nicolas title: estimating potential infection transmission routes in hospital wards using wearable proximity sensors date: - - journal: plos one doi: . /journal.pone. sha: doc_id: cord_uid: bjx td background: contacts between patients, patients and health care workers (hcws) and among hcws represent one of the important routes of transmission of hospital-acquired infections (hai). a detailed description and quantification of contacts in hospitals provides key information for hais epidemiology and for the design and validation of control measures. methods and findings: we used wearable sensors to detect close-range interactions (“contacts”) between individuals in the geriatric unit of a university hospital. contact events were measured with a spatial resolution of about . meters and a temporal resolution of seconds. the study included hcws and patients and lasted for days and nights. , contacts were recorded overall, . % of which during daytime. the number and duration of contacts varied between mornings, afternoons and nights, and contact matrices describing the mixing patterns between hcw and patients were built for each time period. contact patterns were qualitatively similar from one day to the next. % of the contacts occurred between pairs of hcws and hcws accounted for % of all the contacts including at least one patient, suggesting a population of individuals who could potentially act as super-spreaders. conclusions: wearable sensors represent a novel tool for the measurement of contact patterns in hospitals. the collected data can provide information on important aspects that impact the spreading patterns of infectious diseases, such as the strong heterogeneity of contact numbers and durations across individuals, the variability in the number of contacts during a day, and the fraction of repeated contacts across days. this variability is however associated with a marked statistical stability of contact and mixing patterns across days. our results highlight the need for such measurement efforts in order to correctly inform mathematical models of hais and use them to inform the design and evaluation of prevention strategies. the control of hospital-acquired infections (hai) is largely based on preventive procedures derived from the best available knowledge of potential transmission routes. the accurate description of contact patterns between individuals is crucial to this end, as it can help to understand the possible transmission dynamics and the design principles for appropriate control measures. in particular, the mutual exposures between patients and health-care workers (hcws) have been documented for bacterial and viral transmission since decades [ , , ] . transmission might be the result of effective contact, as in the cases of s. aureus [ , ] , k. pneumoniae [ ] or rotavirus [ ] , of exposure to contaminated aerosols, as for m. tuberculosis [ ] , or the result of exposure to droplets, as for influenza [ ] . some pathogens such as influenza can also be transmitted by different routes. although close-range proximity and contacts between individuals are strong determinants for potential transmissions, obtaining reliable data on these behaviors remains a challenge [ ] . data on contacts between individuals in specific settings or in the general population are most often obtained from diaries and surveys [ , , , ] and from time-use records [ ] . these approaches provide essential information to describe contacts patterns and inform models of infectious disease spread. the gathered data, however, often lack the longitudinal dimension [ , , ] and the high spatial and temporal resolution needed to accurately characterize the interactions among individuals in specific environments such as hospitals. moreover, they are subject to potential biases due to behavioral modifications due to the presence of observers, to short periods of observation, and especially to missing information and recall biases. evaluating biases and understanding the accuracy of the collected data is therefore a difficult task [ ] . in this context, the use of electronic devices has recently emerged as an interesting complement to more traditional methods [ ] . in particular, wearable sensors based on active radio-frequency identification (rfid) technology have been used to measure face-to-face proximity relations between individuals with a high spatio-temporal resolution in various contexts [ ] that include social gatherings [ , ] , schools [ , ] and hospitals [ , ] . the amount of available data, however, is still very limited, high-resolution contact data relevant for the epidemiology of infectious diseases are scarce, and the longitudinal aspects of contact patterns have not been investigated in detail, prompting further investigation. in this paper we report on the use of wearable proximity sensors [ ] to measure the numbers and durations of contacts between individuals in an acute care geriatric unit of a university hospital. we investigate the variability of contact patterns as a function of time, as well as the differences in contact patterns between individuals with different roles in the ward. we document the presence of individuals with a high number of contacts, who could be considered as potential super-spreaders of infections. some implications of our results regarding prevention and control of hospital-acquired infections are discussed. the measurement system, developed by the sociopatterns collaboration [ ] , is based on small active rfid devices (''tags'') that are embedded in unobtrusive wearable badges and exchange ultra-low-power radio packets [ , , , ] . the power level is tuned so that devices can exchange packets only when located within - . meters of one another, i.e., package exchange is used as a proxy for distance (the tags do not directly measure distances). individuals were asked to wear the devices on their chests using lanyards, ensuring that the rfid devices of two individuals can only exchange radio packets when the persons are facing each other, as the human body acts as a rf shield at the frequency used for communication. in summary the system is tuned so that it detects and records close-range encounters during which a communicable disease infection could be transmitted, for example, by cough, sneeze or hand contact. the information on face-to-face proximity events detected by the wearable sensors is relayed to radio receivers installed throughout the hospital ward (bedrooms, offices and hall). the system was tuned so that whenever two individuals wearing the rfid tags were in face-to-face proximity the probability to detect such a proximity event over an time interval of seconds was larger than %. we therefore define two individuals to be in ''contact'' during a -second interval if and only if their sensors exchanged at least one packet during that interval. a contact is therefore symmetric by definition, and in case of contacts involving three or more individuals in the same -second interval, all the contact pairs were considered. after the contact is established, it is considered ongoing as long as the devices continue to exchange at least one packet for every subsequent s interval. conversely, a contact is considered broken if a -second interval elapses with no exchange of packets. we emphasize that this is an operational definition of the human proximity behavior that we choose to quantify, and that all the results we present correspond to this precise and specific definition of ''contact''. we make the raw data we collected available to the public as datasets s -s in file s and on the website of the sociopatterns collaboration (www. sociopatterns.org). data were collected in a short stay geriatric unit ( beds) of a university hospital of almost beds [ ] in lyon, france, from monday, december , at : pm to friday, december , at : pm. during that time, professional staff worked in the unit and patients were admitted. we collected data on the contacts between staff members ( % participation rate) and patients ( % participation rate). the participating staff members were nurses or nurses' aides, medical doctors and administrative staff. in the ward, all rooms but were single-bed rooms. each day teams of nurses and nurses' aides worked in the ward: one of the teams was present from : am to : pm and the other from : pm to : pm. an additional nurse and an additional nurse' aid were moreover present from : am to : pm. two nurses were present during the nights from : pm to : am. in addition, a physiotherapist and a nutritionist were present each day at various points in time, with no fixed schedule, and a social counselor and a physical therapist visited on demand (in our analysis they are considered as nurses). two physicians and interns were present from : am to : pm each day. visits were allowed from : am to : pm but visitors were not included in the study. in advance of the study, staff members and patients were informed on the details and aims of the study. a signed informed consent was obtained for each participating patient and staff member. all participants were given an rfid tag and asked to wear it properly at all times. no personal information was associated with the tag: only the professional category of each hcw and the age of the patients were collected. the study was approved by the french national bodies responsible for ethics and privacy, the ''commission nationale de l'informatique et des libertés'' (cnil, http://www.cnil.fr) and the ''comité de protection des personnes'' (http://www.cppsudest .com/) of the hospital. individuals were categorized in four classes according to their activity in the ward: patients (pat), medical doctors (physicians and interns, med), paramedical staff (nurses and nurses' aides, nur) and administrative staff (adm). med and nur professionals form a group named hcw. the contact patterns were analyzed using both the numbers and the durations of contacts between individuals. for each individual we measured the number of other distinct individuals with whom she/he had been in contact, as well as the total number of contact events she/he was involved in, and the total time spent in contact with other individuals. these quantities were aggregated for each class and for each pair of role classes in order to define contact matrices that describe the mixing patterns between classes of individuals. the longitudinal evolution of the contact patterns was studied by considering, in addition to the entire study duration, several shorter time intervals: we divided the study duration into daytime periods ( : am to : pm) and nights ( : pm to : am); daytime periods were divided in morning ( : am to : pm) and afternoon ( : pm to : pm) shifts, and we also considered data aggregated on a -hour timescale. we finally considered the similarity of contact patterns between successive days, by measuring the fraction of contacts that were repeated from one day to the next, as such information is particularly relevant when modeling spreading phenomena [ , , ] . overall, , contacts occurred during the study, with a cumulative duration of , s (approx. , minutes or hours). , contacts ( . %) included at least one nur, , ( . %) included at least one med, and , ( . %) at least one patient. table reports the average number and duration of contacts of individuals in each class over the whole study duration. most contacts involve at least one nur and/or one med, and nurs and meds have on average the largest number of contacts, as well as the largest cumulative duration in contact. large standard deviations are however observed: the distributions of the contact durations and of the numbers and cumulative durations of contacts are broad, as also observed in many other contexts [ , , , ] . important variations are observed even within each role class. in particular, contacts of much larger duration than the average are observed with a non-negligible frequency. the total number of contacts between individuals belonging to specific classes is reported in table and the corresponding contact matrices are shown in figure . we report contact matrices giving the total numbers and cumulative durations of contacts between individuals of given classes, as well as contact matrices taking into account the different numbers of individuals in each class. contacts were most frequent between two nurs ( , contacts, %), followed by nur-par contacts ( , contacts, %), and by contacts between two meds ( , contacts, %). very few contacts between pats or between members of the adm group were observed. as reported in table , among the , contacts detected, , ( . %) occurred during daytime, for a total duration of , s (approx. , min or h). contacts ( . %) occurred during nights (lasting , s, approx. min or h). on average we recorded , contacts per morning, , per afternoon, and per night. the evolution of the number of contacts at the more detailed resolution of one-hour time windows is reported in figure . the number of contacts varied strongly over the course of a day, but the evolution was similar from one day to another (for day and day , contacts were recorded after : pm and before : pm respectively, see methods), with very few contacts at night and a maximum around - am. the number of contacts between individuals of specific classes also depends on the period of the day. contacts between nurs, and between nurs and pats, were predominant in the morning while contacts between meds remained similar between mornings and afternoons. overall, . % of contacts between nurs and pats occurred on the morning, . % on the afternoon and . % during the night. figure reports the contact matrices giving the numbers of contacts between individuals of specific classes for each morning, afternoon and night. the absolute numbers of contacts varied from one morning (resp., afternoon or night) to the next, but the mixing patterns remained very similar. differences were observed between morning, afternoon and night patterns. the main difference between morning and afternoon periods came from larger numbers of contacts involving nurs in the morning. almost only contacts involving nurs and pats were observed at night. although the aggregated observables reported above are very similar from day to day, the precise structure of the daily contact network varied strongly: the fraction of common neighbors of an individual between two different days is on average just of %. this value is smaller than the one observed in a school [ ] , but much larger than the one measured for the attendees of a conference [ ] . the cumulative number and duration of contacts of each individual, as identified by his/her badge identification number, are reported in figure and table . a small number of hcws accounted for most of the contacts observed between hcws and pats, both in terms of number and cumulative duration. for instance, nurs (representing % of all hcws) accounted for . % of the number of contacts and . % of the cumulated duration of the contacts with pats (number of contacts and cumulative duration of contacts of a given individual are strongly correlated, r = . ). the number of distinct individuals contacted by a given individual was also correlated with the total number of contacts of the same individual (r = . ). these hcws had a much larger number and duration of contacts than average, as shown in table . the objective of the present study was to describe in detail the contacts between individuals in a healthcare setting. such data can help to accurately inform computational models of the propagation of infectious diseases and, as a consequence, to improve the design and implementation of prevention or control measures based on the frequency and duration of contacts. numbers and duration of contacts were characterized for each class of individuals and for individuals belonging to given class pairs, yielding contact matrices that represent important inputs for realistic computational models of nosocomial infections. as also measured in other contexts [ , , , , ] , the numbers and durations of contacts display large variations even across individuals of the same class: the resulting distributions were broad, with no characteristic time scale. as a consequence, even though the average durations of contacts were rather short, contacts of much longer durations than average occured with nonnegligible frequency. contacts involving either two nurs or between nurs and pats accounted for the majority of contacts, both in terms of numbers and of global durations. very few contacts occurred between pats: this might be a specificity of wards with mostly single rooms, and other wards in which patients are not alone in a room or in which they move around more might yield more numerous contacts between pats. these results are consistent with previous studies [ , ] carried out in pediatrics, surgery and intensive care units, and provide additional evidence that nurses and assistants may be the most essential target group for prevention measures [ , ] . the detailed information about the number and duration of contacts also allowed us to highlight the presence of a limited number of ''super-contactors'' among hcws who account for a large part of all contacts. a large number of contacts could correspond to different situations, namely to contacts with many different patients, or to many contacts with few patients. our results show that the cumulated number of contacts and the number of distinct persons contacted are correlated; this indicates that in the hospital context under study the super-contactors have contacts with many different patients. they could therefore potentially play the role of super-spreaders, whose importance in the spread of infectious agents has been highlighted both theoretically [ , ] and empirically [ ] . this suggests that their role class should be targeted for prevention measures. these results are in concordance with the central role of hcws in hospital wards, as repeated contacts with patients are often necessary for the quality of healthcare. however, since outbreaks of measles and influenza involving this population have been observed [ , ] , the possibility for hcws to be super-contactors emphasizes the need to reduce their exposure to infection and to limit the risk of transmission to patients. this should stimulate the strict implementation of preventive measures including hand washing, vaccination, or wearing of masks [ ] . in addition, hcws could be warned against the risk brought forth by unnecessary large numbers or long durations of contacts, especially with patients. limiting the contacts of hcws (either with pats or with other hcws) might however not be feasible without altering the quality of care. in this respect, the investigation of the temporal evolution of the numbers of contacts may help envision and discuss changes in the organization of care during epidemic or pandemic periods. the numbers of contacts varied indeed greatly along the course of each day, clearly highlighting the periods of the day (here, the mornings) during which transmission could occur with higher probability. the high numbers of contacts during mornings may indicate a potential overexposure to infection for pats and nurs, and one may imagine a different organization toward a smoothing of the number of contacts throughout mornings and afternoons. this would decrease the density of contacts, in particular between nurs, at each specific moment, while maintaining the daily number and duration of contacts between nurs and pats, and overall tend to limit their overexposure [ ] . the potential efficacy of such or other changes in the healthcare organization should of course be tested through numerical simulations of spreading phenomena, and their feasibility would moreover need to be asserted through discussions with the staff. the measurement of contact patterns by means of wearable sensors presents strengths and limitations that are worth discussing. strong advantages are the versatility of the sensing strategy (i.e., the unobtrusiveness of wearable sensors and the prompt deployability of receivers) and the fact that it does neither require the constant presence of external observers nor interfere with the delivery of care in the ward. another strength lies in the high spatial and temporal resolution: behavioral differences across role classes can be detected, and longitudinal studies are possible. high participation ratios are also crucial: similarly to a previous study in another hospital [ ] , the rate of acceptance among hcws and patients turned out to be very high ( %). the information meetings held before the study, providing a clear exposition of the scientific objectives and of the privacy aspects, most probably played an important role in achieving such a high participation rate. the versatility of a system based on wearable sensors and easily deployable data receivers makes it possible to repeat similar studies in different environments and to compare results across contexts [ ] . in particular, several of the reported findings are very similar to those described in [ ] in a different hospital, situated in a different country, and in a different type of ward (paediatric): large variability in the cumulative duration of contacts, small number of contacts between patients, and large numbers and durations of contacts between nurs. repeating measurements in the same ward and in other wards represents an important step towards understanding the similarities and differences of contact patterns in hospital settings, and allows to generalize the observations to more correctly inform models. the measurement approach we used here has also several limitations. contacts were defined as face-to-face proximity, without any information on physical contact between individuals. therefore, the assumption that the number of contacts reflects disease exposure can be appropriate for respiratory infections such as influenza, or for similar diseases that can be transmitted by various routes at a distance of meter around an index case [ ] . the use of close-range proximity as a proxy for the transmission of bacterial infection acquired by cross transmission, such as s aureus or enterobacteriacea, is more questionable. other factors related to specific attributes of individuals (e.g., vaccination or immunosuppression), of the microbial agent (e.g., resistance or virulence) or of the environment (e.g., specialty of ward) may also alter the relationship between contact frequency/duration and transmission. in this respect, a validation with simultaneous direct observation and human annotation of the contacts would be of particular interest. finally, it is difficult to assess whether individuals modified their behavior in response to wearing rfid badges, but direct observation indicated that hcws were focusing on their daily activities and most probably were not influenced by the presence of the badges. badges were not proposed to visitors and this potential external source of infection was not studied. this study complements previous work [ , , , , , ] and provides data that can be used to explore the spread of infection in confined settings through mathematical and computational modeling. models of transmission within hospitals might be based on contact matrices such as those presented here, and used to better understand the epidemiology of different types of microbial agents, to assess the impact of control measures, and to help improve the delivery of care during emergency epidemic situations. in our study, specific mixing patterns were observed between different classes of individuals, showing a clear departure from homogeneous mixing, as it is expected in a hospital setting, and highlighting the relevance of correctly informed contact matrices. moreover, although an important turnover between the persons in contact with a given individual was observed across different days, and although the average contact durations between different classes of individuals varied between mornings, afternoons and nights, the contact patterns remained statistically very similar across successive days. these results suggest that, in order to correctly inform computational models, data collected over just a few hours might be insufficient, but that measures lasting hours would be sufficient to evaluate the statistical properties of contact patterns as well as the mixing patterns between individual classes, and to estimate the similarity between the contacts of an individual across days. the statistical features of the gathered data could then be used to model contact patterns over longer time scales. the scarcity of contact data [ , ] calls for further measurement campaigns to validate and consolidate the results across other hospital units, other contexts, and over longer periods of time. additional data sets would also be useful to build and test proxies that could replace systematic detailed measurement of contact patterns, such as the ones put forward in [ , , ] . in order to explore the relationship between complex contacts network and the spreading of infections, it would be particularly interesting to collect simultaneously high-resolution contact data and microbiological data describing the infection status of participating individuals. combining these heterogeneous sources of information within appropriate statistical models would allow elucidating the relation between the risk of disease transmission and contacts patterns, in order to disentangle transmission likelihood from contact frequency. finally, feedback of the results to hcws could be an innovative pedagogical tool in health care settings. file s dataset s . time-resolved contact network for day , in gexf format. each node corresponds to one rfid tag and has an attribute ''role'' that indicates the role of the individual wearing the tag: patient (pat), medical doctor (med), paramedical staff (nur) or administrative staff (adm). each edge has attributes: ''ncontacts'', the number of contact events between the corresponding rfid tags; ''cumulativeduration'', the total duration of these contacts, and ''list_contacts'', the explicit list of time intervals during which the individuals were in contact. dataset s . timeresolved contact network for day , in gexf format. each node corresponds to one rfid tag and has an attribute ''role'' that indicates the role of the individual wearing the tag: patient (pat), medical doctor (med), paramedical staff (nur) or administrative staff (adm). each edge has attributes: ''ncontacts'', the number of contact events between the corresponding rfid tags; ''cumulativeduration'', the total duration of these contacts, and ''list_contacts'', the explicit list of time intervals during which the individuals were in contact. dataset s . time-resolved contact network for day , in gexf format. each node corresponds to one rfid tag and has an attribute ''role'' that indicates the role of the individual wearing the tag: patient (pat), medical doctor (med), paramedical staff (nur) or administrative staff (adm). each edge has attributes: ''ncontacts'', the number of contact events between the corresponding rfid tags; ''cumulativeduration'', the total duration of these contacts, and ''list_contacts'', the explicit list of time intervals during which the individuals were in contact. dataset s . time-resolved contact network for day , in gexf format. each node corresponds to one rfid tag and has an attribute ''role'' that indicates the role of the individual wearing the tag: patient (pat), medical doctor (med), paramedical staff (nur) or administrative staff (adm). each edge has attributes: ''ncontacts'', the number of contact events between the corresponding rfid tags; ''cumulativeduration'', the total duration of these contacts, and ''list_contacts'', the explicit list of time intervals during which the individuals were in contact. dataset s . timeresolved contact network for day , in gexf format. each node corresponds to one rfid tag and has an attribute ''role'' that indicates the role of the individual wearing the tag: patient (pat), medical doctor (med), paramedical staff (nur) or administrative staff (adm). each edge has attributes: ''ncontacts'', the number of contact events between the corresponding rfid tags; ''cumulativeduration'', the total duration of these contacts, and ''list_contacts'', the explicit list of time intervals during which the individuals were in contact. (zip) health-care workers: source, vector, or victim of mrsa? tuberculosis exposure of patients and staff in an outpatient hemodialysis unit risk of influenza-like illness in an acute health care setting during community influenza epidemics in modeling the spread of methicillin-resistant staphylococcus aureus in nursing homes for elderly community and nosocomial transmission of panton-valentine leucocidinpositive community-associated meticillin-resistant staphylococcus aureus: implications for healthcare klebsiella pneumoniae bloodstream infections among neonates in a high-risk nursery in cali, colombia outbreak of rotavirus gastroenteritis in a nursing home transmission of drug-susceptible and drug-resistant tuberculosis and the critical importance of airborne infection control in the era of hiv infection and highly active antiretroviral therapy rollouts transmission of pandemic a/h n influenza on passenger aircraft: retrospective cohort study close encounters of the infectious kind: methods to measure social mixing behaviour social mixing patterns for transmission models of close contact infections: exploring self-evaluation and diary-based data collection through a web-based interface comparison of three methods for ascertainment of contact information relevant to respiratory pathogen transmission in encounter networks social contacts of school children and the transmission of respiratory-spread pathogens social contacts and mixing patterns relevant to the spread of infectious diseases using time-use data to parameterize models for the spread of close-contact infectious diseases collecting closecontact social mixing data with contact diaries: reporting errors and biases dynamics of person-to-person interactions from distributed rfid sensor networks simulation of an seir infectious disease model on the dynamic contact network of conference attendees what's in a crowd? analysis of face-to-face behavioral networks a highresolution human contact network for infectious disease transmission high-resolution measurements of face-to-face contact patterns in a primary school using sensor networks to study the effect of peripatetic healthcare workers on the spread of hospital-associated infections close encounters in a pediatric ward: measuring face-to-face proximity and mixing patterns with wearable sensors modelling disease spread through random and regular contacts in clustered populations models of epidemics: when contact repetition and clustering should be included prioritizing healthcare worker vaccinations on the basis of social network analysis nurses' contacts and potential for infectious disease transmission superspreading and the effect of individual variation on disease emergence peripatetic health-care workers as potential superspreaders severe acute respiratory syndrome-singapore nosocomial transmission of measles: an updated review hospital-acquired influenza: a synthesis using the outbreak reports and intervention studies of nosocomial infection (orion) statement monitoring hand hygiene via human observers: how should we be sampling? global perspectives for prevention of infectious diseases associated with mass gatherings cough-generated aerosols of mycobacterium tuberculosis: a new method to study infectiousness invited commentary: challenges of using contact data to understand acute respiratory disease transmission modeling and estimating the spatial distribution of healthcare workers a low-cost method to assess the epidemiological importance of individuals in controlling infectious disease outbreaks we are particularly grateful to all patients and the hospital staff who volunteered to participate in the data collection. key: cord- -fugb l authors: klepac, petra; kucharski, adam j; conlan, andrew jk; kissler, stephen; tang, maria; fry, hannah; gog, julia r title: contacts in context: large-scale setting-specific social mixing matrices from the bbc pandemic project date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: fugb l social mixing patterns are crucial in driving transmission of infectious diseases and informing public health interventions to contain their spread. age-specific social mixing is often inferred from surveys of self-recorded contacts which by design often have a very limited number of participants. in addition, such surveys are rare, so public health interventions are often evaluated by considering only one such study. here we report detailed population contact patterns for united kingdom based self-reported contact data from over , volunteers that participated in the massive citizen science project bbc pandemic. the amount of data collected allows us generate fine-scale age-specific population contact matrices by context (home, work, school, other) and type (conversational or physical) of contact that took place. these matrices are highly relevant for informing prevention and control of new outbreaks, and evaluating strategies that reduce the amount of mixing in the population (such as school closures, social distancing, or working from home). in addition, they finally provide the possibility to use multiple sources of social mixing data to evaluate the uncertainty that stems from social mixing when designing public health interventions. for directly transmitted respiratory pathogens such as influenza, measles and coronaviruses, social mixing patterns shape the risk of individual-level infection [ ] and population-level transmission dynamics [ , ] , as well as the effectiveness of control measures targeted at specific age groups [ ] . typically two main approaches have been used to measure social mixing patterns relevant for the spread of disease: inference of contacts based on wearable devices such as proximity sensors [ , ] , or self-recording of contacts [ ] . as well as being able to capture age-specific patterns of infection [ ] , self-recording also allow for details of the type and setting of social contacts, and demographic information about the contacts themselves. a landmark dataset of self-reported contacts was the polymod study [ ] , which collected social mixing data for , participants across eight european countries. these data have been widely used to understand the epidemiology of infectious diseases [ ] and inform policy-relevant disease modelling [ , ] . however, the sample size for each country (e.g. , participants for great britain) limit the ability to stratify by multiple demographic factors and still obtain precise estimates of social mixing within those groups, and does not have details about the location of participants, which meant social contacts could not be compared between spatial covariates such as urban and rural setting. moreover, the polymod study was conducted in - , and so patterns may have changed since then. to generate a more contemporary large-scale dataset on social mixing and movement patterns in the united kingdom, the bbc pandemic project recruited over , participants between september and december as part of a public science project linked to a bbc documentary [ ] . here we present high resolution age-specific social mixing matrices based on data from over , participants, stratified by key characteristics such as contact type and setting. there were two components to the bbc pandemic study, one focused on the town of haslemere [ ] , and another focused on the wider uk population [ ] . here we present data from the uk national study. upon using bbc pandemic app to join this study, users first entered their basic demographic information, including age, household size, gender and occupation. the app then recorded their approximate location at hourly intervals for a hour period. at the end of this period, users recorded each social contact they had made during this period, including information on: the contact's age; the type of interaction (conversational contact, defined as face-to-face conversation of three or more words, or physical contact); the setting of the contact (home, work, school, other); and whether the participant had spoken to that person before. overall, over , participants started the survey and filled out their profile. participants with no encounter or location data were excluded, as were users whose location recordings were all outside the uk. this leaves a rich dataset of around , participants. of those, , completed the study and reported their social contacts at the end of the survey -this is the focus for this paper. the data collection process had some limitations. in particular, the initial version of the app had the default age of a contact as -years-old. participants were free to change the value on slider, but if they just clicked through, it would record that contact's age as . as a result, the early data had more contacts of this age than was plausible. in our analysis, we therefore remove users with or more contacts of the age exactly ( , users dropped). these users together reported , contacts out of which , were aged exactly -years-old. second, the initial version did not allow users to record precisely zero contacts: these users may thus be missing from our denominators, we do not expect this effect to be large and for simplicity have not attempted to correct for it here. information was provided and consent obtained from all participants in the study before the app recorded any data. the study was approved by london school of hygiene & tropical medicine observational research ethics committee (ref ). we follow [ ] to infer mixing matrices from the self-reported contact data. we group the study participants and their contacts by age into following age groups: - , - , - , - , - , - , - , - , then year age bands from to , with a single category for those aged and over: the finer structure chosen to provide higher resolution for school and university ages. for each of those age groups, we find t ij : the total number of reported contacts over the course of hours between participants in age group j and their reported contacts of estimated age group i. to find the mean number of contacts (m ij ) who are in age group i as reported by participants in age group j we divide t ij by n j -the total number of participants in age group j. this results in the 'social contact matrix' m = (m ij ), where m ij = t ij /n j . this is the raw contact matrix as derived from our study data. we can deduce more from our reported contacts as the contacts are reciprocal (if person a was in contact with b, that means that the person b was also in contact with person a). on a population level that means that the total number of contacts from age group j to i, must be equal to the total number of contacts from age group i to age group j. as the sample of participants might have a different population structure than the wider population, this step depends on the country-specific population structure: we define w i as the total population size of the age group i. in our case, volunteers needed to be in the uk to participate in the study, so we use the mid-year estimate of the population structure of the uk (available from ons [ ] ). the reciprocal matrix c = (c ij ) gives the population contact matrix, where c ij = (m ij + m ji wi wj ) [ ] . the population matrix c is of particular importance for infectious disease dynamics because it is related to the next generation matrix [ , ] . the next generation matrix n captures how the infection spreads when pathogen is first introduced in a population, and its (i, j) entry gives the expected number of new infections in compartment i produced by in infected individual introduced into compartment j. as a result, the dominant eigenvalue of n is equal to the basic reproduction number r or the expected number of secondary infections caused by a single individual introduced to a completely susceptible population. the relationship between the two matrices is is the dominant eigenvalue of c (its spectral radius). analogous to the eigenvector representing stage-specific contributions to overall population reproduction in demographic theory [ ] , the dominant eigenvector here gives an indication of which age-groups most contribute to transmission in the population, assuming no age-specific differences in susceptibility or infectiousness (fig ) . we repeated these calculations stratified by type of contact (physical or conversational) and by context (home, work, school, other), resulting in matrices each for the raw contact matrix shown in fig s and population contact matrix in fig . we further stratify these matrices by contacts made during the week ( fig s ) and during the weekend ( fig s ) . as participants filled out the contact survey at the end of the hour period, we define the weekend based on the time they activated the app in the following way. to avoid weekend/week overlap, we consider participants who activated the app after : on fridays and before noon on sundays as those reporting their weekend contacts, and those who activated the app after : on sundays and before noon on fridays as those reporting contacts made during the week. this excludes , users with , contacts that report their contacts both during the week and the weekend. we also analysed the relationship between the density locations participants typically visited and their social contacts. we focused on a subset of the data with participants that had at least recorded gps locations (n= , ). we used data from lower layer super output areas for england and wales, small areas for northern ireland, data zones for scotland to estimate the population density per square km for each recorded gps location in our dataset, with density layers matched to gps location based on whichever had the nearest centroid. we then calculated the mean density across all gps locations recorded for each participants. finally, we compared reported household size with social contacts for participants that had at least one reported physical or conversational contact (n= , ). to explore the relationship between household size, average density of gps tracks and social contacts, we used a generalized additive model [ ] . the model was defined as g(e(y)) = b + f (x) + a, where y was the binary outcome variable (i.e. reported contacts made), x was the predictor (i.e. household size or average density of gps tracks on log scale), g was the link function, b was the intercept, a was age (to adjust for possible confounding), and f was a smooth function represented by a penalized regression spline. results for fitted gams are shown for a = . the main dataset used in this study consisted of , participants reporting , contacts. for ethical reasons, participants in the bbc study had to be at least years of age, although the number of younger participants tails off gradually rather than a hard cutoff at this age ( fig a) . total reported contacts (i.e. the sum of physical and conversational contacts) were distributed across a wide range of age groups, with a peak reflecting the peak in age of participants, and . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . spikes likely representing bias to chose round numbers as estimated ages of casual contacts. the participant population in the study under-sampled the youngest and oldest groups relative the underlying uk population ( fig c) . the total conversational and physical contacts varied greatly across different age groups (fig a,b ). on average, participants reported over three times more conversational contacts than physical contacts, and contacts that spanned a larger age range. the very strong diagonal density of contacts ( fig b) is characteristic of strong age-assortative mixing, and the sub-diagonal density captures interactions between children and their parents. the dominant eigenvectors of the matrices, which indicate the age groups that would drive transmission during the exponential phase of an epidemic simulated using these data, are highest for the - age group ( fig c) . in general, eigenvectors based on physical only contacts and all contacts (both conversational and physical) are very similar except for the ages over where relative dominance is higher for physical contacts. the measured age-specific social mixing matrices also varied considerably between different types and settings. reported contacts at home in the population contact matrix followed a strong age-assortative pattern (strong diagonal band), with inter-generational mixing shown by the offdiagonal bands in contacts (fig a-b) , which is especially pronounced in physical contacts at home. contacts at work showed less age-assortativity than contact at home (fig c-d) , and were predominantly non-physical ( fig d) . within school-aged groups, more contacts were reported on average at school than in other settings (fig e-f ), but for a very narrow age-band. overall contacts in other settings (i.e. not home, work or school) were age assortative for younger groups, but less assortative for older groups, with an off diagonal peak in contact intensity between older participants and other adults (fig g-h) . physical contacts in other settings were less common, but also exhibited the transition from age-assortativity to less structured mixing in older age groups ( fig g) . stratifying these contacts by type and context further by those made during the week versus during the weekend shows temporal changes in the average number of contacts in different settings (fig ) . both physical and all contacts are higher at home and at 'other' locations (not home, school or work) during the weekend, while there is marked decrease in contacts at work and at school, as might be expected. we found that participants who typically had recorded gps tracks in lower density areas had fewer contacts than those that spent most of their time in more dense areas ( a). participants who spent most time in areas with density between and , people per km reported total contacts on average, whereas participants in areas with fewer than people per km recorded fewer than contacts on average. there was a positive association between contacts within the home and household size for participants who lived with up to additional people ( b). however, conversational contacts within the home saturated at around in households of size or larger, and physical contacts saturated at in households of size - , before declining again. we also found many contacts reported on average within the home for the % of participants who lived alone (self-reported household size of ): although % of participants who lived alone reported no contacts within the home, the remaining % of these participants reported . contacts within to date, the gold-standard for modelling age-specific contact patterns in many settings has been the polymod dataset collected in / [ ] . in great britain, the polymod study had , participants reporting a total of , contacts ( . contacts on average). this is higher than the daily average reported number of contacts in the bbc contact dataset with , participants reporting , contacts ( . contacts reported on average). for most age groups, the total number of contacts per day for an average person (assuming population structure [ ] ) is remarkably similar between bbc and polymod datasets, especially for ages over (fig a) . in bbc contact data a smaller proportion of those contacts are physical (and correspondingly a higher proportion of contacts are conversational only). the reduction in the average number of contacts for ages - is striking between polymod and bbc datasets. while for the - age group this may be affected with the fact that we are only sampling a subset of this age group and assuming that -and -year-olds are representative of the entire age group, for -to -year-olds this reduction is real. this could be the effect of our sampled population, or it might be a signature of a real change in social contacts of teenagers since polymod. a survey of over , -to -year-olds in the us in the showed that compared to teens' preference for direct face-to-face communication with friends has declined substantially (from a half to a third) while the interactions over social media have increased [ ] . a similar trend in digitisation of teenager interactions is likely to be present also in the rest of the world with comparable teenage mobile phone use, and it is possible that the reduction in both conversational and physical contacts in our dataset is a reflection of teenage interactions moving away from face-to-face to social media. out of all reported contacts in the polymod dataset, . % were physical (involved touch), . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . figure : population contact matrices. mean number of contacts reported by participants of given age groups adjusted for reciprocity of contacts, c, assuming population age structure given by the mid-year estimate from ons [ ] . matrices are by different encounter type (physical only or all contact, in respective columns) and by different encounter context (home, work, school or other in respective rows). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint gam fits of number of household occupants (not including participants) and total contacts within the home (red), conversational contacts within the home (gold) and physical contacts within the home (cyan), adjusting for age (results shown for year olds). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint . % didn't involve touch, and the rest were unspecified, resulting in comparable levels of reported conversational and physical contacts. in the bbc dataset, participants reported about three times more conversational contacts than physical ( . % of reported contacts were conversational and . % were physical). compared to the general population, polymod oversampled younger groups by design (i.e. ages under ) and the bbc data ended up oversampling adults (in particular ages - ); both studies undersample ages over (fig a) . the distribution of the number of contacts reported across all participants followed a negative-binomial distribution with mean . for bbc data and . for polymod data (fig c and d, respectively) . the polymod data was right censored at contacts, which is evident in the density plot ( fig d) . as the contact matrices are related to next generation matrices, the differences in structure between the polymod and bbc matrices have direct consequences for disease dynamics. here, we are particularly interested in the age-groups that are most responsible for transmission, which is described by the dominant eigenvector of the next generation matrix. we compare polymod matrices for great britain consisting of the average number of contacts recorded per day per survey participant that are available from mossong et al. [ ] table s . even though the raw data from polymod is available, these matrices are most commonly used, which is why we choose them for the comparison with bbc matrices. ages are grouped in -year bins until the age of , and a single + age group (total of age-groups). for the comparison of matrices we follow the same grouping (fig ) , and make both bbc and polymod matrices reciprocal using the population vector for england and wales [ ] . we address the missing data in bbc matrices in two ways: ( ) we take the ages and for which we have data to be representative of the entire age-group - , and ( ) we fill out the information for the age-groups - and - from polymod by scaling the missing square with the ratio of dominant eigenvalues of the symmetric subset of the bbc matrix without missing values, and the same subset of the polymod matrix. the scaling factor q = ρ(bbc s )/ρ(polymod s ) where . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . s designates this symmetric subset of the matrix and ρ() is the dominant eigenvalue (spectral radius) ensures that the dominant eigenvalue of the filled in bbc matrix stays intact. the largest eigenvalue of the reciprocal contact matrix is the average number of different people contacted during one day by someone who has just been contacted [ ] (this value is proportional to r ). the relevant type of contact for transmission (physical or all contact) will depend on the type of pathogen. for pathogens that require close sustained contact for transmission (such as commensal skin colonisers associated with nosocomial infections like staphilococcus aureous [ ] ), physical contacts might be more relevant. for very easily transmissible pathogens like measles, some combination of physical and conversational contacts would better represent transmission. figure a illustrates the difference between the average number of people contacted for a whole range of matrices ranging from physical contacts only to all contacts (by gradually adding conversational contacts in increments of . ). the average number of physical contacts in the polymod dataset are more than twice that of the bbc dataset. however on average participants in the bbc dataset speak to two more persons a day than polymod participants ( . compared to . ). the age-specific contributions to those overall contacts (or overall transmission during the early stages of an outbreak) are given by the magnitude of the dominant eigenvector. fig b-d shows the relative contribution of different age-groups to overall transmission for physical contacts only (b), physical plus scaled conversational contacts by a half (c) and all contacts (d) for both the bbc and polymod studies. except for physical contacts, using bbc mixing matrices generally leads to more transmission in adult age-groups (particularly in ages over ) whereas with polymod dataset school-children are largely responsible for transmission regardless of how we construct the overall matrix. the bbc pandemic study has the potential to provide extremely detailed insights into patterns of movements and social mixing in the uk, which will be valuable for understanding the dynamics of circulating infectious diseases as well as informing the prevention and control of new outbreaks. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . analysis of the full study dataset is ongoing to ensure that information relevant for epidemiological research can be made widely available, while also protecting participant privacy and anonymity. however, the emergence of the novel coronavirus disease covid- [ ] has created an urgent need for the best possible social mixing data to be made available to support the outbreak response, as well as for the possibility to use multiple sources of social mixing data to evaluate the uncertainty that stems from social mixing in evaluating public health interventions. it is therefore our hope that this detailed contemporary picture of age-specific mixing patterns will be of value to those modelling covid- to improve the evidence base for decisions on potential control measures in the uk, as well as suggesting broader insights into social mixing that may be relevant to other countries as well. the matrices we present can be directly incorporated into mathematical models of transmission to predict the dynamics of infection with and between key demographic groups and settings [ , , , ] . the large scale nature of the bbc data presented, with over , users and over , contacts, made it possible to generate fine-scale age-specific social contact matrices across different contexts, and by week and weekend. these could be used to explore the impact of different intervention strategies that rely on social distancing to reduce the amount of mixing in the population (such as school closures, and working from home) on flattening and postponing the peak of an outbreak. there were some notable differences between these matrices and those presented in the polymod study, which surveyed , people in great britain. in particular, polymod participants report higher overall mean contacts than bbc participants ( . compared to . people on average), although this difference is reduced when both dataset are calibrated to the same population structure. more than a decade has passed since the poly-mod study, and it is possible that the average number of contacts in the uk may have dropped. this is particularly evident in the reduction of teenage contacts, which could be the signature of a real change in how teenagers interact with one-another and their preferred way of interacting with their friends shifting from face-to-face to social media over last several years [ ] . while the mean number of conversational contacts reported by participants in the bbc study was higher than in polymod ( . and . respectively), the mean number of physical contacts was significantly lower, and mostly limited to contacts at home. moreover, polymod by design oversampled children, whereas the bbc data oversampled adults, and hence may have captured more of the tail of the contact distribution in older age groups. incorporating social mixing patterns in different contexts and at different times of the week (weekend vs weekday) into mathematical models, it is possible to evaluate the potential effectiveness of a range of control measures targeting respiratory infections, including school closures [ , ] and social distancing [ ] . however, the precise combination of setting and type of contact that will be important for transmission will depend on the infection being considered. there is evidence that both physical contacts [ , ] and conversational contacts [ ] may be relevant for capturing the transmission dynamics of acute respiratory infections such as streptococcus pneumoniae and influenza a/h n p, and for influenza, there can be substantial transmission in households [ ] and schools [ , ] . how to weight the respective contributions of conversational and physical contacts to overall population transmission will depend on the pathogen. there are some additional limitations to the dataset presented here. first, children under the age of were excluded for ethical reasons, which meant there was a gap in the matrices for participants in this group. given the role of school-age children in transmission of many respiratory infections, we are missing important information on mixing in school-aged children. for a flu-like pathogen, this core group will be responsible for driving transmission in the wider population [ ] which can be seen from indirect effects observed in other age-groups by targeting the vaccination of children [ ] . in our previous work [ ] we filled the missing square sub-matrix (the dimensions of it will depend on the size of the age-groups chosen in the model) after making contacts reciprocal with appropriate values from polymod, and here we take an additional step of rescaling the missing sub-matrix so that the overall dominant eigenvalue of the matrix doesn't change. the missing data could also be interpolated from surrounding regions and assuming the log-binomial distribution of contacts. there are other possible biases as well -the day that the participants took part was not randomly assigned. instead, the participants could choose themselves when to run the app which might have biased some to choose a particularly 'interesting' day when they are going to meet a lot of people, or travel somewhere unusual. in addition, the participants themselves were not sampled at random from the population but instead chose to take part. how they heard about the study might might have varied from whether they were reached through social media in the drive to recruit participants before broadcast, or consequent to watching the bbc programme, or through hearing about it from friends -all of which could lead to selection bias. given the big social media exposure around the citizen science project, our participants were recruited largely in two time periods: in october after the launch of the app, and in march after the airing of the documentary 'contagion! the bbc pandemic', but the uneven recruitment of participants over time should not have much impact. there is evidence, at least among school-aged children, that social contact structure during term-time is relatively consistent over a period of several months [ ] . contacts can also change between term-time and school-holidays [ ] and with the health status of participants; individuals typically make fewer social contacts when they have ili compared to a normal day [ ] ]. it may therefore be necessary to combine the matrices presented here with other datasets to fully explore transmission dynamics over long periods and account for changes in behaviour according to health status. finally, by comparing the bbc mixing matrices to ones from polymod we show that there are important differences in age-specific contributions to transmission with school-children driving transmission in polymod, while in the bbc dataset adults are more responsible. the exception here is if mixing is driven by purely physical contacts when ages to are most responsible for transmission. these results have strong implications for control strategies (such as informing school closures) and using different underlying mixing patterns could lead to different policy recommendations. this illustrates the importance of using several sources of data for informing the age-specific mixing of the population to account for the uncertainty that stems from population mixing. ajk was supported by a sir henry dale fellowship jointly funded by the wellcome trust and the royal society (grant number /z/ /z). mlt was supported by the uk engineering and physical sciences research council (epsrc), grant number ep/n / . we are grateful to edwin van leeuwen for helpful discussions on polymod matrices. we would like to thank production, and in particular danielle peck and cressida kinnear, for helping to make the collection of this dataset possible, and all our study participants for giving up their time to contribute to this public science project. we are grateful to anne alexander and hugo leal for helpful discussions regarding the ethics considerations and data privacy. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint figure s : contact matrices during the week adjusted for reciprocal contacts. matrices are by physical contact only and by all contact (conversational and physical), in respective columns) and by different encounter context (home, work, school or other in respective rows). a) physical home, b) all home, c) physical work, d) all work, e) physical school, f) all school, g) physical other, h) all other. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint figure s : contact matrices during the weekend adjusted for reciprocal contacts. matrices are by physical contact only and by all contact (conversational and physical), in respective columns) and by different encounter context (home, work, school or other in respective rows). a) physical home, b) all home, c) physical work, d) all work, e) physical school, f) all school, g) physical other, h) all other. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint ons mid-year population estimates assessing optimal target populations for influenza vaccination programmes: an evidence synthesis and modelling study matrix population models : construction, analysis, and interpretation estimating the impact of school closure on influenza transmission from sentinel data role of social networks in shaping disease transmission during a community outbreak of h n pandemic influenza identifying human encounters that shape the transmission of streptococcus pneumoniae and other respiratory infections. biorxiv on the definition and the computation of the basic reproduction ratio r in models for infectious diseases in heterogeneous populations measured dynamic social contact patterns explain the spread of h n v influenza who mixes with whom? a method to determine the contact patterns of adults that may lead to the spread of airborne infections fine-scale family structure shapes influenza transmission risk in households: insights from primary schools in matsumoto city targeted social distancing designs for pandemic influenza the transmission of staphylococcus aureus effect of mass paediatric influenza vaccination on existing influenza vaccination programmes in england and wales: a modelling and cost-effectiveness analysis. the lancet public health modelling the impact of local reactive school closures on critical care provision during an influenza pandemic clinical features of patients infected with novel coronavirus in wuhan, china. the lancet sparking" the bbc four pandemic": leveraging citizen science and mobile phones to model the spread of disease. biorxiv contagion! the bbc four pandemic-the model behind the documentary the contribution of social behaviour to the transmission of influenza a in a human population conlan. structure and consistency of self-reported social contact networks in british secondary schools contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys social contacts of school children and the transmission of respiratory-spread pathogens social contacts and mixing patterns relevant to the spread of infectious diseases comparative review of three cost-effectiveness models for rotavirus vaccines in national immunization programs; a generic approach applied to various regions in the world social medial, social life: teens reveal their experiences contact network structure explains the changing epidemiology of pertussis a highresolution human contact network for infectious disease transmission reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission the impact of illness on social networks: implications for transmission and control of influenza using data on social contacts to estimate age-specific transmission parameters for respiratory-spread infectious agents generalized additive models: an introduction with r key: cord- -c sbqiyy authors: ivers, louise c; weitzner, daniel j title: can digital contact tracing make up for lost time? date: - - journal: lancet public health doi: . /s - ( ) - sha: doc_id: cord_uid: c sbqiyy nan can digital contact tracing make up for lost time? contact tracing is a fundamental public health intervention, and a mainstay in efforts to control and contain severe acute respiratory syndrome coronavirus (sars-cov- ), the virus responsible for the covid- pandemic. at the time of writing, the pandemic has caused more than million cases and more than deaths. regions with the most successful containment to date have approached the pandemic with integrated measures that include cohesive leadership, effective communication, physical distancing, wearing of face coverings, improvements in the built environment, promotion of hand hygiene, and support for the staff, supplies, and systems needed to care for patients-with testing and contact tracing as cornerstones of the approach. despite the emergence of some promising therapies and work towards a future vaccine, basic public health approaches remain the best available prevention and control interventions at this time. along with efforts to expand conventional contact tracing programmes, there has been an ongoing debate about the value of digital contact tracing, ranging from issues of privacy, questions about efficacy, lower user adoption rates, and concern from some public health experts that mobile apps might distract resources from the core work of conventional contact tracing. yet, in the face of ongoing challenges in disease control, the question of whether digital technologies can supplement existing efforts is one that we cannot afford to ignore. in the lancet public health, mirjam kretzschmar and colleagues model key steps in sars-cov- testing and contact tracing across a spectrum of scenarios and identify opportunities to maximise the effectiveness of the process in reducing the effective reproductive number of covid- . the study is important as initial large-scale physical distancing policies are relaxed and movement of people increases. research into how and where to best invest in improving systems of contact tracing is essential, as even those areas with low case burdens will face ongoing transmission events and must be prepared to quell outbreaks as they occur. not surprisingly, the authors conclude that speed is of the essence in testing and isolating: the study finds that keeping the time between symptom onset and testing and isolation of an index case at days or less is imperative for success in reducing the reproductive number, and that rapid testing of symptomatic people is at least as important as the efficiency of contact tracing. this study adds to the literature on the role of contact tracing in covid- and highlights the need for adequate testing capacity. the authors also suggest a meaningful contribution to contact tracing from mobile apps, which might minimise notification and tracing delays, although they do not consider a hybrid approach combining conventional and mobile app-based contact tracing. the authors make several assumptions that might blunt the impact of their findings: they assume that index cases are isolated with no further transmission, yet household transmission has been reported as important even when contact tracing was in place; that all traced contacts, regardless of symptoms, are offered testing, yet capacity to test remains an important challenge in many areas; and that those testing negative (once) do not spread infection, yet this might be an over-simplification of the sensitivity of tests and the dynamics of infectiousness. the importance of these assumptions could be tested in future research and modelling efforts, as could an analysis of a hybrid approach where exposure notification is used to support conventional contact tracing rather than replace it, which seems more likely in practice. a limitation of the study is the lack of detail on the mobile app technology in the model. while the researchers focus on uptake and speed of notificationstwo important parameters-there is a lack of discussion of the efficacy of an app in terms of its detector function (ie, the sensitivity and specificity of an app to determine if a contact event has occurred between two users ) and its effector function (ie, the effectiveness in contributing to the desired public health actions by the user, such as entering self-quarantine). the conclusion of the researchers, therefore, that "app-based tracing on its own remains more effective than conventional tracing alone, even with % coverage, due to its inherent speed" seems premature without a more nuanced discussion of efficacy and of the potential challenges and harms of digital approaches. this is not to claim that mobile apps are lacking in promise, but they do remain unproven as a public health intervention. therefore, as jurisdictions around the world roll out exposure notification apps, there are crucial questions that must be investigated to understand the efficacy of these apps and to make adjustments necessary to build user trust and adoption, if they are to make a contribution to pandemic response. first, how well do smartphones measure proximity? in other words, what is the effectiveness of the detector function and how many false alarms might be expected for each true contact detected? second, how will mobile apps integrate with overall contact tracing programmes? good contact tracing offers not just an epidemiological interventionquarantining enough individuals to reduce the reproductive number-but also a human one, that investigates outbreaks and understands linkages, and that recognises and addresses the challenges inherent in quarantine and isolation by providing a variety of supports. in our experience, success at this challenging endeavour requires public health workers as human beings to connect with a person, to build trust on a human level. these vital dynamics are not captured in epidemiological models, nor can we expect that notifications provided by a mobile app will fill the place of the detective work and supportive human interventions at the core of contact tracing. third, what factors will encourage users to trust the privacy and security properties of mobile apps? current adoption rates are low in every jurisdiction where apps have been deployed, with most peaking at download rates of about % of the population, and little data available about actual usage levels, which are likely to be lower. mobile app user behaviour depends on a subtle trust-benefit ratio calculation by users that is challenging to predict in advance. what is behind the public's decision to use or avoid these apps? do they have privacy or security concerns or question the benefit of the service? do they trust public health authorities with their data and do they trust the authorities' pandemic response? fourth, how will mobile apps affect health equity? to be successful in addressing the pandemic, any contact tracing system-conventional or digital-should be evaluated within a health equity framework to avoid perpetuating the deep disparities that the global pandemic has so glaringly exposed. as contact tracing remains a crucial component of the covid- response, mobile apps offer promise, especially when considering the speed and scale required for tracing to be effective-as highlighted in kretzschmar and colleagues' study. however, understanding the potential impact of apps as part of a comprehensive integrated approach requires more evaluation of their use in real life and multidisciplinary engagement of technologists, epidemiologists, public health experts, and the public. djw reports grants from the mit-ibm watson artificial intelligence laboratory and the us centers for disease control and prevention. lci reports grants from the bill and melinda gates foundation and the national institutes of allergy and infectious diseases, outside of the submitted work. djw is co-principal investigator and lci is senior medical advisor of the private automated contact tracing initiative, which is a collaboration led by the mit computer science and artificial intelligence laboratory, the mit internet policy research initiative, massachusetts general hospital center for global health, and mit lincoln pharmacologic treatments for coronavirus disease (covid- ): a review developing covid- vaccines at pandemic speed impact of delays on effectiveness of contact tracing strategies for covid- : a modelling study epidemiology and transmission of covid- in cases and of their close contacts in shenzhen, china: a retrospective cohort study months into virus crisis, us cities still lack testing capacity variation in falsenegative rate of reverse transcriptase polymerase chain reaction-based sars-cov- tests by time since exposure using bluetooth low energy (ble) signal strength estimation to facilitate contact tracing for virus-tracing apps are rife with problems. governments are rushing to fix them privacy tipping points in smartphones privacy preferences key: cord- -dx bbeqm authors: simmhan, yogesh; rambha, tarun; khochare, aakash; ramesh, shriram; baranawal, animesh; george, john varghese; bhope, rahul atul; namtirtha, amrita; sundararajan, amritha; bhargav, sharath suresh; thakkar, nihar; kiran, raj title: gocoronago: privacy respecting contact tracing for covid- management date: - - journal: j indian inst sci doi: . /s - - - sha: doc_id: cord_uid: dx bbeqm the covid- pandemic is imposing enormous global challenges in managing the spread of the virus. a key pillar to mitigation is contact tracing, which complements testing and isolation. digital apps for contact tracing using bluetooth technology available in smartphones have gained prevalence globally. in this article, we discuss various capabilities of such digital contact tracing, and its implication on community safety and individual privacy, among others. we further describe the gocoronago institutional contact tracing app that we have developed, and the conscious and sometimes contrarian design choices we have made. we offer a detailed overview of the app, backend platform and analytics, and our early experiences with deploying the app to over users within the indian institute of science campus in bangalore. we also highlight research opportunities and open challenges for digital contact tracing and analytics over temporal networks constructed from them. contagious viral diseases such as the sars-cov ( ), h n ( ), mers-cov ( ), and sars-cov- ( ) have resulted in global epidemic outbreaks and placed a massive burden on public health systems around the world. these pandemics have cascading effects that result in irreparable consequences to economies and quality of life. the recent sars-cov- or covid- pandemic has triggered national and regional lockdowns across the world to curb the spread of the virus. with incubation periods that last days and with a significant fraction of asymptomatic carriers, the proliferation of the disease has been hard to detect and localize. further, testing of populations at a large-scale has proved challenging due to limited testing kits, well-trained health-care professionals, and funds in emerging economies . to tackle this problem, governments and health workers use contacttracing of infected social distancing: social distancing is the practice of maintaining physical distance between individuals to prevent the spread of face-to-face communicable diseases. a . - m distance is recommended for covid- . tracing is the process of identifying people might be at risk due to physical interactions with a disease carrier. individuals to identify those who may have come in contact with them, also called primary contacts. these primary contacts are then quarantined and/ or tested depending on their symptoms. testing, tracing, and isolation form essential components of covid- management, besides preventive measures like wearing masks, practising social distancing , and washing hands . traditional methods of contact tracing are often laborious and may be erroneous due to recall biases , . also, human activity patterns often involve interactions with strangers, especially when travelling, which makes it difficult to identify contacts using traditional methods. as a large fraction of the population owns smartphones, countries around the world, including india, have attempted to use digital contact tracing , , . mobile apps that use bluetooth technology are deployed to record close interactions between users. these bluetooth low-energy (ble) apps typically advertise a unique device id, j. indian inst. sci. | vol xxx:x | xxx-xxx | journal.iisc.ernet.in which can be recognized by other nearby devices with the app that scan for and save these advertised ids, also called contacts. this information is typically stored on the local device; if a user tests positive, their bluetooth contacts are uploaded to a central database and their contacts are alerted. this can dramatically reduce the time required for contact tracing from days to potentially hours, thereby mitigating the spread of the virus . examples of such national-scale apps include aarogya setu in india, tracetogether in singapore, covidsafe in australia, covid alert in canada, corona-warn-app in germany, etc. however, there are limitations to digital contact tracing. these constraints include the low reliability and asymmetry of bluetooth technology in detecting nearby users , , , ; low accuracy of the proximity distance between users to help distinguish nearby and farther off users , ; high degree of adoption required for digital contact tracing to be effective , ; and the inability to locate secondary and tertiary contacts until the primary and secondary contacts test positive, respectively. it is hence still important to use complementary digital contact tracing with manual methods. in this article, we describe gocoronago (gcg), a digital contact tracing app for institutions, which attempts to address these limitations. a key distinction of our approach is to collect the contact trace data of devices into a centralized database, continuously, irrespective of if or when a person is diagnosed as covid positive. this proximity data of all app users are used to build a temporal contact graph, where vertices are devices, and edges indicate proximity between devices for a certain time period and with a certain bluetooth signal strength. this approach has several benefits. when a gcg user is tested positive for covid- , we use graph algorithms to rapidly identify primary, secondary, and other higher-order contacts, based on who guidelines . further, even if the bluetooth scans were missed by the infected user, successful scans by other proximate devices can be used to alert the relevant contacts, increasing the reliability of detection. in addition, centralized digital contact tracing has the potential to estimate the state of the population using network-based seir models, which can be used to assign risk scores and prioritize testing , , . of course, centralized contact data collection has its downsides, primarily, the privacy implications of tracking the interactions between a large number of individuals. we take several precautions to mitigate this. one, the app is designed for deployment only within institutions and closed campuses, and not at a city, regional, or national scale. the data collected are owned by the host institution and not by a central authority. two, users do not have to share any personal information, and devices are identified using a randomly generated id. sharing gps location or their phone number is voluntary and through opt-in. last, deanonymization of data is limited to covid- contact tracing and, by design, requires multiple entities to cooperate, and is overseen by an advisory board with a broad representation from the institution. we discuss these pros and cons in more detail later. besides a centralized data collection approach, we also conduct experiments to understand the impact of various smartphone devices and the environment on the bluetooth signal strength to better ascertain the proximity between devices. we also send proactive messages for users to enable custom bluetooth settings in their smartphones to improve reliability. the use of the gcg app within an institutional setting, with data collection and usage governed by the organization, may lead to higher adoption of the app and enhance its effectiveness in contact tracing. this article examines the design rationale, architecture, and our experience in deploying the gocoronago digital contact tracing app as part of a pilot at the indian institute of science (iisc). it also discusses the challenges and opportunities in improving the utility of digital contact tracing. the rest of the article is organized as follows: in sect. , we review digital contact tracing and provide an overview of a few popular covid- apps. section provides details of the app design and the backend architecture. in sect. , we describe various analytics, including temporal contact network algorithms, for contact tracing, and for providing feedback to app users. finally, sect. summarizes our experience with deploying the app at iisc and highlights some of the opportunities and challenges of digital contact tracing. j. indian inst. sci. | vol xxx:x | xxx-xxx | journal.iisc.ernet.in background and related work . contact tracing infectious diseases, that spread through personto-person interactions, can be contained by tracking their sources and quarantining the individuals who are or may be affected. this is typically done using physical interviews, which try to determine the places visited and the people met by the patient . in some cases, the location history of the patients is shared by cities and public health agencies on websites and mobile apps to allow others who were in the vicinity at that time to take precautions. this form of contact tracing relies heavily on one's memory and collecting such data manually is cumbersome. contact tracing is crucial, especially for viruses such as the sars-cov- that exhibit high transmission rates, low testing rates, long incubation times, and a significant fraction of asymptomatic carriers, who could infect other susceptible individuals , , . digital contact tracing, on the other hand, involves the use of technology to keep track of the individuals who came in close proximity with each other. it has been shown to be effective in preventing the spread of communicable diseases in livestock , , but experiments involving human populations have been limited . the scale at which covid- has spread has led to the use of bluetooth and gps-based contact tracing applications on mobile phones. such apps help individuals automatically keep a record of the places they visited and the people they met, along with the timestamps. this permits us to build contact neighborhoods that can be used to alert or quarantine the concerned individuals and identify potentially risky interactions. most digital contact tracing (dct) apps for covid- rely on bluetooth technology available on smartphones. in addition, a few apps collect the gps location of users. the rapid spread of the covid- virus has led to the development of a variety of smartphone apps around the world, which are variants on this theme. examples include both national apps like aarogya setu (india), nhsx (uk), and covid safe (australia), as well as those proposed by institutions, like novid (cmu) and safepaths (mit). a review of contact tracing apps can be found in , , , , and their features are contrasted in table . at a broad level, these apps scan and advertise for bluetooth signals and record the timestamp, along with the signal strength or the received signal strength indicator (rssi), reported in decibel-milliwatts (dbm) in android. the rssi values are negative and higher when the devices are close to each other. translating the bluetooth rssi to proximity distances for contact tracing is not straightforward since it depends on numerous factors such as the phone hardware, drivers, operating system, ability to run continuously in the background, and interference due to surfaces. yet, they have been widely attempted and deployed because of its potential advantages over manual contact tracing. in fact, to address some of the interoperability issues across android phones and iphones, google and apple have even introduced an exposure notifications (gaen) protocol into their os as part of their covid- response . the bluetrace protocol used by apps in singapore and australia is another popular standard. europe has two competing contact tracing standards that are being refined, decentralized privacy-preserving proximity tracing ( dp t) and pan-european privacy-preserving proximity tracing (pepp-pt) . the bluetooth special interest group (sig) is also working on a contact tracing standard for wearables . such protocols help with mobility across national boundaries, avoid having to install multiple apps, and in the development of custom, yet interoperable, apps. besides smartphone-based apps, others have also developed hardware devices such as the tracetogether token that uses bluetooth, but operates independently of a phone, or wearables like wristwatches that can track the location using gps . in addition to bluetooth, a few apps like novid also broadcast ultrasound signals using a phone's speakers and other apps in the vicinity detect them using their microphone . there have also been other digital apps such as the nz covid tracer that use qr codes for users to check-in when they enter specific locations . besides contact tracing, digital tools have also been used to track symptoms among populations to identify emerging "hotspots" and for health professionals and volunteers to coordinate their response . however, the global adoption of contact tracing apps is low. the percentage of the population who have installed such apps has struggled to go j. indian inst. sci. | vol xxx:x | xxx-xxx | journal.iisc.ernet.in table table comparing gcg features with other covid contact tracing apps, as on sep past %, even among developed countries where a majority of the individuals have smartphones . while there is debate on the minimal adoption rate required for contact tracing apps to have a tangible effect, some use is better than none and more is better , , . in particular, higher adoption rates in dense neighborhoods can highlight the effectiveness of tracing effective since the risk of spreading the infection is greater in closelyknit communities. there are a number of ways in which one can design such digital contact tracing apps. these offer different trade-offs in terms of individual privacy and the health and safety of the community. the target of the app may be for national/regional use or institutional use. while national-scale contact tracing apps potentially offer greater ability to manage the pandemic, they also carry greater risks of data leaks and misuse . further, a high degree of adoption at such large scales is challenging, limiting the usefulness of the app for contact tracing. apps deployed at an institutional scale can be better targeted to the audience and offer better uptake due to the fact that the data are managed at the organizational level. institutions can also respond more rapidly based on insights provided by the app. but they are less effective when users are moving outside the confines of campuses and interacting with the broader population, e.g., apps like aarogya setu and tracetogether are national apps, while goc-oronago, novid, and covid watch are designed for institutions. the use of the app may be voluntary or mandatory. some countries like china have made such apps mandatory for all residents, or for those meeting certain requirements such as travelers. even organizations may make such national or institutional apps mandatory within their premises. but most countries and institutions tend to keep the use of such apps voluntary. further, the use of the collected data for contact tracing may also be voluntary or mandatory. if voluntary, there is an explicit opt-in by the individual who is tested covid positive or is quarantined, before contact tracing using their data can be initiated. alternatively, there may be rules in place that allow the government or institutions to use any proximity data that are available with them, without additional consent from infected users. an explicit consent helps address concerns of social stigma around covid patients. the use of gcg is strictly voluntary, and there is an additional consent required by a user who is infected with covid- before their data can be used for contact tracing-this, despite their data already being available centrally in the backend. apps may collect identifiable, strictly anonymous, or pseudo-anonymous information as part of contact tracing. some apps like singapore's tracetogether compulsorily require the contact details and/or a national identification number to be shared when installing the app. this makes it quicker to reach-out to users during contact tracing, but also heightens the risk of misusing the data for the surveillance of specific individuals and can lead to a significant loss of privacy if the data arre breached. in a strictly anonymous setting, no personal information of the user is collected, and they are only identified by a random id, which itself may also be changed (or "rotated") periodically. a set of such ids may be provided by a central server (tracetogether) or generated locally by the app. during contact tracing, the user's app is alerted and they have the option of voluntarily responding by contacting the health center or a government agency. if the user uninstalls the app, it may be impossible to do contact tracing. a hybrid approach of pseudoanonymization ensures that the contact trace data themselves are anonymous, but the information required for de-anonymization is available with a trusted independent authority whose consent is required (optionally, with a consent from the infected individual) to identify the users relevant for contact tracing. gcg adopts this hybrid model that balances the privacy of users while also enabling rapid and reliable outreach during contact tracing. the contact tracing data may be kept de-centralized, semi-centralized, or centralized. if decentralized, the bluetooth device ids observed by a user's app are stored locally on the device. when a user tests positive for covid- , they can inform a backend service of their device id (potentially, multiple ids, in case of id rotation) and their status. the backend periodically relays a list of device ids associated with covid positive individuals to all apps, which is then used by the user to verify if they came in contact with a covid positive person. this is used by pact and google-apple exposure notification (gaen) framework . in a semi-centralized approach, a mapping between an app and its device id is maintained centrally, but the contact trace data remains locally on the device. on testing positive, a user may choose to (or be required to) upload the contact trace data for the recent past to a backend service, which then sends notifications to these primary contact devices asking them to quarantine or get tested. examples of this approach include bluetrace and aarogya setu . however, aarogya setu also allows users to voluntarily upload their bluetooth contact data to central servers at any time to get an estimate of other high-risk users in the vicinity. last, in a centralized approach, both the mapping of apps to device ids as well as their contacts are sent to a backend service periodically. when a user reports themselves as covid positive, contact tracing can be initiated on the centralized data already available, optionally after an additional consent. gcg adopts this model. this variant is relatively intrusive, but arguably has advantages that may justify its use. one, contact data from both the infected and the proximate users can be combined to increase the reliability of contact tracing. two, even if users uninstall the app, if the data collected are personalized or is de-anonymizable, then contact tracing can still happen over the backend data for the period during which the app was kept installed. three, not just primary but even secondary and tertiary contact tracing, can be performed rapidly. and four, having a centralized model allows us to perform temporal analytics on a global contact network. this can help identify high-risk individuals for prioritizing preventive, testing and (future) vaccination strategies, and infer the health of the user population. bluetooth data provide the relative interaction between proximate users but in itself does not reveal the spatial location of users. while this may disclose interaction patterns between (anonymous) users, which is necessary for contact tracing, correlating this with particular individuals is not possible without additional out-of-band knowledge about them. some contact tracing apps may also collect gps data (covid safepaths) and data from beacons or qr codes (novid) that may reveal the absolute spatial location of the users. collecting spatial location has some benefits. the coronavirus may be transmitted through surfaces or be suspended in the air and thereby be passed on to others who are not near an infected user but in the same location soon after . bluetooth based proximity will miss such users. also, gps data collection may be more reliable than bluetooth. however, gps is not precise enough to be useful for identifying proximity between users. furthermore, tracking the spatial movements of users continuously can have serious privacy consequences , . bluetooth beacons and scanning qr codes present at well-known locations can also provide such spatial information, but will be limited to places where the beacons or codes are deployed. gcg allows users to optionally share their gps data through an explicit opt-in and also allows the selective use of beacons deployed by institutions. last, we need to consider the duration for which the centralized or de-centralized data that are collected retained. this needs to be explicitly stated by the apps for transparency. more the data that are collected and more personalized it is, the greater are the consequences for retaining it longer, especially in a centralized or semicentralized setting. typically, the contact trace data themselves are useful only for roughly days after they are collected since this duration is typically the outer time-window of transmission of the virus. also, there should be clarity on how long the data are retained after a user uninstalls the app. gcg deletes a user's phone number, the only personal data they may share, from its backend within months of them uninstalling the app. the anonymized contact trace data are retained for future research purposes, as per the rules set out by the institute human ethics committee (ihec). the gocoronago (gcg) contact tracing platform consists of a smartphone app and backend services for data collection, management, and analysis. the app is designed for covid- operations and management within an institution and is also proposed as a research project governed by the institute human ethics committee (ihec). the design and technical details of the app and qr code: quick response (qr) code is a -d barcode standard which serves as a machine or device readable label that encodes information. smartphones can use their cameras to take a picture of the qr code and apps or libraries can extract the information present in them. examples of such information include some identifier, the physical location or a url to a website. beacon is a compact device that can be configured to continuously broadcast an identifier and some custom data as part of a bluetooth signal. other bluetooth-enabled devices can detect these signals to get information, typically specific to the location of the beacon. the backend services are described in this section. a high-level design is illustrated in fig. . the gcg app is limited for use by authorized institutions. since not all institutions may have a private/enterprise app store for their organizations, hosting the app in the public google play or apple app store is convenient. users at authorized institutions are provided with individual invitation codes by a separate entity within the institution, typically the information technology (it) office. the it office also maintains a mapping from the user's unique invite code to the actual individual to whom the code was provided, along with their contact details, as shown in fig. . this mapping from the individual to their invitation code is later used by the it office during contact tracing, as described in sect. . . the user can download the gcg app from the google play store or from an institutional download link. during installation, users enter this invite code into the app, which submits and validates it with the gcg backend servers and is returned a unique id, a device id, and a pin. the gcg backend maintains the mapping from the invite code to the unique id for the installed device. the invitation code can only be used once by the user for the first installation. to allow future re-installations, a pin is generated for this invitation code and is shared with the user. optionally, the user may provide their one-time password (otp)-verified phone number during installation, which is recorded in the backend. this number can be used along with the pin to reinstall the app in the future, in place of the one-time-use invite code. last, a device id in the form of a random bit uuid is generated by the backend for each re/installation on a phone, and a mapping is maintained from the unique id to the device id, along with the creation timestamp. this device id will be broadcast as part of the bluetooth advertisement (fig. ) . both the invite code to unique id and unique id to device id mappings are used during contact tracing (sect. . ) . a final piece of information collected from the app during re/installation is the make and model of the phone. as we discuss later, this is vital for interpreting the bluetooth signal strength and translating it into a distance estimate. these identifiers are designed to maintain the anonymity of users from the gcg app and backend, enable de-anonymization of contact users upon an authorized request for contact tracing, and ensure that the app can be re/installed by authorized users. such sandboxing and identifierindirection ensures that no single entity -the it office, a gcg user, or the gcg backend-can independently find the identity of any (other) user and their trace. a key tenet of gcg is transparency. the installation process in the gcg app has disclosures on the legal terms and conditions for the use of the app, and on how the data collected will be used. in addition, there is also a multi-lingual informed consent, as required by ihec, which clearly documents the scope of the research project, potential benefits and downsides, voluntary participation, etc. the gcg app uses bluetooth low energy (ble) signals to detect other proximate phones running the app. the ble wireless protocol is ubiquitous among smartphones sold within the last years. it enables low-power, short-range wireless communication and is intended for applications in fitness, smart homes, healthcare, beacons, etc. its maximum range is < m though this is affected by environmental conditions and transmitting power, and ≈ m is the typical range . ble devices use an advertising and scanning protocol to discover each other and establish a connection. when acting as a server, the devices advertise one or more services that they support, which are identified by service assigned numbers; when acting as a client, they find servers to connect, to based on the advertised service assigned numbers. a single device may advertise multiple services, and it can include a custom payload such as a service name. also, the ble advertisement is broadcast in an open channel, which any nearby ble client can discover. besides standard bit service numbers that are registered and pre-defined for specific types of services, applications can also generate and use bit uuids for custom services they provide. once discovered, clients can establish a network connection with the service to perform additional operations such as data exchange. the gcg app acts as both a client and a server when using the scanning and advertising capabilities of ble, respectively. specifically, it advertises two service assigned numbers, x , which represents a generic access service, and another custom service whose assigned number is the unique device id for a particular app installation. this advertisement is broadcast continuously. as a client, the gcg app scans for secs every minute for advertisements that contain the service number x . if found, it extracts and records the device id that is sent as a secondary service number in the same advertisement. piggy-backing the device id as a service assigned number rather than a custom payload takes fewer bytes, which in turn can reduce the power consumption for the advertisement. as part of the scanning, the gcg app also retrieves the received signal strength indicator (rssi), which is the strength of the ble signal that is received by the app. as we discuss later, this can be used to estimate the proximity distance. the gcg android app uses the default ble settings for broadcasting its advertisements, which translates to ble broadcasts every sec at a medium transmission power level. also, the app consciously does not establish a connection with apps on another device; the device id is broadcast to any ble device that is in the vicinity. in fact, we explicitly set the connectable flag in the advertisement to false. this enhances security by avoiding malicious content from being transferred. while such proximity tracking is helpful for contact tracing of individuals who were spatiotemporally co-located, this does not address situations where two users shared the same space, such as an atm, mess dining hall, or campus grocery, but for a short time apart. since covid- can be transmitted through surfaces and can linger in the air for some time , it is beneficial to identify users who were in the same location but not at the same time, especially for locations with a lot of footfall. the gcg app allows users to voluntarily share their gps location information with the backend. this is disabled by default. if enabled by the user, the gps location is retrieved and uploaded to the backend every mins, and buffered for retries. since the sharing of gps location is strictly voluntary, gcg supports the selective use of beacons installed by institutions at such highrisk spaces. these beacons behave like a gcg app that passively advertises its device id, and the smartphone app can scan for and record the beacon's id, just as it would detect another gcg smartphone's device id. specifically, we use the ibeacon protocol from apple. the beacon transmits a static gcg uuid as its service number, x c, as the manufacturer id for the protocol, and a major and minor version number to uniquely identify itself. the gcg app scans for the static service number, filters results based on the manufacturer id, and retrieves the major and minor version numbers. the app encodes these version numbers into a template uuid to form a unique device id for that beacon and adds it to its proximity trace. during each scan, the proximity data collected consist of zero or more device id(s) and the corresponding rssi values that were discovered at that timestamp. performing a service call to send these data to the backend servers consumes power and bandwidth on the phone. instead of sending these data after each scan, we buffer it to a sqlite database on the phone and periodically send the buffered data to the backend in a single batch. this transmission interval is set to mins. this type of batching amortizes the power and network costs across scans, while ensuring the freshness of the data available at the backend. buffering is also beneficial when internet connectivity is intermittent. if the proximity data cannot be sent to the backend, the buffered data are retained on the device and a resend attempt is made in the next transmission interval. given that this is the most frequent service call to the backend, we use a compact binary serialization to represent the proximity data sent to the backend, unlike the other services which use json. the gcg app needs to run in the background all the time for effective bluetooth advertising, scanning, and proximity data collection. however, the heterogeneity of smartphone models and the limitations of their os means that this advertising and scanning may not be reliable. to identify issues with specific device models and app installations, and verify if the app is running, we collect and report liveliness telemetry statistics to the backend every hour. these include a count of ble scans performed, ble scans failed, gps scans, gcg users and beacons detected, and contact buffer size; bluetooth and gps enabled status, bluetooth and gps permission flags, battery level, app version, etc. these statistics also help us in understanding the aggregate usage of the gcg app within an institution. besides tracking bluetooth contact data, the gcg app offers several features to inform the users about covid- and engage them in preventing its spread. screenshots of these ui elements are shown in fig. . key among these is a proximity alert, wherein a notification is triggered on the smartphone if or more users (configurable) were detected within a ≈ m distance during the last bluetooth scan. this acts as a warning to users in case they inadvertently overlook social distancing. as discussed later, the m distance threshold is just an estimate based on the rssi. the alert is also triggered only once an hour (configurable) to avoid saturating the user. in addition, users can visualize a plot of the hourly count of contacts segregated by the duration of contact within the hour, e.g., < mins , − mins and > mins (fig. b) . this gives them a sense of their interaction pattern for the past hours. similarly, we also display the number of scans performed each hour for the past h (fig. c) . this can help identify issues with bluetooth scanning on specific phones, and prompts the user to take corrective measures. a summary of the number of scans completed per day is also shown as a progress bar to motivate users to hit or more of the possible min scans (fig. a) . j. indian inst. sci. | vol xxx:x | xxx-xxx | journal.iisc.ernet.in these local analytics within the app are complemented by aggregate analytics performed in the backend and are shared through the app each day. these include the social distancing score, user density heatmap for neighboring locations, and a visualization of the contact network neighborhood. these are described later in sect. . a unique aspect of the app is that the set of remote analytics available can be dynamically changed without having to update the app. in the future, this can also be used to push forms and conduct surveys from within the app. importantly, none of the analytics provided to users reveals the identity of other users or even their device ids, to protect their privacy. for example, the hourly contact bars only report the aggregate counts of nearby devices and cumulative duration of interaction at different distances, while the proximity alert is triggered only if at least three users are nearby to prevent fine-grained estimates of the number of gcg users from being revealed. last, we also provide helpful information to educate users about covid- . these include a plot of the positive, recovered, and deceased cases across time in india, and in the local state, and a map of the current positive cases at the state and district level. in addition, we also share let's control covid and curious about covid? infographics as app alerts each day, which suggest precautions, debunk myths, and offer scientific information (fig. f) . these are sourced from public health and science resources such as who, the covid gyan initiative from iisc-tifr, and indian scientists' response to covid- the features described here are largely applicable to gocoronago v . on android smartphones. gocoronago v . is a lighter version available for ios with features limited to advertising, scanning, and receiving alerts. this is due to the limited numbers of iphone users on the academic campus. there are other os and device-specific issues as well that we encountered and addressed in various iterations of the app. while we were initially using wildcard filters when performing bluetooth scans for service numbers on the android app, we noticed that certain phone models such as samsung did not reliably perform such scans. this led us to adopt the x approach. continuous bluetooth advertisement and scanning in the background is challenging in android, and virtually impossible in ios. smartphone brands with custom android builds, such as xiaomi, oppo, vivo, etc. do not always support the recommended practise of executing such applications as a foreground service with a persistent, ongoing notification. as a result, users are forced to change the android battery usage settings and/or autostart permissions for the gcg app, which are brand and even model specific. absence of reliable scanning and advertising defeats the key purpose of the app. we provide local analytics and alerts to help users address such issues. further, android requires users to enable gps to even perform continuous bluetooth scanning, as a way to indicate to users that their location may be revealed indirectly, say, through beacons at well-known locations. but requiring gps to be on even though the app does not collect the gps location without opt-in confuses users, and may lead to privacy concerns. on ios, the problems with background bluetooth advertisement and scanning is well documented due to apple's restrictive policies , , . the ios gcg app is effective when in the foreground and when the user is viewing the app. however, when the user is not actively using the app or the phone is locked, the app can scan for other devices that are advertising, but it cannot advertise. as a result, there needs to be other android or active ios gcg devices nearby for contacts to be recorded, colloquially referred to as "android herd immunity" . besides technical challenges, there are also policy challenges in deploying covid- related android and ios apps to google play and apple app stores. certification from an official government of india agency with specific verbiage was required before the gcg android app would even be reviewed for hosting on the play store, and the subsequent reviews of the app's update takes weeks. given the restrictions that apple imposes on apps posted on its app store, the ios gcg app is only viable for an ad hoc or enterprise license deployment. gcg web services, data management, and analytics are hosted on the microsoft azure public cloud. as shown in fig. , these are present on different virtual machines (vms) that are segregated based on their workload (service endpoint, data management, analytics), and their security zone (internet, intranet, and internal). we describe these backend capabilities next. a suite of rest service application programming interface (api) is defined for the gcg app virtual machines (vms): a virtual machine (vm) is a computing environment that provides all the functionalities of a full computer, but executes within another computer. a vm is the typical unit of renting a computer in public clouds. vms help divide a single large computer or server in the cloud into multiple smaller computers, and the vms are independently rented to different users. public cloud: public cloud is an internet-based service that allows users to rent and access remote computation, storage and software capabilities that are hosted at large data centers offered managed by service providers like microsoft, amazon, and google. it reduces the cost and effort in managing physical computing infrastructure at an organization, and at a higher reliability and scalability. to interface with the backend, to upload data and to download analytics and alerts. the rest services are implemented using java servlets running on apache tomcat web server, and their service endpoints are accessible on the internet. these apis include register device, add proximity contacts, add gps, add liveliness, get notifications, and fetch analytics. most use json as the rest body, except add contacts which uses a binary protocol. the register device api accepts an invitation code from the app, checks a mariadb table if the code is present, not expired and not yet used, and if so, generates a random device uuid, a random pin and a unique id for the user, which are returned back to the app. these mappings, as described earlier, are maintained in mariadb. the phone number, if provided, is salted, hashed, and stored in the database for comparison in the future if a user reinstalls the app. the number is also asymmetrically encrypted and stored in the database, so that it can be decrypted upon authorization by the institution's advisory board, if needed. the decryption key is store securely off-cloud to prevent accidental breaches. the add contact api is most frequently invoked, once every mins by potentially 's of users. to avoid the power, compute, and network overheads of de/serializing json, we use an alternative binary format. it starts with bytes of the source device id, followed by a series of scan records, one per scan. each record starts with bytes of unix epoch time in seconds with the scan record's timestamp. the next byte indicates the number of device contacts 'n' in that scan, followed by × n bytes having the byte device id and byte rssi value for the n proximate devices. if more than n = devices are found in one scan, the app creates multiple scan records. records are created and sent by the app even if there are no proximate devices, since this information is also useful. as mentioned before, beacons are also encoded as device ids following a standard uuid template. intuitively, each record forms an adjacency list for the contact graph. the binary records from service calls from all users are appended to a file and every h, a pre-processing service fetches these binary files and generates a corresponding csv file with an edge list consisting of the timestamp, source device id, sink device id, and rssi. this csv file is backed up to azure blob store and, as discussed later, stored on hdfs for further analytics. add gps is the next frequently called api, every mins, for users who choose to share their gps location. these data are used to generate a device density heatmap of the user's neighborhood for the recent past, and potentially for contact tracing. to support such spatio-temporal queries, we use the influxdb temporal database to store the gps data. one copy of the latitude and longitude is asymmetrically encrypted and stored in influxdb, along with the timestamp, to support authorized contact tracing. another copy is transformed using a geohash of characters, which reduces the precision of the location to a m × m grid. when generating the heatmap for the app user's current location, we query over this geocode. the app communicates hourly device health data using the add liveliness api, as a set of keyvalue pairs that has evolved over app versions. as a result, we store these data within azure cosmos db, which is a nosql database. these data are later queried for identifying devices that are not reporting bluetooth data reliably for sending alerts with possible fixes, and also for monitoring the overall status of the gcg deployment at an institution. alerts are sent to the app using a custom notification service in the backend that the app polls every mins. this approach was initially chosen over google or apple's push notifications to reduce the dependence on external services. alerts that are generated by various analytics are inserted into a mariadb table with the device id, title, content, type, and validity time range. when an app polls the service, any pending alerts for that device are returned. besides displaying alerts to the user, they may also have a special payload that triggers changes to the ui, such as updating the social distancing score on the main screen. user-level analytics such as displaying their contact network and other analytics such as the user density are sent to the app as html that is locally rendered. the app invokes a get analytics api, which returns a json containing a list of current endpoints that serve the analytics. the plots and maps are served off an apache instance. separately, we also run our own open street maps tileserver for serving the map tiles. these external-facing services are hosted on a separate set of vms over which the services are distributed based on their workload and to avoid performance interference. these vms are shown in orange in fig. . we use one azure d s v vms to host the register device, add gps, and add liveliness endpoints, a second one that exclusively runs the add contact, and another to run the get geohash: geohash is a mechanism to encode a location in the form of a compact sequence of alphabets and numbers that are easy to remember, compared to latitude and longitude. typically, longer hashes offer a higher precision of the location. programming interface (api) is a description of the input and output parameters that are received and returned when accessing a capability offered by an application. j. indian inst. sci. | vol xxx:x | xxx-xxx | journal.iisc.ernet.in notifications service; the latter two see a higher load. the tileserver for displaying open street maps, which is only occasionally used, runs off an azure b s vm, while the analytics are served from an azure d s v vm. a separate azure d s v vm hosts mariadb and influxdb used by these services. besides the internet-facing services, there are internal services to support the gcg platform. these are used to host an operations portal to oversee the health of the system, on-boarding of devices, and visualize the contact network. the portal does not directly access any user database or files to prevent accidental access to or modifications of the raw data. instead, a separate routing service offers a limited set of well-defined services to access authorized data. these apis are periodically called and the results are cached in a separate mariadb instance used by the portal. the portal and its database are also hosted on separate vms, shown in yellow in fig. . this sandboxing also extends to the analytics services, which too do not directly access the user databases for sending alerts or generating visualizations, but operate through this routing api. for example, the liveliness data are fetched every mins through this routing service from cosmos db and into mariadb for the portal to visualize the number of scan records received and scans failed among the apps, while the device registration summary is fetched through the api to plot the users on-boarded over time, distribution of their device make and models, etc. ensuring the security of the services and the data collected by the gcg platform is of paramount importance and is intrinsic to various design and deployment choices. all the rest endpoints use http/ with http strict transport security (hsts), which forces the use of a transport layer security (tls . /ssl) encrypted channel between the gcg app and the backend and prevents man-in-the-middle attacks. further, all service calls are authenticated based on a device key that is returned to the app during registration. to ensure that this service call authentication is light-weight, we use a digital signature protocol, which ensures that each call can be locally validated, without the need for any database (fig. ) . specifically, the device key is generated by the backend service as key = base (sha (device id, salt)), where salt is a secret phrase known only to the service. the gcg app encrypts and stores this device key on the phone. subsequently, when invoking any backend service, the app sends its device key, the current timestamp, and a signature, which consists of sign = base (sha (device id, timestamp, device key)) as part of its https header or body. the service then uses the received device id to generate the device key on the fly, and additionally uses the timestamp to generate the signature. it also verifies if the timestamp rest: representational state transfer (rest) is a software architecture that allows desktop and mobile clients to interact with internet services by passing requests and receiving responses, using web standards such as http and data models like json. passed is recent, for mitigating replay attacks. if the generated signature matches the received signature, the request is valid and is executed. note that all of these are flowing over an encrypted https channel. various other best security practises are used. the register device service takes measures to mitigate brute-force attacks using random invitation codes and pins by limiting the number of daily attempts. internal services such as the portal are only accessible from the institution's private network, over vpn, and are additionally secured using authentication. firewall rules are used to restrict access to unused ports. direct ssh access is not available to any vms running services or the database. the internet-facing vms are in a separate subnet from the ones hosting the databases and internal services on azure to keep the networks in different security domains. data flows between the services and databases/storage are tightly controlled and a routing service used for internal services. we run the latest stable release of all software and the latest security patches to protect against known security flaws. the mariadb sql database follows the principle of least privileges for access, and only minimal permissions for select or select/ insert are given to user accounts. user-defined functions are disabled. all queries are templatized to avoid sql code injection. sensitive data such as phone number and location are kept hashed and/or encrypted when stored. this prevents privacy from being compromised even if there is a cloud security breach and the data are leaked. we use asymmetric public-private keys so that only public keys are hosted on the vm for encryption and private keys for decryption are kept securely offline. contact data are backed up to azure encrypted blob storage. the backend services have undergone professional vulnerability and penetration testing by crossbow labs. the gcg app is designed to provide feedback to users on their daily interactions using simple metrics and contact neighborhoods. additionally, to improve user engagement, the app also provides heatmaps of user density and charts and maps that show the covid- situation in various states and districts around the country. in this section, we describe these features along with the contact tracing protocols that are in place if an app user tests positive. we receive contact records from various devices that contain the contact timestamp and associated bluetooth signal value. for efficient primary and secondary contact tracing, we periodically stitch these contact records to create a global contact network graph. further, we annotate the edges with the contact timestamps and signal values to creating a temporal contact network or a temporal graph. we use apache spark to perform this stitching from the csv edge file, as a pre-processing step. specifically, we create an interval graph for scans received during a specific time interval. the spark application takes a start and end time for the interval, and then filters in all the edge list entries in the input csv file whose timestamp falls within this time interval. it then groups all edges by their source and sink vertices to create an adjacency list for each vertex that includes all scan entries from either source or sink edges. every edge is characterised by a time interval [t s , t e ) , where t s is the earliest scan timestamp and t e is the latest scan timestamp between the connecting devices, during that interval. scans on an edge that fall on adjacent time points with the same rssi value are combined to form longer intervals on the edge annotations. this gives a set of disjoint sub-intervals on the edge with an associated bluetooth signal strength. the output is stored in hdfs for future analysis. temporal graph: like a regular graph, a temporal graph (or temporal network) is a collection of vertices and edges between vertices that indicate a relationship between them. but the vertices and edges that exist at different points in time may vary, and their attributes may also change over time. e.g., temporal graphs model interactions in a social network, traffic flow in a road network and proximity contacts in a contact tracing network. the social distancing score provides users with a measure of their extent of social distancing, on a daily basis. unlike the local bluetooth data used to plot the contact counts on an hourly basis within the app, the social distancing score uses more global knowledge from a device and its neighbors. in particular, it accounts for "background devices" that are often or always in the vicinity, such as family members or hostel room neighbors, and which are subtracted from this score as their sustained presence does not pose any additional risk. these scores are calculated using apache giraph once a day, over the interval graph created for the preceding -h period. the score calculation depends on three parameters: signal threshold (δ) , minimum contact duration (φ m ) , and background contact duration (φ b ) . for each device id, we first identify those neighboring devices that could detect each other for at least φ b mins , cumulatively, during the -h period. these neighbors form the background devices and are eliminated from further analysis. currently, we use φ b = mins. next, from the remaining neighbors, we retain only the rssi entries which exceed a value of δ on their edge sub-intervals. this helps identify the duration of nearby contacts with them. based on experiments described in the next section, we set δ = − , which approximates a distance of m. we sum up the duration of nearby contacts for each edge, and those whose duration is greater than φ m mins form the proximate contacts, p. we set φ m = mins by default. intuitively, this means that the user has interacted with p other devices in close physical proximity of about ≤ m for a cumulative of mins or more in the past h, but who are not part of the sustained background presence. from this, the social distancing score for a device is calculated as max{ , − p} . this normalization offers a higher score for users who practise social distancing and a lower score for the others. in the example snapshot, assume that δ = − , φ m = mins and φ b = min . for the device c, devices b and d are proximate contacts since their close contact durations are h and h, respectively. however, a is not a proximate neighbor of c since it is a part of its background, having been detected for a total of h. so the social distancing score of c is . measures the sars-cov- virus is currently assumed to spread by 'contact and droplet' as well as airborne transmission . who and various countries have provided social distancing advisories that emphasize a minimum spacing of - m for curbing the spread of the virus , , , , . being able to nudge users to maintain such distancing is one of the goals of the gcg app. however, inferring distances accurately from bluetooth rssi values is non-trivial. factors such as smartphone hardware variations, body interference, and multi-path interference lead to both false-positives and false-negatives while estimating the distance from rssi values , . researchers elsewhere have conducted experiments to understand if contact tracing apps can estimate if two users are close to each other, i.e., within a distance of m for mins or longer . these were performed with google pixel and samsung galaxy a devices using the open-trace app, an open-source version of singapore's tracetogether app . they used different environmental conditions such as signal attenuation by the human body, a handbag, walls, etc. and also by enacting real-world scenarios. the measured rssi and the distance are plotted over time to understand the variability for different configurations and their relationship to the ground truth. another smart contract tracing (sct) system uses machine learning classifiers to classify the contacts as high/low risk using the bluetooth rssi values. they perform experiments to collect rssi from a nokia . with android and htc m with android . for distances ranging from . - m, and for random device orientations, and at different locations such as hand, pocket, and backpack. the collected data are labeled as + (high-risk, ≤ m ) or − (low-risk) according to the ground truth. they filter the data using a moving average filter before training using machine learning classifiers like decision tree, linear discriminant analysis, naïve bayes, k nearest neighbors, and support vector machine. the google-apple exposure notification api in android also applies ble calibration corrections based on manual measurement of the signal strength under standard conditions. given the hardware diversity we observe among our campus population, we conduct similar lab-scale experiments, as described, using a more diverse number of smartphones and beacons. we evaluate the effect of rssi at , , and m distances to help us determine whether two phones are within m. we use a debug version of the gocoronago android and ios apps that log the bluetooth scan information to a local file on the smartphone in our experiments. the experiment was performed in an open room measuring about × m with few furniture, mimicking a real-world environment. our experiment uses android devices, iphones, and all the devices were used at a high battery level, with power-saving modes disabled and screen set to stay on for as long as possible while performing the bluetooth scans. each experiment configuration was performed for a period of mins to give ≈ rssi measurements per device pair in that configuration. given the technical limitations of ios, android devices can detect other android devices and the beacons, and iphones can detect the android devices. considering these factors, two experimental setups were designed to collect the rssi data as illustrated in fig. . for the distance a = m , we use a hexagonal placement, as shown in fig. a , with pairs of devices at the vertices, a, b, c, d, e, f, and the center, g. these give us devices at distances of m (same vertex); m, between adjacent vertices, e.g., a-b; m, between vertices at diagonal corners, e.g., a-d; and √ m for vertices that are two hops away, e.g., a-c. three runs with the hexagonal setup are required to ensure that every pair of devices is measured at a m distance. for distances a = m and m the devices were arranged in three clusters, a, b, c, at the corners on an equilateral triangle with a side of length a (fig. b) . in each cluster, the devices are placed vertically and adjacent to each other, in a row. devices across clusters are separated by a distance a while those within a cluster have a distance of ≈ m . three runs of the triangular setup with different clusters are performed to ensure that we get the rssi for each pair of devices at m and m. a key rationale for this study is to understand if two devices are within m of each other or not, as we use the m distance as the proximity threshold in our platform. a total of rssi data points at m, data points at m, and data points for m are collected. we focus our analysis on just the android phones, which form the bulk of our deployment. there are , , and data points for , , and m between the android devices, respectively. for each distance and a device pair, we drop the maximum and minimum rssi values to eliminate outliers. an empirical cumulative distribution function (cdf) of the rssi values at , , and m are shown in fig. a . the x-axis shows the rssi values, while the y-axis lists the corresponding percentiles for different distance configurations. we see that there is a substantial overlap between data points at the three different distances for a given rssi. for example, for an rssi of ≤ − , we have % of the m data points, % of the m data points, and % of the m data points fall within that signal strength. so, using any single threshold value of rssi as an estimate for a m distance is liable to result in both false positives and false negatives. for this preliminary study, we wish to determine an rssi value that is the most discriminating with regard to the ≤ m and > m proximity. so for each rssi value, we plot the difference in the percentile of data points that are at m and at m distances, and this is shown in fig. b . the peak difference is observed at an rssi value of − , i.e., the difference between the true positive of m ( %) and false positive of m ( %) is the highest. hence, we use an rssi of − as the proximity threshold in our gcg app and the backend analytics. in the future, we propose to study the effect on rssi from different pairs of phone models and in different environmental conditions in order to develop a more customized proximity threshold, instead of using a single global value that is currently adopted. when an app user tests positive for covid or is under mandatory quarantine, the current protocol at iisc requires the campus health center to check if the user is willing to share their contact data for tracing. if so, they are asked to enter their phone number within the gcg app, if not done so. the health center collects and enters the gcg unique id, device id suffix, and phone number from the user into a portal. this initiates a call to the gcg backend and triggers an otp to the user's phone number, if the details match with an existing user. the user may share this otp with the health center and this serves as their informed consent for contact tracing. the health center enters the otp and any additional details about the subject, such as symptoms, start and end dates for contact tracing, and test information. the gcg backend confirms if the otp is accurate, and if so, the request is forwarded to the advisory board to get the primary and secondary contacts for this user. the advisory board has representatives from the institute, including faculty, staff, students, doctors, and a bio-ethicist. if the board approves the request through their portal, the gcg backend is notified and it will perform a time-respecting breadth first search (t-bfs), which is a variant of breadth first search (bfs) performed over the temporal contact graph. the t-bfs will be initiated from the device id corresponding to the given user's unique id and for the time duration in the past indicated by the health center. if the user's unique id is associated with multiple devices during this period, the search will be initiated from each of these ids. the output is a list of device ids for the primary and secondary contacts. we then use the invitation code, unique id and device id mappings maintained in the gcg backend to get the list of invitation codes used by the primary and secondary contacts. these invitation codes are shared with the it staff, who then use their mapping table to deanonymize them and provide the health center with a list of email ids and/or phone numbers of these contacts. the gcg backend also provides the duration of contacts for each of the invite codes. the health center can then choose to initiate their relevant protocols for reaching out to these contacts, and quarantine or test them. if mandated by law, the health center may share the contact trace data with the local government agency responsible for covid- surveillance. engagement besides the local analytics within the app, we also provide additional analytics to the gcg user based on aggregation in the backend. figure d shows a heatmap of gcg user count in a . × . km area around the current location of an app user, if they share their gps location. it is aggregated over the past h from users who share their gps data. these data are queried from the timestamp and geohashes present in the influxdb backend. in order to respect privacy, the location data are spatially coarsened into tiles of approximately m × m , and temporally coarsened over h, and only the aggregate count of users in each tile is shown. also, when few users are present in a tile, we display these data in a categorical manner, e.g., < . the contact graphs that are constructed in the backend can be visualized using tools such as gephi. figure shows a subset of the temporal graph generated for a single day. here, the size of a node depends on its degree centrality measure across the entire time duration. the thickness of the links depends on the duration of their contact. while such a graph is instructive for backend analytics, we use it to generate a neighbourhood tree for each user, as shown in fig. e . the tree is based on the last h of data and contains contacts up to two hops. importantly, this is a tree and not a neighborhood sub-graph to preserve privacy, i.e., edges between the -hop and -hop neighbors are not shown to avoid revealing contact patterns between them. these trees are generated on a daily basis. it helps the users get a sense of not just their primary contacts, but also their secondary contacts, which could be much larger, and in-turn motivate users to take greater precautions by socially distancing. the gcg app is currently deployed at the indian institute of science (iisc), bangalore. the iisc campus is an access-controlled residential campus with close to students, over faculty, centrality measure: centrality measure is a graph-theoretic score that measures the relative importance of vertices in their ability to spread or influence other vertices in the network. examples of these measures include degree, betweenness, eigenvalue, closeness centrality, page rank, etc. they are used to identify important or critical vertices in contact networks, social networks, www graphs, road networks, etc. and over research and administrative staff. a majority of the students and faculty live on campus. however, iisc entered a full shutdown in march, , a few days ahead of a nation-wide lockdown in india, and the students on campus were instructed to leave for their homes. initial versions of the app were tested among faculty volunteers during the lockdown period. the gcg app was first rolled out to students in june, after a subset of them were allowed to re-enter campus, and subsequently to other faculty and staff. at the time of writing this paper, the gcg app has been installed by over users at iisc. a plot of the number of installations of the gcg app over time is shown in fig. . sharp jumps in installations correspond to new invitations or reminders sent to students, faculty, and staff for installing the app. the app is yet to be rolled out to essential workers such as hostel cooks, cleaning staff, and security personnel, and noticeably, some of the early cases of covid- on campus have been initiated through them. this is understandable since many of them stay off-campus and possibly have a larger mobility footprint, increasing their risk of acquiring the coronavirus. while the gcg android app was initially hosted on the iisc website due to restrictions by google and apple in hosting covid-related apps on their online app stores, it has recently received approval to be hosted on the google play store, with v . currently available there since early august, . an ad hoc ios version is also being tested since the last week of august, . while gcg is designed for institutional use, contact tracing for users from the same institutions who interact outside the campus is also captured. this benefit can be further enhanced through a federated deployment for institutions that are spatially close to each other, such as a cluster of college campuses and software tech-parks in the same neighborhood. here, the chances of physical interaction between users from different organizations are high, e.g., visiting the same local cafeteria or grocery store. in this federated deployment (fig. ) , individual institutions would maintain their independent gcg deployments. but in addition, they would share the strictly anonymized contact graph for their institution with a trusted data broker, such as a non-profit agency or a neutral university. this data broker would then stitch these graphs together based on contacts between unique device ids that span graphs from different institutions. this can then be used to trigger "glocal" analytics-a global combination of local clusters that are near each other-and share more accurate proximity scores with the users of individual institutions, as well as perform more effective contact tracing across institutions in the same community. a key requirement to preserving privacy is that no personal data should be shared with this trusted broker, and any de-anonymization for contact tracing should strictly be handled at the local institution. this can further be complemented through the use of national or regional-scale contact tracing apps, even if used by a smaller fraction of users who are mobile. this can help link clusters of gcg contacts within institutions, and allow with contact tracing beyond the institutional premises as well. however, care should be taken to sandbox the regional and institutional datasets to avoid privacy loss. the availability of fine-grained contact tracing data has opened opportunities for new research on infection spreading. classic epidemiological models are compartmentalized formulations that classify the population into different states such as s (susceptible), e (exposed), i (infected), and r (removed/recovered). based on the progression patterns of a disease, different models such as si, sis, sir, and seir models , , , have been proposed. these models are applicable to large populations and can estimate the time evolution of the fraction of individuals in different states over time and can identify the peak number of infections for different reproduction numbers. the assumptions in these models are, however, coarse and their utility is hence limited. they can be used to take higherlevel policy decisions such as deciding the duration of lockdowns, planning hospital bed-capacity over time, etc. however, the input data for these models are tightly related to the testing rates, which in the case of covid- was very low during the initial few months. research in the past two decades has extended such compartmentalized models to static or timevarying contact networks , , , . in a static network, a node, if infected, can potentially infect any other nodes that it comes in contact with, regardless of the time of contact. but in dynamic networks, temporal ordering is preserved. that is, if an individual a comes in contact with a person b before b and c interacted, then a faces no risk from c. this can correct for the over-prediction of infection rates from static models. with bluetooth-based mobile contact tracing, it is possible to include both duration of contact and the signal strength, which is a proxy for the distance between the phone users during their interaction, to make better predictions of the transmission rates. results from simulated experiments by kretzschmar et al. , indicate reduced reproduction numbers when contact tracing is performed using mobile apps as the delay in alerting vulnerable individuals is reduced to a minimum. apart from identifying primary and higher-order contacts quickly, contact data allow us to identify the most vulnerable users through either simulations of network models assuming hypothetical initial conditions or centrality measures. most centrality scores from network science are defined on static graphs, and it would be interesting to develop better centrality measures that can be used to find the nodes with higher spreading capabilities in a temporal network. identifying such individuals can in-turn be used to device adaptive testing and vaccination strategies, which can help improve the estimates of the health states of the population, especially when testing is expensive, or its availability is limited. another major opportunity with centralized contact tracing is the ability to influence social distancing behavior using alerts and scores. creating control groups and providing such information to one of them and observing their contact patterns for a limited subsequent period can throw light on the effect of such scores. such randomized control trials can help quantify the effectiveness of contact tracing apps even in the absence of covid- case data. one of the key challenges with digital contact tracing is user adoption. as highlighted in sect. , digital contact tracing requires a large fraction of users within the community to use it before it becomes effective. having only a small sample of individuals use the app makes it difficult to identify the true sources of infection, because of which paths between infected individuals and their primary and higher-order contacts may go undetected. however, our experience with institutionallevel contact tracing appears more promising than that employed by governments at a national level in terms of the fraction of users installing an app and the duration for which they had it installed on their phones. in fact, recent reports indicate that even % of user adoption of contact tracing apps can have a meaningful impact of - % reduction in covid infections and death . that said, not all workplaces are captive environments. in such cases, neighborhood or regional deployments of contact tracing apps may be required since they are more likely to interact with people outside their cluster. further, people may also interact during activities outside workplaces and their institutional contact tracing app can be ineffective during these periods. we frequently observe app users turn off their bluetooth or gps, because of which the contact trace data collected are curtailed. users may do so to save battery-even though our experience shows that the android app consumes less than % of batter in an entire day-or when they perceive a lower risk based on their current activity and environmental conditions. these factors can dramatically offset the promises offered by network-based epidemiological models in identifying risk-prone individuals and in contact tracing to contain the spread of infection. it is also extremely difficult to impute such missing data and no assumption can be confidently justified. although digital contact tracing apps have several potential advantages, validating its usefulness is tough. the difference between the two approaches can be best demonstrated when there are covid positive app users who have shared data for continuous periods. in practice, it is wise to use data from such tools in conjunction with manual contact tracing since there would be gaps in data due to user behavior or technology limitations. building robust epidemiological models is all the more challenging because they contain several parameters that have to be calibrated from sparse and missing data. heavy reliance on digital contact tracing apps can also exclude fractions of the community who use feature phones. visitors to institutions such as delivery providers can also be missed out but can contribute to virus spreading. digital contact tracing is still in its infancy. it is important that individuals understand the data shared, risks, and benefits before fully using such apps. communicating these details to a lay audience can be challenging and misconceptions about what such apps collect and can do are not uncommon. in this article, we have described the various dimensions of digital contact tracing for managing the covid- pandemic. we have highlighted the approaches taken by diverse apps globally and their pros and cons. we have proposed gocoronago as an institutional contact tracing app, whose design choices attempt to balance the privacy of individuals with the safety of the community in performing rapid multi-hop contact tracing. we have offered a detailed technical description of the gcg app, its backend services, and analytics. this platform is currently being validated at the iisc university campus, with additional campus deployments underway. we have shared our early experiences with the deployment over the past few months, in the midst of the covid- epidemic, and the opportunities and challenges that lie ahead. given the evolving nature of covid- , our continued experience with this contact tracing platform at iisc and other campuses can serve as a role model, or a cautionary tale, in managing the pandemic in the ensuing months and years. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. using the data collected from the app is currently under review by ihec. the authors are also glad for valuable inputs from dr. olinda timms from st. johns research institute and prof. mukund thattai from ncbs on the design of the contact tracing protocol to balance safety and privacy. a special thanks to crossbow labs for their pro bono security testing services. received: august accepted: september advisory on social distancing measure in view of spread of covid- disease. tech rep . world health organization (who) ( ) contact tracing in the context of covid- : interim guidance world health organization (who) ( ) coronavirus disease (covid- ) advice for the public coronavirus: people-tracking wristbands tested to enforce lockdown ) digital tools for covid- contact tracing: annex: contact tracing in the context of covid- . tech rep . google and apple ( ) exposure notifications: using technology to help public health authorities fight covid- centers for disease control and prevention (cdc) ( ) social distancing modeling the combined effect of digital exposure notification and non-pharmaceutical interventions on the covid- epidemic in washington state a survey of covid- contact tracing apps infectious diseases of humans: dynamics and control incubation period of novel coronavirus ( -ncov) infections among travellers from wuhan network science an overview of mobile applications (apps) to support the coronavirus disease- response in india bluetrace: a privacy-preserving protocol for community-driven contact tracing across borders indoor distance estimated from bluetooth low energy signal strength: comparison of regression models bluetooth sig to extend reach of covid- exposure notification systems automated and partly automated contact tracing: a systematic review to inform the control of covid- why the nhs covid- contact tracing app failed covid- contact tracing apps reach % adoption in most populous countries physical distancing, face masks, and eye protection to prevent person-to-person transmission of sars-cov- and covid- : a systematic review and meta-analysis social distancing: the science behind reducing from two metres to one metre. independent trace together token: teardown and design overview inferring distance from bluetooth signal strength: a deep dive editorial board ( ) much-hyped contact-tracing app a terrible failure. the sydney morning herald quantifying sars-cov- transmission suggests epidemic control with digital contact the effect of network topology on the spread of epidemics temporal dynamics in viral shedding and transmissibility of covid- critical mass of android users crucial for nhs contact-tracing app. the guardian demographic structure and pathogen dynamics on the network of livestock movements in great britain a contribution to the mathematical theory of epidemics mathematics of epidemics on networks contactbased model for epidemic spreading on temporal networks impact of delays on effectiveness of contact tracing strategies for covid- : a modelling study effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of sars-cov- in different settings: a mathematical modelling study coronavirus contact tracing: evaluating the potential of using bluetooth received signal strength for proximity detection decentralized is not risk-free: understanding public perceptions of privacy-utility trade-offs in covid- contact-tracing apps covid- mortality is negatively associated with test number and government effectiveness accuracy of bluetooth-ultrasound contact tracing: experimental results from novid ios version . using -year-old phones a computer oriented geodetic data base and a new technique in file sequencing covid- and your smartphone: ble-based smart contact tracing no, coronavirus apps don't need % adoption to be effective use of social network analysis to characterize the pattern of animal movements in the initial phases of the foot and mouth disease (fmd) epidemic in the uk uber removes racy blog posts on prostitution, one-night stands the pact protocol specification. private automated contact tracing team mobile location data and covid- : q&a the epi info viral hemorrhagic fever (vhf) application: a resource for outbreak data management and contact tracing in the - west africa ebola epidemic covid- digital contact tracing: apple and google work together as mit tests validity simulation of an seir infectious disease model on the dynamic contact network of conference attendees aerosol and surface stability of sars-cov- as compared with sars-cov- virus spread in networks epidemic spreading in real networks: an eigenvalue an individual-based approach to sir epidemics in contact networks china's virus apps may outlast the outbreak. stirring privacy fears the authors acknowledge a research grant from the department of science and technology (dst), government of india, to partly sponsor this work (grant no. dst/icps/ rakshak/ ). they also recognize the support offered by the rakshak review committee. yogesh simmhan was supported by the swarna jayanti fellowship (grant no. dst/sjf/ eta- / - ). the authors thank the administration of iisc for assistance with the development and deployment of gcg, the members of the institute who volunteered to test early versions of the app, and prof. y. narahari who offered valuable guidance to the project. the authors are grateful for the detailed feedback offered by the institute human ethics committee (ihec) at iisc in designing the operations and the research study. a proposal for research key: cord- -maclu gh authors: gunther, christoph; gunther, michael; gunther, daniel title: tracing contacts to control the covid- pandemic date: - - journal: nan doi: nan sha: doc_id: cord_uid: maclu gh the control of the covid- pandemic requires a considerable reduction of contacts mostly achieved by imposing movement control up to the level of enforced quarantine. this has lead to a collapse of substantial parts of the economy. carriers of the disease are infectious roughly days after exposure to the virus. first symptoms occur later or not at all. as a consequence tracing the contacts of people identified as carriers is essential for controlling the pandemic. this tracing must work everywhere, in particular indoors, where people are closest to each other. furthermore, it should respect people's privacy. the present paper presents a method to enable a thorough traceability with very little risk on privacy. in our opinion, the latter capabilities are necessary to control the pandemic during a future relaunch of our economy. t he covid- pandemic has spread all over the world. it has already lead to a very large number of fatalities, more than ' as of end of march . the first priority of humanity is to take all possible actions to prevent more people from dying. in some places, this lead to enforcing a quarantine on large portions of the population. the economic damage is substantial. the us alone is investing usd ' billions to alleviate the consequences of the pandemic. thus, limiting the economical damage by restarting the economy as soon as possible, while at the same time protecting people, is of immense importance. the present document aims at contributing specific suggestions on how to achieve this. three important properties of the covid- pandemic are that • the sickness is limited to roughly three weeks in time. after this period, people are either healthy again, hopefully without impairments, or dead. all evidence expressed publicly, so far, indicates that former carriers of the disease are not contagious anymore after that time. a strictly observed quarantine of three weeks has thus the potential to basically eliminate all carriers of the disease. a quarantine is never perfect, e.g. due to the need to restock food supplies. as a consequence some chains of infection will persist. • the spreading of the disease in the population is characterized by an exponential growth. the characteristic christoph günther is with the german aerospace center, weßling, and with technische universität münchen, munich, germany, e-mail: kn-covid@dlr.de. michael and daniel günther are students at technische universität münchen, munich, germany, e-mail: m.guenther@tum.de, d.guenther@tum.de. parameter r , which describes the number of people infected by a single carrier, is estimated to be around - . . any value above leads to a exponential growth, as long as there is no substantial immunity. more detailed epidemiological models are more differentiated but show a similar threshold behavior [ ] . the value of r , mentioned above is determined by the period during which a carrier is contagious, the probability of transmitting the disease, and the number of contacts that the carrier had during that time. there is no means to control the first factor. the second may be somewhat influenced by carrying masks but not to a level considered sufficient. thus, the most important option for controlling r is to reduce the contacts between carriers and other citizens. • the diagnoses of sick people is a critical element. some people do not show symptoms that they associate with the sickness but are nevertheless infectious. they may be a cause for requiring a longer quarantine than described above. in addition and most importantly, no one shows symptoms before being infectious, which means that as long as there are no tests that everyone can apply at regular intervals, there will always be a delay before the spreading by a particular individual can be discontinued. furthermore, extensive testing as practiced and further expanded in germany will be most effective if the most likely carriers are being tested. currently, there is a variety of attempts to contain the pandemic, which should all be followed in parallel. the development of vaccines and of medications are essential but may not be available in the near future. this has led to an enforced reduction of contacts by various levels of quarantine. the concept of achieving immunity by letting the epidemic spread have rightfully been abandoned, due to the heavy toll in human lives. bill gates formulates what most of us think "but bringing the economy back ... that's more of a reversible thing than bringing people back to life. so we're going to take the pain in the economic dimension -huge pain -in order to minimize the pain in the diseases-and-death dimension." the "how" of restarting the economy remains. some authors studied the effect of relaxing the quarantine at the cost of a regrowth of infected people before shutting down again [ ] . this leads to an increasing level of immunity in a series of waves. in view of the small percentage of people that are immunized at each step and in view of the risk of an unmanageable growth, the number of waves needs to be substantial. furthermore, each wave costs lives. china, south korea and to a much smaller scale webasto in germany have shown an alternative, which consists in a careful tracing of contacts, associated with testing, and quarantining positively tested people. we will call these people "carriers" throughout the paper. tracing contacts is a rather natural concept for containing the pandemic. it aims at identifying and subsequently isolating people, who might potentially be carriers. since the incubation time until an infected person becomes infectious herself is around days and since first symptoms only occur after days at the earliest, with a diagnosis available at en even later time, there is a lag during which infectious carriers continue spreading the disease. thus, knowing contacts to people who have been identified as carriers, allows isolating unidentified potential carriers. the frequent absence of clear symptoms is a second critical cause for the spreading of the infection. in this case contact tracing allows identifying carriers without symptoms through their contact to people with symptoms. in that case, the carrier with symptoms is not the originator but rather helps discovering the originator. independent on who is the originator, contact tracing and subsequent isolation eliminates sources of disease spreading. an immediate testing and determination of contacts allows to identify further contact whenever the outcome of the test was positive. in the case of a negative outcome, testing is repeated after an incubation time, with isolation being lifted in the case of a second negative outcome. at my institute ( people), we traced a number of contacts and noticed that the complexity of a manual process becomes quickly unmanageable. due to the exponential character of the network of relations, there are simply too many contacts to be traced. we ended up isolating everyone first at the institute level, shortly after that and independently of us at dlr level and finally at national level. this observed complexity led us to the conclusion that automatic means of tracing are essential. raskar et al. [ ] have analyzed an approach based on locating people with a particular focus on privacy-protection and self-protection against the disease. we follow a somewhat different approach. it is primarily based on contacts, rather than on locations, although locations may be used in addition. furthermore, it is focusing on the control of the pandemic as a whole. the protection of the individual turns out to reach a similar level as in the approach by raskar et al. [ ] . the present exposition is developed against the background of german regulations. the public authorities responsible for health is the "gesundheitsamt." the gesundheitsämter (many of them, distributed all over the country) register every person affected by the pandemic and organize the testing of people. thus the identity of any person which either has symptoms, is tested positively or is affected by the disease is currently known to the local gesundheitsamt. we shall subsequently just speak of the gesundheitsamt as if it was a single entity. that gesundheitsamt is a trusted authority independent of any use of electronic means to trace contacts. it shall thus also be the trusted authority in our approach, which will be responsible to operate the server needed to manage the list of carriers. they do furthermore manage people in quarantine, who have to follow strict rules in germany. not doing so may lead to fines and imprisonment [ ] . additionally, germany has imposed limitations on the movement of people, which should not be confused with the stricter quarantine. in our view, it should be acceptable that regaining new degrees of freedom may be associated with certain restrictions, which ensure the traceability of contacts, without unduly exposing privacy. recent polls in germany show a high level of acceptance of restrictions to combat the pandemic. it may well be acceptable to enforce the use of tracing, although this is not the focus of the paper. the precondition for traceability is to use of a smartphone running a covid- tracing app (the app) or alternatively the use of a low cost device. for simplicity, the focus of the exposition will be on an app running on a smartphone. every person leaving their home shall be requested to carry such a device, with the app installed and active. this might be an expectation, which people are free to follow or not. whatever solution is preferred is a political decision. the main elements of its implementation are • the automated creation of a list of contact instances my_ctc, maintained in the personal device of the user. the number of such entries could be up to a few thousand entries per day as soon as big events take place again. • the maintenance of a list of infectious carrier of the disease ga_icd on the server of the gesundheitsamt, currently around ' entries in total with a growth rate of less than per day. • the search for entries from the personal list my_ctc in the list ga_icd retrieved from the gesundheitsamt. • in the case of a hit, the app informs the server of the gesundheitsamt about the identifier found. • the server and the app cooperate in classifying the category of the contact (category or , see below). the associated contact persons might be involved in this classification process. • based on the result, the gesundheitsamt decides about the quarantining and testing of the device's owner. the best possible cooperation of the contacts and the gesundheitsamt in assessing the category of the contact reduces both the test load and the necessity of a quarantine. in an initial phase, this may include the indication of the seat used on a joint train ride, the confirmation of a joint lunch or the like. clearly, further technical developments in sensing of both the mutual placement and orientation of people will be of great help in automating this process but are not needed in an initial phase. such developments could follow similar lines as the work for indoor position, which achieves high levels of accuracy [ ] . the above description identified a number of actors. before entering into this discussion, it is useful to differentiate three categories of contacts [ ] : • category contacts are those to which a face-to-face contact accumulated to more than minutes. • category contacts are those to which a face-to-face contact accumulated to less than minutes. • uncritical contacts are all others. the consequences of being a category or contact are defined by the gesundheitsamt and may be changed over time. both categories are quarantined. currently, the main difference is in the level of testing. the category defines the sampling time of our contact monitoring. with this preparation,we have the following actors: • the gesundheitsamt (trusted authority): it tests people for covid- infection, it publishes an anonymized list of carriers and it facilitates the categorization of contacts. • roaming users: their devices monitor contacts at regular intervals ( second) and store the list of contacts my_ctc as well a a list with location and orientation information my_loc, their devices check whether there was a contact to an infected person (at least once per day), and provide support to the categorization of the contacts, potentially using location and orientation information. note that all information is kept locally with the exception of information exchanged in the categorization of a contact. • users tested positively: their devices provide their lists my_ctc as far back as their owner's infection may have been contagious to the gesundheitsamt, they go into treatment or at least quarantine, and cooperate in determining the category of contacts that they had. the device uses the list my_loc to support the classification of contacts to other people. although the position information is kept locally, it is partially disclosed to the gesundheitsamt in the assessment of contacts' categories. • users with a critical contact (category or ): they also go into quarantine and are subject to an immediate test. in the case of a positive outcome, they change category. otherwise, they are tested again after an incubation time. in a second negative testing, they are freed from quarantine obligations. there is a number of options to detect the proximity of people. we propose to use bluetooth transceivers to send beacons and monitor for such beacons at regular intervals. the benefit of using bluetooth is that corresponding interfaces are included in nearly every smartphone and that they are furthermore available on cheap platforms. in addition, bluetooth creates a direct relationship between the potential contact persons, which works everywhere, including shopping malls or the underground metro station. although not too reliable, the power level can be used as an indication of the distance between the transmitter and the receiver, and could thus be used as a filter. the details of this aspect need further assessment. furthermore, the use of bluetooth is associated with a low power consumption. the proposal made in section v uses functions available in the application programming interfaces (api) of android and apple ios. more refined solutions may be implemented by google and apple, themselves providing improved power management, relative contact positioning, safety against manipulation and the like. tracing may either be performed on a voluntary basis or enforced. the knowledge of being a carrier (positive testing) does not provide benefits to people without or with marginal symptoms. it rather puts them into quarantine and thus reduces their freedom of movement. quarantining carriers has a huge benefit for society, however. thus, the incentives to individuals are purely ethical, which seems to be sufficient at the time. thus, we focus on the voluntary approach but provide some hints for enforcement as well. • in a preparatory phase, the user installs the tracing app. in the case of enforcement, the app creates a connection of data from an official id-card and the device and then registers the user with that data. this creates a permission to roam and is communicated to the mobile operator. it can furthermore be used to prevent a number of manipulations to evade quarantining, for example. in the case of a voluntary roll out, this registration does not exist, and even in the mandatory case, it is only used to prevent manipulations and does in particular not create any additional means of tracking. • every day, the app chooses a random daily identifier my_rdi, which it broadcasts at regular intervals using a bluetooth protocol (see section v). the identifier provided by the device is c f d |my_dri. the randomness of the my_rdi prevents any correspondence with a particular device or user. it is changed daily to prevent tracking by any fixed monitoring stations. • in parallel, the device searches for the beacons of other devices. this monitoring is performed every seconds. whenever the device detects an identifier of the form c f d |fg_dri for the first time, it adds fg_dri to its list my_ctc and stores the current time (in second units). if it sees the identifier again, it updates the duration of the contact. in total, there are two-minutes intervals in hours. assuming that someone is surrounded by up to people during hours would lead us to entries. there is no difficulty in storing that number, but this exposes the importance of applying simple filtering to control the complexity of later processing steps. • whenever the gesundheitsamt updates its list ga_ctc, which is signed using its private key, the device checks for matches between ga_ctc and my_ctc. the increase in carriers is around per day in germany. the list shall include these entries as well as those of the day before, which is perfectly manageable. the random device identifier and the date must both match, since the identifier is changed every day. note that a very high level of anonymity is preserved up to this point. • if there are matches in the device's list my_ctc and in the list of the gesundheitsamt ga_ctc, there are two different options: -the devices notifies its owner and asks him about his preferences. if the preference is to enter quarantine without further checking, no further action is needed and no information is ever exchanged. -in all other cases, the gesundheitsamt and the device aim at categorizing the contact. this requires a negotiation, which can be handled by a mailbox to prevent the disclosure of the person's identity. in advanced negotiations, the information from my_loc will typically be used in the process of categorization. • once the category of a contact is determined, the gesundheitsamt either asks the person to quarantines herself and organizes testing, or just drops the alert if the contact was uncritical. in the latter case, no further data is exchanged and the data associated with the inquiry is erased. • in the case of a critical contact, the gesundheitsamt invites the person for testing. all exchanges can again be handled through a mailbox. this does again not require the disclosure of the identity of the person. if the testing is twice negative, the person leaves the quarantine and the data is erased. • in the case of a positive testing, the app provides the contact history my_ctc from the beginning of the estimated infection period to the gesundheitsamt. the disclosure of the identity of the person is not needed for pandemic control. the app maintains the list of locations from the estimated infection onward in order to respond to further inquiries from the gesundheitsamt. • the device continues comparing its list my_ctc with later provisions of ga_ctc. this is necessary, due to the significant delay before some carriers are found and since it is the last contact, which is determining the end of the quarantine period. • whenever the gesundheitsamt receives a list of my_ctc including the timing and the duration, it will add the random identifiers to its list ga_ctc. depending on the evolution of the pandemic and future experience, it may decide to only trace contacts to category or to both categories. it will add these contacts to its list and publish a signed copy of ga_ctc at regular intervals. as a consequence, listed identifiers will trigger an inquiry of the associated devices with the gesundheitsamt to ask for categorization. once every user device has performed its matching, there will be no unidentified hits in the past. thus, the gesundheitsamt can erase all non-public information associated with the published list. since some devices may not have contact to the gesundheitsamt for a few hours, there should be a margin in erasing this data, e.g. one extra day. from an epidemiological perspective, users that are quarantined would ideally be tracked. the procedure is straight forward: whoever leaves the location of the quarantine is warned. in the case of a continued breach of rules, the gesundheitsamt is informed and takes action. from this time onward, the person could be continuously tracked to support her repatriation into her quarantine zone. this is certainly controversial and not too compatible with a voluntary tracing. it may be activated if enforcement of tracing turns out to be necessary. currently, this seems not to be the case. the tracing described above is meant to control the pandemic and to enable a restart of the economy, while keeping citizens as protected as possible. in the case of a voluntary use of the system, the main threats are attacks on the privacy of users. they are not only serious but may additionally jeopardize the acceptance of tracing as a method to control the pandemic. in the case of enforced tracing, there are additionally options for evading tracing or tracking. this is mentioned but not discussed in any depth. the primary line of attack to access the personal profile of a particular person is through the app. thus, the app needs a thoughtful design and implementation. this is, however, a requirement, which it shares with any other software using personal data and localization. a similar statement holds for the software run on the server of the gesundheitsamt. it should avoid any deficiencies but is still exposed to exploits of the operating systems and the like. we also assume that the public key cryptosystem is secure in the relevant time. the data base of the gesundheitsamt is only of limited interest, since it contains very little information and since the data is not personalized. the bigger threat is the impersonation of the gesundheitsamt, it may lead to a number of options, which mostly don't have a clear benefit, like • the removal or addition of contacts. • the false categorization of contacts. • the undue convocation of people to testing. • the quarantining of healthy people. the most influential possibility is to add a carrier to ga_ctc and to thus retrieve the list of his contacts. this, however, requires finding a valid random daily identifier, e.g. by creating an explicit contact to a person as well as a major software bug at the gesundheitsamt e.g. by exposing its private key. other sophisticated attacks are conceivable, e.g. using a network of cooperating bluetooth units to profile users by tracking their passage near those units. this is not particular to the present system, however. otherwise, we did not find an obvious other critical attack so far. in the end the usefulness of tracing carriers of covid- and of restoring normality to our daily life have to be balanced against fears of potential attacks. the consequence of having been in contact with an infected fellow citizen is to become quarantined. some people may want to avoid that, even in the case of enforcement. most options such as roaming without an active device, breaking quarantine rules, using different devices, uninstalling and reinstalling the app or cheating during the categorization can all be handled by appropriate measures. they will have to be addressed if enforcement is really desired. this is currently not the case. an implementation of the above system could easily be performed by the companies google for android devices and apple for ios devices. a more detailed design will need a further specifications of the protocols, which should be done jointly to achieve the fastest possible availability of a fully inter-operable system. we studied different mechanisms provided by bluetooth in application programming interfaces (api). ibeacons, which is a protocol used for indoor location services, became our initial candidate. this protocol allows devices to broadcast identifiers, which are received by other devices in the neighborhood. the received signal strength can be used as an indicator of the transmitter to receiver distance. the concentration of transmission and monitoring around seconds intervals of the time of the day can be used to implement a simple form of power management. the focus of our testing was on verifying the possibility of using a mechanism provided by an api. thus, we implemented an app on ios to transmit ibeacons and used the nrf connect for mobile app to monitor these beacons. this worked whenever the app was in the foreground of the ios device. the transmission was, however, discontinued, whenever the app was sent into the background. as a consequence, we implemented an alternative approach using the standard bluetooth low energy (ble) protocol. a corresponding app was written for ios and and another one for android. both apps implement the beacon transmission and beacon monitoring. the source code can be downloaded from https://github.com/danielgnt. the subdirectories bletrack-android and bletrack-ios contain the associated code. these apps could successfully monitor beacons between android phones as well as between ios and android phones. all associated trials worked with the apps in the background on both phones. however, we could not get the ios to ios scenario working with both apps in the background. it only works when one of the apps is in the foreground, which is not sufficient. if this could be solved, a large community of programmers could implement the tracing system described above. the present paper exposes an automated, privacy preserving, tracing method based on bluetooth radio contacts, which consequently works indoors, where people come closest to each other. the approach uses random daily identifiers to trace contacts. the randomness and daily updates prevent most attacks on privacy. the information needed to trace contacts is maintained locally in the personal device. the health agency "gesundheitsamt" is a trusted authority, which only stores contact profiles of positively tested people. this data does not have to include any means of identification of physical person. the next step in bringing this approach to reality would be to setup a task force force designing the details of the protocol, as well as implementing and testing the mobile and server components. the aim should be for a quick and stable initial operational systems. the outcome should be further optimized in a second phase to improve contact classification in order to reduce unnecessary testing and quarantining. modellierung von beispielszenarien der sars-cov- -ausbreitung und schwere in deutschland impact of non-pharmaceutical interventions (npis) to reduce covid- mortality and healthcare demand apps gone rogue: maintaining personal privacy in an epidemic covid- und häusliche quarantäne: flyer für gesundheitsämter indoor localization accuracy of major smartphone location apps kontaktpersonennachverfolgung bei respiratorischen erkrankungen durch das coronavirus sars-cov- key: cord- - ledrw j authors: majumdar, arnab; mehta, anita title: heterogeneous contact networks in covid- spreading: the role of social deprivation date: - - journal: nan doi: nan sha: doc_id: cord_uid: ledrw j we have two main aims in this paper. first we use theories of disease spreading on networks to look at the covid- epidemic on the basis of individual contacts -- these give rise to predictions which are often rather different from the homogeneous mixing approaches usually used. our second aim is to look at the role of social deprivation, again using networks as our basis, in the spread of this epidemic. we choose the city of kolkata as a case study, but assert that the insights so obtained are applicable to a wide variety of urban environments which are densely populated and where social inequalities are rampant. our predictions of hotspots are found to be in good agreement with those currently being identifed empirically as containment zones and provide a useful guide for identifying potential areas of concern. the global crisis caused by the onset of the novel coronavirus (covid- ) pandemic has caused a flurry of academic activity across many disciplines, ranging from epidemiology to statistical physics. most ongoing statistical physics research has been geared towards getting concrete answers in terms of infected populations, deaths, and preventive measures as quickly and simply as possible. this is, of course, an extremely useful approach to take, given that policy-makers need simple models to craft broad and easily understandable solutions. in our opinion, however, a balance needs to be struck between simplicity and accuracy in the interests of efficiency of outcome. it is for this reason that we focus on an approach that is both more rigorous and more intuitive, which asserts that it is not sufficient to treat this problem within the homogeneous mixing approaches so far used by physicists; we need in fact to focus on the contact networks of individuals, so that we can account for the stark difference in impact of those who have a larger number of contacts and are much more likely to contract the infection and subsequently propagate the disease, than those who live relatively isolated existences. the susceptible-infected-removed (sir) model is one of the first to have been used for disease propagation [ , ] , and consists of a population that is susceptible, some of whom can be infected, while others are removed (recover or die). this has been widely used in the current pandemic and has given rise to the popular use of the parameter r , which is the number of individuals that are on average infected by a person who is infected. the simple idea behind this is that for values of this parameter greater than , the disease spreads and will eventually become an epidemic (at a rate proportional to r ), while if the number of people infected by an infected person is * corresponding author: anita.mehta@ling-phil.ox.ac.uk kept well below , the epidemic will die out. this takes no account of the fact that such a model assumes that the parameter r is universal across a population, i.e. everyone has an equal likelihood of transmitting the infection to the same number of people. this is based on an assumption that the population is homogeneous and well mixed, and that everyone is capable of transmitting the infection to everyone else. such an "on average" assumption, however, fails dangerously in the case of epidemics. newman [ ] was the first to take into account that individuals needed to be resolved in terms of their 'degree distribution', i.e. the number of people that they were in contact with, and his pioneering solutions to the disease propagation network have since been widely used [ ] [ ] [ ] [ ] for epidemics ranging from hiv to sars- . in the following section (materials and methods), we review some of the formalism that is relevant to our modelling. one of the important fallouts of the networks approach is the natural occurrence of hotspots, i.e. of regions of high connectivity, which are particularly vulnerable to the spread of disease. given the lack of widely available contact network data for such situations, we focus on the conditions of strict lockdown and postulate that the size of an individual's contact network is primarily determined by the size of their household. regions where household sizes are large are often those which are socially deprived, e.g. when low-income groups live in cramped conditions. these poorer areas provide flashpoints for the propagation of disease even in cities where most of the population live in more privileged conditions. we use available data on the city of kolkata in india as an example of a city with a high population density which contains many areas of social deprivation, and show how both factors contribute strongly to epidemic propagation. we begin with an analysis of the city as a whole, specialising next to the wards (local areas) that comprise it, so that the heterogeneities of contact networks are probed on smaller scales; we find that this more microscopic analysis has the effect of enhancing outbreaks, and speeding up transitions to epidemics locally. this tendency is even further amplified when we extend our analysis to the population of slum dwellers, where social deprivation adds to cramped living conditions to create even larger contact networks among people who are forced to access basic facilities together. one of the consequences of the above analysis is that we can outline a geographical map of hotspots where quarantining and testing should, in fact, be focused; our predictions are in good agreement with empirical estimates by the government and will, we hope, provide guidelines for future planning in the case of areas not yet identified empirically. we first provide a brief review of newman's seminal work [ ] on disease propagation on networks. individuals on a network are linked by disease-causing contacts, which newman uses to define the probability of transmission of the disease as the transmissibility t , in terms of which relevant quantities are defined. the contact network is defined by the number of contacts or 'degree' k i that an individual i is in contact with. the degree k i is a random variable drawn from a degree distribution p k . here we follow newman in considering a network with degree distribution defined by where c is a normalizing constant, α is the power-law exponent, and κ defines the cut-off. such a distribution is both flexible in expressing various real-world networks as well as stable [ ] . the importance of this in our case is to do with the fact that we would expect a lot of our degree distributions to have long tails, which are characterised by power-law distributions; these will be critically important for the eventual evolution to epidemics. however, power-law distributions are an idealisation to infinite systems and are, in reality, cut off by exponential tails [ ] . newman [ ] obtained closed analytical expressions for various entities based on this form of p k . the normalization constant c, mean connectivity k and the mean squared connectivity k can be computed as where li n [x] is the n-th polylogarithm of x. in the context of epidemics, we are concerned with the size of an infected cluster beginning with a single infected individual. by virtue of his or her connectivity, an infected individual could potentially infect a subset of k connected individuals who were initially susceptible but uninfected. the probability that an infected individual transmits the infection to a connected uninfected contact is defined as the transmissibility t . for low values of t , the number of transmissions do not reach epidemic proportions so that a relatively small cluster of people is infected. beyond a critical threshold t c a transition occurs and a significant proportion of the population are infected. the threshold t c can be computed [ , , , ] as this is akin [ ] to the percolation threshold on the underlying contact network, defined purely by topological parameters. since κ determines the cut-off for the power-law domain of the distribution p k , the limit k → ∞ corresponds to p k ∼ k −α and the results obtained for pure power-law distributions hold. equation ( ) takes the form where ζ(·) is the riemann zeta function. as a consequence, t c < for α < . however, for finite κ, t c can be finite even for these smaller values of α. for t < t c , the mean cluster size of infected individuals s can be computed as similar to percolation, this mean cluster size diverges at the transition as where γ is the critical exponent. for t > t c , the cluster of infected individuals s(t ) is a finite fraction of the population and can be computed by solving a self-consistent relation as shown in [ ] . again, analogous to percolation, s(t ) is the order parameter which grows as s ∼ |t − t c | β for t → t + c . in the limit κ → ∞, we have a pure power-law distribution where the scaling exponent β is related to the distribution exponent α [ ] , as we note that the basic reproduction number r , which is the average number of people to whom an infected individual transmits the disease in homogeneous approaches, is related to the topological parameters α and κ, as well as the transmissibility t via: the above relationship ensures that at the epidemic threshold t = t c , r = , as it should be [ , , ] . to estimate the parameters of our model we make use of publicly available data from the census of india, [ ] . we use the hh- city dataset which contains the number of households with sizes , , , , , , - , - , and + for the entire city of kolkata. from this distribution, we compute the distribution of contact degree k. we assume that for a household of size h, each of the h members has a degree k = h − [ ] . combining this with the number n h of households obtained from the data, we construct the cumulative distribution p k . the cumulative distribution is matched to our model from eq. ( ), by fitting the parameters α and κ. since α is connected to the fundamental structure of the contact network, we fix its value for all subsequent estimations and adjust the cut-off parameter κ. we use the primary census abstract ddw-pca which is granular to the level of the wards within the kolkata municipal corporation (kmc) to obtain the heterogeneity of the contact networks within the city. for this purpose, the contact network is adjusted to match the results for the mean household size for the ward, while the computation for the mean cluster-size above tc is scaled to the population of the ward. some of the information about slum populations is extracted from the "kolkata municipal corporation percentage of slum population to total population" published for the census of india [ ] . each ward is then segregated into "slum" and "non-slum" sub-populations. for the non-slum population, the computation in the paragraph above is retained. for the slum population, we rely on more recent data [ ] which suggests that % of the households do not have in-house sanitation facilities or water supply and are thus forced to be in contact with at least one other household, thereby increasing the size of the contact network of each member living there. the network formalism is, as mentioned above, the most appropriate one to examine the transmission of covid- since all transmission takes place through human contacts. individual contact networks completely determine the spread of the infection -infected people who live secluded existences have few, if any, people to infect in turn, while those with large familial and social networks are capable of infecting many people once they are themselves infected. as policy-makers realise this, the importance of tools that provide data on contact networks is being increasingly realised both at the levels of academia [ ] and government [ ] . however, the difficulty of getting accurate data, as well as the fact that people are mobile in general and tend to infect people even without knowing them (e.g. in public places) makes this a difficult enterprise. although efforts are currently in place to estimate global data in the way people move, at a macroscopic level (i.e. without reference to individual infected people) [ ] , it may be a while before accurate data on contact networks, relevant to the spread of infection, are publicly available in most democratic countries. this inherent complexity has had the consequence that much of the discussion among scientists and policymakers has centred around 'homogeneous' sir theoryrelated approaches, which are both easier to understand and which via somewhat sweeping assumptions that every individual is equal on average to every other from the point of infection-spreading, are much more tractable. while there are situations [ ] where such assumptions of homogeneous mixing may well hold, there are many more situations where local and heterogeneous aspects are critical, and where predictions from homogeneous models can be somewhat misleading. we demonstrate this here in the context of the city of kolkata, which captures two aspects critical to our thesis -strong heterogeneity in terms of personal contact networks, as well as areas of great social deprivation, both of which, as will be seen, can lead to the rapid spread of epidemics. while this is a specific choice made by our access to publicly available data [ , ] , its relevance fig. . the cumulative household size distribution for the city of kolkata from census data [ ] . the total population size is , , , the number of households is , , , and the mean household size is . . is global. social deprivation and high population densities among migrant workers in singapore have recently been held responsible for a second wave of covid- spreading [ , ] , and similar conditions among migrant workers in the uae are a portent of similar trends. we assert therefore that our highlighting of these issues in the context of kolkata has global and urgent relevance to the current pandemic. in the first subsection we focus on the role of heterogeneity, where infections spread via the contact networks of individuals, while in the second, we focus on the role of social deprivation in infection spreading. in most democracies such as india, it would be considered a violation of privacy to have extensive lists of individual contacts made publicly available; additionally, the surveillance required to gather details of where people move and thus whom they might infect in public places, would be even more a violation of democratic rights. we, therefore, focus on conditions of strict lockdown, which are (at least theoretically) valid in india as this paper is being written. under these conditions, we postulate that the contact network of an individual is limited to contacts within his or her household. for kolkata, publicly available data [ ] leads to the plot in fig. of the cumulative probability distribution of the household size. this leads to the plot in fig. , which is the cumulative distribution of the number of contacts a person has (blue dots) which was subsequently fitted (red line) by a power-law/exponential form [ ] . the fit preserves the mean contact network size of k = . to a very good approximation, and yields the parameters α = . , κ = . , which will be used in the analysis that follows. using the analysis of the previous section, eq. ( ), we obtain the curve of infected people vs transmissibility shown in fig. . the prediction for a transition to an epidemic is at a critical value of t c = . , which is much lower than t c = . for a homogeneous model with the same k ; the homogeneous model thus clearly underestimates the risk of the epidemic here and in general. the mean cluster size of infected people diverges with an exponent γ of (see eq. ), as shown in fig. . the above city-wide prescription is strongly modified when we look at granularity at the ward-level -a ward in kolkata is defined as a locality for which census data are available [ ] in terms of various demographics. wards are a first indicator of heterogeneity since some are more densely populated than others, as shown in fig. a . the household size distribution per ward, which is independently available from the data, is given in fig. b . as we might expect, there is a reasonable correlation between the two (fig. ) , i.e. areas where there is a high density of population are also those where the household sizes are large. these seem to be clustered to the north and the west of the city (figs. a and b ). the heterogeneity introduced by wards in terms of their household sizes introduces important heterogeneities into the contact networks of the individuals in them. using the formalism above, this in turn introduces important differences in the number of individuals infected, since the populations of the wards have distinct characteristics. as a consequence, the transition to an epidemic sets in much faster if we use this level of description, as is seen in fig. . since, under strict lockdown, it is reasonable to assume that people will interact more within their wards than city-wide, we believe that the orange curve is a better representation for the spread of the epidemic than the blue (city-wide) curve. another way of seeing this is to say that if we assume that the wards are self-contained, each one is associated with a specific critical transmissibility t c (fig. ) . the epidemic spreads as soon as t c is attained within a ward, with a corresponding explosion in the number of people infected. note that the higher the t c , the slower the spread of infection. the most vulnerable areas in terms of epidemic spread are to the north and west of the city (lightest colours), which correlate well with the areas of high population density and household size shown in figs. a and b. we have designed a web application [ ] to demonstrate the spread of the epidemic. fig. a demonstrates this for a single value of the transmissibility t = . which is the threshold for the city taken as a whole. the heterogeneous nature of outbreaks is apparent, as some vulnerable wards (mainly to the north, centre and west) have an infected population numbering several thousands, while others (mainly in the south) are yet to see any significant infections at all. as t increases (e.g. by easing the lockdown and increasing the frequency of contacts between people), more wards would cross their local a linear scatter plot of mean household size vs. population density in kolkata. the histogram of population density is shown on top of the main scatter plot, while that for mean household size is on the right of the main scatter plot. critical threshold and begin to see significant infections. while some of the hotspots predicted in fig. a have not yet been empirically observed, we assert that these are areas of potential risk. we compare these results of our model to the situation in the city as of late april . since the locations of infectious clusters have not yet been published, we look at the number and locations of those regions which have been designated as containment zones. containment zones are set up by the government to contain the disease within a defined geographical area (usually a city block) following multiple confirmed infections in the area, with a view to breaking the chain of transmission and preventing the spread of the infection to new areas. each zone is geographically quarantined with enhanced active surveillance [ ] . as on april , , out of such containment zones in the city of kolkata, have been identified by ward [ ] . figure b shows the number of such zones within each ward. although most containment zones in fig. b are geographically aligned with the areas we have identified as being at risk (fig. a) , there are a few areas of local outbreaks which our model did not predict in the south-east of the city. this area was urbanized relatively recently and became densely populated in the period after , the date of the last census [ ] , which is the source of our data for this part of our research. before leaving this subsection, we summarise the nature of our findings. heterogeneity of contact networks plays a crucial role in the transmission of disease, even if we take a macroscopic viewpoint on the population of fig. . the number of people infected as a function of transmissibility t using a) a ward-based picture (orange line) and b) a city-based picture (blue line). the granularity of the ward-based picture results in strong heterogeneities of contacts, which allow for a faster spreading of the epidemic than if one assumes the more macroscopic picture of contact networks following the same degree distributions across the city. while the city-based picture corresponds to a single tc = . (fig. ) , the ward-based picture allows a distribution of tc from . to . . a city like kolkata. when we take a look at individual wards and use available data to construct more realistic contact networks of individuals, we note that areas of high population density are strongly correlated with large household sizes, and so, within our present approximation, with extended contact networks. these are in turn of crucial importance in the spread of disease, as our predictions demonstrate, predictions which in fact are well correlated ( fig. ) with existing governmental preventive approaches [ ] . in the next subsection, we will focus on areas of high social deprivation. these could, in general, be migrant housing in singapore or dubai, but in the present instance, are based on data on the slums in kolkata. it will be seen that such areas are particularly vulnerable to becoming hotspots for disease transmission. we have used existing data [ ] on slums in kolkata as well as publicly available census data [ ] for our analysis in this section. in addition to dismal living conditions wherever overcrowding is the norm (and a major mechanism for the forced enhancement of human contacts), a major indicator of social deprivation is the lack of literacy that usually obtains in slums. the latter is particularly important in the context of covid- spreading since it translates into a lack of awareness for the very necessary preventive measures at an individual and collective level that would help fight the virus. we first use the census data to look at the lack of literacy in the slum population. fig. a shows the ward-by- ward fraction of slum dwellers in kolkata; this appears to be the complement of fig. b , which shows the fraction of literate people computed ward-by-ward for the city. in fig. , we quantify the above picture by scatter plots showing the correlation between (a) literacy and household size, and (b) the fraction of population living in slums and literacy. all the data clearly show there is a strong negative correlation between literacy and household size, and that in slums, in particular, literacy rates tend to be low. we now use recently collected data specific to slums in kolkata [ ] that, in addition to providing a household size distribution for them, also provides a measure of deprivation, in this case, due to overcrowding. many slum families share toilet facilities and often depend on public borewells to get their water. from the point of view of our research, this, even under strict lockdown, forcibly extends their contact networks. in fig. a we show that the household size distribution for slums [ ] is only slightly larger overall than that for the overall population computed based on census data [ ] . however, the fact that % of the slum population in kolkata share toilet facilities as opposed to % who have private toilets, causes a dramatic change in the degree distribution of contacts as shown in fig. b . the orange curve corresponds to the degree distribution for slum dwellers with private toilet facilities, while the blue one corresponds to that where facilities are shared with at least one other family, so that with the percentages mentioned above, the resulting degree distribution is given by the green curve. as we will see, this sharing of facilities has a dramatic effect first, on the degree distribution (fig. a) , and next, on the critical transmissibility t c (fig. b) for the slums. for both figs. a and b, the green distribution represents the entire population, while the blue one is specifically for the slums. we note that the sharing of facilities such as toilets leads to a large effective shift in the mean degree distribution for the slums vis-à-vis that of the general population, cf. the shift of the blue curve from the green curve in fig. a -despite our rather conservative estimate of this (see the discussion for further details). this has an even more dramatic effect on the critical transmissibility t c for slums, as will be seen in fig. b . the sharp peak for the slum population sets in at a much lower value of t c , so that the slum population are much more vulnerable to epidemics than the general population taken as a whole. in fig. a , we take into account the effect of the slum population obtained from the data of [ ] to compute the number of people infected ward by ward (orange curve), to be compared with the city-wide computation (blue curve). we notice a much sharper increase in the number of people infected, relative even to the ward-by- ward fig. . the probability density function of ward-by-ward (a) mean degrees, and (b) transmissibility tc, for the general (green) and the slum populations (blue) of kolkata. mean degrees are shifted to the right for the slum population [ , ] . the tc curves indicate that the critical transmissibility tc sets in much earlier for the slum population than for the general population. fig. . (a) comparison of city-wide transmissibility curve which shows a transition at . with the ward-by-ward transmissibility curve, now including the effect of slums, which shows that the transition to an epidemic occurs even earlier than what is predicted by the ward-by-ward graphs in fig. . (b) the probability density function of the ward-by-ward basic reproduction number r for the general (green) and slum (blue) populations in the city. graph of fig. . as a result, the effective mean reproduction number r , obtained from eq. ( ) is greater for the slum population than the general population for the same value of t c , as will be seen from fig. b . a. key findings of our study, in the context of past approaches most approaches to date on the covid- spread have involved homogeneous mixing (see e.g. [ ] ), whereas ours is based on heterogeneous mixing, depending on the degree of the nodes of the underlying contact network. the main effect of this heterogeneity is to reduce the threshold to the epidemic, so that the epidemic spreads faster than it would with homogeneous mixing. our aims in this paper are, therefore: first, to insert the heterogeneity of contact networks in theoretical approaches to model the spread of covid- and second, to use this formalism to provide an insight into the role of social deprivation. we have looked at slums as a source of high-degree connectivity, which, combined with low economic development (e.g. the anti-correlation with literacy) provide explosive ingredients for a transition to an epidemic. both of these ingredients have the effect of considerably advancing this transition in the areas where they are prevalent, and hence to the entire population of the city of which they are part. while our data analysis is specific to kolkata, it is very much more general in its applicability to cities with areas of high connectivity and social deprivation, of which the recent example of singapore [ , ] is only one example [ ] . our approach is also able to identify hotspots on the basis of individual contact networks in a scientific and objective way, thus allowing for an impartial way of identifying areas where containment should be enforced as an overall means of prevention. our predictions are in good agreement with empirically obtained government data [ ] , where the latter identify containment zones as areas where infections have already occurred. on the other hand, our approach goes further, allowing for such identification even before infections have occurred, based on household sizes and individual contact networks obtained therefrom; this would form a good basis for preventive measures. without access to data on individual contact networks, we are limited to household sizes as measures of degree distributions, which they are only under (difficult to implement) conditions of strict lockdown. from this point of view, the level of agreement between our predictions and empirical data on containment zones is indeed remarkable. however, we assert that our results should be interpreted qualitatively rather than quantitatively as a means of highlighting the difference between predictions arising from heterogeneous and homogeneous contact networks; for example, our predictions for t c show clearly that contact network heterogeneity causes infections to spread much more rapidly than might be imagined on the basis of homogeneous theories. another limitation arises from the availability of data on slums. in [ ] , slums in kolkata were studied via sampling methods, without any information on their exact location. we have therefore had to study the slums collectively, without any information on where exactly, and how extensive, they are in any particular area. the best that we have been able to do is, via census statistics on the percentage of slums per ward [ ] , compute the ward-based slum statistics to give a qualitative estimate of the ward-based t c . obviously, the approximation involved in doing this, i.e. treating the slums per ward as a contiguous unit when they are in fact distributed, is considerable, but in the absence of data, this is an approximation that we are forced to make. another approximation involves the way in which we compute the revised degree distribution for slums due to shared toilet facilities. in the absence of spatial information on how many families share these facilities, we have taken a conservative approach, assuming that a family of size n shares facilities with only one other family of size n , and using this idea to construct the relevant distributions. of course in actuality, many families may well share the same toilet, (or indeed the same access to drinking water via public borewells, which we have not even considered), so we are, if anything, underestimating the severity of the resulting overcrowding and epidemic spread. another point in this context concerns what we mean by the "general" population. the census data [ ] that we have based this on includes people of all classes, including slum dwellers, in the city of kolkata; the specific data on slums [ ] is both more recent and focuses only on the slums considered in kolkata. we reiterate therefore that, as mentioned above, our estimates should be taken as a conservative qualitative indicator of the effects of social deprivation, rather than an accurate quantitative estimate. on a more positive note, our methods can be applied, if the requisite data become available in a rigorous and detailed format, to areas of social deprivation worldwide with a view to estimating their vulnerability to epidemics. last but not least, while our predictions on hotspots in fig. a are quite well correlated with the empirical data on containment clusters in fig. b , we mention here that our predictions are based on census data, when some of the areas in the south-east of the city were far less developed and populated than they now are. had we had access to recent data, this new demography would have been reflected in revised contact network data, and consequent vulnerability to hotspots. there are several ways in which our approach will integrate into current developments. first, there is an emphasis in many countries such as india on containment zones in hotspots -this is done by looking in hindsight at available data and deciding that areas should be declared as containment zones based on existing infection and death rates, rather than by using network theory. on the other hand, our network-based theoretical approach is predictive; i.e. we can, on the basis of degree-based data, predict where hotspots could occur, and take a preventive rather than curative approach. also, and importantly, our approach provides a non-controversial way of suggesting hotspots, without touching on sensitive ethnic or religious characterisations. another way is for our methods to be used in conjunction with contact tracing apps that are being developed in several countries. an issue that some of these have is that the identification of contacts is done via telephonic links, rather than individual and intimate contact, which is typically how infections spread. if, for example, people in hotspot areas and their contacts within the same neighbourhood are identified, this would be a way of extending the contact network from beyond the household to a larger range; this would reflect conditions where lockdowns are gradually relaxed so that people can move around within a given radius of where they live. most importantly, policy-makers in several countries are implementing many empirical measures relating to the gradual relaxation of lockdown, at the time of writing of this paper. these usually involve maintaining strict lockdown in areas that continue to be at risk, while relaxing restrictions in those where there have been few, or no, infections within the recent past. our formalism, with its detailed predictions of levels of risk in the component areas of a large city, would be invaluable in assisting this process by identifying regions that could be, or should not be, opened up. this is of special importance as economic contingencies compel the world to exit strict lockdowns where possible, while public health contingencies demand that the covid- epidemic is contained to the extent possible. from a scientific point of view, we would like to look at the temporal evolution of the covid- epidemic, in different national contexts, from a network-based point of view. more generally, given the threat that humankind faces from this pandemic, we would welcome collaborations with governmental or other agencies in the design of realistic counter-measures. declaration the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. the mathematical theory of infectious diseases and its applications infectious diseases of humans spread of epidemic disease on networks network theory and sars: predicting outbreak diversity epidemic processes in complex networks predicting epidemics on directed contact networks contact network epidemiology; bond percolation applied to infections disease prediction and control on the critical behavior of the general epidemic process and dynamical percolation office of the registrar general & census commissioner household secondary attack rate of covid- and associated determinants a comparative study of living conditions in slums of three metro cities in india contacts in context: largescale setting-specific social mixing matrices from the bbc pandemic project. preprint, medrxiv south korea is reporting intimate details of covid- cases: has it helped? nature news oxford covid- impact monitor epidemics with containment measures singapore's cramped migrant worker dorms hide covid- surge risk singapore had a coveted coronavirus plan. what went wrong? covid- networks heres a list of all containment zones in west bengals kolkata sick, stranded and broke: covid- crisis hits gulf's migrant workers anita mehta is grateful to the max planck institute for discrete mathematics, leipzig and the centre for linguistics and philology, university of oxford for their support. key: cord- - dm asen authors: joo, jaehun; shin, matthew minsuk title: resolving the tension between full utilization of contact tracing app services and user stress as an effort to control the covid- pandemic date: - - journal: serv bus doi: . /s - - - sha: doc_id: cord_uid: dm asen although contact tracing apps can be effective for controlling covid- , the app usage can be stressful for users. this study identifies countermeasures for users’ stress while maximizing full utilization of the apps. this study presents the relationships among the stress factors, users’ appraisal, users’ emotion focus coping, and the infusion to exert the full potential of the app through a structural equation model. the research model is validated by surveying health code app users. given the results of the study, the contact tracing apps could become a valuable tool to control covid- by removing app users’ privacy concerns. overcoming coronavirus disease is the largest pending global issue. despite worldwide efforts, there currently is no specific vaccine against covid- (cdc ) . meanwhile, health authorities are suggesting preventative measures such as tracing and isolating early-stage confirmed covid- patients and those who contacted the patients, along with social distancing, mask wearing, and hand washing (un ) . authorities around the world are temporally permitting implementations of digital contact tracing apps on smartphones to find confirmed cases and trace their travel logs (kelion ; servick ) . a recent research showed the effectiveness of the digital contact tracing apps for epidemic control using mathematical simulations (ferretti et al. ) . the simulation based on a mathematical model proposed by ferretti et al. ( ) provides evidence that digital contract tracing apps could stop the covid- epidemic by using the reproductive number, r = . calculated from data collected from the early covid- stages in china. some of the successful examples of contact tracing apps in usage are self-quarantine safety protection app required for south korean residents by its government and health qr code apps (hereafter called health code) mandated by the chinese government (cdschq a; gan and culver ) . there are two types of contact tracing technologies: a centralized system and a distributed system (ferretti et al. ; servick ). the centralized system stores all contact tracing information on a central server using the global positioning system (gps) to trace travel logs of smartphone users (servick ). the distributed system uses bluetooth low energy (ble) technology to trace travel information and store data in individuals' smartphones (ferretti et al. ; servick ) . in the centralized system, collected data include information on when, where, and with whom an individual met, what the individual bought, and the activities the individual conducted. thus, while the centralized system is efficient as the quarantine authority can integrate and manage all relevant data, users are concerned about authority's invasive surveillance powers (gallagher ) . in other words, while users do understand that contact tracing apps are needed during the pandemic, they also exhibit increased stress and anxiety over security issues related to their personal information (mozur et al. ) . in general, technology users feel more stressed under mandatory settings of technology acceptance than from voluntary acceptance (marakhimov and joo ) . despite users' stress, the mandatory acceptance of contact tracing apps results in greater effectiveness of reducing the spread of covid- than voluntary acceptance. thus, for being required by both governments, self-quarantine safety protection app of south korea and health code app of china are representative technology for examining the relationship between the effectiveness of mandatory centralized contract tracing apps and user stress. technology related stress (called technostress) results in a negative emotional state and a severe state of anxiety (la paglia et al. ) . coping theory refers to a process of conscious and unconscious efforts to overcome stress (lazarus and folkman ) . it is necessary to examining the tension between the diffusion of contact tracing apps and user stress by applying coping theory. to exert full potential of contact tracing technology to control the spread of covid- , it is critical to resolve the tension between benefits of the contact tracing apps and users' stress. moreover, it is necessary to find the causes of contact tracing app user stress which restricts the utility of the app. first, this study aims to examine users' accuracy concerns arising from potential problems of using contact tracing apps and privacy concerns from privacy infringement as the potential causes of user stress. based on the coping theory (beaudry and pinsonneault ; lazarus and folkman ) , this study proposes a structural equation model that shows the relationships between contact tracing app users' stress and how they accept such stress through a process called challenge appraisal. once users appraise the stress as an opportunity they emotionally cope with the stress and they may engage in the infusion behavior of using the contact tracing app to its fullest potential (jones et al. ) . to test the research hypotheses derived from the proposed structural equation model, survey data were collected from the users of health code, which is the mandatory and centralized contact tracing app with the largest user base. contact tracing apps can be used for a variety of purposes even when the vaccine for covid- is developed. vaccines as a means of disease prevention have limitations that new vaccines should be developed when a new infectious disease emerges, and their development takes a long time. however, the contact tracing apps have the advantage that they can be applied quickly without major modification even for new infectious diseases. since this study can shed insight on maximum benefits of the contact tracing apps while protecting individual privacy, it can contribute to preventing the spread of new infectious diseases including covid- . south korea is one of the few countries that are successfully dealing with covid- . as shown in table , diverse information communication technologies (icts) have been applied to prevent the spread of covid- in south korea. icts such as artificial intelligence and big data are applied to the support for the treatment of covid- . the covid- epidemiological investigation support system (eiss) combined with physical interviews plays a great role in tracking confirmed cases and contacts of covid- (the government of the republic of korea ). this system identifies the travel logs of only those patients who have been determined as confirmed cases by the korea centers for disease control (kcdc) on the map and supports quick responses to covid- control teams using the relevant statistical information (park et al. ) . the eiss integrates data in conjunction with the smart city data hub, which collects and processes data from large cities, developed by the ministry of land, infrastructure and transport. the eiss analyzes data including location information and credit card usage details of confirmed cases, in real time using with the support diverse statistical methods to automatically identify travel logs and points-of-stay by time zones, and provides routes of infection and hot spots to identify the source of infection in each area (park et al. ) . by using the eiss, the travel routes of confirmed cases can be identified and analyzed within ten minutes (the government of the republic of korea ). contact tracing is critical in epidemiological investigation (the government of the republic of korea ). identification of those who have been in contact with a confirmed case in early stages plays an important role in preventing the spread of covid- (park et al. ) . therefore, many countries have adopted mitigation and suppression strategies that trace the travel routes of confirmed cases to identify and isolate contacts, thereby reducing the overall scale of incidence (walker et al. ) . diverse apps for digital contact tracing have been developed and used in various countries (kelion ; servick ) . successful examples include self-quarantine safety protection app (south korea) and health code app (china), both of which are based on the centralized system (cdschq a; gan and culver ) . the south korean self-quarantine safety protection app offers services such as self-diagnosis of the health conditions of individuals in self-quarantine, guidance for living rules, and emergency contact networks to effectively control individuals in self-quarantine (cdschq a). individuals in self-quarantine are required to icts for contact tracing self-diagnosis app: an application that supports the self-diagnosis of entrants from overseas countries with fever, cough, sore throat, and breathing difficulties, etc. self-quarantine safety protection app: an application for persons in self-quarantine to enter their health condition twice a day, and for notification of breakaway from the quarantine area an app for self-quarantine managers and for notification of breakaway: an application for management of persons in self-quarantine and for notification of breakaway from the designated place epidemiological investigation support system: kcdc's confirmed cases' travel route tracking system linked with smart city's data server self-report their health conditions, such as fever, cough, sore throat, and dyspnea symptoms and the report results are automatically sent to the kcdc twice a day. in addition, the location information of individuals in self-quarantine is automatically reported to the kcdc in real time. when the individual leaves his/her designated quarantine location, an alarm notification is sent to both the quarantined individual and the kcdc. then, the kcdc official who is responsible for the location immediately takes necessary actions (cdschq a; the government of the republic of korea ). health code is a contact tracing app with the largest user base in china since early february (mozur et al. ) . the app displays a green, yellow, or red qr code according to each user's health status thereby acting as a pass permit. users with a green code are allowed to visit others, but those with a yellow code should undergo self-quarantine for days, and those with a red code must self-isolate for days. for digital contact tracing, south korea uses mobile phone location tracking (location information at the communication base station), credit card usage details, and cctv records (cdschq a). china uses mobile phone location tracking, facial recognition, cctv, drones, and qr codes (gan and culver ) . users of these contact tracing apps report significant stress over the invasive surveillance functions that they are required to abide by (davidson ; mozur et al. ). coping theory explains individuals' conscious or unconscious endeavors to solve problems and reduce stress. lazarus and folkman ( , p. ) defined coping as "constantly changing cognitive and behavioral efforts to manage specific external and/or internal demands that are appraised as taxing or exceeding the resources of the person." in other words, it is the process of actively adapting to events that are happening and may happen in the future. coping theory has been applied in the fields of psychology, sociology, medicine, and social welfare. coping theory is also used in ict research. beaudry and pinsonneault ( ) proposed the coping model of user adaptation (cmua), which is a model describing users' adaptation to icts based on coping theory. cmua includes the process that users experience when using a new ict, which consists of awareness, appraisal, adaptation, and outcome (beaudry and pinsonneault ) . while contact tracing apps can be an effective covid- control system, the app as a new ict is causing users stress because they must unwillingly adapt and accept the novel invasive surveillant technology. the european parliament (ep) insists on preventing the abuse of personal information by legislating that it is not stored in a central database, and applying sunset clauses. the ep recommends that personal information is deleted as soon as covid- is no longer a threat, and a decentralized system applying ble technology (ep ). in the cases of south korea and china, user stress may be higher than in europe since these two countries have adopted the centralized system for their contact tracing apps. in the technological diffusion approach, new icts become standard means by which individuals or organizations undertake the processes of initiation, adoption, adaptation, acceptance, routinization, and infusion (cooper and zumd ; jones et al. ; zmud and apple ) . from an organizational perspective, zmud and apple ( ) defined infusion as "the extent to which the full potential of the innovation has been embedded within an organization's operational or managerial work systems." jones et al. ( ) defined infusion at the individual level as "the extent to which a person uses technology to its fullest extent to enhance his or her productivity." in other words, infusion is the process through which technology is accepted by individuals or organizations and used so that its full potential functions are realized. contact tracing apps such as self-quarantine safety protection app and health code app are actually initiated and accepted by users in mandatory settings for controlling covid- pandemic. however, the process of adaptation to infusion partially depends on users' emotion and ability because a variety of factors such as their stress and attitude affect adaptive efforts. mason ( ) categorized ethical issues of information age and among them are ( ) accuracy and ( ) privacy. accuracy is associated with the authenticity, fidelity, and precision of the information (mason ). privacy asks the fundamental question of where the borderline is between the information that should or should not be shared with others (mason ). concerns about inaccuracy and privacy infringement are raised with regards to contact tracing apps to control the spread of covid- (davidson ; mozur et al. ) . health code users in china reported concerns about the lack of transparency related to the app's operations, scope of data storage, inability to change an erroneous "red" code, excessive dependence on the internet, and reliance on private companies such as alipay and wechat monitoring their travel routes (davidson ; mozur et al. ) . as previously reviewed, users of contact tracing apps may experience stress mainly due to concerns over the following two issues: being mistakenly subjected to self-quarantine due to incorrect data input or technical errors in the contact tracing app; privacy infringement due to the system storing their personal information in the centralized system. in this context, this study proposes accuracy and privacy concerns as the main causes of user stress for contact tracing apps. users who perceive stress when using contact tracing the apps appraise each stressful situation as a threat or challenge (fadel and brown ; lazarus and folkman ) . based on the above discussion on concerns and challenge appraisal, this study proposes the following hypotheses: hypothesis accuracy concerns about contact tracing apps affect the challenge appraisal. hypothesis privacy concerns about contact tracing apps affect the challenge appraisal. in the appraisal stage of contact tracing apps, users assess whether contact tracing apps are an opportunity to prevent the spread of covid- or a threat to their individual liberty. in the cases of the self-quarantine safety protection app in south korea and health code app in china, users cannot refuse their use since they are mandatory. thus, users try to appraise the perceived consequences of the apps as a new opportunity and then undertake adaptation efforts to satisfy these expected benefits, which were termed as emotion-focused behaviors by beaudry and pinsonneault ( ) . fadel ( b) conducted an empirical study using a survey of electronic medical systems at university health departments to validate the cmca (beaudry and pinsonneault, ) . according to fadel ( b)'s study, appraisal of electronic medical systems as a challenge resulted in increased engagement in adaptation behaviors. in the similar sense, marakhimov and joo ( ) reported positive relationship between users' challenge appraisal of wearable devices and their emotion-focused coping behaviors toward the wearable devices (marakhimov and joo ). moreover, according to a study by joo ( ) regarding infusion of smart grid technology, challenge appraisal of technology significantly influences positive reappraisal. positive reappraisal as a kind of emotion-focused coping behavior refers to efforts to create or ascribe positive meaning to the technology. thus, the present study posits the following hypothesis: hypothesis challenge appraisal of contact tracing apps affects the emotionfocused coping behavior. there have been a few studies on the relationship between emotion-focused coping behaviors and infusion of information systems. in the individual level, emotionfocused coping behaviors positively influence work efficiency and effectiveness (beaudry and pinsonneault ) . emotion-focused coping behaviors are significantly associate with infusion of information systems at the individual level (fadel a) . joo ( ) reported that users of smart grid technology achieved the fullest potential of the smart system by utilizing positive reappraisal based on emotionfocused coping behaviors. thus, the following hypothesis is proposed: hypothesis emotion-focused coping behavior toward contact tracing apps affects user infusion. figure shows the relationships among accuracy and privacy concerns as factors affecting stress, challenge appraisal, emotion-focused coping behavior, and infusion as a structural equations model. an individual that feels stress due to concerns about contact tracing apps appraise the stressful situation (fadel and brown ; lazarus and folkman ) . when users experience a new it service, the new features constitute a challenge that they must evaluate (fadel and brown ) . during challenge appraisal, users of apps undertake emotion-focused coping behavior in an attempt to identify positives and strengths or to avoid/tolerate negative aspects or risks. through emotion-focused coping behavior, users of contact tracing apps adapt to attain the full potential benefits of the app. the research model in fig. is based on the stress-coping-adaptation model of lazarus and folkman ( ) , cmua of beaudry and pinsonneault ( ) , and stress-coping model of joo ( ). table shows measurement items for the five constructs in the proposed research model. the measurement items were modified and adopted from the studies conducted by fadel ( b) , marakhimov and joo ( ) , and joo ( ) to fit the purpose of the current study. each of the questions for the five constructs was measured on a five-point likert scale. the questionnaire was developed in korean, and three graduate students who were bilingual in korean and chinese translated the questionnaire into chinese and mutually reviewed the translations. finally, editing of the translation was commissioned to an agency specializing in korean and chinese. the survey was conducted targeting health code app users in china. although both self-quarantine safety protection app of south korea and health code app of china are good samples of centralized contact tracing apps, korean sample is inefficient for the data collection purpose of the research. as kcdc requires those who are infected or had contacted the infected to install self-quarantine safety protection app, only kcdc has the full list of those who have installed the app. however, health code of china is required for all its residents regardless accuracy concern the degree of concern about the possibility of errors in data input or unintended data usage on the contact tracing app i am concerned about getting red or yellow code by mistakenly inputting health status in the health code (health qr code) app developed i am concerned about getting red or yellow code due to operational error by the health code app service provider i am concerned about being penalized for wrongful data entry even though i enter accurate health status in the health code app the degree of concern about the possibility of abuse of personal information of contact tracing app users and consequential risks i am concerned about the possibility of my health in order to reach respondents with experiences using the app, the current study employed snowball sampling using wechat. excluding missing data, error responses, and inadequate answers, a total of valid responses were used for the analyses. characteristics of the samples are organized in table . male respondents outnumbered female respondents, as the percentage of male respondents was %. the proportion of respondents in their s and s was high at %, and % of respondents have been using the health code more than two months. alipay's app was shown to be the most widely used, followed by wechat and local government apps in order of precedence. the reliability, validity, and research hypothesis of the research model were tested using smart pls (version . . ). common method bias (cmb) may occur in cases where independent and dependent variables are measured in the same way during data collection (kock ) . harman single factor tests and variance inflation factor (vif) were used to check cmb. in the exploratory factor analysis of the harman single factor test, it is unlikely that cmb is present when the total variance of the unrotated first factor is less than % (podsakoff et al. ) . in the case of this study, since the total variance of the first factor was . %, cmb was determined as being unlikely. in a structural equation model, cmb may exist when the vif of a potential variable is . or higher (kock ) . in the present structural equation model, vifs of all potential variables were found to be between . and . , which demonstrates that the possibility of cmb is very low. cronbach's alpha, an indicator of internal consistency of variables, was below the standard of . (hair et al. ) in the challenge appraisal and emotion-focused coping behavior but was found to be reliable at a significance level of . as a result of instances of bootstrapping. therefore, there is no conflict regarding the reliability of the variables from the perspective of internal consistency (table ) . composite reliability (cr) and average variance extracted (ave) are used for the evaluation of convergent validity. as shown in table , the cr values of all variables at least satisfied the reference value of . and the ave values exceeded the reference value of . (fornell and larcker ) . therefore, each variable in this research model shows convergent validity. variables have discriminant validity when the square root of ave is greater than the correlation coefficients of the relevant variables (fornell and larcker ) . in table , the value of the diagonal column is the square root of the ave, and since it is larger than the correlation coefficients of the individual variables, the variables have discriminant validity. in general, there is no multicollinearity, which explains correlations between independent variables, when the vif is below the reference value of . (hair et al. ) . since all vifs were found to be are . or less, as shown in table , multicollinearity is unlikely to exist. the standardized root mean square residual (srmr) is used for the goodnessof-fit of the structural equations model using pls (garson ) . the goodnessof-fit is regarded to be high when the srmr is not greater than the reference value of . (hu and bentler ). the srmr of this research model was shown to be . , which is not too beyond the standard. path coefficients are used to test research hypotheses using smartpls. table shows the results of test of the research hypotheses. the hypothesis (h ) that accuracy concern for the contact tracing app affects the challenge appraisal was not supported. the hypothesis (h ) that privacy concern affects the challenge appraisal was supported at a significance level of . . the two hypotheses (h and h ) that challenge appraisal affects emotional coping behavior and that emotion-focused coping behavior affects infusion were supported at a significance level of . , respectively. table shows the results of the path analysis. the path for challenge appraisal, emotion-focused coping behavior, and infusion (ca → ec → in) demonstrates the significant impact of challenge appraisal on infusion. on the other hand, the path for privacy concern, challenge appraisal, emotion-focused coping behavior, and infusion (pc → ca → ec → in) shows the significant negative impact of privacy concern on infusion. eventually, if concerns about privacy infringement are resolved, app users can more actively engage with apps and maximize their potential benefits through emotion-focused coping behavior (table ) . the r-squared values of emotion-focused coping behavior (ec) and infusion (in) were shown to be satisfactory at . and . , respectively (garson ). in particular, the emotion-focused coping behavior of users of contact tracing apps accounted for . % of the infusion. according to the results of the current study conducted on mandatory centralized contact tracing app users, accuracy concerns about the apps did not significantly affect challenge appraisal. on the other hand, concerns about privacy infringement by contact tracing apps had significant negative effects on challenge appraisal. in addition, challenge appraisal had positive effects on emotion-focused coping behavior, through which app users effectively transitioned to the stage of infusion to maximize the potential benefits of the app. with regard to research hypothesis h , users who appraise contact tracing apps in terms of a challenges more actively conducted emotion-focused coping behavior when using the app. that is, users who appraise contact tracing app as providing a new opportunity to prevent and end the spread of covid- showed efforts to highlight and magnify the strengths and benefits. regarding to research hypothesis h , users attempted to enhance the app's strengths and benefits through emotionfocused coping behaviors, even if they recognized negative aspects in the early stages of use. in a study conducted by marakhimov and joo ( ) on users of wearable devices for healthcare, challenge appraisal was shown to positively affect the extended usage of the wearable devices through emotion-focused coping behavior. in a study conducted by joo ( ) on users of smart grid technologies, the more that users engaged in challenge appraisal of the technology, the more that they conducted emotion-focused coping behavior to actively maximize the potential of smart grid technologies. these studies support the results of the current study on the significant relationships among challenge appraisal, emotion-focused coping behavior, and infusion. with regard to research hypothesis h , the more concerned users are about privacy infringement with regard to the contact tracing app, the less users conduct challenge appraisal of the app. therefore, reducing concerns about privacy infringement may help users to reframe issues related to the app as instead challenges to overcome as well as recognize the strengths and benefits of the app. a previous study also reported that users' concerns about inputting personal health information into the wearable devices had a negative impact on users' challenge appraisal of the devises (marakhimov and joo ). with regard to research hypothesis h , concerns about problems related to inaccuracy, such as input errors or incorrect results, did not significantly affect app users' challenge appraisal. the reason why research hypothesis h was not supported is related to the sociopolitical and cultural systems of china. although private companies and local governments provide health code services, in reality, the central government forces people to mandatorily install and use the app. users of health code apps have a strong belief that they should trust and conform to government orders during these extraordinary circumstances of the pandemic. in such crisis, users tend not to think about errors in data operations or even be skeptical about the possibility of wrongful government operations. in addition, an online discussion was conducted with five graduate students in china who responded to the questionnaire in order to determine why research hypothesis h was not supported. three out of the five students argued that accuracy concerns about incorrect or wrongful information usage did not affect their challenge appraisal of the apps, and they further stated that the reason is that most chinese people trust app services in which the government is involved. given the results of the path analysis, relieving concerns about privacy infringement regarding the use of apps will enable users to realize the full potential benefits of contact tracing apps and make a greater contribution to preventing the spread of covid- . therefore, if the distributed system is used for contract tracing apps rather than the current centralized system, more effective covid- control can be expected. in collaboration with each other, google and apple have decided to provide bluetooth-based distributed contact tracing technology to all quarantine authorities (dumbrava ; apple ) . quarantine authorities in each country will be able to use these open apis (application programming interfaces) to develop customized contact tracing apps to reduce the concerns about privacy infringement to some extent. however, even for the distributed system, users' active participation and trust in operating authorities are paramount. in cases in which the centralized contact tracing system is used, it is necessary to be transparent and disclose how personal information will be safely and expediently deleted when the treat of covid- is over. south korea also decided to introduce an electronic entry and exit registration (qr code) system (called korea internet-pass) for facilities at risk of mass infection from june (cdschq b). users who visit designated facilities must present a personalized encrypted one-time qr code to the facility manager. the facility manager then scans the user's qr code and automatically transmits it to the social security information service (ssis), which is a public institution. the ssis manages facility information and qr code visit records, whereas the qr code-issuing company manages personal information such as the name and phone number of the person to whom the qr code has been issued. when a confirmed case of covid- occurs and the kcdc needs information, the quarantine authority can request information from the ssis, which keeps records of visits to the facility, and the qr code-issuing company, which keeps personal identifying information, to find out who (names and contact information) visited where and when. by separating and encrypting visit records and personal information before storage, only quarantine authorities are enabled to view personal information as a preventative measure to reduce privacy infringement. in addition, the electronic entry and exit registrations are stored only for four weeks, after which the facility visit records are automatically deleted. however, since this is also a centralized system, personal information is not protected if the government forcibly links related systems to track information. therefore, trust and transparency of the government operations are important. based on the coping theory, this study proposed a research model that shows the path from contact tracing app users' stress to their full utilization of the app. this research model can be applied to various fields of technology diffusion and expands the scope of coping theory applications. to date, no vaccine or treatment for covid- has been developed. the findings of this study can be used as a guide to maximize the potential benefits of contact tracing apps with the goal of preventing the spread of covid- . quarantine authorities in each country can improve the utilization of contact tracing apps by reducing the possibility of privacy infringement through establishing transparency and trust. even once a vaccine or treatment for covid- is developed, the findings of this study can be used to prevent the spread of future infectious diseases. given the findings of this study, the distributed system may be more effective than the centralized system for adoption of contact tracing apps, as the distributed system can relieve concerns about privacy infringement. therefore, the findings of this study can provide insights to quarantine authorities or app developers who want to deeply understand the tensions that arise when applying icts to prevent spread of infectious diseases and find innovative solutions. this study verifies the important role of icts in this current pandemic climate of covid- and contributes to maximization of the potential benefits of contact tracing apps. in this context, specific and unique implications of the current study are as follows: first, the current study shows that users are less likely to appraise the strengths and benefits of the app in terms of challenging opportunities as they have more privacy concerns over contact tracing apps. thus, those countries which utilize mandatory centralized contact tracing apps should implement policies that place more emphasis on relieving user's privacy concerns. second, the authorities should offer promotions and training programs that elucidate benefits and strengths of contact tracing apps in controlling the spread of covid- . for example, untraceable covid- cases had increased from . % (may ) to % (august ) in south korea where only infected or those who contacted the infected are required to install the app. on the other hand, the chinese case shows significantly lower rate of untraceable cases as the chinese government requires all residents, regardless of their infection status, to install the contract tracing app. such case can serve as a circumstantial evidence for the strength of contact tracing apps for ending the spread of coivd- . therefore, south korean government should also implement internet-pass qr code not only in facilities at risk of mass infection but also more universally as the case of china. another possible solution, if there is a high public concern for such mandatory installation of centralized apps, decentralized apps could be required for all residents of south korea. finally, authorities and social activities should focus on users to have positive conviction toward contact tracing apps in order for users to understand the strength of contact tracing apps and take benefits of the apps. this study has the following limitations. first, this study did not account for the differences in values as well as cultural and political aspects of the countries in which users of contact tracing apps live. second, the survey conducted on chinese app users cannot be said to represent all contact tracing apps. third, an actual comparison study between the central and distributed systems regarding users' concerns is recommended. privacy-preserving contact tracing understanding user responses to information technology: a coping model of user adaptation how to protect yourself & others guide on the installation of self-quarantine safety protection app mandatory introduction of a new qr code in high-risk entertainment facilities for korea internet-pass information technology implementation research: a technological diffusion approach china's coronavirus health code apps raise concerns over privacy tracking mobile devices to fight coronavirus covid- tracing apps: ensuring privacy and data protection the role of appraisal in adapting to information systems user adaptation and infusion of information systems information systems appraisal and coping: the role of user perceptions quantifying sars-cov- transmission suggests epidemic control with digital contact tracing evaluating structural equation models with unobservable variables and measurement error surveillance technology will only get more intense after covid china is fighting the coronavirus with a digital qr code. here's how it works partial least squares: regression & structural equation models thousand oaks hu l, bentler pm ( ) cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives factors leading to sales force automation use: a longitudinal analysis infusion process of smart grid-related technology based on coping theory coronavirus: german contact-tracing app takes different path to nhs common method bias in pls-sem: a full collinearity assessment approach technostress: a research about computer self-efficacy, internet attitude and computer anxiety consumer adaptation and infusion of wearable devices for healthcare four ethical issues of the information age a ( ) in coronavirus fight, china gives citizens a color code development and utilization of a rapid and accurate epidemic investigation support system for covid- common method biases in behavioral research: a critical review of the literature and recommended remedies covid- contact tracing apps are coming to a phone near you. how will we know whether they work? the government of the republic of korea ( ) flattening the curve on covid- : how korea responded to a pandemic using ict covid- : guide on home-based care, screening & isolation ward set up the global impact of covid- and strategies for mitigation and suppression measuring technology incorporation/infusion acknowledgement this work was supported by the dongguk university research fund. this paper was written as part of konkuk university's research support program for its faculty on sabbatical leave in . conflict of interest the authors declare that they have no conflicts of interest. key: cord- - pidolqb authors: maghdid, halgurd s.; ghafoor, kayhan zrar title: a smartphone enabled approach to manage covid- lockdown and economic crisis date: - - journal: sn comput doi: . /s - - - sha: doc_id: cord_uid: pidolqb the emergence of novel covid- causes an over-load in health system and high mortality rate. the key priority is to contain the epidemic and prevent the infection rate. in this context, many countries are now in some degree of lockdown to ensure extreme social distancing of entire population and hence slowing down the epidemic spread. furthermore, authorities use case quarantine strategy and manual second/third contact-tracing to contain the covid- disease. however, manual contact-tracing is time-consuming and labor-intensive task which tremendously over-load public health systems. in this paper, we developed a smartphone-based approach to automatically and widely trace the contacts for confirmed covid- cases. particularly, contact-tracing approach creates a list of individuals in the vicinity and notifying contacts or officials of confirmed covid- cases. this approach is not only providing awareness to individuals they are in the proximity to the infected area, but also tracks the incidental contacts that the covid- carrier might not recall. thereafter, we developed a dashboard to provide a plan for policymakers on how lockdown/mass quarantine can be safely lifted, and hence tackling the economic crisis. the dashboard used to predict the level of lockdown area based on collected positions and distance measurements of the registered users in the vicinity. the prediction model uses k-means algorithm as an unsupervised machine learning technique for lockdown management. in an unprecedented move, china locks down the megacity named wuhan, in which the novel coronavirus was first reported, in the hopes stopping the spread of deadly coronavirus. during the lockdown, all railway, port, and road transportation were suspended in wuhan city. with the increasing number of infections and fast person-to-person spreading, hospitals are overwhelmed with patients. later, the disease has been identified in many other countries around the globe [ , ] . subsequently, the world health organization (who) announced that the virus can cause a respiratory disease with clinical presentation of cough, fever, and lung inflammation. as more countries are experienced dozens of cases or community transmission, who characterized covid- disease as a pandemic. in such unprecedented situation, doctors and health care workers are putting their life at risk to contain the disease. furthermore, to isolate infected people and combatting the outbreak, many hospitals are converted to covid- quarantine ward. moreover, a surge of covid- patients has introduced long queues at hospitals for isolation and treatment [ ] . with such high number of infections, emergency responders have been working non-stop sending patients to the hospital and overcrowded hospitals refused to in more patients. for instance, recently, in italy, medical resources are in short supply, and hospitals have had to give priority to the researchers can access the implementation and programming code in https ://githu b.com/halgu rd /lockd own_covid . people with a significant fever and shortness of breath over others with less severe symptoms [ ] . as the covid- continues to spread, countries around the glob are implementing strict measures intensify the lockdown, from mass quarantine to city shutdown, to slow down the fast transmission of coronavirus [ , ] . during the lockdown, people are only allowed to go out for essential work such as purchasing food or medicine. ceremonies and gatherings of more than two people are not permitted. these strict rules of quarantine only allow few to move around the city including delivery drivers providing vital lifeline. on the other hand, few countries, such as japan, has declared a state of emergency in many cities in an attempt to tackle the spread of the virus. although covid- started as a health crisis, it possibly acts as a gravest threat to the world economy since global financial crisis [ ] . covid- epidemic affects all sectors of the economy from manufacturing and supply chains to universities. it is also affect businesses and daily lives especially in countries where the covid- has hit the hardest. the shortage of supply chain has knock-on effects on economic sector and the demand side (such as trade and tourism). this makes a supply constraint of the producer and causing a restraint in consumer's demand, this may lead to demand shock due to psychological contagion. to prevent such widespread fallout, central banks and policymakers have been rolling out emergency measures to reassure businesses and stabilize financial markets to support economy in the phase of covid- . currently, most countries are in the same boat with leading responsibility of group twenty and international organizations [ ] . to meet the responsibility, many companies and academic institutions around the world made efforts to produce covid- vaccine. however, health experts state that it may take time to produce an effective vaccine. as an effective vaccine for covid- is not probably to be in market until the begin of next year, management of lockdown is an imperative need. thus, public health officials combat the virus by manual tracking of recent contacts history of positive covid- cases. this manual contacttracing is very useful at the early spreading stage of the virus. however, when the number of confirmed cases was increased tremendously in some countries, manual contacttracing of each individual is labor-intensive and requires huge resources [ ] . for example, an outbreak of the covid- at a funeral ceremony in an avenue in erbil, kurdistan region left regional government with hundred of potential contacts. this situation or many other scenarios of massive number of cases burden the government on trying to manual tracking all contacts [ ] . it is risky that health authorities cannot easily trace recent covid- carrier cases, so that its probability of occurrence and its impact can hardly be measured. technology can potentially be useful for digital contacttracing of positive coronavirus cases [ ] . smartphone can use wireless technology data to track people when they near each other. in particular, when someone is confirmed with positive covid- , the status of the smartphone will be updated and, then, the app will notify all phones in the vicinity. for example, if someone tests positive of covid- and stood near a person in the mall earlier that week. the covid- carrier would not be able to memorize the person's name for manual contact-tracing. in this scenario, the smartphone contact-tracing app is very promising to notify that person [ ] . this automated virus tracking approach could really transform the ability of policymakers and health authorities to contain and control the epidemic. in this situation, a dashboard is required to assist governments and health authorities to predict when lockdown and selfquarantine will end. this study first reviews the current solutions to combat covid- . then, we developed a smartphone-based approach to automatically and widely trace the contacts for confirmed covid- cases. particularly, contact-tracing approach creates a list of individuals in the vicinity and notifying contacts or officials of confirmed covid- cases. this approach is not only providing awareness to individuals they are in the proximity to the infected area, but also tracks the incidental contacts that the covid- carrier might not recall. thereafter, we developed a dashboard to provide a plan for policymakers' officials on how lockdown/mass quarantine can be safely lifted, and hence tackling the economic crisis. applying mass quarantine to people who might be exposed to contiguous covid- in specific areas without any plan and information of infected people in those areas will lead to economic collapse. for example, if there are limited confirmed covid- cases in some areas, restrictions on mass gatherings should be eased and consequently relaxing social distancing among people to allow them for necessary shopping and using transportation. from a technical standpoint, we summarize the most important contributions of this paper as follows: . we build a tracking model based on positional information of registered users to conduct contact-tracing of confirmed covid- cases. . we propose a smart lockdown management to predict level of mass quarantine in those areas. . to notify contacts for confirmed cases, we also developed a notification model to cluster lockdown regions. the rest of this paper is organized as follows. the section "related work" provides the literature review on recent advances of developed ai systems for covid- detection. this is followed by presenting an overview of the proposed approach and details of the designed algorithm in the section "proposed smartphone-based contact-tracing". the section "experiments and deployment" presents the experiments which are conducted in the paper. finally, the section "conclusion" concludes the paper. countries practice many restrictions to respond the fast transmission of covid- pandemic including quarantining people with toughest level of social restrictions, closing public and private sectors, and early diagnosis of infected people via recent technologies. however, none of these solutions will be considered as permanent cure due to bad effecting of the daily life. apparently, such solutions have dramatic and chronic impact on social and economic dimensions. therefore, there is a need for digital contact-tracing to tackle the afore-said issues. in this section, recent trends on contact-tracing are investigated and compared with the proposed approach. several solutions ranging from company's products to an academic research studies have been proposed in mitigate the negative consequences of covid- . in particular, an application in singapore named smartphone-based contact-tracing is developed, aarogya setu [ ] is also used in india to support the difficulty of covid- situations. furthermore, some solutions are under development in united kingdom in collaboration with giant companies including google and apple [ ] . in [ ] , a new system has been implemented using onboard smartphone bluetooth technology to track people who exposed confirmed cases. the system can notify the nearby users in the public area when the infected users are approaching and the area will be quarantined to control the spreading of the virus in the vicinity. however, such study will not provide a comprehensive solution to predict the lockdown area and will not updating the prediction issue, periodically. in another attempt, authors in [ ] proposed a new decentralized approach to track the contact-tracing, which is named caudht (contact tracing application using a distributed hash table) . the approach is trying to preserve the privacy issue of the users (including public health users and infected users), since the system is exchanging data in a blind signature mechanism [ , ] . furthermore, the approach uses the distributed hash table method to messaging between the users. however, if such approach is implemented on the distributed server, it needs huge computation and incurs huge cost. in comparison to their proposed system, the proposed approach is working on temporal tracked information of the registered users, which is not required a large space on the server. furthermore, most of the computations of the system including notifying users can be run on the smartphone users. in [ ] , the authors modeled on how covid- spreads over populations [ ] in countries in terms of the transmission speed and containing its spreading. in the model, r is representing the reproduction number, which is defined the ability of the virus in infecting other people as a chain of contagious infection. infected individuals rapidly infect a group of people over very short period of time, which then yields an outbreak. on the contrary, the infection would be in control if the probability gets closer of one person to infect less than one other person [ ] . this is exactly happening in fig. ; when people (black color) who have come into contact with an infected person (red color), the infection would be spread rapidly. one important aspect is how the number of infected people looks like depends on several factors, such as the number of vulnerable people in the communities, the time takes to recover a person without symptoms, the social contacts and possibility of infecting them with coronavirus. furthermore, another factor will affect fast spreading of coronavirus is the frequency of visiting crowded places such as malls and minimarkets [ ] . thus, policymakers and public health authorities are responsible to manage and plan a convenient way to contain the epidemic. moreover, countries at the early stage of virus spreading need to control the epidemic by typically isolating and testing suspected cases tracing their contact and quarantine those people in case they are infected. testing and contact-tracing at wide scale, the better the chance of containment. in the case of covid- , research studies have been conducted for containment or controlling the fast spreading, and hence helping policymakers and societies in ending this epidemic [ ] . in [ ] , the authors have investigated the importance of confirmed covid- case isolation that could play a key role in controlling the disease. they have utilized a mathematical model to measure the effectiveness of this strategy in controlling the transmission speed of covid- . to achieve this goal, a stochastic transmission model is developed to overcome the fast person-to-person transmission of covid- . according to their research study, controlling virus transmission is within weeks or by a threshold of accumulative cases. however, controlling the spread of the virus using this mathematical approach is highly correlated to other factors like pathogen and the reaction of people. one key role to track infected people and predict ending lockdown is contact-tracing. when a patient is diagnosed with infectious disease like covid- , contact-tracing is an important step to slowing down the transmission [ ] . this technique seeks to identify people who have had close contact with infected individuals and who, therefore, may be infect themselves. this targeted strategy reduces the need for stay at home periods. however, manual contact-tracing is subject to a person's ability to recall everyone they have come in contact over a week period. in [ ] , the authors exploited the cellphone's bluetooth to constantly advertise the presence of people. these anonymous advertisements, named chirps in bluetooth, are not containing positional or personally identifiable information. every phone stores all the chirps that it has sent and overheard from nearby phones. their system uses these lists to enable contact-tracing for people diagnosed with covid- . this system not only traces infected individuals, but also estimates distance between individuals and amount of time which they spent in close proximity to each other. when a person is diagnosed with covid- , doctors would coordinate with the patient to upload all the chirps sent out by their phone to the public database. meanwhile, people who have not been diagnosed can their phones do a daily scan of public database, to see if their phones have overheard any of the chirps used by people later diagnosed by covid- . this indicates that they were in close prolonged contact with that anonymous individual. figure shows the procedure of exchanging anonymous id among users for contact-tracing. as stated in the aforementioned section, manual contacttracing is labor-intensive task. in this section, we detail out each part of the proposed smartphone-based digital contacttracing, as shown in fig. . the main idea of the proposed framework in fig. to enable digital contact-tracing to end lockdown and the same time preventing the virus from spread-ing. the best thing to do seems to be let people go out for their business, but any body tests positive of covid- , we would be able, through proposed framework, to trace fig. a framework of contact-tracing using smartphone-based approach everybody in contact with the confirmed case and managing the lockdown and mass quarantine. this will confirm preventing the spread of the virus to the rest of the people. the first step of the proposed contact-tracing model is registration of users. there is no doubt registration and coverage of high percentage of population is very significant for effective pandemic control. users provide information such as name, phone number, post code, status of the covid- disease (positive, negative, or recovered). effectiveness of the application and digital contact-tracing depends on two factors speed and coverage. for the proposed framework, we utilize global navigation satellite system (gnss) receiver for outdoor environment, whereas bluetooth low energy is used in indoors. in our proposed model, bluetooth technology does not need make a connection setup between the users, while the system only requires the discovery process to retrieve the mac address of the nearby users and then performs the process of matching the infected users with their mac addresses. speed depends on how to reduce the time required for contact-tracing from few days to hours or minutes. the more people register in the system, the better performance of the system in terms of both speed and coverage of contact-tracing. in the second step, global positioning system (gps) receiver is used by the proposed model to track either individuals or a group of people visiting to a common place. the gps service class updates user coordinates to the database in every few seconds. once a registered user reports gets infected with covid- , his test result would be send to the public database in central computer server. other registered users will regularly check those central server provider for possible positive covid- cases they were in contact in the past weeks. server is responsible to compare the infected id with its list of stored ids. a push notification will be send, by the server, to those who were in contact with a person tests positive. it is important to note that the information would be revealed to the central server is an id of the phone. in another scenario, tracking users' position information could be periodically stored on the server for the purpose of exchanging notifications. furthermore, this means that only the infected users' information would need to be stored on the server. certainly, the records of infected area should be updated periodically. therefore, the system does not need huge computations on the server because of the issue of tracking infected users would be run on the smartphone. the only function that should be run of the server is the lockdown area prediction function. fire-based cloud messaging is used to send push notification to multiple devices even the apps are paused or running in the background. many apps send push notification, which indicate an alert to the users. this is happen when a person is approaching someone who is infected with covid- or nearby a lockdown area. to protect the privacy of those who have the coronavirus, we only include an alerting message into the push notification. this certainly would be very useful for entire population to make informed decision about not getting close to covid- area. however, this notification would help the public health professionals rather than replace it. the proposal is also including a lockdown prediction model. the model is working based on the collect geographic in-formation and crowding level of the registered users in the system. there are many algorithms to perform the cluster-ing on collected data including k-means clustering, mean-shift clustering, density-based spatial clustering of applications with noise (dbscan), expectation-maximization (em) clustering using gaussian mixture models (gmm), and agglomerative hierarchical clustering. however, the k-means clustering is the fast method among the other algorithm to find and allocate points with respect to the discovered clusters or group of points [ ] . for the reason of time and space complexity, in this study, the k-means clustering algorithm is used and implemented to prediction process. in this study, k-means as an unsupervised machine learning algorithm is used to cluster the users' positions information and predict that the area should be locked down or not based on the same empirical thresholds. this section presents the details of how the proposed approach will be implemented. the proposal includes two main parts. first, deploying an application on android-based smartphone which will be used by the users and track/send mobility information of the users to the system. while the second side is a web portal (including a comprehensive dashboard) to monitor and predict the visited area that should be locked down or not. a. smartphone application . an android application is implemented on the smartphone. the application lets the users to register their information into the proposed system including name, postcode or zip code, phone number, age, bluetooth mac address, gender, and covid- status. the bluetooth mac address is automatically captured through the application without user interaction. the covid- status includes three options which might be covid- , none covid- , and recovered. figure a shows a snapshot of the application form for the registration process. . once the users have completed the registration process, they can enter into the position tracking model. the tracking model is to send user's position information into the database of the system as well as shows the google map regarding to their positions, as shown in fig. b . . beside this, the users are also can receive the notification or alert about the areas which have been visited by infected users. the notification is working in the background, i.e., the user may be paused the application and uses other application on the smartphone. however, when the user opens the application and enters the infected area will receive the alert dialog. figure c and fig. d show an example of the notification and alert dialog. the notification and dialog alert models are also configure both outdoors and indoors. for example, for outdoors, the gnss position information of the users is used to measure the distance between any two users' positions, and then, if the distance is less than m, then the notification or the alert dialog would be raised. however, for indoors, the application scans for bluetooth devices in the vicinity, and then, the result of the scan is matching with pre-registered mac addressed in the system. if the matched mac addresses have covid- or recovered cases, then the notification model and the alert dialog will notify the users about having covid- or recovered users in the scan area. a web portal for the system's administrators is designed and implemented using html , php, javascript, and google map api. this part of the system is to monitoring and tracing the registered users only in terms of how the areas (which have been visited by users) should be lockdown or not? to this end, an unsupervised machine learning (uml) algorithm has been implemented in the system. there are several uml algorithms including neural networks, anomaly detection, clustering, etc. however, for this system, k-means clustering algorithm is used to predict the lockdown approach for the visited area. the k-means algorithm, first, reads the tracked users' position information and their status covid- . then, in the next step, it will calculate the centroid position of the areas based on the dasv seeding method. the dasv method is an efficient algorithm to select the best centroid position among a set of nearest positions in the vicinity. in this study, two different spaces have been selected via dasv method, since only two crowded area are tested. then, the centroid positions will be updated based on how the positions are nearest to each them. the pseudo code of the k-means clustering algorithm is shown in algorithm . once, the process of the clustering of the tracked users' positions information has completed, a set of clusters will be produced. then, for each cluster, the distances between the positions of the different users are calculated. this is to calculate how many times the users, in the vicinity, are approaching to each other (from now called aeo). for this study, five users (user a with marker-yellow color, user b with marker-orange color, user c with marker-pink color, user d with marker-green color, and user e with markerblue color) are participated into the system in two different areas in usa. therefore, two different scenarios via the five users are conducted for the k-means algorithm, as shown in fig. . in the first scenario, the users are walking and they are located in denver area in colorado, usa, while in the second scenario, they are located in aspen area in colorado, usa. each user, at every s sends their location information (including latitude and longitude), and duration of each scenario is min of walking. therefore, approximately each user sends records of location information to the server. a threshold for the approaching distance has been initialized to m, i.e., if user a has been approached around m to user b, or c, or d, or e, it means that the users are too near to other users. for the two scenarios, if aeo is greater than , the system assumes that this area is too crowed and the system will predict that the area should be locked down. however, if the value of aeo is less than ten times, it means that the area should not be locked down. for ten trial experiments, the model predicts that the denver area in the first scenario should be locked down, since the five users during the walking in the area are approaching to each other for times and they passed the threshold (i.e., m). however, in the second scenario, the same trials have been tested parallel with the second scenario, and the model predicted that the aspen area does not need to be locked down, since the users are walked far to each other. both scenarios results are shown in fig. . as an initial study, only two different scenarios in two different areas are analyzed. however, more complex scenarios and hypothesis in the future could be conducted. lockdown area prediction using recent technologies, especially via onboard smartphones technologies, is the necessity for most of the countries. such management is very important for the purpose of economic sector and the demand side including trade and tourism. this practical research has shown that the lockdown issue for an intended area could be predicted using machine learning algorithms such as k-means clustering algorithm. the algorithm is implemented on a server as well as the server receives the tracked location information of the smartphone users. this fig. the results of the prediction model for both scenarios is followed by send back notifications from the server to the users to notify them for the crowded area and controlling the spreading the coronavirus covid- . furthermore, this management is also giving a feedback for the policymakers about of locked down area or not. the time and space complexity of the implemented algorithm on the server is depending on the size of the number of participant users. to this end, proposed approach, temporary, uses the tracked location information only for purpose of lockdown prediction issue. therefore, the approach keeps the privacy of the participant smartphone users. a set of experiments and trials have been conducted to prove the validity of the proposed approach. however, further study for setting up server, managing the implemented algorithm, and providing robust security/privacy issues is needed. at the emergence of covid- , many countries worldwide are commonly practiced social distancing, mass quarantine, and even strict lockdown measures. smart lockdown management is a pressing need to ease lockdown measures in places where people are practicing social distance. in this paper, we developed a smartphone-based approach to inform people when they are in proximity to an infected area with covid- . we also developed a dashboard to advise health authorities on how specific area safely get people back to their normal life. the proposed prediction model is used positional information and distance measurements of the registered users in the proximity. the policymakers and public health authorities would be able to take benefit from the proposed dashboard to get latest statistics on covid- cases and lockdown recommendation in different areas. the weak point of this study is the privacy issue of tracking position information of the users. this issue would be solved by applying encryption algorithms, in near future. however, the weak point of this proposal is that: in the further study, more experimental and complex scenarios are also needed to verify the validity of the proposed system. for example, if the number of recorded of tracked users is wider and if the prediction model is intended to use for a bigger city includes new work city in united state or london in united kingdom; also, using deep learning algorithms rather than of using only k-means algorithm. hopefully, in near future, these requirements would be considered for the public health care system. deep learning-based model for detecting novel coronavirus pneumonia on high-resolution computed tomography: a prospective study novel coronavirus in the united states covid- pneumonia level detection using deep learning algorithm lockdowns can't end until covid- vaccine found, study says diagnosing covid- pneumonia from x-ray and ct images using deep learning and transfer learning algorithms can we compare the covid- and crises? what is contact tracing? number of covid- cases reaches in kurdistan region; iraq's total now skin melanoma assessment using kapur's entropy and level set-a study with bat algorithm apple and google partner on covid- contact tracing technology role of telecom network to manage covid- in india: aarogya setu covid- contact tracing and data protection can go together decentralized contact tracing using a dht and blind signatures a trustworthy system with mobile services facilitating the everyday life of a museum an efficient blockchain-based approach for cooperative decision making in swarm robotics what we scientists have discovered about how each age group spreads covid- covid- optimizer algorithm, modeling and controlling of coronavirus distribution process artificial intelligence for coronavirus outbreak a novel ai-enabled framework to diagnose coronavirus covid using smartphone embedded sensors: design study ai-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multi-modal data feasibility of controlling covid- outbreaks by isolation of cases and contacts safe paths: a privacy-first approach to contact tracing a k-mean-directions algorithm for fast clus-tering of data on the sphere conflict of interest the authors declare that they have no conflict of interest. moreover, this research was not funded by any funding agency. key: cord- -tc pueow authors: aleta, alberto; ferraz de arruda, guilherme; moreno, yamir title: data-driven contact structures: from homogeneous mixing to multilayer networks date: - - journal: plos comput biol doi: . /journal.pcbi. sha: doc_id: cord_uid: tc pueow the modeling of the spreading of communicable diseases has experienced significant advances in the last two decades or so. this has been possible due to the proliferation of data and the development of new methods to gather, mine and analyze it. a key role has also been played by the latest advances in new disciplines like network science. nonetheless, current models still lack a faithful representation of all possible heterogeneities and features that can be extracted from data. here, we bridge a current gap in the mathematical modeling of infectious diseases and develop a framework that allows to account simultaneously for both the connectivity of individuals and the age-structure of the population. we compare different scenarios, namely, i) the homogeneous mixing setting, ii) one in which only the social mixing is taken into account, iii) a setting that considers the connectivity of individuals alone, and finally, iv) a multilayer representation in which both the social mixing and the number of contacts are included in the model. we analytically show that the thresholds obtained for these four scenarios are different. in addition, we conduct extensive numerical simulations and conclude that heterogeneities in the contact network are important for a proper determination of the epidemic threshold, whereas the age-structure plays a bigger role beyond the onset of the outbreak. altogether, when it comes to evaluate interventions such as vaccination, both sources of individual heterogeneity are important and should be concurrently considered. our results also provide an indication of the errors incurred in situations in which one cannot access all needed information in terms of connectivity and age of the population. the average. within a network perspective, this is just a consequence of the higher number of contacts, or degree, that some individuals have in the network [ , , ] . this individual heterogeneity also signaled that outbreaks could be really large if key individuals become infected and, at the same time, gave a new target for efficient control strategies such as vaccinating highly connected individuals [ , ] . however, despite the many advantages of this approach, determining the complete contact network of a large population is almost infeasible, especially for infections transmitted by respiratory droplets or close contacts. hence, it is common to use idealized networks built using some empirical data of the population, such as the degree distribution [ ] . lastly, there are high-resolution approaches that rely on lots of statistical data to build agent-based models in which the behavior of every single individual is taken into account [ ] [ ] [ ] [ ] [ ] [ ] . note, however, that in agent-based models, individuals are usually assigned to certain mixing groups (i.e., their household, school, or workplace), and that inside those groups homogeneous mixing is used, due to the lack of data for all these settings at a country scale [ ] . an important step to create more realistic models in this direction is to collect high-resolution data on individual contacts using wearable sensors [ ] , that can be used to build timevarying networks in which not only the information about who contacts who is contained but also the duration and frequency of contacts [ ] . several settings have been monitored, such as schools and workplaces [ , ] , or even conferences and museums [ , ] . although the data is still too rare to be used in large scale simulations, it has already been shown that the heterogeneity induced by the time-varying networks inside each mixing group produces a different outcome than the one obtained assuming homogeneous mixing within each group [ ] . our goal in this paper is to analyze the role of one particular type of heterogeneity in disease dynamics, namely, the age structure of the population. originally, age was introduced into the models to study childhood diseases [ ] . the classical approach consists of dividing the population into different groups, one for each age bracket under consideration, and establishing an age-dependent transmission rate. this transmission rate can be arranged in a matrix in which each element encodes the transmission probability between groups i and j (this matrix is also known as the who acquired infection from whom matrix [ , ] ). it is also possible to separate the effect of the transmission itself in a common parameter and encode the number of contacts between each group in the matrix [ ] . note that this procedure falls into the second category described previously. that is, it takes into account the heterogeneity induced by having different classes of individuals but hides the individual variability under a homogeneous mixing approach within each group, as in models of sexually transmitted diseases with groups with different activity levels. nevertheless, this approach is widely used today and has yielded outstanding results for many diseases such as chickenpox [ ] , herpes zoster [ ] , measles [ ] [ ] [ ] , pertussis [ ] and tuberculosis [ ] . in fact, even though the theoretical basis of this method is relatively old, data on the contact patterns of the general population as a function of their age have been available only recently. the first large-scale study on the contact patterns between and within groups in the context of infections spread by respiratory droplets or close contact took place in and was focused in europe [ ] . since then, a number of studies covering different countries have appeared, although data on africa and asia are still scarce [ ] . various methods have been developed to infer the contact patterns in the absence of direct data [ ] [ ] [ ] , and to project them into the future [ ] . and yet, most studies that use this data disregard the whole distribution of contacts and use only the average number of contacts between groups, completely neglecting the individual heterogeneity (with few exceptions [ ] ). as a consequence, in these studies, superspreading events cannot occur naturally, unless the model is modified, contrary to network models in which the large connectivity of some individuals can result in the appearance of such events. similarly, the virtual absence of an epidemic threshold for certain types of contact networks cannot be observed with these simplified contact patterns [ ] . to bridge this gap, in this paper, we focus on analyzing the role that disease-independent heterogeneity in host contact rates plays in the spreading of epidemics in large populations under several scenarios, both numerically and analytically. furthermore, in contrast to previous approaches to this problem [ ] [ ] [ ] [ ] , we use a data-driven approach to highlight not only the role of those heterogeneities but also to explore the validity of the conclusions that one can derive when only limited information about the population is available. there are multiple ways of modeling the contact patterns of the population, depending on the availability of data and the characteristics of the disease. in this work, we consider that diseases have the same outcome on all individuals regardless of their condition and that individuals do not change their behavior as a consequence of the disease. this way, we can focus on the effect of adding different characteristics to the population contact patterns. to be more specific, we use the information from the survey that was carried out in italy for the polymod project [ ] . in this project, over , participants from eight european countries were asked to record the characteristics of their contacts with different individuals during one day, including age, sex, location, etc. since that pioneering work, the number of countries where this type of study has been conducted has been increasing steadily, but data on africa and asia are still scarce. besides, the resolution and amount of information vary from study to study [ ] . as such, we build four different models of interaction, assuming that only partial information about the population is available, see fig . the simplest formulation is the homogeneous mixing approach (model h), suitable when very limited information about the population is available. in this model, all individuals are able to contact each other with equal probability. the number of such interactions, hki, can be extracted from contact surveys simply by calculating the average number of contacts per individual. note, however, that this formulation is very simplistic since all individuals are completely equivalent. a slightly better approximation is to divide the population into agegroups, given the demographic structure of the population, fig b, and establish a different number of contacts between and within them (model m), which is the common approach currently used in the epidemic literature to model age-mixing patterns. in this case, the necessary information includes knowing the age of both individuals participating in each contact, although this information can be easily summarized in an age-contact matrix, m, where each entry m αβ represents the average number of contacts from an individual in age group α to individuals in age group β. note that in both models only the average number of contacts is used, in one case the average over the whole population and in the other over each age-group. another possibility is to use the whole contact distribution, fig d, to build the contact network of the population. this formulation is commonly found in the network science literature since it highlights the role that the disproportionate number of contacts of some individuals have in the dynamics of the disease. a simple way of creating these networks is to represent each individual i as a node and extract its degree (number of contacts) from the distribution. then, the expected number of edges between nodes i and j is ha ij i ¼ k i k j = p l k l (model c). to obtain this expression, we can consider that each node i has k i stubs associated. next, if these stubs are matched together randomly, the probability that each stub from node i ends up at one of the k j stubs of node j is k j over the total number of stubs, ∑ l k l . this method is known as the configuration model. lastly, we can combine both ingredients, the mixing patterns, and the contact distribution of the population in a network representation. to do so, we propose to arrange nodes in a multilayer network, in which each layer represents an age-group. as such, the first step to create this network is to extract the age associated to each node from the demographic structure of the population, fig b, and assign them to their corresponding layer (since we are working with age-groups, our system is composed by that same amount of layers). then, the degree of each node should be extracted from the desired distribution. to incorporate the mixing patterns into the configuration model, we propose the following scheme: . given a node i located in layer α (where the layer represents the age-group associated with i), the probability that any of its stubs ends up at a node in any layer β (including the same layer) is p αβ . this probability can be extracted from the mixing matrix as . the stub from node i will match the stub of node j, situated in layer β, with probability k j /∑ l β k l , where the denominator indicates the addition over the degree of all nodes present in layer β. hence, the expected number of edges between nodes i and j will be given by yet, note that incorporating the mixing patterns introduces a restriction in the degree distribution. indeed, one of the important properties of the mixing patterns matrix is that it has on the other hand, if the full contact distribution of the population is known, regardless of their age, it is possible to build the contact network of the population (c). lastly, when both the contact distribution and the interaction patterns between different age groups are known, the individual heterogeneity and the global mixing patterns can be combined to create a multilayer network in which each layer represents a different age group (c+m). panel b: demographic structure of italy in [ ] . panel c: age-contact patterns in italy obtained in the polymod study [ ] . panel d: contact distribution in italy obtained in the polymod study [ ] . the x axis represents the number of daily contacts and the y axis the fraction of individuals that have reported such amount of contacts. the distribution is fitted to a right-censored negative binomial distribution since the maximum number of contacts that could be reported was . https://doi.org/ . /journal.pcbi. .g to verify reciprocity, i.e., that is, the number of contacts going from group α to group β has to be the same as the ones from β to α (if the populations of each group were equal, this would lead to a symmetric matrix). it is easy to see that eq ( ) only fulfills this property if x and, thus, where hki α represents the average degree in layer α. hence, even though the shape of the distribution can be chosen freely, the mixing matrix fixes the average degree of each layer. eqs ( ) and ( ) to determine the consequences of each of the previous assumptions, we first consider a general susceptible-infected-susceptible (sis) markovian model [ , ] . in this model, the recovery rate of each infected individual is modeled by a poisson process with rate δ. in turn, each successful contact emanating from an infected individual (i.e., a contact that transmits the disease) is modeled as a poisson process with rate λ. we denote by y i the bernoulli random variables that are equal to one if individual i is infected or zero otherwise. complementary, the only ingredient left to be defined is how the contact process between individuals actually takes place. in general, in its exact formulation, we can do so by introducing the matrix a, which denotes whether two individuals can contact each other or not [ , ] : with this formulation we can already study the spreading of an epidemic on any network, models c and cm. indeed, assuming that the states are independent, i.e., hy i y j i = hy i ihy j i�y i y j , we get considering that the nodes with the same degree are statistically equivalent, we can obtain the epidemic threshold using the heterogeneous mean field approximation [ ] , this well-known result from network science clearly shows the importance of the heterogeneity of the contacts, since it depends on the second moment of the distribution. in the case of italy, using this expression we obtain a theoretical threshold of τ cm = . and τ c = . for the cm and c models, respectively. for the m model, since individuals are indistinguishable, eq ( ) is rewritten as where m αβ is the matrix depicted in fig c, and y α the fraction of infected individuals in layer α. in this case, using the next generation approach [ , ] , the epidemic threshold is regarding italy, the spectral radius of m is ρ(m) = . , resulting in an epidemic threshold of τ m = . . lastly, the equation governing the h model is where the epidemic threshold is according to fig d, the epidemic threshold in our system is thus τ h = . . thus, in this case, the following relation holds: some observations are in order. first, even though the average number of contacts is the same in all models, the epidemic threshold is completely different. besides, increasingly adding heterogeneity to the model lowers the epidemic threshold. this is especially relevant when going from classical mixing models to network models. indeed, when we introduce the whole contact distribution, we are indirectly adding the possibility of having super-spreading events, which, as noted before, is missing in the classical approaches. on the other hand, as expected, the difference between both network models is relatively small ( t cm t c ¼ : ) since the main driver of the epidemic threshold is the contact distribution. nonetheless, as we shall see next, for other scenarios, the multilayer framework will yield quite different results from model c. to asses the quality of our theoretical analysis, our first step is to obtain the epidemic threshold for each configuration numerically. to do so, we create an artificial population of individuals and assign them an age according to the demographic structure of the italian population [ ] . then, we simulate a stochastic sis markov model, with δ = and multiple values of λ for each of the four contact models under consideration (see materials and methods). in fig a, we show the attack rate (total number of cases over the whole population) as a function of λ. the overall behavior of the four scenarios is qualitatively similar, although large differences are observed in the value of the epidemic threshold (see inset), as predicted. to properly characterize the value of the epidemic threshold and compare it with the theoretical expectations, we use the quasistationary state (qs) method [ , ] . this technique allows computing the susceptibility of the system, which presents a peak at the epidemic threshold (see materials and methods). the caveat is that it is highly dependent on the system size since the epidemic threshold is only properly defined for infinite systems. nevertheless, in fig b we compute the susceptibility, χ for the four configurations with system sizes ranging from to individuals and we can see that for the latter the peak of the susceptibility is already quite close to the predicted value of the epidemic threshold, validating our theoretical approach. next, we focus on studying the impact that the disease has on each age group under the different configurations, fig c. we set the value of λ in each case so that the attack rate is equal to . , since the four scenarios converge to that value for similar values of λ (see fig a) . using the homogeneous mixing approximation, we obtain a distribution of infected individuals across ages proportional to the demographic structure of the population (fig b) , as one would expect given that all individuals are virtually indistinguishable for the dynamics. the same result is obtained for the c model, in which the age of the nodes is completely independent of the network structure. at variance with these results, if we incorporate the heterogeneous mixing patterns of the population either in the age-mixing (m) model or in the multilayer network (cm) setting, the incidence in each age group would be quite different, see fig c. note that we have again set λ so that the overall incidence is . in all cases −this assures that the total number of infected individuals is the same, only its distribution across age classes is different. results show that in both scenarios the prevalence is much higher for teenagers and smaller for the older cohorts than in the homogeneous mixing model. although the sis model facilitates the theoretical and numerical analysis of the system, especially near the epidemic threshold, it is too simplistic to model real diseases such as ili. thus, to highlight the impact of these observations on a more realistic scenario, we slightly modify the model by incorporating the removed compartment so that the dynamics are governed by a susceptible-infected-removed (sir) model, which is better suited for studying ili [ ] . it has been recently shown that using a constant and group-independent basic reproduction number, r , might not describe well key features of the disease dynamics in realistic scenarios [ ] . for this reason, we first explore the dependency of this parameter with the age of the individual in the two networked scenarios. to do so, we simply count the total number of newly infected individuals that a single seeded infectious subject would produce in a fully susceptible population over simulations, with the value of λ set so that the average value of r is . inline with typical values for influenza [ ] . fig a shows the value of r as a function of the age of the seed node in the network in which all nodes have the same degree distribution. clearly, the same r value is obtained regardless of the age of the nodes, as it should be given that both their degree and their connections are independent of their age. conversely, in the multilayer network where the mixing patterns of the population are incorporated, fig b, the situation changes completely. the value of r is above the average for teenagers and adults but below the average for the elderly, highlighting the importance of the underlying structure in the value of r . lastly, we study the effect of vaccinating a fraction of the nodes before the epidemic begins. this sort of contention measures are among those that can benefit the most from knowledge about the structure of the population, as they allow devising more efficient vaccination strategies. first, we set the baseline scenario to values compatible with the - ili epidemic in italy. according to the world health organization, the total attack rate was . %. besides, an important fraction of the population was vaccinated preemptively. in italy, vaccination is recommended for several groups of people, such as those with chronic medical conditions, firefighters, health care workers, or the elderly [ ]. of these groups, the only one that we can distinguish in our model is the elderly, but it is also the one with the largest vaccination rates. unfortunately, the uptake of the vaccine has been decreasing for the past few years, and now is close to % [ ] . even more, the effectiveness of the vaccine is estimated to be around % yielding an effective vaccination rate of % in the elderly [ ] . hence, to obtain the baseline values in our model, we set % of the elderly in the recovered state initially and set the value of λ so that the attack rate is . %, fig a. our first observation is that in the c scheme, we trivially obtain a reduction in the attack rate among the elderly due to their vaccination, but otherwise, the incidence is the same in all age groups. on the other hand, both in the m and cm models, the attack rate depends highly on the age of the individual. to gauge the effect of increasing vaccination rates, we vaccinate % of the total population (assuming that the effectiveness is % for all age-groups). note that since the elderly group represents % of the population, the initial vaccination rate was roughly % of the total population. if these new vaccines are administered randomly, we can see that the effect is just a homogeneous reduction of - % in all age groups, independently of the model, fig b. conversely, if that same amount of new vaccinations is targeted, the situation changes completely. in the m model, we vaccinate individuals belonging to the group with - years old since it is the one with the largest number of contacts and the highest attack rates. we can see that the overall reduction is much larger than in the previous case, and especially so in this particular group, see fig c. in the c and cm models, instead, we apply the vaccines to individuals with the largest degrees. we can see that the reduction is larger in the c setting than in the cm one. this result might seem counter-intuitive since the same measure is applied to both systems. however, note that while in the c model the largest degrees are homogeneously distributed across the population, in the cm model they are concentrated in specific age groups, or layers. furthermore, since nodes in the same layer tend to be connected together, the previous observation implies that the effect of removing hubs will be lower. to verify this, we have rewired the connections of the cm model while preserving the age, degree and vaccination status of each node. as we can see, in such case we recover the same value as in the c model. in other terms, the correlations induced by the age mixing patterns lower the effectivity of this vaccination strategy. note also that in both the random and the targeted vaccination schemes, the number of new vaccines introduced in the system is exactly the same, only who is vaccinated changes. models can range from simple homogeneous mixing models to high-resolution approaches. the latter, even though it might provide better insights, is also much more data demanding. as a compromise between the two, network models can capture the heterogeneity of the population while keeping the amount of data necessary low. nevertheless, most network approaches focus only on determining the role that the difference in the number of contacts of the population has on the impact of disease dynamics but ignore other types of heterogeneities such as the age mixing patterns. we have shown that to determine the epidemic threshold of the population properly, the heterogeneity in the number of contacts cannot be neglected, making the simple homogeneous approach and the homogeneous approach with age mixing patterns ill-suited for it. in fact, a description that ignores the age mixing patterns of the population can capture much better the value of the epidemic threshold. furthermore, we observe two different regimes in the attack rate as a function of the spreading rate. for low values of the spreading rate, individual heterogeneity plays a more important role, yielding larger attack rates than the homogeneous counterparts. however, after a certain value, the phenomenology reverses, i.e., larger attack rates are obtained for the homogeneous approaches rather than for the networked versions. the reason is that, in homogeneous models, an infected agent can contact everyone in the population, and thus it can keep infecting individuals even if the attack rate is high. when the network is taken into consideration, it is possible that nodes run out of susceptible individuals within their vicinity, virtually preventing them from spreading the disease any further. on the other hand, if we study the distribution of infected individuals across age cohorts, we can see that the c scheme is no longer valid, yielding the same results as the simple homogeneous mixing approach. if the age mixing patterns are added into the model, either in the m or cm schemes, a larger fraction of young individuals will be infected, while the incidence in elder cohorts is reduced. hence, even though the c approach can predict fairly well the value of the epidemic threshold, it cannot be used to study the spreading of diseases in which taking into account the age of the individuals is important beyond the epidemic threshold. conversely, the multilayer network of the cm model can describe both the epidemic threshold and the distribution of the disease across age groups correctly. in other words, it combines both the importance that individual heterogeneity has with the inherent assortativity present in human interactions. individual heterogeneity also introduces important variations in the measured value of r . this observation is quite important since it shows that for the proper evaluation of r during emerging diseases, the sampling of the population has to be done carefully. biases in the sampled individuals, such as having too many young individuals, could lead to estimations of r much larger than its actual value. even more, this is not limited to the age of the individuals since we have also seen the importance of individual heterogeneity in the dynamics. of utmost relevance, if in the sample, there are individuals with an average number of contacts higher than the normal population, the estimations of r would also be higher. lastly, we have also observed the crucial role that heterogeneity plays if we want to devise efficient vaccination strategies. the role of networks in this regard is known to be important not only because there are tools that allow identifying the most important individuals, but because it provides a clear way to study herd immunity. yet, if we do not take into account the contact distribution of the population the effectivity of vaccination campaigns will be lower. conversely, if we rely simply on the contact distribution of the population and disregard their mixing patterns, we would overestimate the effect of vaccination. as the current covid- pandemic has shown, accounting for both the age and the contact heterogeneity of individuals is crucial to control the epidemic. it is yet unknown the exact role that age plays in this disease, although preliminary results show that children are less susceptible and that the case fatality rate for older individuals is much higher. similarly, large super-spreading events are possible such as the ones detected in south korea, boston or spain [ , ] . the latter country is also among the ones most affected by the current epidemic, but empirical information about the age mixing patterns of the population is not available [ , ] . thus, to the inherent problems of forecasting the evolution of an emerging disease [ , ] we have to add our ignorance about these factors which, as we have shown in this article, can substantially modify the predictions. this highlights once again the importance of obtaining precise information about the behavior of the population, enhancing our preparedness for this type of event. to sum up, we have shown the importance that individual heterogeneities have on the spreading of infectious diseases. yet, although in general the more details in the model the better, it is also important to take into account the inherent limitations about data that currently exist. therefore, it is crucial to correctly gauge what can and cannot be done, given the information available to us. in particular, we have shown that to predict the epidemic threshold, it is indispensable to know the degree distribution of the population. nonetheless, this is not strictly needed to evaluate the impact of a disease away from the threshold. yet, adding this information, even though it does not dramatically change the predicted outcomes of the epidemic under normal conditions, could be pivotal to devise efficient vaccination strategies. furthermore, we have seen that the underlying information of the system also has an impact on quantities that are commonly measured and used in real settings, such as r , implying that care must be taken when extrapolating the results from one study to the other. in all cases, we consider populations of individuals. in the h model, since individuals are indistinguishable, the impact of the disease over the age groups is computed by randomly extracting values from the demographic distribution of italy in [ ] . in the m model, the size of each age-group is computed using the same procedure. besides, the age-mixing matrix was corrected so that reciprocity is fulfilled, and the average connectivity is exactly . [ ] . in the c model, we randomly extract the degree of each node from a right-censored negative binomial distribution adjusted to the survey data from polymod [ ] . then, links are sampled performing a bernoulli trial over each pair of nodes respecting that ha ij i ¼ k i k j = p l k l . a similar procedure is followed to create the multilayer with age mixing patterns, but in this case, each layer has its own values for the negative binomial distribution, according to the data (see fig d) , and the probability of establishing a respects ha ij i ¼ p aðiÞ;bðjÞ k i k j = p l aðjÞ k l where p α(i),β(j) is the probability that a link from a node with the same age as node i ends up at a node with the same age as node j, and α(j) is the layer to which j belongs. we remark that the network is simplified, removing multiple edges. close to the critical point, the fluctuations of the system are often high, driving the system to the absorbing state [ , ] . to avoid this problem, the quasistationary state (qs) method stores m active configurations previously visited by the dynamics. at each step, with probability p r , the current configuration (as long as is active) replaces one of the m stored ones. then, if the system tries to visit an absorbing state, the whole configuration is substituted by one of the stored ones. the system evolves for a relaxation time, t r , and then the distribution of the number of infected individuals, p n , is obtained during a sampling time t a . lastly, the threshold is estimated by locating the peak of the modified susceptibility χ = n(hρ i − hρi )/hρi, where hρ k i is the k-th moment of the the distribution of the number of infected individuals, p n (note that hρ k i = ∑ n n k p n ). in our analysis, the number of stored configurations and the probability of replacing one of them is fixed to m = and p r = . , while the relaxation and sampling times vary in a range depending on the size of the system, t r = − and t a = − . heterogeneity in pathogen transmission: mechanisms and methodology the role of population heterogeneity and human mobility in the spread of pandemic influenza an infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices a review of simulation modelling approaches used for the spread of zoonotic influenza viruses in animal and human populations. zoonoses and public health modeling infectious diseases in humans and animals sexual practices and risk of infection by the human immunodeficiency virus gonorrhea transmission dynamics and control heterogeneities in the transmission of infectious agents: implications for the design of control programs transmission dynamics of hiv infection models for vector-borne parasitic diseases age-related changes in the rate of disease transmission: implications for the design of vaccination programmes sexual lifestyles and hiv risk aids and sexual behaviour in france social contacts and mixing patterns relevant to the spread of infectious diseases complex networks: structure and dynamics superspreading and the effect of individual variation on disease emergence network theory and sars: predicting outbreak diversity stochasticity and heterogeneity in the transmissiondynamics of sars-cov- infectious disease transmission and contact networks in wildlife and livestock a high-resolution human contact network for infectious disease transmission statistical physics of vaccination networks and epidemic models strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic mitigation strategies for pandemic influenza in the united states spread of zika virus in the americas measurability of the epidemic reproduction number in data-driven contact networks reactive school closure weakens the network of social interactions and reduces the spread of influenza recalibrating disease parameters for increasing realism in modeling epidemics in closed settings robust modeling of human contact networks across different scales and proximity-sensing techniques contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers what's in a crowd? analysis of faceto-face behavioral networks empirical temporal networks of face-to-face human interactions mixing patterns between age groups in social networks disease control in age structure population infectious diseases of humans: dynamics and control using empirical social contact data to model person to person infectious disease transmission: an illustration for varicella the impact of demographic changes on the epidemiology of herpes zoster: spain as a case study age-structure and transient dynamics in epidemiological systems global dynamics of a discrete age-structured sir epidemic model with applications to measles vaccination strategies parental vaccination to reduce measles immunity gaps in italy. elife contact network structure explains the changing epidemiology of pertussis data-driven model for the assessment of mycobacterium tuberculosis transmission in evolving demographic structures. proceedings of the national academy of sciences a systematic review of social contact surveys to inform transmission models of close-contact infections inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread projecting social contact matrices in countries using contact surveys and demographic data inferring high-resolution human mixing patterns for disease modeling projecting social contact matrices to different demographic structures high-resolution epidemic simulation using withinhost infection and contact data epidemic spreading in scale-free networks transmission dynamics of an sis model with age structure on heterogeneous networks effects of vaccination and population structure on influenza epidemic spread in the presence of two circulating strains incorporating disease and population structure into models of sir disease in contact networks multiple lattice model for influenza spreading world population prospects , custom data acquired via website virus spread in networks fundamentals of spreading processes in single and multilayer complex networks dynamical processes on complex networks the construction of next-generation matrices for compartmental epidemic models the impact of regular school closure on seasonal influenza epidemics: a data-driven spatial transmission model for belgium epidemic thresholds of the susceptible-infected-susceptible model on networks: a comparison of numerical and theoretical results age-specific contacts and travel patterns in the spatial spread of h n influenza pandemic estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. bmc infectious diseases seasonal influenza vaccination and antiviral use in eu/eea member states. european centre for disease prevention and control association between vaccination coverage decline and influenza incidence rise among italian elderly adjuvanted influenza vaccine for the italian elderly in the / season: an updated health technology assessment evaluation of the potential incidence of covid- and effectiveness of containment measures in spain: a data-driven approach forecasting covid- . front phys. predictability: can the turning point and end of an expanding epidemic be precisely forecast? arxiv key: cord- -jqh authors: nan title: next generation technology for epidemic prevention and control: data-driven contact tracking date: - - journal: ieee access doi: . /access. . sha: doc_id: cord_uid: jqh contact tracking is one of the key technologies in prevention and control of infectious diseases. in the face of a sudden infectious disease outbreak, contact tracking systems can help medical professionals quickly locate and isolate infected persons and high-risk individuals, preventing further spread and a large-scale outbreak of infectious disease. furthermore, the transmission networks of infectious diseases established using contact tracking technology can aid in the visualization of actual virus transmission paths, which enables simulations and predictions of the transmission process, assessment of the outbreak trend, and further development and deployment of more effective prevention and control strategies. exploring effective contact tracking methods will be significant. governments, academics, and industries have all given extensive attention to this goal. in this paper, we review the developments and challenges of current contact tracing technologies regarding individual and group contact from both static and dynamic perspectives, including static individual contact tracing, dynamic individual contact tracing, static group contact tracing, and dynamic group contact tracing. with the purpose of providing useful reference and inspiration for researchers and practitioners in related fields, directions in multi-view contact tracing, multi-scale contact tracing, and ai-based contact tracing are provided for next-generation technologies for epidemic prevention and control. outbreaks of infectious diseases could cause huge losses in human lives. the spanish pandemic in led to over million deaths [ ] . as of , approximately . billion people worldwide are at risk of malaria, and every seconds one patient dies due to malaria infection [ ] . the death rate of tuberculosis has exceeded aids, becoming the most deadly infectious disease in the world. about % of people in south africa have latent tuberculosis and there were , cases of tuberculosis in alone [ ] . along with the serious threat to human lives, infectious diseases also bring huge economic losses. statistically, malaria causes an economic loss of billion us dollars every year in african countries [ ] . seasonal influenza in the us causes an annual economic burden of billion us dollars [ ] . the developments in vaccines and drugs have enabled us to combat infectious diseases and have greatly reduced the harm brought to human society. however, the sudden emerging infectious diseases caused by the drug resistance and the inherent variability of viruses still remains to be a serious global problem that often leaves us in an unprepared and vulnerable situation. for example, the h n flu in mutated to become the h n flu in and the h n flu in . the h n virus began to spread through contact networks in hong kong in january , and lives were taken in just one month [ ] . the death toll rose to after three months. thus, in the fight against various kinds of infectious diseases, relying solely on vaccine development is far from enough. a more effective ''active prevention and control'' method is desperately needed so as to rapidly detect and block figure . the spatial distribution of h n cumulative cases in the early stage of the outbreak in mainland china in [ ] . in this case, infected cases are recorded with the information of location and time, while the contact information dominating transmission remains unknown. the transmission paths of new infectious diseases, detaining the disease to a minimum spread until its eradication. many infectious diseases are transmitted through personto-person ''contact''. in computational epidemiology, a contact is simply defined either as a direct physical interaction (e.g. sexual contact) or proximity interaction (e.g. to people being within m of each other, or being in the same room) [ ] . human contact interactions constitutes a ''contact network'' of virus transmission. in this network, nodes represent individuals, and links represent contact relationships. the structure of the contact network significantly affects the spatiotemporal patterns of virus spread. for example, in the case of respiratory infections that spread through droplets, interactions like face-to-face communication, shaking hands, crowd gathering, and sharing vehicles enable the spread of diseases and increase the possibilities of transmission from infected to susceptible persons. tracking the contact interactions of individuals can effectively restore the ''invisible'' virus transmission paths, quickly locate and isolate high-risk individuals who were in contact with infected persons, and can aid in quantitative analysis of the transmission paths, processes, and trends of the infectious diseases, all leading to the development of corresponding effective epidemic control strategies. the biggest obstacle in contact tracking is obtaining data that directly describes contact behaviors. because contact interactions between individuals are diverse and often subtle, they are difficult to be directly observed and recorded. in other words, it is hard to obtain first-hand high-quality data for contact tracking. when a disease is spreading, the impact of the disease could be observed, instead of the underlying direct interactions between individuals. for example, during the outbreak of h n bird flu, it is difficult to identify who were infected due to contacts with certain infected people. as shown in fig. , only new h n cases and the number of deaths in different time and space can be observed. many epidemiology scholars and computer scientists have conducted research on how to accurately capture individuals' contact behavior data as well as how to indirectly infer the contact network from other data sources. many methods have been proposed, most of which utilize intelligence data analytics related technologies, such as intelligent sensing, network modeling and analysis, data visualization, multi-source heterogeneous data mining, data-driven reverse engineering, machine learning, and multi-agent simulation, among others. based on the granularity of contact modeling, the existing methods can be classified into four categories: static individual contact tracking, dynamic individual contact tracing, static group contact tracking, and dynamic group contact tracking. each of these methods are described and discussed separately in following sections. individual contact tracking records fine-grained ''individualto-individual'' contact information, such as contact time, location, frequency, duration, etc. the most common ways to gather contact information are non-automatic methods, e.g., offline and online questionnaires [ ] , [ ] , [ ] , and automatic methods, e.g., mobile phone, wireless sensors, rfid, and gps [ ] , [ ] , [ ] , [ ] . offline questionnaire has been used for many years in some counties to trace sexual contacts of sexually transmissible infections (stis), particularly for hiv [ ] . in recent years, hiv has killed more than million people. currently, there are still million newly infected hiv individuals and half of them will be died every year [ ] , [ ] . to reveal the spread patterns of hiv infection and aids in the u.s., fay et al. analyzed the patterns of same-gender sexual contact among men using the data developed from a national sample survey in and , respectively. they found that at least . percent of adult men have had sexual contact with another man in life, and never-married men are more likely to have same-gender sexual contacts [ ] . similarly, merino et al. [ ] sampled homosexual men from colombia as volunteers to answer questionnaires on sexual practices. analysis of these questionnaires suggests two significant risk factors for hiv- infection: ) having sexual contact with foreign visitors; ) having more than ten homosexual partners. they suggest that the spread of hiv- infections should be monitored at the international level and more attention should be paid to these subgroups with high transmission rates. in general, most of us tacitly approve that unsafe sexual behavior would be more prevalent among individuals with optimal viral suppression. in the swiss cohort study on april , , wolf et al. [ ] investigated the unsafe behavior among hiv-infected individuals by selfreported questionnaire. however, after adjustment for covariate, it reported that unsafe sex is associated with other factors, e.g., gender, age, ethnicity, status of partner, having occasional partners, living alone, etc. in recent years, researchers designed questionnaires to measure the validity and reliability of sexual abstinence and avoidance of highrisk situations. for example, najarkolaei et al. [ ] sampled female undergraduate students from tehran university, iran, and assessed the validity and reliability of the designed sex, behavioral abstinence, and avoidance of high-risk situation. zhang et al. [ ] surveyed hiv-positive persons on their socio-demographic characteristics and sexual behavior, and traced hiv infection status of persons who had heterosexual contact with the hiv carriers. among these persons, were hiv-positive, i.e., the secondary attack rate was . %. therefore, they appeal to improve the knowledge about hiv/aids, enhance psychological education, and promote the use of condom, so as to suppress the transmission of hiv. in addition to hiv, offline questionnaire has also been used to trace the contact between individuals to investigate other infectious diseases such as chlamydia trachomatis infection, zika, and flu. to seek the view of patients with chlamydia trachomatis infection on legislation impinging on their sexual behavior, an investigation was performed on patients at std clinics in stockholm, sweden in . during the past months, men ( %) were more likely to have sexual intercourse with occasional partners than women ( %), and the mean number of men and women was . and . , respectively [ ] . the zika virus is primarily transmitted by aedes species mosquitoes. however, by reviewing the travel experience and sexual intercourse of infected individuals in the us, researchers confirmed that there were cases of zika virus infection were transmitted by sexual contact [ ] . for instance, a person in texas was getting infected with the zika virus after sexual contact with someone who had acquired the infection while travelling abroad [ ] . in , molinari et al. [ ] investigated the contact interactions of , students in a high school through questionnaires. a local campus contact network was established based on information such as the length of contact time and contact frequency. the outbreak process of flu was then simulated based on this established contact network. they found that the classroom is the location with the most campus contact and that class break and lunch break are the times with the most campus contact. offline questionnaire is an efficient way to trace private contact interactions such as sexual practice between individuals. however, it needs to find target participants one by one within a specific region, which is time consuming and needs more physical labor. moreover, data collected by this method is usually time delayed and incomplete. with the purpose of collecting more timely and low-priced data of various kinds of contact behaviors, online questionnaires such as online survey and web-based survey have been extensively applied. in , a national online survey was constructed in adolescent males, using computer-assisted self-interviewing (audio-casi) technology. comparing with traditional selfadministered questionnaire, the prevalence of male-male sex with intravenous drug users estimated by audio-casi was higher by more than % [ ] . influenza-like illness (ili) outbreaks on a large scale every year in many countries, recording and detecting ili are important public health problems. flutracking, a weekly web-based survey of ili in australia, has been used to record the past and current influenza immunization status of participants in winter influenza seasons for many years [ ] . it only takes the participants less than seconds to complete the survey, including documenting symptoms, influenza vaccination status, and mobility behaviors such as time off work or normal duties. in , the peak week detected by flutracking was august, which was contemporaneous with that in other influenza surveillance systems [ ] . for the first three years being applied, the participants increased from to and , in , , and , respectively, due to its convenience in completing the survey and its accuracy in detecting the peak week of influenza activity. flutracking also provides vaccine effectiveness analysis by investigating the status of vaccinated and unvaccinated participants. in , the ili weekly incidence peaked in mid-july in the unvaccinated group, month earlier than vaccinated group confirmed by national influenza laboratory [ ] . in recent years, by cooperating with the health department, organizational email systems, and social media, flutracing gained over new participants each year by sending invitations from existing participants. as a result, the number of online participants in flutracing has exceeded , in [ ] . contact information collected through an online questionnaire is more timely and low-priced than offline questionnaire. however, it still cannot record real-time contact information, and, moreover, contact information collected online sometimes inaccurate or even false [ ] . because people on the internet are usually anonymous, which is incapable to verify the information of their real name, age, place of residence, etc. individual contact information obtained through offline or online questionnaires is usually time delayed, incomplete, and inaccurate. with the aim to collect dynamic, complete, and accurate individual contact information, some researchers began to use mobile phone, wireless sensors, rfid, and gps devices to track individual contact behaviors. in recent years, the application of mobile phones has become increasingly universal, providing a convenient way to record real time location information [ ] , [ ] . in , yoneki [ ] developed the first contact pattern discovery software, fluphone, which could automatically detect and record the contact behavior between users by mobile phone. the researchers collected the contact information of users on the cambridge university campus with this software and established the contact network between different users at different times. then, they simulated an influenza outbreak on this network using a seir model. in view that the large power consumption of gps and bluetooth resulted in short standby time of mobile phones, yoneki and crowcroft [ ] further developed a new contact tracking application, epimap, using wearable sensors, which had lower power consumption and longer standby time. epimap thoughtfully transmits and stores data by satellite, as many high-risk areas are in developing countries where there often are not enough wireless communication facilities to support contact tracking. wearable wireless sensors can record individuals' contact events such as time, location, and duration continuously and accurately, and gradually becomes a useful tool for collecting high-precision contact data in small areas [ ] . it has been applied to discover contact patterns in various kinds of social settings such as hospitals and campuses. for example, mit media lab researchers nathan eagle and alex pentland proposed the reality mining method as early as . this method suggests the use of wearable wireless sensors to record people's daily activities [ ] , [ ] . they developed an experimental system to record the activities of several mit students in a teaching building over time, and then established a small social network describing their contact relationships [ ] . salathé et al. [ ] collected the contact interactions of students in a high school in the united states for one day using wireless sensors, and they established an individual-based campus contact network. it was found that the campus contact network had high density connectivity and a small-world structure [ ] . however, it is costly to trace contact interactions using wearable wireless sensors, especially when the number of individuals being monitored is large. moreover, people wearing wearable wireless sensors are very conspicuous, participants are unwilling to wear such devices due to privacy concerns especially for patients. radio frequency identification (rfid) is a non-contact automatic identification technology, by which the contact behavior can be recorded when individuals carrying a small non-contact chip getting closer. in , olguin et al. [ ] collected , contacts among people (including medical staff, children in critical condition, and nursing staff) in a children's hospital in the united states using radiofrequency identification devices (rfid), and established a contact structure for the hospital. similarly, yoneki [ ] collected students' contact data from a french primary school using radiofrequency identification devices. more recently, in october , ibm researchers kurien and shingirirai from africa labs invented a radio tag designed to extend tracker working distance, and implemented it in tracking tuberculosis ( fig. ) [ ] . each tag contains a tiny sensor, figure . an ibm researcher holds a micro-radio tuberculosis tracker [ ] . in october , ibm researchers from johannesburg, south africa, released their latest research update: using cheap radio tags to anonymously track the contact transmission paths of tuberculosis. this study is an important step for ibm in helping who eliminate tuberculosis. a storage device, and a battery. radio tags can communicate with each other, allowing individuals' contact interactions to be recorded when two tags are in close proximity. the contact data collected by the radio tags is presented in a three-dimensional visualization system. using of the intelligent data analysis method provided by the system, medical staff can view the spatiotemporal distribution of tuberculosis patients in real time, track the transmission paths of tuberculosis, and find high-risk populations. because of the high cost of tuberculosis vaccines, contact data can also aid in the determination of high-priority vaccinations. however, the traditionally used radiofrequency tracker has a limited transmission and receiving range and only works within a small area. gps (global positioning system) has the capability of long-distance positioning, which has been widely used for tracing indoor and outdoor mobility behaviors and physical activities [ ] , [ ] . with the aged tendency of population, tracing mobility behavior is critical for measuring, describing, and comparing mobility patterns in older adults. for example, hirsch et al. [ ] investigated the mobility patterns using gps tracing data collected from older adults in vancouver, canada, with the goal of understanding neighborhood influences on older adults' health. they found that participants who were younger tend to drive more frequently and live far from their neighborhoods. gps devices have also been used for tracing physical activities of adolescents in school and other social settings [ ] - [ ] . for instance, rodriguez et al. [ ] sampled adolescent females in minneapolis and san diego, usa, and traced their physical activity and sedentary behaviors by gps every s in different settings. physical activities were more likely to occur in parks, schools, and places with high population density during weekend, less to occur in places with roads and food outlets. besides, tracing animals in the sea or on the land using gps devices can obtain detailed spatiotemporal data regarding the movement patterns. for instance, dujon et al. [ ] traced a green turtle travelling more than km across the indian ocean and obtained more than , locations. moreover, by tracing the whole-body motion dynamics of a cheetah using gps devices, patel et al. [ ] illuminated the factors that influence performance in legged animals. although detailed individual contact information can be collected through non-automatic methods, e.g., offline and online questionnaire, and automatic methods, e.g., mobile phone, wearable wireless sensors, rfid, and gps devices. these methods are mostly limited to small-scale population experiments due to high cost and short range collection. they have not been applied to large areas or large-scale contact behavior studies. group contact tracing captures contact interactions of human beings with similar characteristics (e.g., age, occupation, hobbies) in different social settings from the macroscopic level. static group contact behavior can be traced by large-scale questionnaire and simulated by multi-agent models. dynamic group contact behavior can be inferred by data mining method like tensor deconvolution. in recent years, a composite group model that can characterize population heterogeneity and model epidemic spreading dynamics, overcoming the difficulty of obtaining fine-grained individuals' data has attracted much attention. such models not only simulate the transmission process, but also depict the contact structure of a larger population. the composite group model divides the population into several meta-populations by age or spatial location, so that individuals within a meta-population have similar biological characteristics (such as susceptibility, infectivity, latent period, and recovery period). then, the process of epidemic transmission can be modeled using group contact interactions among meta-populations instead of individuals' contact interactions [ ] . based on this model, the infection and spread of epidemics can be described as a reaction-diffusion process. ''reaction'' characterizes the process of individual infection within a meta-population. ''diffusion'' characterizes the transfer process of epidemic diseases between different meta-populations through the group contact structure (fig. ). in addition, there is a practical significance in establishing contact networks for composite groups because control strategies for epidemic diseases are usually oriented towards composite populations, for example, vaccination groups are usually sectioned by age when planning vaccine allocation strategies. [ ] . the diffusion process is illustrated from a macroscopic perspective, i.e., the transmission among different meta-populations, whereas the reaction process is illustrated from a microscopic perspective, i.e., the individual infection within a meta-population. the composite group contact network was first established using questionnaires. in , mossong et al. [ ] conducted the polymod research project in europe, in which they organized a wide-range survey on contact behaviors, involving volume , , participants from eight european countries. a total of , contact records were collected. they found that contact interactions have significant spatial heterogeneity, with most individual contacts occurring at home ( %), work ( %), school ( %), places of entertainment ( %), and while using transportation ( %). further, contact structures under different scenarios have obvious differences. there are some age-related contact patterns: in many scenarios (such as in schools), individuals are more likely to contact people of similar age; most of the contact between children and their parents occurs at home, while most contacts for adults occur in workplaces. the researchers thus divided the population into several meta-populations, establishing a composite group model based on age. interaction probabilities between different age groups were estimated according to questionnaire data (fig. ) , and a contact network based on composite groups was established. the simulation method based on multi-agent models is also applied to the establishment of contact networks. this generally involves combining the questionnaire survey with population census data to establish the contact structure of composite groups. iozzi et al. [ ] modeled a virtual society with the characteristics of italian society based on questionnaire data from , people. human daily migration behaviors were simulated by a virtual community, and a contact matrix of the composite group was obtained. based on this matrix, the outbreak process of italian b (human parvovirus) was successfully simulated. similarly, eubank et al. [ ] simulated the movement of individuals within a city by large-scale agent system, and then modeled a group contact structure based on their simulation. the data they used included population census data, land usage, population migration, and other daily behavior data. constructing contact matrix for meta-population requires large-scale even nationwide questionnaire survey, which is quite costly and time delayed. multi-agent method simulates human mobility behaviors in the virtual world based on the contact matrix constructed using the data from the real world of the past [ ] . it doesn't consider the changes of existing contact patterns caused by human self-awareness and epidemic-control strategies in the future. most of the above studies focus on static properties of contact behaviors, such as the contact object (who is contacting), scene (where this contact happens), frequency, and duration. in other words, the aforementioned studies assume that the contact patterns of the individual remain stationary. however, contact interactions usually change with time, and show different temporal and spatial patterns. for example, contact interactions can change periodically with the weather and season, vary significantly between workdays, weekends, and holidays, and may be adjusted in response to the threat of an epidemic disease and during the outbreak by reducing travel or wearing face masks [ ] , [ ] . additionally, governmentimposed epidemic-control strategies can significantly change individuals' contact patterns [ ] , [ ] , [ ] , [ ] . for example, during the outbreak of h n flu in hong kong in , interventions, such as flight reductions, school closures, and vaccination efforts, significantly altered individuals' contact interactions [ ] - [ ] . dynamic contacts between individuals are more difficult to be observed and recorded than static contacts because of the limitations of existing contact tracing methods. offline and online questionnaires are incapable of recording real-time contact information, and usually time delayed to receive feedback from participants. automatic contact tracing methods such as mobile phone, wearable wireless sensor, rfid, and gps devices can collect continuous mobility information [ ] , [ ] . however, all these methods are limited to monitoring mobility behaviors for small-scale population, due to the large consumption of power, short range of positioning, high cost of money, etc. besides, most people cannot be expected to agree to have their dynamic contact interactions monitored in real time because of privacy issue. for example, wearing a tracker can also be equated to declaring one's self an infectious disease patient. tuberculosis patients in african countries are branded with social prejudice, making wearing an identifier a sensitive issue [ ] . in light of these obstacles, a new path that does not ''directly'' capture and record individuals' dynamic contact behaviors, but ''indirectly'' infers the dynamic contact model of a large-scale population from other readily available data sources must be found. infectious disease surveillance, like that depicted in fig. , expands everyday with the vast applications of information technology in the medical field. surveillance data record spatiotemporal information related to the spread of infectious diseases, which is the result of the spread model acting on the real contact network, as shown in fig. (a) . such surveillance data can be regarded as an external manifestation of the implicit contact network, suggesting that the dynamic contact network could be ''inversely'' inferred from infectious disease surveillance data, as shown in fig. (b) . essentially, this is a complex inverse engineering problem: using the observed dynamics phenomenon to determine the dynamic structure that leads to the phenomenon. in other words, determining time-dependent contact interactions using the timedependent spread trend of infectious diseases. based on the idea of inverse engineering, yang et al. [ ] proposed a novel modeling and inference method for constructing a dynamic contact network based on tensor model. they described the spatiotemporal patterns of composite group contacts as a tensor, modeled the inference of the dynamic contact network as low-rank tensor reconstruction problem, and proposed a tensor deconvolution based inference method by fusing compression perception, sparse tensor decomposition and epidemic propagation models. this method makes it possible to determine the dynamic contact network of the large-scale composite group from population census data and surveillance data of many epidemic diseases. using this method, composite group dynamic contact networks for hong kong and taiwan were established using population census data and surveillance data of a variety of infectious diseases (such as h n , influenza, measles, mumps, etc.) for these two areas. the temporal and spatial evolution patterns of individuals' dynamic contact interactions were analyzed. based on the established dynamic contact network, they further studied the spread law, and prevention and control strategies of h n epidemic disease. they arrived at two important conclusions: ( ) in the h n outbreak in hong kong in , if the beginning of the new semester was delayed two to six weeks, the total number of infections would have been reduced by % to %; ( ) the best strategy for prevention and control of h n spread is vaccination of school-age children in the first few days of the new semester. contact tracking based on intelligent information processing technology represents an active prevention and control strategy for infectious diseases. its main functions are to achieve early detection and timely intervention of infectious diseases. research on contact tracking methods not only expands the options for preventing and controlling infectious diseases, but also further improves people's understanding of their own contact behaviors. contact tracking has become an increasingly mature datadriven technology for disease prevention and control, evolving from individual tracking to group tracking. individual tracking attempts to capture more detailed contact interactions for accurate locating of infected patients and high-risk susceptible populations. traditional offline questionnaire is a practical method for tracing private contact interactions between individuals such as sexual practice. however, it is quite costly and time delayed to find target participants and receive feedback from them. comparatively speaking, online questionnaire serves a low-priced way to collect feedback from participants timely. however, it cannot record the time exactly when contact occurs. meanwhile, offline and online questionnaires sometimes provide inaccurate information of human mobility. for example, klous et al. [ ] surveyed participants in a rural in the netherlands using questionnaire and gps logger, respectively. investigations on walking, biking, and motorized transport duration showed that time spent in walking and biking based on questionnaire was strongly overestimated. the use of automatic contact tracing methods enabled researchers to obtain continuous and accurate individual contact information, e.g., time, location, duration, etc. mobile phone and wireless sensors were widely used to trace mobility behaviors of students in campus and patients in hospital. then, small-scale contact network within the tracing regions can be constructed and the diffusion process of infectious volume , disease such as influenza can be simulated and analyzed in detail. however, the use of mobile phone is limited to tracing short-term contact behavior because of large power consumption of gps and bluetooth. wearable wireless sensors can only be applied to small-scale population due to its high cost and privacy concerns. rfid devices are convenient carrying which solves privacy concerns very well, but it only can be used for short range collection. gps device has the advantage of long-distance positioning. however, it is costly to capture indoor mobility behaviors due to the requirement of communication stations [ ] . all these automatic contact tracing methods have not been used for studies of large-scale individual contact so far. group tracking replaces individual contacts with the contact probability of meta-populations, which, to some extent, overcomes the obstacles of individual tracking. using a contact matrix of meta-population, contact patterns regarding people with similar features can be depicted from the macroscopic level. however, the contact matrix is usually constructed using the data collected from a nationwide questionnaire, which is static and can only represent the contact patterns of the past. to explore dynamic contact patterns of meta-population, a data-driven ai (artificial intelligence) method was adopted, i.e., tensor deconvolution [ ] . based on this method a dynamic evolutionary model of the group contact was constructed and dynamic contact patterns were inferred inversely through insights into the time-dependent nature of the infectious disease surveillance data. nevertheless, it should be noted that although it can characterize a wider range of dynamic contact behaviors, it cannot be used to accurately locate unique contact events because of the coarse granularity of the captured contact behaviors. exploring social contact patterns for epidemic prevention and control is an every promising research direction, and some potential future development directions are illustrated as follows. a. multi-view contact tracing data obtained from different views can give expressions to different patterns of mobility behaviors. for example, offline and online questionnaire can accurately record contact events occurred in places that individuals frequently visited [ ] . gps devices can record indoor and outdoor contact events happened occasionally [ ] . heterogeneous contact network constructed by various kinds of information can provide a new way for analyzing and simulating the spread of epidemics. therefore, tracing mobility behaviors and analyzing contact patterns from multi-views to get new insight into what heterogeneous contact patterns like will be a new direction in the future. existing studies focus on either individual level or group level contact tracing, presenting independent contact patterns from microscopic and macroscopic scales, respectively. however, group contact patterns are formed by collaborative behaviors of individual mobility, while individual mobility behaviors can be influenced by others in the same group. revealing hidden interactions between individual contact and group contact will be helpful to identify influential individuals as sentries for disease monitoring. therefore, discovering hidden interactions from multi-scale contact patterns that tunneling individual contact and group contact will be a new opportunity for early epidemic detection. dynamic mobility behaviors lead to complex contact patterns, which are usually hidden and cannot be directly traced by non-automatic or automatic methods. a better way to infer dynamic contact patterns is adopting ai-based methods using heterogeneous real-world data. existing studies such as tensor deconvolution consider the combination of contact probabilities within real-world social settings like school, home, and workplace as linear [ ] . however, hidden dynamic contact patterns within these social settings could be more complicated than linear models can characterize. therefore, exploring advanced ai-based contact tracing methods, e.g., multi-view learning [ ] - [ ] , deep learning [ ] , broad learning [ ] , etc., will be the next generation technology for epidemic prevention and control. in this paper, we introduced current studies on contact tracing and its applications in epidemic prevention and control. this paper covered research directions, i.e., individual contact and group contact, which were introduced from both static and dynamic aspects. non-automatic tracing methods like offline and online questionnaires record static individual contact information, while automatic tracing methods like mobile phone, wearable wireless sensor, rfid, and gps devices collect dynamic contact events. static group contact patterns can be depicted by a coarse granularity contact matrix constructed by large-scale questionnaire data, dynamic contact patterns, however, can only be inversely inferred using data-driven ai technologies. both individual and group contact tracing are promising research directions and filled with challenges, especially for dynamic contact tracing. collecting contact data from multi-views and analyzing contact patterns from multi-scale mobility interactions will be new directions in the future. moreover, exploring advanced ai-based contact tracing methods using heterogeneous and multi-source data will provide new opportunities for epidemic prevention and control. hechang chen received the m.s. degree from the college of computer science and technology, jilin university, in , where he is currently pursuing the ph.d. degree. he was enrolled in the university of illinois at chicago as a joint training ph.d. student from to . his current research interests include heterogenous data mining and complex network modeling with applications to computational epidemiology. bo yang received the b.s., m.s., and ph.d. degrees in computer science from jilin university in , , and , respectively. he is currently a professor with the college of computer science and technology, jilin university. he is currently the director of the key laboratory of symbolic computation and knowledge engineer, ministry of education, china. his current research interests include data mining, complex/social network modeling and analysis, and multi-agent systems. he is currently working on the topics of discovering structural and dynamical patterns from large-scale and time-evolving networks with applications to web intelligence, recommender systems, and early detection/control of infectious events. he has published over articles in international journals, including ieee tkde, ieee tpami, acm tweb, dke, jaamas, and kbs, and international conferences, including ijcai, aaai, icdm, wi, pakdd, and asonam. he has served as an associated editor and a peer reviewer for international journals, including {web intelligence} and served as the pc chair and an spc or pc member for international conferences, including ksem, ijcai, and aamas. updating the accounts: global mortality of the - 'spanish' influenza pandemic plasmodium ovale: a case of notso-benign tertian malaria the global burden of respiratory disease impact of the large-scale deployment of artemether/lumefantrine on the malaria disease burden in africa: case studies of south africa, zambia and ethiopia high-resolution measurements of face-to-face contact patterns in a primary school tracking tuberculosis in south africa the annual impact of seasonal influenza in the us: measuring disease burden and costs updated situation of influenza activity in hong kong little italy: an agent-based approach to the estimation of contact patterns-fitting predicted matrices to serological data the tencent news what types of contacts are important for the spread of infections? using contact survey data to explore european mixing patterns estimating within-school contact networks to understand influenza transmission reality mining: sensing complex social systems time-critical social mobilization capturing individual and group behavior with wearable sensor fluphone study: virtual disease spread using haggle epimap: towards quantifying contact networks and modelling the spread of infections in developing countries a high-resolution human contact network for infectious disease transmission collective dynamics of small world networks close encounters in a pediatric ward: measuring face-toface proximity and mixing patterns with wearable sensors computing urban traffic congestions by incorporating sparse gps probe data and social media data modeling human mobility responses to the large-scale spreading of infectious diseases modelling dynamical processes in complex socio-technical systems social contacts and mixing patterns relevant to the spread of infectious diseases characterizing and discovering spatiotemporal social contact patterns for healthcare outbreaks in realistic urban social networks skip the trip: air travelers' behavioral responses to pandemic influenza the effect of risk perception on the h n pandemic influenza dynamics quantifying social distancing arising from pandemic influenza behavioral responses to epidemics in an online experiment: using virtual diseases to study human behavior an evaluation of an express testing service for sexually transmissible infections in low-risk clients without complications hiv/aids: years of progress and future challenges reflections on years of aids prevalence and patterns of same-gender sexual contact among men hiv- , sexual practices, and contact with foreigners in homosexual men in colombia, south america prevalence of unsafe sexual behavior among hiv-infected individuals: the swiss hiv cohort study sexual behavioral abstine hiv/aids questionnaire: validation study of an iranian questionnaire study on the risk of hiv transmission by heterosexual contact and the correlation factors a survey of patients with chlamydia trachomatis infection: sexual behaviour and perceptions about contact tracing transmission of zika virus through sexual contact with travelers to areas of ongoing transmission-continental united states zika virus was transmitted by sexual contact in texas, health officials report adolescent sexual behavior, drug use, and violence: increased reporting with computer survey technology insights from flutracking: thirteen tips to growing a web-based participatory surveillance system flutracking: a weekly australian community online survey of influenza-like illness in flutracking weekly online community survey of influenza-like illness annual report mobility assessment of a rural population in the netherlands using gps measurements generating gps activity spaces that shed light upon the mobility habits of older adults: a descriptive analysis what can global positioning systems tell us about the contribution of different types of urban greenspace to children's physical activity? out and about: association of the built environment with physical activity behaviors of adolescent females a study of community design, greenness, and physical activity in children using satellite, gps and accelerometer data the accuracy of fastloc-gps locations and implications for animal tracking tracking the cheetah tail using animal-borne cameras, gps, and an imu examining the spatial congruence between data obtained with a novel activity location questionnaire, continuous gps tracking, and prompted recall surveys using global positioning systems in health research: a practical approach to data collection and processing mobile sensing in environmental health and neighborhood research feasibility and acceptability of global positioning system (gps) methods to study the spatial contexts of substance use and sexual risk behaviors among young men who have sex with men in new york city: a p cohort sub-study strengths and weaknesses of global positioning system (gps) data-loggers and semi-structured interviews for capturing fine-scale human mobility: findings from iquitos using mobile phone data to study dynamics of rural-urban mobility inferencing human spatiotemporal mobility in greater maputo via mobile phone big data mining gps tracking in neighborhood and health studies: a step forward for environmental exposure assessment, a step backward for causal inference? multi-view clustering with graph embedding for connectome analysis a self-organizing tensor architecture for multi-view clustering mmrate: inferring multi-aspect diffusion networks with multi-pattern cascades inferring diffusion networks with sparse cascades by structure transfer partially observable reinforcement learning for sustainable active surveillance broad learning: an emerging area in social network analysis key: cord- -k rs dql authors: doerre, a.; doblhammer, g. title: age- and sex-specific modelling of the covid- epidemic date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: k rs dql background: recent research points towards age- and sex-specific transmission of covid- infections and their outcomes. the effect of sex, however, has been overlooked in past modelling approaches of covid- infections. aim: the aim of our study is to develop an age- and sex-specific model of covid- transmission and to explore how contact changes effect covid- infection and death rates. method: we consider a compartment model to establish forecasts of the covid- epidemic, in which the compartments are subdivided into different age groups and genders. estimated contact patterns, based on other studies, are incorporated to account for age- and sex-specific social behaviour. the model is fitted to real data and used for assessing hypothetical scenarios with regard to lockdown measures. results: under current mitigation measures as of mid-august, active covid- cases will double by the end of october . infection rates will be highest among the young and working ages, but will also rise among the old. sex ratios reveal higher infection risks among women than men at working ages; the opposite holds true at old age. death rates in all age groups are twice as high among men as women. small changes in contact rates at working and young ages may have a considerable effect on infections and mortality at old age, with elderly men being always at higher risk of infection and mortality. discussion: our results underline the high importance of the non-pharmaceutical mitigation measures in the current phase of the pandemic to prevent that an increase in contact rates leads to higher mortality among the elderly. gender differences in contact rates, in addition to biological mechanisms related to the immune system, may contribute to sex-specific infection rates and their mortality outcome. to further explore possible pathways, more data on covid- transmission is needed which includes socio-demographic information. right from the start of the covid- pandemic, the importance of age on covid- contraction and fatality has been recognised (among others, esteve et al. ( ) , dudel et al. ( ) , kulu and dorey ( ) , wu and mcgoogan ( ) , karagiannidis et al. ( ) ), as well as of coresidence patterns (esteve et al. ( ) ). compartment and agent-based models aiming at projecting the spread of the disease have incorporated age as an important variable of transmission (e.g. davies et al. ( ) , deforche ( ) , colombo et al. ( ) , blyuss and kyrychko ( ) , balabdaoui and mohr ( ) ), in addition to other characteristics such as space (colombo et al. ( ) ) or contact patterns ). an important determinant, which appeared to be largely overlooked in modelling exercises, is sex. in the following, we will refer to sex when discussing technical details and biological factors, and gender, when referring to social factors. while studies generally notice that infection and in particular fatality rates were higher among elderly men than women, the reverse appears to be true for infections at working ages (sobotka et al. ( ) ). in germany, during the first wave of the pandemic through mid-may, infection rates were higher among women than men at working ages (figure ), while they were higher for men thereafter. one reason for this difference, in addition to biological factors (see discussion below) may lie in genderspecific contact rates. estimates of contact rates (van de kassteele et al. ( ) ) based on the polymod study (mossong et al. ( ) ) showed that household, workplace and school structures strongly shape ageand gender-specific contacts made by individuals. using the contact matrices from the latter study and calculating the ratio of the age-specific number of contacts for men and women (contacts men/contacts women) a clear pattern emerges (figure ): among ages - , contacts are between %- % higher among women, while among ages to , they are %- % higher among men. at the highest ages, the pattern reverses again, with women having slightly more contacts. the aim of our study is to model covid- transmission taking into account the two crucial demographic factors age and sex. we develop an seird-model that incorporates age-and sex-specific contacts, which shape transmission rates. the model may be used for short-and long-term projections, our example explores short-term effects up to two and a half months of hypothetical changes in contact rates. the model can be used to develop scenarios which address the effects of age-and gender-specific changes in contacts due to the closing of schools, kindergarten and shops, or work in home office, as well as to explore the effect lifting of these measures. while we are not able to address these effects separately, we translate them into hypothetical changes in age-and sex-specific contact rates by developing three scenarios. the first scenario reflects a continuation of the situation of mid-august ; the second assumes a lifting of measures mainly at working ages, and the third extends this to children, adolescents, and young adults. the manuscript is structured as follows: first we introduce the basic seird model and discuss how ageand sex-specific contact modelling was incorporated. we present the numerical implementation of the model, model fitting and the development of uncertainty intervals. then we introduce our scenarios and present the projection results in terms of number of active infections (prevalence), and cumulated number of deaths by october . we also explore how increasing contacts affect sex-ratios in infections and deaths. we close with a discussion of the results, the strengths and limitations of our model, as well as policy implications. figure : seird compartment model with transitions. (s → e: susceptible person becomes exposed to the virus, e → i: exposed person becomes infectious, e → r: exposed person is removed due to recovery, i → r: infectious person is removed due to recovery, i → d: infectious person is removed due to death) the core of the epidemiological model is an seird compartment model (see hethcote ( ) ) consisting of the epidemiological states s (susceptible, i.e. not yet exposed to the virus), e (exposed, but not infectious), i (infectious), r (recovered), and d (dead). the compartments represent individual states with respect to contagious diseases, i.e. covid- in this case, and the transitions between them are considered on a population level (see figure ). in this sense, the compartment model is used to describe a population process, but is not intended to model individual processes with respect to covid- . the following essential rate and fraction parameters are involved in the model: β (contact rate): the average number of individual contacts per specified timespan that are potentially sufficient to transmit the virus (see below for detailed specification) ρ (manifestation index, fraction): the fraction of people who become infectious at some time after being exposed to the virus (incubation rate): the mean rate of exposed people to become infectious; / is the average incubation time γ (recovery rate): the mean rate of exiting the infectious state, either to recovery or death; /γ is the average duration of the disease τ (infection fatality rate): the fraction of people who die due to covid- the contact model is considered for a population of n individuals, which is decomposed into a disjoint groups. for each group a = , ..., a, the proportion of individuals with regard to the whole population is n a /n , where n a denotes the number of individuals in group a. for any a ∈ { , .., a} and b ∈ { , ..., a}, let λ ab be the average number of contacts of an arbitrary individual from group a with individuals in group b during a fixed base time unit δ, e.g. hours. more specifically, define η ab (t , t ) as the random number of contacts of an individual in group a with any individual from group b over the timespan [t , t ] and η a * (t , t ) := a b= η ab (t , t ) as the (random) overall number of contacts of an individual from group a. it is assumed that η ab (t , t ) is poisson distributed as η ab (t , t ) ∼ poi t t µ ab (s) ds via the contact intensity µ ab (t). by assuming independence of contacts to different groups, it follows that η a * (t , t ) is also poisson distributed having intensity µ a * (t) = a b= µ ab (t). the average rate of contact of any individual from group a with group b is then obtained as where for the sake of simplicity we assume that µ ab (t) is periodic in the sense that µ ab (t + δ) = µ ab (t) for all t . deviations from these assumptions can be incorporated by appropriate modifications to the contact model and parameter set. in the compartment modeling approach, individuals within each group are generally assumed to be homogenous with respect to contact behaviour and no individual effects are considered. in order to address the potential impact of the implementation and easing of lockdown measures, we expand the model structure to group-specific compartments. below, we define groups according to sex and age group, but the following reasoning is valid for any specification of disjoint groups, given that the resulting groups are sufficiently large. specifically, for given groups a = , ..., a and any time t, set s a (t) as the number of susceptible people in group a at time t, e a (t) as the number of exposed people in group . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . https://doi.org/ . / . . . doi: medrxiv preprint a at time t, and so on. the group-specific compartment model is characterised by the ode system for all groups a = , ..., a, which is a direct extension of the ode system of the basic compartment model for the special case a = . we define as the effective contact rate between groups a and b, where w is the secondary attack rate, m ab is the specific mitigation effect by lockdown measures with regard to contacts between groups a and b, r is a general factor that accounts for compliance to distance, isolation and quarantine orders, h b is the proportion of infectious people in group b in need of hospitalisation and λ ab is the basic contact rate between groups a and b when no lockdown measures are in place. as we are primarily interested in shortterm prediction, we do not model biological aging, i.e. transitions between demographic groups. therefore, for any time t, compartment-specific additivity is assumed, i.e. the system is closed, meaning that the sum of all odes is at each time t. in the absence of any lockdown measures, the general contact patterns are characterised by the basic contact rates λ ab , which represent how intensive/often group a has any contact with group b sufficient for potential virus transmission. in the polymod study (mossong et al. ( ) , , participants from countries including germany reported the number and extent of their social contacts during a randomly assigned hour period, using a written diary. the age and gender of the contacted persons were recorded, among other information. overall, the study contains information on , contacts, distributed across the participating countries. the overall contact pattern for germany is displayed in figure . the behaviour of the epidemiological model is primarily governed by the effective contact rates β ab which result from the basic contact rates λ ab by accounting for the secondary attack rate and lockdown measures. it is implicitly assumed here that hospitalised cases are effectively isolated from the remaining population and can not spread the disease. note that the product ( − m ab )( − r)( − h b ) represents the proportion of potential virus transmissions that are not prevented. based on the compartment model, derivative states such as the demand for hospitalisation and demand for intensive care units can be modeled separately by imposing estimated proportions on the compartment i. more precisely, h a (t) and c a (t), i.e. the number of hospitalised persons and patients in intensive care in group a at time t are calculated as: where h a is the age-specific proportion of infectious people in need of hospitalisation and c a is the agespecific proportion of hospitalised cases that need intensive care. for these parameters, estimates are available from imperial college covid- response team ( ) and verity et al. ( ) ; see table . . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . we have implemented the suggested model in r using a discrete approximation of the ode system via the forward euler method (see butcher ( )). the step size ∆t is chosen as a quarter fraction of one day. accordingly, the transition rates between the compartments need to be adjusted, whereas the fraction parameters remain unchanged. for instance, if the average incubation time is days and ∆t = / (days), the transition parameter = / · / = / , whereas the manifestation index ρ, as the relative proportion of exposed people developing symptoms, is the same for any ∆t. the time-discrete . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . https://doi.org/ . / . . . doi: medrxiv preprint approximation of the system of odes is therefore described as follows. we suggest to fit the model along the following consecutive steps: ( ) determine a timespan { , ..., t } during which no lockdown measures had been in place, and determine the cumulative number of infections during this time. ( ) based on plausible ranges for the involved compartment parameters and the initial state of the compartment model, fit the contact intensity model with regard to the cumulative number of infections during { , ..., t }. in order to derive the secondary attack rate w from the contact rates λ ab given in van de kassteele et al. ( ), we fit the proposed compartment model to the reported cases during a timespan { , ..., t } of no lockdown. this step is necessary, because the social contact rates λ ab do not incorporate the specific transmission characteristics of sars-cov- , such as the average length of the infectious period and average infection probability per contact. we assume that w is not specific to age or sex. we employ as a least-squares criterion function in order to determine the optimal value w := argmin w> q(w), where i cum are the observed cumulative infections, and i cum (t|w) are the estimated cumulative infections based on the epidemiological model given w. hence, w is the scalar parameter for which the cumulative infections are best predicted retrospectively. note that the observed cumulative number of infections is usually recorded for each day, while the step size ∆t in the model may be different. thus, appropriate matching of observed and estimated values is necessary. this fitting method requires that the number of infections for the geographical region considered is sufficiently large, such that the mechanics of the compartment model are plausible. note that potential under-ascertainment may not substantially change the optimal value of w as long as the proportion of detected cases does not strongly vary over time. furthermore, the suggested fitting method is based on the assumption that the probability of virus transmission is independent of age and sex, given that a contact has occurred. if different propensities of virus transmission are allowed for, the contact matrix may be correspondingly adjusted along introduced parameters w , ..., w ab for each group combination or w , ..., w a , if the probability of transmission only depends on the contact group. the criterion function is likewise extended as (w , ..., w ab ) → q(w , ..., w ab ). however, optimisation in this extended model requires a sufficiently large number of transmissions and detailed information on the recorded infections, and may lead to unpractically vague estimates otherwise. therefore, we suggest to employ the simpler model with univariate w first. in order to account for parameter uncertainty, we develop uncertainty intervals for the number of people in each compartment. as a cautionary remark, note that these intervals are not to be equated to confidence intervals in the classical sense. though the resulting intervals are conceptually comparable to bayesian credibility intervals, they are to be distinguished in that no prior distribution is explicitly assumed here. note that these intervals do not reflect uncertainty in terms of the underlying infection data. we predict the number of cases in each age-specific compartment using a monte carlo simulation method. for each simulated run, all parameters are independently drawn from their respective range, yielding . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . an instance of a hypothetical parameter setup. given these parameters, the seird ode model is approximated using the forward euler method and known initial states, as described above. after n r of such simulated runs, the prediction intervals for all relevant values are construed based on the pseudo-empirical trajectories of the compartment model. furthermore, prediction intervals are derived as point-wise quantile ranges for each t. for instance, an % prediction interval for the number of infectious people in group a at time t is [i a, % (t), i a, % (t)]. first, we fitted the model to observed covid- infections using transition rates from literature as described under section for the period february to march . we estimated the model parameter w, also termed secondary attack rate, which reflects the probability of infection per contact, by least squares between observed and predicted values, as described in section . . second, we developed three scenarios starting our projections on august and, using quarter-days as base time, ending on october . the first scenario, which is our baseline scenario, assumed that the age-and sexspecific contacts are down by %, i.e. only % of the contacts estimated by van de kassteele et al. ( ) were realized between start and end of the projection. this applied to all age groups and to both sexes. this scenario should reflect continuous distancing measures as were present in mid-august. the second scenario assumed that contacts at working ages - were increased by percentage points (pp), and among those aged - by pp, equaling a decline of % and % respectively. all other ages remained at % contact reduction. this should reflect the return from home office settings, the opening of shops, cafes, restaurants, etc. the third scenario considers an additional increase in contact rates among ages - by pp, which should reflect the opening of schools and venues mainly visited by young individuals. we explored the following age-specific outcomes: fitting our model to covid- infections observed during our fitting period ( feb - march ) results in an estimate of the secondary attack rate w ≈ %. we started with , active infections on august and under scenario this figure increased to approximately , ( figure ) (men: , ; women: , ). the number of active infections was highest at age - (men: , ; women: , ), followed by age - (men: , ; women: , ), and age - (men: , ; women: , ). the cumulative number of deaths increased from , to , with , men and , women. by october , infection rates (table ) were highest among the - -year old (men . and women . per individuals) followed by ages to ( . - . ), and ages - ( . - . ). at ages above , infection rates declined rapidly, almost halving from individuals in their fifties ( . - . ) to those in their sixties ( . - . ), while at older ages the decline followed at a much lower pace (ages - : . - . ; ages +: . - . ). sex ratios of infections were below in the age interval to , indicating a higher risk of infections among women. from age onwards they were generally above , thus turning the disadvantage towards men. as expected, death rates (table ) increased with age with a decline at the oldest ages probably reflecting health selection or better protection of the oldest old. they were more than twice as high among men than women, again with the exception of the oldest age group, where men might be positively selected by health. scenario assumed increased contacts at working ages and arrived at , active infections by october and therefore , active infections more than in scenario (men , ; women , ). these additional infections stemmed from all ages, even if the risk of infections increased most among the working ages. sex ratios of infection rates remained . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . unchanged, because we increased contact rates at the same proportion for both genders. the additional infections translated into an additional deaths (men: ; women ); among women, three quarters of these deaths resulted at ages and above; among men, %, reflecting their higher mortality already at younger ages. sex ratios of death rates remained unchanged as compared to scenario , reflecting our model assumption of parallel increase in contact rates for both genders. scenario with increased contacts at young and working ages resulted in , active infections and thus , more than in scenario (men , ; women , ) which translated into an additional deaths with the majority resulting from ages and above (women %; men %). there was little change in sex ratios as compared to the other two scenarios. incorporating age-and sex-specific contact rates in a covid- compartment model permits exploration of the effects of changes in mitigation measures on the two genders. we developed three scenarios which assumed ongoing distancing measures versus easing of contact restrictions in working ages, and among adolescents and young adults. our projections do not set out to forecast the actual number of covid- infections in a time span of about two months, they rather assess the effect of increased contacts on the infection and mortality risks of the two genders and the various age groups. the fit of our model to the baseline period in february and march results in an estimated secondary attack rate w ≈ %, putting our findings in close agreement with the rates reported in ghangdou, where the household w varied between % and %, and the non-household w between % and % (jing et al. ( ) ), although higher attack rates of up to % have been reported e.g. for meals and holiday visits (liu et al. ( b) ). three important lessons can be learned from our scenarios. first, even a small change in contact rates has a large impact on infections and deaths. in our projections we assumed an increase ranging from to pp. this reflects the fact that without non-pharmaceutical mitigation measures (npmm) such as masks, physical distance between individuals, better air ventilation and hygiene, and without contact tracing, the infection rates would return to the initial exponential increase. this was reflected in a reproduction rate of . to . , as observed at the beginning of the pandemic (lin et al. ( ) , and alimohamadi et al. ( ) , rki ( )). however, the presence of npmm also mitigates the effect of the increase in contacts due to the return to office, opening of shops, restaurants, as well as schools, and venues visited by young adults, leaving it far from the initial impact. in our present scenarios, both effects, the change of contact rates and the change of their impact, are captured in the reduction matrix (m ab ), which is multiplied with the matrix of the contact rates. one alternative approach would be to develop separate scenarios for changes in the secondary attack rate w due to npmm and changes in the contact rates (m ab ), which is one possibility to modify this analysis further. at any rate, our scenarios show that small changes already have large impacts on infections and deaths. this implies that the impact of contacts must be diminished considerably to allow increases in contacts without returning to exponential growth of infections, hence underlining the high importance of the npmm in the current phase of the pandemic. second, due to intergenerational contacts, any easing of measures in working and young ages will inevitably lead to an increase in infections and deaths, the latter mainly at old ages. over all ages, deaths will increase by % when contacts increase at working ages, and increase by % when contacts also rise among the young. the vast majority of these increases occur at old ages, with % among women and % among men, whereby the fatality among men is more than twice as high as among women. thus, elderly men are at a particular risk of death due to increased contacts. however, our model assumptions are based on fatality rates at the beginning of the pandemic, which may have changed because of better treatment options of critically severe covid- cases using, e.g., dexamethasone (cain and cidlowski ( ) ). thus, we might overestimate mortality under current knowledge and treatment options. still, increases in contacts need to be accompanied by special measures protecting the elderly from death, without negative physical and mental health consequences due to quarantine and isolation measures (galea et al. ( ) ). contrary to deaths, infections will mainly increase at young and middle ages with a lower risk of severe covid- symptoms or even asymptotic disease courses. third, small changes in contact rates will not change the sex ratios in infections and deaths. at all ages, men will have more than twice the mortality risk from covid- , while the risk of infections is more frequent among working age women than men. at old ages, men have a higher infection risk. note that, in absolute numbers, more women are diagnosed with covid- at old age due to their higher life expectancy. here a more substantial question arises, namely whether covid- infection rates are indeed gender-specific. german covid- infection rates, as in any other country, are biased by the . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . time-lag of reporting and by differential availability of pcr-tests over time and to subgroups of the population (rki ( )). gender-specific diagnoses in favour of women may reflect that higher contact intensities of women may have led to a higher rate of pcr tests and therefore to a smaller number of undiagnosed cases. in addition, women are more health-conscious than men (oksuzyan et al. ( ) ) and may have sought pcr testing to a higher degree even when presenting with weaker symptoms. on the other hand, takahashi et al. ( ) found sex-specific differences in immune response to covid- infections. for a further discussion of potential sex-specific mechanisms modulating the course of disease, see also (gebhard et al. ( ) ). thus, we can conclude that both biological and social factors contribute to sex-and gender-specific infection and mortality rates and that they are stable given small changes in contact rates. we focused on the practical emulation of the dynamic behaviour and process of the spreading of covid- while incorporating specific epidemiological information on the virus and disease. to achieve this aim we used a compartment modeling framework, which has become a standard approach in epidemiology due to its flexibility and accessibility. the main advantage of this modeling framework is that a considerable amount of demographic and epidemiological information can be incorporated while the essential model structure and implementation remain relatively simple. similarly, it is possible to extend the model to incorporate parameter uncertainty, as described above. furthermore, we want to emphasize the markovlike property of compartment modeling in the sense that current compartment sizes on a specific date are sufficient for deducing the subsequent behaviour of the epidemiological process, which makes the framework particularly attractive for forecasting and investigating hypothetical scenarios. however, there is one drawback to compartment modelling that it is inherently based on an averaging rationale which treats population groups homogenously and the average number of contacts in each group is a determining parameter. in contrast to truly stochastic models (such as agent-based models), no random or systematic individual deviations from the fundamental contact patterns are taken into consideration. in addition, geographical and spatial information are not explicitly considered in compartment modeling, and this further limits the scope of the forecasting results. in general, assessing the impact of introducing or easing different lockdown measures is remarkably difficult, especially because several aspects are usually changed simultaneously and the general behaviour of the population may change dynamically at the same time. some efforts have been made to address these issues in the literature, however we advise against using the proposed model for such purposes. one main reason is that the initial state for forecasting and fitting of the model relies primarily on available data sources, which are in the form of reported count data. in addition to the general limited validity of observational data, there is still insufficient knowledge on the specific characteristics of covid- and the actual current spread of the virus. naturally, other modeling approaches face the same issues of data quality. in our covid- forecasts, the number of infections and the number of deaths differ only slightly from models which do not differentiate by sex (data not shown). however, age-and sex-specific models provide better insight into the risk populations of infections and mortality. this helps to target health policy measures under scarce resources, such as who should be tested and vaccinated first. both biological sex and social gender appear to affect covid- infection rates and their outcomes; this needs to be acknowledged in health policy decisions and medical treatment. to further explore social factors on covid- transmission, more information that includes socio-demographic data is needed. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . https://doi.org/ . / . . . doi: medrxiv preprint . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted october , . . https://doi.org/ . / . . . doi: medrxiv preprint estimate of the basic reproduction number for covid- : a systematic review and meta-analysis age-stratified model of the covid- epidemic to analyze the impact of relaxing lockdown measures: nowcasting and forecasting for switzerland effects of latency and age structure on the dynamics and containment of covid- numerical methods for ordinary differential equations, rd edition after years of regulating immunity, dexamethasone meets covid- an age and space structured sir model describing the covid- pandemic age-dependent effects in the transmission and control of covid- eepidemics an age-structured epidemiological model of the belgian covid- epidemic mathematical models of contact patterns between age groups for predicting the spread of infectious diseases monitoring trends and differences in covid- case fatality rates using decomposition methods: contributions of age structure and age-specific fatality national age and coresidence patterns shape covid- vulnerability the mental health consequences of covid- and physical distancing the need for prevention and early intervention impact of sex and gender on covid- outcomes in europe the mathematics of infectious diseases impact of non-pharmaceutical interventions (npis) to reduce covid- mortality and healthcare demand household secondary attack rate of covid- and associated determinants in guangzhou, china: a retrospective cohort study case characteristics, resource use, and outcomes of patients with covid- admitted to german hospitals: an observational study efficient estimation of age-specific social contact rates between men and women the contribution of age structure to the number of deaths from covid- in the uk by geographical units spread and impact of covid- in china: a systematic review and synthesis of predictions from transmission-dynamic models secondary attack rate and superspreading events for sarscov- the reproductive number of covid- is higher compared to sars coronavirus social contacts and mixing patterns relevant to the spread of infectious diseases a demographic perspective on gender, family and health in europe projecting social contact matrices in countries using contact surveys and demographic data coronavirus disease (covid- ) daily situation report of the robert koch institute age-structured impact of social distancing on the covid- epidemic in india age, gender and covid- infections sex differences in immune responses that underlie covid- disease outcomes estimates of the severity of covid- disease : a model-based analysis characteristics of and important lessons from the coronavirus disease (covid- ) outbreak in china, summary of a report of cases from the chinese center for disease control and prevention changes in contact patterns shape the dynamics of the covid- outbreak in china key: cord- - kz z authors: quaife, matthew; van zandvoort, kevin; gimma, amy; shah, kashvi; mccreesh, nicky; prem, kiesha; barasa, edwine; mwanga, daniel; kangwana, beth; pinchoff, jessie; edmunds, w. john; jarvis, christopher i.; austrian, karen title: the impact of covid- control measures on social contacts and transmission in kenyan informal settlements date: - - journal: bmc med doi: . /s - - - sha: doc_id: cord_uid: kz z background: many low- and middle-income countries have implemented control measures against coronavirus disease (covid- ). however, it is not clear to what extent these measures explain the low numbers of recorded covid- cases and deaths in africa. one of the main aims of control measures is to reduce respiratory pathogen transmission through direct contact with others. in this study, we collect contact data from residents of informal settlements around nairobi, kenya, to assess if control measures have changed contact patterns, and estimate the impact of changes on the basic reproduction number (r( )). methods: we conducted a social contact survey with residents of five informal settlements around nairobi in early may , weeks after the kenyan government introduced enhanced physical distancing measures and a curfew between pm and am. respondents were asked to report all direct physical and non-physical contacts made the previous day, alongside a questionnaire asking about the social and economic impact of covid- and control measures. we examined contact patterns by demographic factors, including socioeconomic status. we described the impact of covid- and control measures on income and food security. we compared contact patterns during control measures to patterns from non-pandemic periods to estimate the change in r( ). results: we estimate that control measures reduced physical contacts by % and non-physical contacts by either % or %, depending on the pre-covid- comparison matrix used. masks were worn by at least one person in % of contacts. respondents in the poorest socioeconomic quintile reported . times more contacts than those in the richest. eighty-six percent of respondents reported a total or partial loss of income due to covid- , and % reported eating less or skipping meals due to having too little money for food. conclusion: covid- control measures have had a large impact on direct contacts and therefore transmission, but have also caused considerable economic and food insecurity. reductions in r( ) are consistent with the comparatively low epidemic growth in kenya and other sub-saharan african countries that implemented similar, early control measures. however, negative and inequitable impacts on economic and food security may mean control measures are not sustainable in the longer term. conclusion: covid- control measures have had a large impact on direct contacts and therefore transmission, but have also caused considerable economic and food insecurity. reductions in r are consistent with the comparatively low epidemic growth in kenya and other sub-saharan african countries that implemented similar, early control measures. however, negative and inequitable impacts on economic and food security may mean control measures are not sustainable in the longer term. keywords: covid- , sars-cov- , social contacts, physical distancing background over . million cases and , deaths from covid- have been recorded worldwide as of august [ ] . most recorded cases and deaths have occurred in high-income countries in europe and north america. many countries introduced extreme physical distancing control measures to control sars-cov- transmission [ ] . modelling studies suggest that without substantial mitigation measures, most low-and middle-income (lmic) settings, including sub-saharan africa, will experience a delayed, but severe epidemic [ , ] . yet to-date, the numbers of recorded cases and deaths in africa are much lower than predictions, prompting speculation on why many african countries have so far avoided a severe uncontrolled epidemic. a range of reasons has been proposed, including differences between settings in case and death detection capacity, demographic factors such as population age distribution, and the role of temperature and aridity in transmission [ ] [ ] [ ] [ ] [ ] [ ] . however, many sub-saharan african countries implemented lockdown and curfew measures far earlier in their country's epidemic trajectories than most higher-income settings in europe and north america. for example, kenya-the focus of the current studyimplemented a partial lockdown on april when the country had recorded just cases and deaths. in contrast, although case detection rates may differ between settings, the uk implemented its own lockdown on march after recording cases and deaths [ , ] . the first reported case in kenya was on march , and schools closed on march . suspension of international flights, including mandatory quarantine of incoming residents; closure of bars and restrictions on restaurant opening hours; and a ban on large gatherings were imposed on march , soon followed by an enactment of a nationwide curfew from pm to am. on april , the kenyan government declared wearing face masks as mandatory in any public place. recently, cessation of movement was imposed in informal settlements in mombasa and nairobi, following a rise in cases in nairobi's kibera informal settlement. consequently, the government has indicated additional physical distancing measures may be authorised. physical distancing control measures seek to reduce the number of contacts between people where transmission could occur. to predict the impact of control measures accurately, quantitative data on the number and type of contacts between people is required. todate, only a few empirical studies have been published to assess the impact of covid- control measures on contacts; these have been conducted in china [ ] , the usa [ ] , and europe [ ] ; but none were undertaken in sub-saharan africa. in fact, prior to the current pandemic, a systematic review [ ] reported that just four social contact surveys out of had been conducted in sub-saharan africa, including one in kenya [ ] [ ] [ ] . to our knowledge, just one lmic study has been published since this review [ ] . this lack of evidence means that many sars-cov- transmission models primarily use synthetic contact matrices for lmic settings, which use demographic, household composition, classroom size, and other data to adjust social contact data from primarily high-income settings [ , ] . although one social mixing study was conducted in kilifi, a coastal area of kenya [ ] , outside of one study which collected data from a south african township [ ] , no published contact data exist from informal settlements, which may be particularly vulnerable to covid- due to high levels of population density, indoor crowding, and household sizes, alongside intergenerational mixing within the household. between-person contacts drive the transmission of respiratory pathogens, such as sars-cov- . understanding how contact patterns change under different control measures is important to inform decisions on whether and how to implement them. in this study, we describe a survey of contact patterns conducted among a sample of adults from five informal settlements in urban and peri-urban areas around nairobi. we explore how direct contacts vary across respondent characteristics, including by socioeconomic status. we estimate the impact of current control measures on the reproduction number, r , to evaluate whether these measures might be sufficient to control the epidemic. we also describe income losses and food security that respondents attribute to covid- and control measures. participation in the study was voluntary, and analyses were conducted on anonymised data. the study was approved by the internal review board of the population council (study number ), the ethics committee of the london school of hygiene and tropical medicine (reference number ), and the amref health africa ethics and scientific review committee in kenya (p / ). adult respondents were recruited from two existing population council cohorts in five informal settlements around nairobi (kibera, huruma, kariobangi, dandora, and mathare). the existing cohorts were part of the adolescent girls initiative kenya (agi-k) and nisikilize tujengane (nisitu -listen to me, let us grow together) studies. the cohorts were in place to study the impacts of multi-sectoral interventions on adolescents, and consisted of randomly selected households from informal settlements which contained at least one adolescent in january (agi-k) or january (nisitu). in may , respondents from agi-k and nisi tu cohorts completed a telephone survey on covid- knowledge, attitudes, and perceptions (kap). of these , an age-and sex-stratified random sample of respondents completed a contact survey. stratification was based on kenya census data for nairobi county, with a target sample size of and % oversampling to account for refusal. this was based on the sample sizes of similar contact surveys [ ] , alongside feasibility of phone interviewing during lockdown. background data, including household ownership of assets, were merged from previous survey rounds. respondents were first asked a range of questions on covid- including knowledge and experience of testing and symptoms, economic impacts on the household, and food availability and cost. then, respondents were asked to report all direct physical and non-physical contacts made between am the day preceding the survey and am the day of the survey. a direct contact was defined as someone respondents met in person and with whom they had either (i) "physical contact (any sort of skin-toskin contact e.g. a handshake, embracing, kissing, sleeping on the same bed/mat/blanket, sharing a meal together out of the same bowl, playing football or other contact sports, sitting next to someone while touching shoulder to shoulder, etc.", or (ii) "non-physical contact (you did not touch the person, but exchanged at least a few words, face-to-face within metresfor example, someone you bought something from in the market, or rode with on a minibus, or worked with in the same area)". all respondents were over the age of , so no contact data were collected from children; however, respondents were able to list contacts under the age of . we made pragmatic adaptations to existing contact measurement tools to allow them to be conducted over the phone, primarily to reduce respondent burden and to ensure that aggregate contact data were not biassed downwards by respondent fatigue. respondents were first asked about contacts with members of their household the previous day, recording the contact age, gender, and whether contacts were physical or non-physical. then, respondents were asked how many nonhousehold contacts they had had in the same timeframe. those who reported nine or fewer outside-household contacts were asked to describe each contact's age, gender, whether the contact was physical or non-physical, the duration of the contact, and whether a mask was worn by the respondent or contact. those who reported ten or more outside-household contacts were asked how many of these contacts were physical/non-physical, in the age ranges under , - , and over . the contact tool is shown in additional file . r version . . and stata were used for analyses; the code and data are publicly available at https://github. com/mquaife/kenya_mixing. the age and gender of respondents were compared to the full sample from which they were drawn, alongside census data to assess the representativeness of the sample. data on household assets were used to classify respondents into wealth quintiles using principal component analysis; additional file gives information on this, alongside methods used to estimate economic and food security. we calculated the mean number of social contacts per person per day, stratified by respondent age, sex, household size, and education level. we then calculated social contact matrices for the age category-specific daily frequency of direct contacts, adjusting for contact reciprocity and the age distribution using census data from informal settlement sub-counties. we then compared the mean total number of daily contacts by age group to the only empirical dataset available from kenya in kiti et al. [ ] , alongside synthetic matrices from [ ] and [ ] . kiti et al. collected data on physical contacts only, so we restrict our sample to physical contacts when comparing with this study. we adjusted both matrices to match the age structure of the informal settlement setting, using the kenyan population and housing census to adjust from kilifi and nationally representative populations, respectively [ ] . additional file provides more detail. because kiti et al. collected data on the age of contacts in categories (< , - , - , - , - , +) which were different to those in this survey, we restructured both age matrices and used bootstrapped samples of both datasets to impute the number of contacts for matching age ranges. we adjusted for symmetry after bootstrapping because one age range in our data ( +) had fewer than five respondents. bootstrapping was not possible with prem et al. matrices as they do not relate to individual level data. as respondents under the age of were not included as survey respondents, we imputed child contacts using methods developed by klepac et al. [ ] , and implemented for the same purpose in a uk study [ ] . this involved taking the ratio of the dominant eigenvalues between our matrices and the comparable setting-adjusted matrices to scale missing matrix elements. finally, we estimated the impact of control measures on the basic reproduction number (r ) in this population. because there are no baseline contact data from this population without control measures, we assume that contact patterns in this sample prior to control measures were similar to those estimated by kiti et al. or prem et al. we make the common assumption for respiratory infections that the next-generation matrix is a function of the age-specific number of contacts, the percontact transmission probability, and the duration of infectiousness, and that r is therefore proportional to the dominant eigenvalue of the contact matrix [ , ] . we assume that existing matrices are comparable to the informal settlement setting of this study after adjusting for age distribution, that there were no changes in the duration of infectiousness during the study period, that percontact transmission probability also remained constant, and that all age groups have the same per-contact transmission probability, given infection. with these assumptions, the relative reduction in r can be estimated as the reduction in the dominant eigenvalue of the contact matrices. our central estimate of the r of sars-cov- is . (sd = . ), as estimated in a meta-analysis of published estimates of r prior to the introduction of control measures [ ] . because studies in this meta-analysis were predominantly based on european and asian countries, we explore a lower bound of . (sd = . ) based on the earliest estimate of the time-varying reproduction number in kenya [ ] . we also use a higher bound r of . (sd = . ) based on modelling analyses from european countries [ ] . finally, although there is limited evidence of age-specific variation in infectiousness or symptomatic rate given infection, there is some evidence that children are around half as susceptible to sars-cov- infection compared to adults [ ] . in a sensitivity analysis, we explore whether this impacts r estimates. out of the people sampled for the kap survey, interviews were completed. of the initial sampled, were sampled to complete the additional contacts module. in total, were successfully interviewed and recorded contacts. eight hundred thirty ( %) of these were household contacts, and ( %) were non-household contacts on which we have detailed information. the remaining ( %) were nonhousehold contacts of respondents who reported ten or more such contacts. the mean age of respondents was (sd . , max ), and % were female ( / ). table shows that the age and gender distribution of respondents broadly matched that of (a) the sample from which respondents were randomly chosen and (b) the kenyan adult population. compared to both groups, there is some indication that our sample has more - year olds and fewer + year olds than national data, whilst our sample is substantially older than that of kiti et al. eight respondents ( %) reported two or more covid- symptoms in the previous days. forty-two percent of respondents ( / ) thought they had a high chance of acquiring sars-cov- , and % ( / ) thought the implications would be "severe" or "very severe" if they caught the virus. when asked an open-ended question without prompting what they would do if they developed covid- symptoms, % ( / ) thought they would take a test, and % ( / ) said they would stay at home or avoid social gatherings. just % ( / ) of respondents knew someone either who was suspected of having covid- or who had tested positive. respondents reported substantial food and economic insecurity due to covid- and control measures. around a third ( %, / ) reported the pandemic had caused a complete loss of income, and an additional % ( / ) reported partial income losses. eightythree percent ( ) reported experiencing increases in food prices, and three quarters of respondents reported eating less or skipping meals due to having too little money for food ( %, / ); all but one ( / ) reported that this was due to the situation with covid- . just % ( / ) reported receiving monetary or non-monetary assistance in the previous days- % ( ) reported that food was the one of the biggest needs that was currently unmet. covid- control measures meant % ( / ) of respondents reported seeing friends less, and % ( ) seeing family less. twenty-five percent of respondents ( / ) reported leaving the settlement where the interview was conducted in the previous h. at the time of data collection, mask wearing was required by the kenyan government in public places and was very common: % ( / ) of respondents reported "always" wearing a mask outside of their house. the mean number of contacts reported was (median , iqr - ), household contacts (median , iqr - ) and non-household contacts (median , iqr - ). as shown in fig. , respondents in the poorest quintile reported . times as many contacts as those in the richest quintile and we find evidence of a downwards trend in contacts as socioeconomic status increases (non-parametric test for trend p = . ). there was weak evidence that men had more contacts than women ( . − . = . , t test p = . ) and contacts increased with age (non-parametric test for trend p = . ). just % ( / ) of contacts were reported within the household, and total contacts did not vary substantially by household size or by respondent education level. this lack of variation by household size is consistent with most contacts being outside of the household. figure summarises the characteristics of contacts for which we have detailed information ( household contacts and non-household contacts where a respondent reported fewer than ten non-household contacts). most physical contacts were household contacts, and the proportion of female contacts was higher among household than non-household contacts. just % ( / ) of non-household contacts took place without a mask being worn by either the respondent or the contact. most reported non-household contacts were brief: % ( / ) were under min, and a further % ( / ) between and min. finally, % ( / ) of non-household contacts took place in an outside location, and % ( / ) of non-household contacts were in the home of the respondent or contact. figure shows age-specific contact matrices disaggregated by contact location and type; these are asymmetric and not adjusted for demography. matrices are consistent with the majority of contacts occurring outside of the household and being non-physical. figure uses the two existing contact matrices for kenya to impute contact patterns for under s, adjusting for age-distribution and symmetry. the two pre-covid- data sources differ substantially in their methods, and the differences are propagated in these adjusted matrices. we find a % reduction in physical contacts, and a - % reduction in all contacts compared to before the epidemic. we estimate r under control measures, shown in fig. . all comparisons to pre-covid- matrices assuming r = . suggest that control measures reduced r to below one, to . (iqr . , . ) for physical contacts and to either . (iqr . , . ) or . (iqr . , . ) depending the synthetic matrix used as comparator, based on prem et al. [ ] and [ ] , respectively. using the lower r estimate of . , we estimate reductions to . [ ] or either . or . assuming all contacts are equally risky. as shown in additional file , assuming that children are half as susceptible to sars-cov- infection compared to adults has little impact on r estimates. covid- control measures in informal settlements appear to have led to a large reduction in social contacts. we find a - % reduction in eigenvalues of contact matrices depending on the pre-covid- matrix used; assuming an r of . , this would translate to an r of between . and . at the time of data collection. by contrast, simulation estimates of the r in an unmitigated covid- epidemic in kenya were between . ( % ci . - . ) and . ( % ci . - . ) [ ] . the r we estimate here is consistent with the slow growth of the kenyan epidemic to-date compared to epidemics in china and europe. the large reductions in contacts we estimate are of similar magnitude to those seen in both the uk [ ] ( % reduction in contacts), wuhan and shanghai [ ] ( % reduction), and the usa ( % reduction) [ ] . we are not aware of any comparable post-lockdown studies from low-or middle-income settings to-date, including sub-saharan africa. considerable food and economic vulnerability was reported due to covid- control measures. over % of respondents reported a partial or complete loss of income, and three quarters reported eating less or skipping fig. characteristics of a household and b non-household contacts for which full information was gathered fig. age-stratified mean number of reported contacts from survey respondents recruited from five informal settlements around nairobi. a the aggregate mixing matrix. b household contacts only. c non-household contacts only. d physical contacts only. e non-physical contacts only meals due to covid- . households reported they were receiving some assistance, but that their biggest remaining unmet need was food. although the prevalence of covid- was low, and these factors can largely be attributed to control measures rather than illness from covid- itself, it is important to recognise the counterfactual of no control measures is an unmitigated epidemic, and not an absence of these harms. the socioeconomic situation of informal settlements means that respondents may face greater economic precarity than residents of formal urban areas. even within this sample, the poorest quintile of respondents reported . times as many contacts as the richest, suggesting an inequitable impact of covid- transmission. this inequity would be exacerbated if socially patterned financial and access barriers inhibit the poor from seeking care for covid- [ , ] . stringent control measures which cause economic and food insecurity are not likely to be sustainable in the long term if not accompanied by social protection mechanisms. these estimates of r are lower than those suggested by the linear growth of the epidemic in kenya under control measures [ ] which implies an r of around , suggesting that there are other factors which influence transmission which we do not consider here. contact patterns measured here only reflect community transmission, and if proportionately more infections occur due to contacts in non-community or clinical settings, then these estimates will overestimate the impact of control measures. as seen in many other settings, the number of reported cases is likely to be a significant underestimate of true cases given constraints in case finding and laboratory testing capacity: estimates suggest that during the study period, kenya was detecting around % of symptomatic cases, compared to around % in the uk and the usa [ ] . at present, evidence on how sars-cov- is transmitted is inconclusive [ , ] ; however, if fomites are a substantive cause of transmission in kenyan informal settlements, then the current analysis will likely overstate the impact of control measures on r . we conject fomite transmission may be more likely in this setting due to high population density, and low and unequal access to water, sanitation, and hygiene amenities. furthermore, the next-generation matrix approach of calculating r which we use assumes uniform susceptibility and infectivity by age. in reality, younger people are less likely to acquire and transmit sars-cov- [ ] . because our contact data are collected in wide age ranges, if younger people have reduced contacts proportionately more than older people, our results may overestimate the impact on r . we found that assuming reduced susceptibility among children did not substantively change results. since data were collected for this study, the case numbers have continued to increase in kenya. as of late july , a progressive re-opening was occurring, including the lifting of movement restrictions in areas considered hotspots including parts of nairobi and coastal counties, moving the start of nightly curfew from pm to pm, and allowing the opening of places of worship, restaurants, and other places of communal gathering. local air travel resumed on july. restrictions on the number of people allowed in such places remain, for example gatherings in places of worship are limited to people for h, only for those over or under years of age. schools remain closed until january . this study has a number of limitations. in the absence of baseline contact data (i.e. before control measures were put in place), we use empirical matrices from a different area of kenya and synthetic matrices based on adjusting contact surveys from higher income countries to household and other characteristics in kenya. although we adjust these datasets by the age structure of the kenyan population, other factors such as household size were not reported and may influence number of contacts and therefore pathogen transmission. the pre-covid- setting of kiti et al. is very different to this sample, not least as estimates place population density around times greater in informal settlements (kibera, , persons/km ) compared to urban kilifi ( persons/km ) [ ] . because we would expect contacts to be greater in more densely populated areas, the true reduction in contacts may be more than we estimate here. although we have a range of background data on respondents from using existing sampling frames, households in the agi-k and nitisu cohorts were initially selected as having an adolescent residing there in and , respectively. finally, although face mask use was reported by almost all participants, because of uncertainty in the effectiveness of masks in reducing sars -ncov transmission, the impact of different types of face masks, and real-world adherence of mask users, r calculations do not assume any protective effect from mask use. other social contact surveys have used a prospective study design, asking respondents to record contacts in a daily diary [ ] . because we asked respondents to recall contacts from the previous day, these data may be subject to recall bias, although it is not clear in which direction this may act. furthermore, we impute adjusted child contacts using the comparison studies. an alternative approach, such as that taken by kiti et al., would have been for respondents to record contacts for children in their household-arranging this was not possible during covid- restrictions. to make the contact survey feasible for phone-based data collection, we simplified the tool for respondents who reported more than ten outside-household contacts. we are therefore limited to knowing these contacts' age and whether the contact was physical or non-physical. contacts reported in this way were a substantial proportion ( %) of the total sample. the main risk of bias from this may stem from respondents rounding up or down to anchor numbers (e.g. units of ten); fig. e shows a few respondents cluster around and contacts. overall, the loss of granularity was beneficial to reducing respondent burden. we do not calculate the net reproduction number, r, but because reported case numbers in kenya are low, the proportion of the population that is no longer susceptible is likely minimal. we assume that direct contacts are a proxy for effective contacts and therefore transmission, and that transmissibility does not vary by age. in addition, we do not account for the very high proportion of respondents who report that they or their direct contacts wore face masks. considering these factors would mean r is below the r estimated here. kenya has implemented strict control measures in response to the covid- pandemic. this study highlights the difficult decisions policymakers face as we find that control measures are likely to have substantially reduced covid- transmission, but also negatively impacted food and economic security of informal settlement residents. this is the first study to measure social contact patterns after covid- control measures have been implemented in sub-saharan africa. there is evidence that impacts are inequitable, as the poorest quintile report . times more contacts than the richest quintile, and % of respondents reported complete or partial income losses. negative and inequitable impacts on economic and food security may mean control measures are not sustainable in the longer term without social protection. an interactive web-based dashboard to track covid- in real time. the lancet infectious diseases oxford covid- government response tracker, blavatnik school of government the potential effects of widespread community transmission of sars-cov- infection in the world health organization african region: a predictive model projected early spread of covid- in africa through the relatively young and rural population may limit the spread and severity of covid- in africa: a modelling study preparedness and vulnerability of african countries against importations of covid- : a modelling study looming threat of covid- infection in africa: act collectively, and fast managing covid- in low-and middleincome countries covid- pandemic in west africa effective transmission across the globe: the role of climate in covid- mitigation strategies. the lancet planetary health changes in contact patterns shape the dynamics of the covid- outbreak in china: science quantifying interpersonal contact in the united states during the spread of covid- : first results from the berkeley interpersonal contact study. medrxiv quantifying the impact of physical distance measures on the transmission of covid- in the uk a systematic review of social contact surveys to inform transmission models of close-contact infections age-and sex-specific social contact patterns and incidence of mycobacterium tuberculosis infection social mixing patterns within a south african township community: implications for respiratory disease transmission and control social contact structures and time use patterns in the manicaland province of zimbabwe effect of acute illness on contact patterns projecting social contact matrices in countries using contact surveys and demographic data updated social contact matrices in countries using contact surveys and demographic data quantifying age-related rates of social contact using diaries in a rural coastal population of kenya kenya population and housing census volume iii: distribution of population by age and sex contacts in context: large-scale setting-specific social mixing matrices from the bbc pandemic project. medrxiv temporal variation in transmission during the covid- outbreak. cmmid repository estimating the effects of non-pharmaceutical interventions on covid- in europe age-dependent effects in the transmission and control of covid- epidemics forecasting the scale of the covid- epidemic in kenya. medrxiv examining levels, distribution and correlates of health insurance coverage in kenya socio-economic inequality and inequity in use of health care services in kenya: evidence from the fourth kenya household health expenditure and utilization survey using a delay-adjusted case fatality ratio to estimate under-reporting. london: centre for mathematical modeling of infectious diseases repository persistence of coronaviruses on inanimate surfaces and their inactivation with biocidal agents exaggerated risk of transmission of covid- by fomites. the lancet infectious diseases kenya population and housing census volume ii: distribution of population by administrative units social contacts and mixing patterns relevant to the spread of infectious diseases publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations we are grateful for the excellent research assistance of roseline oguta, hellen collete ochola, emmanuel mukabi, carol olela adhiambo, catherine nduku mwangi, wesely onsongo, omachi shawn ambunya, james joseph okwogo, esther kariuki, juliet nduta mwangi, kadija mohamed ali, vidah achieng oloo, and lucy kerubo nyamwaro. we also acknowledge the contribution of timothy abuya, faith mbushi, eva muluve, james b. tidwell, and thoai d. ngo availability of data and materials data and code are fully available at https://github.com/mquaife/kenya_ mixing. the study was approved by the internal review board of the population council (study number ), the ethics committee of the london school of hygiene and tropical medicine (reference number ), and the amref health africa ethics and scientific review committee in kenya (p / ). verbal informed consent was obtained from all participants because written consent was not possible for phone-based interviews. the verbal consent process was approved by the three ethics committees named above. competing interests all authors declare no conflict of interest. received: june accepted: september supplementary information accompanies this paper at https://doi.org/ . /s - - - .additional file . mixing data collection tool.additional file . measurement of socioeconomic status, and food and economic security.additional file . age adjustment.additional file . age-susceptibility adjustment.author details key: cord- -g pt yn authors: mclachlan, scott; lucas, peter; dube, kudakwashe; hitman, graham a; osman, magda; kyrimi, evangelia; neil, martin; fenton, norman e title: bluetooth smartphone apps: are they the most private and effective solution for covid- contact tracing? date: - - journal: nan doi: nan sha: doc_id: cord_uid: g pt yn many digital solutions mainly involving bluetooth technology are being proposed for contact tracing apps (cta) to reduce the spread of covid- . concerns have been raised regarding privacy, consent, uptake required in a given population, and the degree to which use of ctas can impact individual behaviours. however, very few groups have taken a holistic approach and presented a combined solution. none has presented their cta in such a way as to ensure that even the most suggestible member of our community does not become complacent and assume that cta operates as an invisible shield, making us and our families impenetrable or immune to the disease. we propose to build on some of the digital solutions already under development that, with addition of a bayesian model that predicts likelihood for infection supplemented by traditional symptom and contact tracing, that can enable us to reach % of a population. when combined with an effective communication strategy and social distancing, we believe solutions like the one proposed here can have a very beneficial effect on containing the spread of this pandemic. at the time of writing many of us are in our fifth or sixth week of social distancing and lockdown in an effort, we were told, that would flatten the curve and curtail the spread of covid- . as considerations move from dealing with the worst of the disease to containment of any remaining pockets of infection, much noise is being made in the media concerning the need to implement contact tracing apps (cta) before the world can return ostensibly to normal (mathews, ; scott, ; whittaker, ; drew ) . while the claimed benefits for cta of being able to leave our homes, reopen workplaces and revive crippled economies are significant, cta are not without some controversy volk, ) . questions regarding transmission dynamics and optimal intervention strategies for the disease, and the risk cta pose to individual privacy and efficacy are repeatedly raised, and many feel these have not been adequately answered (crocker et al, ; sun & viboud, ) . some describe cta as the trojan horse: reminding us that many governments and corporations already operate population-wide electronic surveillance and the likelihood that they do not, and once they also get access to our cta data, will not, act in good faith . however, what everyone fails to ask is whether this personal information is being provided in support of the most, or even an effective method and at what uptake rate in the general population are we sure that it will be worthwhile. is a bluetooth radio beacon paired to a smartphone app the most effective method for digital contact tracing? in this paper we address these key questions for smartphone-based contact tracing solutions. proposed more than years ago for the control of syphilis (paran, ) , contact tracing is a surveillance and containment strategy for infectious disease (vazquez-prokopec et al, ) . rather than managing only isolated cases as they seek medical attention, contact tracing follows the path of infection from diagnosed patients to those with whom they have been in close physical contact (armbruster & brandeau, ; eames, ; vazquez-prokopec et al, ) . several approaches for contact tracing have been described in the literature, including: first-order, single-step, iterative and retrospective (eames, ; klinkenberg et al, ) . first-order tracing only identifies those people the patient immediately came into contact with, and advises them of potential exposure and the need to seek medical advice or self-isolate. it does not concern itself with tracing the contacts of contacts, leaving that second-order process to occur as and when the first-order contact seeks medical care. single-step contact tracing identifies all people that the infected person came into contact with, and as any of those are also identified as infected, their contacts are identified and the process continues. one issue with single-step contact tracing is that asymptomatic infecteds can spread the disease until they are detected and isolated. in contrast, iterative contact tracing continues to track and re-apply the relevant diagnostic test to contacts iteratively before their infection may even be detected through symptom screening. the process continues until no further infecteds are identified. the final type, retrospective contact tracing, follows the same process as either single-step or iterative with the addition that it also operates in reverse by considering the people with which the infected patient had been in contact with in their recent past, with the goal to identify who it was that infected the patient. each approach is demonstrated in figure . contact tracing has traditionally been conducted as a manual multi-stage process that begins when a patient is diagnosed with an infection that is usually also subject to notification rules that require the clinician to apprise the health authority (ha) of the infected's status. any likely contacts of the infected patient are determined, identified, advised of their exposure status and encouraged to seek medical advice (armbruster & brandeau, ; eames, ) . generally, contact tracing has only been used for diseases with low prevalence: meaning diseases where there is only a small number of cases in the community at any given time (armbruster & brandeau, ) . examples of diseases where contact tracing has been applied include: tuberculosis, hiv/aids, ebola and sexually transmitted diseases (armbruster & brandeau, ; danquah et al, ; eames, ; yasaka et al, ) . on review, many of these examples show the efficacy and reliability of contact tracing to be uncertain and contentious issues. with our vastly increased global population, international airline travel, megacities and mass transit, it is unlikely that traditional contact tracing alone could contain even a minimally contagious disease (niehus et al., ) . traditional contact tracing was used early-on during the sars epidemic (fidler, ; huat, ) . however, it failed to contain the infection which quickly spread through the wider community, with global has realising that new approaches were now required (fidler, ; huat, ) . modern contact tracing approaches have been proposed using ubiquitous and pervasive smartphones and the wireless technologies they contain to record and report when we have come into close physical contact with others. it is believed this automated contact tracing will overcome situations when we either are not aware of, or don't recall, every contact incident (maghdid & ghafoor, ) . the proposed approaches shown in figure incorporate these technologies to more efficiently and effectively provide: (a) movement-focused mobileassisted automatic contact recording; (b) contact identification; (c) contact notification; and, (d) narrowcast messaging (maghdid & ghafoor, ; vazquez-prokopec et al, ; yasaka et al, ) . proponents of cta claim, possibly disingenuously given their extensive and publicly-funded investment in development of the app, that installing the app will significantly reduce the chance of you passing on the infection to your family and friends (covidsafe app, ) , and essential to keeping your family safe from covid- (hamilton, ) . all cta users who have been in close physical contact with the infected that they should seek medical advice. (d) the central server can also be used to send narrowcast messages, for example: alerting people who cta location tracing identified near a particular infection hotspot during a defined period (in green) that they may have been exposed and to seek medical advice. while solutions using wifi mac address sniffing (lu et al, ) , gps (finazzi, ; klopfenstein et al, ; maghdid & ghafoor, ) and cellular network geolocating (dp t, ; pepp-pt, ) have all been proposed, many believe bluetooth tracing to be the most suitable for use in cta (berke et al, ; brack et al, ) . authors point to the fact that bluetooth has already been demonstrated effective for proximity detection (berke et al, ; brack et al, ) . it is also claimed that while bluetooth has an effective range of around - metres, signal strength can be used to effectively identify whether another device is within the -metre rule promoted as a component of social distancing (berke et al, ; xia & lee, ) . most attention to privacy in the literature focuses on the interactions and data passing between users of the cta when they come into close physical contact and their devices handshake. a smaller focus is given to interactions between the cta and ha server, whose privacy exposure is mitigated, it is claimed, by decentralised solutions: that is, solutions where most data remains on the user's device and only small push or pull transactions occur to the ha server to either advise the system of the user's covid- diagnosis, or verify that the user has not already been in contact with another who has since been diagnosed. what is clear is that while labelling their solutions as privacy-preserving, most authors seek to mitigate one form of data or privacy loss while ignoring, intentionally or not, every other possible disclosure vector (kuhn et al, ) . to the best of our knowledge, no author considered the issue of metadata and its effect in nullifying their often complicated and expensive privacy solutions. metadata is the most common and easily accessible form of personal information being collected (mclachlan, ) . metadata is defined as information about a communication: the who, when, where, and how but not the what. metadata contains sufficient information to know when you made a call, texted, emailed or accessed a web page, who your communication or web request was made to, how and whether the person or system at the other end received the communication. the only thing metadata does not contain is the actual content of the message (maurushat et al, ) . for more than a decade metadata has been used by law enforcement and others to draw inferences about our state of mind, intentions, previous travel, personal associations and interactions (maurushat et al, ; mclachlan, ) . in many countries metadata may be accessed without a warrant by authorised organisations and agents, and laws exist requiring telecommunications, internet service provider companies and web hosts to maintain large stores of metadata collected as a result of the activities of individual subscribers (maurushat et al, ; mclachlan, ; shamsi et al, ) . let us consider the data that is generated while using a cta. figure presents the typical cta use-case described by many authors, in which: (a) the primary cta user and others install and register the app on their smartphones; (b) as they move around and come into close physical contact with each other, their smartphones identify other smartphones and a contact trace is recorded; (c) an upload of some information passes from the cta on the users device, via their provider's core network (cellular or isp); (d) from their provider, via the internet, to the ha servers; and (e) alerts and updates can also be sent from the ha server to individuals, or every user. some variation is observed in the literature claiming to present privacy-preserving methods regarding: (i) the type of information passed from the cta to the health authority server; and (ii) whether the data passes directly to the ha server or, as with the singapore (tracetogether), australian (covidsafe) and proposed apple/google collaboration examples, into a thirdparty supplier's international datacentre cloud network (i.e. google, apple or amazon web services) before being received by the ha server (maddocks, ) . metadata are generated at every step of the typical cta scenario. every communication or request sent to cellular, internet service provider or web host organisations results in metadata that must be stored in logs in their network that identify you from your subscriber identity module (sim) record matched to the details of your device, with a record of what you requested or sent, to or from whom, and when (de carli et al, ; mclachlan, ; shamsi et al, ) . all digital traffic passing from your provider's network via the internet to the ha results in metadata being captured inin the systems of every network provider between the two, but more importantly, in the ha's network systems and servers. believed by many to be non-sensitive, metadata often remains overlooked in smartphone and internet-facing solutions even though it can be a trivial matter to re-identify an individual and their actions and interactions with others from the metadata, or digital breadcrumbs, they create (ho et al, ; maurushat et al, ; perez et al, ; shamsi et al, ) . while singapore and australia's health departments have already commenced rollout of cta solutions for covid- , the united kingdom (uk), north america and most of europe will only commence their trial deployments in the coming week (hern & sabbagh, ) . taiwan, south korea and israel were even more proactive, with increased testing, quarantines and mandated cta of recent travellers and the infected that has resulted in lower rates of secondary infections and significantly fewer deaths, with alarms being raised, similar to home detention systems for criminals, informing police if those in quarantine left the building in which they were being housed (lee, ; lomas, b) . most literature proposing cta and being used by academics and governments to support efforts, efficacy and expenditure of public funds for covid- contact tracing with smartphones, are theoretical solutions in hurriedly prepared preprints that are yet to undergo rigorous testing or peer review. examples include: (berke et al, ; brack et al, ; de carli et al, ; hekmati et al, ; klopfenstein et al, ; maghdid & ghafoor, ; reichart et al, ; xia et al, ) . while acknowledging that privacy is not a design goal for any cta, many propose solutions that they claim are privacy-preserving: both between app users generally, and between individuals and the health authority and technology suppliers who maintain the central servers . only one paper was identified in this work that acknowledged no privacy could exist where there was a central authority, and that users should only expect solutions to keep them blinded from each other (berke et al, ) . some solutions present as a confusing array of seemingly random technology, thrust together (reichart et al, ). apps proposing id hashing or public/private key encryption between central server and end-user claim these additions ensure complete user privacy: and while authors acknowledge that the central server will have recorded your current and all previous hashids and will be used to distribute alerts to other users, they also disingenuously claim that the health authority are entirely unable to learn anything at all about users, the infected, or their contact history from this vast collection of data . many proclaim cta ineffective because it relies on willing individuals who must provide identifying information about themselves and those they come into contact with, and self-report their infected status via the app for storage on a central server hekmati, ; yasaka et al, ) . usually, while simultaneously claiming to provide a decentralised or privacy-protecting solution that still uses user ids and other information such as location or contact lists that are uploaded or shared via the central server hekmati et al, ; reichert et al, ; yasaka et al, ) . however, decentralisation adds complexity (berke et al, ) , often without a significant improvement in privacy. in one case an infected still self-reports, except that they are instead required to provide a signed medical certificate, exposing even more personal information to whomever runs the central server so that an alert can be broadcast to others who the infected has previously been in contact with (hekmati et al, ) . researchers and epidemiologists have sought, somewhat unsuccessfully, to understand the efficacy and overall value of disease contact tracing for many years, with heightened interest often observed in the aftermath of disease outbreaks. many issues limit contact tracing efficacy, the most significant being the need to understand transmission, susceptibility, prevalence, and latency for the target disease (kiss et al, ) . before deciding on an effective control strategy, it is essential to understand the course of the disease. in epidemiology, many compartmental models have been developed for modelling infectious diseases (roddam, ; hethcote, ) . one commonly used model computes the theoretical number of people infected with a contagious disease in a closed population over time is the susceptible-infected-recovered (sir) model (anderson, ; rodrigues, ) . these mathematical models are being considered an important source of knowledge for global governments making life-or-death decisions regarding management of covid- . the susceptible-exposed-infected-recovered (seir) model has been used to focus on transmission of covid- in wuhan, china (lin et al, ) , and to compare outcomes for different containment policies (casella, ) . the susceptible-infectious-recovered-dead (sird) model has been used to provide estimations of the basic reproduction number (r ), per day infection mortality and recovery rates, and attempts to forecast the evolution of an outbreak at the epicentre three weeks in advance (anastassopoulou et al., ) . susceptible-infected-diagnosed-ailing-recognized-threatened-healed-extinct (sidarthe) was proposed as an extension to sir in an effort to model the covid- epidemic in italy (giordano et al., ) . their model showed that enforced lockdowns could be mitigated in the presence of widespread testing (peto, ) and contact tracing, strongly contributing to rapid resolution of the epidemic. similar findings were also found in (hellewell et al., ) . while some believed contact tracing was effective during the sars outbreaks of the early 's (kiss et al, ) , we have already discussed singapore's reliance on contact tracing during that period and how on review it was found to have failed (fidler, ; huat, ) . other examples where contact tracing failed, in some cases even with the use of smartphone technology and apps, include an audit of contact tracing use for tuberculosis (hussain et al, ; mwongela, ) ; the foot and mouth outbreak in the uk in (kiss et al, ; kao, ) ; and the - ebola epidemic in west africa (danquah et al, ) . while many claim suitability, viability and effectiveness for cta, in most cases the cta solution they propose has yet to be prototyped, and for those that were, trialled in anything approaching a real-world situation de carli et al, ; hekmati et al, ; klopfenstein et al, ; mwongela, ; yasaka et al, ) . we sought to understand how effective cta might be as a containment approach for covid- in highly populous locations like london or birmingham in the uk, or sydney and melbourne in australia. we observed that most papers presenting a cta appeared to silently apply best-case assumptions when discussing or evaluating their models in order to paint their solution in the best light. for consistency, we chose to continue this practice albeit with the novel addition of transparency. with respect to how many people an infected person may come into contact with, we rely on the calculations provided in the uk that have come to be known as the oxford figures and have been used by those developing and promoting the need for an nhs-specific app, and in the media, to support efficacy, funding and deployment of the nhs app (merrick, ) . the authors used an seir model to suggest that in a -day period post-lockdown the average person comes into contact with people, of which are considered to be close contacts sufficient for disease transmission, and of those would be individuals in a cta scenario who are potentially traceable (keeling et al, ). while we could have worked from the number of close contacts which would have made our numbers significantly larger and more dramatic, in order to demonstrate the fallacy of claims made in support of cta even as a component in disease containment for covid- , we chose again to work from a best-case position and elected to use latter and much lower figure for total transmissions. the oxford figures also provide that the average latent period, usually defined as the period between when a person is exposed to the virus and when they begin exhibiting symptoms, is days (keeling et al, ). other authors using larger datasets provided this incubation period was days, with % of patients showing symptoms at day (lauer et al, ; qi et al, ) . younger infected patients tend to be asymptomatic, and for longer periods, and while the mean serial interval, the time between when symptoms appear in infector and infectee) varies between and . days qi et al, ) . it should be noted that our best-case assumptions are similar to those of dr hannah fry's group (kucharski et al, ) except that our mean delay from symptoms to isolation was reduced to day: the effect of which would be to reduce the number of secondary infecteds created by each primary in our scenarios. in spite of this, our results were statistically similar to those of kucharski et al ( ) . the assumptions used in our calculations include that: a) the infection clock starts from exposure; b) from day the infected begins to shed the virus; c) patients may become symptomatic between days . and . ; d) at day every infected is considered to by symptomatic; e) each infected comes into close contact with people in a -day period, pro rata for the period between day and when they become symptomatic; f) every infected has self-isolated from day ; we searched pubmed, medrxiv, biorxiv, arxiv and doaj for peer-reviewed articles and preprints that mentioned the terms "contact tracing", and "covid- ". our initial search revealed more than articles published since december . we narrowed our search to those articles published since march and selected only those that proposed a cta solution. this identified a collection of papers whose solutions were reviewed. from these papers ( %) used the term privacy in either the title, abstract or introduction, and ( %) proposed solutions claimed to be privacy-preserving. solutions intended to reduce or eliminate data passing to a central server, described as decentralised solutions, were proposed in ( %). only ( % ) solutions described production of a prototype with ( %) solution having been tested with simulated data. for the o'clock path shown in figure , we present the absolute best-case scenario where % of the population have smartphones, install the cta, are tested, immediately self-report and self-isolate. this scenario, whilst being quite impossible, would actually contain the disease in only two cycles, or days. the and o'clock paths present the uk and australian scenarios for the claimed % (merrick, ) and % (woodley, ) adoption that we are told would deliver cta success in their respective populations. in each scenario every infected spreads covid- to only a small number of infecteds, and while a percentage of secondary infecteds are alerted through the cta and self-isolated, the remaining percentage, those without the app, persist to spread the infection to a significantly large number of people. figure provides a visual representation of the progress at each stage for the % adoption nhs cta scenario. smartphone penetration for adults in the uk has only achieved %, reducing to % in the key covid- demographic, the over- s. australian figures are similar. to get % penetration in the overall uk population, more than three quarters ( %) of all smartphone owners must install, register and use the app. this assumes absolutely no loss to follow-up, which occurs where a user either stops using or removes the app from their device for any reason. when the average loss to follow-up in a clinical trial is % (akl et al, ) , the nhs app would actually require more than % of the smartphone-owning population to initially install and register the app to increase the probability that % will use their cta to completion. for the % (australian) and % (uk) scenarios we begin from the position that % and % of the population respectively have installed the app and immediately self-report and/or self-isolate when alerted. as these scenarios played out, we calculated under an absolute best-case wherein people who were alerted by the app or who reached day all immediately self-isolated. the issue with this is that we know some people's symptoms will not be severe enough at first for them to believe they have the disease and seek medical advice. this is human nature. studies report that around % of all exposed people remain asymptomatic but recover from the virus in a timeframe similar to that of people who do become symptomatic (mizumoto et al, ; day, ) . further, - % of patients will be asymptomatic but remain contagious and continue to shed the virus from - months after their initial exposure, with or without a symptomatic period (bengali, ) . in keeping with our best-case model we have not incorporated additional potential exposures that would arise from these groups of people in our calculations. a final set of calculations was performed seeking the sweet spot: that number below absolute for cta adoption in the overall population where the number of secondary cases was manageable by manual contact tracing and other containment methods, and the nhs generally. table presents the results of those calculations and, similar to figures proposed by other groups who have evaluated this issue (bulchandani et al, ) , we find the sweet spot for cta uptake in order to control covid- lies somewhere between and %. as discussed, such high uptake is simply not credible or possible. this paper has considered many of the barriers that continue to impede success for contact tracing, even when it is automated with a smartphone app. we now turn to consider the current or proposed solutions and how, even if not completely successful, they might be better designed and promoted in order to produce a lasting benefit for the average individual and wider community. the united kingdom breakout box describes the app being rolled out by the uk government and nhs. we refer to it as the oxford/nhs app since its development was led by academics at oxford. the government are pinning their hopes on this app being a key enabler for relaxing the current lockdown policy. appendix discusses the common properties and data being collected by cta reviewed during this research. we believe the statistics and overall proposal to support development of the app and promote its uptake in the community are based on best-case scenarios. however, we do perceive that the strength of government and nhs support comes from the perception of trust they seek to engender. the openness and degree of transparency that the nhs and oxford teams have been upselling in the media, if delivered, far exceed that of any other. we found no other state-developed or operated solution that suggested a willingness to allow the media, technologists and general public access to the source code. however, early non-published results of a pilot trial on the isle of wight are less encouraging with a major limiting factor being the variation in smartphone operating systems, especially those of older phones (duell, ) . the level of transparency underpinning the nhs solution needs to also be adopted in any use of the apis provided by the apple/google collaboration. (drew et al, ) . they currently have . million users and report symptoms data gathered from around . million, of whom only a tiny fraction of , ( . %) had undergone some form of pcr-based diagnostic test (drew et al, ) . many issues may present with this type of study. these issues include the subjective nature of the endeavor, the bias that comes from the fact that the app was initially promoted to and installed by clinical staff and their families, and that many in the wider community who voluntarily install such apps are the worried well who, when prompted with questions suggesting the symptoms that go with a condition, are more likely to identify as having some of them. unless carefully managed, suggestibility unintentionally induces conditioned associations between symptoms, leading patients to report more intense or additional flu-like symptoms (skelton et al, ) . leaving these issues aside we believe a good solution might have been to incorporate cta into this symptom tracker app, and allow the existing user-base to either consent or decline providing that additional information. that a high number of existing users would consent to the addition is far more likely than believing that almost million people will install a second covid- related app. we also believe that any proposed cta solution should contemplate capture of many of the same symptom-based data-points, whether used directly in contact tracing or not. we suggest this in order to enable future anonymous aggregation and data mining/knowledge engineering on covid- from what could be a considerably much larger and richer dataset. our proposed solution focuses on enabling users to diagnose the possible presence of covid- themselves. this is done through a causal probabilistic model (a bayesian network, that we describe in section . ) that is made available in a smartphone app based on the architectural framework (that we describe in section . ). the app provides the user with information about how likely it is they have or have not mild or severe covid- . when this probabilistic information is combined with data about the gpslocation of the smartphone, together with information about the age group of the person the triple (prob. user has covid- , gps-location, age-group) can be used to provide information about the distribution of mild and severe covid- . for example using colour shades on the map of a country, the data can be used to present a dynamic visualization of the probability distribution on the location where that information was collected (hay et al, ) . this solution option involves providing diagnostic-oriented feedback to citizens with real time covid- surveillance and minimal privacy infringement as quickly as possible in the face of all the limitations of the current constantly changing situation. response measures from the information collected from this option operate mainly at the population location level, such as intensified lockdown, social/physical distancing and self-isolation campaigns rather than more granular contact tracing and individual isolation measures requiring massive resource deployment. this option is dramatically different from the many trace and contact app solutions provided elsewhere. a bayesian network (bn) (cowell et al, ; neil , koller & friedman, ; pearl, ) is a graphical model consisting of nodes and arcs as shown in figure (this is the draft model we propose for our app). some of the variables (such as those representing symptom nodes) may be directly observable while others (such as the covid- node) are not. there is an arc between two nodes if the corresponding variables are causally linked in a probabilistic sense. the strength of the link, as well as the uncertainty associated with these, is captured using probabilities and statistical distributions. when data are entered into the model for specific variables that are observed, all of the probabilities for, as yet, unknown variables are updated using an ai algorithm called bayesian inference. hence, in the model here, the bn algorithm computes the probability of having none, mild, or severe covid- , based on present signs and symptoms and other relevant background information entered by the user. the model makes a number of simplifying, but rational assumptions. for example, it assumes: that a person can only become infected if they have been in recent contact with an infected person (or some biological matter from an infected person); that a positive test result from a perfectly accurate covid test procedure would mean that the person has covid (even if they were asymptomatic); that there may be other conditions such as copd or flu that have some symptoms in common with covid- . the probability distributions in the model for the symptoms given the disease status (i.e. the status of the covid- variable) are based on the statistics provided in the paper by huang et al. ( ) . all the assumptions are described in appendix . covid- bayesian network model structure. the probabilities shown for the covid status node represent the prior probabilities when no observations are entered. figure shows the updated predicted probabilities with some user entered observations; in this example a user has many of the covid symptoms and has had multiple recent interactions with other people. although this user has not entered their background or risk factors, the model infers there is a % probability the person has covid ( % probability severe and % probability mild). note that the model also updates the probabilities for the unknown risk factors and background nodes. for example, this person is more likely to be male than female ( %) and is likely to be over ( % probability). the probability of obesity is % (up from a prior of %). these backward inferences are simply the application of bayes. appendix illustrates the power of the model through other scenarios. depending on the value of the 'alert threshold' that is set the model will trigger an alert (it will trigger a separate hospitalization alert depending on the length of time the symptoms have been present and whether or not they are improving). so those people with the app who have come into contact with the person will be alerted that they have been in contact with a person most likely to be covid positive, while this person could be given appropriate instructions for contacting the health authorities. this model is still an incomplete attempt at developing a bn for the prediction of the presence of covid- (we are in the process of gathering the relevant data required to complete all of the probability tables; currently those for which we do not have relevant data, or are not logically determined, are simply estimated). it is possible to add other signs and symptoms (for example dizziness seems useful) and also comorbidities and immunodeficiency could be added, as the literature provides the relevant information. the advantage of a bn is that it can still generate predictions with incomplete information. thus, if certain evidence is not entered by the user, the model is able to use prior probabilistic information rather than make particular assumptions. so, although body temperature and oxygen saturation are key measurements, the user decides whether or not these measurements are actually done. using the bn it is also possible to predict which feature will be the most informative one in contributing to the diagnosis, and this feature can be used to request additional information from the app's user after some initial input. the envisioned use of such a probabilistic bn model is as a foundation of population surveillance of the geographical outbreak and spread of covid- . the proposed infrastructure for personalised covid- status feedback and collecting geographical data is shown in figure , and is inspired by related research of the authors' research groups (van der heijden et al, ; velikova et al, ) . as figure illustrates, the bn is embedded or integrated into an app meant to run on a person's smartphone. the presentation of the feedback is expected to be attractive and easily understood by the smartphone user with additional advice whether or not it is wise to contact a gp. this solution operates within the cardipro environment using the web/pwa front-end and agena cloudapi (mclachlan et al, ) . our research group has the means now to demonstrate both the elements and the entire solution presented in figure . the minimalist data transmitted to the server, even if coupled with collecting a similar anonymous symptom set as used for the chan/spector app, might be more palatable to people who may be concerned about privacy in both the uk and netherlands. in summary, in this proposed solution, it is assumed that a citizen of a country obtains feedback about the likelihood of the presence of mild or severe covid- from a smartphone app, but the main purpose of making an app with the bn embedded is to monitor the population for detecting new outbreaks and the locations at which this occurs as early as possible. for this purpose, it is only needed that the minimalistic data triple is collected centrally. the age information might be useful to get information about required protection of particular groups. in addition it might be useful to also add an app-specific unique identifier so that it is possible to follow the progress of covid- in the individual (possibly until hospital admission). however, collecting only the above-mentioned data triple has the advantage of minimal infringement of privacy. on installing cta some personally identifiable information is always captured as a result of downloading the app from the app store, and for some apps, like the australian and uk ones, when registering on first use (maddocks, ) . at a minimum, this is information which when combined with the metadata being generated makes every user and their associations identifiable. we contend that many claims regarding privacy and efficacy of cta for covid- in these papers may not be justified, and in some cases are misleading. we are not the first to identify the falsity of attempts at cta privacy (berke et al, ; kuhn et al, ) , nor to raise concern regarding the efficacy and applicability of cta for covid- contact tracing. however, we are the first to consider both issues together, and as a result to demonstrate that bluetooth cta are not the covid- panacea we all seek. the uk and several other countries including australia, singapore and germany, propose a centralised approach whereby data will be collected on smartphones and some component of that data is forwarded to a central server, enabling contact alerting and tracing of the epidemic. some countries favour use of the solution presented by the apple and google partnership, which is claimed to be a 'local' solution under development that will not breach data security and will not lead to any centralisation of data. their proposal for privacy-safe contact tracing using bluetooth would, they say, require explicit user consent, which is another issue that needs greater consideration. the apple/google solution apis on first blush don't appear to collect personally identifiable information or user location data, and suggest a list of people you've been in contact with never leaves your phone (detected via bluetooth le). we are also told that people who test positive are not identified to other users, google or apple. that the information would only be used for contact tracing by public health authorities for covid- pandemic management which in itself, like every other proposed decentralised system would necessitate communication with and storage of data in some form of central server. however, with regard to metadata and privacy, we are circumspect that apple/google or the various ha using their apis will not be collecting at least part of the data being generated for secondary use purposes. it should also be noted that the apple/google apis are simply an interface for has to expedite development of cta solutions: they are not a cta. apis act as a standardised intermediary, in this case between the user interface and a data backend, both of which will still require has to engage software architects and developers to create. there is no guarantee that without engaging far more experienced technologists and serious reconsideration, any app the nhs develop using the apple/google apis will not fare as badly as the first hours of real-world testing of the nhsx cta on the isle of wight (duell, ) . if we are to use these apis, a better solution might be a progressive web app (pwa). a single pwa could be developed to be compatible with both android and apple architectures, and engineered to avoid the main issue seen with the nhs trial app: incompatibility with variants of the smartphone's operating system. we have already developed and demonstrated an example of this approach, called cardipro (mclachlan et al, ) . it can be inferred from the literature, mass media and download pages of those developing and promoting cta, that to at least some degree they seek to create the belief that implementation of contact tracing makes containment of covid- a fait accompli. each presents a solution couched in words suggesting that, for successful eradication of covid- , we need only to install the cta, and in doing so we will have identified everyone who, symptomatic or asymptomatic, might have the disease. however, this assumes the data collected by the cta will be clean, accurate and sufficiently complete, and rather than develop their own app, the australian government licensed rights to rebrand the tracetogether app developed by the singaporean government, and deploy it as covidsafe. as is common, emergency legislation was hurriedly drafted and enacted under the catchy title: biosecurity (human biosecurity emergency)(human coronavirus with pandemic potential)(emergency requirements -public health contact information) determination act (phcia, ). while making it an offence for a person outside those employed by a state or federal health authority to collect, use or disclose covid app data except for the purposes of contact tracing (section ( ) & ( )), this determination explicitly limits the same provision to data generated within the app or by the commonwealth and stored on the user's mobile device. phcia also excludes from all provisions, privacy or otherwise, information arising from any source other than the national covidsafe data store (section ( )). the effect of provisions of the phcia make it unlawful for an app user or member of the general public to decrypt, view or disseminate any data from their device, or even knowledge about data that the app collects or stores, while leaving government organisations able to interact with this data more freely. the phcia contracts itself out of provisions of the privacy act that may be found inconsistent under power of section ( ) of the biosecurity act , but does not exclude itself from the operation of others, including the telecommunications (interception and access) act (tiaa, ) which invokes data retention provisions on telecommunications providers, including your telephony and internet service providers, and amazon web services who will be the web host of the central server, to store records of all forms of electronic communication for at least two years. the tiaa also makes metadata available without warrant to a broad range of organisations that include law enforcement, local, state and federal government bodies, the rspca, the australian navy and border protection services, the thoroughbred horse and greyhound racing associations, workplace safety investigators, the clean energy regulator, national measurement institute, building and construction commission, taxi services commission and in some cases it has been demonstrated, private investigators (farrell, ; guy, ) . will fully support their containment efforts which, despite best intentions, is extremely unlikely (senga et al, ) . we accept that solutions operating at the front end of contact tracing, like the cta, will produce more data. more contact information will require time-consuming and labour-intensive follow-up, and consumption of considerable resources in order to identify and weed out the true cases from the spurious chatter (senga et al, ) . but it should be noted that previous work has failed to consider: a) the effect of people simply leaving their smartphone at home, or in the car; b) how to effectively deal with people who might have two or more devices; or c) how to identify the owners of prepaid devices that in some countries can be registered without identification, or anonymously. d) the effect of a cta user coming into close physical contact with others who eschew, or cannot afford, smartphones (all previous work assumed that the adjectives pervasive and ubiquitous meant complete coverage). we believe that care should be taken when deploying cta in any community. not just because of privacy or consent issues. but rather, to ensure that even the most suggestible member of our community does not become complacent and assume that cta operates, as claims like those provided with the australian government covidsafe app would seem to suggest, as an invisible shield making us and our families impenetrable or immune to the disease. even if all potential privacy issues were resolved, the decision to install and register the cta in most western countries would remain voluntary. this raises the question: how can high-level uptake of the cta be assured? to answer this question we propose that at least three related matters must be considered: (i) public compliance with existing social distancing measures; (ii) media narrative of cta; and, (iii) ongoing changes in peoples' subjective estimate of severity and susceptibility to the virus. opinion polls in recent weeks are finding that the majority in each country are in favour of existing social distancing measures, irrespective of how strictly they are maintained and for how long they remain (ipsos-mori, a) . when compared to other countries, people in the uk are displaying a higher degree of support for continued social distancing. similarly, the uk population has shown overwhelming support for lockdown measures, again irrespective of severity and duration of the lockdown. it may be that this public acceptance will extend to other stateoperated measures including the suggested test, track and trace strategy that includes the cta, as it is being promoted as a way to end the lockdown and reduce the possibility for additional and more severe lockdown measures. it is possible that, irrespective of how privacy-invading cta methods may be, or the potential negative impact that third-party use of metadata resulting from individual engagement with the app, the public may accept these impositions in return for the benefits of a lifted lockdown and lighter social distancing measures. certainly, the polling conducted between the th to th of april in the uk suggests this holds true, with % showing support for the cta (ipsos-mori, b). however, public opinion elsewhere is somewhat mixed. in other countries the trade-off is not the same: the protection of privacy outweighs relaxed social distancing through use of a cta. for example: (i) in france, where % of respondents are opposed to the cta (hughes hubbard, ) ; and, (ii) the us, where % of respondents are opposed to the cta, (kirzinger et al, ) . the us poll also showed opinion somewhat changes when benefits such as going back to work are more prominently presented, in which case % would agree to download the cta. however, from % of the us total sample % indicated that a cta would make them feel less safe, while % said the cta would make no difference to their feelings of safety at all. the current media narrative and an individual's subjective estimates of severity and susceptibility are two broad factors that, whilst not while drawing significant criticism, the uk national health service (nhs) has rejected the apple/google apis and decentralised model, expressly favouring a centralised approach that they say will allow for collection of more granular data and broader analysis to study and track the pandemic (hamilton, ) . the key difference to be noted between the nhs approach and all others is upfront acknowledgement of the intention to maintain this central collection of data while also making substantial claims regarding the privacy strength of the userland app and ethics of their approach. unlike descriptions of all other claimed privacy-preserving apps seen in the covid- literature, and in stark contrast to the australian approach of denying the public any real knowledge of the data being collected and transmitted by their device (phcia, ), the nhs are making encouraging noises regards allowing researchers, security analysts and the general public access to the source code, to see behind the curtain and verify what data the app is collecting and transmitting (gould & lewis, ) . unlike any other, and if taken on face value, this could allow uk citizens to consider that data's existence and potential uses when deciding whether to download and activate the app on our personal devices. independent of each other, account for the observed differences in opinion and behaviour both between countries and over time (abeysinghe & white, ; leppin & aro, ; slovic, ; wagner-egger et al, ; wheaton et al ) . the contentious issue of privacy presents as a far more salient and palatable target than dealing with the overwhelming lack of evidence for efficacy. on these issues there are now several open letters from scientists that are being communicated to the general public. while the sustained focus on data privacy concerns remains strong in the mainstream media, this negative issue will dominate public understanding of ctas and significantly restrain uptake. if the narrative can be drawn towards the potential benefits for everyone that come from a general loosening of restrictions to open schools and workplaces, then the success we have seen in compliance with the current lockdown may allow these people to accept the tradeoff and come out in favour of the cta. naturally, this won't be isolated from individual's estimates of severity and susceptibility to the virus, and by extension, for those close to them. but if there is a sufficiently strong belief that severity and susceptibility is high in those close to oneself, even if the severity and/or susceptibility is low for themselves, then, just as we have seen with compliance to lockdown measures, compliance with state messaging on voluntarily using a cta may also be high. many proposed solutions, even the google/apple collaboration, focus very heavily on privacy and app distribution and make almost no mention regarding accuracy. despite best intentions, the levels of inaccuracy that arise in any data recording mean that any contact tracing, manual or digital, will always be incomplete (senga et al, ) . even when we have a significant proportion that do comply with contact tracing, we often still have poor data arising out of the methods employed to collect the data. the normal inaccuracies that occur in data recording and data entry are amplified with contact tracing because some people simply don't want to be traced, while others have limited socio-cultural understanding for why we are wanting to trace them (senga et al, ) . contact tracing represents an expensive win/lose situation. a very small group of university researchers and technology companies receive a large funding boost to develop the cta and deal with the data that it collects, and a large number of people involved in manual contact tracing win jobs. however, the overall community suffers more significant risks, and losses, when they choose to re-engage in normal behaviours under the false sense of hope that most cta are promoted as giving, and risk becoming infected and infecting their family, potentially leading to death. we are sceptical that any standalone contact tracing approach, manual or automated, could contain a high-prevalence highly contagious disease like covid- . this is primarily because the cta acts retrospectively. it advises the user they were previously in close contact with an infected, and in the case of covid- , this advice often comes only after they have already begun asymptomatically shedding the disease. the primary (third) solution we propose integrates the retrospective cta with symptom tracking and a bn, providing the user with a prospective view of the probability that they may have contracted covid- . in this way we increase cta utility for users. we believe that with increased utility uptake may be improved, as is the opportunity to collect useful data and identify actionable clinical knowledge to improve the response in future disease outbreaks. solutions like the one proposed here can have a very beneficial effect on containing the spread of infection. one final point that returns to the issue of consent that was raised earlier. for many proposing cta, the idea of using an app instead of just network tracing via the cellular network or other means is not drawn from many of the cited papers in this work, most apps will collect and transmit some subset of the following data fields: • mac address of your device's bluetooth or wi-fi chip • your phone number (or imei number if the device does not easily report the subscriber phone number) • the mac address of other people your phone sees (bluetooth handshakes with everything it sees that is also bluetooth, even when it doesn't know the device and has never been paired with it) • the time, date and in some cases, location data from your gps for each new interaction with another in-range device (accurate to about meters). a new interaction is when your device sees another device move into its broadcast area. note that in a corporate office the app might see the device of someone in the next room move into and out of range tens or hundreds of times over the course of a working day. • the bluetooth or device name of the smartphone that is running the app, and every other bluetooth device that crosses into its broadcast range. this last point can more easily enable re-identification as people often name their smartphone 'tim's iphone' or similar. as much about bluetooth being more accurate, it is about the idea of claiming to have informed consent: that by downloading the app and clicking through a privacy agreement they have received 'informed consent' to access and monitor an individual through their device. in studies evaluating the impact and effect of privacy policies and user agreements it was found that % are written in language unapproachable by most people (jensen et al, ) , % of participants do not even recall seeing the agreement while clicking through to install the app (good et al, ) , and only . % of more than , actually clicked or scrolled to view the policy (jensen et al, ) . most users have no idea what they have agreed to, and given that organisations change their policies and agreements regularly, whether the current version of the agreement is consistent with that which the media may have discussed when the cta was being rolled out. given these findings, is it ethical to consider that when users install and register the cta, the inclusion of a long privacy policy and user agreement that potentially more than half of the population will be unable to comprehend constitutes informed consent? in writing this paper we reviewed a large collection of topical works on covid- . most works were recent preprints proposing cta solutions for containment of the disease, while others presented the latest research and evaluation of the spread of the disease in our communities. while there is a focus in the literature on two issues, privacy and efficacy, the current media narrative for cta in many countries strongly emphasises perceived privacy risks, and in the uk especially, the risks some attach to the nhs decision to eschew the presumed leading solution: the apple/google collaboration. we also sought to simulate the operation of cta, and the results of our calculations appear largely in agreement with those of other groups just published. introduction of a new cta alone would not contain the disease, and the best-case sweet spot for uptake is beyond that which could conceivably be achieved. however, by providing people with an understanding not just retrospectively for whether they have been in contact with an infected previously, but also using a bayesian approach to proactively provide the probability that they might have the disease, we can increase the ctas utility to users while potentially improving the uptake and knowledge to be learned from use of the app. when combined with an effective communication strategy and sensible social distancing, we believe solutions like the one proposed here can have a very beneficial effect on containing the spread of this pandemic and reducing the need for draconian lockdown procedures. the avian influenza pandemic: discourses of risk, contagion and preparation in australia potential impact on estimated treatment effects of information lost to follow-up in randomised controlled trials (lost-it): systematic review data-based analysis, modelling and forecasting of the covid- outbreak discussion: the kermack-mckendrick epidemic threshold theorem contact tracing to control infectious disease: when enough is enough but the coronavirus stayed in his body for days assessing disease exposure risk with location histories and protecting privacy: a cryptographic approach in response to a global pandemic decentralized contact tracing using a dht and blind signatures. last accessed digital herd immunity and covid- can the covid- epidemic be controlled on the basis of daily test reports? arxiv australian government department of health: covidsafe app. last accessed probabilistic networks and expert systems the challenge of proximity apps for covid- contact tracing. electronic frontier foundation. last accessed use of a mobile application for ebola contact tracing and monitoring in northern sierra leone: a proof-of-concept study covid- : identifying and isolating asymptomatic people helped eliminate virus in italian village wetrace: a privacy preserving mobile covid- tracing approach and application decentralised privacy-preserving proximity tracing. last accessed rapid implementation of mobile technology for real-time epidemiology of covid- the serial interval of covid- from publicly reported confirmed cases new nhsx covid- contact tracing app doesn't work on two-year-old phones say isle of wight residents using it in trial. mail online, last accessed: th contact tracing strategies in heterogeneous populations lamb chop weight enforcers want access to australians' metadata. the guardian, last accessed risk assessment and decision analysis with bayesian networks sars, governance and the globalization of disease earthquake network -pilot investigation covid- in val seriana. last accessed th modelling the covid- epidemic and implementation of population-wide interventions in italy stopping spyware at the gate: a user study of privacy, notice and spyware digital contact tracing: protecting the nhs and saving lives clinical characteristics of coronavirus disease in china requests for access to telecommunications metadata under a of the tia. right to know. last accessed nd the uk won't use apple and google's coronavirus contact-tracing technology for its app, sparking privacy worries about how people's data will be used. business insider, last accessed mathematics of infectious diseases contain: privacy-oriented contact tracing protocols for epidemics feasibility of controlling covid- outbreaks by isolation of cases and contacts critical mass of android users crucial for nhs contact-tracing app. the guardian, last accessed following the breadcrumbs: timestamp pattern identification for cloud forensics clinical characteristics of asymptomatic infections with covid- screened among close contacts in nanjing, china clinical features of patients infected with novel coronavirus in wuhan sars epidemic and the disclosure of singapore nation guidance from the edpb and the cnil for gdpr-compliant covid- contact tracing. last accessed audit of a tuberculosis contact tracing clinic one month in: british public opinion on covid- majority of britons support government using mobile data for surveillance to tackle coronavirus crisis privacy policies as decision-making tools: an evaluation of online privacy notices covid- dashboard probabilistic graphical models: principles and techniques the impact of local heterogeneity on alternative control strategies for foot-and-mouth disease the effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing the effectiveness of contact tracing in emerging epidemics digital ariadne: citizen empowerment for epidemic control effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov- in different settings covid notions: towards formal definitions -and documented understandingof privacy goals and claimed protection in proximity-tracing services kff health tracking poll -late taiwan's carrot-and-stick approach to virus fight wins praise, but strains showing. reuters, last accessed risk perceptions related to sars and avian influenza: theoretical foundations of current empirical research a conceptual model for the coronavirus disease (covid- ) outbreak in wuhan, china with individual reaction and governmental action europe's pepp-pt covid- contacts tracing standard push could be squaring for a fight with apple and google israel passes emergency law to use mobile data for covid- contact tracing department of health: the covidsafe application privacy impact assessment a smartphone enabled approach to manage covid- lockdown and economic crisis private contact-tracing apps could ease coronavirus lockdown and get major businesses back to work by monitoring covid- spread in offices and alerting staff if they have been in contact with an infected colleague. daily mail. last accessed using 'big' metadata for criminal intelligence: understanding limitations and appropriate safeguards predicted by orwell: a discourse on the gradual shift in electronic surveillance law real-time online probabilistic medical computation using bayesian networks (no. ) coronavirus: nhs contact tracing app needs % take-up to be successful, expert warns. the independant. last accessed estimating the asymptomatic proportion of coronavirus disease (covid- ) cases on board the diamond princess cruise ship a mobile based tuberculosis contact tracing and screening system (doctoral dissertation using observational data to quantify bias of travellerderived covid- prevalence estimates in wuhan, china. the lancet. infectious diseases covid- evidence service, centre for evidence-based medicine (cebm) shadow on the land: syphilis probabilistic reasoning in intelligent systems: networks of plausible inference you are your metadata: identification and obfuscation of social media users using metadata information covid- mass testing facilities could end the epidemic rapidly biosecurity (human biosecurity emergency)(human coronavirus with pandemic potential)(emergency requirements -public health contact information) determination act. last accessed early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia privacy-preserving contact tracing of covid- patients mathematical epidemiology of infectious diseases: model building, analysis and interpretation application of sir epidemiological model: new trends what good digital contact tracing might look like. vox. last accessed contact tracing performance during the ebola virus disease outbreak in kenema district understanding privacy violations in big data systems recalling symptom episodes affects reports of immediatelyexperienced symptoms: inducing symptom suggestibility the perception of risk impact of contact tracing on sars-cov- transmission telecommunications (interception and access) act . last accessed an autonomous mobile system for the management of copd combining contact tracing with targeted indoor residual spraying significantly reduces dengue transmission exploiting causal functional relationships in bayesian network modelling for personalised healthcare coronavirus contact-tracing apps: most of us won't cooperate unless everyone does. the conversation. last accessed lay perceptions of collectives at the outbreak of the h n epidemic: heroes, villains and victims psychological predictors of anxiety in response to the h n (swine flu) pandemic hundreds of academics back privacy-friendly coronavirus contact tracing apps racgp releases covidsafe factsheet. royal australian college of general practitioners, last accessed how to return to normalcy: fast and comprehensive contact tracing of covid- through proximity sensing using mobile devices peer-to-peer contact tracing: development of a privacy-preserving smartphone app the authors acknowledge support from the epsrc under project ep/p / : pambayesian: patient managed decision-support using bayes networks key: cord- -ickp n authors: latsuzbaia, ardashel; herold, malte; bertemes, jean-paul; mossong, joël title: evolving social contact patterns during the covid- crisis in luxembourg date: - - journal: plos one doi: . /journal.pone. sha: doc_id: cord_uid: ickp n we conducted an internet survey using survey monkey over six weeks to evaluate the impact of the government interventions on social contact patterns in luxembourg. participants were recruited via the science.lu website on march , april , april , may during lockdown, and june and june after the lockdown to provide an estimate of their number of contacts within the previous hours. during the lockdown, a total of , survey participants with a mean age of . years reported , contacts (mean = . , iqr – ). the average number of contacts per day increased by % from . to . over the lockdown period. the average number of contacts decreased with age: . (iqr – ) for participants below years and . (iqr – ) for participants above years. residents of portuguese nationality reported a higher number of contacts (mean = . , iqr – ) than luxembourgish (mean = . , iqr – ) or other foreign residents, respectively. after lockdown, , participants reported , contacts with . (iqr – ) contacts per day on average, of which . % ( , / , ) occurred without a facemask (mean = . , iqr – ). while the number of social contacts was substantially lower during the lockdown by more than % compared to the pre-pandemic period, we observed a more recent % increase during the post lockdown period showing an increased potential for covid- spread. monitoring social contacts is an important indicator to estimate the possible impact of government interventions on social contacts and the covid- spread in the coming months. covid- has become a global public health emergency affecting more than countries and territories resulting in more than million reported cases by june and over , deaths [ ] . as of june , luxembourg reported , cases and deaths (fig ) [ ] . following the closure of schools, sports facilities, non-food shops, bars and restaurants on march , the luxembourg government declared a state of emergency on march implementing strict social distancing measures and instructing the local population to stay at home except for essential work and to avoid all unnecessary social interactions. although social gatherings were prohibited, people were free to go outside while maintaining a physical distance of two a a a a a meters or more. five surgical masks were distributed to every resident on april - followed by the first easing of lockdown phase on april with reopening of construction sites and recycling centres. the second phase started with reopening of final grades of secondary schools on may and secondary schools one week later with reduced class size. wearing a facemask became mandatory in a public area if a two-metre distance could not be maintained. fifty additional surgical masks were distributed to every resident during the second easing of lockdown phase and gatherings of up to six people (plus household members) indoors and people outdoors were authorised. the third phase was initiated on th of may reopening elementary schools with reduced class size, restaurants (maximum of persons per table) and cafes (only table service) including mandatory face masks for staff and guests when not sitting at the table. covid- spreads via the respiratory route to close contacts and social contact patterns are therefore a key factor shaping the spread of covid- and other infectious agents in a population [ , ] . contact surveys are an important methodological approach to assess social mixing as well the impact of control measures such as quarantine, travel restrictions or social distance measures, or lockdown in general [ ] [ ] [ ] . previous work from the uk suggests that lockdown measures may have decreased the reproduction number from . to . [ ] . similarly, researchers from china have shown a significant decrease of the reproduction number below one following physical restriction measures [ ] . social contact patterns differ across european countries. according to the polymed study luxembourg resident reported . social contacts per day before the pandemic, similar to numbers reported for italy ( . ). belgian, british and german residents, for example, reported . , . and . social contacts per day, respectively [ ] . additionally, luxembourg has a unique demographic structure with a foreign population of nearly %, hence this population requires specific communication strategies targeting local and foreign communities. we repeatedly conducted an internet survey to follow up the impact of the local government interventions on social contact patterns in luxembourg shortly after the lockdown was implemented due to the rapid local spread of the covid- . in addition, our study provides insights on social contact patterns by age group and nationality, which can be important for identifying groups less compliant to imposed restrictions. recruitment of participants occurred via sharing of a survey link on the social media platforms facebook and twitter to followers and readers of the science.lu website following the publication of a general interest article on covid- [ ] . individuals were requested to fill in an online questionnaire to self-report their daily number of contacts. the first survey was for the first wave, the survey collected age category, number of individuals living in the household other than the respondent and number of contacts within the last hours excluding members of the household. from april , we expanded the questionnaire by recording nationality and the location where most contacts occurred (s file). the post lockdown survey included additional question to identify the number of contacts without wearing a facemask (s file). two more categories were added to identify the place of contact with a multiplechoice answer. a social contact was defined as a face-to-face conversation with more than three words at a distance of less than two meters. the total number of contacts was estimated by adding the reported number of contacts outside the household to the number of individuals living in the household. similarly, contacts without a facemask were calculated by adding the number of contacts without a facemask to individuals living in the household (assuming participants from the same household do not wear a mask at home). the survey was purposely designed to have only a small number of questions (duration less than a minute) with available translations in three languages (german, french and english) to ensure high participation and completion. ethical approval for this study was waived by the luxembourg ministry of health and the national ethics committee for research. study participants were informed on how collected data was processed and utilized. the mean number of social contacts per person was calculated and stratified by age category, nationality, household size, location of most contacts and sampling week. the number of contacts in the questionnaire " or more" was counted as contacts and " - " was averaged to . . similarly, in the post lockdown follow up survey, the number of contacts in the questionnaire " or more" was counted as contacts. to estimate the number of contacts related to the place of occurrence, the total number of contacts was divided by the total number of places where contacts occurred assuming that participants would have a similar number of contacts in indicated places of contact. we compared the mean number of daily participants to number of contacts from a large contact survey conducted in luxembourg before the pandemic between may and september [ ] . the total number of social contacts was adjusted representative to the population age structure (s table) . the age adjusted average number of reported contacts was calculated by multiplying the average number of contacts in each age group by the actual proportion of that age group from national population data [ ] . a poisson regression was performed to evaluate factors influencing the number of contacts. variables significantly associated with the number of social contacts in univariate regression were selected for the multivariable model. the language variable was excluded from the model due to significant correlation with nationality (r = . , p< . ). the statistical analysis was performed in stata (college station, texas usa). the effective basic reproduction number estimates and graph were downloaded from the epiforecasts platform (https://epiforecasts.io/) [ , ] . between march and may , a total of , (mean age . years) respondents participated in the online survey, of which . % were under years of age and . % of participants were over years of age (table ) . of , respondents reporting nationality, . % ( , / , ) were luxembourgish and . % ( / , ) were foreign residents ( table ) . the total number of reported contacts was , , while the average number of daily contacts was . ( % ci . - . , iqr - ). after adjusting for age structure, the average number of daily contacts was . . the average number of contacts reported by luxembourg residents in a study before the pandemic was . [ ] , suggesting that contacts during lockdown had decreased by . %. we observed a consistent decline across all age groups and household sizes. the mean number of reported contacts was significantly higher (p< . ) in young participants: . ( % ci . - . , iqr - ) reported by participants below years compared to . ( % ci . - . , iqr - ) for participants above years ( table ) . residents of portuguese nationality reported a significantly higher (p< . ) number of contacts (mean = . [ % ci . - . , iqr - ]) than luxembourgish residents (mean = . [ % ci . - . , iqr - ] or other foreign residents ( table ). the mean number of contacts was significantly higher (p< . ) for the survey when conducted in french language (mean = . [ %ci . - . , iqr - ]) compared to german (mean = . [ % ci . - . , iqr - ]) and english (mean = . [ %ci . - . , iqr - ]) language. we observed a significant variation of the average number contacts depending on the place where most contacts occurred. the highest number of contacts was reported for most contacts at work (mean = . [ % ci . - . , iqr - ]), while the lowest number of contacts was reported for most contacts during leisure (mean = . [ %ci . - . , ) and at the supermarket (mean = . [ %ci . - . , iqr - ]) ( table ). the average number of contacts reported at work increased by . % from . ( % ci . - . , iqr - ) to . ( % ci . - . , iqr - ), while average number of contacts during leisure activities increased by . % from . ( % ci . - . , iqr - ) to . ( % ci . - . , iqr - ) ( table and fig ) . the average number of contacts per day significantly increased (p< . ) by . % over the lockdown period from . ( %ci . - . , iqr - ) to . ( %ci . - . , iqr - ) ( table and fig ) . in the post lockdown period, , participants filled in the survey (mean age = . ) reporting , contacts. the average number of daily contacts significantly increased from . during the lockdown to . after lockdown ( % ci . - . , iqr - ) (fig ) (p< . ) . after adjusting for age structure, the average number of daily contacts was . . the increase was consistent across all categories (table , figs and ). of the total number of contacts, . % ( , / , ) reported a contact without a facemask (mean = . , iqr - ). univariate poisson regression analysis showed that age above years, foreign nationality (other than portuguese), as well as english survey language were associated with a lower number of contacts (table ) . survey sampling week, portuguese nationality and french survey language were associated with higher number of contacts. in multivariable regression, age, foreign nationality (other than belgian) and calendar date remained significant predictors of the number of social contacts (table ). as shown in s fig, the effective reproduction number in luxembourg dropped below one shortly after lockdown (s fig) [ , ] and remained below one during the full lockdown period. the effective reproduction number increased to levels close to unity from the beginning of june onwards, although large confidence intervals were observed due to very low number of new daily cases. our study suggests that the strict physical distancing measures implemented in luxembourg had a substantial and immediate impact on social mixing patterns resulting in a large reduction of the average number of contacts per day. during the early lockdown period, survey participants reported . contacts on average, which is % lower than during the non-pandemic period [ ] . this decline was consistently observed across all age groups and household sizes. our study findings are similar to those from shanghai, wuhan and the uk showing %, % ( , ) and % reduction in the average number of daily contacts, respectively [ , ] . in these studies, the reduction of contacts was estimated to have resulted in a significant decrease of the basic reproduction number r below one [ ] . similar to these estimates, our results explain the rapid decline in sars-cov- transmission, also resulting in the rapid decline of covid- cases observed since the beginning of the lockdown in luxembourg. although in luxembourg the incidence of infections has been dropping to single figures by early may, further relaxing of physical distance restrictions poses significant risk for transmission. relaxing restrictions too early could lead to an earlier second wave leading to further tightening of restrictions [ , ] . between march and may , the average number of contacts increased by % associated with an increasing number of contacts at work and during leisure activities. this increase occurred after the easing of lockdown phase on april , when construction sites and recycling centers were reopened. in june, during the post lockdown period the number of average contacts increased to . , nevertheless it remains % lower than during the pre-pandemic period [ ] . this increase in social contacts most likely resulted in the increase of the reproduction number followed by growing covid- incidence that had been observed by the end of june. in addition, more than half of the contacts in the post lockdown period occurred without wearing a facemask increasing the transmission risk [ ] . our results suggest that older individuals are more compliant with restriction measures compared to younger persons, which is expected since the risk of hospitalization and death from covid- increases with age [ ] . similarly, results of a large survey study conducted by del fava et al. showed that participants older than years have a decreased number of contacts in belgium, france, germany, italy, netherlands, spain and united kingdom [ ] . residents of portuguese nationality had more daily contacts compared to other residents, which could be work related, as % of portuguese participants recorded contacts at work during lockdown. in luxembourg, portuguese residents represent the largest foreign community accounting for % of the total population and appear to be at increased risk for transmission [ ] . direct communication with these foreign communities could help to ensure compliance with physical distancing measures. one limitation of our study is that we potentially overestimated the effect of lockdown due to selection bias. compliant individuals following strict restrictions might be more active on social media and thus more likely to come across the survey. secondly, we have to take into account that our study sample was not representative of the general population in terms of age structure and nationality: participants below and above years of age were underrepresented. similarly, participants of luxembourgish, french, belgian and german nationality were overrepresented, portuguese residents were underrepresented. nevertheless, the adjusted mean number of contacts during lockdown was similar to the non-adjusted number of contacts. another limitation of our study is that the pre-pandemic survey conducted in was conducted using a paper diary approach and our online approach might lead to lower ascertainment of contact numbers. furthermore, the online survey does not account for multiple responses by a single respondent. we did not collect any further data on contacts (e.g. age), thus were unable to construct age matrix and estimate exact reduction of basic reproduction number. on the other hand, , respondents filled in the survey representing over % of total population in luxembourg. in conclusion, our stud shows that physical distance measures resulted in significant reduction in social contacts and therefore decreased the spread of covid- in luxembourg. monitoring social contacts patterns using online surveys provides valuable early evidence of the effects of both lockdown and easying of lockdown measures on transmission and could be easily adapted to be used in different countries and regions in the world. situation update worldwide, as of the contribution of social behaviour to the transmission of influenza a in a human population social contacts and mixing patterns relevant to the spread of infectious diseases estimating clinical severity of covid- from the transmission dynamics in wuhan, china quantifying the impact of physical distance measures on the transmission of covid- in the uk changes in contact patterns shape the dynamics of the covid- outbreak in china how long will this situation continue in luxembourg? statistics portal of the grand-duchy of luxembourg improved inference of time-varying reproduction numbers during infectious disease outbreaks estimating the time-varying reproduction number of sars-cov- using national and subnational case counts covid- : extending or relaxing distancing control measures. the lancet public health the effect of control strategies to reduce social mixing on outcomes of the covid- epidemic in wuhan, china: a modelling study. the lancet public health to mask or not to mask: modeling the potential for face mask use by the general public to curtail the covid- pandemic. infectious disease modelling the differential impact of physical distancing strategies on social contacts relevant for the spread of covid- data curation: ardashel latsuzbaia, jean-paul bertemes. key: cord- - pwqa zk authors: shetty, sameep s; merchant, yash; shabadi, nikita; aljunid, sharifah tahirah title: “c” in covid date: - - journal: oral surg doi: . /ors. sha: doc_id: cord_uid: pwqa zk world war “c”( ) has set in against an invisible virus. the routes of transmission include *contact of contaminated objects,*circulating droplets in the air called aerosols disseminated through *cough, sneeze, ocular secretions( ) from an infected individual. world war "c" has set in against an invisible virus. the routes of transmission include *contact of contaminated objects,*circulating droplets in the air called aerosols disseminated through *cough, sneeze, ocular secretions from an infected individual. the outbreak response steps are based on the "c" principle:  contact tracing -data analytics using smart apps  contact the suspects, test them, containment when in doubt  contact and treat positive patients  constant proactive steps: hand hygiene, sanitization, sensitization and awareness of the general population, social distancing, deep breathing exercises to reboot the immune system. the cataclysm of covid is due to the cytokine storm in the lungs that leads to acute respiratory distress, high contagiousness, reverse transmission, temporal patterns of shedding, high reproductive no (r : . - . ), stochasticities in the initial phase of the outbreak, low surveillance intensity, carnal origin and its constant transformation into the human host of all age groups . the contrast in the degree of hypoxia ("silent hypoxia") and the pattern of infections in high-risk individuals who are unable to mount a stable immune response with modest symptoms explains the lethal spectrum of the novel coronavirus. the who has warned against countries issuing immunity passport, perhaps due to the fear of the second wave of pandemic and insufficient evidence of an immune shield post-treatment. collapsing economy and health care system: we are now in the nascent phase of emolument and remuneration reduction which may deteriorate to loss of employment across all sectors. this could further leap into the next phase and herald a tsunami of the economy if left unchecked. the 'burn out' of health care workers as they multiply their efforts to combat covid and the increased risk to succumb to infection may accelerate the progression and death of non-covid ailments. the overlooked "c": immuno-compromised oncology patients are at a relatively higher risk. the dynamics of sars-cov is indeed altered by the atomic-level liaison between the spike protein this article is protected by copyright. all rights reserved receptor-binding domain (rbd) and the host receptor ace which is overexpressed in tobacco consumers. there are several reports on covid- that can directly result in many cardiovascular complications, including fulminant myocarditis, myocardial injury, heart failure, and arrhythmia , . the global shortage of drugs due to the pandemic and diversifying the use of common drugs to test its efficacy against the novel coronavirus has adverse effects and lacks evidence. in contrast, initial data on the role of ace inhibitors augmenting the onset of severe forms of sars-cov- infection has discouraged their use and triggered the onset of severe cardiovascular events , . children and carrying mothers: the nascent immune system in children may explain the unique spectrum with infants and young children ≤ years more likely to succumb to severe clinical symptoms than older children (ie, ≥ years) . the decline in the immune activity during pregnancy can increase the risk and could lead to maternal morbidity, death . this article is protected by copyright. all rights reserved face contact) or the onset of infection post days of contact to a known infected person (the average maximum incubation period) are low and should not be over-exaggerated . the rising infodemic, stigmatization on covid yields panic and precludes the implementation of epidemic control measures. a flattened or crushing the curve is not anticipated to change the area under the curve . covid in high risk individuals display severe respiratory symptoms, multi organ involvement and may require a long term rehabilitation as in a chronic disease. as we foresee this pandemic to prolong and maybe even become endemic, revamping our immune system, adopting a robust health care system and public health measures at the individual, community, national and global levels is the need of the hour. vicissitudes in oncological care during covid sars-cov- isolation from ocular secretions of a patient with covid- in italy with prolonged viral rna detection risk factors of critical & mortal covid- cases: a systematic literature review and meta-analysis accepted article this article is protected by copyright. all rights reserved cancer patients in sars-cov- infection: a nationwide analysis in china. the lancet oncology the novel coronavirus disease (covid- ) threat for patients with cardiovascular disease and cancer coronavirus fulminant myocarditis saved with glucocorticoid and human immunoglobulin use of hydroxychloroquine and chloroquine during the covid- pandemic: what every clinician should know position statement of the esc council on hypertension on ace-inhibitors and angiotensin receptor blockers epidemiological characteristics of pediatric patients with coronavirus disease in china potential maternal and infant outcomes from (wuhan) coronavirus -ncov infecting pregnant women: lessons from sars, mers, and other human coronavirus infections c-reactive protein levels in the early stage of covid- early reflection on the global impact of covid , and implications for physiotherapy taking the right measures to control covid- . the lancet infectious diseases the oral surgery response to coronavirus disease (covid- ) manifestations and prognosis of gastrointestinal and liver involvement in patients with covid- : a systematic review and meta-analysis key: cord- -gtkx r a authors: lapolla, pierfrancesco; lee, regent title: privacy versus safety in contact-tracing apps for coronavirus disease date: - - journal: digit health doi: . / sha: doc_id: cord_uid: gtkx r a nan with a view to a gradual exit from lockdown, governments around the world are considering deploying contact-tracing apps to prevent or manage a second wave of coronavirus disease . through smartphones, contact-tracing apps can identify people who may have come in contact with an infected person. based on bluetooth low energy (ble) and with optional geo-localisation (gps), this technology can track people's movements. when an infected subject is close enough to another person, the latter becomes a potential infected case who can be contacted and tracked. the aim is to isolate the potentially infected cases to reduce the spread of covid- . many concerns arise over efficacy, privacy issues and data management by governments or health authorities. another crucial concern is the cybersecurity of the appsupporting infrastructures which can be exposed to third-party attacks. a simulation on one million people found that % of smartphone users in the uk ( % of the general population) would need to install a contact-tracing app to suppress the epidemic effectively. a survey run in five countries with more than potential app users suggested that lower numbers would install a similar app ( . % of users in the uk, and . - . % in france, germany, italy and the usa). in singapore, the first country to deploy a voluntary contact-tracing app (tracetogether), launched in march, only an estimated % of the population installed the app. after a spike in new cases in april, the city-state introduced a lockdown named 'circuit breaker'. regarding the digital alphabetisation aspect, data for the italian population in show that only . % of the - age group of internet users have high digital skills; the majority of internet users have low ( . %) or basic ( . %) skills. not only that, but it is more the younger population in italy who have smartphones rather than mobiles. currently, various different frameworks have been developed to build contact tracing, such as open frameworks (ga-pptp, dp- t, blue trace, tcn) or private and controlled (pepp-pt). the nature of implementation may be open source (dp- t, blue trace, tcn) or private (pepp-pt, ga-pptp), and the control-based network can be decentralised or centralised proximity data. on april, a letter signed by nearly academics warned that centralised systems can risk surveillance, and suggested that apple and google (currently working jointly in developing a contact-tracing app) should consider developing one which uses an opt-in and decentralised system. countries such as china, singapore and colombia have officially adopted contact-tracing apps. , controversies arise over app security issues and data breach (as was the case in the netherlands), especially for apps including geo-localisation (such as the ones deployed in norway and israel). , contact-tracing apps might be an effective way of controlling the pandemic through the next phases. however, in order to be effective, contact tracing must be supported not only by solid technology, capable of minimising the risk of attacks, but also by a system offering safe communication with appropriate authorities. therefore, concerted pan-european efforts to resolve concerns over the privacy implications will be essential in the development of successful covid- contact-tracing apps. quantifying sars-cov- transmission suggests epidemic control with digital contact tracing digital contact tracing can slow or even stop coronavirus transmission and ease us out of lockdown digital contact tracing will fail unless privacy is respected, experts warn. the guardian report: internet use is increasing but . % of internet users have low digital skills sign the contact tracing joint statement china launches coronavirus 'close contact detector' app help speed up contact tracing with tracetogether data leak within hours with possible corona app covid alert together we can fight coronavirus -download the smittestopp app coronavirus: israeli court bans lawless contact tracing key: cord- -rgzmpoxv authors: keeling, matt j; hollingsworth, t. deirdre; read, jonathan m title: the efficacy of contact tracing for the containment of the novel coronavirus (covid- ). date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: rgzmpoxv contact tracing is a central public health response to infectious disease outbreaks, especially in the early stages of an outbreak when specific treatments are limited. importation of novel coronavirus (covid- ) from china and elsewhere into the united kingdom highlights the need to understand the impact of contact tracing as a control measure. using detailed survey information on social encounters coupled to predictive models, we investigate the likely efficacy of the current uk definition of a close contact (within meters for minutes or more) and the distribution of secondary cases that may go untraced. taking recent estimates for covid- transmission, we show that less than in cases will generate any subsequent untraced cases, although this comes at a high logistical burden with an average of . individuals ( th percentiles - ) traced per case. changes to the definition of a close contact can reduce this burden, but with increased risk of untraced cases; we estimate that any definition where close contact requires more than hours of contact is likely to lead to uncontrolled spread. kingdom highlights the need to understand the impact of contact tracing as a control measure. using detailed survey information on social encounters coupled to predictive models, we investigate the likely efficacy of the current uk definition of a close contact (within meters for minutes or more) and the distribution of secondary cases that may go untraced. taking recent estimates for covid- transmission, we show that less than in cases will generate any subsequent untraced cases, although this comes at a high logistical burden with an average of . individuals ( th percentiles - ) traced per case. changes to the definition of a close contact can reduce this burden, but with increased risk of untraced cases; we estimate that any definition where close contact requires more than hours of contact is likely to lead to uncontrolled spread. contact tracing is the main public health response to importations of rare or emerging infectious diseases, and was implemented in the uk during the 'containment stage' of the influenza pandemic (mclean et al ) . in more recent years, contact tracing was also a valuable tool following importation of ebola virus disease into the uk in (crook et al ) and the cases of monkeypox in the uk in (vaughan et al ). in general, contact tracing is a highly effective and robust strategy given sufficient resources. the main advantages are that it can identify potentially infected individuals before severe symptoms emerge, and if conducted sufficiently quickly can prevent onward transmission from the secondary cases. contact tracing has proved hugely successful in the treatment of sexually transmitted infections, where the definition of a contact is relatively straightforward, where infection is often asymptomatic and where the time-scales of transmission are slow (hogben et al , rönn et al . in contrast, the use of contact tracing for novel invading pathogens has received less quantitative consideration, in part due to greater uncertainties over social contact structure (although see ahmed et al , hoang et al . modelling studies have often focused on quantifying the importance of pre-symptomatic and pretracing infectiousness, but are usually based on statistical distributions of contact networks (fraser et al , kwok et al . here we leverage detailed social network data from the uk to model both transmission and the act of tracing, and identify the implications of contact tracing for containment of a novel pathogen, using parameters for the novel coronavirus (covid- ) (read et al , li et al . we characterised contact patterns in the uk using a postal and online cross-sectional survey, which asked participants to report the number of social encounters with unique individuals during a given day, as well as the duration and typical frequency of those encounters (danon et al (danon et al , . in total, , respondents reported more than , encounters -one of the biggest studies of its kind to date. the encounter patterns of this study were in good qualitative agreement with other similar studies of social interactions (mossong et al , isella et al . in this study, the daily encounter data was first extrapolated to generate a pattern of contacts over a day period (replicating random encounters and increasing the total duration associated regular contacts), to act as the basis for transmission and contact tracing simulations. using this extrapolated data, we can classify interactions into those which satisfy the definition of a close contact for the purpose of contact tracing. from our social encounter data we can also distinguish interactions with people who could be later identified and traced, from those with unidentifiable strangers (schematic figure ). we assume that all contact of longer than hour or repeated contacts can be identified and traced, whereas shorter meetings with people for the first time are strangers who are unidentifiable. the second element of the simulation is to determine who gets infected from a source case chosen representatively from the survey respondents. this transmission process is stochastic, accounting for both the time spent with each contact and the infectivity on each day (see appendix). taken together these two predictions allow us to bound the efficacy of contact tracing. one of the most notable features of human social contacts is the huge variability in the number and strength of contacts -which is reflected as variation in both the number of secondary cases and the number of individuals that match the contact-tracing definition (figure ). using preliminary estimates of covid- transmission (average latent period days, average effective infectious period . days, r = . and assuming a simple seir formulation (read et al )) we compute the distribution of epidemiological, social and contact tracing characteristics across the population. extrapolating the data from the social contact survey suggests that the average number of contacts over a day period is , although the distribution is significantly over dispersed (with a median of and around % of individuals having > , total contacts). of these total encounters, an average of contacts ( %) meet the definition of a close-contact (in contact for > minutes, phe ) and of these close-contacts we predict an average of ( %) to be individuals known to the infected case that can be traced. therefore, simply considering social contacts, it is clear that there are very many short duration contacts which do not meet the definition of a close contact, and although unlikely to become infected may pose a risk due to their greater abundance. given that the risk of infection increases with duration of contact, the distribution of cases effectively represents a biased sample of all contacts. as expected, given the model assumptions, the expected number of total secondary cases agrees with the assumed r (mean= . , median= , and th percentiles - ). given that these cases are most likely to be those contacts of the longest duration, we predict that % of cases match the definition of a close contact. however, not all of these contacts will be identifiable; assuming that all repeated contacts and contact of longer than hour can be traced, we predict that % of all cases meet the definition and can be identified. however, because of the extreme heterogeneity in contacts between individuals and the stochastic nature of transmission, we would still expect % of all primary cases to generate at least one secondary case that cannot be identified. aggregating across all individuals and under the optimistic assumption that all the contact tracing can be performed rapidly, we expect contact tracing to reduce the basic reproductive ratio from . to . -enabling the outbreak to be contained (figure ). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . rapid and effective contact tracing can therefore be highly effective in the early control of covid- , but places substantial demands on the local public-health authorities. each new case requires an average of individuals to be traced, with . % of cases having more than close traceable contacts (figure ). we therefore consider the implications of changing the definition of a close contact. clearly a more strict definition of a close contact (requiring more contact time) reduces the burden on the health services as fewer contacts need to be traced, but also increases the risk of cases being missed. figure provides a quantitative assessment of changes to the close contact definition. definitions requiring more than four hours of contact, are unlikely to control an outbreak as the expected number of untraced secondly cases is greater than one. this therefore places a strict bound on level of contact tracing required. the added benefit from definitions shorter than hour has relatively little impact on the mean number of untraced cases (figure b), but does reduce the probability that some untraced contacts occur. throughout we have used a value of r that represents a population-level average once the local infection has become established. however, the first invasion into any new population or social setting generally has a larger expected number of secondary cases. the first invader enters a completely susceptible population; moreover all their close contacts (eg family members) are susceptible. in contrast, due to clustering of contacts, most secondary cases will be in a landscape with a depleted number of susceptibles -as close contacts such as family members will already have been exposed to the primary case. this susceptible depletion in the local social network may help to explain the change in r t over time reported for covid- (yang et al ). we therefore consider the impact of different values of the initial reproductive ratio (figure ), which could capture this social aspect, or could represent heterogeneity between individuals in the amount of virus shed, or could inform about innate differences in behaviour between china and the uk. given the strong biasing of transmission towards long-duration contacts, the impact of varying initial reproductive ratio is less extreme than might be expected; it is only for the highest values of initial reproductive ratio simulated (> . ) that contact tracing fails to find more than one case such that infection can escape. mathematical models have an important role to play in preparedness for novel infectious diseases, allowing policy makers to plan for potential public health scenarios before they arise. however, in such scenarios reliable data is often limited, so predictions of long term dynamics are generally associated with wide confidence intervals. in contrast, while short term predictions are subject to greater stochasticity, the distribution of possible behaviours can be readily captured. here we have investigated contact tracing of a close-contact pathogen, using novel coronavirus (covid- ) as the example, and considered the efficacy of contact tracing as a control measure. this work brings together a detailed survey of social encounters together with bespoke mathematical modelling of the transmission and tracing processes. given the huge heterogeneties present in social encounters (both in terms of duration and number) mathematical models are vital to interpret the interplay between a low number of high risk encounters (e.g., household members) and the high number of low-risk less-identifiable encounters (e.g., commuters or retail customers). the uk currently defines a close contact as minutes within meters over two weeks before detection (phe ) . under this definition, there are unlikely to be many untraced secondary cases, although the burden of tracing could be large. relaxing the definition of a contact (such that longer contact durations are needed) lessens this burden but at the greater risk of undetected cases ( figure ) . surprisingly, small changes to the reproductive ratio, within the bounds estimated from early data ( figure ) or even changes to the distribution of infectivity, are predicted to have a relatively modest impact of the success of contact tracing illustrating the robustness of this control measure. our model has addressed the simple and optimistic question of whether contact tracing is sufficient to identify secondary infections. the public health implications of this tracing are more complex, and depend on the relative timing of events and the treatment of identified contacts. for contact tracing to be an effective public health measure requires secondary cases to be discovered before they become infectious; hence the time from the primary case becoming infectious to the tracing of their contacts needs to be shorter than the incubation period. longer time scales would allow tertiary cases to be infected and would snowball the tracing process. in addition, those contacts that are traced either need to be effectively screened for infection and quarantined or otherwise isolated so that they do not pose a risk to others. therefore, while contact tracing has the potential to control covid- (and other close-contact pathogens) the ultimate success relies on the speed and efficacy with which suspect contacts can be contained. with contacts positioned by their total contact duration. here, the definition of a contact is someone with whom the index case encountered for minutes or longer. some contacts will be identifiable (green), while others will be unidentifiable (orange). a definition of contact that is too restrictive and inappropriate for the infection means some encounters may fail to meet the definition yet may be at risk of infection; these excluded contacts could be identifiable (light grey) or unidentifiable (orange). (b) examples of ego-centric networks collected by the survey. the participant (ego) is the blue central triangle; circles represent individual contacts, squares represent groups of contacts (size of group indicated). colours represent social settings of encounters (red=home, cyan=work/school, yellow=travel, pink=other) . larger symbol sizes represent longer contact durations, while a closer proximity to the ego indicates the contact is more frequently encountered. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . figure . impact of different assumptions for the definition of a close-contact. (a) the total number of contacts traced (b) the number of secondary contacts that are not traced (c) the probability that at least one secondary case is not traced. for (a) and (b) the crosses mark the mean value, boxes contain the th percentiles while bars contact the th percentiles, colours correspond to those in figure a - prob at least one untraced case . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . figure . impact of different values for the initial reproduction number of the primary case; changing this does not affect the the number of contacts traced (a) the number of secondary contacts that are not traced (b) the probability that at least one secondary case is not traced. for (a) the crosses mark the mean value, boxes contain the th percentiles while bars contact the th percentiles, colours correspond to those in figure a -distributions are across all respondents to the survey and across stochastic realisations. (based on an seir model with latent period , infectious period . , r = . ). prob at least one untraced case . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . effectiveness of workplace social distancing measures in reducing influenza transmission: a systematic review lack of secondary transmission of ebola virus from healthcare worker to contacts social encounter networks: collective properties and disease transmission social encounter networks: characterizing great britain factors that make an infectious disease outbreak controllable a systematic review of social contact surveys to inform transmission models of closecontact infections what's in a crowd? analysis of face-to-face behavioral networks epidemic models of contact tracing: systematic review of transmission studies of severe acute respiratory syndrome and middle east respiratory syndrome. computational and structural biotechnology journal early transmission dynamics in wuhan, china, of novel coronaviurs-infected pneumonia pandemic (h n ) influenza in the uk: clinical and epidemiological findings from the first few hundred (ff ) cases. epidemiology & infection social contacts and mixing patterns relevant to the spread of infectious diseases novel coronavirus -ncov: early estimation of epidemiological parameters and epidemic predictions from the contact tracing data, we extrapolate to the estimate the duration of contact ",$ % , between individual i (the respondent) and individual j (the contact) on day d. we then define a close contact of i as all contacts j:where in the uk, we have defined the total contact time t as minutes over a duration d of two weeks before detection and isolation of individual i (phe ).the probability of transmission to individual j from individual i is then calculated as:where $ is an estimate of the transmission rate from individual i on day d. key: cord- -ocp yodg authors: swaan, corien m; appels, rolf; kretzschmar, mirjam ee; van steenbergen, jim e title: timeliness of contact tracing among flight passengers for influenza a/h n date: - - journal: bmc infect dis doi: . / - - - sha: doc_id: cord_uid: ocp yodg background: during the initial containment phase of influenza a/h n , close contacts of cases were traced to provide antiviral prophylaxis within h after exposure and to alert them on signs of disease for early diagnosis and treatment. passengers seated on the same row, two rows in front or behind a patient infectious for influenza, during a flight of ≥ h were considered close contacts. this study evaluates the timeliness of flight-contact tracing (ct) as performed following national and international ct requests addressed to the center of infectious disease control (cib/rivm), and implemented by the municipal health services of schiphol airport. methods: elapsed days between date of flight arrival and the date passenger lists became available (contact details identified - ci) was used as proxy for timeliness of ct. in a retrospective study, dates of flight arrival, onset of illness, laboratory diagnosis, ct request and identification of contacts details through passenger lists, following ct requests to the rivm for flights landed at schiphol airport were collected and analyzed. results: requests for ct were identified. three of these were declined as over days had elapsed since flight arrival. in out of requests, contact details were obtained within days after arrival ( %). the average delay between arrival and ci was , days (range - ), mainly caused by delay in diagnosis of the index patient after arrival ( , days). in four flights ( %), contacts were not identified or only after > days. ci involving dutch airlines was faster than non-dutch airlines (p < , ). passenger locator cards did not improve timeliness of ci. in only three flights contact details were identified within days after arrival. conclusion: ct for influenza a/h n among flight passengers was not successful for timely provision of prophylaxis. ct had little additional value for alerting passengers for disease symptoms, as this information already was provided during and after the flight. public health authorities should take into account patient delays in seeking medical advise and laboratory confirmation in relation to maximum time to provide postexposure prophylaxis when deciding to install contact tracing measures. international standardization of ct guidelines is recommended. aircrafts can function as transport vehicle for patients infected with influenza, leading to introduction of a new virus strain to non-endemic areas [ , ] . although the risk is small, passengers might be infected by a contagious patient during the flight [ ] [ ] [ ] [ ] , as well as during public transport [ ] . transmission during the flight increases the possibility of further transmission in the area of destination. for these reasons, during the initial phase of the influenza a/h n pandemic, many countries initiated contact tracing among flight passengers of flights where contagious patients with laboratory confirmed influenza a/h n were notified. a risk assessment guideline for infectious diseases transmitted on aircrafts has been developed by the european centre for disease prevention and control (ecdc) [ ] , which includes influenza. literature study revealed on-board transmission in flights with a duration of less than h. the majority of infected contacts during these flights were seated on the same row, or one or two rows in front of behind the index [ ] [ ] [ ] [ ] . contacts up to and rows distance from the index were infected in one study [ ] . as these contacts also had personal contact with the index during the flight, transmission across a distance of so many rows is not proven. the guideline concludes that it is difficult to design a single contact tracing algorithm for influenza. due to the short incubation period of influenza, it is almost impossible to provide contacts with postexposure prophylaxis (pep) within the time that it is most effective, which is h after exposure [ ] . therefore, the main aim of contact tracing might be to interrupt the chain of transmission by alerting contacts for early diagnosis and treatment. although the world health organization (who) developed technical advice for case management of influenza a/h n in air transport during the pandemic [ ] , no international standardized protocol for contact tracing for this pathogen was available. in line with the ecdc guideline [ ] and the dutch guideline for 'incidental introduction of a new influenza strain' [ ] , in the netherlands close contacts of a patient with laboratory confirmed pandemic influenza were identified. in case the index had been contagious during a flight with a duration of ≥ h, passengers and cabin crew were to be informed on signs and symptoms of the disease and to seek medical care in case they would occur. in addition, close contacts, defined as passengers seated on the same row, two rows in front and two rows behind the index case, as well as the cabin crew working in this compartment, were traced by public health authorities to provide a day prophylactic course of oseltamivir as soon as possible (preferably within h after exposure). schiphol airport is the only airport in the netherlands where trans-atlantic flights arrive. its municipal health services (mhs, ggd kennemerland) and the center for infectious disease control (cib-rivm) frequently experienced that, despite all efforts, the time period elapsing from exposure to administration of the first oseltamivir dose exceeded the required h. acquiring contact details from airlines was time consuming, and contact details on passenger lists were often minimal, so that contacts were difficult to trace. in this study, we assess the time delay in contact tracing of flight passengers for influenza a/h n as performed in the netherlands during the initial phase of the pandemic. our data show that despite all efforts the effectiveness of this control measure in daily practice is minimal. from april th until june nd , contact tracing among flight passengers in the netherlands was indicated for laboratory confirmed influenza a/h n cases, who traveled on a flight for h or longer while being contagious, defined as day before, until days after disease onset. these criteria were installed by the cib, which also functions as national focal point (nfp). the procedure for contact tracing is complex, see figure . requests for contact tracing to the cib for dutch index patients originate from any dutch mhs which identifies a patient who traveled by plane while being contagious for an infectious disease which requires contact tracing. other nation's health authorities will make a request to the cib in case they diagnosed a patient which arrived at schiphol airport for transit while being infectious. requests for ct in the last group are submitted to the national focal point (nfp) or through the early warning and response system of the eu (ewrs). the cib verifies laboratory confirmation, and the indication for contact tracing regarding flight duration. the mhs of the airport where the specific flight landed coordinates contact tracing for flight passengers. in case of schiphol, mhs kennemerland approaches the involved airline company requesting the passenger list. the airline provides passenger lists with at least passenger names, seat numbers and booking or contact details. mhs kennemerland then completes contact details through booking offices or using other search methods. close contacts living in the netherlands are traced by the respective dutch mhs's. for tracing foreign contacts, the cib sends a notification with contact details to the nfp of the country of final destination, or through the ewrs system for eu countries. during the pandemic, ct requests were turned down if more than days had elapsed after flight arrival, as contact tracing was not considered to have additional value. during the study period, passenger locator cards (plc) only were used on direct flights from mexico during the initial phase of the pandemic. these flights were all run by dutch airlines. for each contact investigation performed in the period april th until june nd , the following data were collected: flight arrival date, first day of illness of index patient, date of laboratory diagnosis, date of contact tracing request and the date passenger lists were obtained and contact details were completed ('contacts details identified'). from these data, time intervals (in days) between flight arrival and date of diagnosis (interval i), between diagnosis and request dates (interval ii) and between request and contact details identified dates (interval iii) were calculated, see figure . date of actual contact tracing and oseltamivir administration was not available in this study, but is inherently always hours if not days later. as the airline company traces contacts amongst crewmembers, these are not included in this study. data were analyzed using spss software (version , usa). the influence of availability of plc's on timeliness and the origin of the airline company (dutch or non-dutch) were statistically analyzed. in the period april th until june nd , indications for ct were identified. three international requests concerning ct for influenza patients diagnosed outside the netherlands were declined as already more than days had elapsed since flight arrival. in out of the remaining contact investigations, passenger lists with contact details were obtained within days after arrival ( %), see table . in total contact details of close contacts were identified, of which contacts lived in the netherlands, and contacts abroad. the average number of close contacts per flight was (range: - ). in four contact investigations ( %), contact details were not obtained, or provided later than days after flight arrival and ct was stopped. these ct were all related to non-dutch airlines, and total delay *:. in the beginning of the pandemic one request for contact tracing was accepted after days **: these late ct requests were accepted as the passenger lists of the concerned flights already were available from earlier contact investigations ***: date of diagnosis not known ª: passenger locator card was stated days for further data processing. of the requests, the total delay between request and contact detail identification was longer for non-dutch airlines (mean , sd , ) compared with dutch airlines (mean . days, sd , )( -sided mann-whitney test, p = , ). for the completed contact investigations, interval i was the largest interval in the contact tracing procedure (mean , days, range - , % ci , - , , n = ). the other intervals ii and iii were shorter, with a mean of , days and , days respectively, see table . figure shows the medians of the described intervals. since / index cases were already ill before, or during the day of arrival of the flight, the delay in interval i is mainly caused by delay in seeking medical advice and diagnostic procedure itself. after acceptance of the request for ct by the cib, ggd kennemerland needed on average , days (range - , % ci , - , days) to collect the passenger list from the airlines and complete contact details (interval iii). the total delay between flight arrival and identification of contact details was on average , days (range - days, % confidence interval , - , days), see table . in only out of contact investigations ( %), contacts were identified within days after arrival. in out of these contact investigations, plc's were available. interval iii of the ct with plc's available was shorter ( , days, sd , ) than for ct's without plc ( , days, sd , ), this was not significant however (p: , ). overall delay in ct with plc's also was shorter (mean , , sd , ), but not significant, when compared to ct without plc's (mean , , sd , ) (p: , ). in this study we evaluated the timeliness of contact tracing (ct) of flight contacts in daily practice. we conclude that the prevailing policy to provide close contacts antiviral pep during the early phase of the influenza pandemic is very difficult to implement effectively and therefore has little effect to control disease spread. active case finding through contact tracing of exposed persons is an important procedure during the containment phase of an emerging communicable disease. however, our data show that, even in a small-industrialized country with modern communication tools, tracing of flight contacts exceeds the required maximum of h after exposure. for influenza, close contacts of contagious index cases are entitled to receive antiviral pep within h after exposure to prevent them from becoming ill and further spreading of the disease. starting oseltamivir within h does not prevent disease but shortens the disease period, mitigates symptoms and might decrease further transmission. awareness among contacts to seek medical evaluation when influenza-like (ili) symptoms occur, for both proper antiviral treatment and (home-) isolation advice, reduces further spreading. as influenza has a relative short latent period, for influenza a(h n )/ varying between , - , days [ , ] , contacts ideally should be informed within day. oseltamivir postexposure prophylaxis for this pandemic strain is reported to be effective even when administrated more than h after exposure in household settings [ ] , however, delays in administration are not specified. we cannot exclude the possibility that in our study, even delayed administration of oseltamivir prophylaxis may have prevented some people from becoming ill, although we anticipate the effectiveness of the intervention overall to be less in this setting than in households. our study among contact investigations showed an average total delay of , days between flight arrival and identification of contacts by passenger list, which is too late for effective pep, and late for alerting on first symptoms of disease. only in three contact investigations ( %), contact details were obtained within h. however, after identification of passenger details, health authorities need time to actually trace the contact and administer pep. it is highly unlikely that this was achieved within the same h. we therefore conclude that contact investigation for provision of pep as conducted here was ineffective. regarding the awareness of ili symptoms, schiphol airport handed all passengers on flights arriving from mexico information leaflets on influenza a/h n with information on early symptoms and requesting them to seek medical advice in case of fever and respiratory symptoms such as coughing. posters with this information were placed in passenger halls, to inform passengers arriving indirectly from mexico via transit through other airports, or arriving from non-endemic areas with higher transmission (e.g. usa). as contact details were identified on average . days after exposure, however not contacted yet, we conclude that ct did not have additional value for timely achievement of increased awareness. it is not a new finding that contact tracing of flight passengers is a time-consuming procedure [ ] . in one study among flight passengers during the pandemic in , % ( / ) of the contacts were reached within h [ ] . in a measles contact investigation, % ( / ) of responding passengers were contacted within h. in this study however, the diagnosis of measles was already suspected during the flight, and laboratory confirmation was initiated immediately after landing [ ] . it also helped that many contacts were tourists staying at the same hotels, which facilitated tracing them. our study shows that the longest delay before identification of contact details for an influenza index case is caused by the time between arrival and laboratory diagnosis (interval i, , days). this delay is a result of patients delay in seeking medical care, and doctor's delay, including laboratory confirmation. for influenza, the indicated laboratory test was polymerase chain reaction, which takes several hours to obtain the result and in the beginning of the pandemic, the pcr test was not yet available in many laboratories. patients delay was considerable however. it even took the seven passengers with date of onset before the flight, and therefore symptomatic during the flight, to days after arrival before laboratory confirmation was made. also, none of the airline reported that these patients already were identified during the flight, nor that infection control measures were taken. for the indexes that became ill on the day of arrival, delay until laboratory confirmation still lasted days (range - days). a prepandemic study by sharangpani et al. among flight passengers showed that they are more willing to seek physicians care in case they developed flu-like symptoms when the perceived the pandemic as serious [ ] . leggat et al. demonstrated during the pandemic that only a minority ( , %) of australian citizens would cancel their air travel in case of cough and fever lasting more than day. this was higher among persons who were more concerned about the pandemic [ ] . in the netherlands, the perceived severity of the disease decreased significant during this study period [ ] . we expect that the delay until laboratory diagnoses in this study considerably is affected by patients delay seeking medical care, which might be better in diseases experienced as more threatening. collecting passenger details from foreign airlines also caused considerable delay because of differences in time zones and the need to convince the concerned airline companies about the urgency to collect and hand-over passenger lists with contact details. sometimes official request letters were necessary for legal reasons to release personal contact details. dutch companies were easier to convince by dutch health authorities to hand over passenger details. our data show that contact details that were identified too late or not at all, indeed more often originated from non-dutch than from dutch airline companies. an internationally standardized contact tracing protocol, communicated with the international civil aviation organization (icao) and international air transport association (iata), would facilitate the timeliness, and therefore effectiveness of contact tracing. although one might expect differently, timeliness of ct for flights where plc's were available, was not better than ct for flights without plc. however, plc's reduces the effort, in terms of staff support for airline companies and the municipal health service to collect useful passenger information considerably. plc's were only used by dutch airlines, who already were able to provide passenger lists relatively quickly. this also explains the limited attributed shortening in timeliness. contact details on plc's might be more accurate to trace the passenger than details provided by the passenger list or booking station. this is further investigated. this study has several limitations. as available data were recorded in days, and not in hours, it was not possible to determine the time intervals more precisely. as this was both with first and last date of the intervals, we expect no negative or positive bias. secondly, the arrival date was used for date of exposure, while the actual exposure might have already taken place the day before at departure of the flight. this would imply an increase in delay and decrease the effectiveness of contact tracing. also, we have no data if, and when contacts were actually reached and oseltamivir was administered. since several steps were still required to reach the contacts after they were identified through passenger lists, this only would have lead to further delay in administrating prophylaxis. further investigation into the timeliness of administration of prophylaxis among these contacts is initiated, to have insight in the delay of this last interval to facilitate future decisions on the effectiveness and necessity of contact tracing among flight passengers. lastly, this study includes ct initiated at only one airport. ct procedures might be different at airports in other countries, which influences interval iii. as this is not causing the main delay, we do not expect that in other countries ct would be much faster. we conclude that tracing close contacts among flight passengers during the initial phase of pandemic a/ h n was not effective, as timely provision of pep could not be achieved in most cases. most contacts came from an endemic area (mexico) or areas with well known increased transmission during the first months of the pandemic. the additional risk for those travelers of being a close contact during a long haul flight is small ( , %) [ ] . furthermore, airline companies and/or schiphol airport already provided contacts with information on the disease and its symptoms by. the benefit to inform them of the fact that they were contacts of a laboratory confirmed case did not justify the extra effort health authorities invested in contact tracing, especially during a period where public health officials, airports and airline companies were absorbed by efforts of other pandemic related control measures. in hindsight, the limited burden of disease of influenza a/h n did not justify contact tracing efforts. the main reason for flight contact tracing is raising alertness for possible exposure to uncommon infectious diseases, enabling early recognition and treatment of the disease and timely installation of control measures (e.g. sars and viral hemorrhagic fevers). for some diseases, pep is indicated as well. the risk assessment upon which the decision to install contact tracing is based should incorporate -apart from an evaluation of the severity and rarity of disease -an assessment of the required timeliness of effective control measures [ ] . the expected time for laboratory confirmation of index cases and identification and tracing of contacts should be related to the maximum period during which quarantine, pep or other control measures are effective in order to decide on the benefit of this time consuming procedure. lastly, also cabin crew should be aware of their role of signaling infectious patients. in consultation with medical professionals, direct control measures can be installed, as well as medical evaluation after landing. empirical evidence for the effect of airline travel on inter-regional influenza spread in the united states spread of a novel influenza a (h n ) virus via global airline transportation calculating the potential for withinflight transmission of influenza a (h n ) transmission of infectious diseases during commercial air travel transmission of pandemic a/h n influenza on passenger aircraft: retrospective cohort study lack of airborne transmission during outbreak of pandemic (h n ) among tour group members, china is public transport a risk factor for acute respiratory infection? ecdc: risk assessment guidelines for infectious diseases transmitted on aircraft an outbreak of influenza a/taiwan/ / (h n ) infections at a naval base and its association with airplane travel outbreak of influenza-like illness [corrected] related to air travel an outbreak of influenza aboard a commercial airliner mixed outbreak of parainfluenza type and influenza b associated with tourism and air travel antiviral agents for the treatment and chemoprophylaxis of influenza -recommendations of the advisory committee on immunization practices (acip) who: case management of influenza a(h n ) in air transport influenza: operationeel deeldraaiboek . incidentele introductie nieuw humaan influenzavirus in nederland estimated epidemiologic parameters and morbidity associated with pandemic h n influenza population modeling of influenza a/h n virus kinetics and symptom dynamics household transmission of pandemic influenza a (h n ) virus in osaka contacting passengers after exposure to measles on an international flight: implications for responding to new disease threats and bioterrorism attitudes and behaviors of international air travelers toward pandemic influenza level of concern and precaution taking among australians regarding travel during pandemic (h n ) : results from the queensland social survey perceived risk, anxiety, and behavioural responses of the general public during the early phase of the influenza a (h n ) pandemic in the netherlands: results of three consecutive online surveys eu funded react project: response to emerging infectious disease: assessment and development of core capacities and tools the authors would like to thank josé ferreira for advice on statistics. authors' contributions cs and ra designed the study and collected the data. mk advised on the data management and presentation of the results. js, cs and ra interpreted the data. js critically revised the manuscript. cs wrote the manuscript and all authors commented on drafts and approved the final version. all authors read and approved the final manuscript. the authors declare that they have no competing interests. the pre-publication history for this paper can be accessed here: key: cord- - gkiukh authors: clark, eva; chiao, elizabeth y; amirian, e susan title: why contact tracing efforts have failed to curb covid- transmission in much of the u.s date: - - journal: clin infect dis doi: . /cid/ciaa sha: doc_id: cord_uid: gkiukh by late april , public discourse in the u.s. had shifted toward the idea of using more targeted case-based mitigation tactics (e.g., contact tracing) to combat covid- transmission while allowing for the safe “re-opening” of society, in an effort to reduce the social, economic, and political ramifications associated with stricter approaches. expanded tracing-testing efforts were touted as a key solution that would allow for a precision approach, thus preventing economies from having to shut down again. however, it is now clear that many regions of the u.s. were unable to mount robust enough testing-tracing programs to prevent major resurgences of disease. this viewpoint offers a discussion of why testing-tracing efforts failed to sufficiently mitigate covid- across much of the nation, with the hope that such deliberation will help the u.s. public health community better plan for the future. m a n u s c r i p t many countries that have successfully mitigated the covid- pandemic to date did so via stringent measures to limit personal movement and abate public interactions [ , ] , but these approaches are unlikely to be acceptable from an economic, legal, or sociocultural perspective in the united states. partly for this reason, our nation rushed to espouse the idea of targeted, case-based covid- management [ ] [ ] [ ] [ ] , focusing on expanded testing and contact tracing, while disregarding several major obstacles that set us apart from countries that succeeded in mounting a timely, targeted response. indeed, expansive testing-tracing programs have largely succeeded in curtailing community spread in certain countries, most notably south korea, which is commonly referred to as the archetype for controlling covid- while avoiding strict lockdowns [ , , ] . arguments were made that the initial set of "stay at home" orders implemented in many regions of the u.s. were intended to prevent hospital overflow and to essentially buy time to plan out a more precise strategy that would have less impact on daily life [ ] , taking note of what worked best in other parts of the world that preceded us in the pandemic curve. here, we discuss some urgent public health considerations related to why heavy reliance on expanded testing-tracing efforts were largely unsuccessfully in many states in the u.s. that are now experiencing record-breaking surges in case counts. from the beginning of the pandemic, there has been a noticeable lack of unified national leadership and coordination, which has resulted in both the absence of a robust plan (or common goals) for local and state health departments and the dissemination of confusing mixed messages to the lay public [ , ] . for the most part, the u.s. centers for disease control and prevention (cdc) has remained uncharacteristically silent during this national crisis [ ] . in may , at a time when many jurisdictions had already started relaxing their "stay at home" mandates, the cdc released a watered-down version of the original guidance a c c e p t e d m a n u s c r i p t documents censored by the trump administration [ , ] . the resulting guidance allowed for the potential re-opening of schools, restaurants, bars, and other institutions that were closed in many jurisdictions earlier in the pandemic, with limited specific direction for addressing sustained community transmission [ , ] . notably, joint white house and cdc benchmarks for re-opening (described elsewhere [ ]) were flouted by several states, including texas, georgia, and florida [ ] . few states had come close to meeting even just one of the cdc benchmarks, when they started reopening under the impression that voluntary social distancing and expanded testing-tracing would be sufficient to curb the epidemic in regions with seemingly flattening rates of ongoing transmission [ ] [ ] [ ] . had states been encouraged to heed cdc benchmarks, it may have been possible to avoid the major surges now being observed in these states [ , ] . in addition to the lack of coordinated public health leadership, it has been surprising that despite being a resource-rich nation, the u.s. still struggles to achieve adequate and consistent testing rates [ , ] . in areas experiencing surges, there have been reports of long lines, test shortages, and over a week long turnaround times, despite the fact that the past months since the start of the epidemic should have provided ample time to increase supply chains for testing materials [ , ] . it is a fundamental concept that health departments cannot trace cases that remain undetected. yet, even prior to the current surges, many putative cases, even those who were symptomatic, were unable to obtain timely sars-cov- testing and results, and very few jurisdictions had implemented widespread freely-available public surveillance testing [ , ] . this ineptitude in deploying a cohesive testing strategy stems from many organizational and national leadership barriers, including an underfunded public health outpatient testing infrastructure, regional insufficiencies in testing supplies/reagents, and a lack of national guidance regarding the best strategy for a c c e p t e d m a n u s c r i p t implementing surveillance testing [ ] . reliable, widespread, no-cost surveillance testing should have been available nationwide early in the u.s. epidemic, as it is the basis on which the other tools in the public health toolbox are predicated. had widespread testing been available while community spread was still relatively low, contact tracing endeavors may have been able to quickly identify and eradicate hotspots and transmission chains within affected communities. however, that window of opportunity has passed and sustained transmission has led to rapidly growing caseloads and inability to keep up with contact tracing in many jurisdictions, despite some efforts to scale capacity [ , , [ ] [ ] [ ] [ ] . currently, the goal of contract tracing is still to identify the maximal number of sars-cov- infected and exposed individuals in order to enable transmission mitigation through isolation and intervention [ ] . however, contact tracing is usually most successful during troughs of the epidemic curve, when such efforts are more manageable. in these situations, theoretically, if nearly every case can be isolated quickly, and the majority of their contacts quarantined, then the local epidemic could be quelled enough by these targeted tactics to permit loosening of more stringent public health measures. however, loosening stricter social isolation measures before adopting the infrastructure prerequisite to allow for timely and thorough contact tracing is generally unadvisable, especially in the context of our decentralized and fragmented public health and healthcare systems. in states where the virus is currently surging, implementation and sustainable management of testing-tracing efforts became virtually unfeasible as transmission increased, and capacity was exceeded in some jurisdictions [ , ] . indeed, cdc guidance states that contact tracing is not usually recommended in communities with "sustained ongoing transmission" [ ] ; however, "sustained ongoing transmission" has not been clearly defined for covid- . this confusion may have contributed to the development of ineffective policies in states that have now a c c e p t e d m a n u s c r i p t experienced dramatic increases in case counts and hospitalizations, like florida and texas [ , ] , both of which were depending on attempts to conduct contact tracing in the midst of high levels of sustained ongoing transmission [ , , ] . comprehensive testing and contact tracing plans require a high level of forethought, coordination, communication, and social acceptability to be effectively executed [ ] . this is partly why many countries that have more synchronized public health systems with legal authority to provide strong oversight have generally fared better [ , , ] . robust plans, backed by considerable resources (i.e., financial, personnel, legal, and technological), combined with high adherence to physical distancing and face covering recommendations, have been instrumental to covid- mitigation in many countries, including ones with regions that have very high population densities [ , , ] . for example, south korea, which had prior experiences with both sars and mers, had modified legislation after prior outbreaks to allow for prompt responses to epidemics [ ] . as a result, south korea was able to integrate the following rich information sources into their contact tracing efforts: patient interview data, medical records data, gps data from mobile phones, cctv footage, and credit card transaction data. they also published the pre-diagnosis movements of confirmed covid- cases [ ] . similarly, taiwan merged data between health insurance records, medical records data, travel history, and data from both an app and a toll-free hotline set up for the public to report suspected cases [ , , ] . such methods are likely to be considered quite invasive, and therefore, neither legally nor culturally acceptable in the u.s. [ , ] , particularly during the current climate of civil unrest and the expanding backlash against public health measures that may be due partly to the politicization of certain recommendations [ , ] . there are some comparatively less intrusive voluntary technologies that have been used to supplement contact tracing and augment local public health efforts in some countries a c c e p t e d m a n u s c r i p t [ , ] . generally, with voluntary technologies, users agree to data collection and sharing for contact tracing purposes, and the data are deleted once obsolete. in europe, efforts are underway to develop and utilize general data protection regulation (gdpr) compliant phone apps [ ] . a voluntary app was used in iceland to help mitigate covid- spread, but successful solutions from small countries like iceland may not be generalizable to other regions for many reasons beyond regulatory and sociocultural differences [ ] . it is not possible to infer exactly how effective voluntary technology use will be in the u.s., especially without the aid of other major preventive tactics, given high levels of community spread in some parts of the country [ , ] . attempts to use opt-in mobile phone apps are ongoing in various parts of the u.s. (e.g., massachusetts, california, san francisco) [ , ] . evaluating the success of such programs over time may be helpful in planning for future surges of covid- . it is clear that public buy-in and engagement are crucial to ensuring cooperation with, adherence to, and sustainability of expanded testing-tracing programs [ , ] . the u.s. has substantial regulations that preclude enforcement of compliance with contact tracing [ ] . this implies that the public's participation will be voluntary [ , ] , and therefore, less likely to provide accurate and comprehensive information, limiting the effectiveness of these endeavors. besides legal hurdles related to civil liberties, there are also relevant ethical considerations about access to and use of data about people's contacts and whereabouts that need to be weighed [ , ] . vulnerable individuals, such as immigrants and victims of crime or domestic violence, may not be comfortable with sharing such information, even with health departments [ , ] . because some corporations have decided to conduct testing and tracing of their employees [ ] , individuals may be concerned that hiring or termination decisions will be based on test results. therefore, public messaging about expanded testingtracing must clarify how the data can legally be used and how they will be managed and a c c e p t e d m a n u s c r i p t protected, especially if private companies will be contracted to aid with data collection efforts [ ] . potential for misuse by law enforcement, immigration enforcement, and for-profit companies should be addressed unequivocally [ ] . many of these urgent considerations necessitate national-level guidance and leadership. at present, most local health departments are left to manage the public health concerns of their own jurisdictions with little support, and most lack the resources needed to adequately fulfill this responsibility [ , ] . despite the fact that there is a pandemic roughly every decade, contact tracing systems run by health departments are generally not designed to handle rapidly transmissible, pandemic-scale diseases. taking over a week to conduct contract tracing may be effective for some communicable illnesses (e.g., syphilis, according to the association of state and territorial health officials (astho), the enormous scope of conducting contact tracing for sars-cov- is most closely exemplified by the u.s. response to the west african ebola outbreak, which was the largest contact tracing endeavor ever implemented here [ ] . during this response, , individuals were actively monitored, but there were no reported cases [ , ] -by contrast, almost million covid- cases already exist on u.s. soil to date [ ] . even for ebola contact tracing, there were significant operational barriers including: resource limitations, barriers in coordination and communication between jurisdictions, challenges in quarantine enforcement, and difficulties related to provision of isolation housing [ ] . covid- has logarithmically amplified these obstacles. the size of the public health workforce required to adequately implement sar-cov- testing-tracing efforts depends upon many factors, such as the catchment area population size and the true incidence and prevalence. larger numbers of staff may be necessary as social distancing measures are loosened (or public adherence decreases) and case counts a c c e p t e d m a n u s c r i p t increase, or if technologic tools are not used for augmentation. smaller numbers of staff would likely be necessary if local, state, and national public health agencies were able to communicate and coordinate effectively. creation of a national contact tracing system could eliminate geographic restrictions for hiring and would increase procedural standardization. astho and other organizations have been advocating for a coordinated, national approach for expansion of contact tracing, and requests were made for support from the federal government to acquire an additional , - , contact tracers [ , ] . such a national resource would reduce the burden on the current public health workforce and ultimately, could also set the stage for a more strategic and effective national system for responding to current and future pandemics. since the beginning of the covid- outbreak in the u.s., a paucity of timely, national guidance and strategic planning, in culmination with an overwhelmed public health system, has served as a substantial obstacle to rapid disease mitigation [ ] . in just a few months, the covid- crisis has exposed the deficiencies in our public health infrastructure and has led us to mull over the palpable changes that would have prevented the current tugof-war between epidemiologic, political, and economic sacrifices. largely due to these deficits, we missed a pivotal opportunity to curtail the spread of this epidemic in much of the u.s. however, the course of an epidemic is dynamic, and if tough, decisive, and criticallyneeded policy decisions are made in the upcoming months to curb transmission, we may, once again find ourselves, in a relatively better position to consider effective strategies, though the disease may become endemic. at the very least, public health practitioners and scientists must acknowledge the complexities of real-world testing-tracing efforts, and promote new policies aimed at both mitigating sustained community transmission and bolstering contact tracing capacity in their jurisdictions. while this type of resource is worthy a c c e p t e d m a n u s c r i p t of investment for the longer-term, if contact tracing is to be considered the principal solution across the nation under current circumstances, then indicators for when social distancing can be relaxed (or needs to be strengthened) should include actionable thresholds around local contact tracing capacity [ ] . these thresholds are particularly relevant because of the sparse capacity that is the reality for many jurisdictions at this time [ , ] . overall, this is a crucial moment for our public health system to reassess its unmet needs, to evaluate and address the reasons behind its shortcomings, and to cultivate change before public momentum fades and we fall back into a national complacency, abandoning the opportunity for re-hauling and re-imagining a politically-independent, well-resourced, and innovative public health system. dr. chiao reports grants from nih, outside the submitted work. all other authors have no relevant conflicts of interest. covid- -we urgently need to start developing an exit strategy they've contained the coronavirus. here's how. the new york times reopening of america: more than half of states will lift coronavirus restrictions by the end of the week all u.s. states have taken steps toward reopening in time for memorial day weekend. the washinton post very aggressive' contact tracing needed for u.s. to return to normal coronavirus contact tracing could stop covid- and reopen america a national plan to enable comprehensive covid- case finding and contact tracing in the social distancing could buy u.s. valuable time against coronavirus. the washington post reviving the us cdc complicating outbreak response, preparedness centers for disease control and prevention. child care programs during the covid- pandemic restaurants and bars during the covid- pandemic public health experts say many states are opening too soon to do so safely states moving forward with reopening are seeing increases in new coronavirus cases. the washington post covid- dashboard coronavirus cases spike across sun belt as economy lurches into motion. the new york times sun belt hospitals are feeling the strain from virus' surge -and bracing for worse. the washington post trump hopes for million tests per week by end of may-the low end of experts' estimates of what's needed to reopen cities still lack testing capacity as cases surge, lines for coronavirus tests sometimes stretch miles in the summer heat. the washington post early detection of covid- through a citywide pandemic surveillance platform rapid sentinel surveillance for covid- thousands of coronavirus tests are going unused in us labs sun belt hospitals are feeling the strain from virus' surge -and bracing for worse texas is short of its contact tracing workforce goal by more than , people coronavirus contact tracing is 'not going well,' dr. fauci says, u.s. still needs more testing houston's surge of covid- cases overwhelms contact tracing efforts contact tracing: part of a multipronged approach to fight the covid- pandemic what florida contact tracing is like during the covid- pandemic. health news florida contact tracing assessment of covid- transmission dynamics in taiwan and risk at different exposure periods before and after symptom onset contact tracing, testing, and control of covid- -learning from taiwan most americans are not willing or able to use an app tracking coronavirus infections. that's a problem for big tech's plan to slow the pandemic. the washington post covid- contact tracing we can live with: a roadmap and recommendations coronavirus recommendations ignored as case numbers rise. the washington post a scramble for virus apps that do not harm. the new york times review of mobile application technology to enhance contact tracing capacity for covid- , . . show evidence that apps for covid- contact-tracing are secure and effective covid- : the us state copying a global health template for contact tracing success ethics of instantaneous contact tracing using mobile phone apps in the control of the covid- pandemic should immigration status information be included in a patient's health record? contact tracing poses 'pandora's box' for reopening businesses a c c e p t e d m a n u s c r i p t a c c e p t e d m a n u s c r i p t a c c e p t e d m a n u s c r i p t key: cord- - z l b authors: sturzenegger, david; sardon, aetienne; deml, stefan; hardjono, thomas title: confidential computing for privacy-preserving contact tracing date: - - journal: nan doi: nan sha: doc_id: cord_uid: z l b contact tracing is paramount to fighting the pandemic but it comes with legitimate privacy concerns. this paper proposes a system enabling both, contact tracing and data privacy. we propose the use of the intel sgx trusted execution environment to build a privacy-preserving contact tracing backend. while the concept of a confidential computing backend proposed in this paper can be combined with any existing contact tracing smartphone application, we describe a full contact tracing system for demonstration purposes. a prototype of a privacy-preserving contact tracing system based on sgx has been implemented by the authors in a hackathon. the covid- pandemic has caused a severe human and economic tragedy. as of april , more than . million covid- infections have been confirmed globally [ ] . governments all over the world have taken action to prevent their health systems from being overwhelmed. national lockdowns as well as social and physical distancing measures have been imposed to slow the spread of disease, [ ] . however, these measures have also brought large parts of the economy to a standstill, elevating the risk of a sustained economic downturn, [ ] . a. the importance of contact tracing current research suggests that contact tracing could play a critical role in avoiding or leaving lockdown, [ ] . contact tracing can help maintain a relatively unrestricted society and economy, while minimizing the damage to the health of the population, [ ] . since most transmissions are estimated to occur from pre-symptomatic individuals, traditional manual contact tracing procedures are too slow to effectively contain the covid- spread, [ ] . but smartphone apps that immediately alert recent close contacts and prompt them to self-isolate may significantly increase the efficacy of contact tracing. it is estimated that % of a country's population would need to participate in contact tracing for it to be effective, but privacy concerns may slow adoption [ ] . the fundamental problem is the simple fact that to determine whether two people were in contact their location data needs to be compared. this directly conflicts with the desire of most people to keep their location data private, leading to a trade-off between health and privacy. contact tracing systems built by the chinese and south korean governments have favored health over privacy in the context of the current pandemic. these systems recently came under public scrutiny over issues of data protection and privacy. critics argue that emergency measures tend to be expanded beyond their original scope, [ ] . hence, liberal countries are clearly in favor of opt-in based apps that use privacy-preserving technologies to minimize privacy and civil liberty intrusions, [ ] . for example, a group of european experts recently launched the pan-european privacy preserving proximity tracing initiative to guide on best practices for developing contact tracing apps, but privacy concerns remain, [ ] , [ ] . conventional systems rely on a trusted third party (ttp) to keep track of potential infection chains and orchestrate notifications (see section ii). this has led some to conclude the need for elaborate governance structures. for example, in [ ] the authors suggest amending the epidemics act to incorporate so called data trustees, who shall be entrusted with guaranteeing proper data handling. such considerations are based on the assumption that contact tracing requires the presence of a ttp. however, with the advent of confidential computing this assumption seems outdated, as trusted execution environments (tees) may make ttps obsolete. confidential computing refers to performing computations with additional data confidentiality and integrity guarantees. tees have recently emerged as one of the most flexible and mature technologies, which can enable confidential computing. many of today's leading technology companies are actively developing and promoting confidential computing technologies. for example, companies like microsoft, google, alibaba cloud and others have joined forces to form the confidential computing consortium under the linux foundation, [ ] . currently, intel's sgx is the most advanced tee implementation and the main technology the members of the confidential computing consortium focus on, [ ] . this paper proposes an intel sgx-based contact tracing system which provably cannot reveal any user's location data while providing all benefits of a traditional contact tracing system. we focus on a confidential computing backend that can be used in combination with any of the currently existing contact tracing apps, requiring only minimal modifications. current contact tracing apps typically rely on pushing the infected user's location data to the entire system. these location data include gps and/or proximity data, i.e. (typically randomized) identifiers of devices that were close to the current device. on every user's phone, all infected users' data are then compared to the locally stored location data in order to determine whether the mobile user has been within close proximity to infected individuals. if the data is gps data, this immediately reveals the infected user's past movements and offers very little privacy. if the data is proximity data, this may substantially leak privacy as well. privacy loss may occur (a) to the mobile-phone user, and/or (b) to the diagnosed patient. attackers will prefer to attack large data sets located at hospital servers. these privacy problems can negatively influence a user's decision and willingness to disclose their infection to the system. they may therefore substantially degrade the system's overall effectiveness. two examples of existing contact tracing systems are discussed in the following. israel's health ministry recently launched the contact tracing system hamagen, [ ] , [ ] , [ ] . hamagen claims that it only processes the users' location data on their devices. however, the system relies on pushing the location data of all infected users over government servers to all users in the system. hence, the location data of infected people is not protected at all. tracetogether's approach is similar to the idea behind apple's "find my device" technology, [ ] . every active phone continuously monitors for bluetooth low energy (ble) beacon messages, which are broadcasted from other devices together with some identifier. when it picks up one of these signals, the participating phone tags the data and stores it. as a result, no location data is stored on device, but rather a list of "identifiers" of the users one has met. in order to make location tracking more difficult, regularly changing random identifiers, derived from a user's secret key, are used. however, in order to identify potential transmissions, an infected user has to reveal his or her entire proximity data to a central authority, increasing the risk of re-identification (see section ii). current contact tracing apps need to address the following problems: • revealing data of infected users. contact tracing apps like hamagen perform on-device transmission detection. while this protects non-infected user data, it exposes infected individuals to re-identification risk by pushing their identifiers to all edge devices for local matching. • trustworthiness of central data processing. other systems, like tracetogether or pepp-pt [ ] , require all edge devices to send their collected location or proximity data to a central server, where matching of infected and non-infected identifiers is performed. typically, this makes it difficult for users to verify how their data is processed by the server. more specifically, it becomes impossible to guarantee that their contributed data will not be used beyond the pre-agreed purpose of contact tracing and will be deleted afterwards. contact tracing systems consist of two components: the smartphone contact tracing app, installed on the user's device, and a contact tracing backend. while special-purpose tees exist on smartphones, they currently do not offer all the guarantees that are needed to conduct confidential computing. especially the concept of remote attestation is lacking in most existing smartphone implementations, which makes them impractical for the use-case discussed in this work. hence we propose to build a confidential contact tracing backend to address the problems mentioned in section iii. while this backend in general can be used with any contact tracing app, we propose a full contact tracing system (i.e. including a specific app) for demonstration purposes. the proposed backend shall leverage intel sgx to confidentially determine potential chains of transmission, without ever exposing any user data to anyone-not even to the platform operator. much of the following description will not be particular to a confidential computing solution. the key benefits of using intel sgx are twofold: one can prove that the system works as described, thereby preventing data misuse, and one can achieve a higher level of privacy protection than with conventional systems. using sgx technology, the gps data from the infected patients are encrypted by the hospital in such a way that it can only be decrypted inside the sgx tee. similarly, gps data from the user's mobile phone is encrypted for the same target sgx environment. once both data sets are now within the trusted boundary of the sgx tee, they can be decrypted safely and be compared. if a positive match is found, the sgx tee will report the result over a secure channel to (a) the mobile user, and (b) optionally also to the hospital. the benefit of the sgx tee is that gps data is never accessible in plaintext. once the sgx tee finishes the comparison of gps data-sets, sgx will delete ("flush") the data from its memory. this ensures that the original gps data-sets are present inside the sgx tee only for a very short time. this has the advantage that attackers are unable to obtain access to large gps sets. we therefore recommend that hospitals who are in possession of gps data-sets of infected patients to encrypt their data while in storage. assume each device generates and emits a random identifier in discrete time intervals ∆t. for example, device a emits a during [t +∆t) and a during [t + ∆t) and so forth. devices in proximity to one another pick up these random identifiers reciprocally and locally store the sent and received identifiers in a contact tuple log. for example, assume that, while in proximity to one another, device a sends a and device b sends b . in this case, a locally stores (a , b ) and b stores (b , a ) (see figure ). let's now assume user c is tested at a health authority h. if the test is positive, the health authority submits this information to the confidential computing backend. the (authenticated) user c polls the backend to see whether or not the test was positive. note that c cannot produce false infection notifications, since only the health authority can perform these calls to the backend. if c decides to notify the user network of his infection, he or she sends his contact tuple log, e.g. {(c , a ), (c , b ), ...}, to the backend which stores it in an encrypted database that is provably only accessible to the backend . all devices regularly poll the backend for matches in the encrypted storage by sending their contact tuples. for example, when a polls the storage, he or she sends {..., (a , c ), ...}. as there was a match between, a's and c's tuples, a is informed that he has been in contact with an infected individual. note that provably neither a's tuples nor the fact that a was in contact with an infected individual get stored or submitted elsewhere by the backend (consider again footnote ). v. discussion much of the system proposed above is similar to existing contact tracing systems. the main difference consists of the fact that using intel sgx, it can be proved using remote attestation as well as memory encryption and memory isolation (see also [ ] for a high-level introduction to these concepts) that the backend operates exactly as advertised. it is important to stress again that the concept of a confidential computing backend can be used in combination with any contact tracing application, not just the one described here for demonstration purposes. a prototype of a privacy-preserving contact tracing system based on sgx has been implemented and open-sourced in the context codevscovid hackathon, [ ] . the main benefits of the confidential computing-based backend are twofold: on the one hand it enables effective data minimization (i.e., data does not need to be exposed to perform contact tracing logic); and, on the other hand, it provides transparent and verifiable data processing. this means that users can be guaranteed that their data is only used for the pre-agreed specific purpose of contact tracing. the specific data processing logic can be open-sourced, audited and verified through independent parties. note that the confidentiality and integrity guarantees of any system-including confidential computing systems-depend on a correct implementation. we did not describe such a full implementation. the purpose of this paper is to demonstrate the concept and feasibility of a privacy-preserving contact tracing system. we believe that a privacy-preserving backend can enable a more widespread and therefore effective contact tracing system. we described the need for a privacy-preserving contact tracing solution: without a strong focus on data privacy, contact tracing is unlikely to be widely adopted in liberal countries. we propose the use of intel sgx to build a confidential computing backend that provably cannot reveal any user data and outline a complete contact tracing system for demonstration purposes. together with currently available contact tracing smartphone applications, such a privacy-preserving contact tracing system could help mitigate some of the adverse effects of the current pandemic. , ) coronavirus disease (covid- ) outbreak situation , ) coronavirus deaths in italy overtake china as economic damage mounts restarting the economy and avoiding big brother , ) coronavirus deaths in italy overtake china as economic damage mounts , ) quantifying sars-cov- transmission suggests epidemic control with digital contact tracing , ) european experts ready smartphone technology to help stop coronavirus , ) snowden warns: the surveillance states we're creating now will outlast the coronavirus , ) us and europe race to develop 'contact tracing' apps pan-european privacy-preserving proximity tracing , ) an even deeper dive into the secure enclaves , ) covid- : gouvernanzmodell für ein digitales proximity tracing confidential computing consortium. ( ) confidential computing consortium intel sgx israel's ministry of health's covid- exposure prevention app isreal's ministry of health. ( , ) partial error at the "hamagen"' application hamagen" application -fighting the corona virus singapore's ministry of health. ( , ) help speed up contact tracing with tracetogether ) an even deeper dive into the secure enclaves cocotrace -confidential contact tracing key: cord- - cmekxs authors: malmberg, hannes; britton, tom title: inflow restrictions can prevent epidemics when contact tracing efforts are effective but have limited capacity date: - - journal: j r soc interface doi: . /rsif. . sha: doc_id: cord_uid: cmekxs when a region tries to prevent an outbreak of an epidemic, two broad strategies are available: limiting the inflow of infected cases by using travel restrictions and quarantines or limiting the risk of local transmission from imported cases by using contact tracing and other community interventions. a number of papers have used epidemiological models to argue that inflow restrictions are unlikely to be effective. we simulate a simple epidemiological model to show that this conclusion changes if containment efforts such as contact tracing have limited capacity. in particular, our results show that moderate travel restrictions can lead to large reductions in the probability of an epidemic when contact tracing is effective but the contact tracing system is close to being overwhelmed. two main factors determine if, and when, a region will be affected by an epidemic like the current covid- outbreak, be it the first epidemic or a second wave after a successful lockdown has eliminated internal spread. the first factor is the rate λ at which infectious individuals (either visitors or returning local residents) enter the country. the second factor is the probability π that such an entry gives rise to a major outbreak. potential preventive measures by health authorities can target reductions of λ using, for example, travel restrictions or quarantines, or reductions of π using, for example, contact tracing in conjunction with the isolation of imported cases (henceforth 'contact tracing'). quite often, preventive measures aim at reducing both λ and π. contact tracing can be fully effective (i.e. π = ) if it manages to bring the epidemic's effective reproduction number r below during the early stage of the outbreak, where r is defined as the expected number of infections caused by an infected individual. intuitively, if each individual, on average, infects less than one other individual, a large outbreak is not possible. we use a simple epidemiological model to analyse the effects of reducing π and λ on epidemic outbreaks. the background to our work is a literature that has found small effects from regulating the inflow rate λ. anzai et al. [ ] show that inflow reductions alone (assuming π > ) cannot prevent an epidemic outbreak from taking place and at best delay epidemic onset, often for just a very limited time. chinazzi et al. [ ] study the effect of reducing the inflow of infected individuals while simultaneously reducing π in the community at large and also find that inflow reductions can only marginally delay an epidemic unless π is reduced drastically. these pessimistic findings mirror those from a large number of earlier models [ ] [ ] [ ] [ ] [ ] [ ] . the world health organization's systematic review on the role of travel restrictions in containing pandemic influenza reviewed papers and concluded that a % reduction in international air travel would only slow down a pandemic by - weeks and would not prevent the introduction of a pandemic into any given country [ ] . we focus on the joint effect of inflow reductions and contact tracing. we incorporate contact tracing by having two different outbreak probabilities: one negligible for incoming cases that are contact traced effectively and another much higher probability for incoming cases that are not contact traced. with no capacity constraints, changing λ does not meaningfully affect the probability of an outbreak, regardless of whether effective contact tracing is in place or not. if effective contact tracing is not in place, reducing the inflow rate λ will only marginally delay the epidemic, in line with the literature findings above. if effective contact tracing is in place, there will not be an epidemic for any λ, so reductions in λ are again irrelevant. however, this conclusion relies critically on the assumption of unlimited capacity in contact tracing, i.e. that all imported cases are contact traced effectively irrespective of how many are currently being traced. we show that when contact tracing is effective but has limited capacity, reducing λ may well be very effective in reducing the risk of an outbreak. regulating λ is particularly important when the contact tracing system is close to being overwhelmed by new cases arriving from elsewhere, in which case even moderate reductions in λ can strongly reduce the probability of an epidemic outbreak. since contact tracing is both resource and labour intensive, we believe that our limited capacity assumption is reasonable. we also conjecture that, while our model is simple, our qualitative findings translate to more realistic set-ups. hence, we think that epidemiological models used for policy analysis should incorporate capacity constraints, since they might otherwise underestimate the potential of travel restrictions to prevent epidemic outbreaks (or the re-emergence of an epidemic). formally, we study a stochastic epidemic model where infected cases arrive at some poisson rate λ. without contact tracing, each new case leads to an epidemic outbreak with some probability π nt > . the containment effort consists of contact tracing that reduces the probability of an epidemic outbreak to π t ≥ . we say that contact tracing is effective if π t = and is imperfect if π t > . to model potential capacity constraints, we assume that contact tracing is conducted by a set of n teams which process every case arrival with an intensity μ (so the mean time for completing contact tracing is /μ). if a case arrives when at least one team is free, that case has probability π t of leading to an epidemic. if all teams are occupied when an infected case arrives, that case is lost to the system and causes an epidemic with probability π nt . we say that contact tracing has unlimited capacity if n = ∞ and has limited capacity if n < ∞. this set-up can be modelled as a queuing system where the state is the number of people currently in the contact tracing system. to analyse the probability of an epidemic outbreak at different time horizons, we add the outbreak as an additional, absorbing state. the set-up is illustrated in figure . with this set-up, we consider the effect of varying λ under four different combinations of parameters: contact tracing being imperfect versus effective and having unlimited versus limited capacity. doing this, we confirm the recent literature's main finding: varying λ is relatively inconsequential when capacity is unlimited. however, we also find that varying λ can be very important when contact tracing is effective but has limited capacity. we proceed by discussing each case in turn. in this case, there is an epidemic outbreak at a constant rate λπ t > . at a horizon t, the probability of an outbreak is reducing λ can proportionally delay the outbreak but cannot stop it. in this case, all arriving cases have zero probability of causing an epidemic. thus, regardless of λ, there will not be an outbreak. in this case, epidemics break out at a rate λπ t when the contact tracing system is below capacity and at a rate λπ nt > λπ t when at capacity. by contrast with case , reducing λ has the figure . simplified model of an epidemic outbreak with contact tracing. this diagram outlines the basic evolution of a disease from emergence to epidemic outbreak in the presence of a contact tracing system. when the system is imperfect, each traced case has a positive probability of leading to an epidemic, regardless of the arrival rate of new cases, the rate at which cases are processed or the number of cases that can be processed at once. when the system is effective, an outbreak will only occur if the system's capacity is limited and not all newly arriving cases can be processed. royalsocietypublishing.org/journal/rsif j. r. soc. interface : benefit of reducing the probability of being at full capacity. however, since the containment system is not fully effective, reducing λ can still only delay the outbreak. in this case, epidemics do not occur when the contact tracing system is below capacity but do occur at a rate λπ nt > when at capacity. thus, by preventing the system from reaching full capacity, reducing λ can be very effective at preventing outbreaks. the effect of λ can also be highly nonlinear. indeed, unless the queue was truncated at n, the value λ = nμ would be a critical value where the system would change discretely from being a subcritical to a supercritical system. to illustrate our findings, we perform simulations varying λ under each of these four different cases. all simulation parameters are given in table . we assume that, when capacity is limited, there are n = tracing teams and cases are handled by a team, on average, in days (i.e. μ = . ). we consider three infected case arrival rates: a baseline rate λ = , a moderate reduction λ = and a strong reduction λ = . when contact tracing is imperfect we assume that π t = . %, and when contact tracing is effective we assume that π t = . %. in the absence of any contact tracing, we assume that π nt = %. the results for each case are displayed in figure , with the number of days on the horizontal axis and the probability of an epidemic outbreak on the vertical axis. the first row shows the first two cases, where contact tracing has unlimited capacity. in the left-hand panel, contact tracing is imperfect, and reducing λ only delays the outbreak. in the righthand panel, contact tracing is effective and there is a low probability of an outbreak independent of λ. the second row shows the two cases where contract tracing has limited capacity. in the left-hand panel, contact tracing is imperfect and the result is qualitatively similar to the case with unlimited capacity; reducing λ only delays the epidemic, with a somewhat stronger effect in that the epidemic starts immediately in the absence of inflow restrictions. by contrast, when contact tracing is effective, the result is very different from that with unlimited capacity. this case is shown in the right-hand panel, where reducing λ can change the probability of an epidemic from virtual certainty to almost nil by preventing the tracing system from being overwhelmed. the model makes a simplifying assumption that contact tracing resources are only used for imported cases. in practice, domestic cases also have to be contact traced, since every imported case will not be isolated before it causes secondary transmissions. in such a situation, contact tracing will be effective if it brings down the effective reproduction here (a,c) display results from simulations where contract tracing is assumed to be less than fully effective, while (b,d) assume that it is fully effective. royalsocietypublishing.org/journal/rsif j. r. soc. interface : number below , in which case standard branching process logic implies that the epidemic dies out before it ever becomes large. we conjecture that our qualitative conclusions are not affected by allowing for domestic spread. just as before, travel restrictions are not important with unlimited contact tracing: either contact tracing is ineffective, in which case travel restrictions can only slow down the epidemic, or contact tracing is effective, in which case there will not be an epidemic regardless of travel restrictions. by contrast, if contact tracing is effective but has limited capacity, there can still be an epidemic because of an overload of the system, and travel restrictions will affect the probability that such an overload happens. the main difference from introducing domestic spread is that a larger set of parameters becomes relevant for policy. for example, the probability of being overloaded now will depend on how quickly imported cases are discovered, how infectious they are during the waiting time, how much they interact during the waiting time and how many infections each case causes on average, and also on factors such as the dispersion in the number of infections per case and dispersion in the time taken to process a case. we also assume that the capacity of the system is fixed over time. however, during the covid- pandemic, many countries have invested in expanded contact tracing capacity as the epidemic has progressed. an interesting extension to the model would be to allow for a gradual expansion of the contact tracing system. in this situation, it is likely that travel restrictions could adapt dynamically over time to become looser as the contact tracing system achieves a larger capacity. taking stock, we conclude that introducing capacity constraints can imply large changes in the effectiveness of inflow restrictions. instead of inflow restrictions leading to a gradual delay of an epidemic, there are nonlinear effects once the system goes below capacity. thus, in cases where systems are at the risk of being overrun, even moderate travel restrictions can be highly effective in reducing the risk for a local epidemic. while most rich countries are now beyond the point of preventing domestic outbreaks of covid- , we believe that the reasoning in this paper is still relevant for countries that have outbreaks that are later brought under control. in these cases, inflow restrictions may be helpful in preventing an epidemic from re-emerging, as they allow the country to stay below capacity in their contact tracing efforts. data accessibility. replication code for the simulation exercises is available upon request. assessing the impact of reduced travel on exportation dynamics of novel coronavirus infection (covid- ) the effect of travel restrictions on the spread of the novel coronavirus (covid- ) outbreak effectiveness of travel restrictions in the rapid containment of human influenza: a systematic review the feasibility of age-specific travel restrictions during influenza pandemics. theor human mobility networks, travel restrictions, and the global spread of h n pandemic mitigation measures for pandemic influenza in italy: an individual based model considering different scenarios delaying the international spread of pandemic influenza controlling pandemic flu: the value of international air travel restrictions small islands and pandemic influenza: potential benefits and limitations of travel volume reduction as a border control measure acknowledgements. we thank patrick kiernan and asma naeem for excellent research assistance. key: cord- -pg l zb authors: abueg, m.; hinch, r.; wu, n.; liu, l.; probert, w. j. m.; wu, a.; eastham, p.; shafi, y.; rosencrantz, m.; dikovsky, m.; cheng, z.; nurtay, a.; abeler-dörner, l.; bonsall, d. g.; mcconnell, m. v.; o'banion, s.; fraser, c. title: modeling the combined effect of digital exposure notification and non-pharmaceutical interventions on the covid- epidemic in washington state date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: pg l zb contact tracing is increasingly being used to combat covid- , and digital implementations are now being deployed, many of them based on apple and google's exposure notification system. these systems are new and are based on smartphone technology that has not traditionally been used for this purpose, presenting challenges in understanding possible outcomes. in this work, we use individual-based computational models to explore how digital exposure notifications can be used in conjunction with non-pharmaceutical interventions, such as traditional contact tracing and social distancing, to influence covid- disease spread in a population. specifically, we use a representative model of the household and occupational structure of three counties in the state of washington together with a proposed digital exposure notifications deployment to quantify impacts under a range of scenarios of adoption, compliance, and mobility. in a model in which % of the population participated, we found that digital exposure notification systems could reduce infections and deaths by approximately % and %, effectively complementing traditional contact tracing. we believe this can serve as guidance to health authorities in washington state and beyond on how exposure notification systems can complement traditional public health interventions to suppress the spread of covid- . the covid- pandemic has brought about tremendous societal and economic consequences across the globe, and many areas remain deeply affected. due to the urgency and severity of the crisis, the poorly understood long-term consequences of the virus, and the lack of certainty about which control measures will be effective, many approaches to stopping or slowing the virus are being explored. in seeking solutions to this problem, many technology-based non-pharmaceutical interventions have been considered and deployed ( ) , including data aggregation to track the spread of the disease, gps-enabled quarantine enforcement, ai-based clinical management, and many others. contact tracing, driven by interviews of infected persons to reveal their interactions with others, has been a staple of epidemiology and public health for the past two centuries ( ) . these human-driven methods have been brought to bear against covid- since its emergence, with some success ( ) . unfortunately, owing in part to the rapid and often asymptomatic spread of the virus, these efforts have not been successful in preventing a global pandemic. further, as infections have reached into the millions, traditional contact tracing resources have been overwhelmed in many areas ( ) ( ) . given these major challenges for traditional contact tracing, technology-based improvements are being explored, with particular focus on the use of smartphones to detect exposures to others carrying the virus. smartphone apps may approximate pathogen exposure risk through the use of geolocation technologies such as gps, and/or via proximity-based approaches using localized radio frequency (rf) transmissions like bluetooth. location-based approaches attempt to compare the places a user has been with a database of high-risk locations or overlaps with infected people ( ) , while proximity-based approaches directly detect nearby smartphones that can later be checked for "too close for too long" exposure to infected people ( ) . in either approach, users who are deemed to be at risk are then notified, and in some implementations, health authorities also receive this information for follow-up. due to accuracy and privacy concerns, the majority of contact tracing proposals have avoided the location signal and focused on a proximity-based approach, such as pepp-pt ( ) and nshx ( ) . further privacy safeguards may be achieved by decentralizing and anonymizing important elements of the system, as in dp- t ( ) and apple and google's exposure notifications system (ens) ( ) . in these approaches, the recognition of each user's risk level can take place only on the user's smartphone, and server-side knowledge is limited to anonymous, randomized ids. technological solutions in this space have never been deployed at scale before, and their effectiveness is unknown. there is an acute need to understand their potential impact, to establish and optimize their behavior as they are deployed, and to harmonize them with traditional contact tracing efforts. specifically we will examine these issues in the context of ens, which is currently being adopted by many countries ( ) . there are many variables to consider when characterizing the behavior of any system of this type. technology-dependent parameters, such as those needed to convert bluetooth signal strength readings to proximity ( ) ( ) , vary from device to device and require labor-intensive calibration. they will not be discussed in this paper. here we seek to explore the general conditions and public health backdrop in which an ens deployment may exist, and the policy characteristics that can accompany it. in order to improve our understanding of this new approach, we employ individual-based computational models, also known as agent-based models, which allow the exploration of disease dynamics in the presence of complex human interactions, social networks, and interventions ( ) . this technique has been used to successfully model the spread of ebola in africa ( ) , malaria in kenya ( ) , and influenza-like illness in several regions ( ) ( ) , among many others. in the case of covid- , the openabm-covid model by hinch et a.l ( ) has already been used to explore smartphone-based interventions in the united kingdom. this model seeks to simulate individuals and their interactions in home, work, and community contexts, using epidemiological and demographic parameters as a guide. in this work, we adapt the openabm-covid model to simulate the ens approach and apply it to data from washington state in the united states in order to explore possible outcomes. we use data at the county level to match the population, demographic, and occupational structure of the region, and calibrate the model with epidemiological data from washington state and google's community mobility reports for a time-varying infection rate ( ) . similar to hinch et al., we find that digital exposure notification can effectively reduce infections, hospitalizations, and deaths from covid- at all levels of participation. we extend the findings by hinch et al. to show how digital exposure notification can be deployed concurrently with traditional contact tracing and social distancing to suppress the current epidemic and aid in various "reopening" scenarios. we believe the demographic and occupational realism of the model and its results have important implications for the public health of washington state and other health authorities around the world working to combat covid- . to model the combined effect of digital exposure notification and other non-pharmaceutical interventions (npis) in washington state, we use a model first proposed by hinch et al. ( ) , who have also made their code available as open source on github ( ) . openabm-covid is an individual-based model that models interactions of synthetic individuals in different types of networks based on the expected type of interaction (fig. ) . workplaces, schools, and social environments are modeled as watts--strogatz small-world networks ( ) , households are modeled as separate fully connected networks, and random interactions, such as those on public transportation, are modeled in a random network. the networks are parameterized such that the average number of interactions matches the age-stratified data in ( ) . contacts between synthetic individuals in those interaction networks have the potential for transmission of the virus that causes covid- and are later recalled for contact tracing and possible quarantine. while the original model by hinch et al. ( ) included a single occupation network for working adults, we extend this to support multiple networks for workplace heterogeneity. this is motivated by increasing evidence that workplace characteristics play an important role in the spread of sars-cov- , such as having to work in close physical proximity to other coworkers and interacting with the public. baker et al. found that certain u.s. working sectors experience a high rate of sars-cov- exposure, including healthcare workers, protective services (e.g., police officers), personal care and services (e.g., child care workers), community and social services (e.g., probation officers) ( ) . as another example, the centers for disease control and prevention (cdc) has issued specific guidance to meat and poultry processing workers due to the possible increased exposure risk in those environments ( ) . therefore, we model each individual industry sector as its own small-world network and parameterize it with real-world data such as the sector size and interaction rates. in openabm-covid , transmission between infected and susceptible individuals through a contact is determined by several factors, including the duration since infection, susceptibility of the recipient (a function of age), and the type of network where it occurred (home networks assume a higher risk of transmission due to the longer duration and close proximity of the exposure). individuals progress through stages of susceptible, infected, recovered, or deceased. in this model, the dynamics of progression through these stages are governed by several epidemiological parameters, such as the incubation period, disease severity by age, asymptomatic rate, and hospitalization rate, and are based on the current literature of covid- epidemiology. a complete list of the epidemiological parameters can be found at ( ) and any modifications to those are described in the subsequent sections and documented in the supplementary materials (table s , s ). in this work we model the three largest counties in washington state --king, pierce, and snohomish --with separate and representative synthetic populations. the demographic and household structure were based on data from the u.s. census of population and housing ( ) and the - acs public use microdata sample ( ) . we combined census and public use microdata sample (pums) data using a method inspired by ( ) . for each census block in washington state we took distributions over age, sex, and housing type from several marginal tables (called census summary tables) and from the pums, and combined them into a multiway table using the iterative proportional fitting (ipf) algorithm. we then resampled the households from the pums to match the probabilities in the multiway table. the resulting synthetic population in each census block respects the household structure given by pums and matches marginals from the census summary tables. our synthetic working population was drawn to match the county-level industry sector statistics reported by the u.s. bureau of labor statistics in their quarterly census of employment and wages for the fourth quarter of ( ) . we also used a report by the washington state department of health (doh) containing the employment information of lab-confirmed covid- cases among washington residents as of may , to parameterize each occupation sector network ( ) . for each sector, we use its lab-confirmed case number weighted by the total employment size as a multiplier factor to adjust the number of work interactions of that occupational network. while the doh report does not explicitly measure exposure risk for different industries, it is, to the best of our knowledge, the best source of data for confirmed covid- cases and occupations to date. our model should be refined with better data from future work that studies the causal effect of workplace characteristics on covid- transmission. a complete list of the occupation sectors and interaction multipliers can be found in the supplementary materials (table s ,s ). testing and quarantine in the openabm-covid model, if an individual presents with covid- symptoms, they receive a test and are % likely to enter a voluntary -day isolation with a % drop out rate each day for noncompliance. if the individual receives a positive test result, they isolate for a full days from initial exposure with a daily drop out rate of %. prior to confirmation of the covid- case via a test result, the household members of the voluntarily self-isolating symptomatic individual do not isolate, which is in line with current recommendations by the cdc ( ) . household quarantines may still occur through digital exposure notification or manual contact tracing, described in the following sections. we simulate digital exposure notification in openabm-covid by broadcasting exposure notifications to other users as soon as an app user either tests positive or is clinically diagnosed with covid- during hospitalization. the model recalls the interaction networks of this app user, known as the "index case", to determine their first-order contacts within the previous days. those notified contacts are then % likely to begin a quarantine until days from initial exposure with a % drop out rate each day for noncompliance. see ( ) for a more comprehensive description of the model. while the actual ens allows health authorities to configure notifications as a function of exposure distance and duration, our model does not have the required level of resolution and instead assumes that % of all "too close for too long" interactions are captured between users that have the app. (see the supplemental materials for a sensitivity analysis of this parameter.) the overall effect of digital exposure notification depends on a number of factors that we explore in this work, including the fraction of the population that adopts the app and the delay between infection and exposure notification. as an upper bound on app adoption, we configure the age-stratified smartphone population using data on smartphone ownership from the u.s. from the pew research center ( ) for ages + and common sense media ( ) for ages - . since this data was not available for washington state specifically we assumed that the u.s. distribution was representative of washington state residents. we also extend openabm-covid to model traditional or "manual" contact tracing as a separate intervention. in contrast to digital exposure notification, human tracers work directly with index cases to recall their contact history without the proximity detection capabilities of a digital app. those contacts are then given the same quarantine instructions as those traced through the digital app. we configure the simulation such that manual contact tracers have a higher likelihood of tracing contacts in the household and workplace/school networks ( % and %, respectively) than for the additional random daily contacts ( %). this is based on the assumption that people will have better memory and ability to identify contacts in the former (e.g., involving family members or coworkers) compared to the latter (e.g., a random contact at a restaurant). additionally, we configure the capacity of the contact tracing workforce with parameters for workforce size, maximum number of index-case interviews per day, and maximum number of tracing notification calls per day following those interviews. tracing is initiated on an index case after either a positive test or hospitalization, subject to the capacity in that area. finally, we add a delay parameter between initiation of manual tracing and finally contacting the traced individuals to account for the processing and interview time of manual tracing. model calibration is the process of adjusting selected model parameters such that the model's outputs closely match real-world epidemiological data. to calibrate openabm-covid for washington state we use components of a bayesian seir model by liu et al. ( ) for modeling covid- . they extend the classic seir model by allowing the infection rate to vary as a function of human mobility and a latent changepoint to account for unobserved changes in human behavior. we fit that model to washington state county-level mortality data from the new york times ( ) and mobility data from the community mobility reports published by google and publicly available at ( ) . the community mobility reports are created with aggregated, anonymized sets of data from users who have turned on the location history setting, which is off by default. no personally identifiable information, such as an individual's location, contacts or movement, is ever made available ( ) . the reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. we note that, because of the opt-in nature of this dataset, it may not be representative of the overall population. we extend the methodology in liu et al. to model calibration in openabm-covid by applying the time-varying infection rate coefficients to the relevant county-specific parameters that guide user interaction levels and disease transmission likelihood. more specifically, the number of daily interactions in the random and occupation networks, and , are scaled by the i(t) r i(t) w mobility coefficient, at time step , which is calculated based on the aggregated and (t) m t anonymized location visits from the community mobility reports. the time-dependent infectious rate, , is scaled by a weighting term, , that depends on how far time step is from a (t) β (t) σ t learned changepoint, which is modeled as a negative sigmoid. both and are learned (t) σ (t) m functions and are described in more detail in ( ) . finally, we use an exhaustive grid search to compute two openabm-covid parameters for each county: its initial infectious rate and the infection seed date . the infectious rate is the mean number of individuals infected by each infectious individual with moderate-to-severe symptoms, and can be considered a function of population density and social mixing. the infection seed date is the date at which the county reaches total infections, possibly before the first official cases due to asymptomatic and unreported cases. we pick the parameters where the simulated mortality best matches the actual covid- mortality from epidemiological data, as measured by root-mean-square error (rmse). the results of the calibrated models for king, pierce, and snohomish counties are shown in fig. . note that while there is a strong correlation in the predicted and reported incidence, the absolute predicted counts are approximately x higher than those that were officially reported. we attribute this difference to the fact that openabm-covid is counting all asymptomatic and mild symptomatic cases that may not be recorded in reality. this is approximately consistent with the results of a seroprevalence study by the cdc that estimated that there were to times more infections than official case report data ( ) . . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . in this section we present several forward-looking simulations for washington state counties by comparing multiple hypothetical scenarios that implement some combination of digital exposure notification, manual contact tracing, or social distancing. each simulation uses the same calibrated model parameters up to july , at which point the hypothetical interventions are implemented. beyond this date, each simulation uses the model parameters from the last week of the calibration period, except where explicitly specified as part of the intervention. for each simulated intervention we report the number of infections (daily and cumulative), cumulative number of deaths, number of hospitalizations, number of tests per day, and fraction of the population in quarantine. each simulation covers consecutive days from march , through dec , , plus the additional calibrated seeding period before march . unless otherwise stated, the reported result is the mean value over runs with different random seeds of infection. note that results may be affected by the end date of the simulation because of the time it takes some interventions to have their full effect. we believe that a time horizon of approximately and a half months is long enough to be practically useful for public health agencies who are considering deploying such interventions, but short enough to minimize the long-term uncertainty and effects of externalities such as a vaccine becoming available. we first study the effect of a digital exposure notification app at different levels of app adoption -- %, %, %, %, and % (or all smartphone owners) --of the population in each county. as a baseline, we compare those results to the "default" scenario without digital exposure notification and assume no change in behavior or interventions beyond july , . the results show an overall benefit of digital exposure notification at every level of app adoption ( fig. and ) . when compared to the default scenario of only self isolation due to symptoms, each scenario results in lower overall incidence, mortality, and hospitalizations. unsurprisingly, the effect on the epidemic is more significant at higher levels of app adoption. an app with % adoption reduces the total number of infections by - %, - %, and - % and the number of total deaths by - %, - %, and - % for king, pierce, and snohomish counties, respectively. even at a relatively low level of adoption of %, total infections are reduced by . - . %, . - . %, and . - . % and total deaths are reduced by . - . %, . - . %, and . - . % for king, pierce, and snohomish counties, respectively. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . in addition to its effect on the epidemic, we also evaluate the trade-off between exposure notification app adoption and the total number of quarantine events. there is an incentive to minimize the quarantine rate because of the perceived economic and social consequences of stay-at-home orders. at % exposure notification app adoption the number of total quarantine events increases by . - . %, . - . %, and . - . % for king, pierce, and snohomish counties (fig. ) . in general, the higher the level of exposure notification adoption the greater the number of total quarantine events, with the exception of very high levels of adoption ( % and %) where this number plateaus or even decreases, likely due to the significant effect of the intervention in suppressing the overall epidemic in those scenarios. from another perspective, achieving epidemic control at the price of high initial quarantine is preferable to lower levels of quarantine that are sustained for much longer. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . fig. : estimated total quarantine events of king, pierce, and snohomish counties for various levels of exposure notification app uptake among the population from july , to december , . note that even for the "default" ( % en app uptake) scenario there is a non-zero number of quarantine events because this assumes that symptomatic and confirmed covid- positive individuals will self-quarantine at a rate of %, even in the absence of an app. next we study the potential impact of manual contact tracing in suppressing the epidemic as a function of the contact tracing workforce size. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint manual tracing with the full desired staffing levels of workers per , people is able to affect the epidemic trend in all three counties, but has a significantly smaller effect at current staffing levels (fig. ) . unsurprisingly, the impact for a given level of staffing is dependent upon the current epidemic trend, reinforcing the need for concurrent interventions to effectively manage the epidemic. additionally, we compare the performance of exposure notification to manual contact tracing to establish similarities between relative staffing level and exposure notification adoption and to verify an additive effect of concurrent manual tracing and exposure notification. we see improvements in all cases when combining interventions (fig. ) . in all three counties, exposure notification has a stronger effect at the given staffing and adoption levels, but adding either intervention to the other results in reduced infections, albeit to different extents based on the trend of the epidemic. this suggests that both methods are useful separately and combined, even if they do not explicitly coordinate. while the results shown above suggest that the interventions are effective in suppressing the covid- epidemic to various degrees, in practice, health organizations will implement multiple intervention strategies simultaneously to try to curb the spread of the virus while also allowing controlled reopenings. therefore, we also study the combined effect of concurrent interventions including digital exposure notification, manual contact tracing, and social distancing (fig. ) . we model social distancing as a function of infectiousness of interactions in the random and occupation networks, where increasing social distancing decreases the relative transmission likelihood on a network by a multiplicative factor relative to their values as of march , (i.e., before broad-based social distancing and mobility reductions). for example, social distancing of . x is equivalent to multiplying the relative transmission by / . = . . note that this does not change the number of person-to-person interactions, but rather the likelihood of transmission of . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . any individual encounter, which may be affected by factors other than physical distancing such as mask usage, improved hygiene, use of personal protective equipment, etc. next we examine the effects of combined npis under various "reopening" scenarios by gradually increasing the number of interactions in every interaction network, including households, workplaces, schools, and random networks. specifically, we increase these interactions by a given percentage from the levels as of july , ( % reopen) up to the initial levels at march , , at the very start of the epidemic ( % reopen). given the average number of interactions for network at the end of the baseline as and before the . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint the increase in new infections from a - % reopening are balanced by - % exposure notification app adoption, although the effect varies by county (fig. ). this shows that limited additional reopenings may be possible after introducing exposure notification alongside existing fully staffed manual tracing ( staff per , people), but that social distancing remains an important measure under these circumstances. additionally, there is an increased effect to adding exposure notification under greater reopening scenarios. as an example, we plot some primary metrics for a % network reopening and see significant reductions in nearly all metrics at even % adoption (fig. ) . pierce snohomish fig. . estimated total infected percentage, total deaths, and peak hospitalized under a % reopening scenario (an increase of % of the difference between pre-lockdown and post-lockdown network interactions) at various exposure notification adoption rates for king, pierce, and snohomish counties, assuming no change to social distancing after the (t) β baseline and manual contact tracers per k people. as part of the washington state department of health's "safe start" plan, a key target metric to reopen washington is to reach fewer than new cases per , inhabitants over the prior two weeks ( ) . here, we examine how many days it would take to reach that target under the combined npis. with the recent spike in cases, the trajectory for reaching these targets without renewed lockdowns is out of the range of the simulations. therefore, to show the relative . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . benefits of the npis, we introduce an artificial renewed lockdown at the mobility levels averaged over the month before the phase reopenings (phase . for king county) that occurred on june , . using this averaged mobility from may to june , , we model the relative effects of manual tracing and exposure notification on the washington safe start key metric. we find that for all three counties, manual contact tracing at the recommended staffing levels combined with an exposure notification app can significantly reduce the amount of time it takes to achieve this metric (fig. ) . under the recommended standard for manual tracing, adding exposure notification at % adoption results in reaching the target in %, %, and % of the time versus no exposure notification for king, pierce, and snohomish counties respectively. at the reduced levels of . tracers per , population, the target is reached in less than % and % of the time for king and snohomish respectively, although the exact ratio can not be calculated as the metric is not achieved in the baseline simulation. our individual-based modeling approach attempts to simulate the behavior of humans in a complex environment, in order to better understand the relative effects of different levels of intervention. while we have attempted to add realistic elements and calibrate it with the best . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) available data, it still represents a dramatic simplification of the real world. choices and simplifications made surrounding the behavior of the individuals, their movements in the world, disease dynamics, and many others, mean that the results should be viewed as an exploration of possible outcomes, not a prediction ( ) . a more specific limitation in our work is that we modeled each county separately without cross-county interactions. in particular, we did not model how cross-county human movement contributes to disease spreading. we plan to explore this effect in our future work. our simulations assume that it takes days from symptom onset to receive a covid- test result and we acknowledge that this is a key assumption underlying our findings. ferretti et al. ( ) showed that the delay between the initial exposure to case confirmation, notification, and quarantine has a significant impact on the efficacy of the intervention. rapid testing protocols can shorten the time between symptom development and case confirmation, and are essential for epidemic control ( ) . we used published covid- mortality data to calibrate model parameters. while the death count is arguably a good proxy to the true infection numbers, the published mortality data are scarce and noisy in small counties, resulting in the difficulty of modeling those counties with accuracy. the synthetic occupation networks are based on the latest employment data corresponding to the fourth quarter in ( ) . since the beginning of the pandemic, the size and structure of occupation networks may have changed compared to the latest available data. in our work we used the mobility data along with a changepoint to model time-varying infection rates. while the changepoint vector models the net effect of various latent factors, it may be limited when multiple change points or more complex latent factors exist. the derived time-varying infection rate is homogeneously distributed to the random network and occupational networks. this is an approximation to the reality where the change may vary on different networks. the compartmental modeling approach ( ) ( ) ( ) has been widely used for epidemic study. this approach segments the total population by subgroups according to the disease progression stage and models the transmission of stages with differential equations. seir (susceptible-exposed-infected-recovered) ( ) ( ) ( ) ( ) is a common type of compartmental model used to study covid- spread. however, this approach is not suitable for studying the impact of individual level interventions like exposure notification apps because they characterize the disease dynamics at a population-level. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september , . . in contrast to the compartmental model, the individual-based modeling approach ( , , , - ) simulates the infectious disease progression of individuals and can consider demographics, social interactions, and the environment. these individual-based models can predict the spread of covid- in multiple countries by fitting the stochastic model of disease progression and human interactions from historical data. however, the impact of additional interventions such as digital exposure notification is unexplored. in ( ) ( ) , disease transmission is modeled by a stochastic process to fit the reproduction number of the total population. however, manipulating the reproduction number by real contact tracing actions can be challenging as it is subject to human interaction patterns, adoption rate, and many other types of interventions. this model lacks the characteristics of individuals as it uses the mean field theory to approximate the total population. ( ) ( ) ( ) study contact tracing by situating individuals randomly in a space and mimicking human contacts by the individual's collision from the spatial movement. while this spatial individual-based model reveals promising results in virus spread in relatively small and closed areas, such as public buildings ( ) , and cruise ships ( ) , the ad-hoc assumptions in individual mobility patterns are not suitable for studying the impact of contact tracing in the scale of a city. ( ) introduces the spatial temporal model which has more realistic mobility patterns. however, the spatial movement used in these models is a simplification of contact tracing which lacks the individual interactions among family members, workmates and from random activities. the effectiveness of manual and digital contact tracing is discussed in ( ) through empirical contact data collected from the work related network at a small scale, without considering virus spread among family members and other random interactions. the references ( ) ( ) are the closest to ours, but they do not cover the joint impact of manual and digital contact tracing. in addition, model calibration is missing in their case studies. in contrast, openabm-covid ( ) simulates concurrent manual contact tracing and digital exposure notification interventions over interaction networks at a large scale. in this study we conducted a model-based estimation of the potential impact of a digital exposure notification app in washington state. openabm-covid simulates interactions among synthetic agents in various small-world networks, representing households, workplaces, schools, and random interactions. interactions in those networks can result in covid- transmission and are recalled to simulate different tracing interventions, including "manual" contact tracing or digital exposure notification, such as the recently released apple and google exposure notifications system (ens). we calibrated our model using real-world data on human mobility and showed how it can accurately match epidemiological data in washington state's three largest counties, king, pierce, and snohomish. similar to hinch et al.'s report on digital contact tracing in the uk ( ) , we found that a digital exposure notification app can meaningfully reduce infections, deaths, and hospitalizations in these washington state counties at all levels of app uptake, even if only a small fraction of the . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted september , . . eligible population participates. we also showed how digital exposure notification can be combined with manual contact tracing at the recommended levels to further suppress the epidemic, even if the two interventions do not explicitly coordinate. our simulations showed that the simultaneous deployment of both interventions can help these washington counties meet the key incidence metric defined by the safe start washington plan before december, . the potential overall effect of digital exposure notification seems to be greater than even optimal levels of manual contact tracing, likely because of its ability to scale and better identify random interactions. we also found that quarantine rates, which contribute to the social and economic cost of these interventions, scale sublinearly with app adoption, meaning that in some cases there are fewer people quarantined even though a greater fraction of the population is participating in the app. we credit this effect to the success of the app at suppressing the epidemic at high levels of adoption. given a longer simulation time horizon we may see a similar effect even at the lower levels of app adoption. health authorities may consider this when appealing to the public by explaining how greater rates of collective participation may reduce the severity of the epidemic while also minimizing or reducing the need for quarantine. finally, we looked at the combined effects of digital exposure notification and manual tracing in the context of different reopening scenarios, where mobility and interaction levels increase to the pre-epidemic levels. our results suggest that both interventions are helpful in counterbalancing the effect of reopening, but are not totally sufficient to offset new cases except at very high levels of adoption and manual tracing staffing. as a result we believe that continued social distancing and limiting person-to-person interactions is essential. future work is needed to study targeted reopening strategies, such as reopening specific occupation sectors or schools, or more stringent social distancing interventions in places that do reopen. looking ahead to future work, we are considering the question of coordination between different regions when deploying digital exposure notification as part of a suite of non-pharmaceutical interventions. the united states has seen a highly spatially varied response to the covid- pandemic, with significant consequences to epidemic control ( ) . under the conditions of varying cross-county and cross-state flows, we seek to quantify the empirical efficiency gap between coordinated and uncoordinated deployments and policies around testing, tracing, and isolation in which a digital exposure notification system can aid. in particular, the beginning of such cross-state collaborations is evident in the consortia of state governments such as the western states pact and a multi-state council in the northeast, both working together to coordinate their responses. we expect that coordinated deployments of digital exposure notification applications and public policies may lead to more effective epidemic control as well as more efficient use of limited testing and isolation resources. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint applications of digital technology in covid- pandemic planning and response. the lancet digital health john haygarth's th-century "rules of prevention" for eradicating smallpox spread of sars-cov- in the icelandic population israel's contact tracing system said to be vastly overwhelmed by virus spread california's plan to trace travelers for virus faltered when overwhelmed, study finds a case for participatory disease surveillance of the covid- pandemic in india nist pilot too close for too long (tc tl) challenge evaluation plan dp t -decentralized privacy-preserving proximity tracing privacy-preserving contact tracing a flood of coronavirus apps are tracking us. now it's time to keep track of them using bluetooth low energy (ble) signal strength estimation to facilitate contact tracing for sc ' : proceedings of the acm/ieee conference on supercomputing spatiotemporal spread of the outbreak of ebola virus disease in liberia and the effectiveness of non-pharmaceutical interventions: a computational modelling analysis simulation of malaria epidemiology and control in the highlands of western kenya comparing large-scale computational approaches to epidemic modeling: agent-based versus structured metapopulation models flute, a publicly available stochastic influenza epidemic simulation model effective configurations of a digital contact tracing app: a report to nhsx agent-based model for modelling the covid- epidemic collective dynamics of "small-world" networks social contacts and mixing patterns relevant to the spread of infectious diseases estimating the burden of united states workers exposed to infection or disease: a key factor in containing risk of covid- infection cdc, meat and poultry processing workers and employers openabm-covid /baseline_parameters_transpose.csv at v census of population and housing creating synthetic baseline populations. transportation research part a: policy and practice covid- confirmed cases by occupation and industry estimating the changing infection rate of covid- using bayesian models of mobility the new york times, coronavirus in the u.s.: latest map and case count google covid- community mobility reports: anonymization process description (version . ) seroprevalence of antibodies to sars-cov- in sites in the united states king county safe start application moving from modified phase to phase building covid- contact tracing capacity in health departments to support reopening american society safely safe start washington -phased reopening county-by-county agent-based modeling in public health: current applications and future directions quantifying sars-cov- transmission suggests epidemic control with digital contact tracing a contribution to the mathematical theory of epidemics estimation of parameters in a structured sir model epidemic analysis of covid- in china by dynamical modeling modified seir and ai prediction of the epidemics trend of covid- in china under public health interventions a modified seir model to predict the covid- outbreak in spain and italy: simulating control scenarios and multi-scale epidemics revealing covid- transmission in australia by sars-cov- genome sequencing and agent-based modeling sustainable and resilient strategies for touristic cities against covid- : an agent-based approach predicting the impacts of epidemic outbreaks on global supply chains: a simulation-based analysis on the coronavirus outbreak (covid- /sars-cov- ) case an agent-based epidemic model reina for covid- to identify destructive policies social network analysis and agent-based modeling in social epidemiology modelling disease outbreaks in realistic urban social networks covasim: an agent-based model of covid- dynamics and interventions covid-abs: an agent-based model of covid- epidemic to simulate health and economic effects of social distancing interventions modeling covid- on a network: super-spreaders, testing and containment modelling transmission and control of the covid- pandemic in australia enhancing response preparedness to influenza epidemics: agent-based study of influenza season in switzerland. simulation modelling practice and theory initial simulation of sars-cov spread and intervention effects in the continental us impact of delays on effectiveness of contact tracing strategies for covid- : a modelling study feasibility of controlling covid- outbreaks by isolation of cases and contacts. the lancet global health strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic an agent-based model to evaluate the covid- transmission risks in facilities how to restart? an agent-based simulation model towards the definition of strategies for covid- "second phase how many infections of covid- there will be in the "diamond princess"-predicted by a virus transmission model based on the simulation of crowd flow a spatiotemporal epidemic model to quantify the effects of contact tracing, testing, and containment effect of manual and digital contact tracing on covid- outbreaks: a study on empirical contact data modelling the impact of testing, contact tracing and household quarantine on second waves of covid- interdependence and the cost of uncoordinated responses to covid- key: cord- -jmpterrj authors: eilersen, andreas; sneppen, kim title: cost–benefit of limited isolation and testing in covid- mitigation date: - - journal: sci rep doi: . /s - - - sha: doc_id: cord_uid: jmpterrj the international community has been put in an unprecedented situation by the covid- pandemic. creating models to describe and quantify alternative mitigation strategies becomes increasingly urgent. in this study, we propose an agent-based model of disease transmission in a society divided into closely connected families, workplaces, and social groups. this allows us to discuss mitigation strategies, including targeted quarantine measures. we find that workplace and more diffuse social contacts are roughly equally important to disease spread, and that an effective lockdown must target both. we examine the cost–benefit of replacing a lockdown with tracing and quarantining contacts of the infected. quarantine can contribute substantially to mitigation, even if it has short duration and is done within households. when reopening society, testing and quarantining is a strategy that is much cheaper in terms of lost workdays than a long lockdown. a targeted quarantine strategy is quite efficient with only days of quarantine, and its effect increases when testing is more widespread. | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificreports/ within families, workplaces, and friend groups, everyone is assumed to know everyone. each agent is assigned one family and workplace, as well as two groups of friends. workplaces on average contain ten people, whereas each friend group on average contains five. in the simulation runs presented here, we use a population of n = agents. increasing the number of agents changes the outcome very little, except for minimising stochastic noise. we also do not allow migration in or out of the system. we use a discrete-time stochastic algorithm. at each time-step ( . days), each person has one interaction with some other person. a "die roll" decides whether the person will interact with family, friends, work, or the public. the respective odds are the above-mentioned percentages : : : . if the public is chosen, an entirely random person is selected, otherwise a person is drawn from a predefined group (family etc.). for each interaction, an infectious person has a fixed probability of passing on the disease to the person they interact with. the family size distribution of is based on the distribution of danish households . the average number of people per household is approximately , and large households of more than people have been ignored, as they account for less than % of the population. we believe that in a country where family sizes are larger and there are fewer singles, the family would be more important to the spread of disease. we test the effect of larger families in the supplement (figs. s -s ) and find that it does not change our overall conclusions. we simulate the progression of disease using an seir model with four exposed states, e = e + e + e + e , each lasting on average . days, corresponding to a mean incubation period of days. the exposed states are presymptomatic, meaning that people will not get tested in the incubation period. we let stages e ; be as infectious as the i-stage, as data suggest that a substantial fraction of covid- transmission happens before the onset of symptoms . multiple exposed states are included in order to get a naturalistic distribution of incubation periods. li et al. report that the mean incubation period is approximately five days and the reported distribution is fitted well by the gamma distribution we obtain from our four e-stages. a further problem is the duration of the infectious period (i). viral shedding has been observed to last up to eight days in moderate illness . on the other hand, according to linton et al. , the median time from onset to hospitalisation is three days. a bedridden patient (even if not hospitalised) is likely to transmit the disease less. to fit the observed mean serial intervals of . days of nishiura et al. we model the infectious period as a single state with an average duration of three days. in addition, the infectious presymptomatic period lasts on average . days. in comparison ref. uses a serial interval distribution with mean of . days. other authors have suggested a longer serial interval with presymptomatic infections. finally, the transmission rate of the disease is estimated from an observed rate of increase of % per day in fatalities in the usa. this also fits the observation of a growth rate of icu admissions of about . % per day in italy . with our parameters this is reproduced by a basic reproduction number r ~ (as we allow transmission figure . a diagram of the model structure. each agent has a network consisting of a family, a workplace and two groups of friends. the family accounts for % of interactions. work accounts for % and socialisation with friends accounts for a further %. the members of each of these groups are fixed throughout the simulation. finally, % of interactions happen "in public", which we implement as an interaction with a randomly chosen other agent. everyone in the work and friend sub-graphs are assumed to be connected to each other. below the graph, the underlying mechanisms of the disease are shown. we divide the exposed state into four in order to get a more naturalistic gamma distribution of incubation periods. the two last exposed states are infectious, but asymptomatic, meaning that individuals will not get tested. this is to include presymptomatic infection. in our simulation we set the family groups to an average of people, and the work network to completely interconnected people. the friend network consists of two groups with five in each. having calibrated the model in this way, we want to explore mitigation strategies for the corona epidemic. specifically, we will investigate the relative importance of the areas of social life, and the extent that reducing workplace size reduces disease spread. moreover, we will examine the possible gain and cost by simple contact tracing and light quarantine practices. to illustrate the relative importance of the workplace and public life, we consider the scenarios in fig. a . in the first scenario, nothing is done. in the second, contacts within the workplace are reduced by %, while in the third, contacts with friends and the public are reduced. finally, we compare these with similar scenarios, but where good hygiene or keeping a distance reduces the probability of infection from all types of encounters by half. in the figure, we see that the effects of reducing workplace and social contacts are roughly of the same magnitude. this reflects the assignment of % weight to each of these contact types. the slightly larger effect of social contacts reflects our assumption that these connections are less clustered than the workplace network. the two latter graphs show the scenarios where we both reduce infection probability within one group by % and overall infection probability by %. they show that an effective lockdown requires both restrictions of the time spent in the workplace and in the public sphere, and measures that reduce infection probability by increased hygiene and physical distancing. the above results provide one useful piece of information. if the effect of workplace and social contacts are of the same order, it is of little importance which one is restricted. ideally, both will be restricted for a period. however, when restrictions need to be lifted, authorities will primarily be able to control the workplace, whereas the social sphere relies on local social behavior. obviously, it is economically more sustainable to lift the one with the largest social consequences first, by allowing people to return to work while encouraging keeping social gatherings at a minimum. if restrictions are lifted before a substantial level of immunity is achieved, the epidemic will re-ignite. therefore, we now examine what can be done to minimise spread in the reopened workplaces. one possible strategy is to reduce the number of people allowed at any one time in each workplace. in fig. b , we compare an epidemic scenario where the average number of employees per workplace is with an epidemic where this number is reduced to . we further assume that the number of contacts per coworker remains the same, meaning that the number of contacts per person drops when workplace size is reduced. it can be seen that fragmentation of physical spaces at workplaces could have a significant effect on the peak number of infected. in a situation with a risk of straining the healthcare system, this could be part of a mitigation strategy. once again, the strategy becomes relatively more effective if the infection probability per encounter is also reduced. compared to the cases with no workplace size reduction, making workplaces smaller leads to a greater relative reduction in peak size if infection probability is lower, completely eliminating the epidemic at an infection probability reduction of %. a more local strategy that can be employed when reopening society is widespread testing and contact tracing. as mentioned above, hellewell et al. have suggested that this can be effective in containing covid- www.nature.com/scientificreports/ outbreaks provided high efficiency in detecting infected individuals. contact tracing has previously been modeled in relation to other epidemics , and used successfully against smallpox and sars . one obstacle to the widespread implementation of this strategy is the difficulty of tracing contacts. therefore, we will here implement a crude form of contact tracing where we ( ) close the workplaces of people who are tested positive for the disease, ( ) isolate their regular social contacts for a limited period, and ( ) keep symptomatic individuals in quarantine until they recover. we will see that such a step tracing and quarantine strategy ( stq) can give a sizeable reduction in disease spread while costing fewer lost workdays than overall lockdown. our simulations include the limitations imposed by not being able to trace the estimated % of infections from random public transmissions. thus, the strategy does not require sophisticated contact tracing but could be implemented based on infected people being able to recollect their recent face-to-face encounters with friends. it should be noted that we here quarantine persons in their own households, thereby making our contact tracing strategy easier to implement in practice. in particular, family members of a quarantined person are still free to interact outside their home if they are not themselves tested positive. the drawback of such light quarantine practices is that infected persons in quarantine may still transmit the infection to their families. figure examines how increased testing efficiency systematically improves our ability to reduce the peak disease burden. this would then be a more cost efficient way to mitigate the pandemic than a complete lockdown where each person would lose several man-months. even detecting as little as % of covid- infected per day (which with an average symptomatic disease duration of days corresponds to finding approximately % of the infected) can potentially reduce the peak number of cases by %. if % efficiency is possible, corresponding to detecting about a third of infectious cases, then peak height could be reduced by a factor of almost three with to a % drop, if the probability of infected people being tested is only % per day of illness. however, the price of this is that each person is on average quarantined once during the epidemic. if testing is more widespread, the epidemic peak can be further reduced, until it finally becomes unstable at a testing probability of around % per day. (d) epidemic peak and time spent in quarantine as a function of quarantine length for a testing probability of % per day. the average time spent in quarantine increases linearly with the length of quarantine. on the contrary, the effect of quarantine on the peak height appears to stagnate at approximately days. www.nature.com/scientificreports/ less than two weeks in quarantine per person during the entire epidemic. this is illustrated in fig. a where peak height is reduced from . to . at % testing efficiency. the main cost of the quarantine option is the quarantine time. figure d examines the efficiency versus cost of as a function of quarantine length. it can be seen that there is little gain in extending the quarantine period beyond the -day duration of the incubation period. for this reason we opted for days in quarantine in panel (a, b). as a consequence, an average person will stay around days in quarantine during the course of the epidemic with a testing probability of % per day. this time can be reduced if people can be convinced of smaller work environments and fewer face-to-face contacts per week. fragmentation of our networks into smaller groups will reduce both quarantine overhead and the direct transmission of the disease (fig. b, orange curve) . a prolonged lockdown will hugely disrupt society, and it is questionable whether a complete eradication of the virus is possible anyway. therefore, most governments have aimed at softening the epidemic curve, with varying degrees of success. the one step contact tracing with testing and quarantine is a means to this end and would work most effectively in combination with other efforts to reduce r . finally, we investigate whether an aggressive testing and contact tracing strategy could work if implemented at a late stage in an epidemic. this could be relevant if for example the strategy is part of an effort to reopen society after a period of lockdown. in fig. , we show two possible scenarios where testing and contact tracing is implemented after a -day lockdown with a % reduction of the work and social spheres. the lockdown is initiated when % of the population is infected. in (a) we subsequently test and quarantine the infected and their contacts for days, while in (b) the required quarantine is set to days. we assume a testing efficiency of % chance of detection for each day a person is symptomatic. the progression of the epidemic without testing is marked by a black graph for comparison. from the figure one sees that the strategy of even relatively short quarantines also works with a late onset. at a realistic detection probability, it prevents a resurgence of the epidemic. nonetheless, it is quite costly initially, with a very high peak in number of quarantined people. importantly, the effect does not increase with a longer quarantine period, but the cost is substantially larger. pandemics such as the one caused by covid- can pose an existential threat to our social and economic life. the disease itself is serious and leaves specific epidemic signatures and characteristics that make traditional contact tracing difficult. in particular it is highly infectious, can sometimes be transmitted already two days after exposure, and a large fraction of transmission happens before the onset of symptoms. as such it is difficult to contain without a system-wide lockdown of society. nonetheless, a successful containment in south korea used contact tracing. this motivated us to explore a one-step contact tracing/quarantine strategy ( stq). using reasonable covid- infection parameters we find that the stq strategy can contribute to epidemic mitigation, in the sense that it can reduce the peak number of infected individuals by about a factor of two even with a realistic testing rate of % per day of illness. this was illustrated systematically in fig. . the main cost was people in self-quarantine and not contributing to the workforce. in comparison one has to consider that a society-wide lockdown with similar reduction in peak height would have to last for about days (see fig. ). thus, the lockdown would require of order days of quarantine (or at least extensive social distancing) per person, whereas testing and isolation only requires on average around days per person with a -day quarantine www.nature.com/scientificreports/ even at high testing probabilities. importantly these numbers can be reduced if people are able to lower their number of contacts. a noticeable objection to the stq strategy is the fraction of cases with so weak symptoms that people do not contact health authorities. the effect of such limitations is in our model parameterized through the detection probability. from fig. c one sees that when the detection probability goes below % (a rate of % per day) the peak reduction of the stq strategy becomes only of the order percentage point. it should also be noted that, since we rely on symptoms to determine who stays in quarantine, and people in the infectious/symptomatic stage are assumed to always stay in quarantine, we implicitly assume that all infected persons develop at least some symptoms at some point. this may be a break from reality. the increasing availability of tests may also change the perspectives of the stq strategy. with widely available rapid tests, it will be possible to test everyone regularly, and to test all quarantined persons before they leave quarantine. supplementary figure s deals with the results of such a testing strategy and finds that it makes it possible to totally control the epidemic, or to mitigate it without quarantining any healthy individuals. to put this into perspective, the drawbacks of widespread, but slow testing is examined in the supplementary fig. s . here, we find that the stq strategy is most efficient with no test delay, and that delayed contact tracing is comparable to a primitive lockdown. one interesting point which we have not examined here, is that real-world social networks are heterogeneous, with a large variance in number of contacts. it may be expected, for example, that workers in customer-facing positions in shops will have a high risk of catching the disease and passing it on. the effects of this heterogeneity is examined more closely in ref. here, it is concluded that heterogeneity in the number of contacts enhances the effect of contact tracing, since persons with many contacts are both more likely to pass on the disease and more likely to be quarantined. in ref. , the authors suggest a stq strategy similar to the one we here model. the main points of the present analysis is the focus on mitigating instead of eradicating the epidemic, our suggestion of a shorter quarantine length, and the implementation of quarantine together with other members of the household instead of total isolation. our stochastic, agent-based approach also allows for local failures due to the limited duration of quarantine (people may not yet be symptomatic when exiting quarantine) and the non-traceable public contacts (set to %). finally, one noticeable finding is that contact tracing and reduction of contacts per person is still feasible even at a later stage of the epidemic. as can be seen in fig. , a lockdown and subsequent reopening with testing and contact tracing is highly effective in controlling the epidemic. our study that lockdowns have an important role to play in epidemic mitigation, but that they can be replaced by a stq strategy once the epidemic is under control. the covid- pandemic has set both governments, health professionals, and epidemiologists in a situation that is more stressful and more rapidly evolving than anything in recent years. due to the uncertainties caused by a situation in flux, it is difficult to predict anything definite about what works and what does not. the empirical observation that lockdowns worked in both china, and in a milder form in denmark shows that our assumption of a % reduction in specific infection rates under lockdown is realistic. our main result is that some of these restrictions can be replaced by testing, one-step contact tracing and short periods of quarantine. this is far cheaper than total lockdowns. perhaps most importantly, these measures work best in combination. as is highly relevant to the current epidemic stage of covid- , we pinpoint that stq can be successfully implemented also at a late stage of the epidemic where testing may become massively available. plots of alternative variants of our model (including alternative testing strategies and larger family sizes) can be found in the supplementary material. the code used to produce the plots shown in this article is available on figshare under the url https ://doi.org/ . /m .figsh are. .v . how will country-based mitigation measures influence the course of the covid- epidemic report : impact of non-pharmaceutical interventions (npis) to reduce covid- mortality and healthcare demand social contacts and mixing patterns relevant to the spread of infectious diseases epidemic analysis of covid- in china by dynamical modeling the effect of control strategies to reduce social mixing on outcomes of the covid- epidemic in wuhan, china: a modelling study modelling transmission and control of the covid- pandemic in australia substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov ) contagion! the bbc four pandemic: the model behind the documentary contacts in context: large-scale setting-specific social mixing matrices from the bbc pandemic project early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia feasibility of controlling covid- outbreaks by isolation of cases and contacts fam n: families . january by municipality, type of family, size of family and number of children serial interval of novel coronavirus (covid- ) infections ) pandemic: increased transmission in the eu/eea and the uk-seventh update epidemiological characteristics of novel coronavirus infection: a statistical analysis of publicly available case data report : estimating the number of infections and the impact of non-pharmaceutical interventions on covid- in european countries covid- and italy: what next? the effectiveness of contact tracing in emerging epidemics smallpox and its eradication epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in hong kong heterogeneity is essential for contact tracing we thank gorm gruner, bjarke frost nielsen, andreas roepstorff, and lone simonsen for enlightening discussions. this project has received funding from the european research council (erc) under the european union's horizon research and innovation program under grant agreement no. . a.e. and k.s. both participated in devising the model. code was written and plots produced by a.e.. the functionality of the code was checked by comparison with an alternative algorithm written by k.s. a.e. and k.s. wrote and edited the manuscript. the authors declare no competing interests. supplementary information is available for this paper at https ://doi.org/ . /s - - - .correspondence and requests for materials should be addressed to a.e.reprints and permissions information is available at www.nature.com/reprints.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/ . /. key: cord- -ucm frol authors: nuzzo, andrea; tan, can ozan; raskar, ramesh; desimone, daniel c.; kapa, suraj; gupta, rajiv title: universal shelter-in-place vs. advanced automated contact tracing and targeted isolation: a case for st-century technologies for sars-cov- and future pandemics date: - - journal: mayo clin proc doi: . /j.mayocp. . . sha: doc_id: cord_uid: ucm frol abstract objective to model and compare effect of digital contact tracing versus shelter-in-place on sars-cov- spread. methods using a classical epidemiologic framework, and parameters estimated from literature published between february , and may , , we modeled two non-pharmacologic interventions- shelter-in-place and digital contact tracing- to curb spread of sars-cov- . for contact tracing, we assumed an advanced, automated contact tracing (aact) application that sends alerts to individuals advising self-isolation based on individual exposure profile. model parameters included percentage population ordered to shelter-in-place, adoption rate of aact, and percentage individuals who appropriately follow recommendations. under influence of these variables, number of individuals infected, exposed, and isolated were estimated. results without any intervention, a high rate of infection (> million) with early peak is predicted. shelter-in-place results in rapid decline in infection rate at the expense of impacting a large population segment. the aact model achieves reduction in infected and exposed individuals similar to shelter-in-place without impacting a large number of individuals. for example, a % aact adoption rate mimics a shelter-in-place order for % of the population and results in > % decrease in peak number of infections. however, as compared to shelter-in-place, with aact significantly fewer individuals would be isolated. conclusion wide adoption of digital contact tracing can mitigate infection spread similar to universal shelter-in-place, but with considerably fewer individuals isolated. in the absence of a vaccine or cure, non-pharmacological interventions are critical to reducing spread of severe acute respiratory syndrome-coronavirus- (sars-cov- ). this is primarily accomplished through containment and isolation, such as the universal shelter-in-place order used in many cities and states across the united states. the need for such orders is due to the exponential growth of cases that occur during an outbreak, which generally needs to be countered by a fast, coordinated and widespread response. transmission of sars-cov- during asymptomatic infectious periods further complicates this response. asymptomatic infected individuals may not see the need to self-isolate, and it is difficult for public health infrastructure to identify such cases and enforce isolation during this period. contact tracing has the potential to limit spread of infectious diseases. this has been proven in epidemics such as sars, bird flu, middle east respiratory syndrome (mers), and others. , traditional contact tracing suffers from the problem of scalability as they are based on phone interviews and record keeping. on the other hand, current technologies permit constant tracking of individuals and locations via mobile phones, global positioning systems (gps), wifi, and bluetooth. a system that leverages these technologies to track and record movement of individuals, and monitor proximity to others for potential exposure, can help overcome difficulties posed by manual contact tracing. many app-based systems -for example, private kit: safepaths (http://safepaths.mit.edu/), covid symptom tracker (https://covid.joinzoe.com/us), and the apple/google collaborative venture (https://www.apple.com/covid /contacttracing) -are currently being tested. such advanced automated contact tracing (aact) systems -which could infer exposure risk and propagate warnings to people at risk -may help curb disease spread by facilitating targeted self-isolation rather than universal mandates such as shelter-inplace. in this paper, we compare universal shelter-in-place with targeted self-isolation envisioned in aact. with available data pertaining to sars-cov- we model strategies for the united states available at https://github.com/andreanuzzo/aact_simulation. our disease model is based on the seir (susceptible, exposed, infected, recovered) model assuming a constant susceptible population. [ ] [ ] [ ] using these data, two separate models were created -aact and universal shelter-in-place. in both, computational methods were used to determine impact in terms of infected individuals and proportion of population impacted by isolation/quarantine orders. modeling and study was performed based on data regarding the pandemic published between february , and may , . for the model, we assumed the following: • t inc = incubation period (~ . days) ; • t lat = latency period before development of symptoms (~ . days) ; • basic r = . . preliminary death rate µ = . (with case fatality ratio of . %, as estimated by recent global data). further details are summarized in supplemental material. several variables were considered in model development and summarized in figure . compartments included: s (susceptible individuals), e (exposed to infection, unclear symptomatic conditions, potentially infectious), i (infected, confirmed symptomatic and infectious), r (recovered, immune from further infection), and d (death due to sars-cov- ). in aact, an additional compartment sq (traced contacts that are exposed and under selfisolation) was used while for shelter-in-place, the compartment q (individuals isolated through universal enforcement measures) was used. the basic difference between the models is that isolation/quarantine is based solely on exposure history in aact, while isolation orders apply to the entire population in universal shelter-in-place. we assumed that through the aact app, it is possible to inform exposed (asymptomatic/noninfected) individuals of exposure risk. once warned, they may self-isolate and prevent second-order spreading. therefore, self-isolated contacts will depend on penetrance p of the aact app in infected and exposed populations. the equations and details used to build the model are summarized in the supplemental section. two key elements that fall into aact include percentage of individuals adopting (ie, downloading) the app, and percentage who selfisolate in response to an exposure alert. our model assumes that for the fraction of individuals who heed the warning, there is no transmission of sars-cov- from exposed individuals to other susceptible individuals. universal shelter-in-place model ( figure , right panel) measures that limit public gatherings or mandate full lockdown uniformly impact the susceptible population. they are successful in isolating a fraction of the population, with the unquarantined transitioning through exposure, infection, recovery or death. such measures, depending on duration of enforcement (assumed to be constant in our model), are independent of percentage of infected population or percentage exposed. the key variable considered is percentage of population that is under shelter-in-place orders. for example, if % of the population (including essential personnel) is ordered to stay at home, nobody will be allowed outside and disease transmission will be halted. in real life, percentages far below % would be expected. in the current model, we assume shelter-in-place measures will be released after days. all models were created using r (version . . , ), and tidyverse ( ) and stats packages (https://cran.r-project.org). all graphs were created using r. both models agree with each other when adoption of digital contact tracing and universal shelterin-place mandate are close to zero (i.e., p= and g= , where p is adoption rate of aact and g is both shelter-in-place and aact achieve reductions in number of infected cases (table ) . for example, with % adoption and % compliance, aact would lower peak number of infected individuals by % and cumulative deaths by %. enforcing shelter-in-place measures for % of the population would almost completely halt sars-cov- spread. however, such a measure would quarantine, at peak, more than million people as opposed to isolating approximately million in aact to achieve similar reduction. as can be seen in panels (e) and (f) of figure , the main difference between the models is in societal burden imposed in terms of number of individuals expected to be quarantined or isolated. both adoption of aact (i.e., how many people downloaded the application), and percentage of people who heed the advice of the application (i.e., self-isolate when a warning is issued) are critical to success of digital contact tracing. for example, if % of users respond to an exposure alert by self-isolating, lower adoption rates would be sufficient; conversely, lower response rates to alerts require a higher adoption rate in the general population. figure a and the supplementary video (https://youtu.be/h crrfdek i) summarize this tradeoff for different adoption rates and user response rates over the course of the pandemic. figure b offer a graphical representation of percentage of the population impacted at peak as a function of the application adoption rate and user response rate. sars-cov- is a global pandemic with variable approaches implemented to address its spread. past experience with spanish flu, sars and mers shows that interventions that limit contact, increase social distance, and reduce exposure risk are essential to "flattening the curve". governments around the world have instituted isolation measures such as shelter-in-place or stay-at-home to achieve these goals. however, universal isolation measures disrupt the fabric of society by hindering social interactions, limiting support for people with disabilities, and exacerbating mental health political discourse refers to the pain and suffering associated with these measures. contact tracing is routinely used for controlling infectious diseases. stochastic mathematical models, and past experience in the swine flu pandemic of and ebola outbreak of , have shown contact tracing can reduce r by as much as %. preliminary studies have shown that, accounting for heterogeneity of social interactions, it may be sufficient to trace contacts per infected person to reduce r for sars-cov- from . to . . contact tracing is not novel, but the exponential nature of the ever-enlarging tree of exposures makes conventional manual contact tracing cumbersome. especially in later stages of an epidemic, an automated or semi-automated solution is required in order to be scalable-a solution that we have dubbed aact. in this paper we compared universal containment against aact, a version of automated contact tracing that is able to recursively enumerate all persons who came into contact with an infected person. aact envisions a system that can instantaneously trace individuals in the exposure network of an index case, and issue warnings to everyone in this network. aact coupled with targeted self-isolation has several advantages over universal containment measures. the obvious advantage is society can still function with a select number of individuals in isolation. this approach also attempts to halt disease spread at the earliest time point after identification of infected individuals. aact enables first or second order exposures to isolate and limit further disease spread even when not showing symptoms. therefore, it enables remedies that may work in the pre-symptomatic stage. from the point of view of public health officials, aact may provide an early estimate of exposure risk and disease burden that the healthcare system will face. such information can be used to increase readiness. it may also facilitate patient surveillance and streamline flow and distribution through the healthcare system. finally, with the envisioned pandemic control system, aact and targeted isolation can be quickly deployed at first signs of an outbreak with the goal of limiting disease spread without resorting to measures such as shelter-in-place. spread while impacting fewer individuals. success of aact hinges not only on user adoption, but also on users' willingness to abide by recommendations. if individuals do not universally respond to alerts by self-isolating, impact of aact on disease spread would be minimal. similarly, at lower adoption rates, exposures could not be tracked, thus undercutting benefits. aact would be most successful with universal adoption and universal response. nonetheless, we have demonstrated even at modest adoption and response rates, it is feasible to significantly mitigate disease spread while limiting number of individuals isolated. in a real-world context, several countries have started introducing aact to help reopen societies and mitigate continued disease spread. data from singapore suggested that digital contact tracing carries higher sensitivity and specificity for identifying contacts than traditional approaches. the data on the efficacy of these measures, however, is limited and requires rigorous analysis before conclusions from models can be made. thus, recommendations have been proposed to achieve this and hopefully will result in more rigorous analysis. the need for real-world context is especially important given that several factors, including technological literacy, infrastructure, governmental regulations, user adoption based on culture, and factors such as regional population flow may impact efficacy. for example, likelihood of broad user adoption and compliance would likely be lower in the absence of governmental support, depending on the population. furthermore, populations with high frequency of exchange with surrounding countries, states, or regions in which aact is not used may overcome any value of aact. additionally, without appropriate infrastructure (wireless systems to transmit data, centralized databases that can aggregate data, etc), the viability of aact would be limited. there are several limitations to our models. first, we initialized our models with fixed parameters; in reality, parameters have been dynamic and evolved as the pandemic progressed. however, the intent of this paper was to compare strategies for mitigating disease spread assuming a common disease model. it is fair to assume comparative outcome of aact and universal stay-at-home would be similar regardless of their initialization. second, success of aact may depend on type of technology used. for example, gps systems have lower location accuracy than bluetooth or wifi. thus, systems that predict exposure based on proximity between an infected individual and an app user would be more accurate (and thus impact fewer people) when technology has higher location accuracy. also, we assumed adoption of aact is uniformly distributed throughout the population. diffuse uptake evenly throughout a society would be expected to have more benefit than uptake in dense pockets. finally, our modeling doesn't account for transmission from exposed individuals to other susceptible individuals (eg, household members) between the time of exposure and the time they self-quarantine. such third order exposures were not accounted for by the model and thus skew the data in favor of aact. however, with comprehensive use, near real-time results, and application of self-quarantine rules to household exposures, such deviations could be reduced. contact tracing can mitigate disease spread through a curated approach of identifying and isolating exposed individuals, as opposed to shelter-in-place orders. applications that can be implemented through available smart phones and other devices may offer an opportunity to facilitate contact tracing and alert individuals to self-isolate after exposure. these efforts afford the ability to mitigate disease spread in similar rates to universal shelter-in-place when adopted at sufficient rates, assuming a high percentage of users respond to exposure alerts issued by the system. figure a summarizes curves for different levels of adoption and response rates over the course of the pandemic. in this figure, the right panel summarizes response rates, and inset numbers are adoption rates. figure b offers a graphical representation of number of individuals expected to be isolated at difference adoption and response rates. approach in this framework we analyze two possibilities to implement non-clinical procedures to stop the spread of the epidemic: • advanced contact tracing: through aact, it is possible to inform exposed (asymptomatic/non-infected) members of the community of the exposure risk. once warned, they would ideally self-isolate themselves and prevent second-order spreading of the contagion. therefore, self-isolated contacts will depend on the aact penetrance p in both the infected and the exposed population. we are assuming efficacy % (or rather, traced contacts receiving warnings and not self-isolating would pose as much risk as nontraced contacts). self-isolated members might still develop symptoms. the percentage of aact penetration will also limit the further exposure, thus reducing the transition between susceptible and exposed. • traditional measures: in order to stop the contagion, authorities might recur to enforce social distancing through different measures, going from limitation of public gathering to full lockdown. we use the variable g to model these interventions which will act aspecifically on susceptible, exposed and infected population. this measure does not depend on the percentage of infected patients, but will still limit the of the susceptible population. quarantine will last for a time of days (assumed reasonable in the current scenario) we assume the following initial parameters: • t inc = incubation period (~ . days) • t lat = latency period before development of symptoms (~ . days) • basic r = . preliminary death rate µ = . (with case fatality ratio of . %, as estimated by recent global data). imputed from the definition of = compartment functional definition s susceptible individuals e exposed to infection, unclear symptomatic conditions, potentially infectious i infected, confirmed symptomatic and infectious sq traced contacts, thus exposed but (self-)isolated r recovered, immune from further infection d case fatality (death due to covid- , not other causes) here we will consider as the percentage of adoption of the contact tracing digital solution among the whole population and $$ the percentage of population with the app that would eventually follow the recommendation and self-isolate. we are assuming that percentage of responsible use corresponds to efficacy and tempestivity of isolation moreover, we do not model the second and third-grade exposure risks from the first contacts for simplicity. here we will consider as the strength of intervention, hard to quantify numerically, but can be assumed to increase from limiting big gathering events up to full lockdown, and as the rate of intervention (assumint time of intervention days). here will have effect on the susceptible population. quarantined people will decrease after the intervetion time (and ideally assigned to the recovered, not the susceptible population for simplicity purposes). the incidence of intervention does not depend on the i compartment. presumed asymptomatic carrier transmission of covid- substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov ) epidemic models of contact tracing: systematic review of transmission studies of severe acute respiratory syndrome and middle east respiratory syndrome the effectiveness of contact tracing in emerging epidemics contributions to the mathematical theory of epidemics contributions to the mathematical theory of epidemics -ii the problem of endemicity contributions to the mathematical theory of epidemics -iii further studies of the problem of endemicity the incubation period of coronavirus disease covid- ) from publicly reported confirmed cases: estimation and application early in the epidemic: impact of preprints on global discourse about covid- transmissibility first-principles machine learning modelling of covid- contact tracing the efficacy of contact tracing for the containment of the novel coronavirus (covid- ) people are under quarantines in new york city. the new york times. / / use of a real-time locating system for contact tracing of health care workers during the covid- pandemic at an infectious disease center in singapre: validation study digital health and the covid- epidemic: an assessment framework for apps from an epidemiological and legal perspective key: cord- -fm gl b authors: andersen, bjørg marit title: scenarios: serious, infectious diseases date: - - journal: prevention and control of infections in hospitals doi: . / - - - - _ sha: doc_id: cord_uid: fm gl b scenarios for serious, infectious diseases are important procedures used to understand the special microbe’s behaviour (clinical illness, spread of infection, etc.) and how to act most rational during special dangerous outbreaks. furthermore, scenarios describe how to handle patients, personnel and others possibly exposed to infections,- outside and inside the hospital- to stop spread of the infection as soon as possible. today, it is not acceptable to place a patient with a known high-risk, serious infection in the same hospital room as other patients with not the same disease (who). in this chapter, some seldom but realistic scenario is described to better understand how to react and treat patients to stop spread of microbes during the primary phase of dangerous transmittable diseases. • all personnel who are the first in contact with the infected/exposed person, relatives, other contacts and environment. • it is not acceptable to place a patient (with not defined same disease) in the same room with a patient with known high-risk infectious disease (who). the hospital's management provides written plans for how to react in situations where personnel and others may be exposed to known/unknown serious communicable disease and a practical arrangement for how to handle the situation. the infection control officer at the level of where the problem occurs, and at the departments/ward where the infection may spread, is responsible for following local emergency plans. personnel who unprotected have come in a seriously contagious situation and may have been exposed to infectious agents are responsible for contacting the nearest responsible/infectious unit for advice and of following written guideline and practical advice. [ ] • the patient (infected/suspected infected) is usually relatively easy to deal with since there are guidelines for preventing spread of infection and treating the patient for the current disease; see isolation routines. • contacts exposed to infection are often worse to handle since it may be larger numbers of people (travel company, etc.) and because fear of being infected itself creates uncertainty. therefore, some imaginable scenarios are made that deal with infected contacts. • transport by ambulance. all transport of infectious patients from the place of arrival to the hospital should take place in ambulances using the same infection control regime as for the individual infectious disease (contact infection, airborne infection, strict isolation); see isolation regimes; chaps. proper use of protective equipment is used when handling such patients. the ebola outbreak in african countries in showed that almost , were registered ill, more than , died, health professionals became ill and more than half of them died. lack of use of ppe and proper infection control led to escalation of the epidemic. the staff used only m distance from the patient as a zone of infection, as recommended by the who and cdc, and lacked protection for the head, hair and neck when within the m zone and used only ordinary masks [ ] [ ] [ ] [ ] . this occurred despite the fact that ebola is defined as a high-risk, biosafety level infection in which airborne infection could be relevant [ ] [ ] [ ] [ ] . the epidemic declined from september , following the introduction of more proper use of personal protection equipment and infection control routines [ , , ] . contacts are differentiated after infection risk: . high-risk contact: physical contact with vhf, or with blood secretion or excretion. healthcare staff, ambulance staff, laboratory staff, family or others who have treated the patient before admission. . low-risk contact: been in the same room with the patient after the onset of the disease, but not in direct contact with the patient, equipment or others in the room. examine the contacts; measure temperature two times for weeks. . transmission from healthy contacts is considered unlikely. however, everyone who has been in the same place at the same time with a vhf sick patient should be informed and followed up. if probable or verified vhf, inform the contacts-low chance of infection: • the contacts keep calm at home and measure the temperature daily two times for weeks after the last contact with the index patient. no crowding with many people and no use of collective traffic. • if temperature °c or more, or rash/flu symptoms/sickness, contact the infection medical department. • if this cannot be achieved at home, the contact may come to defined outpatient clinic for temperature measurement by appointment or is admitted to hospital. • served with food, etc. brought out from store to door, possibly, while isolated at home. mrsa (methicillin-resistant staphylococcus aureus), vancomycin-resistant mrsa, penicillin-resistant pneumococci, super-resistant gram-negative bacteria (esbl, cre,cp, ndm- bacteria), multidrug-resistant tubercle bacteria (tuberculosis), vancomycin-resistant enterococci (vre), etc. evaluated in collaboration with microbiological laboratory. this was especially observed during the major tsunami disaster in [ ] . serious problems can occur in areas with melioidosis and other highly virulent bacteria. in hospital, not usually many at the same time, but dependent on endemic situation. the patient can be contact or air isolated, depending on the infectious agent. • registering of direct contacts-depending on the infectious agent (name, address), and include where the patient have been earlier (information). • ambulance personnel and other personnel use routines for the relevant infection type according to isolation procedures and in accordance with emergency department's report. in case of doubt, contact infection control personnel. • use respiratory protection, p mask, if suspecting pulmonary tuberculosis, and put a surgical mask or p /p mask without a valve on the patient in case of suspected pulmonary tuberculosis or respiratory tract infection. contacts/carriers-differentiated follow-up-low chance of getting sick: example: three people in a norwegian travel company get voluminous, watery, painless diarrhoea on return from bangladesh, just before landing at oslo airport, gardermoen. due to a loss of fluid, they were transported with an ambulance equipped for "import infection" and admitted directly into contact isolation. municipal infection control doctor is notified and is responsible for reporting, measures and follow-up outside the hospital together with the municipal emergency response group. this is one of the old, major quarantine diseases. only a few get sick (top of the iceberg), i.e. - patients out of infected. there is a low mortality by proper treatment (< %). the transmission risk is relatively low due to good hygiene and good sanitation in today's norway and other developed countries. most patients are shedding bacteria in large amounts and are also carriers without symptoms. patients are isolated with contact isolation regimens. the infection can reach unmanageable heights in disaster areas, by hunger, contaminated water supply and destroyed infrastructure. following the natural disaster in haiti, cholera was introduced with infected helper crew from asia (carriers), and an epidemic started in , which in had increased to , cholera patients, , hospitalized and deaths [ ] . • registering of contacts and remaining passengers in the airplane and from same travel company (name, address, telephone number). • ambulance staff and other personnel use the contact regime. if risk of spills, etc., use also surgical mask, visor and cap in addition to gloves and gown/overall. example: an elderly woman who recently attended a bus trip to moscow became sick days after returning to oslo. she had sore throat, fever, cough and eventually a white, firm-sitting "plaque" in the throat. she was hospitalized after days because of suspected diphtheria. contact persons and other close contacts were contacted for follow-up and treatment with erythromycin (according to resistance pattern). municipal infection control doctor is notified and is responsible for reporting, measures and follow-up outside the hospital together with the municipal emergency response group. diphtheria may still be periodic problems in eastern europe and many places in the world. vaccination status is good in children in most countries but more uncertain in elderly, especially in women. this is a contact and airborne infection, relatively highly infectious. there are probably few cases in an outbreak due to herd immunity. patients are isolated with air and contact isolation regime until free from bacteria (negative culture). a historically serious infectious disease also in norway, with high mortality rates until the middle of the last century [ ] . ullevål hospital, oslo, introduced treatment with the diphtheria serum in and achieved an impressive response-from - % mortality to - % [ ] . • registering: all exposed persons (name, address, telephone number) and followup; see below. • ambulance staff and other personnel use the contact and airborne infection regime when picking up and transporting a patient. use respiratory protection, visor and cap in addition to gloves and gown/overall, and put a surgical mask on the patient. contacts/carriers of diphtheria are differentiated-vaccination protects against disease: . sampling (nasopharynx samples) of all exposed persons (even if vaccinated and are not sick, you may be a carrier). . prophylactic/therapeutic treatment with erythromycin may be initiated rapidly. . if living at home, others should not be exposed to infection/carrier state. . short-time airborne isolation may be relevant for carrier or exposed to infection until the infection state is clarified/effect of antibacterial therapy. . booster vaccine against diphtheria is considered for all contacts. example: a person of a family of five who has stayed in madagascar for a month got sick on his way home to norway. he coughs, has fever and develops skin rashes that resemble big boils, especially in the groin. he was admitted directly to strict isolation in hospital and suspected of serious import infection. the municipal infection control doctor is notified and is responsible for reporting, measures and follow-up outside the hospital together with the municipal emergency response group. • quarantine disease: periodic problem in the east asia, especially india, and ongoing outbreaks in africa, namibia and madagascar. pest may be a warrelated disease (biological warfare). untreated dies more than % of the cases while, with streptomycin treatment, less than %. this is a typical airborne disease, relatively highly infectious when respiratory tract symptoms. there is usually small outbreak with few cases ( - ). air and contact infection regime until free for bacteria. in november , new pest outbreaks were reported in madagascar. about patients got sick, of whom died of bubonic plague [ ] . the infection spread rapidly between humans and "killed quickly" [ ] . multidrug-resistant yersinia pestis is described in this country [ ] . • registering: all infected persons (same travel company, all in the same flight home) are registered (name, address, telephone number) and followed up. • ambulance staff and other personnel use the contact and airborne isolation regime when picking up and transporting a patient. use respiratory protection (p mask), visor and cap in addition to gloves and gown/overall, and put a surgical mask on the patient. all close contacts/carriers are isolated to infection state is clarified-little chance of getting sick: . sampling from all exposed persons . prophylactic/therapeutic treatment, eventually vaccine . short-time airborne isolation of exposed cases until the infection state is clarified/effect of antibacterial therapy . . anthrax after staying in turkey, sick on the plane home . . . patient: strict isolation-air pressure isolate with pressure [ , ] example: two out of six people who have been on family visits in turkey for a week, on farms with goats and skin production, are acutely ill on the plane home with cough, shortness of breath and fever. upon arrival, the emergency outpatient clinic was contacted by the patients who were immediately transferred to intensive care unit for airway symptoms and suspected import infection. at the hospital, anthrax is suspected, and patients are strictly isolated in air isolate and treated. municipal infection control doctor is notified and is responsible for reporting, measures and follow-up outside the hospital together with the municipal emergency response group. • endemic problem in several places (asia, africa, middle east, north and south america), war-related (biological warfare). varying number of cases of anthrax at outbreaks during peacetime, depending on how many people have eaten, for example, infected food, etc. • symptoms: % are cutaneous, % are respiratory, and a few cases are gastrointestinal anthrax. the latter two are two-phasic and almost always fatal. • the bacterium bacillus anthracis is a spore-forming, resistant, gram-positive rod that survives nearly infinity in the environment if not removed. the bacteria are usually penicillin-sensitive. • infection: person-to-person infection is unlikely, but hospital infection is described. air and contact regime is conducted around such patients in hospitals until free of bacteria (ca h treatment); however spores may survive for a long time. • registering: all directly exposed persons are registered (name, address, telephone number) and followed up. • ambulance personnel use contact and airborne regime for patient pickup and transport. use respiratory protection (p mask), visor and cap in addition to gloves and gown/overall, and put a surgical mask on the patient. contacts/carriers-low risk of getting sick: . sampling and antibacterial treatment are offered to all contacts that may have a common source of infection with the index patients (travel company, co-passenger). person-to-person transmission is unlikely. . vaccine may, in addition to antibacterial treatment, be applicable to people with a common source of infection with the index patients-when risk of large outbreaks (note that vaccination should be discussed due to some serious adverse reactions). . while waiting for result of the sampling, close contacts live at home with contact isolation restrictions. . in case of detected anthrax in contact (incubation phase for disease - days), the person is isolated and treated, and vaccination may be assessed for close contacts. this is an endemic problem among wild animals in most countries. person-toperson transmission is unlikely. it is almost never reported more than one case at a time, infected by animal bites or licking but occasionally without known exposure. close contacts/exposed persons are registered. • registering: all exposed persons are registered (name, address, telephone number) and followed up. • ambulance staff and other personnel use the contact and airborne regime when picking up and transporting a patient. use respiratory protection and ppr; put a surgical mask on the patient. contacts/exposed-low chance of getting sick: . close contacts/exposed persons are assessed for vaccine and rabies immunoglobulin. . followed up at the infection outpatient clinic. patients and contacts are treated just like at vhf. patients and contacts are treated mainly like vhf. see also sars and bird flu. new infectious diseases and agents still appear, for example, sars, avian viruses (h n ), htlv and other retroviruses, hpv, sindbis virus, parvovirus, bocavirus, coronavirus, etc., or bacteria like legionella and borrelia, or agents with virulence changes, like group a streptococci, meningococci, etc. • biological terrorism made anthrax, plague, botulism, coxiella, brucella, vhf, poxviruses and a number of other unusual agents more appropriate as biological weapons . • bacteria sensitive to common antibacterial agents will probably not be a major problem. • viruses with no vaccine or treatment against low infection dose and high capacity to survive will be a major problem if associated with incurable disease, disability or death. • it is probable that this will be a problem first in countries with low hygiene standards/high population density. if the infectious agent is unknown, transmission ways are unknown and the situation is uncertain or uncontrollable, this practical measure may be followed: patient and contacts: strict isolation-negative air pressure isolation . serious illness: isolation of index case and all contacts . less severe disease: isolation of index case and close contacts • registering: all exposed persons are registered (name, address, telephone number) and followed up. • ambulance personnel use contact and airborne regime for patient pickup and transport. use respiratory protection (p mask), visor and cap in addition to gloves and gown/overall/shoe covers/dedicated shoes, and put a surgical mask on the patient. • botulism is caused by a bacteria-produced toxin (clostridium botulinum) that causes paresis and is common in soil as spores. the disease can be associated with toxin formation in contaminated and poorly canned foods, shrimp fish, bacon, etc. under anaerobic conditions and randomly affects both healthy people and vulnerable groups, such as infants who have had honey infected with the bacterium, which has happened repeatedly [ ] . the toxin is the most dangerous we know and is on the list of bioterrorism. • brucella bacteria (zoonosis) are particularly related to laboratory outbreaks but are easily transferable outside the laboratory and are considered highly infectious. [ ] • francisella tularensis (zoonosis) is defined as a category a bioterrorism agent, highly infectious and increasing in the society, has low infection dose ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) bacteria) and can be inhaled or infected via food and water. occasionally, such patients are detected in hospitals, even in the operating department [ ] . • mers-middle east respiratory syndrome-newly discovered coronavirus zoonosis (dromedaries, bats, etc.) and acts as sars, also with a tendency for nosocomial spread in hospitals. in there were cases, of which % died [ ] . • polio-like illnesses, especially among children, were discovered in august in different states in the united states. probably caused by enterovirus d [ ] . • hiv-aggressive variants (crf ) have been detected in cuba in and earlier in africa, with a faster course from infection to aids development [ ] . • prion disease-new shy-drager syndrome is rediscovered in ; multiplesystem atrophy (msa) [ ] . • ricin is a plant-derived toxin that is still used for bioterrorism in letters to, among others, president obama [ ]. • fungi and mould: different types that are especially related to floods, water damage, etc. and to pollution of medical products [ ] [ ] [ ] [ ] . in norway, a recent overview of candida in blood cultures showed a stable state for the past years [ ] . • zika virus: newly discovered flavivirus with mild to serious symptoms and teratogenic effect [ ] . ministry of labour and administration. regulations on protection against exposure to biological factors (bacteria, viruses, fungi and more) in the workplace guidelines for environmental infection control in health-care facilities disinfection, sterilization, and control of hospital waste serious, common contagious scenarios. in: handbook in hygiene and infection control for hospitals. olso: ullevål university hospital other serious viral infections-zoonoses. in: handbook in hygiene and infection control for hospitals. part fagbokforlaget lassa, and other haemorrhagic viruses. in: handbook in hygiene and infection control for hospitals. part . fagbokforlaget interim-infection prevention and control guidance for care of patients with suspected or confirmed filovirus haemorrhagic fever in health-care settings, with focus on ebola infection prevention and control recommendations for hospitalized patients with known or suspected ebola haemorrhagic fever in us hospitals department of health. management of hazard group viral haemorrhagic fevers and similar human infectious diseases of high consequences ebola guideline-from the norwegian institute of public health / international infection control guidelines for ebola. hospital healthcare europe. facilities management transmission or ebola virus from pigs to nonhuman primates violent expiratory events: on coughing and sneezing guidance on personal protective equipment to be used by healthcare workers during management of patients with ebola virus disease in us hospitals, including procedures for putting on (donning) and removal multi-resistant infections in repatriated patients after natural disaster: lessons learned from the tsunami for hospital infection control in: handbook in hygiene and infection control for hospitals. part . fagbokforlaget tonsillitis in europe's outskirt-when the diphtheria toxin came to the county of romsdal blood is a very special juice"-introduction of serum therapy for diphtheria in norway preparedness at ullevål university hospital in connection with biological weapons anthrax and emergency routines at ullevål university hospital the threat of biological attack: why concern now? haemorrhagic fever viruses as biological weapons: medical and public health management bioterrorism-related inhalation anthrax: the first cases reported in the united states anthrax as a biological weapon, : updated recommendations for management anthrax of the gastrointestinal tract nosocomial spread of bacillus anthracis bacteria and disease. epidemiology, infections and infection protection. oslo: gyldendal akademisk, gyldendal norsk forlag as can anthrax infect from patient to patient? bacillus anthracis aerosolization associated with a contaminated mail sorting machine secondary aerosolization of viable bacillus anthracis spores in a contaminated us senate office risk assessments of anthrax threat letters. defense research establishment suffield. dres technical report - anthrax inhalation and lethal human infection bioterrorism: crime and opportunity bioterrorism preparedness and response in european public health institutes deliberate releases of biological agents. initial lessons for europe from events in the united states the use of smallpox virus as a biological weapon: the vaccination situation in france are there «new » and «old » ways to track infectious diseases hazards and outbreaks ? infectious disease disaster: bioterrorism, emerging infections, and pandemics. apic text of infection control and epidemiology bioterrorism readiness plan: a template for healthcare facilities an infant with acute paresis hospital-associated transmission of brucella melitensis outside the laboratory potential risk of aerosol-borne francisella tularensis transmission in the operating room acute neurological disease of unknown etiology in children-colorado hiv-cuba: new aggressive variant prion disease updated: novel prion disease-shy-drager syndrome mould-preventing strategies and possible health effects in the aftermath of hurricanes and major floods fungi-human pathogenic. in: handbook in hygiene and infection control for hospitals. part . fagbokforlaget mucormycosis-usa: fatal, premature infant, probiotic supplement, recall, alert twenty-two years of candidaemia surveillance: results from a norwegian national study key: cord- -vz n jsj authors: keeling, matt j; hollingsworth, t deirdre; read, jonathan m title: efficacy of contact tracing for the containment of the novel coronavirus (covid- ) date: - - journal: j epidemiol community health doi: . /jech- - sha: doc_id: cord_uid: vz n jsj objective: contact tracing is a central public health response to infectious disease outbreaks, especially in the early stages of an outbreak when specific treatments are limited. importation of novel coronavirus (covid- ) from china and elsewhere into the uk highlights the need to understand the impact of contact tracing as a control measure. design: detailed survey information on social encounters from over respondents is coupled to predictive models of contact tracing and control. this is used to investigate the likely efficacy of contact tracing and the distribution of secondary cases that may go untraced. results: taking recent estimates for covid- transmission we predict that under effective contact tracing less than in cases will generate any subsequent untraced infections, although this comes at a high logistical burden with an average of individuals traced per case. changes to the definition of a close contact can reduce this burden, but with increased risk of untraced cases; we find that tracing using a contact definition requiring more than hours of contact is unlikely to control spread. conclusions: the current contact tracing strategy within the uk is likely to identify a sufficient proportion of infected individuals such that subsequent spread could be prevented, although the ultimate success will depend on the rapid detection of cases and isolation of contacts. given the burden of tracing a large number of contacts to find new cases, there is the potential the system could be overwhelmed if imports of infection occur at a rapid rate. from china and elsewhere into the uk highlights the need to understand the impact of contact tracing as a control measure. design detailed survey information on social encounters from over respondents is coupled to predictive models of contact tracing and control. this is used to investigate the likely efficacy of contact tracing and the distribution of secondary cases that may go untraced. results taking recent estimates for covid- transmission we predict that under effective contact tracing less than in cases will generate any subsequent untraced infections, although this comes at a high logistical burden with an average of individuals traced per case. changes to the definition of a close contact can reduce this burden, but with increased risk of untraced cases; we find that tracing using a contact definition requiring more than hours of contact is unlikely to control spread. conclusions the current contact tracing strategy within the uk is likely to identify a sufficient proportion of infected individuals such that subsequent spread could be prevented, although the ultimate success will depend on the rapid detection of cases and isolation of contacts. given the burden of tracing a large number of contacts to find new cases, there is the potential the system could be overwhelmed if imports of infection occur at a rapid rate. contact tracing is the main public health response to importations of rare or emerging infectious diseases, and was implemented in the uk during the 'containment stage' of the influenza pandemic. in more recent years, contact tracing was also a valuable tool following the importation of the ebola virus disease into the uk in and the cases of monkeypox in the uk in . in general, contact tracing is a highly effective and robust strategy given sufficient resources. the main advantages are that it can identify potentially infected individuals before severe symptoms emerge, and if conducted sufficiently quickly can prevent onward transmission from the secondary cases. contact tracing has proven hugely successful in the treatment of sexually transmitted infections, where the definition of a contact is relatively straightforward, where the infection is often asymptomatic and where the timescales of transmission are slow. in contrast, the use of contact tracing for novel invading pathogens has received less quantitative consideration, in part due to greater uncertainties over social contact structure (although see ) modelling studies have often focused on quantifying the importance of presymptomatic and pre-tracing infectiousness, but are usually based on statistical distributions of contact networks. here we leverage detailed social network data from the uk to model both transmission and the act of tracing, and identify the implications of early contact tracing for containment of a novel pathogen, using parameters for the novel coronavirus (covid- ). we characterised contact patterns in the uk using a postal and online cross-sectional survey, which asked participants to report the number of social encounters with unique individuals during agiven day, as well as the duration and typical frequency of those encounters. in total, respondents reported more than encounters-one of the biggest studies of its kind to date. the definition of a contact used in the survey was a face-to-face conversation within m or where skin-on-skin touch occurred. this will naturally include all conversational contacts within m (the standard definition for a contact for covid- tracing), but is unlikely to represent a significant overestimate. the encounter patterns of this study were in good qualitative agreement with other similar studies of social interactions. in this study, the daily encounter data were first extrapolated to generate a pattern of contacts over a -day period (replicating random encounters and increasing the total duration of associated regular contacts), to act as the basis for transmission and contact tracing simulations (see online appendix for more technical details). using these extrapolated data, we can determine which interactions satisfy a given definition of a close contact for the purpose of contact tracing. from our social encounter survey, we consider all reported contacts of min or more as meeting the close contact definition. from our social encounter data, we can also distinguish interactions with people who could be later identified and traced, from those with unidentifiable strangers (schematic figure ); although we note that electronic means of tracing should be able to trace these individuals. we assume that all contact of longer than hour or repeated contacts can be identified and traced, whereas shorter duration encounters with people met for the first time are strangers who are unidentifiable and therefore untraceable. the second element of the simulation is to determine who gets infected from a source case chosen representatively from the survey respondents. this transmission process is stochastic, accounting for both the time spent with each contact and the infectivity on each day (see online appendix ). the transmission rate to a contact is scaled to generate the required basic reproductive ratio, r . taken together these two predictions allow us to bound the efficacy of contact tracing. one of the most notable features of human social contacts is the huge variability in the number and strength of contacts-which is reflected as variation in both the number of secondary cases and the number of individuals that match the contact tracing definition (figure ). using preliminary estimates of covid- transmission (average latent period days, average effective infectious period days, r = , and assuming a simple seir formulation ) we compute the distribution of epidemiological, social and contact tracing characteristics across the population. extrapolating the data from the social contact survey suggests that the average number of contacts over a -day period is , although the distribution is significantly over-dispersed (with a median of and around % of individuals having > total contacts). of these total encounters, an average of contacts ( %) meet the definition of a close contact (in contact for > min, ) and of these close-contacts we predict an average of ( %) to be individuals who can be identified by the infected case and can therefore be traced. this is comparable to early reports from singapore and taiwan where and confirmed cases led to and contacts being traced, respectively (approximately and contacts per case). therefore, simply considering social contacts, it is clear that there are very many short duration contacts that do not meet the definition of a close contact, and although unlikely to become infected may pose a risk due to their greater abundance. as expected, tightening the definition of a close contact can dramatically lower the number of contacts that would need to be traced: identifying contacts from days prior to detection reduces the average number of contacts to (median ). given that the risk of infection increases with the duration of contact, the distribution of cases effectively represents a biased sample of all contacts. as expected, given the model assumptions, the expected number of total secondary cases agrees with the assumed r (mean= , median= , and th percentiles - ). given that secondary cases are most likely to be those contacts of the longest duration, we predict that % of cases match the definition of a close contact. however, not all of these contacts will be identifiable; assuming that all repeated contacts and contact of longer than hour can be traced, we predict that % of all cases meet the definition and can be identified. however, because of the extreme heterogeneity in contacts between individuals and the stochastic nature of transmission, we would still expect approximately % of all primary cases to generate at least one secondary case that is not traced and % to generate a secondary case that cannot even be identified. similarly, we would expect around % ( %/r ) of detected cases to not be able to identify their infecting individual. neither of these results here, the definition of a contact is someone with whom the index case encountered for min or longer. some contacts will be identifiable (green), while others will be unidentifiable (orange). a definition of contact that is too restrictive and inappropriate for the infection means some encounters may fail to meet the definition yet may be at risk of infection; these excluded contacts could be identifiable (light grey) or unidentifiable (orange). (b) examples of ego-centric networks collected by the survey. the participant (ego) is the blue central triangle; circles represent individual contacts, squares represent groups of contacts (size of group indicated). colours represent social settings of encounters (red=home, cyan=work/school, yellow=travel, pink=other). larger symbol sizes represent longer contact durations, while a closer proximity to the ego indicates the contact is more frequently encountered. should be viewed as a failure of contact tracing, merely a reflection of the uncertainties in the approach. aggregating across all individuals, and under the optimistic assumption that all the contact tracing can be performed rapidly such that all close contacts are traced before they become infectious, we expect such highly effective contact tracing to reduce the basic reproductive ratio r from to . -enabling the outbreak to be contained (figure ). less effective tracing (tracing only a random fraction of contacts) would lead to a linear scaling in the reduction of the r such that over % of contacts need to be traced to reduce r below and control the outbreak. this efficacy would need to be increased if contacts were not traced and isolated before they were infectious (a problem exasperated by pre-symptomatic transmission), or could be reduced if the higher risk/longer duration contacts were preferentially traced. rapid and effective contact tracing can therefore be highly effective in the early control of covid- , but places substantial demands on the local public health authorities. each new case requires an average of individuals to be traced, with . % of cases having more than close traceable contacts (figure ). we therefore consider the implications of changing the definition of a close contact. clearly, a more strict definition of a close contact (requiring more contact time) reduces the burden on the health services as fewer contacts need to be traced, but also increases the risk of cases being missed. figure provides a quantitative assessment of changes to the close contact definition. definitions requiring more than hours of contact are unlikely to control an outbreak as the expected number of untraced second cases is greater than one. this therefore places a strict upper bound on the level of contact tracing required. the added benefit from definitions shorter than hour has relatively little impact on the mean number of untraced cases (figure b), but does reduce the probability that some untraced contacts occur. throughout we have used a value of r that represents a population-level average once the local infection has become established. however, the first invasion into any new population or social setting generally has a larger than expected number of secondary cases. the first invader enters a completely susceptible population; moreover all their close contacts (eg, family members) are susceptible. in contrast, due to the clustering of contacts, most secondary cases will be in a landscape with a depleted number of susceptibles-as close contacts such as family members will already have been exposed to the primary case. this susceptible depletion in the local social network may help to explain the change in r t over time reported for covid- . figure a : white is all contacts; blue are those matching the > min definition of a close contact; green are those matching the definition that are also assumed to be identifiable (met previously or for more than hour), and therefore traceable. (c) frequency distribution of the number of secondary cases per index case, again using colours from figure a : red is all secondary cases; grey and orange are those that are not traced either through failing to meet the definition of a close contact or because they are assumed to unidentifiable; orange are all secondary cases that are shorter than min or unidentifiable. we therefore consider the impact of different values of the initial reproductive ratio (figure ), which could capture this social aspect, or could represent heterogeneity between individuals in the amount of virus shed, or could inform about innate differences in behaviour between china and the uk. given the strong biasing of transmission towards long-duration contacts, the impact of varying the initial reproductive ratio is less extreme than might be expected; it is only for the highest values of the initial reproductive ratio simulated (> . ) that contact tracing fails to find more than one case such that infection can escape. we also consider sensitivity to alternative formulations and parameter values for the epidemiological dynamics, and conclude that the success of contact tracing against covid- is predominantly driven by the initial reproduction ratio. mathematical models have an important role to play in preparedness for novel infectious diseases, allowing policymakers to plan for potential public health scenarios before they arise. however, in such scenarios reliable data are often limited, so predictions of long-term dynamics are generally associated with wide cis. in contrast, while short term predictions are subject to greater stochasticity, the distribution of possible behaviours can be readily captured. here we have investigated contact tracing of a closecontact pathogen, using the novel coronavirus (covid- ) as the example, and considered the efficacy of contact tracing as a control measure. this work brings together a detailed survey of social encounters together with bespoke mathematical modelling of the transmission and tracing processes. given the substantial heterogeneities present in social encounters (both in terms of duration and number), mathematical models are vital to interpret the interplay between a low number of high-risk encounters (eg, household members) and the high number of low-risk less-identifiable encounters (eg, commuters or retail customers). throughout this work we have used a simple definition of a close contact as anyone being within m of an infected individual for min or more, over a -week period-relating to the stipulation in our earlier study. this is likely to be a slight overestimate compared to the uk definition which uses a m distance rule. however, other countries and regions have subtly different protocols with critical distances ranging from . to m and times from to min. our assumption of tracing all contacts in a -week period is likely to be pessimistic, with most countries now adopting an interval from days before symptoms to isolation of the patient. under our default definition, there are unlikely to be many unidentified secondary cases, although the burden of tracing all contacts could be large. relaxing the definition of a contact (such that longer contact durations are needed) lessens this burden, but at the greater risk of undetected cases ( figure ) . surprisingly, moderate changes to the reproductive ratio, within the bounds estimated from early data, or changes to the time course of infectivity are predicted to have a relatively modest impact on the success of contact tracing, illustrating the robustness of this control measure (figure ). our model has addressed the simple and optimistic question of whether rapid and complete contact tracing is sufficient to identify secondary infections. the public health reality of contact tracing is more complex, and depends on the relative timing of events and the management of identified contacts. for contact tracing to be an effective public health measure requires most secondary cases to be discovered and isolated before they become infectious; hence the time from the primary case becoming infectious to the tracing of their contacts needs to be shorter than the incubation period. longer time scales would allow tertiary cases to be infected and potentially increase the scale of tracing required. in addition, those contacts that are traced either need to be effectively screened for infection and quarantined or otherwise isolated so that they do not pose a risk to others. we have also assumed that all index infections are identified as cases and start the process of contact tracing, leading to the tracing of all identified contacts. this is clearly an extremely optimistic assumption: not all infections are symptomatic so may go undetected, and not all those who are symptomatic will seek medical help; and not all identified contacts can be traced sufficiently rapidly to prevent further spread. therefore, while contact tracing has the potential to contain covid- (and other close-contact pathogens) during the early stages of invasion the ultimate success relies on the speed and efficacy with which suspect contacts can be contained and the capacity for contact tracing. contact tracing can also be used later in an outbreak to assist with other control methods in reducing the number of cases. in this scenario, other factors become important: the type and number of contacts are likely to be extremely different for countries in or exiting lockdown; the effective reproductive ratio is likely to be far lower; and household contacts may already selfisolate making tracing irrelevant. these considerations mean that contact tracing needs to be less effective to control the infection (more readily bringing the reproductive ratio below ) but is likely to have diminished impact due to the existence of other measures. , and an infectious period of days; other points, in blue, use a latent period chosen from a lognormal distribution and an infection period between and days, and are based on a model with one, two or three latent and infectious classes). what is already known on this topic ► contact tracing is known to be highly effective for diseases that spread slowly by close contact, and hence is used for many sexually transmitted infections. ► quantitative predictions of contact tracing have generally focused on the speed of tracing, and used assumed contact patterns. these studies have shown that control is most effective when the latent period is long and the disease transmits slowly. what this study adds ► by considering the distribution of close contact encounters, we are able to predict the efficiency of contact tracing in identifying secondary cases. ► the uk definition of a close contact ( min or more, within m) is sufficient to contain imports of infection but at the cost of tracing many uninfected contacts. ► we would expect - % of cases to generate at least one unidentified secondary case which would need detecting by other means. pandemic (h n ) influenza in the uk: clinical and epidemiological findings from the first few hundred (ff ) cases lack of secondary transmission of ebola virus from healthcare worker to contacts two cases of monkeypox imported to the united kingdom services in std prevention programs: a review effectiveness of workplace social distancing measures in reducing influenza transmission: a systematic review systematic review of social contact surveys to inform transmission models of close-contact infections factors that make an infectious disease outbreak controllable epidemic models of contact tracing: systematic review of transmission studies of severe acute respiratory syndrome and middle east respiratory syndrome novel coronavirus -ncov: early estimation of epidemiological parameters and epidemic predictions early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia social encounter networks: collective properties and disease transmission social encounter networks: characterizing great britain social contacts and mixing patterns relevant to the spread of infectious diseases van den broeck w. what's in a crowd? analysis of faceto-face behavioral networks quantifying sars-cov- transmission suggests epidemic control with digital contact tracing public health england guidance to assist professionals in advising the general covid- in singapore: current experience: critical global issues that require attention and action assessment of covid- transmission dynamics in taiwan and risk at different exposure periods before and after symptom onset epidemiological and clinical features of the novel coronavirus outbreak in china european centre for disease prevention and control. contact tracing: public health management of persons, including healthcare workers, having had contact with covid- cases in the european union the transmissibility of novel coronavirus in the early stages of the - outbreak in wuhan: exploring initial point-source exposure sizes and durations using scenario analysis transmission dynamics of novel coronavirus ( -ncov) feasibility of controlling covid- outbreaks by isolation of cases and contacts key: cord- -xfky q p authors: narayan, venkataraman; hoong, poon beng; chuin, siau title: innovative use of health informatics to augment contact tracing during the covid pandemic in an acute hospital date: - - journal: j am med inform assoc doi: . /jamia/ocaa sha: doc_id: cord_uid: xfky q p this case report described the innovative design and build of an algorithm that integrated available data from separate hospital-based informatics systems that perform different daily functions to augment the contact tracing process of covid- patients through identifying exposed neighboring patients and healthcare workers and assess their risk. prior to the establishment of the algorithm, contact tracing teams comprising six members each would spend up to hours to complete contact tracing for five new covid- patients. with the augmentation by the algorithm, we observed ≥ % savings in overall manhours needed for contact tracing when there were five and above daily new cases through a time-motion study and monte-carlo simulation. this improvement to the hospital’s contact tracing process supported more expeditious and comprehensive downstream contact tracing activities as well as improved manpower utilization in contact tracing. contact tracing is the process of identifying, assessing, and managing people who have been exposed to the virus to prevent onward transmission. [ ] as a densely populated city state, timely identification and isolation of people exposed to the virus to reduce further spread through efficient and effective contact tracing is a key national strategy of singapore to manage the covid- pandemic. [ ] government agencies and healthcare institutions carry out contact tracing in an integrated manner for all covid- patients. the contact tracing process in our hospital and how it interfaced with the ministry of health's (moh) community-level contact tracing is illustrated in figure . the process was initiated at the hospital upon the diagnosis of a covid- patient. as only % of cases have an incubation period of longer than days, [ ] the activity map, which comprised minute-to-minute details on the covid- patient's activities for the period starting from days before the onset of symptoms to the point the patient was admitted to the designated covid- facilities in the hospital, was obtained through a phone interview. the activity map included specifying the time and areas the patient had been to, the people he/she had encountered and their contact information. in order to identify the healthcare workers (hcws) and other patients who were in contact with the index case during this period, the clinical notes, patient movement charts, other patients' locations, and staff rosters were reviewed and cross referenced by the contact tracing team. subsequently these preliminarily identified hcws were interviewed via telephone to verify their contact with the covid- patient, the nature of their exposure, and the personal protective equipment (ppe) they were attired in to assess their risk of contracting the virus. this was a time consuming and labor-intensive investigative process. healthcare workers and other patients assessed to be at a higher risk of infections based on the prevailing infection control guidelines were isolated at home or in the wards and monitored for symptoms. the hospital's contact tracing process must be expeditious and comprehensive to support downstream activities in the community as well as the hospital itself. the information from the activity mapping and contact tracing of other patients and hcw was shared with moh within hours so that follow up actions of community-level contact tracing and isolation of the at-risk groups would be timely. this benchmark was in line with the us-cdc interim guidance for risk assessment and management of healthcare personnel with potential exposure to covid- that stated that the care team contact tracing process should be completed within hours of each case's identification. [ ] data scientists, operations managers and clinical staff worked closely to integrate data available in the informatics systems with human-based interviews to improve the timeliness, comprehensiveness and efficiency of the contact tracing process. a data-mining algorithm was developed to integrate the available data from hospital-based informatics systems that perform various day-to-day functions to augment the contact tracing process of covid- patients to identify exposed hcws and neighboring patients. the algorithm would run and generate a customized contact tracing report for each covid- patient provided these information for the period the patient was in the hospital: the patient's presence in the various areas and the time period, patients at the same area during the same time period, and hcws who attended to the patient at the various areas. the contact tracing team then scope the contacts to be interviewed based on this comprehensive report. hcws who attend to more than one covid- patient will be reflected as a contact in each of their reports. the algorithm and contact tracing reports were piloted and finetuned for the first covid- positive patients admitted to our hospital. it was evaluated for its accuracy and effectiveness in improving the contact tracing process. we observed significant time savings for our staff performing the detailed activity mapping and the report gave a reliable validation reference for the final contact racing reports. ( in the covid- pandemic, expedient identification of individuals with significant exposure to covid- patients is a key strategy to break the chain of transmission and flatten the epidemiology curve. a literature review of the databases of pubmed, cochrane, and embase, with the search terms of "contact tracing" and "covid- " did not return any studies that described an integrated use of hospital informatics systems to augment contact tracing. with the increasing number of new cases diagnosed daily, the capacity for timely contact tracing would have to be met by increasing staff numbers to perform interviews of the covid- patient and the contacts. the algorithm's value-add was the rapid and comprehensive identification of the covid- patient's activity as well as individuals at risk -hcws and other patients, to be interviewed. the contact tracing staff could then focus on the interviews and risk assessment of the contacts. as a result of better efficiency in manpower utilization, we are also able to maintain a lean contact tracing team despite the hospital experiencing an increasing number of covid- patients. the algorithm was limited in real-time data content because of its dependency on the ehints data repository, which is designed as a midnight snapshot server with daily data uploads scheduled at hours. hence for patients who presented at a&e and admitted on the same day, the contact tracing reports were supplemented with screenshots from the live emr system. the time needed for this manual sequence was negligible and can be automated in future. close interaction between the data scientist, operations managers and clinical staff were essential in the design, improvement, and operationalization of the above contact tracing model that integrated the use of data from informatics system with interview-based contact tracing by the contact tracing teams. this model delivered faster contact tracing in the hospital, hence supporting more expeditious and comprehensive downstream contact tracing activities, as well as improved manpower utilisation in contact tracing. this research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. implementation and management of contact tracing for ebola virus disease evaluation of the effectiveness of surveillance and containment measures for the first patients with covid- in singapore -us cdc morbidity and mortality weekly report the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application guidance for risk assessment and public health management of healthcare personnel with potential exposure in a healthcare setting to patients with novel coronavirus ( -ncov) the authors have no competing interests to declare. key: cord- -x fpa authors: backer, j. a.; mollema, l.; klinkenberg, d.; van der klis, f. r. m.; de melker, h. e.; van den hof, s.; wallinga, j. title: the impact of physical distancing measures against covid- transmission on contacts and mixing patterns in the netherlands: repeated cross-sectional surveys date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: x fpa background during the current pandemic of coronavirus (covid- ) many countries have taken drastic measures to reduce transmission of sars-cov . the measures often include physical distancing that aims to reduce the number of contacts in the population. little is known about the actual reduction in number of contacts as a consequence of physical distancing measures. methods in the netherlands, a cross-sectional survey was carried out in / in which participants retrospectively reported the number, age and gender of different persons they had contacted (spoken to in person or touched) during the previous day. the survey was repeated among of the original participants, using the same questionnaire, in march and april after physical distancing measures had been implemented. results the average number of contacts in the community was reduced from on average . (interquartile range: - ) to . (interquartile range: - ) different persons per participant, a reduction of % ( % confidence interval: - ). the reduction in the number of community contacts was highest for children and adolescents (between and years) and smallest for elderly persons of years and older. the reduction in the effective number of total contacts, measured as the largest eigenvalue of the matrix with community and household contacts, was % ( % confidence interval: - ). conclusion the substantial reduction in contacts has contributed greatly in halting the covid- epidemic. this reduction was unevenly distributed over age groups, household sizes and occupations. these findings offer guidance for the lifting of age-group targeted measures. since the beginning of , the novel coronavirus sars-cov that causes covid- disease has rapidly spread around the world. most patients only have mild symptoms, but mainly elderly and persons with comorbidities can develop severe acute respiratory disease (china cdc, ) . hospitals have been confronted with a fast growing number of patients, often exceeding the intensive care (ic) capacity. many countries have implemented control measures that include a combination of increased hygiene, travel restrictions, case finding, contact tracing and physical distancing measures. the specific physical distancing measures differ between countries and between regions; their overall aim is to reduce the number of contacts in the population thus preventing the transmission of infection. the impact of these measures on the reduction of contacts in the population, and the reduction of contacts made by different groups within that population, are poorly quantified. different approaches exist to measure behavioural changes. mobile telephone data provided by telecom companies are used to measure the change in mobility patterns (oliver et al., ; pepe et al., ) . similarly, the location history of smart phones can be tracked with apps (google ). the anonymized and aggregated mobility patterns can suggest changes in contact patterns in the population at large. to obtain direct and detailed information on contacts, cross-sectional studies are conducted in which participants report their age and gender as well as the age and gender of all persons they have contacted during a single day (mossong et al., ; hoang et al., ) . contact surveys have been successfully used to quantify the reduction in the number of contacts associated with physical distancing measures in shanghai and wuhan, china, estimated at % and %, respectively (zhang et al., ) , and in the uk (jarvis et al., ) among the adult population, at %. one of the challenges with the contact survey approach is to obtain a reliable baseline measurement before physical distancing measures were implemented. in the wuhan study participants needed to recall the number of contacts during a regular weekday at the end of . in the uk study the baseline was provided by a similar study conducted years ago among a different representative uk study population (mossong et al., ) . we present a large contact survey conducted in the netherlands in march and april . the participants were recruited from a large sample of the dutch population who had . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . participated in an earlier large cross-sectional survey in / (verberk et al., ) . the contact questionnaire was nearly identical in both surveys, which allows us to use the first survey as a baseline measurement, and the second survey as a measurement of contacts during the implementation of physical distancing measures. by march, the netherlands had imposed physical distancing measures including closing daycare centers, schools, universities; working from home whenever possible; closing of cafes, pubs, restaurants, theaters, cinemas, and sport clubs; cancelling events with more than persons attending; maintaining . m distance from others outside one's household. by comparing the survey results before and after the implementation of these physical distancing measures, we could determine the impact on the number of contacts made in the community (i.e. outside the household), distinguishing between different age groups, genders, household sizes, days of the week and occupations. we also assessed how the physical distancing measures have affected the total number of contacts, including contacts with household members, and the age-specific mixing patterns. between february and october , a cross-sectional study was conducted in a sample of the dutch population from to years of age (verberk et al., ) , henceforth referred to as the baseline survey. participants were randomly selected from the dutch population registry using a two-stage cluster design. infants under year of age, persons living in areas with low vaccination coverage, and persons with a migration background were oversampled in this survey. the study consisted of an extensive questionnaire, and included questions regarding the participants' age, gender and occupation, the age and gender of their household members, and the total number of unique persons they had contacted outside their household the previous day and which day of the week (i.e., monday through sunday) that was. participants reported the age of the contacted persons in age groups ( - , - , - , - , - , - , - , - , - , - , +) . a contact was defined as a conversation in person, or a physical contact. of the participants that had filled out the questionnaire in / , had indicated to be willing to participate in future research. of these, were invited on . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . march to participate in the follow-up study, referred to as 'the physical distancing survey'. as of april , had returned a questionnaire. participants not reporting their household composition or reporting more than contacts were excluded from the analysis. the questionnaire in the physical distancing survey was identical to the baseline survey questionnaire, apart from two questions. a question asking whether participants had had any contacts outside their household was added (before asking how many), and the question about occupation was changed to cover occupations that involve many contacts and that are part of vital processes. we analysed the baseline and physical distancing contact surveys by comparing the number of contacts in the community per participant stratified by several characteristics: age, gender, household size, day of the week, and occupation (as reported in the physical distancing survey, under the assumption that participants did not change occupation between the two surveys). we obtain the total number of contacts in the population by adding contacts with reported household members to the reported contacts in the community. we estimate age-stratified contact matrices that contain the numbers of contacts made between and within age groups, using an approach that accounts for reciprocity of contacts between different age groups ( van de kassteele et al., ) , and using age-specific population size data for the netherlands on january and on january (statistics netherlands, a). to check the effect of enforcing reciprocity between contacts, we compare the estimated and observed mean number of contacts per participant. we characterize the mixing pattern of the age-specific contacts by the disassortativeness index (farrington et al., ) , and we characterize the effective number of age-specific contacts by the largest eigenvalue of the contact matrix (cf. diekmann et al., ) . all analyses were done using r version . . (r core team, ). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint in total baseline survey participants and physical distancing survey participants were included for analysis. the composition of the survey population by age and gender reflected the dutch population ( figure ). the mean age was years (range - ) in the baseline survey, and years (range - ) in the physical distancing survey. each age group consisted of more than participants in both surveys, except for the + age group in the physical distancing survey ( table ). the household size distribution for the baseline survey population reasonably reflects the household size distribution in the netherlands (statistics netherlands, b) , whereas the household size distribution for the physical distancing survey population contained relatively few single-person households. the reported average household size of . in the baseline survey is smaller than the reported average household size of . in the physical distancing survey. there were more than participants reporting their contacts for each day of the week, for both surveys. the percentage of participants who did not report any community contacts increased from % in the baseline survey to % in the physical distancing survey. on average, a participant had . ( - , interquartile range or iqr) community contacts per day in the baseline survey, and . ( - ) community contacts in the physical distancing survey (tab. ). in the baseline survey, participants aged to years had the highest number of contacts, and this number gradually declined with increasing age. in contrast, the number of contacts was similar across the different age categories in the physical distancing survey (fig. ) . the reduction in the number of contacts was greatest for participants aged to years, and lowest for participants aged to (tab. ). the reduction in the mean number of contacts was similar for male and female participants. the number of community contacts increased with household size in the survey, whereas this number was similar for the different household sizes in . in the baseline . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint survey, most contacts were made during weekdays, slightly fewer on saturdays and fewest on sundays. the reduction of contacts between the two surveys was similar for all days of the week, except for saturdays, when participants decreased their contacts to a lesser extent. the reduction in contacts varied between participants depending on their occupation. among the occupations represented by more than participants, the greatest reduction was observed for schoolchildren and students, and the lowest reduction for those working in food industry, retail and health care. in the estimation of the contact rates between age groups, the reciprocity between contacts is explicitly taken into account. the observed and estimated mean numbers of community contacts per participant are in agreement (fig. ) , showing that participants accurately reported their contacts. the contact matrices for both contact surveys and for the different types of contact (community, household) are shown in fig. . the contact matrices for household members illustrate how participants live with persons in their own age category and with their children or parents (i.e., years younger and older); this is apparent in both baseline and the physical distancing surveys. the matrices for contacts in the community indicate fewer contacts within the younger age groups in the physical distancing survey compared with the baseline survey. in the physical distancing survey, persons of years and older had a relatively large number of community contacts with age groups from to years old. in this survey, % of the contacts with persons of years and older were reported by health care workers. taken together, the household and community contact patterns reveal that age-specific mixing does not change much between the baseline and the physical distancing surveys (disassortativeness index is . for the baseline and . for physical distancing survey). however, the effective number of contacts per person (i.e. the largest eigenvalue of the contact matrix) decreased from ( - , % credible interval) in the baseline survey to . ( . - . ) in the physical distancing survey, an average reduction of % ( - ). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint in the netherlands, physical distancing measures came into effect on march . three weeks after implementation, the numbers of occupied hospital and ic beds reached a peak, after which they gradually declined. through comparison with the baseline survey, physical distancing measures were estimated to have drastically reduced the numbers of contacts in the community, by %. the reduction in the mean number of community contacts varied with occupation, and tended to be larger for younger age groups, larger household sizes, and weekdays as compared to weekends. the effective number of the total contacts per person decreased by %. this effective number of contacts per person is proportional to the reproduction number (i.e. the number of secondary infections caused by a single infectious person in the population) under two conditions. first, the definition of contact (having a conversation in person or physical contact) is a good proxy measure for at-risk contact events where sars-cov- can be transmitted. second, all age groups are equally susceptible and infectious. as evidence accrues that children are less infectious or less susceptible (zhang et al., ; gudbjartsson et al., ) , and children report a larger reduction in contacts compared with other age-groups after imposition of physical distancing, a reduction in the effective number of contacts represents an upper bound for the reduction of the reproduction number. this, in turn, implies that the physical distancing measures that reduce the effective number of contacts by %, are sufficient to reduce a reproduction number with values up to . , to or lower, and halt the covid- epidemic. elderly persons have a relatively small reduction in their number of contacts. most of their reported contacts after the implementation of physical distancing measures were with health care workers. these remaining contacts are likely essential, and there is little prospect for further reducing the number of contacts for this age group. measures other than physical distancing are required to protect the elderly who are the most susceptible to infection (zhang et al, ) and have the highest risk of complications (china cdc ). the study participants were randomly sampled from the dutch population in , with oversampling of zero-year olds, persons living in a low vaccination coverage area, and person with a migration background. we have identified the following limitations. not all . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint participants of the baseline survey indicated that they would participate in a follow up, and not all of those who were invited participated. for instance, % of participants in the physical distancing survey were indigenous dutch, compared to % in the baseline survey (verberk et al., ) . apart from the potential for selection bias, there are differences in participant characteristics between the baseline study in / and the physical distancing survey in , and these differences may have affected the results. first, the physical distancing survey participants are older, because they have aged in the intervening . years, with the consequence that the physical distancing survey does not contain any to year olds. to assess the effect on the mean number of contacts, we created a synthetic population with the size and age distribution of the physical distancing population, sampled from the baseline population. this synthetic population has on average . ( . - . , % bootstrap interval) community contacts per participant, which is higher than the baseline population with a mean of . community contacts per participant. this means that the contact reduction may have been underestimated. second, the physical distancing survey participants reported a larger household size on average, which results in a greaterer number of household contacts. third, the physical distancing participants reported their contacts more often on weekdays ( % versus % in the baseline survey), which would result in an underestimation of the reduction in the number of contacts. fourth, the physical distancing survey was carried out in two months in spring, whereas the baseline survey was conducted over a period of almost two years. because contact patterns change little throughout the course of a year (béraud et al, ) , we do not expect this to substantially affect the estimated reduction. fifth, the additional question in the physical distancing survey questionnaire that asked beforehand whether the participants had had any contacts allowed us to check the reliability of their reported number of contacts. the reported number of contacts in the baseline survey, without such a check, may have been larger. all these differences together may have led to a slight underestimation in the reduction in the number of contacts. the reduction in the number of contacts associated with physical distancing measures in the netherlands is smaller than the reductions of % and % observed in shanghai and wuhan, china (zhang et al., ) , most probably because of the complete lockdown in both cities. in the uk a % reduction in the number of all contacts among adults ( +) has . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint been reported (jarvis et al. ) . to compare this with our results, we calculate the total number of contacts including community and household contacts for the participants in our surveys who are years and older, and find a reduction of %. at the time of the studies the control measures in the uk and the netherlands rank at similar values of the stringency index (hale et al., ) . the effect on contact reduction differs, possibly due to study population differences or compliance. the results of this study have immediate application for a number of important issues of concern to the public health response and management of the covid- pandemic. the estimated contact reduction is applicable to other countries and regions with similar control measures. the estimated age-specific contact matrices are useful for conducting scenario analyses with age-structured transmission models of covid- , to project the future course of the epidemic under physical distancing measures, and to explore the possible effects of lifting them salje et al., ) . as we plan to repeat the contact survey at regular intervals, we will be able to monitor the number of contacts made over time, and detect the impact of future changes in these measures such as the reopening of schools, as well as changes in compliance with the measures that remain. we believe that contact surveys such as these can help to inform and guide infection control measures in this time of unprecedented physical distancing, and in the next phase of the epidemic, when the most drastic control measures might be lifted. we gratefully acknowledge the participants of the contact surveys in / and , as well as dr. s. a. mcdonald for critically assessing the manuscript. the mec-u (medical research ethics committees united; r . ) has approved the research protocol "third population-based immune surveillance study used for the evaluation of immunity against sars-cov- ' (pienter corona)" and written informed . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . consent was obtained from all adult participants, and parents or legal guardians of minors included in the study. data of contact matrices shown in fig. are available as tab separated file (s _contact_matrices.tsv). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . ( . ) ( ) * households in the netherlands consist of ( %), ( %), ( %), ( %) and + ( %) persons (statistics netherlands, b) , leading to an expected distribution per participant to live in a ( %), ( %), ( %), ( %) or + ( %) person household. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . the french connection: the first large population-based contact survey in france relevant for the spread of infectious diseases china cdc the novel coronavirus pneumonia emergency response epidemiology team. the epidemiological characteristics of an outbreak of , vital surveillances -china cdc weekly, available from cdn.onb.it the construction of next-generation matrices for compartmental epidemic models measures of disassortativeness and their application to directly transmitted infections spread of sars-cov- in the icelandic population oxford covid- government response tracker, blavatnik school of government a systematic review of social contact surveys to inform transmission models of closecontact infections quantifying the impact of physical distance measures on the transmission of covid- in the uk social contacts and mixing patterns relevant to the spread of infectious diseases mobile phone data for informing public health actions across the covid- pandemic lifecycle ciro cattut o, michele tizzoni covid- outbreak response: a first assessment of mobility changes in italy following national lockdown the effect of control strategies to reduce social mixing on outcomes of the covid- epidemic in wuhan, china: a modelling study r: a language and environment for statistical computing. r foundation for statistical computing estimating the burden of sars-cov- in france. . pasteur- , available from halpasteur third national biobank for population-based seroprevalence studies in the including the caribbean netherlands efficient estimation of age-specific social contact rates between men and women using data on social contacts to estimate age-specific transmission parameters for respiratory-spread infectious agents changes in contact patterns shape the dynamics of the covid- outbreak in china key: cord- - mrsvg b authors: ng, pai chet; spachos, petros; gregori, stefano; plataniotis, konstantinos title: epidemic exposure notification with smartwatch: a proximity-based privacy-preserving approach date: - - journal: nan doi: nan sha: doc_id: cord_uid: mrsvg b businesses planning for the post-pandemic world are looking for innovative ways to protect the health and welfare of their employees and customers. wireless technologies can play a key role in assisting contact tracing to quickly halt a local infection outbreak and prevent further spread. in this work, we present a wearable proximity and exposure notification solution based on a smartwatch that also promotes safe physical distancing in business, hospitality, or recreational facilities. our proximity-based privacy-preserving contact tracing (p$^ $ct) leverages the bluetooth low energy (ble) technology for reliable proximity sensing, and an ambient signature protocol for preserving identity. proximity sensing exploits the received signal strength (rss) to detect the user's interaction and thus classifying them into low- or high-risk with respect to a patient diagnosed with an infectious disease. more precisely, a user is notified of their exposure based on their interactions, in terms of distance and time, with a patient. our privacy-preserving protocol uses the ambient signatures to ensure that users' identities be anonymized. we demonstrate the feasibility of our proposed solution through extensive experimentation. m any industries suspended their daily operations in correspondence to the government's effort in containing the covid- pandemic. in view of the urgency to resuming the daily life routine, several countries have started to relax the restriction so that some industries can resume operation and have their employees back to normal activities. however, each industry is expected to implement some preventive measures to minimize the risk of further outbreaks. among those preventive measures, such as temperature checks, face coverings, and frequent hand washing, contact tracing is deemed essential in monitoring the interaction between individuals and thus providing an immediate alert to all those who were exposed when someone is diagnosed with an infectious disease [ ] , [ ] . an efficient contact tracing approach needs to properly address the question of how to monitor the interactions between employees and customers and how to alert exposed individuals while preserving the anonymity of the patient. while there are many smartphone-based contact tracing systems (e.g., pan european privacy-preserving proximity tracing (pepp-pt) [ ] , covid- watch [ ] , privacy-preserving automated contact tracing (pact) [ ] , etc.), these solutions might not be effective in a workplace because the employee does not necessarily carry with them the smartphone all the time due to the inherent nature of the activity. furthermore, many people might put the smartphone inside a pocket or backpack, which increases the difficulty in achieving reliable proximity sensing. an effective and low-cost contact tracing solution that can be used by the employee without affecting their activity and at the same time providing line-of-sight (los) signals for more accurate proximity sensing is necessary. motivated by this limitation, this paper proposes a wearable contact tracing solution based on a low-cost smartwatch, namely proximitybased privacy-preserving contact tracing (p ct). first, we exploit the proximity sensing information computed from the received bluetooth low energy (ble) signals to monitor the interaction between employees [ ] . second, we design a privacy-preserving protocol that encapsulates the ble packet with an ambient signature packet rather than the employee's identity or location-related information. the main framework describing the contact tracing based on ble technology is shown in fig. . each smartwatch will broadcast a ble packet periodically according to a system-defined interval. rather than using the conventional two-way ble communication channels (i.e., a secure channel for data exchange established through a series of pairing and handshaking processes), the smartwatch uses a non-connectable advertising channel, which was primarily used by beacon-based applications, to broadcast the packet. hence, it is almost impossible for any malicious device to connect to the smartwatch to access sensitive information. when two users are in proximity to each other, that is, when the smartwatches are within the broadcasting range, they can listen to the incoming packet and measure the received signal strength (rss). the smartwatches will log the packet including the measured rss value into their local storage, as shown in fig. (a) . the packet contains the ambient signature information observed by the individual smartwatch at particular timestamps. when an individual is diagnosed with an infectious disease, as shown in fig. (b) , the smartwatch will upload the individual's own signatures generated for the past days to the signature database. all the other employees will download the infected signatures into their smartwatch for signature matching. in other words, the signature matching process takes place in the individual smartwatch rather than the cloud server. in this case, there is no way for others when employees a and b are in proximity to each other, their smartwatches will log the received ble packet containing the ambient signature information into their watch's local storage. when employee a is diagnosed with an infectious disease, the watch will upload his/her own signatures to the signature database. all the other employees can download those signatures and compare them to a list of signatures they have observed in the past days. an alert will be triggered when the downloaded signatures match one of the signatures on the list. to know who has come into close contact with the infected person. the smartwatch will automatically trigger an alert when it found a matched signature. based on the alert, the individual will be informed about the necessary actions, such as self-quarantine, testing, follow-up and monitoring process, so that further spread of this highly contagious disease can be prevented. while it is relatively straightforward to develop such an application to the smartwatch for contact tracing purposes, it remains unclear how accurate is the proximity sensing information estimated through the rss value and how the ambient signature information can help to prevent information leaks. furthermore, rather than a simple alert when there are matched signatures, it is more effective to tell the individual about their exposure risk level based on how close and how long their exposure was. this is because the risk of an individual to be infected is low if they spent less than one second in close proximity to the infected employee compared to the individual who spent more than one hour in not so close proximity, yet still relatively near (i.e., the smartwatch still in the broadcasting range), to the infected employee. recognizing the above challenges, we carefully developed our proposed p ct that has the following contributions: • accurate proximity sensing: this is the first work that provides a comprehensive investigation about the performance of proximity sensing based on the rss values measured by a smartwatch worn by individuals. while rss suffers severe attenuation due to the human body, our empirical analysis verified that we can achieve satisfactory performance with a carefully implemented machine learning method. since the smartwatch is always worn on the human wrist, it is less challenging for the smartwatch to provide suitable los signals compared to smartphones. • low-cost device: being a low-cost commercial off-theshelf device that is equipped with essential ble technology, the smartwatch has become an ideal solution for privacy-preserving contact tracing. the widely available software development support allows the workplace to provide quick yet reliable prototypes for testing prior to workplace reopening. • risk classification: we define the exposure's risk of a user based on the interaction duration and distance with the infectious individual. in contrast to most works that simply rely on the rss value as an input feature to train a classification model, we explore other possible input features including the number of samples observed by the smartwatch, the maximum rss, the minimum rss, and the rss range. our experiment unveils the effects of selecting the right features on classification performance. • real-time notification and dataset: our developed application can provide real-time notification when the physical distance is violated. also, our experimental results were validated with a real-world dataset that was collected with a smartwatch worn on the human wrist. the rest of the paper is organized as follows. section ii provides the background related to contact tracing and discusses its current development. section iii presents our proposed p ct. section iv describes the method to classify the risk level. section v discusses our experimental evaluations. section vi concludes the paper with future works. recognizing the urgency to have an effective contact tracing system, various digital-based solutions, either based on a smartphone or a smart wearable, have emerged lately. during an epidemic of a highly contagious disease such as covid- , it is very likely for anyone to contract the virus when they interact with an infected individual in close proximity for a very long time. contact tracing aims to trace down this group of people so that they can be aware of their exposure to the virus and take the necessary action as soon as possible. we can divide the contact tracing into two major phases: ) interacting phase: the interacting phase keeps track of the daily contacts including distance and duration. a contact tracing system should be able to detect when any two persons are in proximity to each other at the same time keeping track of the duration they remain in close proximity. an effective contact tracing system should be able to detect the proximity with high accuracy rather than seeking to estimate the exact distance, which is quite challenging considering the dynamic movement of humans. ) tracing phase: when a person is diagnosed with the infectious disease, we need to trace down a list of people who have been in close contact with the infected person because they are more likely to get affected. if this group of people can be notified promptly, we reduce the chances for the virus to continue to spread to others. however, many people are concerned about exposing their identity during the tracing phase. hence, a privacy-preserving contact tracing system should provide these two pieces of information without disclosing one's identity. the traditional contact tracing is conducted manually through in-person interviews and investigations. such a manual method based on subjective feedback (i.e., feedback from the infected individual) is unable to gather the precise interaction distance and duration. furthermore, the investigator might acquire some sensitive information to identify those people who have come into close contact. in contrast to the manual contact tracing, the digital tools for contact tracing can provide more precise information regarding the interaction distance and duration, at the same time preserving the individual's privacy. to date, the digital-based contact tracing can be categorized into smartphone or smart wearable-based: ) smartphone-based contact tracing: the pervasiveness of smartphones has made smartphones the most popular choice when comes into the digital-based contact tracing system. the rich sensing features embedded in the smartphone provides a better estimation of interaction distance and duration [ ] - [ ] . for example, many works leverage location sensing [ ] and proximity sensing [ ] to keep track of the interaction between any two individuals. there are also works exploiting the heterogeneous sensing features in a smartphone to improve the distance estimation [ ] . however, most of these works fail to consider the location of the smartphone during the interacting phase. while people might carry their smartphone with them during grocery shopping, the smartphone will be inside a pocket or a purse most of the time. hence, the distance estimation is more complex and can be highly inaccurate for a contact tracing application. at the same time, people might not carry the smartphone with them all the time while working. ) wearable-based contact tracing: considered the inconsistency of smartphones, some companies have started to exploit the smart wearable approach to contact tracing. the goal is to resume the working routine with less distraction. for example, easyband [ ] presents a wearable solution to auto contact tracing while encouraging safe social distancing practice during interaction. however, easyband uses a centralized server for contact tracing, in which all the users' data is uploaded to the cloud through tcp/ip connection. such a centralized approach is not scalable as all the computations to find the close contact for all the workers are performed within the server. furthermore, there is a high possibility of information leak if the server is compromised. our proposed p ct, on the other hand, provides a privacy-preserving contact tracing by keeping no individual information on the cloud server. recognizing the importance of contact tracing in resuming the normal lifestyle while preventing the further virus outbreak, both government and academia have devoted efforts in developing a more effective contact tracing solution to fight against covid- . ) national-level efforts: china, south korea, and singapore are among the first countries enforcing digital tools for contact tracing. china leverages its existing surveillance strategy to implement a close contact detector based on qr codes technology [ ] . south korea leverages the location data (i.e., the gps data) from the smartphone to detect the distance of the users and push a notification containing personal details of the infected individuals to the nearby users [ ] . singapore developed the tracetogether application based on ble signals on the smartphone to detect the proximity between any two individuals [ ] . while the methods applied by china and south korea might be less strict on user's privacy, singapore adopted a more privacy-preserving approach by only tracking the proximity between users without explicit location information. ) academia-level efforts: there have been a number of initiatives from industry and academia researchers in delivering an effective contact tracing solution while preserving user privacy [ ] , [ ] . for example, pan european privacy-preserving proximity tracing (pepp-pt) detects the proximity based on the broadcast ble packet containing a full anonymous id [ ] . covid- watch provides automatic alert the user when they are in contact with the infected individual [ ] . the privacy-preserving automated contact tracing (pact) exploits the ble signals in combination with secure encryption to detect possible contacts while protecting users' privacy [ ] . most of these initiatives assume that the ble signals will work for proximity detection while there are no works providing a comprehensive study of the accuracy of using ble signals for proximity sensing. to bridge the gap, this paper presents extensive experiments to validate the feasibility of using ble signals for proximity detection. our proposed p ct leverages the ble technology available on the smartwatch for proximity sensing. to achieve privacypreserving contact tracing, we adopt the following signature protocol to define the ble advertising packet. as a popular short-range communication over the . ghz ism band [ ] , ble is readily available in many smart devices including smartwatches, earphones, smart thermostats, etc. [ ] , [ ] . ble communicates through either nonconnectable advertising or connectable advertising [ ] . the latter advertising mode allows another device to request a secure connection through handshaking. our proposed p ct uses the non-connectable advertising mode, which rejects any incoming connection requests [ ] , as the main feature for exposure notification purposes. the non-connectable advertising mode allows the smartwatch to broadcast a short advertising packet periodically according to the system-defined advertising interval, t a . upon receiving the packet, the smartwatch can measure the rss and use it to estimate the proximity. more precisely, the rss is inversely proportional to the square of the distance according to the inverse square law [ ] , [ ] : where p r is the signal strength in dbm, d is the distance between any two smartwatches, and n is the path loss exponent. even though the rss-distance relationship holds for the signal in the free space, the rss values suffer a great variation in practical environments owing to the multipath [ ] and body shadowing effects [ ] , [ ] . we can minimize the signal variation by applying signal filtering methods, such as moving average. as shown in fig. , the rss values at each distance are more distinct and with less variation when a moving average is applied as compared to the raw rss data. while we can set a cut-off threshold, for example, any value greater than − dbm as being in close proximity, such a thresholding approach will result in high false-negative with raw rss value (i.e. the system will not record the contact as close proximity) and high false-positive with filtered rss value (i.e. the system will record the contact in close proximity while it is not). rather than using a thresholding approach, we exploit machine learning methods to proximity sensing and further classify the sensing output into high-risk and low-risk. we propose a signature protocol that constructs a signature vector that fits into the length-constrained advertising packet (i.e., the available payload is only bytes). specifically, each smartwatch is configured to execute the following functions: i. signature generation: the smartphone scans for the ambient environmental features. these features are selectively signature generation signature generation signature generation ( ) own signatures observed signatures processed to generate a unique signature that fits into the bytes advertising payload. the signature is updated every few minutes. ii. signature broadcasting: the smartphone broadcasts the advertising packet containing the unique signature periodically according to the advertising interval t a . the packet is broadcasted through the non-connectable advertising channels. iii. signatures observation: the smartphone scans the three advertising channels to listen for the advertising packet broadcast by the neighboring smartphones. the scanning is performed in between the broadcasting events. the signature is a -dimensional transformed vector containing the ambient environmental features. upon the generation of the signature, the smartwatch will encapsulate the signature information into its advertising packet and broadcast the packet through the non-connectable advertising channels. the nearby smartwatches can see the packet when they scan on those advertising channels where the packet is transmitted. the timing diagram for the advertising, scanning, and signature generation activities, in which each activity is triggered periodically according to its interval, i.e., generation interval t g , advertising interval t a , and scanning interval t s , is shown in fig. . given t s , the smartwatch will only stay active to listen for the incoming packet for a duration defined by the scanning window t w . while it is possible to use a continuous scanning (i.e., by setting t w = t s ) to increase the packet receiving rate, such a scanning approach has an adverse effect on the energy consumption. iv. risk classification besides using proximity sensing to detect the approximate interaction distance between any two individuals, we also consider the interaction time when labeling the individual into low or high risk. proximity sensing has been employed in many scenarios, for example, to identify the user proximity to museum collection [ ] , gallery art pieces [ ] , etc. some works study the proximity detection in a dense environment [ ] , or proximity accuracy with filtering technique [ ] . however, most of these works study the proximity detection between a human and an object with an attached ble beacon [ ] . so far, there is no work studying the proximity sensing between devices carried by two humans. while estimating the distance can help to check if the user maintains a safe physical distancing, an exact m distance should not be a rigid requirement in classifying the risk of a user. rather, we are more interested to know the proximity between any two workers, and how long they remain in proximity. then, we can forward these pieces of information to the epidemiologists and they can decide to classify a contact as high or low risk. ble is an excellent technology for the above purpose since ble is a short-range communication that can only be heard when two smartwatches are in the communication range of each other. upon receiving the advertising packet, the smartwatch can measure the rss and thus estimate its proximity to the nearby smartwatch. we classify the proximity into two classes, i.e., far and close. we define close proximity when the distance between any two smartwatches is less than m, and any distance greater than m but less than the broadcasting range is considered far. in other words, the two smartwatches are not in proximity if they are outside the broadcasting range of each other. the rss distributions for far and close proximity is shown in fig. . it is clear that there would be errors if proximity were decided by simply setting an rss threshold. for example, if we set anything above − dbm as close proximity, chances are some values greater than − dbm are from the smartwatch located at a distance greater than m. hence, it is unreliable to identify the risk of an individual simply based on the proximity. at the same time, some individuals might come in very close proximity when they pass by each other. hence, we also consider the interaction time when we want to identify the risk of an individual. while it is more likely to be infected when the individual is in close proximity to the infected person, the risk of getting where h + denote the hypothesis that the user belongs to the high-risk (+ ) group, h − the hypothesis that the user belongs to the low-risk (− ), and h the null hypothesis. specifically, the null hypothesis happens when the user is risk-free, i.e., the user is outside the communication range of the infected person. obviously, miss detection is undesirable because the user might be at risk but the system considers the user safe. false-negative misclassified the high-risk user to low-risk, this may give a wrong impression to the user that the probability for them of getting infected is low, but actually, the probability could be high. while false-positive is a bit more conservative by misclassifying the low-risk user to high-risk, it is a relatively safer outcome than miss detection and false-negative. we can apply supervised machine learning methods to train a classification model. however, supervised methods required a set of labeled data, which is not readily available to address this problem, we developed an application on the smartwatch to collect the ble data. given the collected data, we can train a classification model, as shown in fig. . during the training phase, the data is split into training and validation set before feeding the data for model learning. the objective is to learn a set of weights that fits the hypothesis function r(x, c) defined by the corresponding classification model c. validation is performed to evaluate the learned model as well as preventing the model from overfitting. if necessary, model fine-tuning can be performed to improve the classification performance. mathematically, the learning process aims to fit the risk mapping function r : (x) −→ y given a set of n training samples {(x , y ), . . . , (x n , y n )}, where x = (x , . . . , x m ) t is an m-dimensional feature vector and y = {+ , − } is the classification output indicating the risk of a user. in this paper, we exploit four types of classifications: decision tree (dt), linear discriminant analysis (lda), naïve bayes (nb) and k-nearest neighbors (knn). ) dt: top-down approach is the commonly used method to learn a classification tree. more precisely, dt starts by choosing a feature from the feature vector that provides the best splitting in connection to the target risk label, and then repeats the same splitting procedures for each separate branch until it reaches a final decision. let θ = (x, γ) be the splitting rule given feature x and threshold γ, we can split n samples of training data t into two subsets, i.e., where t r and t l are the resultant subsets representing the data for right and left branches, respectively. the commonly measure used to govern the splitting rule is the gini impurity g(·), which tells how likely the model will produce a misclassification if the model predicts the labels based on the labels distribution from a randomly chosen feature. mathematically, the gini impurity can be computed as follows: where n l and n r are the number of training samples for each subset, and h(·) is the entropy function, i.e., and p y denotes the probability of correct classification. suppose that i = { , } be the indication function andỹ be the predicted output, then we have the objective of dt is to find the parameters that produce the best splitting rule, i.e., θ * = arg min g(t , θ) ) lda: assuming that the covariance for each class is the same, lda learns a classifier by fitting a gaussian density to each class. let p(x|ỹ = y) be the conditional distribution for each class y = {+ , − }, by applying bayes' rule, we obtain p(ỹ = y|x) = p(x|ỹ = y)p(ỹ = y) y={+ ,− } p(x|y)p(y) then, the class (i.e., the risk) can be determined by selecting the output with the highest posterior probability. ) nb: following a naïve assumption that each feature is conditionally independent, we can apply the bayes' theorem to learn a classification model. by simplifying p(x|y, ∀x ∈ x) to p(x|y), we have since p(y|∀x ∈ x) is proportional to p(y) m i= p(x|y), then we can use maximum a posteriori (map) to estimate the probability for each class p(y) and the conditional probability for each class given the feature p(x|y). the output risk can then be predicted based on the following rule: the goal of knn is to maximize the probability of correct classification. let p i indicate the probability that a training sample i is classified correctly, according to the stochastic nearest neighbors' rule, we have: where t i is a subset of data belonging to the same class as training sample. given p i , the goal of knn can be defined as follows: note that all the classifiers described above can be further extended by assuming different distribution functions. one of the possible future work is to calibrate the classifier based on the prior empirical distribution knowledge about a certain environment. more precisely, different environments might produce different distributions, and if we can acquire this information, it could help to better calibrate the classifier and thus improve the classification performance. we consolidated the collected data from both smartwatches before dividing them into training and testing datasets. then, we evaluate the experimental results obtained from different classifiers. for the experiment, we used fossil sport, a smartwatch based on google's wear os . . the smartwatch is powered by a qualcomm snapdragon wear processor and has an internal memory of up to gb. the gb internal storage is sufficient to store the generated and observed signatures for at least days. the small form factor (i.e., . in amoled screen with mm case size and mm case thickness) makes the smartwatch an ideal candidate for contact tracing in the workplace. as shown in fig. , the smartwatch can trigger the alert automatically when any two smartwatches are in close proximity to each other. we programmed the smartwatch to broadcast the advertising packet in the background. for experimental purposes, we also programmed the application to log all the advertising packet it received at every distance. in particular, the following information will be logged: the ground truth distance, name of the smartphone, mac address of ble chipset, the packet payload, rss values, time elapsed, and timestamp. the time elapsed indicates the time difference between the previous broadcast packet and the current broadcast packet, whereas the timestamp is the exact time when the smartphone received the packet. we performed the experiment by asking two volunteers to stand at a certain distance from each other, from . m up to m, as illustrated in fig. . a measuring tape is used as a reference to the ground truth distance. we first performed the experiment by asking volunteer a to wear the smartwatch on her left hand, and volunteer b on her right hand (i.e., left to right (lr)). after that, we repeated the same experiment with right hand to left hand (rl), left hand to left hand (ll), and right hand to right hand (rr). since lr and rl constitute a direct line between two smartwatches and ll and rr constitute a crosswise line, we categorize these four hand-combinations into two groups: a) direct line, and b) crosswise line. all the measurement data is saved into a "comma-separated values" (.csv) file format and exported to matlab for training and testing. in total, we have collected , data points from all the four combinations, as shown in table i . we consolidated the data from rr and ll into a single dataset (i.e., the crosswise dataset) and then apply an %- % splitting rule to split the data into training and testing set. similarly, we applied the same splitting rule to the consolidated data from rl and lr (i.e., the direct dataset). for each training and testing set, the first four columns indicate the input features and the last column is the target label (i.e., the risk). these four input features include the number of samples observed by the smartwatch, mean rss, maximum rss, minimum rss, and the rss range (i.e., maximum rss − minimum rss). note that the number of samples observed by the smartwatch tells how long the smartwatch being in proximity to each other. the final training and testing data for both sets are shared openly in our github repository [ ] . we used four metrics (i.e., precision (p), recall (r), f score (f ) and accuracy (a)) to evaluate the performance of the classifier. let t + , t − , f + and f − denote the true-positive, true-negative, false-positive and false-negative, respectively, then the above four metrics can be computed as follows: precision tells how many are actually in the high-risk of all the classifier predicted as positive. in other words, high precision indicates the classifier produces low false-positive, which means the classifier is capable of avoiding create unnecessary tension and anxiety to the people. recall, on the other hand, tells how many we predicted as high-risk are in fact high-risk of being infected. in contrast to the accuracy that considers the number of correctly classified true-positives and true-negatives, f -score considers the balance of precision and recall. f -score is a useful metric when false-negatives and false-positives are important factors in evaluating the classifier performance. we fed the two consolidated datasets, i.e., direct and crosswise datasets, to the four different classifiers (i.e., dt, lda, nb, and knn) for training. we repeated the experiment times with a different set of testing data. specifically, we randomly sampled % of data from the dataset for testing purposes at every iteration. for each evaluation metric, we show the mean result and its corresponding % confidence interval (ci). an illustration of the f -score distribution obtained from dt with the testing sets, is shown in fig. . the overall mean results and % ci for both direct and crosswise dataset are shown in table ii and table iii , respectively. from both tables, we can see that all the classifiers achieve satisfactory performance with high precision and recall. in other words, the classifier did not penalize the recall in order to achieve high precision. hence, the f -scores for both datasets are high. we also observed that the direct dataset gave a better performance than the crosswise dataset. this can be explained by the possible signal attenuation when the two hands are blocked by the human body. among all the classifiers, dt achieves the best performance with the highest precision, recall, f -score, and accuracy. in fig. , it shows the precision-recall curve for (a) direct and (b) crosswise. the precision-recall curve provides further insight into the trade-off between precision and recall. both plots indicate that dt achieves superior performance with high precision and recall, whereas other methods tend to trade-off the recall in order to achieve high precision. previously, we used all the five input features (i.e., number of samples observed by the smartwatch, mean rss, maximum rss, minimum rss, and rss range) to train the model. all the four trained classifiers were able to produce satisfactory classification performance, i.e., at least % accuracy. hence, we would like to investigate the implication of input features on the classification performance. we repeated the experiment by using only one feature (i.e., mean rss), and then two features (i.e., mean rss and the number of samples), and so on. the classification accuracy achieved by all the four classifiers is shown in fig. . from both bar charts, we can see that knn suffers severe performance degradation when only one input feature available. overall, the performance increases when the number of features increases. the performance gain of each classifier when the number of features increases, is shown in fig. . we can see that knn benefited a lot when there are more input features. on the other hand, both lda and nb did not show improvement after two features. their performance saturated when the number of features is more than two. it can be noted that the performance of dt also increases when the number of features increases, even though the performance gain is quite minimal. overall, we can see that some features are indeed useful in training a good model, while some features might be redundant and can be excluded from training. for example, the maximum rss and minimum rss might not provide good information to the model training, whereas the rss range provides more useful information. the rss range provides an indication of how big the rss fluctuated during a particular observation period, and this piece of information is indeed helpful to model learning. as discussed, the number of samples observed by the smartwatch within a certain continuous time period is a good indication of how long the user has been interacting with each other. furthermore, we can make a better inference when the number of samples observed by the smartwatch increases. the effect of the number of samples on the classification accuracy is illustrated in fig. . it is clear that the accuracy increases when the number of samples increases and then slowly saturates after it obtains a sufficient number of samples. in other words, the increase in the number of samples has less effect on accuracy when the system has obtained a sufficient number of samples to make an inference. from the results, we can see that the accuracy starts to saturate when the number of samples reaches , for the (a) direct and (b) crosswise cases. hence, we can conclude that most classifiers can produce proper classification output when there are at least samples. if the smartwatch is configured to advertise the packet every ms, we should expect approximately samples per second, which means that approximately s are required for each classifier to reach a stable performance. in practice, this is a reasonable duration considering the interaction duration between users. furthermore, if the interaction duration is less than s, the risk of getting infected is low even if the user is very close to the infected individual. the world health organization recommends a distance of at least one meter. however, different countries implement different physical distancing requirements, from m to m, depending on factors including location, activity, and age of the individuals. considered the variations in physical distancing requirements, we conducted an experiment to verify our classification approach with different physical distancing thresholds. the classification accuracy with different physical distancing threshold is shown in fig. . the results prove the robustness of our classification approach, in which each classifier achieves almost similar accuracy despite the differences in the physical distancing threshold. this means that our proposed approach is practical and can be applied in any setting directly by simply updating the physical distancing threshold in correspondence to the set of required preventive measures. contact tracing is deemed to be an essential measure in the post-pandemic to prevent the second outbreak while slowly reopening the workplace. even though smartphonebased contact tracing is cost-effective considering the ubiquity of smartphones, it is not convenient to have the employee carry with them the smartphone all the time during working. on the other hand, a smart wearable approach provides a more practical solution to contact tracing in the workplace. in this paper, we verify the practicality of our proposed p ct with real-world ble data collected from the smartwatch. for future work, we can integrate the embedded sensors within the watch to monitor employee's activity and thus to better predict their interaction behaviors. the additional knowledge of interaction behaviors, besides the interaction proximity and quantifying sars-cov- transmission suggests epidemic control with digital contact tracing contact tracing and disease control pan-european privacy-preserving proximity tracing we put the power to reduce the spread of covid- in the palm of your hand pact: private automated contact tracing a reliable smart interaction with physical thing attached with ble beacon smartphone inertial sensor-based indoor localization and tracking with ibeacon corrections indoor positioning using smartphone camera face-to-face proximity estimationusing bluetooth on smartphones coronavirus mobile apps are surging in popularity in south korea epidemic contact tracing with smartphone sensors easyband: a wearable for safety-aware mobility during pandemic outbreak china launches coronavirus 'close contact detector' app privacy guidelines for contact tracing applications tracesecure: towards privacy preserving contact tracing overview and evaluation of bluetooth low energy: an emerging low-power wireless technology bluetooth: a viable solution for iot? smartphones and ble services: empirical insights secure seamless bluetooth low energy connection migration for unmodified iot devices ble beacons for internet of things applications: survey, challenges, and opportunities rss localization using unknown statistical path loss exponent model improved distance estimation with ble beacon using kalman filter and svm compressive sensing-based multipath exploitation for stationary and moving indoor target localization body shadowing and furniture effects for accuracy improvement of indoor wave propagation models human body shadowing effect on uwb-based ranging system for pedestrian tracking ble beacons for indoor positioning at an interactive iot-based smart museum notify-and-interact: a beacon-smartphone interaction for user engagement in galleries high resolution beacon-based proximity detection for dense deployment improving ble beacon proximity estimation accuracy through bayesian filtering a compressive sensing approach to detect the proximity between smartphones and ble beacons key: cord- -ti uye authors: vianya-estopa, marta; garcia-porta, nery; piñero, david p; mannion, luisa simo; beukes, eldre; s wolffsohn, james; allen, peter m. title: contact lens wear and care in spain during the covid- pandemic date: - - journal: cont lens anterior eye doi: . /j.clae. . . sha: doc_id: cord_uid: ti uye aim: to establish contact lens wear and care practices during the covid- pandemic in spain. method: a -item anonymous online survey was distributed during the period th april to (th) may via qualtrics. the survey explored: a) demographic characteristics (age, sex, general health and where they were living during lockdown), b) changes in their contact lens use during lockdown, c) hygiene and contact lens compliance and d) concerns associated with contact lens wear and ways to support wearers during the pandemic. results: two hundred and sixty responses were analysed ( . ± . years old, % female). three-quarters of participants reported that they were self-isolating or rigorously following social distance advice. sixty-seven percent of participants reported using their contact lenses less during the pandemic. respondents were found to be compliant with handwashing prior to inserting and removing contact lenses (in both cases % doing this ‘most times’ or ‘every time’). however, only % complied with the’ second rule’ and % used a shared towel to dry their hands. a higher proportion of hydrogen peroxide users replaced the lens case monthly compared to multi-purpose users ( % vs. %; p < . ). twenty-four percent admitted wearing lenses whilst showering and % did not consider ceasing lens wear if feeling unwell with flu/cold symptoms. conclusion: eye care practitioners should continue to educate contact lens wearers to ensure safe contact lens wear to minimise the chance of developing contact lens related complications during the pandemic. modifiable factors that need particular attention in spain include: handwashing for at least seconds before lens handling, drying hands with single use paper towels, including a rub-and-rinse step for reusable lenses, lens case cleaning and renewal, avoidance of water exposure and when to cease lens wear during the pandemic. coronavirus disease (covid- ) has experienced a rapid spread globally since december [ ] , with the world health organisation (who) declaring it a pandemic on th march . spain was one of the first european countries affected by the covid- outbreak [ ] leading to one of europe's strictest and longest lockdowns (state of alarm started on th march and ended on st june). spanish residents were mandated to stay at home except to purchase food/medicines, to travel to work or to attend emergencies. non-essential shops and businesses (including optical/optometric practices) were closed. stress and anxiety increased significantly during lockdown [ ] , promoted by concerns about how to avoid infection and the continuously increasing number of deaths. this was exacerbated by significant amounts of misinformation and speculation [ ] including inaccurate information around the increased risk of covid- associated with contact lens wear [ ] . the covid- pandemic has highlighted the importance of optimal hygiene and lens care to avoid contact lens related complications [ ] . the aim of the current study was to evaluate by means of a survey the behaviours associated with contact lens wear (compliance with hand hygiene and adherence to contact lens wear and care recommendations) as well as to elucidate the best ways to support wearers during the covid- pandemic in spain. to date, limited information exists about contact lens compliance in spain before [ ] and during the covid- pandemic [ ] . similar work related to contact lens compliance during the covid- pandemic has been conducted in the uk [ ] . given the prescribing differences between markets, with a larger number of reusable and rigid/orthokeratology wearers in spain compared with the uk, [ ] and different government approaches to control the virus impact, it is important to identify country-specific wearer behaviours that need addressing during these challenging clinical times. the study was conducted in agreement with the tenets of the declaration of helsinki. the faculty of science and engineering research ethics panel at anglia ruskin university (aru) reviewed and approved the study protocol. qualtrics software (qualtrics, provo, ut) was used to collect data via an online questionnaire. only one submission from each ip address was permitted by the survey software. non-identifiable data were collected, and all participants gave their informed consent online after reading a participant information sheet at the start of the survey. the inclusion criteria included adults older than who were using contact lenses and living in spain during the lockdown due to the covid- pandemic. the survey used was translated and adapted to accommodate for the spanish lockdown characteristics from a survey distributed in the uk [ ] . the survey had questions divided into sections. the first section focused on assessing the demographic characteristics of the participants such as age, sex, general heath, and where they were living during the lockdown. the second section asked about potential changes in their habits related to the use of contact lenses (cls) due to the lockdown. the third section focused on hygiene and contact lens compliance during the covid- pandemic (e.g. washing hands, lens care, lens case replacement). the questions related to lens care disinfection and lens case care were only displayed if the participant used reusable cls. the final section focused on assessing concerns j o u r n a l p r e -p r o o f associated with contact lens wear and ways to best support wearers during the covid- pandemic. respondents were also asked about compliance with recommendations regarding safe contact lens wear given by their ecp. the translation into english followed a multi-step process to ensure optimal accuracy. initial drafts were compared to the original english questionnaire to ensure the meaning of the questions were kept. the survey was then reviewed by a team of four spanish optometrists (all authors of this manuscript, ng-p, mv-e, dp and lsm) as a final step during the translation process to refine the final wording of the questions. the final version was inputted into qualtrics (qualtrics, provo, ut) and then reviewed by team members to ensure functionality. the survey was distributed using social media and was open from th april to th may when optical/optometric practices in spain were either closed or only offered urgent/emergency care. a copy of the full questionnaire can be requested by contacting the corresponding author. the statistical package for social sciences version . (spss inc., chicago, il, usa) was used to carry out the statistical analysis. chi-square statistics for categorical variables (or fisher's exact test if the number of participants in any group was or less) was used to assess differences between lens types (daily disposable vs reusable cl) and between lens care products (multipurpose disinfecting solution vs hydrogen peroxide). a significant level of p≤ . was used for all analyses. a total of participants completed the anonymous online survey. before data analysis, responses were removed under the following circumstances: those living outside of spain (n= ), those not consenting to take part in the study (n= ), when only some initial questions regarding sex and location were completed (n= ) and those younger than years (n= ). a total of respondents were suitable for analysis. the mean age of the respondents was . ± . years old (range - years) and % were female. respondents' demographics and covid- symptomatology are shown in table . the reported frequency of contact lens modality and use of lens care products for lens disinfection is presented in table and table . daily disposable % soft reusable % rigid (including ok) % sixty-seven percent of respondents reported using their contact lenses less during the pandemic, % about the same amount of time and only % more during the pandemic. the most common reason for reduced wearing time was 'less need at home' ( %) followed by 'less effort to wear specs' ( %) and 'fear of infection/touching eyes' ( %). at the time of the survey, % of the respondents had not needed to buy any contact lenses during the lockdown period, % were purchasing them from their optometrist, % from the internet and % through eye clinics or j o u r n a l p r e -p r o o f hospital departments. there were no statistically significant differences in wearing time between daily disposable wearers and reusable contact lens wearers (p= . ). seventy-five percent of respondents reported a change in their handwashing routine during the pandemic. self-reports of hand washing prior to inserting and after removing contact lens with either 'most times' or 'every time' were similar (both %). in addition, % also responded using soap and water during handwashing. in contrast, only % reported following the -second rule 'every time' (with a further % doing it 'most times') and % reported washing their hands 'every time' after coughing, sneezing or blowing their nose (with a further % doing it 'most times'). when asked about how they dried their hands, the responses included: cloth towel shared with other family members ( %), cloth towel only used by themselves ( %), paper towel ( %). there were no statistically significant differences between daily disposable and reusable lens wearers for handwashing before inserting and removing contact lenses (p= . and p= . respectively), handwashing method (i.e. using soap and water, water only, antibacterial wipes/solutions p= . ) and for drying habits (as per above options, p= . ). as shown in table , the majority of respondents used a multi-purpose lens solution for disinfection of their reusable contact lenses. first, the survey assessed aspects related to the cleaning of the lenses after use (figure a -b). the survey found that % of respondents 'never' rubbed the lenses before soaking and % either topped up the lens care solution 'most times' or 'occasionally' (rather than filling up the case with fresh solution). the survey also evaluated aspects related to the care of the lens case. figure a similar analyses were conducted to establish the cleaning routine of wearers of rigid lenses, using either a multi-purpose solution or two-step cleaning routine (cleaner and conditioner, table ). when asked about rubbing prior to soaking, % of respondents admitted they 'never/almost never' did this step and a further % did it only once a week (with the remaining doing it 'daily'/'almost every day'). respondents showed a more compliant behaviour with regards to topping up the solution, % indicated they 'never' topped up and the remainder only did it 'occasionally'. with regards to the care of the lens case, % reported cleaning the case 'daily' with a further % cleaning it 'most days' and the remaining respondents cleaning less frequently or never. in addition, % of the respondents admitted using tap water. only one respondent used the optimal method (rinsing and dying with paper tissue). during the pandemic, the majority of respondents were planning on replacing the lens case monthly ( %) or -monthly ( %) with the remaining planning on changing it less often. a separate analysis was conducted with all reusable lens wearers using hydrogen peroxide as part of their lens disinfection in this study (n= for both soft and rigid lens wearers as shown in table ). around half ( %) of the respondents rubbed their lenses prior to soaking 'daily'/'almost daily' (remaining respondents 'never/almost never' included this step as part of their cleaning routine). in addition, % 'never' topped up the used solution, with the remaining respondents doing this 'always' ( %) or 'frequently' ( %). non-compliance was observed with aspects relating to the care of the lens case including % cleaning the lens case 'weekly' or 'monthly' or 'not at all'. a further % admitted using tap water whilst cleaning the contact lens case and % were planning on replacing the case monthly during the pandemic (with the remaining % planning on replacing it yearly). a statistically significant difference was found between the lens case replacement frequency of contact lens wearers using hydrogen peroxide and those using multi-purpose solutions (p< . ). sixty-four percent of peroxide users replace their lens case monthly whilst only % of multipurpose solution users reported doing this. ninety-seven percent said that they were not exceeding the wearing time during the pandemic (e.g. sleeping in lenses if not previously recommended) and % also followed the recommendations regarding disposal of their contact lenses. the main reasons given by the respondents that did not follow recommendations on disposal included 'forgetting when to replace', 'to save money' and 'doesn't hurt my eyes' ( %, % and % respectively). the survey explored if wearers wore contact lenses whilst showering, with % responding negatively to this question. finally, the survey explored if users checked the health of their eyes daily before inserting lenses with % admitting skipping this health check. a further % responded that they would not consider ceasing lens wear during the pandemic if feeling unwell with cold or flu symptoms. of all respondents, % did not own a pair of up to date spectacles. the survey also asked how often respondents cleaned their spectacles with soap and water: % admitted doing this 'hardly ever' or 'never', % 'daily/almost daily' and % 'frequently'. finally, % had not sought any form of additional support for managing their contact lens wear during the pandemic, but % consulted with their optometrist/ophthalmologist/medical practitioner and % searched for information online. respondents were given space to add free texts regarding ways in which they would like to receive support regarding contact lens wear during the pandemic. of those that responded, the preferred method was online (n= ), via telehealth (phone or video call, n= ) or both (n= ). they also expressed videos and links to relevant sites through a health portal would be useful to support relevant aspects of contact lens wear, such as the cleaning of contact lenses and the impact of covid- on lens wear (n= ). since the start of the covid- pandemic, the contact lens community has been interested in how to best support wearers [ , , ] . the demographics of the respondents in this study are in agreement with data for the spanish market in terms of sex, age of wearers and types of lenses worn [ ] , and so are representative. in this study, % of respondents were socially distancing and or self-isolating when responding to the survey. the current study is in accordance with other work conducted during the covid- pandemic and shows a reduction in lens wear during lockdown [ , , ] . optimal handwashing before contact lens application and removal is essential [ , ] . effective handwashing with soap and water should take a minimum of seconds [ , ] . however, less than half of respondents admitted to complying with this rule. garcia-ayuso et al. [ ] also found . % of participants were unaware or did not follow this rule. during the pandemic, spain emphasised good handwashing technique rather than duration in their public health campaigns [ ] . the greater awareness of hand hygiene created by the pandemic may have increased hand washing prior to lens handling as only % of respondents admitted to not washing their hands prior to application/removal of lenses. interestingly, only a quarter of respondents admitted washing their hands after coughing, sneezing or blowing their nose. the use of paper towels does not seem to be an established method for drying hands in this study (only % of respondents used this method). despite recommendations to dry hands with a single use paper towel [ , ] , % of participants admitted to using their own reusable towel and % admitted to sharing a towel. in view of these non-optimal behaviours, ecp are encouraged to discuss optimal handwashing and drying techniques and ensure wearers do not become complacent as inadequate handwashing is a risk factor for contact lens related microbial keratitis and corneal inflammatory events [ ] . despite the known benefits of rubbing contact lens prior to soaking [ , ] , reusable lens wearers using multi-purpose solution in this study often skipped this cleaning step. in addition, % admitted topping up multi-purpose lens solution frequently or always. in contrast, % of soft reusable lens wearers in the uk were non-compliant with topping up highlighting differences between countries during early periods of lockdown in the covid- pandemic [ ] . although the j o u r n a l p r e -p r o o f present study asked about rubbing, the survey did not include a specific question about rinsing. interestingly, there is no specific mention of rinsing in the recommendations for cleaning contact lenses [ ] . ecps in spain will need to reinforce the importance of rub-and-rinse during the pandemic. jones et al. [ ] indicated that in principle the presence of surfactants in multi-purpose solutions together with rubbing/rinsing steps are likely to be effective against sars-cov- but further work in this area is required. while a recent review suggested (with no cited evidence) that no rubbing/rinsing is required prior to lens disinfection when using hydrogen peroxide solutions [ ] , the cleanliness (which is linked to comfort and risk of infection) of rgp and soft lenses soaked in multi-purpose or hydrogen peroxide solution is better after rubbing and rinsing [ ] [ ] [ ] [ ] . the survey data showed that % of the respondents rubbed their lenses prior to soaking with hydrogen peroxide solution. the survey did not capture how respondents were instructed on lens care procedures when using hydrogen peroxide solutions but ecps need to be careful to only adopt evidence-based recommendations. nichols et al. [ ] suggested that peroxide systems should be the first-line recommendation for most wearers of reusable lenses and several organizations [ , ] indicate that peroxide-based solutions should be effective against the virus that causes covid- . finally, garcia-ayuso et al. [ ] noticed that spanish wearers changed their frequency of wear to occasional use during the pandemic. therefore, when providing instructions about the correct use of solutions ecp will also need to include advice regarding storage of irregularly worn reusable lenses and how frequently solutions need to be changed. previous studies have found that lens cases receive much less cleaning attention than contact lenses [ , ] . in fact, in this study, % of users (the same percentage was found for soft multipurpose users and peroxide users) admitted cleaning the lens case either 'weekly', 'monthly' or 'never/almost never'. in contrast, wu et al. [ ] reported that an effective lens case cleaning process should include rubbing the lens case, rinsing it with contact lens disinfecting solution, wiping it with a tissue and drying it face down with air. current advice suggests regular case replacement [ , ] as infrequent lens case replacement increases the risk of suffering ocular infection [ , ] . nichols et al. [ ] noted that peroxide-based systems improve compliance with lens case replacement. in agreement with this, a statistically significant difference was found in the present work between lens care types (peroxide users were more likely to replace their cases monthly than multi-purpose users). as wearers' behaviours are clearly sub-optimal with regards j o u r n a l p r e -p r o o f to lens case cleaning and replacement, ecps and the research contact lens community are encouraged to find strategies to further educate wearers on these aspects during the pandemic. finally, the survey explored how well respondents were adhering to contact lens recommendations for safe contact lens wear. overall, wearers adhered to recommendations regarding wearing time and disposal of contact lenses, however, % did not check their eyes daily before inserting lenses and % would not consider stopping lens wear if feeling unwell with flu/cold symptoms. twenty four percent admitted wearing lenses whilst showering. exposure to water during handling (rinsing contact lenses or lens cases with tap water) and/or wearing contact lenses whilst showering has shown to be a risk factor for infection [ ] . unfortunately, respondents of this study showed both behaviours (showering as well as use of tap water during lens case cleaning). similarly, vianya-estopa et al. [ ] reported % of uk wearers showered with contact lenses during the pandemic. arshad et al. [ ] has demonstrated an improvement in water-contact behaviours with the use of a no-water infographic and similar work should be attempted in countries like spain (where swimming is a popular activity, especially during the summer months). the present work highlights modifiable behaviours that need to be improved in contact lens compliance during the covid- pandemic. respondents indicated their preferred ways to be supported during lockdown which includes telehealth (either phone and/or video consultations) and access to educational tools online. since the end of the lockdown in june, spain has already experienced a resurgence of new infections and the affected regions have experienced further restrictions in an attempt to control the spread of the infection [ ] . as access to clinical care might continue to be limited over the coming months, ecps are encouraged to review patient education on safe contact lens wear and care to minimise the chance of contact lens related complications. daily disposables offer an advantage over reusable lenses in terms of noncompliance with cleaning and care procedures (relying on optimal handwashing only) [ ] [ ] . nagra et al. [ ] highlight that aftercare appointments had traditionally offered an ideal opportunity to assess contact lens compliance, but in the current times ecps are encouraged to also use alternative ways (e.g. videos or patient information sheets, raising awareness of lens care phone apps). the latter will be necessary particularly if optometric practices need to prioritise face-to-face appointments for urgent and emergency contact lens related j o u r n a l p r e -p r o o f first cases of coronavirus disease (covid- ) in the who european region estimation of covid- prevalence in italy idoiaga-mondragon, stress, anxiety, and depression levels in the initial stage of the covid- outbreak in a population sample in the northern spain characteristics of youtube videos in spanish on how to prevent covid- the covid- pandemic: important considerations for contact lens practitioners an international analysis of contact lens compliance soft contact lens wearers' compliance during the covid- pandemic contact lens hygiene compliance and lens case contamination: a review contact lens wear during the covid- pandemic clean care is safer care: clean hands protect against infection programa higiene de manos -diez preguntas clave sobre la higiene de manos hand hygiene is linked to microbial keratitis and corneal inflammatory events protocolos de higienización y seguridad en los establecimientos sanitarios de óptica y otros lugares de ejercicio professional version the case for using hydrogen peroxide contact lens care solutions: a review to rub or not to rub? -effective rigid contact lens cleaning soft contact lens cleaning: rub or no-rub? importance of rub and rinse in use of multipurpose contact lens solution a comparison of regimen methods for the removal and inactivation of bacteria, fungi and acanthamoeba from two types of silicone hydrogel lenses microbial contamination of contact lenses and lens care accessories of soft contact lens wearers (university students) in hong kong microbial flora of tears of orthokeratology patients, and microbial contamination of contact lenses and contact lens accessories risk factors for moderate and severe microbial keratitis in daily wear contact lens users water exposure and the risk of contact lens-related disease compliance behaviour change in contact lens wearers: a randomised controlled trial number of confirmed coronavirus (covid- ) cases in spain between daily disposable lenses: the better alternative silicone hydrogel daily disposable benefits: the evidence could telehealth help eye care practitioners adapt contact lens services during the covid- pandemic? key: cord- -mlp zgk authors: johanns, paul; grandgeorge, paul; baek, changyeob; sano, tomohiko g.; maddocks, john h.; reis, pedro m. title: the shapes of physical trefoil knots date: - - journal: nan doi: nan sha: doc_id: cord_uid: mlp zgk we perform a compare-and-contrast investigation between the equilibrium shapes of physical and ideal trefoil knots, both in closed and open configurations. ideal knots are purely geometric abstractions for the tightest configuration tied in a perfectly flexible, self-avoiding tube with an inextensible centerline and undeformable cross-sections. here, we construct physical realizations of tight trefoil knots tied in an elastomeric rod, and use x-ray tomography and d finite element simulation for detailed characterization. specifically, we evaluate the role of elasticity in dictating the physical knot's overall shape, self-contact regions, curvature profile, and cross-section deformation. we compare the shape of our elastic knots to prior computations of the corresponding ideal configurations. our results on tight physical knots exhibit many similarities to their purely geometric counterparts, but also some striking dissimilarities that we examine in detail. these observations raise the hypothesis that regions of localized elastic deformation, not captured by the geometric models, could act as precursors for the weak spots that compromise the strength of knotted filaments. the open trefoil knot, commonly known as the overhand knot, is the most elemental open knot, forming the basis of many, more complex, and more functional knots. the trefoil knot can be regarded as a building block in bend knots (e.g., fisherman's/english knot) [ ] , in binding knots (e.g., square or reef, and granny knots) [ ] [ ] [ ] , and in noose knots (e.g., lasso noose, honda knot, and lariat loop) [ ] . the classic overhand knot is also key in suturing procedures (e.g., surgeon's knot) [ ] [ ] [ ] [ ] [ ] [ ] . overhand knots can form spontaneously in various natural contexts, across a wide range of length scales, from polymers and dna strands [ ] [ ] [ ] to the umbilical cord of human fetuses [ ] , and even in vortex loops in plasma and fluid flows [ ] [ ] [ ] . the classic mathematical theory of knots is largely concerned with all possible topologies of knots tied in a closed loop. more recently, a smaller mathematical literature of the geometry of so-called ideal knot shapes has been developed [ ] . for context, in appendix a , we provide a brief review of recent advances on the theory of ideal knots that may be of interest to the mechanics community. in this purely geometric context, a knot is modelled as being tied in a closed loop of idealized rope approximated as a filament with an undeformable circular cross-section, inextensible centerline, and vanishing bending stiffness. the ideal, or tightest, shape is then the centerline configuration of a closed tube with the given knot type and diameter d that has the shortest possible length l . for example, an unknot is any configuration of a closed loop that can be smoothly deformed to a circle without passing through itself. unsurprisingly, the ideal shape of the unknot is a circle of * pedro.reis@epfl.ch circumference l = πd . interestingly, the unknot is the only knot for which the ideal shape is known explicitly; all other ideal knot shapes have only been approximated numerically. the trefoil knot is the simplest nontrivial knot type, and numerical approximations are available, computed with a variety of algorithms, with the most accurate shape obtained to date by przyby l et al. [ ] , with l /d = . . . . further geometric characteristics of the ideal closed trefoil are given in appendix a . ideal shapes of open knots can also be defined, for which both the diameter and a (long) arc length of the filament are prescribed. the ideal shape for a given knot type arises for the configuration with the maximal distance (in space, not arc length) between the two ends of the filament. this is a mathematically well-posed notion that was first simulated by pierański et al. [ , ] . these authors also sought to relate the equilibrium shape of a knotted filament to the decrease in its mechanical strength, as induced by the knot itself. they reported peaks in the curvature profile along the knot at both the entrance and exit points of the knot. consequently, it was hypothesized that the weakening of knotted filaments, commonly confirmed by practical experience, was rooted in these geometric features. this observation has also been corroborated at the atomic scale by saitta et al. [ ] , who performed molecular dynamic simulations on knotted polymer strands and pointed to a strain-energy localization at the entrance and exit to the open trefoil knot. however, more recent studies by uehara et al. [ ] and przyby l et al. [ ] suggest that the ideal rope model may not be appropriate to describe the mechanical properties of tight physical knots. whereas recent studies have addressed the mechanics of loose overhand knots [ ] [ ] [ ] , the mechanics of the corresponding tight configurations remains largely unexplored. here, we perform a compare-and-contrast investigation between the equilibrium shapes of physical real-izations of tight elastic trefoil knots and those of ideal knots based on existing purely geometric models [ , ] , both in open and closed configurations. we realize physical knots tied onto elastomeric rods (which are straight in their unstressed configuration) in experiment complemented with fully d elastic simulation using the finite element method (fem); representative examples are provided in the experimental photographs and femsnapshots of fig. . data from x-ray micro-computed tomography (µct) are used for a thorough quantitative validation of our fem computations. firstly, we focus on the closed trefoil knot, given its advantage of having a closed centerline with periodic boundary conditions; in particular, no external forces are required to attain equilibria. in its tight equilibrium configuration, a d mapping of the contact surface in the physical knot revealed that the double-contact lines first computed by carlen et al. [ ] within the purely geometric model form an accurate outer skeleton for the contact surface patch observed in the elastic case. secondly, we study tight configurations of the open trefoil knot, where different levels of tightness can be systematically investigated by the application of a range of external forces, thereby elucidating the effects of elasticity. our measured curvature profiles for knotted elastic filaments, both in the closed and open trefoils, are qualitatively different from those predicted by the ideal geometric models. specifically, physical open knots exhibit curvature peaks inside the knot, instead of at their entrance/exit, contrary to previous predictions for the tightest ideal knot [ ] . the excellent fem-experimental agreement confirms the observed curvature profiles and enables us to extract and map the contact pressure distribution, thereby revealing significant rod constrictions at the entrance and exit of the tight open knot. finally, we characterize these regions of localized elastic deformation, which we speculate could act as precursors for the weak spots that compromise the strength of knotted filaments. we have devised an experimental framework and performed fem simulations to realize tight physical knots tied on homogeneous, intrinsically straight, elastic rods. in this section, we describe the methodology that we followed on both fronts. a. experimental protocols . . . fabrication of customized elastic rods: we fabricated composite elastomeric rods with the goal of making them compatible with µct imaging and d image analysis to extract their centerline coordinates and self-contacting regions. we used the same fabrication protocol introduced recently to study the contact mechanics between two elastic rods [ ] , but with the additional feature described below. the method described in [ ] enabled the fabrication of composite elastomeric rods made out of vinyl polysiloxane, vps (elite double , zhermack, young's modulus e = . mpa, density ρ = kg/m ), decorated with an elastomeric concentric physical fiber (diameter µm) and a µm-thick elastomeric coating. the concentric physical fiber and the coated layer were made of a different, lighter, elastomer (solaris, smooth on, e solaris = kpa, ρ solaris = kg/m ). the % lower density of solaris with respect to vps allows for the segmentation of the features of interest (centerline and the self-contact regions) during post-processing stages of the µct tomographic images, as detailed in ref. [ ] . in the present work, we introduce an additional feature to our composite rods by embedding a second, eccentric, physical fiber made of solaris and diameter µm, parallel to the concentric fiber, at a distance mm. this inset fiber allowed us to match the twist of the glued extremities when fabricating the closed trefoil knot. finally, the elastomeric rods of total diameter d = . mm were then cut to different values of their total length of l , depending on the system of interest. we tied open trefoil knots on the fabricated rods. any build-up of excess twist at the free ends was avoided by carefully aligning the eccentric fiber at the extremities during the manual tying process. the knot was progressively tightened by increasing the end-to-end distance while the sample was immersed in a container of soapy water (palmolive original) to ensure vanishing friction conditions. the limited size of the sample holders of the µct apparatus required rods of different undeformed lengths: l = mm or mm, respectively, for the looser or tighter open knot configurations detailed next. pierański et al. [ ] computed the normalized knot length, Λ oc (the engaged knot length divided by the tube diameter), corresponding to the normalized difference between the arc lengths of the centerline associated to the first (entrance) and last (exit) contact points, s and s , respectively. both the µct scanning and the fem provide access to Λ oc . we chose two elastic configurations, one looser . . . tying of the closed trefoil knot: to compare the elastic closed trefoil knot and its ideal equivalent (results in sec. iii), we trimmed the elastic rod according to the length-to-diameter ratio computed by e.g. carlen et al. [ ] . for these experiments, our undeformed elastic rod of diameter d = . mm had a length of l = . d ≈ . mm dictated from the geometric model. the physical closed trefoil knot was tied by first producing an open trefoil knot and then joining the two rod extremities using a silicone adhesive (sil-poxy, smooth-on). during this closure procedure the eccentric fibers at each end were aligned at the joint location, which appeared to closely correspond to minimizing any additional, imposed, excess twist. the closed knot was placed in an ultrasonic bath (vwr, usc th) with a water-soap mixture (palmolive original, ≈ % in volume) for five minutes (at frequency khz and temperature • c). the combination of the ultrasonic vibrations and lubrication by the soap minimized frictional effects in the regions of self-contact (ensuring the absence of tangential surface stresses there), in agreement with the assumption of frictionless self-contact of idealized rods. fig. a shows an optical photograph of the final physical closed trefoil knot. . . . post-processing of µct images: we quantified the d geometry of the knotted rods using µct imaging (µct , scanco medical), with spacial resolutions (voxel size) of . µm or . µm for the open or closed knot configurations, respectively. an adaptation of the algorithm developed by grandgeorge and baek et al. [ ] was used for subsequent post-processing of the tomographic images. in this process, the segmentation of the images leveraged the density difference between the various rod features. the embedded concentric physical fiber allowed us to extract a discrete set of the locations of the centerline coordinates, r(s i ). the integer i corresponds to the index of the centerline locations with ≤ i ≤ n (where n is the total number of centerline points). the application of a gaussian-weighted moving average filter to r(s i ) was necessary (see sec. b) to compute the discretized curvature of the rod centerline, as described next. we first constructed the discrete set of tangent vectors e at s = s i such that e(s i ) = r(s i + δs) − r(s i ), with the increment δs ≡ r(s i+ ) − r(s i ) . the discretization increment was fixed to δs = µm and δs = µm for the open and closed knot configurations, respectively. we then computed the curvature profiles of the discrete framed curves according to bergou et al. [ ] : with the length of the voronoi region d(s i ) = (|e(s i− )|+ |e(s i )|)/ . finally, the regions in the muct images corresponding to the thin uniform outer coating layer of solaris were segmented to reveal the regions of self-contact. the individual contact points on the rod surface are shown in fig. a and a . we used the finite element method (fem, abaqus standard . - , simulia, dassault systems ) to simulate the tying of the same knots realized in the experiments. these experimentally validated simulations yield information that cannot be accessed directly through experiment; e.g., the pressure field in the regions of selfcontact. contrariwise, the close agreement between the two (see results in figs. , , and ) serves as a verification that the experimental configurations are, indeed, the equilibrium ones, with no additional significant experimental artifact. the fem computations were performed using the procedure reported recently by baek et al. [ ] , involving a dynamic-implicit analysis to capture the geometrically nonlinear deformation of the closed trefoil knot. a rod of diameter d and length l (paralleling the rod dimensions in the experiments) was meshed using d solid elements with reduced integration (c d rh). the number of elements per cross-sectional area was and for the open and closed knots, respectively. the mesh size along the axial direction of the rod was chosen such that the aspect ratio of the elements was close to unity. we modeled the elastomer as an incompressible neo-hookean material of young's modulus e = . mpa. self-contact of the rod was enforced using a penalty normal force model combined with there being no tangential force (frictionless contact). starting from the initially straight configuration of a rod, we obtained a final knotted geometry by applying a sequence of displacement steps at each extremity of the rod. firstly, we established a configuration of the open trefoil knot based on the knot-tying procedure described in baek et al. [ ] (see fig. b ). then, we gradually brought the extremities of the rods in contact to establish the closed configuration (see figs. b -b ), with the final equilibrium configuration of the closed trefoil knot presented in fig. b . the two extremities were constrained using the abaqus command *coupling, which enables the extremities to be displaced while allowing their crosssection to deform. throughout, we ensured that the simulation was quasi-static. having described our experimental and numerical toolbox, we proceed by quantifying the similarities and dissimilarities between physical and ideal closed trefoil knots, with the analogous discussion of the open case appearing in the next section. a closed knot offers the advantage of having a closed centerline curve with matching periodic boundary conditions; its configuration is not subject to external factors such as applied external forces. after a closed trefoil is tied on the physical elastic rod (with undeformed rod length l = . mm), the observed stretch of its centerline is . in experiment and . in fem-simulation. to compare our results with those of ideal knot theory, we take this small axial strain into account by using the normalized and rescaled arc length s = s/l × l /d , with the stretched rod length, l, and the ideal normalized rod length l /d = . , while assuming that the axial strain is constant along s. the overall length-to-diameter ratio of the stretched rod, l/d, was measured to be . and . in experiments and fem-simulations, respectively. to perform a comparison between the centerline coordinates r(s) = (x(s), y (s), z(s)) of the elastic and the ideal closed trefoil knots, we introduce (following ref. [ ] ) cylindrical coordinates in the cartesian basis {e x , e y , e z }, as shown in fig. a ,b. the knot lies flat on the e x -e y -plane, and the origin is chosen by the condition that the center of mass, or barycenter, of the centerline curve g = n i= r(s i )δs i /l lies on the e z -axis. (here, n is the number of discretization points, l is the stretched rod arc length, and δs i is the length of the i th segment between two successive discretized centerline points.) in fig. c , we compare the radial distance between the centerline and the barycenter axis, quantified as ρ(s) ≡ x (s) + y (s), for the experimental, fem and ideal knot cases (with three individually scaled arc lengths on ordinate, but all plots with the same common length scale on abscissa). the experimental and fem data are in excellent agreement. compared to the ideal knot, the experimental and fem closed knots exhibit a radial inflation, presumably due to elasticity effects, as evidenced by the horizontal offset of the ρ data. for example, ρ/d differs by . and . in the inner segments (minima; shaded) and the outer segments (maxima of ρ/d curves), respectively. moreover, the effect of the cross-sectional deformation is reflected in the amplitude of ρ for the elastic knot which is . d (d for the ideal case). to complete the comparison of the cylindrical coordinates, in fig. d , we present the rescaled vertical centerline coordinate, z(s)/d , for the three cases, which, interestingly, shows an excellent match between the ideal and the elastic closed knots, unlike the ρ data presented in fig. c . based on the µct and fem data, we construct a twodimensional contact map; the projection of the contact surface onto the arc length s vs. arc length σ plane. to assemble this contact map, each point in the contact surface is assigned to the two closest centerline positions of the knotted rod, at arc lengths s and σ. in fig f, we plot the contact map for the ideal case [ ] (black solid lines), together with the corresponding data extracted from µct and fem. note that by construction the arc length contact map is point-symmetric with respect to s = σ = [ ] . consequently, due to this symmetry, we only present one half of the µct and fem contact data, respectively in the lower-right and upperleft quadrants of the s − σ plot in fig e. we observe that, whereas, for the ideal knot, there are precisely two contact points σ σ for each s value, the physical knots exhibit an extended contact region with a range of σ values for each s value. moreover, we find that the contact set for the physical knot is a surface that lies fully inside the double contact lines (black lines) of the ideal closed trefoil knot; the geometric model acts as an outer outer skeleton for the elastic case. this filled (areal) contact region for the physical case, replaces the double-line contact in the ideal knot (see sec. i and a ) due to crosssectional deformation. the mismatch between the ideal and the elastic cases is particularly evident in the inner segments; there, the corners of the geometric contact set are not filled in the elastic case. in fig. e , we plot the curvature profiles of the elastic and the ideal trefoil knots. the curvature data are also presented in fig. a (see color-map), along the centerline of the experimental case. the elastic knot exhibits plateaus in the three outer segments with average normalized curvatures of κ = kd ≈ . (whereas κ ≈ . for the ideal knot). despite these close values in the outer segments, the behavior in the inner segments is strikingly different between the elastic and ideal cases; the ideal model predicts curvature maxima up to κ ≈ . (bounded by the active curvature limit [ ] ), whereas the elastic knot exhibits clear curvature minima. we hypothesize that this difference in curvatures between the two cases is rooted in the cross-section deformations allowed in d elasticity, which we address further in the next section, in the context of the open trefoil knot. the open trefoil knot allows us to directly control the level of tightness by applying forces to the rod extremities, to study the role of elasticity more systematically. this feature is not possible in the closed case since the extremities are, naturally, 'glued ' together. we will employ the experimental and numerical toolbox that we developed for the closed trefoil knot to explore the similarities and dissimilarities between the elastic open trefoil knot and the corresponding ideal case [ , ] . in fig. a , we present the d reconstruction of an ex-perimental open trefoil knot (normalized knot length of Λ oc = . ), with the measured normalized curvature profile, κ(s) = kd superposed onto the centerline. this curvature profile is qualitatively similar to what we observed in sec. iii for the physical, closed trefoils, with minima at the inner segments (region ( ) in fig. a ). in fig. b , we plot the experimental and fem-computed κ(s) profiles for the two elastic knots that we investigated, with normalized knot lengths of Λ oc = . and . . by way of example, we describe the physical knot with Λ oc = . , referring to the features labeled in fig. a and b while traveling along arc length (increasing s). soon after the knot entrance ( ), the vanishing curvature of the almost straight elastic rod rises to a local maximum, in the central region of the inner segment ( ) . the transition of the rod from the inner to the outer segment has a curvature drop, followed by an abrupt rise. the normalized curvature then reaches its maximum value in the outer segment ( ) . in this highcurvature region, we find that κ > over a wide range of s due to cross-sectional deformation of the elastic rod. eventually, there is a local curvature minimum at the central part of the loop ( ). the curvature profile of the ideal open trefoil knot in its tightest configuration (Λ oc = . ) obtained by pieranski et al. [ ] is also shown in fig. b , superposed onto the elastic profiles for comparison. there are important qualitative differences between the ideal and elastic results. for example the prominent curvature peaks occur at different locations and with different shapes between the two cases, a difference that can be attributed to elastic deformation of the cross-sections and the centerline. in fig. c , we map the contact region for elastic knots with Λ oc = . and . (blue and green regions, re- from the simulations, we extracted data for the contact pressure (normal traction) at the regions of selfcontact. in fig. a , we present a snapshot of the elastic knot with Λ oc = . , including the contact regions onto which we superpose the contact pressure (normalized by the young's modulus e). the contact pressure map is shown in fig. b , using a similar representation (in the s − σ space) used in fig. c for the contact map. the highest contact pressure is found along the entire central region of the contact set, with maximum characteristic normalized values of p/e ≈ . . note that the knot entrance/exit (shaded regions in fig. b) correspond to regions of localized pressure, aligned perpendicularly to the rod centerline. to further quantify the localization of deformation along the knot, in fig. c , we present measurements of the circumferential contact set width profile l c (s), normalized by the total perimeter of a rod cross-section at arc length s. we observe sharp peaks of l c at the inner segments (− . s − . and . s . ), where up to % of the circumference of the cross-section is in self-contact. the regions of pronounced contact pressure (fig. b) in combination with the sharp circumferential contact width peaks (fig. c ) lead to localization of high contact pressure in a narrow region with a small range of arc lengths. consequently, as shown in fig. d , where we quantify the profile of deformed cross-sectional area as a function along the centerline of the rod, the cross-section of the inner rod segment is elastically constricted by up to ∼ % compared to its rest cross-section area; such localized constrictions in knots are typically referred to as nip regions [ ] . we have systematically quantified the shapes of physical trefoil knots, in both closed and open configurations. excellent agreement was found in all considered quantities between fem and experiment. for the latter, we made extensive use of x-ray micro-computed tomography, gaining access to volumetric information, including centerline curvature and cross-sectional deformation profiles. in parallel, the experimentally validated fem enabled us to quantify the contact pressure field, which is not available in experiment. direct comparisons were also established between the experimental and fem data for elastic trefoil knots and prior numerical computations of their (purely geometric) ideal shape counterparts. for both open and closed physical trefoil knots the contact sets observed in both experiment and fem were smooth surfaces, with a positive contact set width l c , i.e. finite strips. for the closed trefoil, the physical contact surface is actually a closed strip, which, as an additional topological observation, we remark is a one and a half turn möbius band (the more common möbius band has . twist) and so is non-orientable (it has only one face) and only one edge. moreover for such . -turn möbius bands the single edge itself forms a trefoil knot (see supplementary movie ). this is perhaps at first sight surprising, but the topology of the contact strip is inherited from the topology of the contact line of the ideal closed trefoil configuration, where it is already understood that the contact set in d is a closed curve that is itself a trefoil knot [ ] . just as for the d arc length contact sets (cf. fig. ) , where the d contact region of the elastic configuration fills the outer skeleton provided by the double-contact line of the ideal geometric model when elastic deformation of the cross-section is allowed, the d ideal contact set curve acts as skeleton, which is fattened, or bridged, to arrive at a d physical contact surface strip, whose topology is inherited from the ideal case. in the comparison between the elastic and ideal cases of trefoil knots, we found that their curvature profiles were not just quantitatively different, but also qualitatively different. in both open and closed cases, elasticity regularizes the curvature peaks within the inner segment that are predicted by the purely geometric model. to gain insight into the discrepancies between the elastic and the ideal systems, we focused on the open configuration, allowing us to systematically vary the knot tightness. the curvature peaks of the elastic system occur in the outer segment, for both looser and tight knots, contrary to the geometric counterpart, where they appear at the knot's entrance/exit regions. the contact pressure distribution extracted from fem exhibited localized regions at the entrance of the knot (inner segments). this pressure localization leads to a prominent cross-sectional deformation in the inner segments, acting as local constrictions in these nip regions. as reported by c.w. ashley in his comprehensive reference manual on knots [ ] , "a rope is weakest just outside of the entrance of the knot"; a finding that is commonly confirmed by practical experience in knotted filaments. the significant reduction in the cross-sectional area reported in fig d at the entrance/exit of tight elastic open knots could act as a precursor for weak spots on knotted filaments. our interpretation is different from that of pierański et al. [ ] , who attributed the onset of failure to regions of high centerline curvature, computed using their purely geometry model, which our results demonstrate to be in strong disagreement with the curvature profile of physical knots. our investigation highlights that a mechanics-based approach, going beyond pure geometry, will be necessary to rationalize knot failure. given the high level of tightening in functional knots tied onto elasto-plastic material filaments, these constriction regions are prone to local plastic deformation [ ] . the effect of plasticity on the equilibrium shape of physical knots remains an open question, which we hope to untangle in future studies. declaration of competing interest. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. in the introduction section of the main text, we defined tightest or ideal knot shapes as the centerline configuration with the shortest possible length amongst all those tied in a closed loop of an idealized rope, which is taken to mean a filament with an undeformable circular cross-section of prescribed diameter, and inextensible centerline (and vanishing bending stiffness so the problem has no mechanics, only geometry). in this appendix, for completeness, we provide a brief overview of existing literature on ideal knots (primarily from the mathematics community, but which we hope may be of some interest to the mechanics community). specifically, we focus on a more technical description of the necessary conditions that must be satisfied by ideal shape centerline curves, and the double-contact feature that is manifested in the ideal trefoil knot and other geometries (not-knots). ideal shapes are known to exist for all standard knot types [ ] with centerlines that are c , curves, meaning that the centerline has a continuously varying unit tangent at every point, and a curvature that is defined almost everywhere, but not everywhere. in particular the curvature can be discontinuous. for example, a straight line segment joined to an arc of a circle with matching tangents, but a discontinuous curvature, can form part of an ideal shape, and numerics strongly suggest that straight line segments and discontinuities in curvature do arise in ideal shapes, for example on composite knots [ ] . as a side note, we point out that knowing the fine detail of the precise smoothness or regularity of ideal knot shapes is important in designing good numerical algorithms to approximate them. in addition to the circular centerline of the ideal shape of the unknot, the only other known explicit ideal shapes are comparatively simple, piece-wise planar, ideal shapes for certain links (i.e., knots with multi-component centerlines) [ , ] . the first conditions that must be satisfied by ideal knot shapes were derived in ref. [ ] , in terms of the global radius of curvature. technically, these results depended on the slightly too strong assumption of a c centerline. extensions to the weaker and sharp hypothesis of a c , centerline were obtained in ref. [ ] , where the appropriate euler-lagrange equations were also related to force balance. the necessary conditions that must be satisfied on an ideal shape include the three-way alternative that every point along an ideal centerline must be either (i) part of a straight segment, or (ii) local curvature must achieve its maximal value /d (as is the case every-where for the circular ideal shape for the unknot), or (iii) be at one end of a locally minimal distance, or contact, chord of length d between two distinct points on the centerline. as mentioned in the main text, we believe that the most accurate numerical approximation to the ideal closed trefoil currently available is that provided by przybyl et al. [ ] , with l /d = . . . . . this computed value of l /d is a rigorous (to machine arithmetic precision) upper bound to the actual ideal value, and very probably the upper bound is rather close to the actual, unknown ideal value. however, rather than comparing many digits of accuracy in the ideal value of l /d , the present study seeks to compare features of computed ideal closed trefoil shapes with both experiment and fem simulation, which include a combination of the elastic effects of bending and deformation of cross-sections. the features that we compare were first described for the ideal geometric trefoil by carlen et al. [ ] , and subsequently confirmed and better visualized on improved simulation data, as fully described in [ , ] , from where the images in fig. are reproduced. [ , ] .) a, a solid tube visualization, which obscures the inner structure of the one-parameter family of contact chords shown as a translucent yellow surface in panel b (at a slightly larger scale). the sharp edge of the surface is the centerline of the knot. the red curve traced out by the center points of the contact chords is the contact set where the tube of panel a touches itself. the contact curve can be seen to itself be a trefoil knot by the smooth homotopy from the contact curve (red) to knot center line (green) illustrated in panel c, where each of the nonintersecting multi-coloured closed curves lies on the yellow contact surface. for the ideal closed trefoil, visualized as a solid tube in fig. a , each point along the tube centerline is in fact at the end of two distinct contact chords. this gives rise to the double contact lines in the (s, σ) plane shown in fig. f in the main text. in addition the maximum local curvature of /d is very close to being attained at six points, as shown in the spikes in fig. e in the main text. furthermore, the curvature is nowhere close to vanishing, so that no straight segment arises in the ideal trefoil knot centerline. the double-contact feature is present along the full arc length of the trefoil knot, which means that there is a one-parameter family of double contact chords which trace out a surface in d, as shown in fig. b , where the sharp edge of the translucent yellow surface is the center line curve of the ideal trefoil shape. the d contact set for the ideal trefoil is a closed curve lying on the surface of the tubes visualized in fig. a , but is obscured. this contact line is also traced out by the mid-points of the contact chords (red curve in fig. b ). the contact curve can be seen to itself be a trefoil knot by the homotopy illustrated in fig. c , where there is a family of non-intersecting multi-coloured closed curves that deform along the contact surface from the contact curve (red) to the knot center line (green). analogous double-contact phenomena have previously been reported for infinite double helices, depending on the pitch angle [ , ] , and in an ideal orthogonal clasp problem [ ] . maritan et al. [ , , ] also showed that in an optimal packing problem, single helices frequently arise with both double contact chords and maximal curvature, and that the associated critical aspect ratio of this special helix arises for the c α carbons in α helical segments of protein crystal structures. thus the observed phenomena of double contact chords with additionally maximal curvature, is perhaps not as exceptional as it might first appear. knots for climbers ropes in equilibrium the roles of impact and inertia in the failure of a shoelace knot topological mechanics of knots and tangles the ashley book of knots surgical knots surgical knot tying manual ligatures et sutures chirurgicales the constrictor knot is the best ligature ieee international conference on technologies for practical robot applications (tepra) a finite element model of a surgical knot influence of a knot on the strength of a polymer strand tying a molecular knot with optical tweezers probability of dna knotting and the effective diameter of the dna double helix knotted umbilical cords, in: physical and numerical models in knot theory creation and dynamics of knotted vortices how superfluid vortex knots untie energy release in the solar corona from spatially resolved magnetic braids physical and numerical models in knot theory: including applications to the life sciences, k & e series on knots and everything high resolution portrait of the ideal trefoil knot tight open knots localization of breakage points in knotted strings effects of knot characteristics on tensile breaking of a polymeric monofilament tightening of the elastic overhand knot elastic knots matched asymptotic expansions for twisted elastic knots: a self-contact problem with non-trivial contact topology untangling the mechanics and topology in the frictional response of long overhand elastic knots biarcs, global radius of curvature, and the computation of ideal knot shapes, in: physical and numerical models in knot theory mechanics of two filaments in tight contact: the orthogonal clasp discrete elastic rods technical brief: finite element modeling of tight elastic knots ideal knots and other packing problems of tubes computation and visualization of ideal knot shapes global curvature and self-contact of nonlinearly elastic curves and rods global curvature, thickness, and the ideal shapes of knots on the minimum ropelength of knots and links euler-lagrange equations for nonlinearly elastic rods with self-contact, arch. ration best packing in proteins and dna curves, circles, and spheres a constructive approach to modelling the tight shapes of some linked structures optimal shapes of compact strings acknowledgments. the authors thank a. flynn for fruitful discussions.this work was supported by the fonds national de la recherche, luxembourg ( ), the grants-in-aid for jsps overseas research fellowship ( - ), and by the swiss national science foundation (award - to jhm). appendix b: smoothing of the raw data to reduce 'noise' in the curvature computationas described in the main text (sec. . . ), we computed the curvature profiles of the rod centerline by the numerical differentiation of r(s). prior to this differentiation, we applied a gaussian-weighted moving average filter (command smoothdata in matlab ) to r(s), with a window size defined by σ = round(n b /n gauss ). to test the fidelity of the computed curvature data, given the discrete nature of the raw data, we performed a parametric test of the filter on the closed trefoil knot with l /d = . (the same configuration studied experimentally in the main text). in this test, we fixed the total number of discrete centerline points n b = for the closed trefoil knot, and systematically varied n gauss = { , , }. without the filter (i.e., σ = , corresponding n gauss = ), the data would be far too noisy for analysis. in fig. , we present profiles for normalized curvatures, κ(s), for decreasing values of n gauss (the data is increasingly smoothed as n gauss decreases). we selected the window size of σ = (i.e., n gauss = ), which reasonably suppresses noise while not over-smoothing the curvature features. homotopy between the centerline of the physical closed trefoil knot and the rim of its con- test for the smoothing of the curvature computed from the centerline data for the experimental closed trefoil knot. a gaussian-weighted moving average filter with changing size of the smoothing window allowed to find the trade-off value between noisy and over-smoothed curves. the selected value for the window size used for the data presented in the main text is ngauss = (i.e., σ = ). tact shape. motivated by the known homotopy between the contact line and the knot centerline for numerically simulated, geometrically ideal, configurations of the closed trefoil knot (cf. fig. ) we provide an animation of a three-dimensional rendering constructed from the µct data for the physical closed trefoil knot presented in fig. a , fig. a , and fig. a ,b of the main text. the knotted elastomeric rod (vps ) has a rest length l = . mm and rest diameter d = . mm (d /l = . ). after locating the centerline of the knotted rod (black curve), as well as its selfcontacting surface (blue surface), we extract the rim of the contact surface (red curve). without undergoing selfcrossings (fixed topology), the rim of the contact surface is smoothly morphed into the rod centerline, thus revealing the homotopy between the rim of the contact shape and the rod centerline. to perform this morphing, we first parametrized the rim of the contact shape as r(s * ), where s * is the arc length along the rim (of total length l r ). the intermediate morphing curve, w(s), ranges from the rim curve to the centerline curve (parametrized as r(s)), following the parametrized deformation w(s) = ( − t) r (s l r /l) + t r(s), where l is the total length of the centerline curve, and ≤ t ≤ is the morphing parameter. key: cord- -baxmoutj authors: hobson, stacy; hind, michael; mojsilovic, aleksandra; varshney, kush r. title: trust and transparency in contact tracing applications date: - - journal: nan doi: nan sha: doc_id: cord_uid: baxmoutj the global outbreak of covid- has led to focus on efforts to manage and mitigate the continued spread of the disease. one of these efforts include the use of contact tracing to identify people who are at-risk of developing the disease through exposure to an infected person. historically, contact tracing has been primarily manual but given the exponential spread of the virus that causes covid- , there has been significant interest in the development and use of digital contact tracing solutions to supplement the work of human contact tracers. the collection and use of sensitive personal details by these applications has led to a number of concerns by the stakeholder groups with a vested interest in these solutions. we explore digital contact tracing solutions in detail and propose the use of a transparent reporting mechanism, factsheets, to provide transparency of and support trust in these applications. we also provide an example factsheet template with questions that are specific to the contact tracing application domain. abstract-the global outbreak of covid- has led to focus on efforts to manage and mitigate the continued spread of the disease. one of these efforts include the use of contact tracing to identify people who are at-risk of developing the disease through exposure to an infected person. historically, contact tracing has been primarily manual but given the exponential spread of the virus that causes covid- , there has been significant interest in the development and use of digital contact tracing solutions to supplement the work of human contact tracers. the collection and use of sensitive personal details by these applications has led to a number of concerns by the stakeholder groups with a vested interest in these solutions. we explore digital contact tracing solutions in detail and propose the use of a transparent reporting mechanism, factsheets, to provide transparency of and support trust in these applications. we also provide an example factsheet template with questions that are specific to the contact tracing application domain. t he recent spread of severe acute respiratory syndrome coronovirus (sars-cov- ) and the outbreak of the associated covid- disease has inspired the development of new software applications and ai models to address many of the challenges our global society is facing. public health agencies, corporations, and individuals have been racing to identify tools to help control the spread of the virus, find suitable treatment options, and aid in the creation of a vaccine. given the public health impact and urgent need to limit the continued spread of the disease, many government officials and policy makers have relaxed regulations to expedite the launch of technologies addressing these and other related concerns. many of these technologies collect and use sensitive data about individuals such as health history, medical conditions, infection state, current health symptoms, and location. an example includes contact tracing applications -those focused on identifying individuals who are at risk for developing covid- through exposure to a person later identified as having been infected with sars-cov- . contact tracing applications use various techniques to identify exposure or contact events, and use sensitive personal data like some of the examples previously identified. the use of sensitive personal information has prompted concerns about the overall trustworthiness of these types of applications. these concerns have motivated interest in application transparency, so that application stakeholders can better understand details including the purpose of the application, the s data that is collected and the application's use of the collected data. in recent years there has been significant discussion around the need for transparent reporting, specifically with regards to ai models and services. we apply one of the recent transparent reporting techniques in the context of contact tracing applications. although this category of applications are not considered ai, there is significant risk to the end-users of the applications given the health implications and use of sensitive personal data. studies have shown that technologies that are applied in a healthcare or a public health setting can lead to negative outcomes like medical errors, harm, or death especially if they are poorly designed, implemented, or applied [ ] . the limited understanding of details of these applications motivates a need for transparency to support trust. the objective of this paper is to identify how we develop and use transparent reporting mechanisms for contact tracing applications. we do not aim to make direct conclusions about the trustworthiness of specific applications but focus on the types of questions that must be addressed to provide transparency of and support trust in the applications in this domain. ii. transparent reporting mechanisms researchers in the software engineering community have focused on creating useful documentation for applications. they have identified quality issues in existing documentation for conventional systems [ ] , [ ] , [ ] and discussed problems such as missing rationales for design decisions, too few examples to understand how to use a module or package, lack of overviews to illustrate how a systems component parts work as a whole, and insufficient guidance on how to map usage scenarios to elements of an api. ai applications pose a unique challenge, given their reliance on training data, and their often probabilistic behavior with respect to test data. thus, there has been a recent focus on transparent reporting mechanisms for ai systems, focusing on datasets [ ] , [ ] , [ ] , models [ ] , [ ] and services [ ] . there have been efforts focused on the ethical development of ai that also highlighted the need for transparency or detailed assessments of ai systems [ ] . we build upon these efforts of transparent reporting to examine and provide transparency of contact tracing applications. iii. covid- and contact tracing sars-cov- poses a significant health challenge for global communities in that there are currently no identified vaccines or accepted proactive treatment methods for covid- , the disease the virus causes. limiting the spread of the virus has emerged as one of the primary targets to reduce the occurrence of covid- , and the impact on individuals and the overburdened healthcare system in many countries. two of the measures used to reduce the spread are ) limiting the physical interactions and contact between people (social distancing) and ) identification of people who have come into contact with or proximity of an infected person (contact tracing). contact tracing has been used for many years as a method to control disease and has primarily relied on mobilization of trained human contact tracers -people who actively work with individuals with confirmed infections to generate a list of people whom they may have further exposed or infected [ ] . the contact tracers then notify each of the identified individuals of the exposure risk, encourage them to get tested for infection, and suggest potential immediate quarantine action. if any of those individuals are infected, the tracers begin the process of creating an exposure contact list for each of those people for further notification and action. manual contact tracing efforts are likely not sufficient in cases where the spread of the disease has been exponential, as we have seen with sars-cov- . the initial doubling of cases in china was reported at every . days before advanced mitigation methods were employed [ ] . a recent publication by johns hopkins university center for health security reports that the united states will need to add approximately , human contact tracers as part of the multi-pronged effort to manage the covid- epidemic [ ] . one way to scale contact tracing efforts and complement the work of human contact tracers is through the use of digital contact tracing solutions. the united states centers for disease control and prevention (cdc) identifies two types of digital contact tracing solutions -one focused on streamlining the capture and management of data on cases and contacts, the other on using bluetooth or gps to track an individuals exposure to an infected person [ ] . the approach we use for transparency can be applied to both solution types, however, we focus our remaining discussion on the most prevalent of the application types -those that fall into the latter category. there is not an agreed upon single way to achieve contact tracing; at the time of writing this paper, we identified contact tracing applications available worldwide. many of these applications establish contact events by keeping a record of all the devices (e.g. smartphones) that come within a certain distance of one another or are in the same geographical location at the same time. once a person has been identified as infected with covid- and has indicated it in the application, a notification can be sent to all other devices running the same application that indicated close proximity to the device of the infected person within a set date range. these details are used to infer a contact event -that one or more people were close enough to an infected individual where respiratory droplets could pass from the infected person to the others. the most common approaches for digital contact tracing rely on either the use of location tracking through the global positioning service (gps) or bluetooth low energy capabilities. most smartphones today continuously capture details on the device's location and the associated time via gps satellites. gps-enabled devices are reported to work best when they are outdoors under open skies where they can accurately capture location within feet [ ] . location accuracy is known to degrade when devices are indoors, underground, or near items that obstruct a direct path to the satellites, e.g. buildings, bridges, or trees [ ] . gps-based location tracking for a device is achieved through trilateration using radio signals from gps satellites. the resulting coordinates indicating geographical location are paired with a timestamp to represent location at a specific time. contact tracing applications infer contact events by ) identifying devices that have geographical coordinates that fall within a set distance parameter (e.g. or feet), ) has a time-stamp that overlaps with one another and ) continues to remain within the distance parameter for a specified duration (e.g. minutes) even if the geographical coordinates between one or both change. this inference also relies on certain assumptions including that the device is always in a person's possession and that possession is by a single person. a contrary case includes when the device is not in someone's possession, for example it is left somewhere (on the seat on a train, a table in a restaurant, etc.). the device's location (and not that of the person) would be tracked and a faulty contact event can be reported. similarly, if a device owner or primary user lets someone else (friend, family member, etc.) use the device and the owner is later found to be infected, the exposure of others may be reported for cases where the infected individual was not present. additionally, the challenge posed by inaccuracies for device use indoors may limit the ability to identify a significant portion of contact events and has been identified as a potential shortcoming of using gps location tracking for this particular purpose. bluetooth low energy (ble) capabilities can be used to establish contact events though proximity detection. most smartphones are equipped with bluetooth capabilities that are leveraged for this method of contact tracing and, unlike gps, can track proximity events indoors or outdoors. since bluetooth is used to track the promixity to other bluetoothenabled devices, it does not track actual location. this has been considered one of the limitations of this method since it cannot assist in identification of geographical areas where the virus is spreading. in a contact tracing context, bluetooth low energy is used to broadcast information from a device including a time stamp and an identifier. since bluetooth low energy is based on short-range communications only devices that are within a short distance are expected to receive the broadcast. the receiving device uses the received signal strength indicator to infer distance between itself and the broadcasting device. a recent tech report highlights issues with relying on signal strength as an estimator of distance. the authors showed that the signal strength varied substantially based on the orientation of the device, absorption of the signal by the human body, and reflection or absorption of radio signals in buildings and trains [ ] . another fundamental difference between gps-based location tracking and proximity identification through bluetooth low energy is in how the data related to potential contact events are stored. gps-based location tracking relies on centralization of data to a remote server while the bluetooth low energy technique can be either centralized or decentralized, with data being shared only locally on the individual devices. other, lesser-discussed methods for contact tracing solutions may involve the use of bluetooth beacons [ ] , location tracking through cellular or wi-fi, or tag scanning (e.g. qr or rfid). these techniques can be implemented as the sole method for contact identification or in combination with one of the other techniques. although these techniques can be used to identify potential contact events, they do not factor in pertinent details that can affect transmission likelihood. for example, transmission within an indoor, poorly ventilated space may be more likely than transmission in an outdoor space [ ] . additionally, appropriate use of items like medical-grade masks or respirators by one or more individuals can greatly reduce likelihood of transmission during contact events. we are not suggesting that one method is better than the other; we only present a brief introduction of the techniques since aspects of the technical implementation are important considerations for stakeholders interested in application transparency. contact tracing has been identified as critical to the ability to manage the covid- pandemic and, along with significant testing capabilities, may be a required item to enable governments to relax measures in place that limit the movement of their citizens [ ] , [ ] . these measures have negatively impacted global economies given the effect the restrictions have on businesses in industries such as retail, hospitality, and travel & transportation. multiple stakeholders are interested in the development and use of contact tracing applications and the underlying motivations for this interest may differ for each group. understanding some of the motivations for the stakeholders provides a foundation for identifying the expected benefits and concerns of use of these applications. one category of stakeholders includes those with a public health role -officials in organizations with a focus on the identification and management of viruses like sars-cov- . examples of these organizations are the cdc in the united states and ministries of health in other countries. they have an interest in digital contact tracing solutions as a complement to manual contact tracing efforts that many of them have employed for decades, with a goal of using these techniques to mitigate disease spread. a second group includes health care providers -hospitals, long-and short-term care facilities, and laboratories. their interests also include the management of the virus but extend to use of the applications for reporting of infected cases from their patients and appropriate handling of people who have been notified of potential exposure. the health care provider and public health groups, along with government officials, likely also have interests in using applications to identify 'hot spots' or locations where the spread of the virus is growing. this information can be used in tailoring localized measures aimed at reducing the continued spread. many public and private companies have interests in digital contact tracing applications as part of the efforts in allowing their employees to return to a physical worksite. since the start of the pandemic, many governments have instituted measures to encourage or mandate that their citizens remain at home with exceptions being allowed for those in essential roles, e.g. health care staff, public safety officials, and critical infrastructure work in specific industries [ ] . digital contact tracing applications can be used to identify exposures within an office setting and enable employers to recommend exposed individuals quarantine at home to reduce worksite-associated outbreaks. additionally, employees that have been notified of potential exposure through non-work related activities can communicate the exposure identification their employer and self-quarantine to prevent spread. software developers and information technology professionals are often the groups that are responsible for the development of contact tracing applications. people within this group may have interests in developing their own solutions for contact tracing to make available to the stakeholder groups mentioned above. there are often government and geographical considerations that apply to the applications, so the potential for adoption of an existing application by a different country may require updates by software developers to make them adhere to specific local policies or regulations. the final stakeholder group we identify here includes the individuals that are expected to actively use contact tracing applications. this could be individuals in a certain country, state, or geography, or in a business context, the business' employees. trust in the applications by the target end-users is critical for effective adoption, especially in cases where application usage is not mandated. v. benefits and concerns the use of digital contact tracing applications is expected to provide a variety of benefits but also brings to mind a number of concerns including those relating to privacy and security. the means in which the concerns are handled may differ given the technical design and implementation of each application. one key benefit of contact tracing that applies to both manual efforts and digital applications is the ability to identify people who are exposed to an infected individual to encourage testing and quarantine. the implementation of a quarantine action for people who are infected but are pre-symptomatic or asymptomatic reduces the chance of them infecting others prior to awareness of their own infected state. recent studies have suggested the median incubation period for covid- is about . days [ ] and that a portion of the spread of sars-cov- is from pre-symptomatic [ ] , [ ] or asymptomatic individuals [ ] , [ ] . therefore, the identification and quarantining of pre-symptomatic and asymptomatic people may be a useful factor in reducing the continued covid- infections. digital solutions may also provide additional benefits above that of manual contact tracing methods. some of these additional benefits include: ) faster notification of exposure: digital solutions can help reduce the time for notification to exposed individuals as compared to manual contact tracing. apps can notify exposed people within seconds after an infected person has been identified. manual tracing efforts require several steps after the identification of an infected individual, including completing an interview with the infected person or close family members to collect names of the people potentially exposed, potential additional time to locate a means to contact each person (phone numbers, addresses, etc.), and the time to establish contact. ) identification of contact in public spaces: contact tracing applications may also help address areas where manual contact tracing is not effective, for example in identifying prolonged contact with strangers or in public spaces. a specific example of this would be an asymptomatic individual traveling via public transportation or waiting in line in a coffee shop. the individual would not be able to identify most of the people he/she/they came into contact with. additionally, if the individual could recall the exact day, time, and duration of the visit, this still would not be sufficient to identify and locate all others that were in the same location at the same time. ) identifying outbreak 'hot spots': contact tracing solutions that capture location details in association with infections and exposures may be useful in identifying areas where ) infections are growing, ) the number of cases exceed a threshold, or ) congregations of large groups of people are enabling rapid transmission. this information may be used in implementing countermeasures like social distancing and shelter-in-place policies targeted at specific locations to reduce the increase in infections within that geographical area. discussions around the potential for use of digital contact tracing applications have brought light to a large number of concerns with the technologies, chief of which is focused on privacy. we maintain that transparency of the technologies through an understanding of how each addresses the concerns is a foundation for building trust and enabling stakeholders to make decisions about which technologies they want to use and how they want to use it. some of the main concerns with the solutions include: ) privacy: at the core of digital contact tracing is the awareness of personal information such as health status (infected or not infected), location details, social interactions, and in some cases name, gender, age, and health history (selfreported symptoms and medical conditions). the collection of these details pose a number of issues such as the potential for an individual's sensitive data to be made available to others (intentionally or unintentionally) and use by governments or other groups for purposes other than management of covid- spread. some practical privacy concerns are the opportunities for government agencies such as law enforcement or organizations like the united states' immigration and customs enforcement (ice) agency to surveil people through their use of the application and the potential for others to find out about their health conditions including covid- infection. ) security: another top concern for application stakeholders is application security. this includes two aspects: a) the vulnerability of the applications to attack with an attempt to change how the application works, to access personal data, or to disable usage of the application and b) the embedding of code for nefarious purposes by an application developer or publisher. the data breach investigations report by verizon identified web applications as the second highest category of healthcare industry breaches after miscellaneous errors [ ] . an example of a specific security issue with a contact tracing application was highlighted in a recent report by amnesty international, in which they stated that they were able to access individuals' names, health status, and location details from a central server for the qatar governmentsponsored digital contact tracing application ehteraz [ ] . ) coverage: the technical implementation of the applications also affects the expectation of deployment and use. for example, applications using bluetooth low energy may require many people in the specific community or location to download and use to adequately assess potential spread amongst the population. if there is not enough coverage of use across the population, the ability to identify many of the exposed people is reduced. we understand that people may have varying reasons for choosing to participate or not, one of which is their belief of trustworthiness of the applications based on many of the specific concerns highlighted here. a key requirement for digital contact tracing is that individuals have devices (e.g. smartphones) that enable the application to function properly. since many of the applications rely on ble or gps, individuals would have to have devices that have the capabilities embedded. results of a survey showed that approximately % of the people aged or older in the united states have a smartphone while ownership of those between ages and was greater than % [ ] . also, for some of the systems, a newer version of a smartphone is required; people with older smartphones may not have the ability to use or get alerts from these types of applications. since the identification of contact events for individuals are based on these devices, children and disadvantaged groups may also be omitted given a lack of access to or continued use of a personal smartphone. countries like india and indonesia have large portions of the population that either do not have access to a compatible device or have a device at all [ ] . ) accuracy: we introduced some of the issues related to accuracy earlier in the discussion, specifically the limitations with tracing contact in large locations (e.g. apartment buildings) and areas where people are more geographically separated. we highlighted a concern with gps previously in that it is not as accurate indoors or in areas where there isn't an unimpeded path to open skies. with ble, accuracy may degrade based on the positioning or obstruction of the bluetooth enabled device and this may impact the proximity identification [ ] . ) asynchronous contact events: there is potential for exposure and spread of the virus from cases where there is an asynchronous contact event, for example with a person being in a small enclosed space (e.g. elevator) for a period of time then leaves, and then shortly thereafter another person comes into the same space. there is the belief that most of the spread of the virus is through aspiration of respiratory droplets however there is also the possibility that spread occurs when an uninfected person touches an object or surface that an infected person has previously touched and then puts their hands or fingers in the areas around their mouths, nose, or eyes. both of these examples of spread can occur through an asynchronous contact event but may not be captured as such in digital contact tracing solutions that focus on people being in the same location or close proximity at the same time. ) device impacts: each contact tracing application may also have specific considerations and impacts on the devices in which they are being run. there is a concern with the potential of high consumption of battery power with bluetooth-based techniques [ ] . some of the applications have requirements to run in the foreground of the device, meaning that when other applications are being used by the device holder, the application may not be able to work appropriately to identify contact events. additionally, the device makers may have restrictions in place that effect the way the applications work. one example of this is apple's restriction on allowing bluetooth transmissions when an ios based device is locked [ ] , which limits the functionality of contact tracing applications on these devices. ) ability: these applications rely not only on adoption by individuals but also appropriate use. if people are unaware of specific requirements for use, or are not comfortable with usage of the device or the application, their interactions may not be sufficient to enable effectiveness of the application. consider an example where a novice technology user has a smartphone and downloads the application on it. the user may not realize that downloading the application is not sufficient, but may also require completion of a profile and providing consent for the application to run on the user's device. consent may also be required in the device settings to allow the application to access some of the smartphone's capabilities that are required. for example, a user may install an application but inadvertently restrict the ability for it to work by disabling access to the device's location services or bluetooth capabilities. ) interoperability: limitations associated with contact tracing applications' ability to identify contact events may lead to missed episodes of exposure and potential transmission of the virus. we have highlighted some of these concerns relating to the coverage, access, and accuracy aspects already. another related concern of the application's ability to identify contact is that of interoperability between applications and/or devices. consider an example where an infected individual is located near another individual for an extended period of time. if the two people are running different contact tracing applications, or running applications on different devices (e.g. one with an android based device and the other with an ios based device) restrictions in the applications being able to share details with one another or from one platform to another is a direct inhibitor to the identification of this contact event. apple and alphabet (google's parent company) have proposed a framework that allows interoperability between the device operating systems of contact tracing applications, which is a helpful step in addressing this issue, but is limited to the applications that use the framework [ ]. in some cases, applications like aarogya setu have developed both a version based on the android and the ios operating systems [ ] . ) reluctance in disclosure: in some cases people may agree or are mandated to use a digital contact tracing application but have an interest in withholding an infection diagnosis because of privacy or security concerns, or personal reluctance to acknowledge the diagnosis. similarly, people may not want to acknowledge or disclose their exposure to infected individuals. in some geographies, people who are diagnosed as infected or are identified as having been exposed to an infected individual may be told to quarantine for a period of time. these measures will limit people's movements and ability to do things that they may want to do e.g. go to work, go to the grocery store, visit family members, or participate in social activities. some of these limitations may have an economic impact (restricting ability to work) which may reinforce a reluctance for an individual to disclose infection or exposure. the urgent global need for contact tracing has spurred the development of many digital solutions. to date, we have identified different applications created since december specifically to support the contact tracing needs required for management of covid- . these solutions may differ in technical implementation and specific policies of use. it is likely hard for public health agencies and government officials to quickly identify the differences between applications as they try to determine which one to select as part of their targeted virus management strategy. similarly, it is also difficult for individuals who are asked to install and use the applications to get consumable details regarding specific considerations relevant to them like requirements for use, types of data collected, and data use policies. we provide a list of the applications in table i including details on the organization that sponsored the development or group that directly developed the application, and the technical approach that is used for identifying contact. these details are based on information reported for each of the applications at the time of authorship of this paper, but we acknowledge qr code scanning that due to the dynamic nature in the development of these applications and efforts to address emerging concerns of the intended community of use, some of these details may change in the future. as a follow-on to our prior work on the use of factsheets for transparent reporting [ ] , we now aim to help identify the questions that would provide useful and critical information about contact tracing applications. to achieve this goal, we first compiled questions that are relevant to provide basic information relative to any model, service or application. these include questions focused on scope of use, target stakeholders, and data that is collected. then, after detailed review of contact tracing technologies and their potential for use, we augmented the initial list with questions specific to contact tracing, namely, those that would elicit details addressing the benefits and concerns identified above. some of these questions focus on the technical implementation including the technique used for establishing proximity and/or location, method for identifying a contact event (centralization versus decentralization), and method for infection reporting. as a final step, we considered the beneficiaries of the applications and questions that would be of interest to them that were not already identified. examples of these questions include how infections are reported, whether usage is voluntary or mandated, and how compliance with local laws or regulations is achieved. these efforts enabled us to create a factsheet template -a list of questions that can be used to provide important details on and promote transparency of contact tracing applications. the factsheet template we created is organized into four main categories: general questions, data-specific questions, privacy questions, and use questions. we introduce the template and the associated questions in tables ii-v. viii. discussion we have presented a broad factsheet template to support transparency of contact tracing applications. a key component of factsheets is the tailoring of the questions within the factsheet template to address a specific stakeholder group and provide clarity on the aspects of the applications that they are most concerned about. as we discussed in section iv, the motivations of interest in the applications may differ for each group, and these motivations influence the questions that enable transparency for each group. let's consider the general public stakeholder group whose interest in transparency may be most related to their own use of the applications. the questions that focus on data, privacy and device requirements may be the ones that are critical for their specific version of the factsheet template. some of these questions would include those relating to ) types of data collected, ) how the data is used, ) requirements for efficacy of tracking and contact identification, ) expected device impacts, ) limitations of use, and ) data privacy. the public health official group would potentially be engaged in the selection and management of the contact tracing applications for a specific geographical area (city, county, state, county, etc.) and therefore would likely have interest in the broadest set of questions from the template above. the full set of questions we identified in the template would provide information pertaining to the concerns from all stakeholder did you implement the right for a user to ) withdraw consent, ) object, and ) be forgotten in the application? does the application allow people to learn any personal information about others? are privacy-preserving techniques incorporated in the application (e.g. data anonymization, encryption, aggregation)? if so, provide details on the techniques used. what additional measures are used to protect the data and identity of infected and exposed individuals? could this application be used in a way that identifies people who are infected or at risk to ) the developers, ) people within an individual's social circles, ) to those the app is warning about contact and potential exposure, or ) to the government, employer, or managing organization? if the app connects to public health or hospital systems, how do you ensure that personal information isnt accessible during data sharing points? groups, and this could be useful as the public health officials evaluate and select the applications with the other stakeholders concerns in mind. however, they might be less interested in questions relating to specific device impacts or requirements of use unless these considerations could greatly limit acceptance in their geographical communities. for the it professionals group, questions around the technical implementation, limitations, connections to other it systems, all data aspects (collection, policies, access rights, retention, and security), device requirements, decentralization versus centralization, and privacy-preserving techniques would be of particular interest. this group may also be interested in additional technical details about the applications including access to the code base. recent reports have suggested that the code for two of the applications referenced in table i -aarogya setu and tracetogether -will be publicly available as open source projects [ ] , [ ] . we believe that this is another path for promoting transparency of these types of applications for this specific stakeholder group, and can be used together with factsheets to foster trust. we acknowledge that there could be additional relevant questions that were not listed in our factsheet template that might be useful for application transparency in this context. we suggest the factsheet template as a useful starting point in the efforts towards transparency. we also acknowledge that as applications are updated, the answers to the questions may change. we suggest the generation of a factsheet for each application deployment and update. it is possible to create a base factsheet for the application that covers the details that will not change from one application instance to the other, and also include a supplementary factsheet that is generated for each version or use case. we have demonstrated the potential of factsheets in this context to promote transparency but note that factsheets are not limited to this purpose alone. they can be leveraged as a mechanism in additional contexts, for example as part of a robust trust and governance strategy within a business, or a path for evaluation and certification of models or services by a third-party. our proposal of the use of factsheets for transparency will help in providing consumable details about the applications for the stakeholder groups we discussed in section iii and help each group to understand application details related to the concerns of their group. we encourage people with an interest in fostering trust in models, services and applications to use transparent reporting techniques like factsheets to provide consumers and stakeholder groups with the necessary details to better understand these technologies. health it and patient safety: building safer systems for better care evaluating usage and quality of technical software documentation: an empirical study a field study of api learning obstacles a study of the effectiveness of usage examples in rest api documentation datasheets for datasets the dataset nutrition label: a framework to drive higher data quality standards data statements for natural language processing: toward mitigating system bias and enabling better science factsheets: increasing trust in ai services through supplier's declarations of conformity model cards for model reporting ethics guidelines for trustworthy ai contact tracing: part of a multipronged approach to fight the covid- pandemic nowcasting and forecasting the potential domestic and international spread of the -ncov outbreak originating in wuhan, china: a modelling study pdf [ ] preliminary criteria for the evaluation of digital contact tracing tools for covid- , united states of department of health and human services united states national coordination office for space-based positioning, navigation, and timing coronavirus contact tracing: evaluating the potential of using bluetooth received signal strength for proximity detection bluetooth low energy beacons airborne transmission of sars-cov- : the world should face the reality quantifying sars-cov- transmission suggests epidemic control with digital contact tracing effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov- in different settings guidance on the essential critical infrastructure workforce: ensuring community and national resilience in covid- response the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application temporal dynamics in viral shedding and transmissibility of covid- asymptomatic and presymptomatic sars-cov- infections in residents of a long-term care skilled nursing facility king county presumed asymptomatic carrier transmission of covid- data breach investigations report qatar: contact tracing app security flaw exposed sensitive personal details of more than one million who owns cellphones and smartphones smartphone ownership is growing rapidly around the world, but not always equally exposure notification bluetooth specification government of india aarogya setu open source code things about opentrace, the open-source code published by the tracetogether team the authors would like to thank marc stoecklin for input on the list of current contact tracing applications. key: cord- -oyyj bl authors: parker, michael j; fraser, christophe; abeler-dörner, lucie; bonsall, david title: ethics of instantaneous contact tracing using mobile phone apps in the control of the covid- pandemic date: - - journal: j med ethics doi: . /medethics- - sha: doc_id: cord_uid: oyyj bl in this paper we discuss ethical implications of the use of mobile phone apps in the control of the covid- pandemic. contact tracing is a well-established feature of public health practice during infectious disease outbreaks and epidemics. however, the high proportion of pre-symptomatic transmission in covid- means that standard contact tracing methods are too slow to stop the progression of infection through the population. to address this problem, many countries around the world have deployed or are developing mobile phone apps capable of supporting instantaneous contact tracing. informed by the on-going mapping of ‘proximity events’ these apps are intended both to inform public health policy and to provide alerts to individuals who have been in contact with a person with the infection. the proposed use of mobile phone data for ‘intelligent physical distancing’ in such contexts raises a number of important ethical questions. in our paper, we outline some ethical considerations that need to be addressed in any deployment of this kind of approach as part of a multidimensional public health response. we also, briefly, explore the implications for its use in future infectious disease outbreaks. in this paper we discuss ethical implications of the use of mobile phone apps in the control of the covid- pandemic. contact tracing is a well-established feature of public health practice during infectious disease outbreaks and epidemics. however, the high proportion of pre-symptomatic transmission in covid- means that standard contact tracing methods are too slow to stop the progression of infection through the population. to address this problem, many countries around the world have deployed or are developing mobile phone apps capable of supporting instantaneous contact tracing. informed by the on-going mapping of 'proximity events' these apps are intended both to inform public health policy and to provide alerts to individuals who have been in contact with a person with the infection. the proposed use of mobile phone data for 'intelligent physical distancing' in such contexts raises a number of important ethical questions. in our paper, we outline some ethical considerations that need to be addressed in any deployment of this kind of approach as part of a multidimensional public health response. we also, briefly, explore the implications for its use in future infectious disease outbreaks. as we write this paper, europe is at the epicentre of the covid- pandemic. the pandemic has its origins in the emergence, late in , of a novel coronavirus in the chinese city of wuhan, which has a population of around million. it is estimated that between the official confirmation of the outbreak and the imposition of a lockdown, around million people left the city. the vast majority went to other parts of china. the epidemiological implication of this is that the chinese population outside wuhan came into contact with many more people infected with covid- than did the world outside china. despite this, as of april , around months later, china's total number of cases is , and its daily case rate is close to zero. by contrast, the global total of cases is now approaching million and doubling every few days in many places. compared with other countries, china has been very successful at controlling the spread of there are a number of features of china's response to covid- that would be unlikely to be effective or acceptable in other countries. this does not mean that there are not important lessons to learn from china's success. one element of the approach i there is debate about the accuracy of the figures coming out of china but broad agreement about the success of their intervention in reducing the number of infections. adopted by china and by several other countries in east and south east asia that has been highly successful in reducing cases is the use of mobile phone data combined with intensive testing programmes. there is evidence to suggest that the use of this kind of approach might be successfully transferable to other settings with different political and cultural systems. ii effective, rapid contact tracing is the cornerstone of effective public health response in the face of infectious disease outbreaks. its success depends on identifying cases (usually people with symptoms) quickly, gathering information from them about recent contacts and following up and quarantining those contacts to interrupt further transmission of the disease. covid- presents a problem for contact tracing as usually practiced because around % of transmissions happen early in infection, before symptoms start, and before test results can be acted on. this means that covid- moves too quickly through the population to be amenable to standard contact tracing methods. the use of a mobile phone app that captures 'proximity events'-events in which two mobile phones have been close enough for sufficient time for the risk of infection to be high-offers the potential for instantaneous contact tracing from the moment the infection is confirmed. iii this has the potential to stop the pandemic. the modelling for the use of a mobile phone app in covid- and a more detailed description of how this might work have been published elsewhere. a number of different approaches are currently under development by health systems in many countries around the world. in this paper, our aim is to set out a number of ethical considerations relevant to the use of mobile phone apps to enable rapid contact tracing. these issues will emerge in different ways in different settings. any consideration of the ethical questions arising in the context of the covid- pandemic has to place great importance on the moral significance of its international spread and the massive scale ii the effectiveness and reach of any implementation of the app in democratic societies will inevitably be affected by varying configurations of state-citizen relationships, as well as by the roles of civil society groups and non-governmental actors. iii the question of what constitutes adequate information about infection status for a population effect may be answered differently by different systems, ranging from self-reported symptoms through to clinically validated test results. current controversy of its impact. as of april , there have been confirmed cases and deaths globally. these figures are likely to be significant underestimates. it is important to highlight the fact that in addition to those who have died very much larger numbers of people will be suffering symptoms sufficiently serious to warrant hospitalisation and intensive care. in lowincome and middle-income countries in which health systems will often not have these facilities, the impact will be much greater. we are far from the end of the covid- pandemic: these numbers will continue to rise for quite some time. it hardly needs saying that the saving of lives and reduction of suffering are of immense moral importance and there are strong reasons to support efforts to achieve this. the ethical assessment of an innovation capable of making a contribution to addressing these harms needs to be understood and analysed against the dramatic scale of the deaths and suffering represented by these data. the policy decisions made by governments around the world in response to covid- have been inevitably varied. what is possible, what is required and what is socially and culturally appropriate will differ across the globe. such differences notwithstanding, many countries have introduced significant restrictions on freedom of movement with disruption to everyday life. one-third of the world's population is currently living under 'lockdown'. the terms and enforcement of this vary but all are causing serious economic and other harms to both individuals and institutions with long-term impact. their impact will be enduring. many people now and in the future will experience significant suffering as a consequence of these measures. in the context of public health emergencies, actions are often justified that would not be appropriate outside of such contexts. such actions do nonetheless require an explicit justification: the mere existence of an emergency does not in itself legitimise any intrusion on the autonomy or privacy of individuals or groups. the justification most commonly offered for the current imposition of lockdowns and other restrictions of movement has been that they are necessary to ensure sufficient 'physical distancing' to disrupt the transmission of the infection sufficiently to enable health systems to cope with predicted demand. it is estimated that the overwhelming of health systems, were it to happen, would be one of the main causes of death. current approaches to lockdown are, however, blunt tools applied at a national level. they apply to everyone, whether or not they are at risk, affected or immune. this is justified insofar as there is insufficient accurate, reliable information about the risk status of individuals or specific locations, which would enable more finely-tuned decisions to be made reliably. the justification of blanket lockdowns would be weaker were it possible to manage physical distancing in a more evidencebased, risk-adjusted way. were this so, it would remain the case that limiting the movements of those people who presented a high risk would be justified. it would not, however, be justified to restrict the movements of those individuals (and possibly populations) who were reliably known not to be contributing to this risk. rapid contact tracing enabled by the mobile phone app described above-combined with accurate testing-has the potential to be a tool of this kind. the evidence suggests the app has the potential to enable some (likely many) people to return more quickly to their lives. this evidence puts pressure on justifications for blanket lockdowns. the harms presented by such lockdowns also provide support for an argument that the development and implementation of the app as part of a broader package of public health interventions is not only ethically acceptable but also-where feasible-obligatory. iv the app is preferable to blanket lockdowns because intelligent physical distancing constitutes the minimum imposition compatible with addressing the epidemic safely. a fuller analysis would require the relative benefits and harms of other mooted options for non-pharmaceutical intervention to be compared and considered. controlled or delayed spread of sars-cov- with the primary intention of mitigating against overburdened healthcare resources, herd-immunity by controlled infection in the population, and cyclical lockdowns, have all been considered. mathematical modelling can be used to compare the likely reductions on the morbidity and mortality, alongside any societal costs of quarantine, mediated by each intervention, of note, of the options under consideration, however, only contact tracing aims to prevent transmission while explicitly minmising numbers of people in quarantine. v before the pandemic, questions about data protection, security and privacy were at or close to the top of lists of ethical concerns for many people. against that background, the use of a mobile phone app built on the gathering and sharing of proximity information, even if pseudonymised, may be seen as deeply concerning, particularly in combination with other socially restrictive measures. two important questions requiring clarification in this regard are: what is the nature of the infringement of privacy, if there is one, and, can this be justified in the context of the covid- pandemic? starting with the question of justification, it seems clear now that some privacy infringements are potentially justifiable where they have the potential to contribute to the saving of many lives and reducing enormous suffering. imagine a scale running from to . at the end of the scale would be someone (person a) for whom privacy is the concern that trumps all others. people at this end of the scale would place privacy above all other concerns and would be unwilling to give up any privacy to achieve another goal, no matter how important. a person at the other end of the scale would be someone (person z) who has no interest at all in privacy and would willingly give up % of their privacy for any reason. person a's view is likely to be a minority position with regard to this pandemic. the scale of the suffering caused by the covid- pandemic means that if a case can be made that some degree of privacy infringement will save significant numbers of lives and reduce suffering, the intervention may be justified. any such justification will depend on a clear case being made that the privacy infringement is either necessary or that it is significantly more effective than the alternatives. one aspect of a convincing attempt at justification might be the claim that the privacy infringement is less intrusive than blanket population level lockdowns for everyone. it would, however, also require a convincing case to be made that (i) any privacy impact would be minimised, (ii) that high standards of data security, protection and oversight would be in place, (iii) that there would be transparency about proposed and actual data uses, and (iv) that these would be complemented by other protections, for example, around non-discrimination. this is a iv subject to a number of caveats discussed later in the paper. v the true effectiveness and sustainability of these interventions remains to be seen; benefits and harms of interventions should be evaluated post-implementation and alternative strategies (including combinations of approaches) continuously reconsidered. useful reminder that person z's is also an ethically problematic position. of course, an important concern for many people will not only be about their privacy today during the epidemic, but also about their future privacy: will full privacy protections be reinstated after the epidemic? will data gathered now be used in unacceptable ways later? this final point highlights, importantly, the fact that any justification of infringements of privacy will need to include a convincing account of their scope and duration. the discussion above suggests not only that some constraints on liberty and on privacy may be justified in the context of a global health emergency. it also implies that there may be a tension or trade-off between them. would it, for example, be ethically justified to retain/impose a blanket lockdown on society as a whole -including those not at risk themselves or a risk to others -on privacy grounds alone? much would depend on the details of the scope of the privacy infringement. however, its potential use in enabling many people now, and ultimately all, to emerge safely from a damaging lockdown provides a strong autonomybased prima facie argument in favour of the introduction and use of the app even if it were considered to constitute a privacy infringement. it is worth noting that there are at least two ways in which the use of the app has the potential to be autonomy enhancing. the first is its potential to enable people to go about their lives freely without the constraints imposed by a lockdown. the second is that it would provide a tool to enable individual people to make informed choices about how to behave in a socially responsible way e.g. to self-isolate as necessary to reduce the risks to others. the ideal situation would be for the downloading of the app to be voluntary and for the scale of voluntary uptake to be significant. this is a possibility given that it is believed that an uptake of below % would still -in combination with other measures -be sufficient to make an important impact. there are a number of reasons why those who have smartphones will have a strong incentive to sign up. the first of these is that this would ultimately mean that they and everyone else will emerge from the lockdown more quickly and safely. a second is that, by so doing, they will then be enabled to contribute to saving the lives of others, particularly the vulnerable, and those in caring roles, both locally and globally. appeals to a sense of 'we are all in this together' of 'solidarity' may be effective. vi a third, is related to the impact on the user's own level of risk. although primarily aimed at population level impacts, if a person downloads the app -and so do their close contacts -their personal risk will be very significantly reduced. this is because the de facto effects of app uptake will mostly act very locally except in busy urban environments such as the london underground. what if this does not work? if actual or predicted uptake is insufficient, is there an argument for the use of incentives? against this background of the scale of the current lockdown, the use of incentives to minimise the length of lockdown while also saving lives might be justified if uptake was insufficient and that there was evidence that greater uptake would release large numbers of people from an avoidable lockdown. the nature of these incentives would need careful consideration on vi it is important to note that there are circumstances in which, and many people for whom, appeals to solidarity might also be exclusionary or deepening of existing social divisions. a case-by-case basis. some possible examples might include: a donation to a nomiated charity, or free mobile phone credit. the use of incentives inevitably raises a number of equity questions with regard to those who do not have access to suitable smartphones and would not have access to these benefits through this route. these would need to be acknowledged and addressed in any defensible policy. vii thus far, we have been considering ethical questions relating to the use of the app by individuals. there are, however, implications for institutions and professions such as those who manage care homes or places where large numbers of people congregate such as cafes and restaurants. as we emerge from the epidemic into a world in which infection rates are lower but in which there is not as yet a vaccine-a world in which the transmission of covid- needs to be minimised-such people might reasonably be expected to ensure that the level of risk in their establishment or workplace is minimised. in this transition period, people in these positions might reasonably be seen to have an obligation to allow entry only to people who are able to show they are low risk. it might reasonably be judged irresponsible of such an institution to subject residents or customers to avoidable levels of personal risk and to fail to contribute to the suppression of infection transmission in the public interest. viii this perspective suggests additional reasons for thinking that the uptake of the app might be high because there is good reason to assume that most people would want to be able to both emerge from the lockdown and also to know that when they went to work or to a café they would be safe to do so, and contributing to the safety of others. this might provide a way for professionals and institutions to meet their obligations and an additional incentive to individuals to act responsibly. if the app can be shown to offer the potential to provide information to enable individuals and those who manage institutions to ensure an intelligent and safe emergence from lockdown, there are good reasons for its use. the 'if ' here is important, however, because the app's success will depend not only on the effectiveness of the app itself but also upon the existence of complementary infrastructure such as easy access to reliable testing, support to make sustained self-isolation possible and employment protections to ensure that those who do self-isolate are protected. this suggests the need for an in-depth ethical analysis of the process of emerging from lockdown, potentially into a series of periodic lockdowns with significant impact on the lives and well-being of many people. should the data be deleted at the end of the epidemic? one way of increasing the chances that people will be willing to download the app and allow it to gather data of proximity events might be for clear legally enforceable commitments to be provided that when the epidemic is over (according to some agreed criteria) the app and its data will be deleted. if this is essential to create the conditions for sufficient uptake and hence for saving lives and reducing suffering, it should be considered. it is not an ethically unproblematic course of action, however. one of the most striking and disturbing aspects of the current pandemic has been the way it has revealed how poorly prepared vii these and other equity questions are expanded on below. viii this paragraph needs to be understood in the context of those made in the later section on 'equity, fairness, and justice'. the world and individual countries are for such an eventuality both in terms of health system resilience, availability of equipment and tests and in terms of reliable epidemiological modelling. against this background, it is clear that we have responsibilities not only to those who are currently suffering from covid- but also to future generations. if the app is adopted as an intervention, the data it produces could be an invaluable resource for the protection of future generations from serious harm i.e. through research, the development of modelling methods and evaluation of the range of current responses. if these data re to be retained for such uses, a number of important questions about security, oversight, and ownership will need clear and enforceable answers. the successful and appropriate use of mobile phone apps to facilitate instantaneous contact tracing in the context of covid- in democratic countries depends on the establishment of sustained and well-founded public trust and confidence. this applies to the use of the app itself and of the data. the use of 'well-founded' here is intended to emphasise that mere presence of trust is insufficient in itself: such trust must be genuinely warranted. the requirements for well-founded trust will vary from country to country and perhaps even from person to person. however, in democratic contexts, in addition to the provision of clearly articulated and justified answers to the questions set out above, requirements are likely to include: the establishment of effective, transparent, accountable and inclusive oversight-perhaps by an ethics oversight body including members of the public; the agreement and publication at the outset of ethical principles by which the use of the intervention will be guided; the use of a transparent, auditable and easily explained algorithm; the highest possible standards of data security; and effective protections around the ownership uses of data. all public health emergencies and the actions taken to deal with them raise important justice questions because they are situations in which infringements of justice, discrimination and stigma commonly occur. it is also well established that the development and introduction of new technologies are capable of creating new forms of discrimination and further enhancing those that pre-existed the innovation. these can take the form of bias within the technology itself (perhaps because of biased data), biases arising out of the uses to which the technology is put and bias out of the fact that it may be available to some but not all. the response to covid- has been no different to previous public health emergencies in this regard. against this background, an important requirement for the credibility of any attempt to justify the use of the mobile phone app as part of a wider set of public health interventions to address the threat of covid- will be recognition of the importance of engaging seriously with equity and justice issues. notwithstanding the impossibility of addressing all structural issues in the compressed timescale of a pandemic, evidence is needed of a clear, actionable and ambitious plan for addressing these issues. once the current pandemic is over, there will inevitably be reviews of scientific, epidemiological and medical evidence about which interventions were or were not effective. if it turns out to be the case that the use of instantaneous contact tracing combined with widespread testing is effective, questions will arise about the ethical implications for its use in other infectious disease outbreaks. would it, for example, be acceptable or even required for a specifically designed app to be used each year in the context of seasonal influenza? these are important ethical questions. although there are differences, there are also morally significant similarities between covid- and seasonal influenza. for example, while its transmission rate is generally lower than covid- , the numbers of deaths internationally from seasonal influenza are very large indeed. one important difference, at present, between the two diseases is that mechanisms capable of developing a vaccine each year with some degree of effectiveness against seasonal influenza are in place. this may suggest that, unless judged less harmful or more effective than vaccination, the use of the app in seasonal influenza may not be justified. however, it is possible that apps will be appropriate in other settings and, where likely to be effective, constitute an important and ethically justified part of the public health toolkit. in this paper, we have set out a number of pressing ethical questions raised by the proposed use of a mobile phone app, the collection of proximity data for the control of the covid- pandemic, and the safe emergence of populations from government-imposed lockdowns. scientific and epidemiological evidence suggest that an app of this kind has the potential to contribute to reducing the suffering caused by the pandemic and minimise the harms caused by long periods of lockdown. these benefits and the avoidance of harms are clearly of great moral significance. if they are to be realised, however, several other ethical requirements need to be met. we have highlighted a number of such requirements which deserve attention in any ethically justified use of this technological intervention. in the uk, there is early empirical evidence that a high proportion of the population would choose to download the app under current circumstances, given adequate protections. in an on-line survey of predicted user-acceptance conducted by our collaborators, % of respondents said they would definitely or probably install a contact-tracing app. before they are invited to do so, they need to be assured that adequate protections and oversight are in place. a profoundly important ethical question presented by this technology concerns the problem of how and whether societies can find ways to benefit from the potential of algorithmic approaches to improve public and individual health,while also ensuring that the legacy of the deployment of these technologies does not impact negatively on future generations. correction notice this paper has been corrected since it was first published online. there are two instances in the title and the main text where 'contact' was incorrectly spelt as 'contract'. global coalition to accelerate covid- clinical research in resource-limited settings invisible women: exposing data bias in a world designed for men urban social media demographics: an exploration of twitter use in major american cities racism and discrimination in covid- responses global mortality associated with seasonal influenza epidemics: new burden estimates and predictors from the glamor project acknowledgements our thanks to yasmin gunaratnam, jonathan montgomery, and mariam motamedi-fraser for their helpful comments on earlier versions of this paper.funding this study was funded by the wellcome trust ( ). patient consent for publication not required.provenance and peer review not commissioned; internally peer reviewed. key: cord- -sp o h authors: raskar, ramesh; nadeau, greg; werner, john; barbar, rachel; mehra, ashley; harp, gabriel; leopoldseder, markus; wilson, bryan; flakoll, derrick; vepakomma, praneeth; pahwa, deepti; beaudry, robson; flores, emelin; popielarz, maciej; bhatia, akanksha; nuzzo, andrea; gee, matt; summet, jay; surati, rajeev; khastgir, bikram; benedetti, francesco maria; vilcans, kristen; leis, sienna; louisy, khahlil title: covid- contact-tracing mobile apps: evaluation and assessment for decision makers date: - - journal: nan doi: nan sha: doc_id: cord_uid: sp o h a number of groups, from governments to non-profits, have quickly acted to innovate the contact-tracing process: they are designing, building, and launching contact-tracing apps in response to the covid- crisis. a diverse range of approaches exist, creating challenging choices for officials looking to implement contact-tracing technology in their community and raising concerns about these choices among citizens asked to participate in contact tracing. we are frequently asked how to evaluate and differentiate between the options for contact-tracing applications. here, we share the questions we ask about app features and plans when reviewing the many contact-tracing apps appearing on the global stage. more than , deaths are now attributed to the global covid- pandemic. many thousands more lives are expected to be lost before we have brought the disease under control and are capable of managing future spikes in the number of cases. in an effort to both slow and stop the disease, communities across the world have halted everyday life, requesting or requiring their residents to close non-essential businesses, stop going to school, and stay home. digital initiatives hope to support safe and wellconsidered approaches to the reopening of our societies while simultaneously reducing the human loss of life by giving frontline officials modern tools with which to control this pandemic. one particular set of modern digital tools aims to upgrade contact-tracing capacity, typically a lengthy and laborious process. in addition to increasing the speed with which contact-tracers can reach those who have been exposed to the disease, these tools can increase the accuracy of contact tracing. however, many first-generation digital contact-tracing tools have paved the way for a post-pandemic surveillance state and the mistreatment of private, personal information. privacy must remain at the forefront of the global response, lest short-term pandemic interventions enable long-term surveillance and abuse. the design and development of the next generation of contact-tracing tools offers an opportunity to sharply pivot to solutions using privacy-first principles and collaborative, open-source designs. these tools present an opportunity to save lives by flattening the curve of the pandemic and to provide economic relief without allowing privacy infringements now or in the future. covid- virus transmission occurs for several days before a person shows any symptoms. during this time, a person going about their daily life may interact with, and possibly pass the infection to, as many as a thousand people. without knowing they are infected, an individual who has only mild symptoms or is asymptomatic may continue to interact with others, further spreading the virus. this creates an exponential rise in infections. stopping the spread of covid- with pharmaceutical treatments and vaccines remains at least - months away from widespread availability. therefore, public health countermeasures, such as social distancing, offer the only possibility of stopping virus proliferation in the near future. when applied broadly, such measures disrupt every aspect of society and risk economic collapse. already, unemployment rates have skyrocketed, tenants are struggling to pay rent, and critical supply chains, including the food supply chain, have been interrupted. the longer strict social distancing measures remain in place, the more severe the consequences for economies and societies will be. however, if social distancing measures are lifted too quickly, the virus will spread once again, claiming many additional lives. the contact-tracing process evaluates the recent location history and social connections of those who become infected and notifies the people they have interacted with of their exposure to the virus. in this way, contacttracing methods allow targeted measures (e.g., quarantining, virus testing) to be applied only to exposed individuals. traditionally, public health officials perform contact tracing manually, by interviewing patients diagnosed with a disease about their activity over the past days or weeks. then, officials reach out to people who crossed paths with the patient during the time the patient was contagious and recommend targeted interventions to prevent further spread of the disease. widespread, rapid transmission of a virus by respiratory droplets, as in the case of covid- , challenges the practicality of the traditional contact-tracing process. manual tracing is resource intensive, is time consuming, and will, at best, be limited to contacts within the social circles of the infected-and thus cannot trace strangers effectively. furthermore, the patient being interviewed is often extremely ill and at risk for memory errors during the interview. digital contact-tracing tools may help mitigate these challenges. today, almost half of the world's population carries a device, such as a smartphone, capable of gps tracking and bluetooth communication with nearby devices. each device is able to create a location trail-a timestamped log of the locations of an individual, as well as a list of anonymous id tokens that are collected when the device user crosses near another device. by comparing the device users' location trails or the anonymous id tokens they have collected with those from people who have covid- , one can identify others who have been near the person who is infected; this facilitates contact tracing in a more accurate and timely manner than the traditional manual approach. several pilot programs, particularly in china and south korea, have demonstrated the technical feasibility of contact-tracing applications as tools to help contain the covid- outbreak within a large population. however, these programs also highlight the very real risks that exist with the use of such technologies. a location trail and list of nearby device ids contains highly sensitive, private information about a person: everything from where they live and work and which businesses they support, to which friends and family members they visit. location data can be used to identify people who are infected and might then be targeted by their community. for example, data sent out by the south korean government to inform residents about the movements of persons recently diagnosed with covid- sparked speculations about the individuals' personal lives, from rumors of plastic surgery to infidelity and prostitution. more frightening still, enabling access to a person's location data by a third party, particularly a government, opens a path to potentially unrestrained state surveillance. in china, users suspect that an app developed to help citizens identify symptoms and their risk of carrying a pathogen was used to spy on them and share personal data with the police. care must be taken in the design of such apps. a number of groups, from governments to non-profits, have quickly acted to innovate the contact-tracing process: they are designing, building, and launching contact-tracing apps in response to the covid- crisis. a diverse range of approaches exist, creating challenging choices for officials looking to implement contact-tracing technology in their community and raising concerns about these choices among citizens asked to participate in contact tracing. we are frequently asked how to evaluate and differentiate between the options for contact-tracing applications. here, we share the questions we ask about app features and plans when reviewing the many contact-tracing apps appearing on the global stage. we are asking an open-source approach lets programmers and other experts outside the app development team review the code for a project. these outside programmers can make improvements, copy the code, or use it to create something entirely new. open source offers a layer of trustworthiness. because the code is publicly available, it can be reviewed by experts around the world to confirm it works the way the development team says it should. there are, at times, valid reasons to not use an open-source approach, such as when a business is seeking to develop a proprietary technology. during the covid- crisis, we believe that open-source projects promote collaboration and foster community. contact-tracing apps require the use of a data source to infer contact between two people: two of the most useful data sources are gps location data and bluetooth broadcasting. gps-based apps create a "location trail" for each user by recording their time-stamped gps location. if a person catches covid- , they can share their location trail with the responsible authority-the health worker, public health official, government official, or app creator. the authority then releases some or all of the location trail for other users to compare to. in some applications, the person who is infected might be able to directly share their location trail with other users. other apps rely on bluetooth to determine who the person who is infected has crossed paths. such apps create a unique identifier, a number or token, which the app broadcasts to nearby devices. the user's phone then records the identifiers of other phones it has been near. if a person becomes infected, their unique identifiers can be compared to those stored by other users to determine who the infected person has crossed paths with. in some cases, such as the singapore tracetogether app, the central authority stores user information and can determine the user's phone number and identity from an identifier. in others, such as covid watch and coepi, the identifiers provided by the person who declares themselves to be infected cannot be used by the central authority to determine the person's real world identity. both approaches offer distinct advantages and challenges: gps-based approach • allows for estimation of exposure related to surface transmission of disease. unlike bluetooth, gps-based systems can notify users if they were in a location shortly after a person infected with covid- , when the chance for exposure to the virus through commonly touched surfaces is high. • enables users to import historical data. other applications on the users' phones, such as google maps, are already collecting the potential user's location histories before they install the contacttracing app. when users import these historical data, the app can alert the user to potential exposures from their location history, even before they downloaded the app. • provides redacted, anonymized gps data to help public health officials follow the spread of disease within a community. • is able to record the user's location history using a small amount of data, making scaling and implementation in regions with high data costs more likely. bluetooth-based approach • uses signal strength, which is reduced by walls and other barriers, to estimate the distance between users. in some places, such as a large, multi-floor building, this estimate more accurately reflects the chance of exposure to disease than a gps-based approach. • uses time-range-dependent, randomly generated numbers as ids to ideally achieve relative anonymity. • requires the use of a compatible app by other users to record possible exposures. if an app is not widely adopted, the potential utility is limited. • no potential to collect historical data from before the user downloaded the app. in the near-future, some solutions, including covid safe paths, will integrate both approaches, allowing the user to harness the advantages of each while mitigating some challenges. both gps and bluetooth: aarogya setu (india) bluetooth: trace together (singapore) some bluetooth-based apps use a fixed identifier, meaning the unique number assigned to the device does not change and is permanently associated with the user. time-variable identifiers change on a set time interval, such as an hour, so each user is associated with many different identifiers. the use of time-variable identifiers adds a layer of privacy protection by making it difficult for a third party to track a particular phone over time based upon a single identifier. in a centralized version of contact tracing, location and contact data are collected and consolidated centrally by a single authority, often a government entity. china utilized a centralized approach with its app. other information about the user, such as mobile telecommunication service provider or payment data, may be collected and paired with the location data. the central authority identifies people who are infected, determines their contacts, and requests specific actions by those who may have been exposed to the virus. centralized systems create powerful tools for analysis and public health decision making. however, such systems also expose a person's data to a central authority, creating an opportunity to undermine the person's privacy. in a decentralized approach, the healthy user's data never goes to a central server. location data are stored and processed on the phone of the user. only the location data of people confirmed to be infected need to be shared. tools, such as redaction and blurring of the infected person's data, can be used to help preserve their privacy. an israeli app, track virus, is an example of a decentralized approach, as is covid safe paths. decentralized systems typically offer greater privacy protection and are, therefore, more in line with privacy requirements and regulations such as gdpr. some utility may be lost compared to centralized systems as collection and aggregation of large data sets from users can be used for beneficial public health research. however, as we consider the various approaches, the grave privacy risks associated with centralized systems far outweigh the limited additional benefits, leading us to highly value decentralized approaches. when checking if a healthy user has been exposed to covid- , contact-tracing apps may either push the healthy user's data to the authority (centralized processing) or pull a list of locations and/or contact ids of those who have been infected from the authority (decentralized processing). with a push, the healthy user's data is pushed (shared) off of the user's device and is compared by the authority to the data of people who have been infected. this exposes a large amount of data to the authority. in a pull model, an anonymized history of location data or identifiers from people who have been infected are pulled onto the healthy user's device so that the comparison can take place locally without compromising the privacy of healthy individuals. given what is known to date about person-to-person transmission of covid- , contact-tracing apps can properly assess users' potential exposure to the virus if they take four important factors into consideration: • the distance between the person who is infected and the user. • the length of time the person who is infected and the user occupied the same space. • how many days prior to becoming infected the person interacted with the user. • whether or not the user may have had contact with contaminated surfaces after interaction with the person who is infected. a location history must be collected from a person who has been diagnosed with covid- in order for contact tracing to occur. several approaches are being piloted. in general, these approaches fall into two categories: • an authority (public health official, healthcare provider, government official) collects the location history from the person who is infected and makes it available to users of the app. • the patient self-reports symptoms and directly shares their data with other users of the app. use of an authority offers the advantage of confirmation that the person has covid- . the overlap of symptoms between covid- and other common respiratory illnesses might cause someone to suspect they have covid- when they actually have the flu or a common cold. systems where people self-report themselves as infected pose the risk that people with symptoms, but without a confirmed diagnosis, share their location trail. self-reporting approaches are also at risk from bad actors who may misreport their status as infected in order to create chaos and fear. however, self-reporting systems have the advantage of fuller consent of the infected person as the person definitively decides to share their location trail without influence from an authority figure. when evaluating contact-tracing solutions, we seek to understand how data will be collected from the person who is infected and how the solution will confirm that the person truly has covid- . at the base of every contact-tracing app lies an algorithm that determines whether the app user has been exposed to people who are infected and might have an increased chance of being infected themselves. the algorithm integrates many factors, such as the distance between the users, the length of time the users were in the same location, or the amount of time between the contact and the start of symptoms. two apps with different algorithms will potentially give a different likelihood of exposure to the same user. understanding the algorithm used is necessary for public health officials and healthcare providers to provide appropriate guidance to users who receive an exposure notification. contact-tracing app developers must clearly communicate their algorithm with all stakeholders and failure to do so will be a significant red flag. location data may potentially be repurposed to achieve additional objectives beyond contact tracing. we believe these data should be used only for response to an ongoing pandemic and that other uses should be strictly forbidden. turning app data over to law enforcement or other non-health actors, such as commercial entities seeking to target ads to potential customers, threatens users' rights and privacy. critically, this undermines public trust. without trust, citizens will not adopt contact-tracing apps at a wide enough scale to effectively control the spread of the epidemic. therefore, access to location-tracking data should be tightly limited to specific public health initiatives working on pandemic response. users should be able to confirm how their data is used. promises by the app's developers to delete data are insufficient. users should be able to check exactly what location data has been collected and stored and to confirm that their data is no longer there after the deadline for deletion (the disease's incubation period, to days for coronavirus). apps must obtain users' unforced and informed consent for any disclosure of their data. recently, the a teleom austria group shared aggregated user location data from an app not regularly used for public health purposes with the austrian government's covid- emergency management team for reasons that were not initially specified. observers believe that a 's data was most likely being used to forecast disease spread or to monitor the population for large gatherings that might transmit the virus. however, the sharing of location data with government agencies for unspecified purposes attracted the criticism of privacy rights activists and created suspicions that weakened user trust, threatening long-term success. an opportunity for misuse and privacy violations arises whenever a third party, a government, a corporation, or any other entity is able to access the data of healthy users. a decentralized approach prevents privacy compromise for healthy users because they are doing all the calculations on their own phones. time-limited storage of location data also protects user privacy, such as only storing days of data with deletion of everything beyond this point. all contact-tracing app development teams should clearly articulate how they protect the privacy of all users -whether healthy or infected. as an example, a preliminary draft of the privacy principles of the covid safe paths team can be accessed in covid- contact tracing privacy principles. this overview of model privacy practices explains how the application embraces principles such as privacy by design, the fair information practice principles (fipps), and legal protection by design. historical location data and nearby device ids must be collected from a person who is infected to enable contact tracing. however, both the collection and release of that information have broad implications for the privacy rights of the individual. as the most vulnerable stakeholder, several efforts must be undertaken to protect, to the highest degree possible, the privacy of the person who is infected. app development teams may design for privacy by utilizing a variety of approaches: • providing users with the ability to correct incorrect information. • notifying individuals about what data is collected, how long it is stored, and who will have access to it during each stage of use. • enabling people to obtain access to information about potential exposures to covid- without requiring that they consent to share their data with other parties. • deleting user location data after it is no longer necessary to perform contact tracing. • alignment with the fair information practice principles. • using open-source software to foster trust in the app's privacy protection claims. • limiting the amount of data published publicly. • providing tools that allow the person who has been diagnosed and their healthcare providers to redact any sensitive locations, such as a home or workplace. • end-to-end encryption of location data before sensitive locations are redacted. • eliminating the risk of third-party access to information by enabling voluntary selfreporting by the person who is infected. • supporting strict regulation around access to and usage of the data by any entity that collects it, particularly governments. • obtaining targeted, affirmative, informed consent for each use of the person's data. • providing users with the ability to see how their data is being used and revoke consent for usage of their information. requiring people who are infected or potentially infected to track their movements and disclose their contacts achieves the highest degree of efficacy in contact tracing within a community. however, if residents cannot choose to at least selectively withhold their information, they may be stigmatized, persecuted, or exploited by malicious actors on the basis of their data. voluntary reporting respects users' rights to privacy and to informed consent. it encourages app developers to include safeguards that reduce the risk for abuse of sensitive data. however, when individuals who become infected refuse to share their contact-tracing data, the accuracy of contact tracing declines, potentially contributing to misinformation and a false sense of security. we believe that no one should be forced to relinquish highly sensitive personal data. we dislike solutions that require potential users to consent to share their data if they become infected in order to access information about whether or not they have crossed paths with someone who was infected. incentives such as those outlined in the following sections should be implemented to encourage users who become infected to share their data. people who are healthy should also proactively choose to use a contact-tracing app rather than being mandated to do so. potential users should be encouraged to do so by incentives, such as the opportunity to take control of their information to benefit their health, strong privacy protection policies, trust in the app's developers, clear communication, and informed consent. in order to roll out a contact-tracing app on a global scale, three groups must work together: a substantial team to create and promote the app; large, trusted institutions to support development and deployment of the app; and local, onthe-ground partners in the various communities in which the app is deployed. contact-tracing apps are tools, not complete solutions. disease containment utilizing these tools requires multidisciplinary collaborations across the technology, healthcare, public health, and government sectors. we are working hard to create these partnerships for covid safe paths and look for such partnerships in other apps we evaluate. among those partnerships teams should be seeking to build are: • cloud players (aws, azure, gcp, etc.) • mobile carriers and local telecommunications providers. • partnerships with health authorities; these partnerships are particularly important in light of app store requirements for all apps addressing the covid- pandemic to have the support of a health organization • government agencies • local public health workers and healthcare providers: contact-tracing apps will only succeed if those who crossed paths with someone who became infected can receive guidance and support from local providers on what steps to take to protect themselves and their families. • current contact tracers; integrating into the current contact-tracing protocol increases the effectiveness of a contact-tracing app within a community • non-profit organizations and academic institutions we see apps aiming to deploy at a variety of levels, from a single city to an entire nation to those aiming for a global reach. regardless of the level at which they are deployed, contact-tracing apps must be paired with existing infrastructure in order to support a successful containment strategy. public health officials and healthcare providers must be ready to answer user questions, offer testing, or provide advice about what to do if someone has been exposed to a person with covid- . the resources and support necessary to follow this advice must also be made available. we look for well-considered deployment strategies with aggressive outreach to local partners. for this reason, we are building not only a contact-tracing app, but also safe places, a web-based tool for public health officials working to contain the covid- pandemic. it is also worth noting that as global travel resumes, cross-communication between apps operating in different regions will be necessary to achieve global containment of covid- . we look for teams that are thinking ahead and building the technological foundation for this collaboration into their application. taking any software tool from idea to widespread solution requires the team to think creatively. contact-tracing apps gain value with each additional user. many approaches to encouraging user adoption exist, and good teams will use a variety of them. a few steps we encourage are: • fostering trust • developing key partnerships, including with community officials who can help drive local support for the solution • creating solutions that meet the needs of public health officials responding to the pandemic • focusing on the needs of the users • providing value to the user during a contact-tracing interview even if they choose not to download the app before they have been diagnosed with covid- contact-tracing apps need a strong value proposition for each stakeholder-the healthy user, the person who is infected, the public health worker responsible for contact tracing, the public health authority responsible for the community's response to the pandemic, and government officials tasked with coordinating the local or national response to covid- . as an example, the incentives for each stakeholder from the safe paths solution are presented here. offers an opportunity to take control and gain information. the user is able to make decisions about where they should be going and what activities are safe for their families and themselves. users are more confident and more informed about their actual risk of spreading the disease. gives the ability to quickly and accurately share location history with public health contact tracers. sharing their history offers an opportunity to help protect their community. gives immediate relief to contact tracers. provides a tool to more efficiently conduct interviews and gather information from patients. increases data accuracy over current methods (e.g., remembering). enables them to work with infected patients to quickly remove information that the patient asserts is personal, private, and/or confidential. allows more efficient and more accurate data collection and analysis about the spread of covid- within their jurisdiction. provides data to make better, more targeted recommendations for intervention to their community and to utilize limited testing resources most constructively. offers an opportunity to communicate a personalized risk profile to each citizen, answering the question "should i be concerned or not?" for every individual in their constituency and to closely monitor those who have the highest chance of experiencing complications from covid- . faster and more accurate contact tracing allows officials to catch up with the virus and more effectively deploy resources. rather than undifferentiated application of lockdown measures risking economic and subsequent financial collapse, officials are able to implement a differentiated approach with targeted measures as recommended by the who. the utmost care must be taken when notifying users of a potential exposure to covid- given the serious health, economic, and social consequences of a notification. during this stressful time, clear, easy-to-understand communication reduces the possibility for the user to misjudge their situation. high-quality translations should be available for all users. transparency about how the decision to notify the user was made helps the user and their public health officials make decisions about whether and which containment measures the user needs to undertake. notifications should evolve to reflect advances in the understanding of disease transmission as scientists around the world continue to clarify how covid- passes from person to person. contact-tracing apps, particularly those that allow individuals to self-report themselves as infected, must address the risk that some people will make fraudulent reports. in some instances, a false report may be done in good faith-the person truly suspects they have covid- , but they have not undergone definitive testing and actually have a different virus. in other cases, bad actors may report themselves as infected with covid- in order to create chaos. storing sensitive information in an anonymized, redacted, and aggregated manner minimizes the risk of data-tampering, yet it does not eliminate the chance for human error or malicious intervention. one approach to reducing fraud requires the diagnosis to be confirmed by a healthcare provider. however, creative teams may find other ways to prevent false reports of illness. with large-scale deployment, most apps will experience an occasional false report or find an error in an otherwise correct report. each app should develop a protocol for its response when an incorrect report is identified. easy-to-use tools should allow all involved in reporting to quickly mark and remove errors as soon as the false report is identified. most often, users should be notified of the change in their exposure history. while most apps aim to obscure the identity of the person who is infected, accidental release of information sufficient to identify the person can occur on rare occasions, similar to accidental release of protected health information. these low risks should be communicated to the users during the consent process. a process for quickly removing identifiable information from public access should be in place. notification of a potential exposure to covid- will be frightening to many, particularly those at increased risk for serious complications, and may lead to panic among users. large groups of people seeking medical evaluation or demanding testing could quickly overwhelm an already strained healthcare system. we have seen panic related to the pandemic lead to hoarding and vigilantism. conversely, users who are not notified of a potential exposure may assume they are at no risk to catch covid- and disregard critical social distancing and hygiene recommendations. any contact-tracing solution will need to provide users with accurate information to reduce the chance for panic or risky behavior. when reviewing an app, we look for the following: • clear, easy-to-understand, culturally appropriate communication with the user • engagement of epidemiologists, public health officials, and healthcare providers, both as core members of the decision-making team and as local partners within the community to which the app is deployed, in order to provide assessment and recommendations to people who may have been exposed to covid- • measures to prevent individuals from falsely reporting themselves infected and thoughtful consideration of how a person reported to be infected is confirmed to have covid- • use of both gps and bluetooth systems, utilizing the strengths of each technology • creative algorithms that reduce the chance that insignificant exposures are flagged contact-tracing apps should be viewed as a tool to be utilized by experts in infectious disease control. epidemiologists, public health officials, and healthcare providers must be core members of any team designing and implementing a contact-tracing app. we look to see that such experts are included as team members, mentors, and strategic partners. ideally, contact-tracing apps should fit into the current care pathway. one of the leaders in this area is tracetogether in singapore, which supports a contact-tracing process put in place long before the app was ready. tracetogether uses bluetooth to identify nearby phones with the app installed and tracks both proximity and timestamps. if a person is diagnosed with covid- , they can choose to allow the ministry of health to access their tracetogether data, which is then used by the manual contact-tracing team to alert those who may have been exposed. the manual contact-tracing team then alerts those who may have been exposed. we also aim to lead in this area with the development of covid safe places, a web-tool allowing public health officials to work more quickly, collect better data, and better respond to what is happening in their community. we are partnering with public health workers around the world to deploy covid safe places. the success of any contact-tracing program should be measured in lives saved. lives are saved both by a reduction in the spread of disease and by a reduction in the psychosocial and economic consequences of widespread quarantine actions. quantitative analysis of the effect of this new technology should be undertaken-not only to allow for further improvements during the current covid- pandemic, but also to better address the next outbreak of infectious disease. in addition to collecting real-world data about the impact of contact-tracing apps, teams should work to communicate their success to the public. if the apps are effective in helping to control the pandemic, the public may fail to notice the extent to which their use was critical to the community's ability to control the spread of disease. the covid- pandemic will not last forever. if we falter in our response and choose digital contact-tracing tools that compromise individual privacy for efficacy, the consequences will extend long after the last store has reopened and the last child has returned to school. we believe privacy does not have to be compromised in order to reduce new infections and slow the spread of disease. we are building covid safe paths with privacy protection at the forefront for this pandemic and the next. here, we have begun to detail the key questions that should be asked as we evaluate contact-tracing apps developed and deployed against the covid- pandemic. we plan to continue this discussion and are committed to serving as a resource for countries, states, cities, and individuals throughout the world. we welcome additions to and modifications of this report and analysis. to submit a change please email info@pathcheck.org assessing disease exposure risk with location data: a proposal for cryptographic reservation of privacy how europe manages to keep a lid on coronavirus unemployment while it spikes in the u.s. the washington post privacy by design: the foundational principles. implementation and mapping of fair information practices covid- dashboard recommendation regarding the use of cloth face coverings singapore says it will make its contact tracing tech freely available to developers % can't pay the rent: 'it's only going to get worse fair information practice principles clever cryptography could protect privacy in covid- contact tracing apps covid- contact tracing privacy principles centre for the mathematical modelling of infectious diseases covid- working group coronavirus: the korean clusters. reuters graphics legal by design' or 'legal protection by design'? in law for computer scientists coronavirus disease vs. the flu the efficacy of contact tracing for the containment of the novel coronavirus (covid- ) more scary than coronavirus': south korea's health alerts expose private lives. the guardian the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application how the coronavirus is disrupting the global food supply in coronavirus fight, china gives citizens a color code, with red flags. the new york times coronavirus vaccine in months? experts urge reality check coronavirus has disrupted supply chains for nearly % of u austria: telco a gives government location data to test movement restrictions apps gone rogue: maintaining personal privacy in an epidemic. arxiv don't believe the covid- models: that's not what they're for people tattle on neighbors flouting covid- shutdown orders contact tracing covid- virtual press conference jobs carnage mounts: million file for unemployment in weeks. national public radio key: cord- -m f de authors: trivedi, amee; zakaria, camellia; balan, rajesh; shenoy, prashant title: wifitrace: network-based contact tracing for infectious diseases using passive wifi sensing date: - - journal: nan doi: nan sha: doc_id: cord_uid: m f de contact tracing is a well-established and effective approach for containment of spread of infectious diseases. while bluetooth-based contact tracing method using phones have become popular recently, these approaches suffer from the need for a critical mass of adoption in order to be effective. in this paper, we present wifitrace, a network-centric approach for contact tracing that relies on passive wifi sensing with no client-side involvement. our approach exploits wifi network logs gathered by enterprise networks for performance and security monitoring and utilizes it for reconstructing device trajectories for contact tracing. our approach is specifically designed to enhance the efficacy of traditional methods, rather than to supplant it with a new technology. we design an efficient graph algorithm to scale our approach to large networks with tens of thousands of users. we have implemented a full prototype of our system and deployed it on two large university campuses. we validate our approach and demonstrate its efficacy using case studies and detailed experiments using real-world wifi datasets. approaches use bluetooth for proximity sensing, sometimes in combination with gps and other locationing techniques present on the phone for location sensing [ ] . in this paper, we present an alternative network-centric approach for phone-based contact tracing. in contrast to client-side approaches that depend on the use of bluetooth and mobile apps a network-centric approach does not require data collection to be performed on the device or apps to be downloaded by the user on the phone. instead, users use their phone or mobile device normally and the approach uses the network's view of the user to infer their location and proximity to others. our approach is based on wifi sensing [ , ] and leverages data such as system logs ("syslogs") that are already generated by the enterprise wifi networks for contact tracing. although our approach does not require the use of wifi location [ ] , such techniques, where available, can further enhance the efficacy of our approach. our network-centric approach to contact tracing offers a different set of trade-offs and privacy considerations than bluetooth-based client-centric methods; one of the goals of our work is to carefully analyze these tradeoffs. the following scenario presents an illustrative use case of how our approach works. consider a student who visits the university health clinic and is diagnosed with an infectious disease. the university health clinic officials decide to perform contact tracing and seek the consent of the student for network-based contact tracing. since the user could have transmitted the disease to others over the past several days it is important to determine what campus buildings and specific locations within each building were visited by the student during that period and which other users were in the proximity of the student during those periods. the health officials input the wifi mac address of the student's phone into the network-centric contact tracing tool. the tool analyses wifi logs generated by the network, and specifically association and dissociation log messages for this device, at various access points on campus to reconstruct the location(building, room numbers) visited by the user. it further analyzes all other users who were associated with those access points at those times to determine users who were in proximity of the user and for how long. this location and proximity reports are used by health officials to assist with contact tracing. additional reports for each impacted user can be recursively generated. in designing, implementing, deploying, and evaluating our network-centric contact tracing tool, our paper makes the following contributions: • we present a network-side contact tracing method that involves passive wifi sensing and no client-side involvement. we discuss why such an approach may be preferable in some environments, such as academic or corporate campuses, over client-side methods. • we present a graph-based model and graph algorithms for efficiently performing contact tracing on passive wifi data comprising tens of thousands of users. • we implement a full prototype of our system and deploy it on two large university campuses in two different continents. • we validate and experimentally evaluate our approach using anonymized data from two large university networks. our results show that the efficacy of contact tracing for three simulated diseases and highlights the need to judiciously choose wifi session parameters to reduce both false positives and false negatives. through case studies, we show the efficacy of judicious iterative contact tracing while avoiding an exponential increase in co-located users who need to be traced, and also evaluate our approach for normal campus mobility patterns and mobility patterns under quarantine. we show that our graph-based approach can scale to settings with tens of thousands of users and also present the limitations of using wifi sensing for contact tracing. in this section, we provide background on contact tracing and present motivation for our network-centric approach. : am : pm : pm : pm location location wifitrace (proximity report) location : : am - : am : pm- : pm location : : pm - : pm location : : am - : pm location : : pm - : pm wifitrace (location report) -arrival -departure contact tracing: contact tracing is a well-established method that is used by health professionals to track down the source of an infection and take pro-active measures to contain its spread [ ] . the traditional method is based on questionnaires -upon diagnosis, the user is asked to list places visited and other people whom they have had contact and this information is used to iteratively contact these individuals and so on [ ] . the goals of contact tracing are two-fold: identify the potential source of infection for the diagnosed individual and determine others who may have gotten infected due to proximity or contact. since there is often a to day incubation period between the time of infection and onset of the illness, infected users often need to use their recollection of where they have been over multiple days or weeks, a process that can be error-prone due to gaps in memory. the manual process is challenging to scale up to larger numbers of users, especially for larger outbreaks of disease. phone-based contact tracing: since smartphones are now ubiquitous, the use of phone-based sensing for contact tracing has emerged as a key technology to automate and scale the contact tracing process [ , ] . the most common method involves the use of bluetooth to transmit a unique (and often anonymized) identifier from each phone. a phone also listens for such identifiers from other phones in its proximity. thus, the device can determine which other users/phones are in its proximity at each time instant. when a particular user contracts an infection, their device id is used by others to determine if they have been in the proximity of the infected user. this basic approach has been implemented by apple and google into their contact tracing api [ ]. many standalone contact tracing apps have also implemented this approach, which also involves having each phone upload its collected data to a server for contact tracing analysis [ , , , , ] . we note that such a client-centric approach requires a user to first download a mobile app before contact tracing data can be gathered-users who have not downloaded the app (or have opted in) are not visible to other phones that are actively listening for other devices in their proximity. thus the overall effectiveness of the approach depends on the level of user adoption. this is seen as a key hurdle from the experience of singapore's tracetogether app [ ] , which has seen only . million downloads despite needing a critical mass of million active users (around two-thirds of the population) to be effective [ ] . health experts have argued that while technology-based contact tracing solutions are useful, they should not be seen as a replacement for traditional means of contact tracing, which is still an effective approach [ ] . our network-centric approach is designed to address these issues. first, it is designed to help health professionals improve traditional contact tracing methods, rather than supplant manual contact tracing using technology. our network-centric tool is designed to integrate into health professional's contact tracing workflows; unlike some bluetooth apps, they are not designed for end-users to self-monitor their proximity to infected users. second, a network-centric approach overcomes the critical mass adoption hurdles faced by bluetooth approaches-since it is based on passive wifi sensing that does not require any app to be downloaded by users or require active client participation. with near-ubiquitous availability of wifi in environments such as offices, university campuses, and shopping areas, wifi sensing has emerged as a popular approach for addressing a range of analytic tasks [ , ] . wifi sensing can be client-based (i.e. done on the mobile device) or network-based (i.e. done from the network's perspective). performing triangulation via rssi or time of flight measurements to multiple wifi access points to localize a device's position is an example of client-side wifi sensing [ ] . in contrast, network-centric wifi sensing involves using the network's view of one or more devices to perform analytics. the approach has been used for monitoring the mobility of wifi devices by analyzing the sequence of the access points that see the same device over a period of time [ ] . while mobility characterization and modeling using wifi sensing has seen more than a decade of research [ , ] more recent-work has leveraged wifi sensing for a range of analytic tasks such as tracking health [ ] , stress [ ] , retail analytics [ ] and more. we build on this prior body of work and focus on the network-centric approach for contact tracing. the key premise of the approach is that the mobility of a user's phone is visible to the network through the sequence of wifi access point associations performed by the device as the user moves, which allows the network to determine the locations visited by the users' device and other co-located devices that were present at those locations by virtue of being associated with those aps. thus, the approach relies on passive wifi sensing by passively observing devices as they move through the network. there are some key advantages of such an approach over a client-centric approach, unlike a client-based approach that needs a critical mass of users to opt-in or download an app before proximity can be effectively determined, the wireless network can "see" all devices that are connected (associated) to it at all times. hence, a network-centric method is easier to deploy and scale to large numbers of users without any initial deployment hurdles. second, the client-centric approach involves data collection on each device for proximity sensing. by its very nature, a network-centric approach does not require any data to be collected on the device. in many cases, the approach may not even require an additional data to be collected by the network. this is because this method relies on syslog of network events, snmp reports, or rtls events that are routinely logged by many enterprise networks for purposes of performance and security monitoring. our network-centric approach "mines" this already logged data for performing contact tracing. of course, our approach does require network logging of ap events by the network if this information is not already being logged. third, a client-centric method uses bluetooth for proximity sensing and must use a second sensing modality such as gps for sensing location where those devices were seen. in contrast, a network-centric approach can use a single modality -wifi sensing -to determine the location (based on the ap locations) and proximity (based on ap associations). note that methods like gps do not work well inside buildings, while passive wifi sensing can provide ap-level locations of users even without any additional wifi locationing technology. however, the approach is not without challenges. bluetooth-based approaches claim to sense other devices that are within a few feet of the user, which is then used for proximity analysis. although the use of bluetooth to coarse-grain proximity measurements (e.g., users co-located within the range of an access point). coarse-grain proximity sensing can increase false positives, and hence the approach uses the duration of proximate co-location as an indicator of risk of infection and the duration of proximate co-location can be determined accurately (same as bluetooth). moreover, since we designed our approach to enhance traditional contact tracing, rather than replace it, coarse-grain proximity information, along with co-location duration, is still useful to health professionals for identifying users who should be subjected to traditional contact tracking checks. wifi-based contact tracing only works in areas with wifi coverage -which are largely indoor spaces and a few key outdoor spaces. this method does not work outdoors where no wifi coverage is available. in contrast, bluetooth methods work "everywhere"-both indoors and outdoors-since they involve listening to other devices and do not depend on a network. while this is a key limitation of a network-centric approach, they are nevertheless effective in university campuses or corporate environments where employees spend a significant portion of their day. finally, all contact tracing methods, whether client or network-based raise important privacy concerns. however, privacy considerations of network-based methods are different from those of bluetooth-based client methods [ ] . we discuss these in detail in section § and show how user privacy can be safeguarded in such methods. the deployment of network-centric contact tracing technology raises privacy issues, which we discuss in section . ethical considerations that came up during the design of this technology are discussed here. data collection for experimentally validating the efficacy of our approach has been approved by our institutional review board (irb) and is conducted under a data usage agreement (dua) with the campus network it group that restricts and safeguards all the wifi data collected. to avoid any privacy data leakage all the mac ids and usernames in the syslogs are anonymized using a strong hashing algorithm. the hashing is performed before syslog data is stored on disk under the guidance of the it manager who is the only person aware of the hash key of the algorithm. any data analysis that results in the de-anonymization of the users is strictly prohibited under the irb and signed dua. users on the usa campus involved in the data collection consent to an acceptable use it policy, that permits the campus it department to collect network-level syslog data events for a system diagnosis or analysis of cyber-attacks on the enterprise network. additionally, all researchers sign a form of consent to adhere to the signed irb and dua and undergo mandatory ethics training. in short, the data used to validate and evaluate our approach prior to its actual deployment is anonymized and subjected to multiple safeguards as part of an irb-approved study. this section presents an overview of our approach, followed by the details of our graph-based contact tracing algorithm. fig depicts the architectural overview of our contact tracing system. the system uses a three-tier pipelined architecture. the data collection tier uses network logging capabilities that are already present in enterprise wifi systems to collect the wifi logs of device associations to access points within the network. many enterprise it administrators already collect this data for network monitoring, in which case this data can simply be fed to the next tier in the pipeline. otherwise, the it admins need to turn on logging to start gathering this data. the next tier in the pipeline ingests this raw data and converts it into a standard intermediate format. in other words, this tier performs pre-processing of the data. since the raw log files will have vendor-specific formats, this tier implements vendor-specific pre-processing modules that are specific to each wifi manufacturer and its logging format. this tier processes log files in batches every so often and generates data in intermediate form. our final tier ingests the data produced by the vendor-specific pre-processor and creates a graph structure that captures the trajectories of user devices. it exposes a query interface for contact tracing, for each query, it uses the computed trajectories over the query duration to produce (i) a location report listing locations visited by the infected user and ( ) a proximity report listing users who were in proximity of that user and for how long. as discussed below, this tier uses time-evolving graphs and efficient graph algorithms to efficiently intersect trajectories of a large number of devices (typically tens of thousands of users that may be present on a university campus) to produce its report. consider a wifi network with n wireless access points that serves m users with d devices. we assume that the n access points are distributed across buildings and other key spaces in an academic or corporate campus and that the location of each access point (e.g., building, floor, room) is known. large enterprises such as a residential university will comprise thousands of access points (our work is based on deployment and data from two large overview : user name : janedoe start time : : am /jan/ end time : : pm /jan/ showing all locations visited for mins or higher visit details : overview : user name : janedoe start time : : am /jan/ end time : : pm /jan/ displaying co-located users in descending order of total co-location time. number of users co-located : alice ... . an example contact tracing report produced by our tool: (a) patient report (b) proximity report universities, one based in the northeastern usa that comprises access points and one based on singapore that comprises , access points). the number of users and devices seen in such networks is typically in the order of tens of thousands. to manage such a large network, enterprise wifi networks uses controller nodes that have the capability to administer and manage the aps and the network traffic, along with detailed logging and reporting capabilities. as a user moves from one location to another, their mobile device (typically a phone) associates with a nearby access point. since the locations of aps (building, floor, room) is known, the sequence of ap associations over the course of a day reveals the trajectory of the user and the visited locations. to reconstruct this trajectory we assume that the wifi network logs contain association and disassociation events as seen by each ap. typically this information is of the form: timestamp, ap mac address, device mac, optional user-id, event-type, where event-type can be one of association, disassociation, reassociation, authorization, and unauthorization. typically when a device switches to a new ap due to user mobility, this is visible to the network in the form of disassociation with the previous ap and an association with a new ap. given this log information, contact tracing of a user involves two steps: ( ) determine all aps visited by the user in the specified time period and ( ) determine all users who were associated with each of those aps concurrently with the infected user. to do so, we can analyze the log to first construct the time-ordered sequence of ap sessions of the concerned device (a session is the time period represented by an association followed by a disassociation). since ap locations are known, this session list represents the location visited by the user and the time duration. next, for each ap session in the above user trajectory, we can analyze the log to determine overlapping sessions of all other users at that ap. these are users (i.e. their devices) who were present in the proximity of the infected user. of course, the wifi log does not reveal the distance between the two users or whether physical contact occurred. nevertheless, it enables us to determine users at risk by computing the duration for which the two users were in proximity of one another. in some cases, the location where they were co-located may reveal the degree of risk (e.g., a hour long meeting in a small conference room or a lecture classroom). to enable health ... professionals to further assess the risk during contact tracing, we generate a location report, showing locations visited by the user and for how long as well as a a proximity report of co-located users at each location and the duration of co-location. figure depicts a sample report resulting from the process. since an enterprise network with thousands of aps and tens of thousands of devices will generate very large log files (for example, the log file from one of our campuses contains more than billion events over a month semester period). scanning the log to compute the location and proximity can be slow and inefficient. consequently, we present an efficient graph-based algorithm based on time-evolving graphs in the next section. to efficiently process contact tracing queries, we model the data as a bipartite graph between devices and aps. each device in the wifi log is modeled as a node in the graph; each ap in the network is similarly modeled as a node. an edge between a device node and an ap node indicates that the device was associated with that ap. each edge is annotated by the time interval (t , t ) that denotes start and end times of the association session between that device and the ap. note that data is continuously logged to the log files, which causes new edges to be added to the graph as new associations are observed and new nodes to be added as new devices are observed in the logs. thus, our bipartite graph is a time-evolving graph. for computational efficiency, each device and ap node in the graph is limited to a time duration, say an hour or a day. this is done to limit the number of edges incident on each node, which can keep growing as device associate with new aps or aps see new association session. as a result of associating a time duration with each node, each device or ap is represented by multiple nodes in the graph, one for each time duration where there is activity. in this case, we can view the node id as the mac address concatenated by the time duration. for example, mac [ : , : ], mac [ : , : ], represent two nodes for the same device, each capturing ap association edges seen within that period. in case of ap nodes, this would capture all device association to that ap within those time periods (see figure ). the duration for partitioning each node's activity in the graph is a configurable parameter, and this duration can chosen independently for a device node and an ap node if needed. given such a bi-partite graph, a contact tracing request is specified by providing a device mac address and a duration (t st ar t , t end ) over which a contact trace report should be generated. the query also takes a threshold τ that specifies only ap sessions of duration longer than τ should be considered. the graph algorithm first identifies all device nodes corresponding to this user that lie within the (t st ar t , t end ) interval and identifies all edges from these nodes. these edges represent all ap locations visited by the user, and session durations represent the time spent at each location. only edges with the following constraints are considered: ( ) the session must lie within the query time interval, i.e., [t , t ] ∈ [start, t end ] and ( ) the session duration must be at least τ , i.e., (t − t ) ≥ τ . edges that do not satisfy either of the above criteria are ignored and the remaining edges are used to enumerate the ap locations visited by the device and the time duration spent at each location. to compute the proximity report, the algorithm traverses each edge and examines the corresponding ap node. for each ap node, the list of incident edges corresponds to all devices that had active sessions with that ap. the session duration [t , t ] on each edge is compared to the infected users session [t , t ] and the edge is included only of the two session overlap. this process yields a list of all other users who had an overlapping session with the infected user. the algorithm can also take an optimal parameter w that indicates the minimum overlap in session between the two for the user to be included in the proximity representation, i.e., w ≥ [t , t ] ∩ [t , t ]. the parameter w specifies the minimum duration of co-location necessary for a user to be included in the proximity report. algorithm lists the pseudo code for our graph algorithm. thus, a time evolving bipartite graph allows for efficient processing of contact tracing queries over a large dataset. since contact tracing technologies use location and proximity information of users, they raise important privacy concerns. privacy concerns for client-side bluetooth-based applications are well-known [ ] . since networkcentric client tracing is an alternative approach that raises a different set of concerns, we discuss these issues in this section and describe techniques used in our ongoing deployments to mitigate them. first, our network-centric tool is aimed at health-care and medical professionals who perform contact tracing and is not an end-user focused tool for self-monitoring prior contacts. contact tracing is a well-established approach that has traditionally been performed manually through questionnaires [ ] . our tool has been designed to fit into this workflow and serves as an additional source of information, in addition to interviews, for professionals engaged in contact tracing. unlike some bluetooth-based apps, it does not allow end-users to lookup information about themselves or anonymous infected users. by focusing on health professionals and not end-users, our tool avoids some of the privacy pitfalls from giving end-users access to anonymous proximity data. second, even though data access is limited to health professionals, the data contains sensitive location information and is still prone to privacy misuse. there are two approaches to handling this problem. first, we recommend that operational control of the tool be in the hands of the organization's it security group. recall that the approach is based on wifi network monitoring data that is already routinely gathered by it departments for network performance and security monitoring. for example, our campus uses such data to track down compromised devices that are connected to our wifi network and may be responsible for ddos attacks from inside. another example is tracking down student hackers, since the hacking of university computers (e.g., to change course grades) is a common exploit on university campuses. audit and compliance laws in many regions also necessitate gathering network logs for subsequent analysis and audits. to address these issues, it departments routinely collect detailed network logs and use them for optimizing performance or handling security incidents. since the it department already has access to the raw data used by our network-centric tool, deploying the tool within the it department does not increase privacy risks (since this raw data is already prone to the same privacy risks independent of our tool, and it departments have strict safeguards in place to protect such data and limit access to it). here, limiting operational control of the tool to the organization's it group can provide good privacy protection in practice. however, it may not always be feasible to limit control of the tool to it professionals alone. for instance, larger outbreaks of disease may require allowing direct access to health officials who are performing contact tracing. in this case, we can address privacy concerns by not storing user identities or real mac addresses with the tool itself. instead, user names and device mac addresses are anonymized by a cryptographic hash (eg sha- hash). all queries on the tool are done using hashed identities and not the real ones. the actual mapping of user names and device mac addresses to their hashed values is stored separately from the tool, and this information is accessible only to a small trusted group. to perform contact tracing on an individual, this trusted person needs to authorize it (e.g., once user consent is obtained) by releasing the mapping of the actual name of the user and their device mac to the hashed values. the tool can then be queried using the hashed values of that user's information. similarly, once proximity reports are generated, they can be sent to the trusted person, who can then deanonymize that information using the mapping table. in this manner, it is not possible to query the tool to track activities of an arbitrary user, unless first authorized, which prevents misuse. our current campus deployment uses this anonymized data approach for additional safety. finally, many countries have strict privacy laws that require user consent before collecting sensitive data. to comply, many organizations require users to consent to their it policy that enables them to gather network data for critical safety operations-a prerequisite for such network monitoring. further, health care professionals are required to obtain user consent to perform manual contact tracing-a process that can be used for network-centric contact tracing as well. this section presents our system implementation. we have implemented our system using python and perl. our tool is available as open-source code to researchers and organizations who wish to deploy it (source code is available at http://wifitrace.github.io) as shown in figure , our implementation uses a three tier architecture. the first tier is based on the logging capabilities that are already supported by enterprise-grade wifi networks. our system simply uses these capabilities and implements only the next two tiers. our system currently supports wifi access points from cisco and hp/aruba, two large vendors of enterprise wifi equipment. we have implemented a pre-processing code for both these professionals can decide whether to pro-actively notify co-located individuals who are deemed to be at risk during an outbreak or to publicize a list of locations and times visited by an infected individual(s) and request other users to contact them if they are impacted. in the latter case, the proximity report data is used for further contact tracing once co-located users contact health officials. the latter approach is presently used on our usa campus. vendors to take raw monitoring data and convert it to a standard intermediate data format for our second tier. for hp/aruba network, our tool supports the processing of both syslogs (generated by arubas wifi controllers) as well as rtls logs generated by aruba aps. both types of logs provide association and disassociation information. in case of aruba rtls, we log wifi data directly from the controller nodes using either real-time location services (rtls) apis [ ] . in case of aruba syslogs, we periodically copy the raw syslogs generated by the controller and pre-process this raw data. for the cisco networks, we log wifi data directly from the network using the cisco connected devices (cmx) location api v [ ] . all of these preprocessor scripts convert raw logs into the following standard record format : timestamp, ap name or id, device mac id, event type, (optional) user name by default, we assume anonymized (or hashed) device macs and usernames and assume a separate secure file containing a mapping of real names to hashes. our third tier implementation then uses this data to support contact trace querying. a query is of the form (hashed) username or device mac, start duration, end duration, threshold τ , and co-locator treshold w. internally the data generated by the pre-processing code is represented as a bi-partite graph, as discussed in section . . our system supports a variety of queries on this graph through a graph api depicted in table . this graph api is used to implement the graph algorithm described in section . . the algorithm yields a location report, which shows all locations (aps) visited by the user for longer than τ and a proximity report that shows all users who were connected to those aps for a duration greater than w. figure shows a sample location and proximity report generated by our system. in addition to human-readable query reports, our system can optionally output query results in json format, which is convenient for visualization or subsequent processing. our system also supports additional report types beyond location and proximity reports. for example, it can produce reports of additional users who visit a location after the infected user has departed from that location. this is useful when a location has high-contact surfaces that may continue to transmit a contagious disease even after an infected user departs. such a report can produced by specifying a window parameter, that specifies the time window over which additional users are identified as being at risk at each location after the user departs. we have deployed and operationalized our tool on both our university campuses (one in northeastern usa and one in singapore) through a collaboration with our university's health and it service. both campuses have large wifi networks, one with hp/aruba aps and the other with a mixed cisco/aruba network of , aps. while our tool can be used for contact tracing of any infectious disease(we have originally begun developing it inspired by an outbreak of meningitis on our campus), health officials on both campuses view it as a method for scalable contact tracing for covid- . while our tool has been operational for several months, fortunately, as of may , neither campuses had seen any covid cases on the campus that required the use of our tool. this is largely because residential universities such as ours switched to online learning in march and asked most students to vacate their dorms and enforced a work-from-home policy for faculty and staff. except for a small number of students who were unable to return to their home countries (due to global lockdowns), the campus have been largely empty. one of our campuses saw a single employee case of covid- , but initial (manual) contact tracing determined that the employee worked in a setting with limited contact with others, and university health professionals did not see a need to perform additional contact tracing, manually or using our tool. wifi sensing has been used by researchers for mobility studies since the early s [ , ] , and it is well established among researchers that wifi devices reveal user mobility patterns. our work builds on this wifibased mobility research, and in this section, we validate its use for network-centric contact tracing. we conducted a small-scale user study to gather ground truth data to validate three question related to the use of passive wifi sensing for contact tracing: ( ) how accurately do wifi access point associations reveal true user locations? ( ) how accurately do wifi session durations reveal true durations of times spent at a location? and ( ) how accurately do co-located wifi device sessions reveal co-located users at those locations? to answer these questions, we had a group of volunteers walk around our campus to visit multiple locations for varying durations while carrying their mobile devices. each user manually logged the entry and exit times at each location as well as the path used to walk from one location to another. the trajectories of some of the user's devices were correlated, which meant the users were co-located whenever the devices were connected to the same ap concurrently. our user study produced a ground truth dataset that includes seven devices that visited a total of , distinct locations over a course of ten days. for each of the user, we computed a location report containing all visited locations (assuming a threshold τ = ) and compared the locations as seen by the wifi network to the ground truth locations recorded by each user. figure (a) shows the confusion matrix, with a precision of . , recall of . , and a high f -score of . . as can be seen, the inferred location matches the ground truth location with high accuracy. the errors mainly occur when a user is walking (in all cases, these involved short session of tens of seconds to minutes). when a user is in transit between locations, their mobile device makes ap transition by disassociating from previous ap and associating with a more proximate ap. the threshold for switching aps and aggressiveness of these switches varies across mobile phone makes, models and manufacturer. this results in some mobile phones that stay connected with an earlier ap even through there is a nearby ap with better connectivity; this can result in a location error where the ground truth location is a bit further away from that shown by the more distant ap. in almost all cases, it the user stays at the new location for more than a few minutes ( to minutes in our observations), their phone switches to the closer ap which has a stronger signal. hence, for very short sessions during walks, the true location may be off from the inferred location by up to one ap "cell. " figure depicts the accuracy of the inferred location for varying session lengths observed across four of the devices (namely, iphone, samsung, motorola and lg phones) used in our user study. as can be seen, once the session length exceeds around minutes, the accuracy rises to %. for contact tracing, we are typically interested in locations that visited by a user for a few tens of minutes; as shown in the figure, the approach provides high accuracy for such cases. figure (b) shows a scatter plot of session duration as reported by our tool and the ground truth. as can be seen, there is good match between the actual and ground-truth of session durations; the small errors occur at location entry or exit due to the lag in the mobile device switching to the nearest ap. next we validate the accuracy of co-locations. we use our tool to generate the proximity report for each device and compare it to the ground truth trajectories reported for each device. figure shows the accuracy of the co-located devices as seen by our approach. we see that our approach can capture co-located devices (and users) with high accuracy for sessions exceeding minutes. as noted above, short transitions are often off by one ap cell, which implies that two devices that are near one another will be seen by the network as being connected to adjacent aps, rather than the same one. fortunately these effects do not hamper the efficacy of contact tracing since two users need to be near one another for a period of time (e.g., minutes or more) to be considered at risk. as can be seen, longer sessions are captured with high accuracy. finally, we conduct a validation experiment where we count the number of users entering and leaving room in the library and compare it to the number of devices (users) reported by our approach at that location. as shown in figure , the wifi based occupancy closely follows the ground truth manual count. the slight mis-match occurs for short wifi sessions when a user is present only for a brief period (and when their devices have not switched from the previous ap to the one on the room). the user counts are accurate for all sessions that exceed a few minutes since their devices eventually switch to the closest ap. together, these results validate the efficacy of using passive wifi sensing for location and proximity sensing for contact tracing. in this section, we describe case studies that evaluate the efficacy of our contact tracing tool and also present results on the efficiency of our graph algorithms and general limitations of our wifi sensing approach. we first describe our dataset and then our results. since our tool has been deployed on two university campuses, we use production wifi logs from the university wifi networks for our experimental evaluation. this is the same data that would be used by health professionals for their contact tracing, except that we use a fully anonymized version of this data for our experiments. table depicts the characteristics of the wifi logs. the us university has an aruba network of aps deployed across buildings. it has users comprising students and facultystaff (figures are rounded to the nearest thousand). the dataset spans jan to may , which includes the covid- lockdown that began during spring break (mid-march). the singapore university has a mixed aruba and cisco network comprising , aps deployed across buildings. it has , users comprising , students and , faculty/staff. the dataset spans feb to may and also includes the covid quarantine which was progressively announced by the government, ending with a full lockdown like the us university. we randomly choose a user from our dataset and assume they are infected with one of the above diseases and use our tool to compute the number of locations visited by the user over that period and the number of co-located users. we perform contact tracing assuming τ = ω = mins and τ = ω = mins, which implies location visited for at least (or ) minutes and co-location of at least (or ) minutes. for each disease, we repeat each contact tracing experiment for randomly selected students, and then randomly chosen faculty or staff users. figure depicts our results. as can be seen, the number of locations visited by an infected user grows as the duration of contact tracing grows from days for flu to days for measles. we find that the number of location visits is insensitive to τ beyond τ > mins (as discussed in more detail in the next section). a student visits ≈ locations per day while a faculty/staff user is somewhat less mobile and visits ≈ locations per day. figure depicts the proximity results from our contact tracing experiment. as shown, τ = min yields a large number of colocated user, colocated users for flu over a day period, rising to over users for measles over a day period for a student. for τ = min, the number of colocated users is lower-but still high - for flu and for measles. the colocation count is lower for facultystaff users (figure (b) ) but is still quite high ( to ) for τ = and substantially lower (between and ) for τ = mins. these results yield the following insights: • first, we note that the number of colocators does not increase linearly with an increase in contact trace duration. the growth is sublinear indicating that users have a social circle of users and there are repeated interactions with the same set of users over different days. • second, it is infeasible to manually contact trace several hundred users for each infected user. this can be addressed by carefully selecting the parameter τ and ω and also carefully considering the tool output for subsequent manual contact tracing. in particular, τ = min is too low due to a high rate of chance co-location. choosing τ = mins and ω be or mins may yield better results. further our results show that common areas like dining, cafeterias add substantially to the colocation counts. it is straightforward to filter out those ap sessions to determine users with higher risk. figure (a) and (b) shows that the number of co-locators drops substantially once cafeteria visits are excluded. finally, our report (see figure provide the total time spend with colocator in sorted order as well as the location where co-location occurred. it is possible to consider the top n (eg. n = ) users with the most proximity minutes or only consider specific locations such as a small conference room or a classroom for subsequent manual tracing. such strategies are already used by professional contact tracers to hone in on the most probable at-risk co-locators while eliminating users who may be false positives. tracing. while the above experiment involved a single level of contact tracing in many cases, contact tracing may have to be iterative, with each colocator subjected to contact tracing. given that a user may come in contact with more than a hundred users in a single day (eg if they attend a few lectures in the classroom and visit a cafeteria) iterative tracing even for two iterations can be prohibitive. as explained in the previous section, the colocators list needs to be pruned at each step to identify the users at most risk. in the previous section, we suggested using a carefully chosen τ and ω to filter out certain locations or focus on high-risk locations (eg a small conference room). these are subjective strategies and can yield errors and miss "true positives". an alternate strategy is to "test and trace", which combines testing with contact tracing -a strategy used by many countries for covid- . in this case, each colocated user is administered a test to check if they are infected, only infected users are subjected to iterative contact tracing and the rest are filtered out. in this case, the number of users subject to contact tracing grows based on the rate of transmission(referred to as the r in the medical literature). for example, if r= , then only out of the several tens of users identified by our tool will be subjected to additional tracing in each step (we assume that all users are tested to find r users who are infected). table depicts the number of users identified by this strategy for testing and tracing -as can be seen, the growth is much lower than a naive iterative strategy. tracing during quarantine periods. while the previous experiments performed contact tracing during pre-covid semester periods where mobility patterns were "normal", we now examine how contact tracing results will change in the presence of strict lockdown policies. figure (a) shows the number of locations visited per day by different types of campus users. while users visited - locations per day for τ = mins during the normal period, after march th , the number of ap locations visits drops sharply for all users due to lockdown policies. this will significantly alter contact tracing results fro our tool for users who become ill during such lockdown. figure (b) shows the number of locations visited for a user subjected to covid- contact tracing (duration of days). as shown the number of locations visited varies from to for τ = and it drops to - locations visits for τ = mins or greater. figure depicts the number of co-locators for τ = mins for several users based on the pre-covid and lockdown mobile patterns. as can be seen, social distancing and lockdown policies bring the colocator count to be less than ten for all types of users, an order of magnitude reduction. in such cases, comprehensive contract tracing of all colocators is feasible through manual means. to evaluate the efficiency of our graph algorithm, we compare the execution time of naive linear search approach and our graph based algorithm across varying size of co-locators. since different users display different amount of mobility, the number of co-locators seen for each user will be different. searching the co-locators using linear search requires complete scan of the entire dataset sequentially, resulting a high overhead across all runs irrespective of the number of observed co-locators of device. additionally, as the number of nodes increase, the search overhead also increases. in contrast, our graph algorithm efficiently identifies relevant edges and nodes relevant to the specified query, thereby reducing the search space overhead also, adding the constraint of τ results in further pruning of edges resulting in reduced search space reducing the time and space complexity of our algorithm. this behavior is depicted in figure that compares the execution overhead of the two approaches for our campus dataset. as shown, our graph-based implementation outperforms the naive sequential search by a significant margin. wifi-sensing has well-known limitations and this section analyzes the implications of these limitations on contact tracing. multi-device users : researchers have previously studied the behavior of multi-device users and shown that it is very common for users to own two or more devices [ ] . a key consequence of this result is that device count seen by an ap does not equal user count. while all wifi logs log device association information not all of them provide user ownership information. if such information is missing, radius authentication logs should be additionally used to map devices to owners to avoid double counting devices as separate users. figure shows the number of unique devices seen by aps in different types of campus buildings and the corresponding user count (eg aruba syslogs provide both types of information). as shown locations like dorms and classrooms see between . x to x difference in unique devices and unique users (since users may connect a phone and a laptop to the network), only dining areas (cafeteria) see low over counting since users are likely to carry only their phone when eating. this result highlights the importance of considering device ownership to avoid over counting users by only considering connected devices. unassociated devices : not all users may connect their mobile devices to the wifi network. such devices are visible to the network when they perform ssid scans using a randomized mac address. unassociated devices can cause multiple challenges. first ignoring them altogether will undercount users in a location but simply counting all devices can yield a large number of false positives. figure (a) depicts the number of unassociated devices seen in four buildings in our singapore campus. since the buildings are next to a public road or public bus stop, the number of unassociated devices per day is x greater than the number of associated users. figure (b) shows that enforcing a session duration of minutes filters out most of these chance associations and the number of such devices (likely visitors) is around % of the total number of associated devices. impact of session duration: our contact tracing tool uses two parameters τ and ω that are directly related to wifi session durations. judicious choice of these parameters can allow for a good tradeoff between eliminating false positives and eliminating true positives. figure shows the number of ap locations visited by campus users for varying values of session length τ . the figure shows that the location visits stabilize around τ = mins and then yields - location visits per day. small values of τ include locations visited when in transit and should be ignored. figure shows the impact of varying values of τ and ω and the figure shows a decreasing gradient as both τ and ω are increased for all user types. finally figure shows the number of colocated users for varying values of τ and ω. as shown, using values that are tens of minutes allows the tool to filter out overlapping sessions caused by users in transit. these results highlight the importance of carefully choosing τ and ω depending on the infectious nature of disease but also avoiding false positives. the prevalence of many infectious diseases in our society has increased the importance of contact tracing-the process of identifying people who may have come in contact with an infected person-for reducing its spread and disease containment [ , ] . for performing contact tracing, the infected user needs to provide the places visited and persons who were in proximity or direct contact [ ] . while the traditional method relies on interviews, the covid- pandemic has seen the use of a method such as gps, bluetooth [ ] , credit card records [ ] , and cellular locationing. manual contact tracing as a mode for containment of diseases with a high transmission rate has proved to be too slow and cannot be scaled. research [ , , ] has shown that technology-aided contact tracing can aid reduce the disease transmission rate by quicker scalable tracing and help achieve quicker disease suppression. bluetooth and bluetooth low energy (ble) based contact tracing has emerged as a possible method for proximity detection [ ] . a handful of systems based on bluetooth or ble have been rolled out few of which have been supported by the government of various countries such as singapore [ ] and australia [ ] . the main limitation of these approaches is the need for mass adoption before it becomes effective [ ] and its reliance on bluetooth distance measurements, which may not always be accurate. authenticity and privacy attacks are other key issues in using bluetooth for contact tracing. [ ] has shown that authenticity attacks can be easily performed on bluetooth based contact tracing apps. such attacks can result in forging the location visited and creating a fake history of a user introducing risk to the society as shown in [ ] . bluetooth apps suffer from privacy issues as noted in [ , ] . as a result, privacy issues for bluetooth-based contact tracing has received significant attention [ , , ] . privacy-preserving methods include the use of homomorphic encryption for determining contacts [ ] and the use of private messaging to notify possible contacts [ ] , to name a few. technology-aided contact tracing is becoming increasingly important tool for quick and accurate identification of co-locators. while bluetooth-based contact tracing method using phones have become popular recently, these approaches suffer from the need for a critical mass of adoption in order to be effective. in this paper, we presented a network-centric approach for contact tracing that relies on passive wifi sensing with no clientside involvement. our approach exploits wifi network logs gathered by enterprise networks for performance and security monitoring and utilizes it for reconstructing device trajectories for contact tracing. our approach is specifically designed to enhance the efficacy of traditional methods, rather than to supplant it with a new technology. we presented an efficient graph algorithm to scale our approach to large networks with tens of thousands of users. we implemented a full prototype of our system and deployed it on two large university campuses. we validate our approach and demonstrate its efficacy using case studies and detailed experiments using real-world wifi datasets. finally, we discussed the limitations and privacy concerns of our work and have made our source code available to other researchers under an open-source license. apple google partner covid- contact tracing singapore built a coronaviris app but it hasnt worked so far tracetogether app covid- contact tracing flusense: a contactless syndromic surveillance platform for influenza-like illness in hospital waiting areas epic: efficient privacy-preserving contact tracing for infection detection bluetrace: a privacy-preserving protocol for community-driven contact tracing across borders assessing disease exposure risk with location data: a proposal for cryptographic preservation of privacy contact tracing mobile apps for covid- : privacy considerations and related trade-offs afshan amin khan, and roohie naaz. . applicability of mobile contact tracing in fighting pandemic (covid- ): issues, challenges and solutions. cryptology eprint archive development finance division. . moef: korea contact tracing johns hopkins university covid dashboard contact tracing and disease control apps gone rogue: maintaining personal privacy in an epidemic epidemic contact tracing via communication traces quantifying sars-cov- transmission suggests epidemic control with digital contact tracing estimated influenza illnesses, medical visits, hospitalizations, and deaths averted by vaccination in the united states sars basics fact sheet quest: practical and oblivious mitigation strategies for covid- using wifi datasets feasibility of controlling covid- outbreaks by isolation of cases and contacts experiences & challenges with server-side wifi indoor localization using existing infrastructure extracting a mobility model from real user traces the effectiveness of contact tracing in emerging epidemics analysis of a campus-wide wireless network interrupting transmission of covid- : lessons from containment efforts in singapore wireless health monitoring using passive wifi sensing location determination using wifi fingerprinting versus wifi trilateration aruba networks. . it analytics for operational intelligence eryk dutkiewicz, symeon chatzinotas, and bjorn ottersten. . enabling and emerging technologies for social distancing: a comprehensive survey stefaan verhulst, and patrick vinck. . mobile phone data and covid- : missing an opportunity a collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the united states covid- epidemic in switzerland: on the importance of testing, contact tracing and isolation a high-resolution human contact network for infectious disease transmission cisco systems. . cisco dna spaces qiang tang. . privacy-preserving contact tracing: current solutions and open questions empirical characterization of mobility of multi-device internet users how to return to normalcy: fast and comprehensive contact tracing of covid- through proximity sensing using mobile devices stressmon: scalable detection of perceived stress and depression using passive sensing of changes in work routines and group interactions analyzing shopperâĂŹs behavior through wifi signals sensorless sensing with wifi key: cord- -dk lwsrx authors: magklaras, georgios; bojorquez, lucia nikolaia lopez title: a review of information security aspects of the emerging covid- contact tracing mobile phone applications date: - - journal: nan doi: nan sha: doc_id: cord_uid: dk lwsrx this paper discusses the aspects of data reliability and user privacy for the emerging practice of mobile phone based contact tracing for the covid- pandemic. various countries and large technology companies have already used or plan to design and use mobile phone based solutions, in an effort to urgently expedite the process of identifying people who may have been exposed to the disease and limit its spread to the general population. however, serious concerns have been raised both in terms of the validity of the collected data as well as the extent to which implemented approaches can breach the privacy of the mobile phone users. this review examines the weaknesses of existing implementations and concludes with specific recommendations that can contribute towards increasing the safety of infrastructures that collect and process this kind of information, as well as the adoption and acceptance of these solutions from the public. on march th , the director general of the world health organization (who) declared the outbreak of covid- a global pandemic [ ] . emergency measures have fundamentally altered the economy and society on a global scale, as health systems around the world struggled to keep up with the demand for emergency health care [ ] . as part of these measures and in an attempt to quickly identify people who may have been exposed to the disease and thus limit its spread to the general population, many governments around the world have deployed mobile phone applications to make the public health process of contact tracing more efficient in a massive scale. a non exhaustive list of countries that were among the first to deploy mobile phone based contract tracing applications include australia [ ] , china [ ] , israel [ ] , norway [ ] , singapore [ ] and south korea [ ] . in addition, large technology companies such as google and apple are preparing their own infrastructure for covid- contact tracing [ ] . many of the previously mentioned governments that were early adopters of the technology and make the participation of its citizens in electronic contract tracing voluntary have claimed that their applications are safe to use and prompted their citizens to download and use them. however, many technology experts have criticized the technology [ ] or expressed concern about its efficacy versus its privacy implications [ ] . moreover, in certain countries, public response to the technology was lukewarm. for instance, india, singapore and norway have seen limited user acceptance of these solutions if one examines recently estimated application download numbers [ ] . all these facts give merit to a closer examination of the problems of covid- contact tracing solutions. before taking a closer look into the problems of contact tracing solutions, it is necessary to provide essential definitions about the concept and the technologies involved in making the transition from manual to electronic procedures. in public health epidemiological context, contact tracing is the process of identifying persons who may have come into contact with a person whose infection has been confirmed [ ] . the infected person is often referred to as the "index case" and all the people that have come into contact that meets certain criteria (proximity, type of transmission, duration) with the index case are referred to as the "contacts". the systemic collection of further information about these contacts aims to isolate them, test them for the infection and treat them where applicable. depending on the type and expected spread of an outbreak, the process can be recursively repeated for contacts of contacts. the overall aim is to limit the spread of the infection in the general population. health authorities follow specific protocols that require manual contact tracing. this means that health workers evaluate the provided information, search for locating the contacts, notify the contacts (phone call) and all this depends on the accuracy of the information that the index case and his/her subsequent contacts can provide. it is thus reasonable to assume that as health infrastructures are strained for resources in a fast spreading infection, the quality as well as the accuracy of manual contact tracing procedure will suffer. this has been confirmed well before the covid- outbreak. in fact, electronic contact tracing has been tested in the pre-covid- world in many epidemiological emergencies, among them the ebola virus outbreak [ ] . although this study is far from the technology implementation aspects we see in the covid- mobile phone contact tracing solutions, it highlighted the power of the ubiquity of the mobile phone as a tool to aid the monitoring and spread of infectious diseases. in a post-covid- world, governments and technology companies turn to various aspects of mobile and general computing infrastructures to implement contact tracing solutions. in particular, most covid- contact tracing solutions make use of the following mobile phone technologies: a) the use of global positioning system and assisted gps (a-gps) [ ] technology: every mobile phone has an embedded gps receiver and through to complimentary components of a gpp compliant [ , ] telecommunications infrastructure, a time series of gps coordinates of the mobile device can be recorded. features like the google account location history [ ] , as well as the chinese [ ] , israeli [ ] and norwegian [ ] contact tracing applications make use of the position/location data. google has also used location data during the covid- pandemic to estimate the extent of the imposed quarantine measures in various countries with the so called 'mobility reports' [ ] . b) the use of the bluetooth protocol [ ] : the bluetooth protocol is a complex wireless technology standard that encompasses different modes of transmission and functionality. the relevant bits to contact tracing concern its low energy variant called bluetooth le [ ] . this variant is used to perform proximity sensing calculations. the calculations are used to estimate the distance between the index case and the contacts and thus play a crucial role in most covid- contact tracing application implementations. another crucial aspect that concerns the bluetooth operation is that the technology is used to exchange data between devices. latter paragraphs will describe that process in more detail. c) the increase in power and data storage in mobile phones, as well as the ubiquity of reliable g/ g (and in the near future g) connections create powerful ways of constructing big data sets with different levels of anonymity and susceptibility to linkage attacks [ ] . most of the solutions claim that they take precautions to anonymize the data they exchange. data exchange and collection can also occur in de-centralized or centralized ways. this has different implications for the privacy of the users that contribute the data in question. leaving the substantial variations among existing different covid- contact tracing implementations to the side, in simple terms, when a user downloads a contact tracing application to a smartphone, the device will in principle perform the following actions: a)activate the bluetooth le interface and will broadcast its presence by means of transmitting an anonymous identifier. the transmission of the identifier is performed repeatedly in the form of a beacon. b)use the same bluetooth le interface to record received anonymous identifiers of other mobile phones within range. c)for every received/intercepted anonymous bluetooth le identifier, the phone will attempt to estimate its proximity. this proximity sensing step is crucial to the validity of the sampled data. d)the collected data are stored in the smartphone but are handled in different ways. an abstraction of such a record could look like the ones below: where ble_id is the anonymous identifier, proximity_estimation represents a distance (meters) , a_gps_data represent location data of the smartphone according to the data collected by its a-gps receiver and finally covid _flag represents whether the user of the smartphone has disclosed (voluntarily) whether he is infected with covid- . different contact tracing implementations upload these records (with the user's consent) to different types of central database infrastructures for processing. for the purposes of clarity, we need to emphasize not all mobile application implementations collect gps data (a_gps_data field). the collection of location data creates privacy concerns that are discussed in section of this paper. the a_gps_data field can collect other forms of location data (cell tower id) to aid the accuracy of the proximity sensing process in various ways. a central database will process the collected records with particular emphasis on the records that have the covid _flag set and the proximity_estimation within a certain range (say for instance less than meters or less). consequently, it is possible to message alert all users that have been within a pre-defined proximity and time exposure of a specific ble_id whose smartphone user has declared his/her infection. it is therefore evident that smartphones can provide time, location and proximity data that public health authorities consider valuable, in order to alert the general population [ ] . this process forms the very basis of smartphone based covid- contact tracing and will be used as a reference mechanism for analysis for the rest of this paper. the following sections will focus on various implementation details of the reference mechanism. section will discuss information security aspects that concern the use of the bluetooth le protocol, its data accuracy, as well as its various information security weaknesses. section elaborates on the privacy aspects of storing anonymous data in central infrastructures. the fourth and final section of the paper concludes with concrete recommendations that aim to improve the security of electronic contact tracing solutions. the bluetooth protocol is a vast and complex specification [ ] . different versions and smartphone chipset implementations can result in different operational and information security aspects of its use for the purposes of contact tracing. however, in broad terms, these aspects touch on three different areas. the first is the area of user privacy. one needs to question what is the likelihood that a user can be identified as a result of the bluetooth data exchange necessary to facilitate contact tracing. a second question relates to how accurate are the data collected by bluetooth le for the purposes of contact tracing. finally, a third question to raise is what are the security implications of using it to broadcast your (in theory anonymous) presence and exchange data with devices you do not know. bluetooth le allows device manufacturers to use temporary random addresses in over-the-air communication instead of their permanent address to prevent tracking, as part of the bluetooth core specification version [ , ] . earlier versions of the bluetooth core specification were broadcasting the interface mac address, a permanent identifier that is unique for every smartphone [ ] and could thus be used to track an individual. while bluetooth core specification version addresses this issue, it also leaves gaps that could be exploited and lead, under specific circumstances, to identification of individuals. jameel and dungen [ ] examined bluetooth le beacon protocols and an array of mechanisms that facilitate localized interactions with smartphones and other bluetooth devices via the beacon mechanisms. the advlib library [ ] is a product of their work which allows software developers to easily integrate bluetooth le beacon advertising-based functionality into their applications, without having to embed them into the low-level protocol mechanisms. however, the practical application of this work for an adversary is that the library could be used to identify bluetooth powered devices. while it is not possible to track a specific individual by making use of this mechanism, identifying that someone has a specific phone and a specific accessory in an area with a limited number of people could aid the process of adversarial reconnaissance aiming towards personal identification. becker, et al [ ] proceed further and demonstrate that even current bluetooth le anonymization measures are vulnerable to passive tracking. their work proposes an address-carryover algorithm that exploits the asynchronous nature of the bluetooth le payload and achieves tracking that bypasses the attempted address randomization of a device. the worrying aspect of their study and experimental setup is that it does not use differential cryptanalysis to decrypt the content of bluetooth le communication. their method works entirely by intercepting public, unencrypted bluetooth le advertising traffic which is necessary for steps a and b of the abstracted covid- contact tracing procedure outlined in section of this paper. it is broad, in the sense that it is effective against all ios, macos and windows devices. another worrying aspect of the work outlined in [ ] and also supported by other theoretical and experimental work [ , ] is that despite the existence of bluetooth mac address randomization mechanisms to achieve anonymity, not all device manufacturers and operating system/application authors choose to employ them in the same way. there is a certain amount of flexibility in how to implement and transmit these randomized identifiers. these might include standard ways but different operating systems and applications might embed additional information as part of the bluetooth le public beacon payloads for the purposes of incorporating customized functionality. this additional information often leaks vital identity aspects and is dictated by software, from the operating system all the way to the application layer. consequently, different covid- contact tracing applications diverge substantially from whatever the relevant bluetooth standards dictate and offer different levels of user privacy. as far as the data accuracy of bluetooth le collected data is concerned, there are also serious doubts expressed by experts. step c of the abstracted covid- contact tracing procedure (section of this paper) attempts to estimate the distance of an intercepted bluetooth le beacon. the question here is with what accuracy can bluetooth le determine whether the user of another smartphone is closer than a predetermined distance (say meters). the best way to answer that question is to understand the mechanism employed to measure that distance. the bluetooth protocol uses the received signal strength indicator (rssi) to measure distance between devices [ ] . the principle is that the stronger the signal, the closer the devices are to each other, so a correlation between sensed signal strength and distance can be approximated. however, different bluetooth chipset implementations utilize the rssi in slightly different ways. while appropriate calibration can reduce these inaccuracies, the problems do not stop there. the bluetooth le transmission frequency often interferes with other devices in the . ghz range, such as older wifi routers, unshielded usb cables and microwave ovens. a bluetooth le device would do its best to extend the 'beacons' (advertisement of presence and availability) by keeping constant time and regulating the transmission power to overcome other sources of interference. in such a frequency congested environment, a real distance of . meters could really be estimated as . meters (false negative), or a real distance of . meters could be estimated to under . meters (false positive). many experts, amongst them the bluetooth inventors jaap haartsen and sven mattisson, agree that these proximity sensing inaccuracies were and remain a limiting factor [ ] . as a result, the accuracy of the collected proximity data will be reduced and further post processing steps are needed, in order to allow someone to derive safe conclusions about who is in real danger to get infected due to proximity. finally, an often overlooked aspect of bluetooth le is its transmission range. while bluetooth le version has a line of sight (los) beacon range of meters, the next major version of the protocol specification (v ) extends that los range to meters [ ] . at the time of writing, most mobile phones will be supporting bluetooth le version within the next months. if every smarthphone used to perform many personal and business critical things (e-banking, remote control of systems at work, email) has yet another interface that advertises the presence of an individual (apart from the g/ g and wifi interfaces), this provides an advantage for an adversary and can act as a catalyst for cyberattack vectors. the fact is that bluesnarfing attacks against mobile phones have been identified from the early adoption days of the bluetooth protocol [ ] . moreover, there is good evidence that these attacks have persisted over a number of years [ ] and will continue to persist with many recent notable examples that target bluetooth device firmware features [ , ] . the conclusion derived by this body of work is that the covid- contact tracing applications increase the exploitable attack footprint of the average smartphone. privacy and security aspects of storing and processing contact tracing data the covid- contact tracing data collected by smartphones always require some data entry processing backend (central server or servers that operate independently). however, there are different degrees of data centralization among the various solutions. for instance, the norwegian [ ] and singaporean [ ] contact tracing implementation are some of the paradigms that require all collected data to be centralized for further processing. in direct contrast, the temporary contact numbers (tcn) protocol [ ] as well as the decentralized privacy-preserving proximity tracing (dp- t) protocol [ ] constitute examples of protocols that are designed to minimize both the amount of info as well as the necessary processing in a centralized infrastructure. google and apple seem to follow the decentralized approach [ ] . prior discussing the relative merits of centralized versus decentralized covid- contact tracing approaches and beyond the bluetooth le related privacy threats discussed in section , it is useful to examine the context of what user privacy means when combined with a justified need to enhance the tools that health authorities can utilize to effectively contain the spread of a pandemic. the european union is among the major global players that have officially recognized the potential of smartphone and associated technological solutions to fight the covid- pandemic [ ] . part of this recognition is made amidst the presence of comprehensive regulations such as gdpr [ ] that set very strict requirements for the storage and processing of personal information. many countries have modified their national data protection laws to make urgent allowances for the data collection and processing of personal data related to the covid- pandemic [ ] . as an example, the norwegian national data protection authority (datatilsynet) has explicitly permitted non anonymous location data processing for the purposes of covid- smartphone contact tracing, only if it is not possible to derive safe conclusions from anonymous proximity based data [ ] . these steps indicate that there is a need for balance between personal privacy and public health [ ] . it is outside the scope of this paper to pass a judgement on whether amendments to national legislations should favor privacy over public health or vice versa. the goal of this review is to highlight what is in favor of the privacy of the smartphone user and thus help specialists and policy makers to implement electronic contact tracing in the least privacy intrusive manner. achieving such a goal is not always trivial and it will require adherence to international standards. validated international standards for smartphone based contact tracing do not exist at the time of writing. what does exist is a set of eu recommendations [ ] that dictate a set of principles relevant to user privacy in the context of electronic contact tracing. in particular, the eu recommendations dictate that all smartphone based contact tracing solutions should: a)operate on anonymized data with the goal of alerting users that have been in close proximity to confirmed cases without revealing the identity of the index case or the contacts. breach of anonymity and hence disclosure of the identity details of an individual b) not track the location of the users. c) be based on voluntary user participation. any unauthorized usage of data without the knowledge or the approval of the user is strictly prohibited. d)the entire infrastructure should be secure and effective end to end. this includes any centralized components where data are deposited for processing. e) there need to be interoperable and scalable across a number of countries, as people travel from country to country. having these requirements as a guide, one of the first conclusions we can derive is that any solution that stores, sends and processes gps and a-gps data is not acceptable from a privacy perspective. a time series of gps coordinates or other network assisted location data (cell tower id) is personal information and whether deposited partly or completely in a central database server reveals too much information for a user. research efforts that propose privacy preserving location based contact tracing exist. mit researchers have proposed a contact tracing system based on a method that redacts, transforms and encrypts gps coordinates to address the privacy preservation problem [ ] . the contact tracing is then computed by a process known as private set intersection (psi), a technique commonly employed as part of secure multiparty computing [ ] , aiming to reveal only the common data values that are necessary for the computation. however, privacy preserving contact tracing techniques that use gps coordinates constitute best effort experimental approaches that need a reference implementation to be tested and proven. an additional practical matter is that of accuracy. gps and a-gps coordinates cannot at the moment provide a level of accuracy in terms of contact proximity and this is why most solutions today resort to the use of bluetooth le, even with the problems discussed in section of this paper. a last practical aspect concerns compliance to existing legislation. if the law does not provide a clear framework for the sampling of location data for health related purposes, then it is not possible to employ these techniques and thus approaches that rely on geolocating the users will be impractical and impossible to implement. researchers that are proponents of techniques that employ gps coordinates [ ] point out that large companies already collect user location data for operational and advertisement purposes. while this is true [ , ] , there is a distinct difference between geolocating individuals for commercial purposes and doing the same in a health context. apart from the location info this kind of contact tracing solutions contain references to health status (infected or not infected status of an individual). combining personal location info to health status raises the legal context and regulatory handling requirements of the collected/processed information. for instance, the european data protection supervisor considers all data concerning health as a special category [ , ] for which strict privacy preserving requirements apply when it comes to the handling and processing of the collected information. the concentration of large amounts of (theoretically) anonymous health related information in central repositories for the purposes of centralized contact tracing solutions [ , ] creates certain risks and operational requirements for the storage and processing of the data. weaknesses in the anonymity protocols (such as the ones described in section of this paper in connection to the bluetooth le protocol) or in the implementation of infrastructures could place a malicious adversary in a situation to collect information that could compromise the privacy of millions of individuals. the handling of large amounts of anonymous (or desensitized) health data predates the electronic contact tracing era and can be observed in other fields of health informatics. a good example is that of genomic medicine where certain types of genomic data, even if they have been anonymised in principle, they do provide distinct probable ways to re-identify the subjects of a study [ ] . for these reasons, access to these types of data requires data consumers to follow certain ethical guidelines that bind them not to use them in ways that could re-identify the anonymised study subjects and conform to strict storage and data processing requirements [ ] . on the other hand, centralized processing requirements are simpler to implement in principle when compared to decentralized contact tracing solutions such as those proposed in [ ] and [ ] . in general terms, the aim of decentralized contract tracing solutions is to reduce the privacy and security impact of having all the necessary data in one place. they still require a minimal centralized component, especially for steps that incorporate the health status (infected or not infected contact), however the disclosure of information to central entities is minimal by design. this reduces the possibility for abuse of central data repositories. on the other hand, decentralized solutions delegate the processing of info to non trusted devices (the smartphones of the users). this increases implementation complexity. the entire concept has not been yet proven at scale, both in theory and practice. most existing contact tracing solutions follow the centralized storage and information processing model at the time of writing. a final consideration has to do with how the central it infrastructure for contract tracing solutions are implemented. there seems to be certain lack of transparency on how this central part has been implemented. taking norway as an example, a country with good tradition on respecting the privacy of its citizens and among the first to launch a covid- contact tracing application, it is evident that no tender processes have been disclosed for awarding public funds to construct the application [ ] , calls to open source the application in order to aid the review by security experts were denied [ ] and that data that contain gps, bluetooth le smartphone identifiers and health status were stored in private cloud vendors [ ] with unclear status on whether the data can leave the norwegian geographic border. as a result, the norwegian implementation drew a lot of criticism by many it experts around the world [ ] . this is by no means unique to norway. other countries have faced similar criticism. transparency of data processing, as well as export control of health data are issues that should be taken seriously as dictated by pan european (gdpr) and other international legislation [ ] . besides compliance, choices that limit transparency make public acceptance of a technology difficult. thus, it is evident that implementing contact tracing technology should be a process with structure and best practices that are missing at the moment. this structure and recommended practices forms the subject of the next section of this paper. the previous sections of this paper have highlighted that the existing covid- contact tracing applications have serious problems, both in terms of the reliability of the collected data sets, as well as in terms of preserving the end user privacy and security. addressing these problems is not a trivial process and will require substantial efforts towards the creation of standards that oversee the development of contact tracing platforms. the existing eu recommendations [ ] that were discussed in section of this paper can serve as a good start on a road map that will make electronic contact tracing both usable and acceptable by societies around the world. on the issues of bluetooth le accuracy discussed in section [ ] , there are research and development approaches aiming to increase the proximity sensing accuracy of the protocol. examples of such work can be found in [ ] [ ] . it is also possible that smartphone chipset manufacturers together with future versions of the bluetooth le protocol will add features that will increase the proximity sensing accuracy. however, no matter what technological measures are employed to achieve additional proximity sensing precision, the important thing is to put them to the test in a standard manner. the only reliable way to do this is to set control experiments where a group of individuals using smartphones can create verified/predetermined contacts under a variety of conditions (inside buildings with different level of rf noise environments different contact times and different number of individuals). if the subsequent analysis of the recorded data accurately represents the verified/predetermined conditions within a predetermined statistical accuracy (say less than % for both false positives or negatives) then this means that the data collected by a contact tracing implementation is good enough to be used for the public. launching an application on a national scale without proving the accuracy of the sampled data and verifying it by statisticians and experts can lead to misleading results and should be avoided. when it comes to the rest of the vulnerabilities of the bluetooth le protocol (range on los and software vulnerabilities discussed in section and referenced in [ ] [ ] [ ] [ ] [ ] ), there are various measures to be taken. it is prudent that the bluetooth le power is regulated in a standardized manner when operating a contact tracing application, so that the effective range of the protocol is reduced. setting devices to the lowest power level to perform reliably proximity sensing will reduce the effective adversarial surveillance range [ ] . in addition, smartphone manufacturers need to do a better job in addressing the firmware and mobile operating system vulnerabilities, especially for the older smartphone devices. as an example, in the android mobile operating system, critical bluetooth vulnerabilities such as the 'bluefrag' cve- - [ ] affected mainly older versions of the android system for several months. while the vulnerability in question has been patched at the time of writing, not all android device manufacturers have included this patch in their android oem versions. the result is that a substantial number of smartphone users that still operate android version are vulnerable if they use contact tracing and other bluetooth based data exchange applications. thus, it is our view that world wide or regional regulations should make mandatory that all smartphone vendors issue critical system updates throughout the expected life cycle of a smartphone ( - years) . drawing upon the eu contact tracing implementation requirements [ ] , we advise against the usage of any location data (gps, a-gps, cell tower id or other) in electronic contact tracing solutions. apart from conflicts with data protection legislation discussed in section [ , ] , we do not see how location data can enhance the contact discovery. for the purposes of contact tracing, the bluetooth le proximity collected data are more relevant and accurate than any other form of satellite or network assisted location system. incorporating location data, even when anonymised/desensitized increases the susceptibility of the collected data to differential privacy attacks [ ] , especially in implementations where the data is centralized and should be avoided. we do not have enough data on existing implementations to recommend whether existing decentralized approaches should be favored over centralized approaches. as discussed in section of this paper there are certain advantages and disadvantages for each of these approaches. decentralized approaches follow the principle of minimizing the amount of information necessary to perform the contact tracing, however they add implementation complexity and require information to be distributed to untrusted entities. while decentralized approaches look promising, they require further theoretical and practical implementation validation by experts, before definite conclusions are drawn. however, as both approaches require some main it infrastructure component beyond the information gathered by smartphones, the following paragraphs discuss some concrete recommendations that can aid the security of electronic contact tracing solutions. section discussed the paradigm of genomic medicine data [ ] and its analogy to that of electronic contact tracing solutions. the common denominator is the presence of a large amount of anonymized health data. whatever cryptographic precautions can be taken to protect the identity of the contact tracing users, this does not change the fact that a large amount of information about public health is stored in one form or another (centralized versus decentralized, different encryption standards). in our view, this should be good enough to treat this kind of anonymous data in the same way as eponymous medical data. this view is supported by existing data classification policies that form part of information security management practices [ ] . as an example, the university of oslo, the largest and oldest academic institution in norway, manages large amounts of electronic information, including sensitive eponymous data from the oslo university hospital. for that reason, its information security management system [ ] classifies large amounts of anonymous health data at the highest level of data sensitivity [ ] . this has several implications about how anonymised contract tracing information should be stored and processed. infrastructures that hold eponymous sensitive medical data and have approval by relevant national data protection authorities implement a lot of technical requirements to ensure that the confidentiality, integrity and availability of the sensitive data is safeguarded. drawing from the university of oslo's paradigm, its 'services for sensitive data (tsd)' platform [ ] is a practical implementation that provides these safeguards. elements such as multi-factor authentication [ ] , compartmentalization of computation activities on security hardened virtual machines and storage/backup encryption are some of the techniques employed by tsd. in our view, these should be mandatory technical elements that should form a standard for every core it infrastructure platform that handles electronic contact tracing data at national/international level. in addition, core it infrastructures should comply to gdpr [ ] and possibly the hipaa standard [ ] . compliance to these standards can also aid the interoperability among different national contact tracing solutions across a number of countries and continents. eu requirements dictate that contact tracing solutions should be interoperable [ ] . finally, as the use of cloud computing is increasing and the pressure for healthcare systems to be more cost effective is growing [ ] , there are certain risks associated to placing public health data in the cloud. a principal risk is that many large private cloud providers offer a utility service without safeguarding (or even wanting to know) the criticality and importance of the data and the tasks performed in their infrastructure [ ] . when private cloud providers are used for core it contact tracing infrastructure, we recommend three concrete rules. the first is that private cloud providers should comply to the same technical requirements and regulations set of the previous paragraph. in addition and as a consequence of regulatory compliance, private cloud providers should provide it infrastructures within the geographical territory of the country/region if laws dictate the data should be localized. a third recommendation is that an independent cost-risk analysis should be commissioned prior reaching decisions to store and process contact tracing data exclusively in private cloud providers. a better approach is to adopt hybrid cloud technologies, where a public authority can have the option of easily turning the data and compute activities back to their own infrastructure, in case they face legislation or data availability problems. who announces covid- outbreak a pandemic covid- and italy: what next australian government department of health the chinese qr code scanning based contact tracing application israeli ministry of health, hamagen contact tracing application website smittestopp' contact tracing application website tracetogether' contact tracing application website mobile apps, websites offer real-time data on covid- outbreak contact-tracing apps are not a solution to the covid- crisis brookings techstream coronavirus contact-tracing apps: can they slow the spread of covid- ? coronavirus contact-tracing apps struggle to make an impact principles of public health practice use of a mobile application for ebola contact tracing and monitoring in northern sierra leone: a proof-of-concept study understanding gps/gnss: principles and applications lte advanced: gpp solution for imt-advanced smartphones as locative media. st edn, digital media and society series how to view your location history in google maps androidcentral website google corporation, covid- community mobility reports website bluetooth essentials for programmers getting started with bluetooth low energy: tools and techniques for low-power networking theoretical results on de-anonymization via linkage attacks covid- contact tracing apps are coming to a phone near you. how will we know whether they work? bluetooth core specification. v . bluetooth technology protecting your privacy. bluetooth website bluesniff: eve meets alice and bluetooth low-power wireless advertising software library for distributed m m and contextual iot tracking anonymized bluetooth devices bleb: bluetooth low energy botnet for large scale individual tracking why mac address randomization is not enough the inventors of bluetooth say there could be problems using their tech for coronavirus contact tracing experimental performance evaluation of ble versus ble in indoors and outdoors scenarios mobile phone vulnerabilities: a new generation of malware pairing and authentication security technologies in low-power bluetooth a study of the feasibility of co-located app attacks against ble and a large-scale analysis of the current application-layer security landscape the knob is broken: exploiting low entropy in the encryption key negotiation of bluetooth br/edr coronavirus: guidance to ensure full data protection standards of apps fighting the pandemic the eu general data protection regulation (gdpr): a practical guide, first edition the international association of privacy professionals. dpa guidance on covid- norwegian data protection authority datatilsynet, declaration on covid- and processing of personal data contact tracing to manage covid- spread -balancing personal privacy and public health coronavirus: an eu approach for efficient contact tracing apps to support gradual lifting of confinement measures assessing disease exposure risk with location data: a proposal for cryptographic preservation of privacy secure multi-party computation: theory, practice and applications european data protection supervisor. the eu's independent data protection authority a community effort to protect genomic data sharing, collaboration and outsourcing genomic data user code of conduct norway: . million people download coronavirus tracking app despite security concerns norway launches virus app to keep contagion under control hundreds of it experts from around the world face tracking apps like norwegian smittestopp, nrk.no ( ) privacy policies, cross-border health data and the gdpr increasing accuracy of bluetooth low energy for distance measurement applications improving ble distance estimation and classification using tx power and machine learning: a comparative analysis security vulnerabilities in bluetooth technology as used in iot the algorithmic foundations of differential privacy chapter : evolution of a profession. in: practical information security management: a complete guide to planning and implementation how to classify data and information the university of oslo website two-factor mutual authentication based on smart cards and passwords healthcare informatics and privacy the problem of 'personal data' in cloud computing: what information is regulated? -the cloud of unknowing key: cord- -ptb dst authors: bilinski, a.; mostashari, f.; salomon, j. a. title: contact tracing strategies for covid- containment with attenuated physical distancing date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: ptb dst contact tracing has been recommended as a critical component of containment strategies for covid- . we used a simple epidemic model to evaluate how contact tracing might enable partial relaxation of current physical distancing restrictions. testing and tracing coverage need to exceed % in order for contact tracing to reduce transmission by at least %. with high isolation and quarantine efficacy, contact tracing could reduce overall transmission by > %, which would allow for substantial loosening of physical distancing measures. benefits of contact tracing could be enhanced by testing all contacts rather than only those with symptoms and by policies to support high adherence to voluntary isolation and quarantine. as of may , the novel coronavirus sars-cov- has infected more than . million people worldwide and caused over , deaths. since march, most communities in the united states have been living under physical distancing measures including stay-at-home orders in states. evidence suggests that these mitigation efforts have slowed the spread of the virus in many jurisdictions. , a number of frameworks have been proposed for the safe relaxation of non-pharmaceutical interventions, and most include scaling up testing and contact tracing to support containment. [ ] [ ] [ ] [ ] [ ] guidelines for contact tracing call for identifying and monitoring individuals who have been in close contact with confirmed positive cases, facilitation of testing for symptomatic contacts, and counseling and follow-up to encourage voluntary self-isolation, quarantine and symptom monitoring. , several previous papers have considered the role of contact tracing for containment of covid- , [ ] [ ] [ ] [ ] but important questions remain about potential impact given uncertainty around the extent of presymptomatic and asymptomatic transmission of sars-cov- and the efficacy of voluntary isolation and quarantine. as decision-makers look toward relaxing current physical distancing measures, there is an urgent need to quantify the degree to which contact tracing programs could allow for partial loosening of restrictions while maintaining control over resurgent infection. this paper uses a simple model to evaluate different contact tracing strategies to support modification of physical distancing restrictions. we examine the necessary conditions for maximizing benefits of contact tracing. we consider how broadening current testing guidelines from the centers for disease control and prevention to include testing for contacts without symptoms could amplify the impact of contact tracing programs. we developed a simple deterministic markov branching model of covid- ( figure s ). epidemiological parameters were adapted from prior modeling studies where available (table s ). infected individuals generate new infections based on whether they have symptoms, whether disease is detected, and whether they have been identified as a contact of an infected individual (table s ). in our base case analysis we assumed that % of infections are asymptomatic, [ ] [ ] [ ] [ ] and that confirmed cases have % lower rates of transmission than unconfirmed cases; we considered alternatives to both of these assumptions in sensitivity analyses. we assumed that symptomatic cases become infectious prior to emergence of symptoms. , estimates on the effectiveness of contact tracing vary considerably. , we modeled an array of different scenarios in order to characterize prerequisites for effective contact tracing, as well as to evaluate different possible policy priorities. we defined scenarios by the fraction of symptomatic cases detected in the community (not linked to a tracked case), the fraction of contacts successfully traced, the isolation and quarantine efficacy among traced but undetected contacts, and whether testing was restricted to those with symptoms or includes all traced contacts ( table ) . given the likely importance of levels of community testing as a prerequisite condition for contact tracing, we conducted a secondary analysis that quantified the combined benefit of scaling up both testing and contact tracing against a counterfactual in which detection of symptomatic cases remains constant at an assumed current fraction of %. we evaluated the impact of different contact tracing strategies in terms of the percentage reduction in the effective reproductive number rt (average number of secondary infections produced by each infection) under each contact tracing scenario, compared to a scenario without contact tracing. assuming that contact tracing strategies would be implemented alongside policy changes to partially relax physical distancing measures, and that the containment phase would begin when rt was less than or equal to . , reductions in rt can be used to compute the containment margin for a given strategy. the containment margin signals how much current physical distancing measures could be relaxed in the presence of contact tracing, while maintaining rt below the critical threshold of . . base-case results are summarized in figure (see figure s for details). both community detection of symptomatic cases that are not linked to a tracked case and successful tracing of contacts needed to be at least % in order for contract tracing programs to reduce rt by more than % compared to corresponding scenarios without contact tracing. testing asymptomatic contacts may substantially increase the impact of contact tracing strategies. across all scenarios with adequate fractions (≥ %) of symptomatic cases detected in the community and contacts traced, testing asymptomatic contacts increased the benefit of contact tracing by a median factor of . , with a range from . to . . benefits of asymptomatic testing were substantial in all scenarios except those in which efficacy of isolation and quarantine was already maximized for all contacts. the overall impact of contact tracing depends strongly on isolation and quarantine efficacy. median reductions in rt assuming isolation and quarantine efficacy of %, % or % were %, % and % respectively, for strategies that tested only symptomatic contacts, and %, % and % for strategies that tested all contacts. the contact tracing scenario with the greatest impact overall-defined by high levels of symptomatic detection and successful tracing, high isolation and quarantine efficacy, and testing of all contacts irrespective of symptoms-reduced rt by %. in a sensitivity analysis (figures s /s ), we considered how the potential impact of contact tracing strategies might vary if the percentage of cases without symptoms was only % rather than %, as assumed in our base case. a lower asymptomatic fraction increased effectiveness for all scenarios, by a median factor of . , and a range from . to . . we also conducted secondary analyses to evaluate the combined effect of scaling up both testing and contact tracing against the counterfactual of persistently limited testing at % of symptomatic cases and no contact tracing (figures s /s ). accounting for both expanded testing and contact tracing together, the maximum reduction in rt increased to %, and the benefits in many scenarios were at least percentage points greater than the benefits of contact tracing alone. when we further varied the relative transmission rate associated with detected compared to undetected cases from % (as in base case) to %, results were largely similar, with a maximum reduction of %, and higher gains at lower levels of isolation and quarantine efficacy. to translate results into implications for potential modification of current policies, we used the percentage reductions in rt from each contact tracing scenario to compute a corresponding containment margin, which indicates how much current physical distancing measures could be relaxed with contact tracing in place, while holding rt below . . as an example, assume that current physical distancing measures have reduced the reproductive number from r = . to rt = . , and that a contact tracing strategy could reduce rt by %. under these parameters, containment would be possible if relaxed physical distancing measures on their own could maintain rt below . , because the further reduction by a factor of . due to contact tracing would bring rt below . . this implies that together with contact tracing, physical distancing measures could be applied at % of their current, full implementation effectiveness and still maintain the critical containment threshold of rt< . if rt has been reduced to levels well below . , the containment margin is greater (i.e. physical distancing measures could be further relaxed); if contact strategies are less effective the margin for loosening physical distancing shrinks. for example, if a contact tracing strategy were half as effective, producing a reduction in rt of %, physical distancing measures could only be reduced to % of their current intensity. further examples of containment margins under different assumptions about r and the benefits of contact tracing are provided in table . in this study, we computed expected reductions in the effective reproductive number, rt, under different contact tracing scenarios to quantify the degree to which contact tracing can allow for modification of public health orders and physical distancing restrictions while maintaining containment. to support containment, contact tracing must be implemented in concert with wide-scale community testing and must successfully track a high fraction of infected contacts. our results indicate that contact tracing will support a partial relaxation of physical distancing measures but not a complete return to levels of contact prior to physical distancing, consistent with prior studies. , for example, a recent paper estimated that adding contact tracing to self-isolation could reduce rt by - %, assuming % compliance, which is similar to the ranges estimated in our analysis. testing of asymptomatic contacts would substantially magnify potential benefits by extending the coverage of tracing and potentially contributing to improved efficacy of isolation and quarantine. another potential benefit of testing asymptomatic contacts, not captured in our model, is that negative test results could reduce the number of people needing to quarantine presumptively or could reduce the duration of quarantine, which might produce positive health and economic impacts. for contact tracing to be most effective, broadening testing guidelines to include asymptomatic contacts will be important once testing capacity bottlenecks are addressed. the benefits of contact tracing also depend substantially on levels of adherence to isolation and quarantine among traced cases, which could be enhanced through policies such as providing voluntary out-of-home accommodations and income replacement. limitations of this analysis include a simplified modeling framework that lacks network or household structure, and also does not explicitly capture nursing homes, work places, or other potentially high-transmission venues. we furthermore did not model the impact of broader testing of asymptomatic individuals other than traced contacts, which would increase the coverage and impact of a containment strategy. many uncertainties persist, including the extent of asymptomatic prevalence and transmission. nevertheless, by examining a range of scenarios that reflect key sources of uncertainty and policy-relevant variables, we provide benchmarks that can aid in developing evidence-based containment strategies to minimize the risk of resurgent covid- spread. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint supplemental information figure s . model structure and parameters. parameter definitions: a is the fraction of infections that are asymptomatic; k is the fraction of infections that are detected; r is the number of secondary infections from each infection and p is the fraction of cases that are successfully contact traced. for parameters indexed by subscripts: t is contact traced, n is not contact traced; s is symptomatic, a is asymptomatic; d is detected, u is undetected. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . fraction of infections that are asymptomatic (a) estimates vary across studies. [ ] [ ] [ ] [ ] alternative value of % examined in sensitivity analysis ( figure s ) . symptomatic, not contact traced (kns) varies values shown in table asymptomatic, not contact traced (kna) % assumed to be negligible based on current us testing guidelines symptomatic, contact traced (kts) % assumption, reflecting referral to testing for traced contacts asymptomatic, contact traced (kta) % assumption, reflecting referral to testing. applies only in contact tracing strategies that include testing for asymptomatics (see table ). computed see table s for details. values shown in table . relative number of secondary infections from detected infections compared to undetected infections (q) . limited empirical data, rationale for reduced secondary transmission includes: potentially increased likelihood of adherence to selfisolation, targeting of confirmed cases for public health support. alternative value of . examined in sensitivity analysis ( figure s ). calibrated values calibrated to produce baseline rt= . note that relative reductions in secondary infections across program comparisons are . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint scale invariant. values shown in table . isolation and quarantine efficacy is approximately the product of how much infectious time remains when the contact is notified, and the degree of adherence to isolation and quarantine measures. estimates of adherence have ranged considerably in previous studies ( - %) , including % and % in previous covid- analyses. remaining infectious time is difficult to measure, but likely less than . , a prior modeling study used efficacy estimates of % for a 'low-feasibility setting' and % for a 'high-feasibility setting.' . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . figure s . results from sensitivity analysis assuming % of cases are asymptomatic. the horizontal axis shows the fraction of symptomatic cases that are detected in the community. the vertical axis shows the primary measure of strategy effectiveness: the percentage reduction in rt in the contact tracing scenario relative to rt without contract tracing. the color of the lines within each panel indicate the fraction of contacts that are successfully traced. 'isolation and quarantine efficacy' refers to the level of reduction in transmission rates from traced, undetected contacts. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . figure s . results from secondary analysis on combined effects of scaling testing and contact tracing. the horizontal axis shows the fraction of symptomatic cases that are detected in the community. the vertical axis shows the primary measure of strategy effectiveness: the percentage reduction in rt in the increased testing plus contact tracing scenario relative to rt without increased testing or contract tracing. the color of the lines within each panel indicate the fraction of contacts that are successfully traced. 'isolation and quarantine efficacy' refers to the level of reduction in transmission rates from traced, undetected contacts. the first two columns show results that maintain the base-case assumption that detected infections produce % as many secondary infections as undetected infections in the same category (presymptomatic, symptomatic, asymptomatic). the second two columns show results for an alternative in which detected cases produce % as many secondary infections as undetected infections. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . figure s . heatmap representation of figure . the horizontal axis shows the fraction of symptomatic cases that are detected in the community. the vertical axis shows the contacts that are successfully traced. the values and shading in each cell indicate the percentage reduction in rt in the contact tracing scenario relative to rt with no contract tracing. 'isolation and quarantine efficacy' refers to the level of reduction in transmission among traced, undetected contacts. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint figure s . heatmap representation of figure s : sensitivity analysis assuming % of cases are asymptomatic. the horizontal axis shows the fraction of symptomatic cases that are detected in the community. the vertical axis shows the contacts that are successfully traced. the values and shading in each cell indicate the percentage reduction in rt in the contact tracing scenario relative to rt with no contract tracing. 'isolation and quarantine efficacy' refers to the level of reduction in transmission among traced, undetected contacts. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint figure s . heatmap representation of figure s : secondary analysis on combined effects of scaling testing and contact tracing. the horizontal axis shows the fraction of symptomatic cases that are detected in the community. the vertical axis shows the contacts that are successfully traced. the values and shading in each cell indicate the percentage reduction in rt in the increased testing plus contact tracing scenario relative to rt without increased testing or contract tracing. 'isolation and quarantine efficacy' refers to the level of reduction in transmission among traced, undetected contacts. the first two columns show results that maintain the base-case assumption that detected infections produce % as many secondary infections as undetected infections in the same category (presymptomatic, symptomatic, asymptomatic). the second two columns show results for an alternative in which detected cases produce % as many secondary infections as undetected infections. . cc-by-nc . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint coronavirus map: tracking the global outbreak. the new york times incidence, clinical outcomes, and transmission dynamics of hospitalized coronavirus disease among , , individuals residing in california and washington, united states: a prospective cohort study the effect of control strategies to reduce social mixing on outcomes of the covid- epidemic in wuhan, china: a modelling study. the lancet public health covid- us state policy database roadmap to pandemic resilience national coronavirus response: a road map to reopening resource estimation for contact tracing, quarantine and monitoring activities for covid- cases in the eu/eea comparative impact of individual quarantine vs. active monitoring of contacts for the mitigation of covid- : a modelling study feasibility of controlling covid- outbreaks by isolation of cases and contacts. the lancet global health effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov- in different settings quantifying sars-cov- transmission suggests epidemic control with digital contact tracing getting a handle on asymptomatic sars-cov- infection spread of sars-cov- in the icelandic population key: cord- -bgxc ti authors: wu, yan; song, shujuan; kao, qingjun; kong, qingxin; sun, zhou; wang, bing title: risk of sars-cov- infection among contacts of individuals with covid- in hangzhou, china date: - - journal: public health doi: . /j.puhe. . . sha: doc_id: cord_uid: bgxc ti abstract objectives this study determined the rate of secondary infection among contacts of individuals with confirmed covid- in hangzhou according to the type of contact, the intensity of the contact, and their relationship with the index patient. study design retrospective cohort study. methods the analysis used the data of , contacts of individuals with confirmed sars-cov- infection. the contacts were categorized according to the information source, type of contact, location, intensity of contact, and relationship with the index patient. results the incidence of infection differed significantly according to contact type. of the contacts, ( . %) developed symptoms and ( . %) had confirmed infection with sars-cov- . the main symptoms were cough and fever. compared to those who had brief contact with the index case, those who had dined with the index case had a . times greater risk of infection; those who had shared transport, visited, or had contact with the index case in a medical institution had a . times greater risk of infection; and household contacts had . times greater risk of infection. family members had a . times greater risk of infection than healthcare providers or other patients exposed to an index case. conclusions the form and frequency of contact are the main factors affecting the risk of infection among contacts of individuals with covid- . centralized isolation and observation of close contacts of individuals with confirmed sars-cov- infection, in addition to population-based control measures, can reduce the risk of secondary infections and curb the spread of the infection. the incidence of infection differed significantly according to contact type. of the contacts, ( . %) developed symptoms and ( . %) had confirmed infection with sars-cov- . the main symptoms were cough and fever. compared to those who had brief contact with the index case, those who had dined with the index case had a . times greater risk of infection; those who had shared transport, visited, or had contact with the index case in a medical institution had a . times greater risk of infection; and household contacts had . times greater risk of infection. family members had a . times greater risk of infection than healthcare providers or other patients exposed to an index case. january , , zhejiang province was among the first provinces to declare a major public health emergency and introduced ten policies including vigorously promoting public awareness on epidemic prevention, restricting public gatherings, and taking measures to prevent hospital-acquired infections to prevent the transmission of sars-cov- infection. , after january , , the number of imported cases in hangzhou declined rapidly, and the majority of the cases were local cases, indicating that the prevention and control measures taken had produced effective results. on continuous data were summarized as medians and interquartile ranges, and t-tests were used for intergroup comparisons. p-values < . were considered to be statistically significant. the incidence rate of contacts with data collected by field investigation was significantly higher than that of contacts with data collected by big data ( . % versus . %, p< . ). the geographical distribution of close contacts in the districts and counties of hangzhou is shown in supplementary figure s . during the observation period, of the ( %) individuals with symptoms were confirmed to have sars-cov- infection, of which ( %) had a last exposure-onset interval of < days. the incidence rate of sars-cov- infection in the group with symptoms was significantly higher than that in the group with no symptoms ( . % versus . %, p< . ). the most frequently reported abnormal symptoms were cough ( . %), fever ( . %), sore throat and rhinorrhea ( . %). an additional contacts ( . %) were infected with sars-cov- but remained asymptomatic. the overall incidence of infection among the contacts was . %. there was no significant difference in covid- incidence among the close contacts according to age or sex, but significant differences were found according to the level of protection, type of contact, relationship with the index patient, and contact location. the results in table show that the infection rate among those living in the same household as the index case was . times higher than that of individuals who had only had brief contact with the index case. compared to those who only had brief contact with the index case, those who had dined with the index case had a . times greater risk of infection, and those who has shared transport, visited, or had contact with the index case in a medical institution had a . times greater risk of infection. among the relationships of contacts, family members had the highest risk of infection, with . times greater risk of infection than healthcare providers or other patients who had been exposed to an index case. in terms of contact locations, the infection rate among those who had contact with the index case in or near his/her home was . times higher that among those who had contact with the index case in a medical institution; and the infection rate of those who had contact with the index case through work, study, or in a place of entertainment was . times that among those who had contact with the index case in a medical institution. this incidence of disease among contacts according to age and sex was consistent with the variation in disease incidence according to age and sex in the population as a whole. the incidence rate among those who wore facemasks was significantly lower than that among those who did not use protective measures ( . in the process of case investigation, the hangzhou government took full advantage of the big data technology in combination with a grid management mechanism to trace cases, analyze transmission routes, and efficiently collect information of close contacts. of the contacts identified, . % were identified using big data. this improved the screening efficiency of contacts and reduced the potential for recall bias or intentional concealment. in this way, contact screening was relatively complete. digitized epidemic prevention and control measures are likely to become more widely used in the future. not required (this research does not involve animals and human material and rights.) a novel coronavirus outbreak of global health concern a novel coronavirus from patients with pneumonia in china coronavirus disease (covid- ) situation report a familial cluster of pneumonia associated with the novel coronavirus indicating person-to-person transmission: a study of a family cluster epidemiological characteristics of coronavirus diseases in zhejiang province protocol for covid- monitoring, prevention and control in zhejiang province national health commission of the people's republic of china epidemiological investigation of the first reported case of coronavirus disease (covid- ) in zhejiang province the novel coronavirus originating in wuhan, china: challenges for global health governance special expert group for control of the epidemic of novel coronavirus pneumonia of the chinese preventive medicine association. an update on the epidemiological characteristics of novel coronavirus pneumonia (covid- ) none declared. key: cord- -qp kq authors: klopfenstein, lorenz cuno; delpriori, saverio; francesco, gian marco di; maldini, riccardo; paolini, brendan dominic; bogliolo, alessandro title: digital ariadne: citizen empowerment for epidemic control date: - - journal: nan doi: nan sha: doc_id: cord_uid: qp kq the covid- crisis represents the most dangerous threat to public health since the h n influenza pandemic of . so far, the disease due to the sars-cov- virus has been countered with extreme measures at national level that attempt to suppress epidemic growth. however, these approaches require quick adoption and enforcement in order to effectively curb virus spread, and may cause unprecedented socio-economic impact. a viable alternative to mass surveillance and rule enforcement is harnessing collective intelligence by means of citizen empowerment. mobile applications running on personal devices could significantly support this kind of approach by exploiting context/location awareness and data collection capabilities. in particular, technology-assisted location and contact tracing, if broadly adopted, may help limit the spread of infectious diseases by raising end-user awareness and enabling the adoption of selective quarantine measures. in this paper, we outline general requirements and design principles of personal applications for epidemic containment running on common smartphones, and we present a tool, called 'diary' or 'digital ariadne', based on voluntary location and bluetooth tracking on personal devices, supporting a distributed query system that enables fully anonymous, privacy-preserving contact tracing. we look forward to comments, feedback, and further discussion regarding contact tracing solutions for pandemic containment. the novel coronavirus sars-cov- and its rapid spread have established a pandemic of global proportions over the course of the first months of . high fatality rates detected in the first affected regions are expected to be even higher in countries with an older population, low-income, or lack of suitable health-care facilities [ ] . in the absence of a viable vaccine, so far, the spreading of the disease due to the sars-cov- virus has been countered in many countries with countermeasures that attempt to suppress epidemic growth, thus avoiding to overwhelm the healthcare system with an unmanageable number of patients. the reduction of contagion is achieved through a set of increasingly severe measures that limit personal freedom and entail strong socio-economic drawbacks, going well beyond mass gathering prohibition and case isolation. social distancing rules, including school and university closure, household quarantine, internal and cross-border mobility constraints, and selective closure of non-essential productive and commercial activities, have brought many countries to complete lockdown [ ] . all these approaches require quick adoption and strict enforcement, in order to effectively curb virus spread in the short term. as observed in the pandemic, there is a strong correlation between excess mortality and earliness of containment measures. containment interventions that are introduced too late or lifted too early were shown to have very limited effect [ ] . in the long term, governments are required to trade off the adoption of dramatic full lockdown measures with more lax interventions. for instance, progressively adopting temporary small-scale contagion suppression actions that aim at keeping the virus' reproduction number, r , at a level that does not exceed the healthcare system's capacity. adaptive adoption of these kinds of containment policies at a regional level is expected to be effective even if enforced for shorter periods of time [ ] . the triggering of circumscribed quarantine measures can be directed through the widespread adoption of technological tools that allow tracing contacts and interactions with known cases of contagion [ ] . mobile apps running on personal smartphones are especially attractive as solutions because they enable immediate deployment on existing hardware and a quick response [ ] . several approaches of this kind have been proposed over the course of the last weeks, giving raise to a growing debate around privacy implications and potential risks of mass surveillance and stigmatization [ , ] , that have prompted authorities to provide recommendations and guidelines [ ] , and big players to develop ad hoc cross-platform protocols [ ] . in this paper, we suggest that these contact tracing tools should be designed to sup-port end-user empowerment, as opposed to mass surveillance, granting citizens more data, awareness, and control, as envisioned by nanni et al. [ ] . in section , we outline the basic requirements and the founding principles on which they should be based. in section , we present a location/contact tracing solution, composed of a mobile app and a distributed query system, designed to meet these critical requirements. the proposed system allows individuals to keep track of movements and contacts on their own private devices and to use local traces to select relevant notifications and alerts from health authorities, thus completely eschewing, by design, any risk of surveillance. taking end-user empowerment as the founding principle, in this section we outline requirements and design principles that address both regulatory and technical issues. compliance with national and international regulation is essential to protect natural persons and their fundamental rights and freedoms, while technical requirements are mainly meant to reconcile dependability needs with the features of general-purpose personal devices, characterized by software fragmentation, hardware diversity, variety of non-exclusive usage modes, lack of calibration, limited resources, and untrained users. a. collective intelligence. systems based on the voluntary participation of individuals, performing a collective effort in their pursuit of a common goal, leverage a form of collective intelligence, which is the only alternative to mass surveillance and enforcement. ict solutions should support and encourage such collaborative behaviors. b. social responsibility. individual participation in a collective effort towards a common goal is an act of social responsibility. the technology adopted must make the social value of end-user's behavior clearly perceptible. c. awareness and control. technology is not infallible and systems may not always behave as expected. mobile apps should not induce end-users to simply rely on them. rather, they should empower end-users by granting them control and awareness of the data gathering process and by allowing them to browse their data and possibly add spontaneous notes. d. privacy and anonymity by design. protection of sensitive data, such as locations or health-related information, cannot rely exclusively on trust or security promises. the system must be designed to keep user data private at all times, ideally storing them exclusively on the user's device, and to make identification impossible a posteriori. e. technology agnosticism. contact tracing is a challenging task. in spite of the many approaches that have been proposed, no single technology has proven to offer the ultimate solution. for instance, location services have limited accuracy, especially indoor, while bluetooth proximity does not reveal the exposure to indirect contagion (through infected surfaces). the solution of choice should exploit all available technologies and be open to any improvement or integration. f. effectiveness. the effectiveness of containment measures based on the voluntary adoption of a mobile app strongly depends on the percentage of the population, making proper use of that app. although this is always true, each solution has to be evaluated in different scenarios, including those well below the nominal critical mass of the target technology. two types of performance indicators have to be used, to measure the support that app can provide both to end-users tested/diagnosed positive to covid- , willing to cooperate with health authorities, and to all other individuals possibly infected by them, who should take timely countermeasures. g. interoperability. interoperability must be pursued as much as possible, in order to reduce the critical mass requirements of each single system and to fully exploit their potential. to this purpose, open standard protocols should be preferred to closed ad hoc ones, cooperation among institutions must be technically supported, and integration with synergistic healthcare systems must be enabled. h. openness of source code. open-source access to all system components is the key to speed up development, ensure continuous improvement, and guarantee coherence between specification and implementation. transparency is essential both for end-users and for health authorities possibly adopting the solution. i. openness of statistical data. statistical data can provide valuable information to evaluate the effectiveness of epidemic containment, to monitor contagion, and to drive timely decisions. all statistical information that can be provided by end-users on a voluntary basis, without jeopardizing their privacy and anonymity, is worth being gathered and made available as an open dataset. open data enables study and research without providing questionable competitive advantages to any player. j. avoidance of false alarms. the ultimate goal of contact tracing systems is to reach susceptible or asymptomatic individuals who are considered to be the target of specific measures (e.g., quarantining or testing) according to the containment policies adopted. the solution adopted must minimize unneeded alarms that can overwhelm the healthcare system and spread panic. l. scalability. the higher is the adoption rate, the more effective the solution is. hence, scalability is a key requirement. since the target devices, i.e., smartphones, have their own storage, computation, sensing, and communication resources, scalability can be inherently achieved by exploiting local resources as much as possible without triggering any network effect. digital ariadne or 'diary' is a privacypreserving open-source tool, developed by digit srl and the university of urbino, that allows users to trace their movements and contacts, while also allowing governments or healthcare agencies to rapidly direct their epidemic containment efforts, in a way that aligns with the principles outlined above. the system is composed of: a mobile application, that is voluntarily installed by users on their smartphones, keeping track of their locations through the device's gps sensor and interactions with other users through bluetooth radio beacons, a privacy-aware reward system, which incentivizes app usage while collecting anonymous usage information to feed an open data set, and a distributed query system that allows recognized public authorities to selectively and anonymously notify users about possible contagion sources. the mobile app works in background, with careful usage of battery and storage and without impairing the functioning of the personal devices. nonetheless, it provides a rich user interface to make endusers fully responsible and aware of their own contribution to epidemic containment. source code of the mobile applications and the back-end service, both currently in active development, is available on github . the digital ariadne mobile application is developed using the flutter framework for apple ios and google android. a combination of movement detection with the built-in accelerometers, activity recognition, data from gps sensors, and bluetooth low energy (ble) transmission is used to adaptively track the user's movements and interactions without negatively impacting battery and storage capacity of the device. traces are collected autonomously by a background service launched by the application, but end-users can decide at any time to interact with the app to browse stored data, to force sampling, to mark known locations or the add notes. tracking status is displayed on the main app screen, as shown in figure . three concentric circles, representing the hours of the current day, show the detected movements, the amount of time spent in known locations, and the notes added by the end-user. this information is stored for a maximum duration of days on the device. the app approximately requires mb of data to store information about a day of full tracking (this requirement may slightly increase in case of frequent movement). the app, once activated by the user, starts tracking the device's location and records detected positions and movements. location tracking is adaptive, based on the user's activity and speed, in order to provide sufficient precision in the case of movement and low battery consumption otherwise. the user may voluntarily mark known locations through the app interface, thus allowing to specify places such as home, workplace, school, or any other locations. access to and departure from these locations are detected through geofencing and allow the user to have a quick overview of his or her movements throughout the day. also, the user may decide to add notes to a specific location and time of the day, in order to remember specific events or situations that may be relevant for contact tracing purposes. user movements, known locations, and notes are shown on an interactive map like in figure . the diary app makes use of the temporary contact numbers (tcn) contact tracing protocol in order to broadcast randomly-generated and anonymous identifiers, which are updated every few minutes [ ] . all installations of the app keep a fully-local log of every identifier that has been broadcast and every identifiers that was received from other devices making use of the same tcn protocol. this data, which expires together with location data after days, never leaves the phone and is not linked to private user information. the digital ariadne system makes use of a server-side component that is used to collect daily usage statistics in an anonymous fashion. users are never required to identify themselves and no user-identifying information is transmitted at any point. communication between mobile applications and the back-end makes use of secure connections using widelyadopted standards (https with optional certificate pinning). an anonymous installation id is generated upon the first launch of the mobile application. this id is a randomly generated uuid and is used only to distinguish individual installations for statistical data collection and aggregation. installation ids are not linked to private user information or device characteristics. daily statistics include, for each installation: collected information, made available as open data set, gives an indication of how the mobile app is used, allowing researchers and policy makers to gauge the effectiveness of measures adopted at regional or national level. while personal data never leaves the user's device and collected statistical data cannot be used to identify users, digital ariadne is designed to give designated territorial or national authorities access to the system through a dashboard allowing them to publish epidemicrelated "call to actions". call to actions can be seen as geographical and temporal distributed queries that operate with the following process: (a) an authority creates a new call to action based on a sequence of geolocated and timestamped points and/or a set of temporary contact numbers, (b) the call to action is stored by the back-end service until it expires, (c) the mobile app automatically downloads relevant call to actions, (d) the mobile app matches call to actions to private location and contact data, in order to verify whether the user has been exposed to possible sources of contagion, (e) if there is a match, the user is privately notified and can access information associated to the call to action. matching users may also directly choose to interact with the public authority, optionally disclosing part of their traces. thus, a call to action is composed of a series of geographical regions (geo-polygons), associated time intervals, and a series of temporary contact numbers (tcns). the match is performed by checking whether the user has been within the indicated region in a given time period, or if any tcn is found among local records. sensitivity of the match can be finetuned by the health authority, by indicating a maximum distance from the region and a minimum time interval of match (i.e., exposure) in order to alert the user, with the intent of reducing panic and avoiding unnecessary alarms. . . . creating call to actions in the case of contagion when diary users are positively diagnosed, they may grant to healthcare or government authorities partial access to their local traces, inlcuding geolocations, timestamps, and the list of temporary contact numbers generated by the diary app. this information can be used to generate anonymous calls to action made accessible to all the instances of the app in the interested area. call to actions are processed locally to each installation and displayed to end-users if and only if their traces match filtering criteria. this mechanism enables anonymous tracking of past interactions of diagnosed individuals, alerting potentially infected diary users and prompting them for self-isolation and further testing. to further raise awareness and promote adoption and usage of the application, diary integrates with the 'worth one minute' platform. the platform has adopted diary as an instrument for the common good and rewards users with anonymous vouchers (called woms) for their collaborative behaviour [ ] . these vouchers are intended to provide: (a) a simple gamified experience that allows users to earn points and thus obtain positive feedback of their voluntary contribution to epidemic containment; (b) a tangible currency-like reward that can be adopted as a voucher system at a local and national scale to promote microeconomic growth in a post-lockdown scenario; (c) a perception of the social value of individual actions and behaviours. in this paper, we have argued citizen empowerment to be the foundation on which novel epidemic control technologies must be built as a viable alternative to mass surveillance. general design principles driving the development of such technologies have been outlined and applied to the design of digital ariadne, an open-source privacy-preserving instrument that combines location and contact tracing capabilities to collect local traces that can be crossmatched with authoritative alerts and calls to action without leaving the end-user's device. just like ariadne's thread, the data stored on personal smartphones offers a trusted trace to find a way out of the maze of covid- . we invite any kind of feedback on this whitepaper, including comments on the design principles and technical contributions to the open-source diary project. age-dependent risks of incidence and mortality of covid- in hubei province and other parts of china. medrxiv ferguson. the effect of public health measures on the influenza pandemic in u.s. cities quantifying sars-cov- transmission suggests epidemic control with digital contact tracing feasibility of controlling covid- outbreaks by isolation of cases and contacts support for app-based contact tracing of covid- contact tracing mobile apps for covid- : privacy considerations and related trade-offs commission recommendation on a common union toolbox for the use of technology and data to combat and exit from the covid- crisis, in particular concerning mobile applications and the use of anonymised mobility data apple and google partner on covid- contact tracing technology jeroen van den hoven, and alessandro vespignani. give more data, awareness and control to individual citizens, and they will help covid- containment a global coalition for privacy-first digital contact tracing protocols to fight covid- worth one minute": an anonymous rewarding platform for crowdsensing systems the authors wish to thank the more than beta testers that signed up for testing and the more than users that have provided valuable feedback in the last three weeks. key: cord- -swc pitd authors: nosyk, bohdan; armstrong, wendy s; del rio, carlos title: contact tracing for covid- : an opportunity to reduce health disparities and end the hiv/aids epidemic in the us date: - - journal: clin infect dis doi: . /cid/ciaa sha: doc_id: cord_uid: swc pitd sars-cov testing and contact tracing have been proposed as critical components of a safe and effective covid- public health strategy. we argue that covid- contact tracing may provide a unique opportunity to also conduct widespread hiv testing, among other health promotion activities. m a n u s c r i p t massive sars-cov testing and contact tracing at a scale and speed never before seen have been proposed as critical components of a covid- public health strategy that could, in theory, safely allow us to relax social distancing measures and begin to bring back the world we left behind before a cure or effective vaccine is delivered. one of a number of challenges this strategy faces is that contact tracing is extremely labor intensive. deployed in sexuallytransmitted infections such as hiv, a recent study from new york reported costs of over $ per contact interviewed with % of state-level costs attributable to personnel [ ] . for coronavirus these efforts are simpler but the volume of contacts-an average of per positive case [ ] pose substantial logistic challenges. digital applications and other technological supports may help, though it appears inevitable that a large labor force will be needed. already, massachusetts has begun hiring to fill their state's projected need for , contact tracing workers [ ] . health canada has an open call for a volunteer workforce to do contact tracing and other related tasks [ ] . south korea employed a massive workforce of public health workers at the peak of the epidemic [ ] . the value of a such a public health workforce extends beyond contact tracing for covid- and could lead to progress fighting many other health conditions. programs in african communities have combined hiv testing with screening for infectious diseases like tuberculosis and malaria as well as non-communicable diseases like hypertension and diabetes [ ] . using this model, we can take this opportunity to scale-up testing for infectious diseases as well as noncommunicable diseases and by doing so improve community health. as sars-cov- testing is evolving with, not only serological but saliva testing, similar approaches could be taken for optout hiv, hcv and hemoglobin a c testing for which finger stick methods are already available. this will allow us to begin to address the unacceptable health disparities that have existed for many of these conditions and, not surprisingly, also occur in covid- [ , ] . aside from the potentially profound health benefits of a combination implementation approach, pairing covid- contact tracing with testing for hiv may serve to offset the immense costs of such an approach. it is well-established that hiv testing provides outstanding value and can even be cost-saving in the long-term in high-prevalence populations and settings [ ] . simply learning of a new hiv infection is known to change behavior, with meta-analyses estimating an % reduction in sexual activity with partners of negative or unknown hiv status [ ] , and starting arvs with subsequent viral suppression stops hiv transmission [ ] . testing focused on the geographic regions with the highest rates of new diagnoses and further targeted using phylogenetics to identify the largest and fastest-growing clusters of infection, has a c c e p t e d m a n u s c r i p t been positioned as a key pillar of the united states' 'ending the hiv epidemic' strategy [ ] . these efforts will be familiar to public health leaders in the us -anthony fauci was an architect of the strategy [ ] , which was motivated by the profound successes in the african continent, engineered in part by the president's emergency plan for aids relief (pepfar) and headed by deborah birx [ ] . yet testing remains one weak link in our efforts to end the hiv epidemic, in part because of persistent stigma and because the communities at greatest risk often have limited access to healthcare and/or are young and without established healthcare. we [bn, cdr] have recently written that for six of the largest us cities comprising nearly in people living with hiv/aids in the us, implementing a wide range of interventions to diagnose, treat and protect against infection at even near-ideal levels would fall short of the ehe targets [ ] . we excluded contact tracing because of the relatively limited experimental evidence supporting its effectiveness in hiv and the uncertainty regarding the potential scale this type of effort could actually reach. contact tracing for hiv is difficult to implement and may be profoundly threatening to already marginalized individuals. covid- contact tracing may provide a unique opportunity to also conduct widespread hiv testing with modified contact tracing that could be acceptable and important for ending the hiv epidemic. though the task is indeed monumental, the necessary temporary labor force is readily available. were implemented [ , ] . some project the unprecedented stimulus packages and protections for workers may not return the hardest-hit industries to pre-covid levels, and high unemployment may linger. job loss is a major life stressor and is associated with declines in psychological and physical well-being along with a host of other negative social effects [ ] . creating jobs, even if only for a limited term, is a necessary government intervention to not only kickstart the economy but also to mitigate further growth in the concurrent epidemic of loss and despair [ , ] . mobilizing this labor force in the covid- response will allow communities to take meaningful action to protect themselves, an act which in itself may have transformative benefits over the long-term. hiv is known to disproportionately impact urban african american and hispanic/latinx communities [ , ] as do other chronic health conditions such as diabetes and hypertension. now covid- shows the all too familiar progression into vulnerable populations. early cases were seen primarily in international travellers, but currently african american and a c c e p t e d m a n u s c r i p t latinx populations are disproportionately impacted. a public health workforce with point of care screening for these conditions as well could make meaningful strides to reduce the widespread disparities inherent in health. furthermore testing to promote health allows for stigma-free messaging and broad acceptance. testing and contact tracing for covid- will provide an entry into social networks in communities where risk factors for conditions like hiv or noncommunicable diseases may be shared. it is imperative however that contact tracing does not increase stigma and discrimination of minority populations and thus the development of a workforce dedicated to culturally competent contact tracing, point of care testing and overall health promotion [ ] has to be a priority in order to seize this opportunity. the speed with which covid- contact tracing must be conducted is on a different scale, though more intensive efforts can be triaged to specialized staff. the greatest effort howeverand valueis in testing. of course, these potential benefits hinge on the rapid development of antibody testing technology, not to mention sufficient availability of personal protective equipment, and the public health system's ability to rapidly train and deploy the massive influx of temporary workers that would be needed. aside from the devastating death toll, the potential long-term effects of covid- on the global economy are severe and may ultimately prove to be greater than any other period in living memory. difficult decisions regarding how and when to restart the economy lie ahead, but the balance sheet can include the long-term benefits of widespread hiv testing, screening for chronic diseases and potentially narrowing health disparities. though it may feel like a distant priority, now more than ever, as the official language of the plan goes, we have unprecedented opportunity to end the hiv epidemic in america. accomplishing this and expanding the impact to other health conditions is an opportunity we must not let go to waste. notes: this study was funded by the national institutes of health/national institute on drug abuse grant no. r -da- to emory center for aids research and emory vteu. the authors have no conflicts of interest to declare. cost analysis and performance assessment of partner services for human immunodeficiency virus and sexually transmitted diseases the efficacy of contact tracing for the containment of the novel coronavirus (covid ) coronavirus disease- : summary of , contact investigations of the first cases in the republic of korea. osong public health res perspect hiv testing and treatment with the use of a community health approach in rural africa covid- and african americans failing another national stress test on health disparities localized economic modeling study group. the impact of localized implementation: determining the cost-effectiveness of hiv prevention and care interventions across six united states cities meta-analysis of high-risk sexual behavior in persons aware and unaware they are infected with hiv in the united states-implications for hiv prevention programs hptn study team. prevention of hiv- infection with early antiretroviral therapy ending the hiv epidemic: a plan for the united states at years. lancet localized hiv modeling study group. ending the hiv the usa: an economic modelling study in six cities mortality and morbidity in the st century the epidemic of despair among white americans: trends in the leading causes of premature death trends in racial/ethnic disparities of new aids diagnoses in the united states what lessons it might teach us? community engagement in hiv research addressing hiv criminalization: science confronts ignorance and bias a c c e p t e d m a n u s c r i p t key: cord- -u z rir authors: ranisch, robert; nijsingh, niels; ballantyne, angela; van bergen, anne; buyx, alena; friedrich, orsolya; hendl, tereza; marckmann, georg; munthe, christian; wild, verina title: digital contact tracing and exposure notification: ethical guidance for trustworthy pandemic management date: - - journal: ethics inf technol doi: . /s - - - sha: doc_id: cord_uid: u z rir there is growing interest in contact tracing apps (ct apps) for pandemic management. it is crucial to consider ethical requirements before, while, and after implementing such apps. in this paper, we illustrate the complexity and multiplicity of the ethical considerations by presenting an ethical framework for a responsible design and implementation of ct apps. using this framework as a starting point, we briefly highlight the interconnection of social and political contexts, available measures of pandemic management, and a multi-layer assessment of ct apps. we will discuss some trade-offs that arise from this perspective. we then suggest that public trust is of major importance for population uptake of contact tracing apps. hasty, ill-prepared or badly communicated implementations of ct apps will likely undermine public trust, and as such, risk impeding general effectiveness. digital technologies are increasingly being discussed and implemented for covid- pandemic management and as tools for easing restrictive measures, such as lockdowns (mello and wang ; ting et al. ) . due to the high penetration rate of smartphones, there has been a huge interest in mobile phone data as a source for public health research and measures (oliver et al. ) . to track the spread of the virus, in europe and elsewhere, network operators share (anonymized and aggregated) phone location data. apple and google, two leading providers of smartphone operating systems, release data to show mobility trends in countries and selected regions google ) . in addition, a range of new mobile phone based applications ("apps"), sometimes lumped together under the term "covid- apps", have been rolled out recently or are being under development by private as well as public actors (sharma and bashir ; privacy international ; woodhams ; gdprhub ) . these apps may serve a variety of functions: provide users with covid- -related information, monitor people in quarantine, trace movements, or give users rapid warning of potential exposure to sars-cov- (gdprhub ; rimpiläinen et al. ) . frequently, mobile phone apps are designed to fulfil more than just one purpose, e.g. symptom checkers could generate data which might also be used for epidemiological modelling, monitoring the virus spread or to evaluate public health measures. available apps differ widely regarding data use (e.g. self-reported, geolocation data, proximity tracing), data sources (e.g. gps, bluetooth), data handling (decentralized or centralized), as well as data protection (anonymization or pseudonymization) (woodhams ). proximity or contact tracing apps (ct apps) have gained notable attention so far. ct apps notify users if they have been in proximity to confirmed infected people and propose next steps (e.g. self-isolation, testing). a vital distinction to be made here is between apps that collect data on-in principle-identifiable individuals in a centralised database ('centralised' variants) and those that function by use of encrypted identifiers that connect individual users to each other ('decentralised' variants) . only the first allows 'contact tracing' in the stricter sense when individuals and encounters are retrospectively identified by a third party. the second variant warns users in the case of contact with infected individuals (i.e. exposure notification), but does not allow a centralized tracing of possible infection chains. both variants, contact tracing and exposure notification, can play an important role in a digitally supported pandemic management strategy. since analogous contact tracing is comparatively slow, resource intense and lacks reliability, digital proximity tracing has been proposed as a complementary tool to indicate possible transmission chains that analogous contact tracing might miss or take a longer time to identify. one study suggests that ct apps could, in theory, effectively decrease virus transmission by enabling targeted testing or quarantine, and thus avoid mass confinements or lockdowns (ferretti et al. ) . informing or identifying potential spreaders earlier could reduce pre-symptomatic transmission, i.e. before an infected person shows symptoms. this might also support micromanagement after lifting restrictive pandemic control measures or during future infection waves. until today, however, little is known about the effectiveness and efficiency of ct apps in real-world settings, and whether or not they could also have negative effects on pandemic management, or expose individuals to ethical downsides, such as lack of data protection. as of august , a wide range of ct apps is being used or under development globally, from algeria to vietnam . singapore pioneered a bluetooth based open-source technology named bluetrace, which underpins the tracetogether app (tracetogether ). in europe, after a joint attempt to establish a pan-european "privacy-preserving approach" for ct (pepp-pt) has seemingly failed, various countries have rolled out their own proximity tracing apps. initially, pepp-pt and a centralized database were considered as the preferred framework for ct apps, but massive criticism (e.g. joint statement ) has led some policy makers to switch to a decentralised approach (busvine and rinke ) . however, european countries are divided on the question whether to rely on centralized (e.g. france) or decentralized (e.g. germany) data management, making interoperability between different frameworks difficult. by now, the authorities of many european countries like austria, belgium, denmark, germany, italy, ireland and switzerland opted for a decentralised approach based on a joint api from apple and google ). there are plans from the european commission to build a gateway to allow cross-border exchange of information between these national ct apps. development of ct apps is not only promoted by public agencies but often relies on public-private partnerships with relevant corporate actors. notably, apple and google have collaborated to develop a joint contact tracing framework which is also founded on decentralized data management (apple and google ) . such efforts are important to guarantee interoperability between different smartphone systems and allow building efficient ct apps. however, due to this dependency, commercial companies are gaining a wide-ranging influence on the national strategies for digital contract tracing; e.g. the apple and google framework is only of limited use for countries which have opted for a centralized architecture for ct apps like france (scott et al. ). furthermore, the use of the framework is restricted to only one tracing app per country (gurman and de vynck ) . ct apps may prove to be valuable public health tools, but they also raise significant concerns (gasser et al. ; lucivero et al. ) . as part of the covid- pandemic response, advisory bodies, ngos, and expert initiatives have interrogated the ethical aspects of digital surveillance technologies, including ct apps (e.g. algorithmwatch ; amnesty international et al. ; chaos computer club ; human rights watch ; swiss national advisory commission on biomedical ethics ; who ). the first ethical frameworks for digital tools in the context of covid- have been proposed (mello and wang ; gasser et al. ; lucivero et al. ; kahn et al. ; morley et al. ; parker et al. ) , and the european commission ( ) has drafted various recommendations and guidelines for digital contact tracing in the eu. this paper focuses on ethical considerations for responsible development, design and implementation of effective and justifiable ct apps in pandemic management strategies. it considers legal and digital ethical concerns in a broader framework of public health ethics as well as related pragmatic and procedural considerations. it provides a framework for ethical analysis of concrete proposals, and suggests that to strengthen trustworthiness, policy makers need to be sensitive to the multi-faceted complexities of public health decision making. the viability of ct apps as a useful pandemic-response measure, depends on a complex interplay of criteria, such as pragmatic assumptions about effectiveness, the likelihood of public health benefit, technological specifications, legal requirements etc. to minimise the risk of adverse outcomes, ethical standards should guide and complement the process of development (ethics by design), implementation, use, and evaluation of ct apps. rather than asking general questions on the moral acceptability of ct apps, the crucial question is: "what specific interventions, if any, may be justified under what conditions?" inspired by ethical frameworks for big data in health and research, developed by the shapes initiative (xafis et al. ) , and other normative frameworks for digital health technologies (marckmann ) and pandemic management (thompson et al. ) , we propose relevant substantive values (which evaluate the outcome of measures) and procedural values (which guide decisionmaking) as well as corresponding questions, which should be considered in response to these requirements (table ) . the list of considerations provides a sketch of the complex set of criteria relevant to assessing ct apps as ethically justifiable public health tools. we neither claim that the list is complete, nor do we think that a responsible policy-making process should necessarily address all of them. on the contrary, it is highly unlikely that a solution would satisfy all these demands. not only is there a significant lack of available data and real-world experience regarding ct technologies, all pandemic management strategies will involve several trade-offs. but acknowledging the ethical values and specific questions can help during development, implementation and evaluation of ct apps in order to find ethically appropriate solutions. in what follows, we will describe some of the complexities in implementing ct apps (cf. nijsingh et al. ). considering the wide variety of mobile applications being developed in the context of the covid- pandemic, it is crucial to distinguish between different apps, their functions, purposes, and performance. the value of mobile applications being developed in the context of the covid pandemic essentially depends on specific pandemic contexts and factors such as the social and political environment, how ct apps are integrated into a comprehensive strategy of pandemic management, as well as possible and available alternatives (fig. ) . notably, and as we will demonstrate, the implementation of digital contact tracing may involve moral costs. in some countries, apps and other mobile based surveillance measures are imposed on people, leading to an infringement of privacy rights (human rights watch ). even without compulsion, ct apps can have severe consequences for social values: worries range from issues of data protection, to possible stigmatization of patients, social justice concerns, or function creep (woodhams ; hart et al. ) . nevertheless, risks that cannot be easily mitigated or avoided could still be acceptable, considering the severity of a pandemic situation, the importance of effective contact tracing to manage it, and the scope of established measures to stop virus transmission. to assess whether a certain ct app is justified, its use needs to be compared to available alternative strategies. from this perspective, infringements associated with a possible loss of privacy and risks related to an effective ct app may appear justifiable in light of the enormous costs in terms of welfare, liberty and health outcomes of either letting the virus run its course or maintaining comprehensive restrictions or lockdowns . to make a case in favour of a ct app, however, several conditions must be met. sufficient societal need and potential effectiveness need to be demonstrated, and ethical risks sufficiently mitigated in order to demonstrate proportionality. in addition, such evaluation and decision-making needs to demonstrate procedural fairness, with transparency and opportunity for potentially concerned parties to voice concerns. finally, the balance of reasons for and against needs to be superior to alternative solutions or strategies. here, again, context matters. for a ct app scheme to be worth its costs and risks, a society needs to be in a pandemic stage, in which contact tracing is a priority. this may depend both on the pattern of (community) transmission, and the healthcare capacity of this country relative to the transmission pattern. public health benefit is the pandemic situation such that contact tracing activity is motivated from a public health standpoint? is the general use of the ct app likely to enhance the effectiveness of contact tracing measures? is the technological make-up of the app such that it can actually produce public health benefit? is the pool of potential users who are willing to use a ct app large enough for epidemiological effectiveness? harm minimisation are ct apps the least harmful way of obtaining the desired benefits? are ct apps easy to use and do they minimise confusion or stress by design? has the risk of self-and social stigma effects, implicated by an elevated focus on one's or others' health status been considered and mitigated? are safeguards in place to mitigate the vulnerability of and harm to marginalized groups from ct apps and related public health and security measures? are potential, harmful social effects related to the app (widespread anxiety, ineffective quarantines etc.) adequately considered? privacy are measures in place for data protection and against data loss or misuse?are data security authorities involved? is data parsimony guaranteed and access to non-essential personal data minimised? are the most privacy-preserving solutions (e.g. no real-time data, anonymization) prioritised? is collection of the tracing-data temporary (e.g. will it be deleted after a certain, specified amount of time are social, and moral costs of ct apps proportionate to the pandemic threat and the expected effectiveness of using the app? is the cost-effectiveness of the ct app positive compared to alternative pandemic management strategies? are financial costs proportionate to the expected public health benefits? general trustworthiness are democratic procedures in place to guide decision making? can population uptake be assumed? do stated objectives of ct apps align with proposed measures? if a society is not in such a state, no app will be able to promote better contact tracing. in addition, the utility of ct apps largely depends on broader public health measures beyond digital technologies. for ct apps to contribute to an effective public health strategy, sufficient staffing of public health services as well as reliable infrastructures (e.g. for testing and for quarantine) are needed. to avoid false positive self-reports, health departments or other institutions need to confirm infection status of users. for 'centralised' ct apps, the data generated by the app needs to be collected and analysed in a meaningful and cost-effective way (from a public health perspective) in relation to a set of justified effective tracing actions that are thereby being facilitated (i.e. eased or made possible by the app data). for 'decentralised' apps, additional efforts of analogous contact tracing are necessary, because possible transmission chains are not tracked in a way to be accessible for health authorities. all ct apps require well-organised institutional efforts. little is known about the effectiveness of contact tracing apps in the real-world setting (anderson ) . even for countries with a high penetration rate of proximity tracing technologies such as iceland, the contribution of ct apps to suppressing the pandemic has been questioned (johnson ) . besides the risks of false positives (which can impose burden on unaffected individuals) and false negatives (which may lead to a false sense of security), the implementation of an ineffective app has opportunity costs: wasting time and resources, undercutting other solutions and leading to wrong political decisions. this may result in a sub-optimal approach to pandemic control, leading to higher morbidity and mortality and greater economic damage. it is also crucial to view the value of a ct app in regards to the quality of information produced by it: mobile phones are not well equipped for contact tracing of individuals. bluetooth signals, which are central for the now widely used approach supported by apple and google, only allow a rough estimation between devices (leith and farrell ) . the same is true if location data (gps) is used. apps that would rely on user-generated subjective information are also likely to produce false predictions that could affect particular tracing policies. this concerns both false positives and false negatives. as such, incorrect information will rather compromise than support particular public health measures, as well as health care systems more generally, and scarce resources may be wasted, or used suboptimally. by contrast, ct apps that appear to be effective in tracing individuals, may raise more severe privacy concerns (baumgärtner et al. ). it has been reported from south korea, where multi-source tracing and tracking technologies are being used (gdprhub ), that information was so detailed as to allow re-identification of individuals (zastrow ) . hence, the values of effectiveness and privacy need to be carefully balanced in digital public health measures. for example, while infringements on individual rights or liberties could be justified to secure health benefits, measures always need to be proportionate and aim for careful balance between competing values and considerations. effectiveness does not only presuppose a favourable context in terms of a suitable pandemic stage and accompanying interventions, but also sufficient uptake. for ct apps to offer a meaningful contribution to pandemic management, a large part of the population needs access to compatible mobile technologies (e.g. newer smartphones or beacons), install and set up the app, and be willing and able to use tools correctly. a study from the uk has estimated that to stop the pandemic on its own, around % of smartphone users (more than % of population overall) would have to use a ct app (hinch et al. ) , i.e. a user rate comparable to whatsapp or facebook messenger in some european countries. as mentioned, so far the highest penetration rate of ct apps in the world has been reported from iceland, where almost % of the overall population downloaded a ct app. for singapore's much heralded ct app, less than a quarter of the population are using this tool (tracetogether ). at the point of writing in early august germany had introduced a ct app less than two months ago, and download numbers had reached more than million, approximately % of the overall population (robert koch institut ). a lower adoption rate still has some positive effect for targeted testing and quarantine hinch et al. ) . nevertheless, population uptake is a bottleneck for success of these digital technologies. predicting future uptake of ct apps is difficult and depends on various factors, such as the penetration range rate of digital technologies in a society, the possibility to download and use the app on different types of smartphones, the credibility of institutions offering these solutions, and viable solutions for ethical concerns such as data security. recent surveys have been inconclusive about the possible uptake in different countries. a study showed a high level of support (around %) for ct apps in countries such as the uk, germany, france and the us (milsom et al. ) , while other surveys from the us and germany came to a less optimistic conclusion (anderson and auxier ; covid- snapshot monitoring ). the available data also show that some aspects could reduce the acceptability of ct apps: these include concerns about further continuation of surveillance after the pandemic and data security (anderson and auxier ) . one way to increase uptake is, of course, to force people to download and use ct apps. mandatory use of disease surveillance tools and possible moral obligations to comply with them are being discussed (lucivero et al. ; parker et al. ; schaefer and ballantyne ) . coercion, however, adds ethical downsides of liberty restrictions that are seen as substantial in a liberal democratic context, and thereby complicates the justification of a ct app policy. moreover, compulsory measures may undermine public trust and create incentives for cheating (floridi ) , necessitating even more forceful steps to secure the benefits of the policy. as a consequence, then these benefits need to be even more pronounced and certified in order to create a potential for the policy to be proportional. for this reason, ct app programs based on voluntary use with a good uptake appear preferable. but this assumes strong public trust in the apps and the program (ienca and vayena ; parker et al. ) . trust, however, must build on trustworthiness, and thus needs to be backed up by responsible design and corresponding policies. such "well founded" credence (parker et al. ) also remains a strong indicator that choices are self-determined and, thus, in line with democratic values. meanwhile, reports from china and other nations have already shown that digital measures utilised in the covid- pandemic response have been used for mass surveillance (woodhams ; human rights watch ) and that there might be plans to massively extend the use of newly established apps even after pandemic (davidson ) . in some countries such as sweden or the netherlands, the launch of ct apps has been postponed or even cancelled due to weak data security and doubts about effectiveness and concerns on the legality of apps that process sensitive personal information (wassens ; hagberg ) . such evidence might have already fuelled public mistrust in ct apps in other nations, especially in societies, in which trust in science and governance is limited. for countries like germany, public outreach by the political representation regarding the introduction of different apps has created confusion (barker ) . internationally, ct apps have already become the subject of conspiracy theories, fake news, and scams. from the perspective of liberal values, citizens should ideally support ct apps because they have (justified) faith in public health measures and, thus, freely choose to utilise disease surveillance technologies. this, however, does not rule out some measures to increase population uptake (floridi ) : encouragement, campaigning, nudges and even some stronger forms of incentives could be justified to increase adoption rates. possible benefits should be equally accessible for most citizens without disproportionate burdens, and negative incentives must not be so severe as to render ct apps de facto compulsory, for example by limiting access to essential infrastructures (lucivero et al. ; morley et al. ) . incentives can also create new risks, e.g. owing to users' psychological responses to the information regarding user-surroundings and related health risks disclosed by a particular app. a privacy infringing, unfair or burdensome app may trigger negative responses, particularly if it is perceived as being imposed upon the public. uptake depends in part on the level of trust in agencies responsible for development, marketing, and distribution of ct apps, on solving issues e.g. of data protection or stigmatization, but also on the usefulness and performance of digital proximity tracing itself. since using ct apps could have adverse consequences for individuals, for example by requiring tests and imposing isolation measures, demonstrated effectiveness and validity of ct apps will be a major factor for population uptake. trust, however, cannot be quickly established, or specifically for just one public health intervention (ward ) . it is a long-term endeavour and requires constant efforts to uphold it, e.g. through transparent communication and participatory elements in health care planning. this raises a pragmatic dilemma regarding the factor of trust: on the one hand, the effectiveness of ct apps is uncertain. on the other hand, digital proximity tracing essentially depends on population uptake and user adherence. broad scepticism about the effectiveness of digital contract tracing could eventually become a self-fulfilling prophecy. this pragmatic dilemma must therefore also be incorporated into ethical considerations. for if the probability of uptake and thus of effective pandemic control with the app is too small, the risks and moral costs of the app could be too high. for an ethically appropriate introduction of an app, that also maintains or increases well-founded trust, the functions, goals, possible chances, and risks associated with specific ct apps must be communicated clearly, as well as the measures taken to mitigate the risks. the same goes for disclosure of conflicts of interest and the procedural management of state-business relationships linked to commissions of technological development and procurement of technical products. this last aspect becomes especially important if the decision is to adopt one particular national ct app solution and policy, meaning that private developers will be in serious competition to win the race for a state contract. to increase app uptake, focusing efforts on one single ct app with just one (or a limited number of) clearly defined purpose(s) and broad support from political and health institutions may be crucial. to prevent confusion and loss of trustworthiness, there may then be good reasons to restrict privately offered ct apps, or to institute mandatory quality assurance authorisation in order to ensure that pandemic management is not undermined by business ventures. the importance of trustworthiness of technologies and policies for earning sustainable public trust also means that it is important to prevent false expectations. for instance, simplistic "solutionism", i.e. the belief that pandemic challenges could be managed by technological fixes alone, must be avoided. public decision-making on pandemic policies including decision making on ct apps, requires a structured framework to work through these ethical considerations. such a framework can play a vital role in increasing transparency of made decisions, as well as the trustworthiness of (and trust in) policies and technical solutions. based on our analysis, we conclude the following points for consideration: • the covid- pandemic cannot be solved by technological means alone. digital proximity tracing is not a panacea in the covid- pandemic response, but could become a valuable component in a comprehensive strategy. thus, it is imperative to have appropriate public health measures and infrastructures in place before and while implementing ct apps. • to ensure effectiveness and user-friendliness, there should only be a limited number of ct apps or, ideally, only one platform. reducing the functionality of apps, i.e. only one clear objective per app, seems advisable. while a joint, pan-european platform, allowing interop-erability between different ct apps is warranted, diverging requirements need to be considered. • given the inevitable risks for privacy and the potential impact on individual liberty, especially related to the centralized ct apps, there should be a reasonable expectation of population benefit of ct apps prior to their large-scale applications. effectiveness and benefits must be evaluated alongside the implementation. • the ubiquitous presence of risks necessitates a thorough and prudent approach. a particular focus on temporary measures is warranted. while science and policy have been confronted with deep uncertainty during the covid- pandemic, strategies must be carefully chosen, risks mitigated and measures reversible. uncertainties on the benefits of digital ct limit the set of legitimate pandemic response policies and actions. without sufficiently clear evidence of effectiveness, jeopardizing the rights or liberties of (some parts of) the population cannot be justified. • trust is essential in public health decision-making in general, and covid- ct apps in particular. policies, recommendations and public health measures should be part of a broader endeavour to win and maintain trust in public health measures. well-founded trust requires taking seriously the ethical complexities relating to the implementation of ct apps as well as being transparent about the inevitable trade-offs that are being made. communicating goals and functions as well as possible benefits, risks, and limitations of ct apps clearly and early can play a crucial role in preventing squandering trust and misconceptions. automated decision-making systems and the fight against covid- -our position joint civil society statement: states use of digital surveillance technologies to fight pandemic must respect human rights most americans don't think cellphone tracking will help limit covid- , are divided on whether it's acceptable contact tracing in the real world mobility trends reports contact tracing bluetooth specification preliminary-subject to modification and extension germany's angst is killing its coronavirus tracing app mind the gap: security & privacy risks of contact tracing apps germany flips to apple-google approach on smartphone contact tracing prüfsteine für die beurteilung von, contact tracing"-apps ergebnisse aus dem wiederholten querschnittlichen monitoring von wissen, risikowahrnehmung, schutzverhalten und vertrauen während des aktuellen covid- ausbruchsgeschehens commission recommendation of . . on a common union toolbox for the use of technology and data to combat and exit from the covid- crisis, in particular concerning mobile applications and the use of anonymised mobility data quantifying sars-cov- transmission suggests epidemic control with digital contact tracing mind the app-considerations on the ethical risks of covid- apps digital tools against covid- : framing the ethical challenges and how to address them projects using personal data to combat sars-cov- . updated apple, google covid- tool to be limited to one app per country folkhälsomyndigheten ratar smittspårning via app outpacing the virus: digital response to containing the spread of covid- while mitigating privacy risks report-effective configurations of a digital contact tracing app: a report nhsx no, coronavirus apps don't need % adoption to be effective a flood of coronavirus apps are tracking us. now it's time to keep track of them mobile location data and covid- : q&a on the responsible use of digital data to tackle the covid- pandemic nearly % of icelanders are using a covid appand it hasn't helped much joint statement on contact tracing: date johns hopkins project on ethics and governance of digital contact tracing technologies ( ) coronavirus contact tracing: evaluating the potential of using bluetooth received signal strength for proximity detection covid- and contact tracing apps: technological fix or social experiment? retrieved ethische fragen von digital public health ethics and governance for digital disease surveillance survey of acceptability of app-based contact tracing in the uk ethical guidelines for covid- tracing apps applying a precautionary approach to mobile contact tracing for covid- : the value of reversibility mobile phone data and covid- : missing an opportunity? arxiv preprint ethics of instantaneous contact tracing using mobile phone apps in the control of the covid- pandemic infektionsketten digital unterbrechen mit der corona-warn-app global examples of covid- surveillance technologies: flash report downloading covid- contact tracing apps is a moral obligation how google and apple outflanked governments in the race to build coronavirus apps use of apps in the covid- response and the loss of privacy protection contact tracing as an instrument for pandemic control: central considerations from an ethical perspective pandemic influenza preparedness: an ethical framework to guide decision-making digital technology and covid- tracetogether-an overview improving access to, use of, and outcomes from public health programs: the importance of building and maintaining trust with patients/clients. frontiers in public health ministerie test corona-app regionaal ethical considerations to guide the use of digital proximity tracking technologies for covid- contact tracing. interim guidance covid- digital rights tracker an ethics framework for big data in health and research south korea is reporting intimate details of covid- cases: has it helped? nature, retrieved publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgements this article builds upon a background paper published by the ethics working group within the german public health network covid- (https ://www.publi c-healt h-covid .de). the authors want to thank ansgar gerhardus, dagmar lühmann and dagmar starke for helpful feedback on an early version of this article and two anonymous reviewers for their valuable comments. we gratefully acknowledge samia hurst-majno for her valuable suggestions.funding open access funding enabled and organized by projekt deal. conflict of interest all authors declare that they have no conflict of interest.open access this article is licensed under a creative commons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/ . /. key: cord- -hebe cjb authors: brooks-pollock, e.; read, j. m.; mclean, a. r.; keeling, m. j.; danon, l. title: using social contact data to predict and compare the impact of social distancing policies with implications for school re-opening date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: hebe cjb background social distancing measures, including school closures, are being used to control sars-cov- transmission in many countries. once "lockdown" has driven incidence to low levels, selected activities are being permitted. re-opening schools is a priority because of the welfare and educational impact of closures on children. however, the impact of school re-opening needs to be considered within the context of other measures. methods we use social contact data from the uk to predict the impact of social distancing policies on the reproduction number. we calibrate our tool to the covid- epidemic in the uk using publicly available death data and google community mobility reports. we focus on the impact of re-opening schools against a back-drop of wider social distancing easing. results we demonstrate that pre-collected social contact data, combined with incidence data and google community mobility reports, is able to provide a time-varying estimate of the reproduction number (r). from an pre-control setting when r= . ( %ci . , . ), we estimate that the minimum reproduction number that can be achieved in the uk without limiting household contacts is . ( %ci: . - . ); in the absence of other changes, preventing leisure contacts has a smaller impact (r= . , %ci: . - . ) than preventing work contacts (r= . , %ci: . - . ). we find that following lockdown (when r= . ( % ci . , . )), opening primary schools in isolation has a modest impact on transmission r= . ( %ci: . - . ) but that high adherence to other measures is needed. opening secondary schools as well as primary school is predicted to have a larger overall impact (r= . , %ci: . - . ), however transmission could still be controlled with effective contact tracing. conclusions our findings suggest that primary school children can return to school without compromising transmission, however other measures, such as social distancing and contract tracing, are required to control transmission if all age groups are to return to school. our tool provides a mapping from policies to the reproduction number and can be used by policymakers to compare the impact of social-easing measures, dissect mitigation strategies and support careful localized control strategies. the reproduction number, or the 'r number' has become a central statistic being used to characterise the transmission of novel severe acute respiratory syndrome-coronavirus (sars-cov- ). early estimates of the reproduction number, which is the average number of secondary cases due to a single case, range between . and . [ , ] , indicating that at least out of every transmission events need to be prevented in order to avoid an outbreak or control an ongoing epidemic. in the united kingdom (uk), social distancing restrictions, including school closures, introduced on march , led to an overall reproduction number less than and a subsequent decline in the daily number of cases and deaths. it is therefore important to quantify the effect of interventions and their easing on the reproduction number. it is uncertain how the relaxation of these restrictions, especially the physical return to school of the school-age population, will affect the transmission of the virus, though contact tracing and isolation of discovered cases is anticipated to mitigate some of the impact. the reproduction number of close contact infections such as sars-cov- depends critically on social contact behaviour. questionnaire surveys, enumerate and describe face-to-face contacts an individual had on a given day, are the most direct way of assessing the potential for spread in a population [ ] . several such surveys have quantified the behaviour of the uk population prior to the pandemic in [ ] [ ] [ ] . a more recent survey in the uk estimated that post march the number of social interactions was . contacts per person per day [ ] . social distancing measures, such as the closure of schools and workplaces and mandatory reduction in social interactions, while effective at preventing transmission, have severe economic and psychological effects, and of particular concern is their impact on children [ ] . age-specific behavioural patterns mean that social distancing measures affect age groups differently. in normal circumstances, the majority of social contact hours for persons over years old occur at home while only a quarter of their social contact hours are associated with leisure activities outside the home. in contrast, nearly % of twenty to thirty-year olds' social contact hours are at work [ ] . crucially, nearly half of children's social contact hours are made within a school setting, meaning that school closures have a major impact on the social experience of young people. in this study, we use social contact data [ ] , including an additional targeted survey of children, to quantify the impact of re-opening schools on the reproduction number in the uk [ ] . we used publicly available death data from the uk [ ] to estimate an exponential growth rate of . ( % ci . , . ) deaths per day between march and march . this corresponds to a reproduction number of . using a mean serial interval of . days [ ] [ ] [ ] . we combine this estimate of the reproduction number prior to lockdown with social contact data to estimate a transmission probability per contact hour of . hour - , see materials and methods. following lockdown, we use google community mobility reports as a proxy for the percentage reduction in active work, leisure and travel contacts. with a % reduction in work contacts, a % reduction in leisure contacts and a % reduction in school contacts, the reproduction number is reduced to . ( % ci . , . ) (figure ), which is consistent with direct estimates [ ] . eliminating all but households results in a reproduction number of . ( % ci, . , . ), but this estimate does not include contacts outside the home arising from essential services. adding work and school contacts to household contacts, with no leisure or other contacts, increases the reproduction number to . ( % : . − . ). adding leisure contacts to household contacts, while preventing work and school contacts, increases the reproduction number to . ( % : . − . ). tracing and isolating social contacts of symptomatic cases so they do not transmit onwards is beneficial but does not control transmission in isolation [ ] . the impact of contact tracing increases as social distancing measures are eased ( figure ). if all children are in school, then when % of normal contacts are present, the reproduction number is close to . in this scenario even modest contact tracing is enough to control transmission. the added benefit of tracing contacts per index case over contacts per case is minimal, because very few people have more than contacts. with % of contacts outside the home present, schools could fully re-open with effective contact tracing in place. if contact patterns return to precovid levels, then contact tracing on its own is unlikely to be able to control transmission without other measures in place. from the lockdown baseline reproduction number estimated above of . , we investigate the impact of school opening scenarios on the reproduction number. figure illustrates the predicted reproduction number under scenarios in which schools are open. the shaded regions represent different policies under the assumptions that children are half as infectious as adults and there is no immunity in the population [ ] . we find that if no other social contacts outside the home increase apart from those occurring within primary schools, then opening primary schools is consistent with a reproduction number less than , = . ( % : . − . ) (fig. a) . however, even a modest increase in contacts outside home and school, relative to post-lockdown levels, would push the reproduction number back above . in the absence of substantial population-level immunity, the additional opening of secondary schools is likely to bring transmission close to epidemic growth in the population ( = . , % : . − . ). in general, higher adherence to other social distancing measures is required as more children return to school. we predict that contact tracing could increase the options for opening schools (figs b and c). we assume that a given proportion of all contacts are successfully traced, self-isolate, and that their contribution to the reproduction number is effectively zero. under a scenario similar to the situation in early june , where % of contacts are effectively traced and isolated, a larger proportion of pupils could return to school while still limiting transmission. if % of contacts of symptomatic cases were traced and isolated, we estimate that schools could fully re-open while maintaining control of transmission, as long as at least % of other contacts are prevented ( = . , % : . − . ). in this scenario, other forms of social distancing, including working from home and eliminating leisure contacts, would still be required if schools were to be fully open before a pharmaceutical solution is found. finally, we consider the impact of the relative infectiousness of children when re-opening schools. the most pessimistic scenario, where children are as infectious as adults, corresponds to the scenarios considered in figure . if children are less infectious than adults then re-opening primary and secondary schools has a smaller impact on the reproduction number, but the impact of increasing other contacts outside home and school settings remains the same. in this paper, we demonstrate that a combination of early death counts and social contact data provide sufficient information to estimate the potential impact of combinations of social distancing measures on the reproduction number for covid- in the united kingdom. we . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . focus on the effects of school closures/re-openings on covid- transmission in the context of other control measures. our findings suggest that that high adherence to social distancing outside school settings is needed to maintain epidemic control. opening primary schools has a modest impact on r, while opening secondary schools is predicted to have a larger overall impact; a combination of reopening both would result in a loss of epidemic control. our findings support the use of contact tracing as a key part of epidemic control; however, tracing needs to be highly effective. after the introduction of test and trace system in the uk only % of social contacts of cases were successfully traced and isolated within hours, though this has substantially increased over time [ ] . while tracing % of contacts has a positive impact on the reproduction number, it is insufficient to prevent epidemic growth if all schools are fully open. the greater risk arises from contact with people outside the home and school contexts. it is likely that reopening of schools will also lead to an increase in contacts made outside school, due to caregivers returning to work and interactions between parents. a strength of this analysis is its predictive value of the effect of combined interventions. using metrics of adherence to social distancing measures, such as google mobility or contemporary social contact surveys, it is possible to map the country's progression across figure , and therefore estimate the effect of policy changes on the reproduction number and hence the population attributable fraction of cases due to multiple combined interventions [ ] . this analysis was made possible by pre-existing detailed social contact data. social contact patterns have been used to characterise the potential for disease transmission in a population [ ] , design vaccine and control programmes for infectious diseases including influenza [ ] , meningitis [ ] and now covid- [ ] . however, in most settings, such data are out-of-date or not available. given their proven value, we argue that regular, representative social contact surveys should be become a routine part of epidemic control and preparedness. the social contact survey surveyed , individuals in the uk in about their social contacts during a single day [ ] . participants were recruited using three approaches: a paper survey sent to people in the post, an online survey and an online survey aimed specifically at school-aged children. participants were asked to complete demographic information about themselves including age, occupation and about their social contacts on the previous day. participants were asked to report the number of people they met, the duration of the contact (< minutes, to minutes, to hours, + hours), the context (home, work/school, travel, other/leisure), and whether the contact involved touch, e.g. a handshake, hug or kiss. to ease the ability to report large number of contacts per day, participants could report contacts as individual contact or groups of contacts; this methodology better captures the right-hand tail of the degree distribution. participants were also asked about transitive interactions between contacts, reported in [ ] . estimating the reproduction number from social contact data we use an individual-based approach to calculate a reproduction number of each of the participants of the social contact survey study [ ] . the reproduction number for an individual is given by . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint where is the number of contact events reported by each participant, ! is the number individuals in that contact (participants could report groups of similar contacts), ! is the duration of the contact and is the probability of transmission. because we do not have ages of contacts, this is an ego-centric estimate of r, and does not include local depletion of susceptibles. the population-wide reproduction number, rt, is calculated using the age-adjusted mean of the squared individual reproduction numbers, i.e. where is the number of participants in the social contact survey, ≤ ( ≤ is the relative infectiousness of children relative to adults. ( is the age-specific weighting for participant , estimated to match the age distribution in the uk population, calculated as the ratio of the proportion of individuals aged in the uk, +, ( ), to the social contact survey sample, the uncertainty associated with the reproduction number is estimated by bootstrapping the contact data, weighted by age, using the boot function in r. we report the bootstrapped mean and % percentile confidence intervals. the model can be calibrated using incidence data when the social contact patterns are known. here, we calibrated the model to the exponential growth phase of the epidemic in the uk prior to the introduction of widespread mass social distancing on march . we estimated the growth rate, , from death data between march and march , then calculated the reproduction number as = exp ( ) where is the serial interval. google has made community mobility reports [ ] available for the period during covid- transmission from february . the google mobility reports provide a point estimate for the percentage change in number of visits to and length of stay at places categorized as grocery and pharmacy, parks, transit stations, retail and recreation, residential, and workplace. the median percentage change is relative to the median value for the same day of the week for the period between january and february [ ] . we mapped the context reported in the social contact survey onto the google mobility data categories as home is equivalent to residential, work/school to workplace, other/leisure to retail and recreation and travel to transit. we assumed that % of contacts were active during the week of march . we then used google mobility estimate of the percentage of contacts that were active in subsequent weeks. to simulate % of contacts in a given context being active, we take a random sample without replacement of a proportion ( − / ) of all contacts for that context according to age group. the selected contacts are flagged with a comply flag ! equal to . the reduced individual reproduction number is given by: where / ! & equals zero if ! = and one otherwise. we estimate the mean and % confidence intervals for the reproduction number by sampling contacts then bootstrapping contacts, weighted by age, times and taking the percentile confidence interval. we assumed that no age groups have pre-existing immunity against covid- and all age groups are equally infectious. to simulate school closures, we remove all contacts for the relevant school aged children that have "school" listed as the context for the contact. to capture the % of children who are still attending school, we re-instate a random sample of the removed contacts. we simulate other contacts being active as above. we modelled contact tracing from symptomatic index cases. we assumed that an age-specific proportion of index cases were symptomatic, where index cases under years old had a % chance of being symptomatic, then assuming a linear increase with age in the chance of symptoms up to % for people over years old [ ] . for each contact, we drew a random number to determine if the index case was symptomatic, and therefore eligible for contact tracing. we assumed that either % or % of contacts were traced and isolated before becoming infectious. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint novel coronavirus -ncov: early estimation of epidemiological parameters and epidemic predictions estimates of regional infectivity of covid- in the united kingdom following imposition of social distancing measures social contacts and mixing patterns relevant to the spread of infectious diseases social contacts and mixing patterns relevant to the spread of infectious diseases social encounter networks: characterizing great britain contacts in context: large-scale setting-specific social mixing matrices from the bbc pandemic project quantifying the impact of physical distance measures on the transmission of covid- in the uk the psychological impact of quarantine and how to reduce it: rapid review of the evidence. the lancet. lancet publishing group individual-based perspectives on r coronavirus (covid- ) in the uk serial interval in determining the estimation of reproduction number of the novel coronavirus disease (covid- ) during the early outbreak transmission dynamics, serial interval and epidemiology of covid- diseases in hong kong under different control measures early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia efficacy of contact tracing for the containment of the novel coronavirus (covid- ) age-dependent effects in the transmission and control of covid- epidemics defining the population attributable fraction for infectious diseases social mixing patterns in rural and urban areas of southern china assessing optimal target populations for influenza vaccination programmes: an evidence synthesis and modelling study introducing vaccination against serogroup b meningococcal disease: an economic and mathematical modelling study of potential impact social encounter networks: characterizing great britain google covid- community mobility reports key: cord- - v qufw authors: vierlboeck, maximilian; nilchiani, roshanak r; edwards, christine m title: the easter and passover blip in new york city date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: v qufw abstract and executive summary - when it comes to pandemics such as the currently present covid- [ ], various issues and problems arise for infrastructures and institutions. due to possible extreme effects, such as hospitals potentially running out of beds or medical equipment, it is essential to lower the infection rate to create enough space to attend to the affected people and allow enough time for a vaccine to be developed. unfortunately, this requires that measures put into place are upheld long enough to reduce the infection rate sufficiently. in this paper, we describe research simulating the influences of the contact rate on the spread of the pandemic using new york city as an example (section iv) and especially already observed effects of contact rate increases during holidays [ - ] (section v). in multiple simulations scenarios for passover and easter holidays, we evaluated %, %, %, and % temporary increases in contact rates using a scenario close to the currently reported numbers as reference and contact rates based on bioterrorism research as a 'normal' baseline for nyc. the first general finding from the simulations is that singular events of increased visits/contacts amplify each other disproportionately if they are happening in close proximity (time intervals) together. the second general observation was that contact rate spikes leave a permanently increased and devastating infection rate behind, even after the contact rate returns to the reduced one. in case of a temporary sustained increase of contact rate for just three days in a row, the aftermath results in an increase of infection rate up to %, which causes double the fatalities in the long run. in numbers, given that increases of % and % seem to be most likely given the data seen in germany for the easter weekend for example [ , ], our simulations show the following increases (compared to the realistic reference run): for a temporary % surge in contact rate, the total cases grew by , , the maximum of required hospitalizations over time increased to , , and the total fatalities climbed by , accumulated over days. as for the % surge, we saw the total number of cases rise by , , the maximum number of required hospitalizations increase to , , and the total number of fatalities climb by , over days in nyc. all in all, we conclude that even very short, temporary increases in contact rates can have disproportionate effects and result in unrecoverable phenomena that can hardly be reversed or managed later. the numbers show possible phenomena before they might develop effects in reality. this is important because phenomena such as the described blip can impact the hospitals in reality. therefore, we warn that a wave of infections due to increased contact rates during passover/easter might come as a result! "a pandemic is the worldwide spread of a new disease" [ ] the above mentioned definition by the who describes the current global situation in regard to the virus named covid- , that emerged world wide in the past months. as of the writing of this paper (april ), there are , , confirmed covid- cases world wide and , confirmed deaths in over countries [ ] . the virus is confirmed to be transmissible from human to human [ , ] and has constantly been spreading due to contact between individuals. the problem with the spread is though, that while it seems like a simple mathematical model, it is dynamic complex system which does not necessarily behave in a linear way. thus, predictions can be difficult and the actual behavior of the whole system, and therefore the outcome such as fatalities and infrastructure strain, is hard to evaluate. one way to conduct such evaluations is to design a representative model which simulates and mimics the real world phenomena as close as possible. with such a model, certain parameters and influences can be assessed by modifying the model and observing its reaction, which is what this paper is about. due to the importance of the above described transmission from human to human and the involved contact, the research presented in the following paragraphs took a look at the effective contact rates between humans in a theoretical dynamic simulation using new york city as an example, in order to determine what factors play what role and how certain influences interact. therefore, various simulations and scenarios were assessed in order to discovered different behaviors and potential emergent phenomena based on and dependent on different factors. the second section will describe the research methodology, the model utilized for the simulations, and how the specific simulations were conducted. section iii describes the assumptions that were made in order to design and set up the model as well as the involved parameter as a result. section iv and v then demonstrate and discuss scenarios possible and likely in order to show the behavior of the system and certain emergent phenomena. lastly, section vi will summarize and discuss the outcomes and also give an outlook how research might continue. when looking at models for the spread of diseases, sir models present a simple and easy to adapt starting point for such situations. sir stands for "susceptible-infectiveremoved" and was first proposed by kermack and mckendrick in [ ] . the model is described as a differential system in which multiple factors depend on each other to determine the behavior of the three levels s, i, and r. the equations herein were as follows [also see ]: with these equations, a simulation system was be derived that models the current situation of the covid- spread in a simplified way. since the aforementioned infrastructure and hospital strain was of importance for the coronavirus pandemic, the model was modified to include time delays due to incubation and a portion of the infected people who would not go directly from "infected" to "removed" and rather move to hospitalization. from hospitalization then there were two options, either a delayed demise of the individual, or a delayed recovery, which adds the individual back to r. these additions modify the equations above as follows and add equations (iv) and (viii): the simulation model based on these parameters was setup in vensim [ ] with time and calculation steps of one day. a flowchart of the model is depicted in figure on the right. based on the equations and the structure shown in figure , the model was designed in order to allow for a flexible adjustment of the parameters, which will be described in section . with the model then, the chosen research methodology was applied as described by maria [ ] . herein, after the above described problem definition in a first step, the parameters of the model were set to yield an adequate and verifiable outcome. such a verification was conducted by comparing the model results to real world data that was reported during the current pandemic. wirth the set parameters (also see next section), multiple scenarios were simulated and examined based on various conditions that were chosen, always derived from real and current circumstances. these scenarios will be described in the fourth and fifth section. the outcomes were compared as far as the different levels of the simulations components go. for example, the infection rates and total cases could be compared to determine the speed of the spread and therefore the rise of the total case number over time. another option is the comparison and evaluation of fatality numbers and hospital strain over time to assess how different scenarios effect the end results and possibly discover potential shortages at certain times. these scenarios then allowed for a general evaluation and also the discovery of the main focus of this paper, the phenomenon we called the "easter blip" (section v). based on the results, predictions of possible behaviors of the current pandemic were deduced to potentially support governing and regulating decision in order to avoid and mitigated unwanted situations such as high fatality numbers or collapse of medical support for example. the next section will describe the assumptions the model was based upon to allow for simulations that mimic the current real world behavior as far as feasible. in order to design a model that could mimic and simulate the real world pandemic, the factors, described in the equations (iv) through (viii) above, had to be set so that the simulation results would be in accordance with real world situations and data. therefore, this section will outline the assumptions that were made to achieve the accordance. hence, the following sub-sections will describe one parameter each based on new york city (nyc) in , with a population of , , people [ ] . the infection rate of the model, which describes at what rate the susceptible population will be infected, was defined depending on two factors: infectivity (i) and effective contact rate (c). these two factors together with the infectious population (i) and the susceptible population (s) allow the calculation of the infection rate according to the following formula: the infectivity (i) was defined as a constant based on the likelihood of infection when people interact and hence was derived from various sources and set to % [ , ] due to the higher population density of nyc compared to the locations of the source data. the constant infectivity allowed a modulation and adjustment of the infection rate based on the second component, the effective contact rate. this rate was furthermore used to model and simulate real word behavior as circumstances like social distancing for example impact the effective contact rate of the population and therefore were ideal to be modeled this way. for the general magnitude of the effective contact rate, the amount of average contacts of people per day in nyc was researched in order to enable a realistic starting point without any measures such as social distancing. based on literature sources, the researched contact rate in nyc ranged from for people who do not use the subway up to at least for people who do utilize the subway [ ] . since this data was obtained and measured in and the population of nyc increased by % since then, this would yield contact rates of . and . today. given that the number of subway users in nyc is higher than in any other city in the united states [ ] , it was assumed that % of the nyc population take the subway on a daily basis and therefore are more active and effectively have more contact, also through surfaces. together with the number of contacts for non-subway users, this would yield an average effective contact rate without restrictions or social distancing of . . all in all, the infection rate therefore was defined by the following equation: with the parameters for the hospitalization and recovery rate were assumed to be directly connected as an infected person would either recover or be hospitalized (see figure ). therefore, the recovery rate was exactly the opposite portion of the hospitalization rate, yielding since the numbers of hospitalizations strongly vary by age group and therefore depend on demographics, an average hospitalization rate was calculated based on official data by the city of new york [ ] to allow for the use of a constant. the resulting probability was . for hospitalizations and thus . for recovery. and similar to the last sub-section, the parameters for the hospital recovery and death rate were also assumed directly connected as a hospitalized person would either recover or decease. therefore the hospital recovery rate was exactly the opposite portion of the death rate, yielding since the death rate for people already hospitalized is much higher than the death rate of the virus in general, it was calculated based on the number of confirmed deaths and hospitalizations also provided by the city of new york [ ] , which resulted in a death rate of . and hence a hospital recovery rate of . . and the first positive covid- case was reported in new york city on march st . unfortunately, this is only the first confirmed positive case and not necessarily or likely the first case in general. throughout the spread of the virus, only cases tested positive were reported and therefore a lack of people who carry the virus, but are not aware, has to continuously be assumed. this is further exacerbated by the fact that it is possible to carry the virus without ever showing symptoms [see ] . thus, the number of covid- cases resulting from a simulation has to be way higher than what the real data represents. actual numbers and estimation for the unknown numbers are hard to find and estimations range from over percent unknown cases [ ] to ten times the confirmed number or more [ ] . therefore, the number of unknown cases in the model was adjusted so that the model aligned from march st to march th with the reported real time data. in order to achieve this, the model was set to infections at the time of the first reported case. this lead to a realistic outcome of the simulation and also served as verification of the design as the fatality rate and the case numbers correlated with the data when taken into consideration the unknown cases. with these settings and parameters, the scenarios for the simulation could be run and evaluated. since the measures and regulations that were put into place are hard to quantify, the first scenarios will address the effects of such measures and show how they could have affected the numbers. then, the ensuing scenarios will evaluate possible future occurrences and possibilities. the following fourth section will cover these scenarios and therein discuss the general effects of the variable in the simulation, the effective contact rate. the fifth section then will discuss and show a possible and presumably likely phenomenon that could await in the near future, including its implications. as described above, the first baseline scenario to assess is to figure out what trajectory the real world data most likely followed in order to understand what the measures that were put into place changed and how they affected the model. as mentioned, the variable to be manipulated will be the effective contact rate which directly affects the infection rate. the first measure that was put in place in nyc was social distancing and the closing of certain institutions and stores to be implemented immediately. this was accompanied by companies moving employees to work from home or stopping work all together. official orders for example went into place on march th and march st after the national emergency was declared on march th [see ]. therefore, the scenario below was constructed for the simulation to evaluate the effects. in a first run, the two dates were utilized to introduce step reductions in the effective contact rate of various heights and the effects on the infection rate were compared in form of a graph. figure looking at the outcome, we see that the different effective contact rate steps flatten the curve significantly and stretch out the infections. looking at the data as of april , which would correspond to the day in the simulation, the fatality count in nyc was , , which corresponds to the run above that included two steps with a reduction of each time. unfortunately, as figure and show, this version of scenario does not do well as far as the reduction of the infection rate goes over time and the fatalities keep increasing exponentially despite the measures. this is due to the fact that the reductions are not significant enough overall to have a helpful impact. furthermore, such a scenario, while plausible and possible, is not realistic since measures put in place do not go into effect at once and everyone adheres to them immediately at the time they go into effect. therefore, a continuous reduction is more realistic, which is why such a scenario will be presented and evaluated in scenario . the second scenario, as above alluded to, will evaluate the effects of gradually reduced effective contact rates over a number of days. this way, the rates decrease over time until they reach certain events or a limit, which is more realistic since people adjust to new circumstances and in this case regulations gradually over time. thus, the starting point of the regulations mentioned in the previous scenario was used to introduce effective contact rate reductions with a delay of one day. for example, the blue line in figure indicates a reduction of for the effective contact rate after day for consecutive days until the rate reaches . . steps of to . steps of to . steps of . to . steps of . to . all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . https://doi.org/ . / . . . doi: medrxiv preprint with this data, we can see that the run with the steps of . down to . is closest to reality and approaches the fatality number currently reported. furthermore, we see that the gradual reduction of the effective contact rate leads to a peak in the infection rate which then introduces a downswing and successive upswing albeit the latter with a lower gradient of the infection rate over time. therefore, we can say that the gradual reduction of the effective contact rate is an effective measure to control the epidemic and can even hedge the upswing of the virus spread, as shown by figure . so far, we have looked at baseline scenarios which behave the same way over time and have changes that are linear or follow a gradient. unfortunately, this is not at all the case in reality, as singular or short term relaxation in rules or temporarily making exceptions can cause major changes to the effective contact rate for a brief period of time. such events can be short ones that increase the effective contact rate momentarily, but also longer time periods that show an increase or decrease, such as seasons, for example [see ]. since the increases are more critical than the decreases, we want to take a look at them in this section. at the time of this writing (april , ), easter is happening and during these times, various other religious holidays have happened or are coming up in the near future. during such holidays, people tend to congregate, visit religious gatherings such as masses, and visit family members. after a prolonged period of solitude, the perceived need and yearning for such close contacts increases understandably and there have already been reports of planned gatherings [ ] , measured significantly increased mobility in germany [ , ] , and people (including two of the authors) have witnessed good friday gatherings at homes in new jersey and new york, for example. these phenomena raise the question what could happen if people are giving in to their yearnings and defy recommendations and regulations. hence, this section will look at possibilities in two scenarios to estimate the implications of such defiance in order to enable a prediction regarding the outcome if the cause cannot be prevented. scenario will assess the possibility of increased effective contact rates on separate occasions and scenario will assess short periods of increases. as a basis for the scenarios, the trajectory closest to reality of scenario will be utilized. to utilize a real life example we simulated the run from scenario with the steps of . down to . and implemented two short increases in effective contact rate for good friday and easter sunday. in order to simulate various severities of increases, four runs were conducted with increments of % yielding the last run as a return all the way back to the effective contact rate c of . . the results are depicted in figure through on the next pages and discussed hereinafter. the figures through on the next two pages demonstrate the effects that short outbursts can have and a few takeaways have to be mentioned and pointed out. first, a return to the effective contact rates of a "normal" state can increase the infection rates temporarily by % as the first day with increased effective contact rates amplifies the second one. this is due to the decrease in between those two dates not being sufficient for the measures to fight back the short upswing in a limited time. therefore, these two increases could yield hundreds of thousands of new infections and thus could also even double the number of hospitalized patients. second, in the long run, these short increases in effective contact rates can have detrimental impacts when it comes to the fatality numbers as a result of the increased hospitalizations. in the worst case, this could lead to an increase in fatality numbers of % after days, not taking into consideration that hospitals might be overloaded and forced into triage procedure where limited resources have to be allocated and decisions have to be made which patients can receive treatments at all. overall, this scenario shows that singular increases already can have detrimental impacts and make the difference between hospitalization infrastructure being overloaded or able to handle the demand. in addition, the simulation shows the numbers immediately, whereas in reality, the incubation time might lead to a delay and thus the individuals infected over easter could potentially affect the medical infrastructure one week to two weeks later. with these aspects in mind, the last scenario will assess the worst possible option, a temporarily sustained increase, beginning on page . all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . rate -hospitalizations showing how many people require hospitalization for scenario each day after the delay of the incubation. this represents the required hospitalizations, which may exceed the real capacities of the hospitals and therefore cause shortage and possible even triage situations as described before. the predicted hospitalization numbers allow for estimation of necessary resources for the simulated area. reference % % % % all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . the above descried scenario assessed short singular increases, which might happen again in the future for certain events. this leaves the question though, since the last scenario already showed an interplay between two singular increases, how sustained increases, even if temporarily limited to various days, affect the numbers and if the reciprocity effects multiply. therefore, this last scenario assesses a constant increase over easter weekend, for example, if people would spend multiple days with family or other gathers, which is not unusual. again, in order to simulate various severities of increases, four runs were conducted with differences of % yielding the past run as a return all the way to the effective contact rate of . for three days (good friday through easter sunday). the results are depicted in figure through on the next pages and discussed thereinafter. figure shows the hospitalizations over days for demonstration purposes. the figures resulting from the last scenario show that the effects are partially as to be expected based on scenario since the infection rate steadily rises with every day the increase persists and therefore the impact that the measures have when they are back in effect is also reduced. for example, for the infection rate on the first day after the increased period, the numbers are between % to % higher than they were in the respective runs of scenario . this means that each day the increase persists will have permanent effects on the infection rates even once the effective contact rate goes back down. this permanent influence can have extreme ripple effects for the hospitalization and fatality numbers as shown by figures and below: the hospitalization numbers are between . % to . % higher than the respective runs of scenario and between % and % higher than the reference run; the fatality numbers are between . % to . % higher than the respective runs of scenario and between . % and % higher than the reference run over days. all in all, we can see that a temporarily sustained increase not only increases the numbers and therefore causes effects over the time of its existence, it also affects the numbers after its subsidence as it permanently increases the severity of the pandemic. this allows for two conclusions: one, it is imperative to prevent such increases at any costs and two, if they are inevitable, they have to be kept as low and short as possible to minimize the permanent impact they have. this concludes the simulations and scenarios assessed in this research. the last section will give a overview and summary including a conclusion and outlook regarding future research opportunities and plans. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . https://doi.org/ . / . . . doi: medrxiv preprint all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . https://doi.org/ . / . . . doi: medrxiv preprint the previous sections have shown that the effects of such a dynamic and complex systems as this pandemic are by no means linear and predictable by mere extrapolation. even with the measures and the current standings, short increase and maybe even returns to "normal" effective contact rates can have detrimental outcomes that cause permanent effects impossible to cure even when caught early. the two simulation scenarios and demonstrated that even single short increases can show these behaviors and temporarily sustained ones increase and amplify the impact through reciprocity. our simulations have shown that increases permanently increase infection rates after subsidence by as much as % and higher surges, such as a return to "normal" and therefore % increase of the effective contact rate would increase the infection rate temporarily by over , %. these effects ripple through the system and impact hospitalizations and ultimately fatalities, increasing the former by as much as % at the peak and the latter by as much as % in the worst case compared to the references without contact increases. in numbers, given that increases of % and % seem to be most likely given the data seen in germany for the easter weekend for example [ , ] , our simulations show the following increases (compared to realistic reference run) for a temporary % surge in contact rate: the total cases grew by , , the maximum of required hospitalizations over time increased to , , and the total climb in fatalities was , accumulated over days. as for the % surge, we saw the total number of cases rise by , , the maximum number of required hospitalizations increase to , , and the total number of fatalities climb by , over days in nyc. in conclusion, the numbers and scenarios demonstrated that increases of any kind have to be prevent at any costs in order to not permanently impact the progress of the pandemic containment. if such increases cannot be prevented, it is imperative to keep them as short as possible and, if necessary, separate the peaks as much as possible in order to allow for regulation and mitigation in between. furthermore, other mitigation strategies such as stricter regulations could be a possibility to mitigate already happened singular increases. as described in the previous section, the results obtained in this simulation possess a certain predictive power within their numbers as they show possible phenomena, such as increases infection rates and their implications, before they develop effects in reality. this is especially important when it comes to the hospitalization rates, as increases infection rates or even short phenomena such as the described easter blip can significantly impact the hospitals in reality. thus, the results allow a predictions to an extent when a wave as a result of an increase in infection numbers might come. this can allow authorities to assign resources accordingly or at least prepare for possible impacts especially since data seen in reality already shows the trajectory of the evaluated scenarios [ , ] . as for future research and an outlook, other measures and effects, such as protective gear for the public can be assessed, as they might reduce the infectivity and or effective contact rate for example. this would allow for a selective use of such measures wherever necessary in order to purposefully utilize their effects. moreover, other branches and population areas are planned to be researched, such as emts and police, as the impact of the pandemic on such forces is also important for the general public safety. we see that the current pandemic impacts all our lives and will most likely continue to do so for, as of the time of this writing, an unexpected future. fortunately the research conducted allows simulation and mimicking of the reality with predictive power and we will continue to adjust our models to include any new and important occurrences. staying home and social distancing are our most powerful weapons in fighting this pandemic, but they only work if everyone participates, wide spread individual exceptions cannot be granted nor accepted and they can sabotage the whole mission. let's all do our part and participate in the fight, everyone can and everyone has to! stay safe, stay home, stay healthy! all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april , . . https://doi.org/ . / . . . doi: medrxiv preprint coronavirus disease (covid- ) pandemic covid- -mobility trends reports deutsche sind immer mehr unterwegs states are restricting easter gatherings amid covid- . churches and lawmakers are pushing back world health organization gisaid: global initiative on sharing all influenza data -from vision to reality genome composition and divergence of the novel coronavirus ( -ncov) originating in chinacell a contribution to the mathematical theory of epidemics differential equations and mathematical biology vensim software introduction to modeling and simulation city of new york. www .nyc.gov/site/planning/planning-level/nyc-population/currentfuture-populations.page modelling transmission and control of the covid- pandemic in australia covid- update: transmission % or less among close contacts bioterrorism: mathematical modeling applications in homeland security. philadelphia, pa: society for industrial andapplied mathematics means of transportation to work by selected characteristics city of new york. www presumed asymptomatic carrier transmission of covid- estimation of covid- outbreak size in italy based on international case exportations the total number of italian coronavirus cases could be ' times higher' than known tally, according to one official what's closed? what's mandatory? how tri-state covid- action affects daily life recurrent outbreaks of measles, chickenpox and mumps: i. seasonal variation in contact rates key: cord- -a ig t authors: hellewell, joel; abbott, sam; gimma, amy; bosse, nikos i; jarvis, christopher i; russell, timothy w; munday, james d; kucharski, adam j; edmunds, w john; working group, cmmid ncov; funk, sebastian; eggo, rosalind m title: feasibility of controlling -ncov outbreaks by isolation of cases and contacts date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: a ig t background: to assess the viability of isolation and contact tracing to control onwards transmission from imported cases of -ncov. methods: we developed a stochastic transmission model, parameterised to the -ncov outbreak. we used the model to quantify the potential effectiveness of contact tracing and isolation of cases at controlling a ncov-like pathogen. we considered scenarios that varied in: the number of initial cases; the basic reproduction number r ; the delay from symptom onset to isolation; the probability contacts were traced; the proportion of transmission that occurred before symptom onset, and the proportion of subclinical infections. we assumed isolation prevented all further transmission in the model. outbreaks were deemed controlled if transmission ended within weeks or before cases in total. we measured the success of controlling outbreaks using isolation and contact tracing, and quantified the weekly maximum number of cases traced to measure feasibility of public health effort. findings: while simulated outbreaks starting with only initial cases, r of . and little transmission before symptom onset could be controlled even with low contact tracing probability, the prospects of controlling an outbreak dramatically dropped with the number of initial cases, with higher r , and with more transmission before symptom onset. across different initial numbers of cases, the majority of scenarios with an r of . were controllable with under % of contacts successfully traced. for r of . and . , more than % and % of contacts respectively had to be traced to control the majority of outbreaks. the delay between symptom onset and isolation played the largest role in determining whether an outbreak was controllable for lower values of r . for higher values of r and a large initial number of cases, contact tracing and isolation was only potentially feasible when less than % of transmission occurred before symptom onset. interpretation: we found that in most scenarios contact tracing and case isolation alone is unlikely to control a new outbreak of -ncov within three months. the probability of control decreases with longer delays from symptom onset to isolation, fewer cases ascertained by contact tracing, and increasing transmission before symptoms. this model can be modified to reflect updated transmission characteristics and more specific definitions of outbreak control to assess the potential success of local response efforts. evidence before this study contact tracing and isolation of cases is a commonly used intervention for controlling infectious disease outbreaks. this intervention can be effective, but may require intensive public health effort and cooperation to effectively reach and monitor all contacts. when the pathogen has infectiousness before symptom onset, control of outbreaks using contact tracing and isolation is more challenging. this study uses a mathematical model to assess the feasibility of contact tracing and case isolation to control outbreaks of -ncov, a newly emerged pathogen. we used disease transmission characteristics specific to the pathogen and therefore give the best available evidence if contact tracing and isolation can achieve control of outbreaks. contact tracing and isolation may not contain outbreaks of -ncov unless very high levels of contact tracing are achieved. even in this case, if there is asymptomatic transmission, or a high fraction of transmission before onset of symptoms, this strategy may not achieve control within three months. as of th february , there have been over , confirmed cases of a novel coronavirus infection ( -ncov), including over international cases, and over reported deaths . control measures have been instigated within china to try to contain the outbreak . as infectious people arrive in countries or areas without ongoing transmission, efforts are being made to halt transmission, and prevent potential outbreaks , . isolation of confirmed and suspected cases, and identification of contacts are a critical part of these control efforts. it is not yet clear if these efforts will achieve control of transmission of -ncov. isolation of cases and contact tracing becomes less effective if infectiousness begins before the onset of symptoms , . for example, the severe acute respiratory syndrome (sars) outbreak that began in southern china in was amenable to eventual control through tracing contacts of suspected cases and isolating confirmed cases because the majority of transmission occurred after symptom onset . these interventions also play a major role in response to outbreaks where onset of symptoms and infectiousness are concurrent , for example ebola virus disease , and mers , , and for many other infections , . the effectiveness of isolation and contact tracing methods hinges on two key epidemiological parameters: the number of secondary infections generated by each new infection and the proportion of transmission that occurs prior to symptom onset . in addition, the probability of successful contact tracing and the delay between symptom onset and isolation are critical, since cases remain in the community where they can infect others until isolation , . transmission prior to symptom onset could only be prevented by tracing contacts of confirmed cases and testing (and quarantining) those contacts. cases that do not seek care, potentially due to subclinical or asymptomatic transmission represent a further challenge to control. if -ncov can be controlled by isolation and contact tracing, then public health efforts should be focussed on this strategy. however, if this is not enough to control outbreaks, then additional resources may be needed for additional interventions. there are currently key unknown characteristics of the transmissibility and natural history of -ncov; for example, whether transmission can occur before symptom onset. therefore we explored a range of epidemiological scenarios that represent potential transmission properties based on current information about -ncov transmission. we assessed the ability of isolation and contact tracing to control disease outbreaks using a mathematical model , [ ] [ ] [ ] [ ] . by varying . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint the efficacy of contact tracing efforts, the size of the outbreak when detected, and the promptness of isolation after symptom onset, we show how viable it is for countries at risk of imported cases to use contact tracing and isolation as a containment strategy. we implemented a branching process model in which the number of potential secondary cases produced by each individual (the 'infector') is drawn from a negative binomial distribution with a mean equal to the reproduction number, and heterogeneity in the number of new infections produced by each individual , , [ ] [ ] [ ] . each potential new infection was assigned a time of infection drawn from the serial interval distribution. secondary cases were only created if the infector had not been isolated by the time of infection. in the example in figure , person a can potentially produce three secondary infections (because three is drawn from the negative binomial distribution), but only two transmissions occur before the case was isolated. thus, a reduced delay from onset to isolation reduced the average number of secondary cases in the model. figure : example of the simulated process that starts with person a being infected. after an incubation period (blue) person a shows symptoms and is isolated at a time drawn from the delay distribution (green) ( table ) . a draw from the negative binomial distribution with mean r and distribution parameter determines how many people person a potentially infects. for each of those, a serial interval is drawn (orange). two of these exposures occur before the time that person a is isolated. with probability ρ, each contact is traced, with probability -ρ they are missed by contact tracing. person b is successfully traced, which means that they will be isolated without a delay when they develop symptoms. hey could, however, still infect others before they are isolated. person c is missed by contact tracing. this means that they are only detected if and when symptomatic, and are isolated after a delay from symptom onset. because person c was not traced they infected two more people (e and f) in addition to person d than if they had been isolated at symptom onset. a version with asymptomatic transmission is given in figure s . we initialised the branching process with , , or cases to represent a newly detected outbreak of varying size. initial symptomatic cases were then isolated after symptom onset with a delay drawn from the onset-to-isolation distribution (table ) . isolation was assumed to be % effective at preventing further transmission; therefore, in the model, failure to control the outbreak resulted from the lack of complete contact tracing and the delays in isolating cases rather than the inability of isolation to prevent further transmission. either % or % of cases became symptomatic, and all symptomatic cases were eventually reported. each newly infected case was identified through contact tracing with probability ρ. secondary cases that had been traced were isolated immediately upon becoming symptomatic. cases that were missed by contact tracing (probability -ρ) were isolated when they became symptomatic with a delay drawn from the onset-to-isolation distribution. in addition, each case had an independent probability of being subclinical (asymptomatic), and were therefore not detected either by self report or if traced by contact tracing. new secondary cases caused by an asymptomatic case were missed by contact tracing and could only be isolated based on symptoms. the model includes isolation of symptomatic individuals only, i.e. no quarantine, so isolation cannot prevent transmission before symptom onset. quarantining contacts of cases requires a considerable investment in public health resources, and has not been widely implemented for all contacts of cases , . we ran , simulations of each combination of the proportion of transmission before symptom onset, r , onset-to-isolation delay, the number of initial cases, and the probability that a contact was traced (table ) . we explored two scenarios of delay between symptom onset and isolation: "short" and "long" . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint the incubation period for each case was drawn from a weibull distribution. a corresponding serial interval for each case was then drawn from a skewed normal distribution with the mean parameter of the distribution set to the incubation period for that case, a standard deviation of , and a skew parameter chosen such that a set proportion of serial intervals were shorter than the incubation period (meaning that a set proportion of transmission happened before symptom onset) ( figure ). this sampling approach ensured that the serial interval and incubation period for each case was correlated, and prevents biologically implausible scenarios where a case can develop symptoms very soon after exposure but not become infectious until very late after exposure and vice versa. there are many estimates of the reproduction number for the early phase of the -ncov outbreak in wuhan, china , , , , , [ ] [ ] [ ] [ ] [ ] , and therefore we used the values . , . , and . , which span most of the range of current estimates (table ) . we used the secondary case is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint short delay to isolation, % of transmission before symptom onset, and % subclinical infection. table . b) the incubation distribution estimate fitted to data from the wuhan outbreak by backer et al. . c) an example of the method used to sample the serial interval for a case that has an incubation period of days. each case has an incubation period drawn from the distribution in b, their serial interval is then drawn from a skewed normal distribution with the mean set to the incubation period of the case. in c, the incubation period was days. the skew parameter of the skewed normal distribution controls the proportion of transmission that occurs before symptom onset, the three scenarios explored are < % of transmission before onset (grey), % of transmission before onset (gold), and % of transmission before onset (pink). outbreak control was defined as no new infections between and weeks after the initial cases. outbreaks that reached , cumulative cases were assumed to be too large to control within - weeks, and were categorised as uncontrolled outbreaks. based on this definition, we reported the probability that an outbreak of a -ncov-like pathogen would be controlled within weeks for each scenario, assuming that the basic reproduction number remained constant and no other interventions were implemented. the probability that an outbreak is controlled gives a one-dimensional understanding of the difficulty involved in achieving control, because the model places no constraints on the number of cases and contacts that can be traced and isolated. in reality, the feasibility of contact tracing and isolation is likely to be determined both by the probability of achieving control, and the resources needed to trace and isolate infected cases . we therefore reported the weekly maximum number of cases undergoing contact tracing and isolation for each scenario that results in outbreak control. once the weekly number of cases reaches a . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint certain point, it can overwhelm the contact tracing system and affect the quality of the contact tracing effort . all code is available as an r package (https://github.com/epiforecasts/ringbp). the funders of the study had no role in study design, data collection, data analysis, data interpretation, writing of the report, or the decision to submit for publication. all authors had full access to all the data in the study and were responsible for the decision to submit the manuscript for publication. to achieve % of outbreaks controlled required % of contacts to be traced and isolated for scenarios with a reproduction number of . (figure a) . the probability of control was . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint higher at all levels of contact tracing when the reproduction number was lower, and fell rapidly for a reproduction number of . . at a reproduction number of . , the effect of isolation is coupled with the chance of stochastic extinction resulting from overdispersion , which is why some outbreaks were controlled even at % contacts traced. isolation and contact tracing decreased transmission, as shown by a decrease in the effective reproduction number (figure b ). for the scenario where the basic reproduction number was . , the median estimate rapidly fell below , which indicates that control is likely. for the higher transmission scenarios a higher level of contact tracing was needed to bring the median effective reproduction number below . figure : the percentage of outbreaks controlled for the baseline scenario (black), and varied number of initial cases (a), time from onset to isolation (b), percentage of transmission before symptoms (c), and proportion of subclinical (asymptomatic) cases (d). the baseline scenario is r of . , initial cases, a short delay to isolation, % of transmission before symptom onset, and % subclinical infection. results for r = . and . are given in the supplement. a simulated outbreak is defined as controlled if there are no cases between weeks and after the initial cases. the number of initial cases had a large impact on the probability of achieving control. with five initial cases, there was a greater than % chance of achieving control in months, . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint even at modest contact tracing levels (figure a ). more than % of these outbreaks were controlled with no contact tracing due to the combined effects of isolation of symptomatic cases and stochastic extinction. the probability of control dropped as the number of initial cases increased, and for initial cases, even % contact tracing did not lead to % of simulations controlled within months. the delay from symptom onset to isolation played a major role in achieving control of outbreaks (figure b ). at % of contacts traced, the probability of achieving control falls from % to % when there is a longer delay from onset to isolation. if there is no transmission before symptom onset then the probability of achieving control is higher for all values of contacts traced (figure c) . the difference between % and % of transmission before symptoms had a marked effect on probability to control. we found this effect in all scenarios tested (supplementary figure s ). including only % of cases being asymptomatic resulted in a decreased probability that simulations were controlled by isolation and contact tracing for all values of contact tracing (figure d ). for % of contacts traced, only % of outbreaks were controlled, compared with % without subclinical infection. in many scenarios there were between and symptomatic cases within a week ( figure ), all of whom would need isolation and would require contact tracing. the maximum number of weekly cases may appear counterintuitive because a lower maximum number of weekly cases is not associated with higher outbreak control. this occurs because with better contact tracing it becomes possible to control outbreaks with higher numbers of weekly cases. the maximum number of weekly cases is lower if the initial number of cases is and higher if it is (see supplement). in the ebola epidemic in liberia, each case reported between and contacts , and the number of contacts may be higher, as seen in mers outbreaks . tracing contacts per case could mean up to , contacts in the week of peak incidence. . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint figure : the maximum weekly cases requiring contact tracing and isolation in scenarios with index cases that achieved control within months. scenarios vary by reproduction number and the mean delay from onset to isolation. % of transmission occurred before symptom onset, and % subclinical infection. the percentage of simulations that achieved control is shown in the boxplot. this illustrates the potential size of the eventually controlled simulated outbreaks, which would need to be managed through contact tracing and isolation. * indicates that the % interval extends out of the plotting region. we determined conditions where case isolation, contact tracing, and preventing transmission by infected contacts would be sufficient to control a new -ncov outbreak in the absence of other control measures. we found that in many plausible scenarios, case isolation alone would be unlikely to control transmission within three months. case isolation was more effective when there was little transmission before symptom onset and when the delay from symptom onset to isolation was shorter. preventing transmission by tracing and isolating a larger proportion of contacts, thereby decreasing the effective reproduction number, improved the number of scenarios where control was likely to be achieved. however, these outbreaks required a large number of cases to be contact traced and isolated each week, which is of crucial concern when assessing the feasibility of this strategy. subclinical infection markedly decreased the probability of controlling outbreaks within months. in scenarios where the reproduction number was . , % of transmission occurred before symptom onset, and there was a short delay to isolation, at least % of infected contacts . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint needed to be traced and isolated to give a probability of control of % or more. this echoes other suggestions that highly effective contact tracing will be necessary to control outbreaks in other countries . in scenarios where the delay from onset to isolation was larger, similar to the delays seen in the early phase of the outbreak in wuhan, the same contact tracing success rate of % achieved less than % probability of containing an outbreak. higher pre-symptomatic transmission decreases the probability that the outbreaks were controlled, under all reproduction numbers and isolation delay distributions tested. our model does not include other control measures that may decrease the reproduction number and therefore also increase the probability of achieving control of an outbreak. at the same time, it assumes that isolation of cases and contacts is completely effective, and that all symptomatic cases are eventually reported. relaxing these assumptions would decrease the probability that control is achieved. we also make the assumption that contact is required for transmission between two individuals, whereas transmission via fomites may be possible. this would make effective contact tracing challenging, and good respiratory and hand hygiene would be critical to reduce this route of transmission, coupled with environmental decontamination in healthcare settings. we intentionally simplified our model to determine the effect of contact tracing and isolation on the control of outbreaks under different scenarios of transmission. however, as more data becomes available, the model can be updated, or tailored to particular public health contexts. it is likely that the robustness of control measures is affected both by differences in transmission between countries but also by the concurrent number of cases that require contact tracing in each scenario. practically, there is likely to be an upper bound on the number of cases that can be traced, and case isolation is likely to be imperfect . we reported the maximum number of weekly cases during controlled outbreaks but the capacity of response efforts may vary. we explored a range of scenarios informed by the latest evidence on transmission of -ncov. similar analyses using branching models have already been used to analyse the wuhan outbreak to find plausible ranges for the initial exposure event size and the basic reproduction number , . our analysis expands on this by including infectiousness before the onset of symptoms, case isolation, explicit modelling of case incubation periods and time to infectiousness. a key area of uncertainty is if and for how long individuals are infectious before symptom onset, and if asymptomatic or subclinical infection occurs. both are likely to make the outbreak harder to control . the model could be modified to include some transmission after isolation (such as in hospitals) which would decrease the probability of achieving control. in addition, we define an outbreak as controlled if it reaches extinction by months, regardless of outbreak size or number of weekly cases. this definition may be narrowed where the goal is to keep the overall caseload of the outbreak low. this may be of concern to both local authorities for reducing the healthcare surges, and may provide a way to limit geographic spread. our study indicates that in most plausible outbreak scenarios case isolation and contact tracing alone is insufficient to control outbreaks, and that in certain scenarios even near perfect contact tracing will still be insufficient, and therefore further interventions would be required to achieve control. however, rapid and effective contact tracing can also reduce the initial number of cases, which would make the outbreak easier to control overall. effective contact tracing and isolation could contribute to reducing the overall size of an outbreak or bringing it under control over a longer time period. . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the authors have no interests to declare. no data were used in this study. the r code for the work is available at https://github.com/epiforecasts/ringbp. . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/ . / . . . doi: medrxiv preprint world health organization world health organization. novel coronavirus ( -ncov) situation report public health management of persons having had contact with novel coronavirus cases in the european union effectiveness of airport screening at detecting travellers infected with -ncov factors that make an infectious disease outbreak controllable comparing nonpharmaceutical interventions for containing emerging epidemics modeling and public health emergency responses: lessons from sars contact tracing performance during the ebola epidemic in liberia implementation and management of contact tracing for ebola virus disease contact tracing for imported case of middle east respiratory syndrome mers-cov close contact algorithm: public health investigation and management of close contacts of middle east respiratory coronavirus (mers-cov) cases (v european centre for disease prevention and control. risk assessment guidelines for diseases transmitted on aircraft. part : operational guidelines for assisting in the evaluation of risk for transmission by disease active contact tracing beyond the household in multidrug resistant tuberculosis in vietnam: a cohort study the effectiveness of contact tracing in emerging epidemics the transmissibility of novel coronavirus in the early stages of the - outbreak in wuhan: exploring initial point-source exposure sizes and durations using scenario analysis the transmissibility of novel coronavirus in the early stages of the - outbreak in wuhan: exploring initial point-source exposure sizes and durations using scenario analysis report : transmissibility of -ncov effectiveness of ring vaccination as control strategy for ebola virus disease estimating the potential total number of novel coronavirus cases in wuhan city, china superspreading and the effect of individual variation on disease emergence coronavirus: uk patient is university of york student epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in hong kong. the lancet early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia incubation period of novel coronavirus ( -ncov) infections among travellers from wuhan, china influenza forecasting with google flu trends pattern of early human-to-human transmission of wuhan early transmissibility assessment of a novel coronavirus social science research network novel coronavirus -ncov: early estimation of epidemiological parameters and epidemic predictions. medrxiv preliminary estimation of the basic reproduction number of novel coronavirus ( -ncov) in china, from to : a data-driven analysis in the early phase of the outbreak. biorxiv when is contact tracing not enough to stop an outbreak? contact tracing during an outbreak of ebola virus disease in the western area districts of sierra leone: lessons for future ebola outbreak response. front public health clinical features of patients infected with novel coronavirus in wuhan, china. the lancet key: cord- -bj u dtk authors: kimathi, mark; mwalili, samuel; ojiambo, viona; gathungu, duncan kioi title: age-structured model for covid- : effectiveness of social distancing and contact reduction in kenya date: - - journal: infect dis model doi: . /j.idm. . . sha: doc_id: cord_uid: bj u dtk coronavirus disease is caused by severe acute respiratory syndrome coronavirus . kenya reported its first case on march , and by march , she instituted physical distancing strategies to reduce transmission and flatten the epidemic curve. an age-structured compartmental model was developed to assess the impact of the strategies on covid- severity and burden. contacts between different ages are incorporated via contact matrices. simulation results show that % reduction in contacts for -days period resulted to . – % reduction of infections severity and deaths, while for the -days period yielded . – . % reduction. the peak of infections in the -days mitigation was higher and happened about months after the relaxation of mitigation as compared to that of the -days mitigation, which happened a month after mitigations were relaxed. low numbers of cases in children under years was attributed to high number of asymptomatic cases. high numbers of cases are reported in the – years and – years age bands. two mitigation periods, considered in the study, resulted to reductions in severe and critical cases, attack rates, hospital and icu bed demands, as well as deaths, with the -days period giving higher reductions. the coronavirus disease is caused by severe acute respiratory syndrome coronavirus (sars cov ). the first reported case was in mainland china, city of wuhan, hubei on the th of december (li et al., ) . subsequently the disease spread at an exponential rate to countries in contact with china resulting to world health organization (who) declaring it as a public health emergency of international concern (pheic) on th january (who africa, ). as of st may there were over six million infections globally, with the european region taking lead in these infections (who, c) . in africa, the first case was reported in egypt then followed by algeria j o u r n a l p r e -p r o o f (who, a). the first kenyan case was reported on the th of march and by st may there were about , confirmed cases, with nairobi and mombasa taking the lead in these infections (moh-kenya, ) . there are mainly three kinds of infections; asymptomatic, pre-asymptomatic and symptomatic. the incubation period for covid- , which is the time between after exposure to the virus and symptoms onset, is on average - days, however it can be as high as days. for covid- symptomatic case, the disease manifests itself through symptoms such as fever, coughs, sneezes and headaches, whereas for asymptomatic case the infected individual does not develop symptoms (who, b) . the basic reproduction number, defined as the average number of secondary infections produced by an infectious individual in a population where everyone is susceptible (li et al., ) , is affected by the rate of contacts in the host population, the probability of transmission during contact and the duration of infectiousness. it can also vary for different age bands since the attack rates are age-dependent. the basic reproduction number for covid- in kenya ranges from . ( % ci . - . ) to . ( % ci . - . ) (brand et al., ) . reduction of the reproduction number can definitely be achieved by instituting appropriate non-pharmaceutical interventions (npis) or use of a vaccine. in the absence of a vaccine, social/physical distancing strategies have globally become the most appropriate non-pharmaceutical interventions (npis) (ferguson et al., ) these mitigations can be implemented by reducing social contacts in workplaces, schools, markets and other public areas. social contacts are influenced by age structure of the population and the frequency of contacts across population (prem et al., ) . mathematical models that describe the impact of the npis in reducing morbidity, infection peak sizes, and excess mortality are vital in public-health planning (singh & adhikari, ) . in their first step towards developing a credible model for covid- dynamics in kenya, the authors of this paper studied the impact of social distancing and contaminated environment in the article (mwalili et al., ) . the current study presents an improved model with the aim of predicting the possible trajectory of covid- infections in kenya. similar to other countries in sub-saharan africa, the kenyan government has imposed travel restrictions across counties, dusk-to-dawn curfew and school closure to ensure social distancing in the population and consequently slowed transmission of covid- . although it is not clear for how long these measures should be in place to eradicate the epidemic in kenya, we state that premature and sudden lifting of interventions could potentially lead to a new peak of infections. however, intermittent application of the interventions can flatten the infections curve (prem et al., ) . previous study of covid- in kenya also predicted the risk of epidemic rebound after the social distancing measures are lifted (brand et al., ) . in this study, an age-structured seir mathematical model that examines the impact of npis in curbing covid- severity and deaths in kenya is developed, with the aim of achieving the following; (i) assessing the impact of reducing social contacts in different age-groups, (ii) examining the trend in infections during and after the npis, (iii) providing plausible period for lifting the npis. we postulate that this study can form a basis for policy formulation to enable kenya delay the disease transmission and eventually flatten the epidemic curve. the kenya population is split into the four broad age groups (knbs, ): those below years, - years, - years, and above years. these are denoted by subscript , , , i = respectively. each population of age group i is classified as either susceptible i s , the sum of the compartments i n gives the size of the population in age group i . ( ) the exposed i e compartment represents the individuals infected with coronavirus but are not yet infectious, since the virus is in incubation stage. the asymptomatic i a compartment has those individuals who are infectious, but do not exhibit the disease symptoms. the mild i m compartment has infectious individuals who exhibit symptoms of covid- , but their condition does not require hospitalization. the severe i h compartment has infected individuals who need to be hospitalized so as to manage their condition better. finally, the critical i c compartment contains infected individuals whose situation is much worse as compared to a severe case; in that they are required to be in intensive care unit. the susceptible individuals are exposed/infected through contact with infectious individuals from any of the age-groups. after the disease incubation period, exposed individuals progress into either the asymptomatic or mild compartment. infectious individuals who are asymptomatic, are assumed to recover over time whereas the mild cases either recover or progress to the severe compartment. depending on the disease severity, the individuals in i h either recover or become critically ill. these critical cases, now in i c either die or their condition improves to a severe case, no longer requiring ventilation. the dynamics of the epidemic in our age-structured model is governed by the flow diagram in figure . the flow diagram yields the following model equations: description of the age-dependent model parameters are presented in the table . human-human transmission of coronavirus depends on whom one is in contact with and where. the place of contact could be at home, school, work, or within the community e.g. markets, restaurants etc. therefore, we assume the susceptible individuals will acquire the virus when they come into contact with an infectious individual, and express the rate of infections and , ( ) i t β as follows: ij c denotes contact matrices and describe the interactions between the considered age-group i with other age-groups j . the constants β and β represent the likelihood of infection upon contact, and based on the basic reproduction number for covid- . the parameter denotes the proportion of mild (symptomatic) infectives who selfisolate to minimize their contacts, which is a control measure encouraged by the health experts during the coronavirus pandemic. contact matrix ij c comprises of contacts at home ( ) h , workplace ( ) w , school ( ) s and other ( ) contact which is not happening at home, work or in school. therefore, we express ij c as follows: matrix for , , , i = is expressed as where for instance the home contact follows: ( ) such that the matrix elements range between and ; with value implying no contact and value implying maximum contact. during the coronavirus epidemic the contact patterns are definitely not the same as compared to the no epidemic times (prem et al., ) . the mixing of different age-group populations has been incorporated in our model equations through contact matrices, ( ) ij c which are used in ( ). the goal of social distancing measures is to reduce individuals' contacts in schools, workplaces, and the general community. these measures are imposed at different times and remain in place for a given duration. in order to implement the control measures adequately by specifying when the measure was started and for how long it will be in place, as well as its effectiveness in reducing contacts, we introduce j o u r n a l p r e -p r o o f the following time-dependent control, a similar approach is found in (singh & adhikari, ) : ( ) where on t and off t are respectively the day of imposing and lifting the control measure. w t is a shape parameter whereas the constant e is chosen such that the desired reduction in contact is achieved. when ( ) u t = it implies zero contacts, as is the case with closure of schools, and when ( ) u t = it means no mitigation measure is in place. we apply this function to the non-household contact matrices as follows: ( ) where the constant h u ≤ captures the fact that in the absence of "stay at home" measures, adults and school-going children will spend less time at home, hence less interactions in homes. in this study, we assume . h u = for the unmitigated scenario so as to reflect the less interactions at home and let h u = for restricted movement, in which people are advised to stay at home. using ( ) in ( ) enables us to implement the interventions of school closure, dusk-to-dawn curfew, and movement restriction independently and at the precise time they were instituted. the dusk-to-dawn curfew is whereby the kenyan government imposed a national wide curfew requiring the citizens to be at home by : p.m. and should only leave their homes after : a.m. the term movement restriction implies the partial lock down of travel in/out of nairobi, mombasa, kilifi and kwale counties that the government imposed on th april, . closure of schools yields a % reduction in the school contacts, as such ( ) s s ij u t c will be a matrix of zeros since ( ) s u t = . we assume that imposing a dusk-to-dawn curfew and restricting movement in and out of hotspots reduces the social mixing in workplaces and in other places (besides home, school, and work) by % and % respectively. then we choose e such that ( ( )) ( ( )) . w min u t min u t = = for the curfew, and ( ( )) ( ( )) . w min u t min u t = = for the restricted movement. using these minimum values to scale down the non-household contact matrices, we obtain the contact patterns depicted in figure . the actual contact matrices for kenya were unavailable to the authors, so we produced synthetic contact matrices guided by (prem et al., ) and (prem et al., ) by approximating the mean number of contact per day from the matrices in (prem et al., ) , normalizing and adjusting accordingly to best reflect the kenyan situation. panels (a), (b), and (c) depicts the contacts at workplaces between working age j o u r n a l p r e -p r o o f groups. panel (b) show a % reduction in contacts at work due to the time constraints brought about by the dusk-to-dawn curfew, in which the working hours are reduced to allow individuals get home before dusk. when the movement restriction is instituted for non-essential services, whereby people are not allowed to travel in and out of certain regions, we see much less contacts at work in panel (c). panels (d), (e), and (f) show contacts which are dominant along the diagonal and in age groups less than years. these contacts are happening in places that are not work, school or home. therefore, they constitute contacts in marketplaces, entertainment places, or other social gatherings such as weddings. hence the mixing is highly assortative and is likely to bring into contact individuals of same age groups but from distant regions. therefore, it is imperative to control interactions in this category of contacts, otherwise the epidemic would spread very fast in the communities. as shown in panels (e) and (f), the social distancing measures imposed on ij c through ( ) are effective in minimizing these contacts by % and % respectively. in africa, majority of the population is generally less than years, and often in contact with children and their (grand)parents as indicated by the main diagonal and off-diagonals in panels (g), (h), and (i). noting that ministry of health is advising people to stay at home, we postulate that imposing the curfew and movement restriction increases home contacts by . % and % respectively, as shown in panels (h) and (i). to show the impact of the highlighted measures in kenya, we present results for daily and cumulative infections, severe and critical cases, deaths, as well as peak demand for hospital and icu beds. the simulation was done for a one year starting from th march, , but we present results for up to december, since the evolution of the epidemic after this period is subject to uncertainties. to initialize the simulation, we assumed ( ) a = , n , and n are respectively obtained as % , % , % , and % of . × , kenya's total population. the parameter values used are presented in table . for the unmitigated scenario we assumed . r = , which is within the range in the study (brand et al., ) , of covid- in kenya. the transmissions of infection were obtained from ( ) and ( we assumed the infectiousness of asymptomatic individuals to be % while the symptomatic individuals to be %. finally, we assume that % of confirmed symptomatic individuals will self-isolate to reduce their contact with susceptible population i.e. % α = . this control measure is uniform across all the agegroups and assumed part of kenya's mitigation efforts alongside the school closure, curfew and travel restrictions. . the simulation results are depicted in figure and table . in figure , the duration of school closure is indicated by cyan shaded region and is overlapped by the duration of implementing the dusk-to-dawn curfew indicated by yellow shaded region. the gray shaded region indicate duration of implementing travel restriction across counties. the interventions begin at different days but they all end at the same day, as shown by the light-gray region for the -days mitigation and darker-gray for the -day mitigation. the social distancing measure lasting for days resulted to a delay of the epidemic peak for about months compared to the unmitigated situation which peaked within - days. the % reduction in contacts for days resulted to between . - % reduction of cumulative infections. when the social distancing measures were in place for days the epidemic peak was delayed for about months compared to the unmitigated scenario. also, the % reduction in contacts for the days resulted to between . - . % reduction of cumulative infections. the peak of infections in the -days mitigation is higher and happens about months after the mitigation is relaxed as compared to that of the -days mitigation, which happens a month after mitigation is relaxed. this is due to insufficient herd-immunity since the infections are quite suppressed during the days as compared to significant presence of infections for the -days mitigation before the measures are relaxed, as shown in figure . also shown is a notable rise in infections after the interventions are lifted. however, due to herd-immunity and the depletion of susceptible in the population the rise in infections is not sustained. from table we show the age dependence in the simulated cases and peaks. in all the cases presented in the table, the numbers for those under years are low. this is the age group with a high number of asymptomatic infections, which are more likely to remain undetected. high number of cases are reported for the - years and - years age bands since majority of individuals in these age bands have wider interaction spheres (outside of schools and home), and they form a significant percentage of kenya population. the considered mitigation periods yielded reductions in the key health outputs, although applying the mitigation for entire simulation time of days would have resulted into more significant reductions. however, in reality the population might not withstand the long-term imposing of dusk-to-dawn curfew and travel restrictions. the high numbers of severe and critical cases translate to high demands for hospital and icu beds, and also deaths. in the -day mitigation in table there is an increase in hospital and icu beds peak demands which is likely due to the notable rise in infections after the measures have been relaxed, as shown in figure . the overall and symptomatic attack rates are presented in table and they exhibit agedependency. the younger population have lower attack rates (and lower epidemic peak sizes) as compared to the older population whereby those older than years have the highest overall attack rate, as well as the highest symptomatic attack rate. this result shows the agedependency of exposed individuals progressing to symptomatic cases. the -days mitigation period reduces the attack rates and subsequently flattens the epidemic curve. however, imposing these stringent measures for a prolonged period has adverse effects on the socio-economics of the country. the dependency of the attack rates on age underscores the variability of r across the age bands (van zandvoort et al., ). figure : effect of social distancing strategies on synthetic contact matrices for kenya population. in the unmitigated scenario, there will be maximum contacts in workplaces and other locations (excluding home, schools and workplace). this results to less contacts at home. the dusk-to-dawn curfew results to a % reduction in contacts at workplaces and other locations, but assumed to increase the home contacts. the movement restriction yields a % reduction in contacts at workplaces and other locations, but presumed to increase the home contacts by %. the dependency of covid- transmissions, severity and deaths on age is crucial to the design of social distancing measures and projection of the expected disease burden in the country. indeed, the considered interventions do not completely avert the epidemic, but they significantly slow down the transmissions and reduce the infection peak sizes, and deaths. we note that if there is no self-isolation of symptomatic cases, the number of cases and deaths will increase, which will result to the peaks happening earlier in all cases. prolonged implementation of social distancing measures will definitely resolve the epidemic; however, it will damage the country economically. it is not fully known how the epidemic would spread to various counties in kenya, and how people in these counties will react to the npis. there is need for coordination and frequent exchange of information between modeling and surveillance groups in order to refine predictions of the epidemic trajectory. table -simulation outputs of the epidemic in kenya in unmitigated and mitigated situations. age-specific cumulative symptomatic, severe, critical and death cases are displayed. the peak of infections, in days, and peaks of demands for hospital and icu beds, and deaths are also shown. the table also shows age-specific overall (symptomatic) attack rates, which are calculated as the number of infections (symptomatic cases) over the total population of that age band forecasting the scale of the covid- epidemic in kenya estimating the number of infections and the impact of non-pharmaceutical interventions on covid- in european countries. imperial college covid- response team distribution of population by administrative units early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia covid- outbreak in kenya seir model for covid- dynamics incorporating the environment and social distancing projecting social contact matrices in countries using contact surveys and demographic data the effect of control strategies to reduce social mixing on outcomes of the covid- epidemic in wuhan, china: a modelling study age-structured impact of social distancing on the covid- epidemic in india response strategies for covid- epidemics in african settings: a mathematical modelling study statement on the meeting of the international health regulations ( ) emergency committee regarding the outbreak of novel coronavirus ( -ncov) who siterep . world health organization a second covid- case is confirmed in africa the authors appreciate the valuable advice offered by peter young and thomas achia of centers for disease prevention and control (cdc), mozambique and kenya respectively. kimathi: conceptualization of this study, methodology, software, results discussion. mwalili: conceptualization of this study, data curation, writing -original draft preparation. ojiambo: conceptualization of this study, writing -review and editing. gathungu: conceptualization of this study, writing -review and editing. none key: cord- - pok authors: nan title: a smartphone magnetometer-based diagnostic test for automatic contact tracing in infectious disease epidemics date: - - journal: ieee access doi: . /access. . sha: doc_id: cord_uid: pok smartphone magnetometer readings exhibit high linear correlation when two phones coexist within a short distance. thus, the detected coexistence can serve as a proxy for close human contact events, and one can conceive using it as a possible automatic tool to modernize the contact tracing in infectious disease epidemics. this paper investigates how good a diagnostic test it would be, by evaluating the discriminative and predictive power of the smartphone magnetometer-based contact detection in multiple measures. based on the sensitivity, specificity, likelihood ratios, and diagnostic odds ratios, we find that the decision made by the smartphone magnetometer-based test can be accurate in telling contacts from no contacts. furthermore, through the evaluation process, we determine the appropriate range of compared trace segment sizes and the correlation cutoff values that we should use in such diagnostic tests. witnessing an alarmingly large number of novel pandemics in this century such as sars, swine flue, mers, ebola, and zika, there has been growing concerns on the ''next big one'' [ ] . many worry that we deal with them using strategies established over a century ago and technology that has been around for decades, with little innovation generated [ ] . consequently, there are calls for technology-based preparedness [ ] , especially in the areas of infection prevention, case finding, case investigation, and contact tracing [ ] . among others, the information technology (it) sector should respond to the calls, and continuously expand and finesse the arsenal of technologies in each of these areas. in this paper, we tackle one of the areas that need the technological revamping: contact tracing. on the brink of an infectious disease epidemic, the most urgent task is to trace those who possibly made contacts with the infected person(s), in order to cut the chain of infection and prevent it from growing into a wider epidemic. but the traditional contact tracing technique has been predominantly analog. namely, contact graphs are constructed through interviews with confirmed cases, by asking who they met and where they visited. this is a hugely costly and time consuming task. worse yet, there is the issue of recall [ ] . meanwhile, recent outbreaks have been fundamentally different from those of the past -highly mobile populations [ ] and the spread into densely populated cities [ ] -which exacerbate the problem with the traditional contact tracing approach. when there are many potential contacts that an infected person cannot identify or recollect as in our typical urban life, a potent tool we can marshal is the mobile devices such as smartphones. the mobile-based epidemic monitoring is nothing but a logical next step because only the mobile devices that move with people can keep up with the contacts they make. indeed, there have been increasing number of proposals for smartphone-based contact tracing. the employed technologies range from similar global positioning system (gps) positions [ ] , similar wi-fi fingerprints [ ] , bluetooth peer discovery [ ] , and identical cells in mobile communication [ ] . unfortunately, they either provide position information too coarse to be used for infectious contact detection [ ] (gps, cellular/wi-fi fingerprinting), require the infrastructure nearby (cellular/wi-fi), cannot be used indoors (gps), consumes too much power for extended monitoring use (gps) [ ] , or could compromise privacy by exposing the identity of the device and eventually its owner (bluetooth beacons). however, some recent works including our earlier pilot study [ ] , [ ] present a new possibility by demonstrating that a magnetometer traces-based approach can detect close contacts. they exploit the fact that the magnetic field strength is rich in spatial features (e.g. m − to . m − ) [ ] due to various distortions by ferromagnetic materials used in buildings such as reinforced concrete and metal doors. the magnetometer-based approach overcomes most of the aforementioned issues. first, thanks to the omnipresent geomagnetic field, the similarity comparison of the two magnetometer traces works both indoors and outdoors, and does not need any infrastructure support. second, it offers better privacy protection by not revealing any identity of the device or the location of the trace generation. third, it detects the coexistence only in close proximity. the lowpower smartphone magnetometers can only be affected by ferromagnetic structures within a few meters [ ] , [ ] . only the co-existing smartphones within this distance can bear sufficient similarity in their magnetometer readings [ ] , [ ] . this last characteristics is especially important as many infectious disease transmissions occur in close distances. public health policies for tracing close contacts or infection control guidance often use a distance of up to meters or feet [ ] . when the disease control authority performs an epidemiological investigation, they can use the smartphone magnetometer traces of the person confirmed infected and of the one suspected of a contact with the infected, in a system depicted in fig. . when people make contacts, they are recorded in their individual phones in the form of similar magnetometer readings. when they want to check if they could have met an infected person, they can ask the system to compare their traces with that of the infected person. since it is a pairwise comparison, it works for the case many people gather at a location. each individual pair from the gathering can be checked using the pairwise comparison method depicted in fig. . in this paper, we assume that it is an emergency situation, and people are cooperating with the disease control authority by downloading an application that records the magnetometer readings and submits it through the phone's cellular connection if necessary. indeed, there are recent efforts that seek such public participation to prepare for the next pandemic outbreak [ ] . in this effort by british broadcasting corporation (bbc), people are encouraged to download an app and activate it for helping model the spreading dynamics in future pandemics. the app then meticulously record the trajectory of the smartphone holder, before it reports the trajectory information to a central server. considering this precedent, our own system model in fig. is not excessively unrealistic. the rationale behind such cooperation from the public could be the fear from the lack of information [ ] . under the depicted scenario, not only the disease control authority but also each individual user can check herself whether or not there has been close contact with an infected person. indeed, world health organization (who) strongly recommends that disease control authorities ensure at-risk populations have the information they need, thereby minimizing social and economic disruption [ ] . although the existing magnetometer-based works have confirmed the feasibility of the idea, they are a far cry from a serious alternative to the traditional contact tracing method. first, many of the operating parameters are still to be determined. in particular, the exiting works [ ] , [ ] , [ ] consider only two extreme and impractical cases: either continuous contact or no contact during the whole duration of comparison. however, when two people, possibly strangers, make a contact, its duration can be very small compared to the entire span of comparison (which can match the most active transmission period of the disease). in fig. , if the infected person a was confirmed infected at time t and the disease transmissible duration is l tx , the similarity measure between the traces r a and r b can be computed low if the duration of contact l l tx . but this will be generally the condition that we will face in reality. therefore, we need to define the window of comparison t w that we slide over the entire trace pair to find any contact (i.e., the similarity measure over a threshold) to make it a valid test method. second, when new diagnostic tests are introduced, it is necessary to evaluate the comparative diagnostic accuracy and feasibility of this new test in comparison to the existing tests or the gold standard [ ] . this ability and diagnostic accuracy can be quantified by calculating various measures such as sensitivity and specificity, positive and negative likelihood ratios, diagnostic odds ratio,etc. in this paper, we address these issues by defining the desirable length of t w , the decision threshold θ c , and by evaluating the quality of the contact diagnosis under these parameter values. as to the nature of the technology we propose in this paper, one can argue that it is only supplementary. in that we believe that the final confirmation about the infection event should be always made by human experts, it is true to a certain degree. it can be used to quickly identify possible contacts with relatively high accuracy, so that the human experts can focus on the most likely ones that have been identified by technology. however, at the same time, the technology covers areas that the traditional method could not. without the technical support, it may be not only costly and time-consuming but impossible in many contact events. first, the authority may not be able to catch up with the speed of spreading when the epidemic is full-blown. second, in many urbanized societies of today, we do not even know or remember those who happen to sit next to us in the bus or train or in a restaurant. in largescale epidemics, the technology can quickly pan out even such contacts that cannot be recovered from the memory of the infected person. in these second sense, the technology will be indispensible. the rest of this paper is organized as follows. in section ii, we briefly summarize the related work that exploits the geomagnetic field strength to detect location and coexistence. in section iii, we first discuss how we measure the similarity of two traces that signals a possible contact. then we identify the parameters that determine the performance of the magnetometer-based contact test, and discuss how we will measure it. in section iv, we evaluate the performance of the test using a set of real-life smartphone magnetometer traces. finally, we conclude the paper in section v. before delving into the discussion, we list the acronyms used throughout the paper in table . there is rich literature on co-presence detection or its use on epidemiology and social studies. in terms of the employed technology, existing works range from sensors to communications to social media. we summarize them below, with brief remarks on their relevance to our problem or the relation to our approach. although the disease transmissibility check in contact tracing needs not necessarily absolute but relative coordinates (i.e., relative to the infected person), one may well consider using gps trajectories to determine the distance of contact. for instance, qi et al. use gps to track and visualize space-time activities for a flu transmission study [ ] . unfortunately, gps is a power-inefficient sensor. as we need to amplify the signal and achieve a high processing gain due to the small received power, a significant reduction in battery time is inevitable. for instance, it can drain a smartphone battery in much less than a typical charge interval even with minimal activity [ ] , [ ] . in attempts to mitigate the problem, we could activate gps only when user movement exceeds the accuracy bound, or turn it off indoors by detecting the condition through other means such as the received signal strength (rss) fingerprints of cell towers. even if the power issue is resolved, however, problems remain. first, the distance estimate between two gps sensors may include a large error because each can have an average error over meters, when a few meters matter in disease transmissibility check. second, and more importantly, gps is incapable of checking for possible infection events indoors. many studies have used radio frequency identification (rfid) or sensor network technologies to understand infection and to prevent it in hospitals [ ] and in schools [ ] . isella et al. use active tags to track contacts that take place in a pediatric ward for analyzing the structure of the contact data, it identifies the central groups that need close attention to prevent nosocomial infection prevention. salathe et al. use telosb motes carried by students in a school to obtain close proximity interactions data and develop a more effective vaccination strategy. it finds the small world phenomenon, and suggests a vaccination strategy based on the structure that is more effective than random vaccination. it is also used in social studies [ ] and for security based on proximity [ ] . shafagh and hithnawi [ ] use ambient radio signals to detect other nodes in close proximity, for authentication between iot devices before they connect. bolić et al. use an enhanced rfid tags to mutually detect proximity to track. when attached to people, it can be applied for tracking interactions at social events [ ] . but the biggest drawback of these approaches is that today's smartphones hardly support rfid or personal area network (pan) technologies other than bluetooth. due to the lack of deployment base among general public, they do not serve our purpose of massive mutual contact monitoring between strangers. recently, there have been efforts to introduce social media such as twitter to epidemic monitoring, for early detection, management, and control of epidemic outbreaks [ ] - [ ] . in particular, participatory surveillance using social networks to collect symptom reports to detect infectious disease outbreaks has been tried. however, most studies limit their scope to common and seasonally recurring health events such as influenza due to the noisy nature of twitter [ ] . moreover, this post-symptomatic reporting can take long time because some diseases go through long incubation period (e.g. weeks in middle east respiratory syndrome (mers) [ ] ). moreover, subjective symptom reports do not provide information specific enough for disease control authorities to construct contact traces and obtain contact contexts. also, it gives us only collective statistics at coarse granularities, while contact tracing requires information on person-to-person interactions. in the same vein, search-based global disease trend tracking services [ ] are not directly helpful to contact tracing in emergency response. for its prevalence, wi-fi is extremely popular for indoor localization. for example, there is a recent work that leverages on participatory sensing [ ] . again, as in gps, we could consider using wi-fi assisted location information to determine the distance of contact, although the disease transmissibility check in contact tracing needs not necessarily absolute but relative coordinates. however, there are not many works in co-locating two devices using the technology. existing works based on wi-fi are mostly centered around proximity detection and its applications. but mutual proximity detection is not in the design of wi-fi, so it requires significant manipulation such as exploiting portable hot spot (phs) mode [ ] . carreras et al. [ ] use wi-fi to mutually discover smartphones in proximity and determine the distance using received signal strength indication (rssi). in line-of-sight condition, they argue that . m resolution is achievable using the rssi of the discovered smartphone and machine learning algorithms. as to the closeness estimation, most previous works rely on rssi [ ] - [ ] . the applications include authentication [ ] , [ ] and epidemic prediction [ ] . in particular, nguyen et al. [ ] show that the co-presence in disease transmissible distance can be determined through rssi signatures from public wi-fi access points. a drawback of using wi-fi is that access points may be unavailable or prove insufficient to fix positions with a consistent precision. also, the technology is not stellar in energy efficiency, especially for long and continual monitoring. for proximity detection, bluetooth is a popular technology for its relatively high precision in short distances [ ] . it has been mostly used for studies on social behavior and interaction such as duration and proximity [ ] , and those in mass gathering situations [ ] . liu et al. [ ] show that bluetooth can be used to detect face-to-face interaction within . m by mapping bluetooth rssi to distance. in this work, smartphones attempt to detect other bluetooth smartphones every seconds. compared with wi-fi and cellular location, they show that bluetooth can provide an order of magnitude more precise proximity detection. montanari [ ] proposes to use bluetooth low energy (ble) to measure the duration and the proximity of social contact using ble-enabled wearables. it has been also used to measure, understand, and predict how individuals change their social behavior in response to infectious diseases [ ] . yoneki [ ] uses bluetooth to collect proximity devices data to measure, understand, and predict how individuals change their social behavior in response to infectious disease. jamil et al. [ ] use ble tags and smartphones to track group dynamics in a massive religious gathering. it investigates the best configurations for the ble tags and the scan durations for smartphones. compared with the infection study, the group dynamics study requires detection in farther distances at more than m. also, the tags unilaterally advertise, and the smartphones unilaterally scan. it also does % duty cycling, with minutes of hibernation between seconds scans. therefore, this does not fit with the continuous monitoring need for infectious contacts that could happen any time. a recent study also points out the inefficiency of the bluetooth (le) protocol in connectionbased interactions when there are hundreds or even thousands of ble devices in the communication range of each other [ ] . harris et al. [ ] consider the dense ble deployment scenario where hundreds or even thousands of tags interact with a large number of scanning devices such as smartphones. it raises the message collision and consequent energy waste issues of the ble active scanning mode, and proposes an optimization scheme to solve them. although bluetooth technology has many desirable properties, it relies on the beacon exchange to detect each other. the beacons can reveal the identity of the transmitting device, threatening the privacy of the user. communication traces obtained by mobile phones are known to be good proxies for the physical interaction network, and they may provide a valuable tool for contact tracing. for example, calls and messaging activities were used to construct human contact networks [ ] . mobile network data or call detail records (cdrs) have also been used to model population flows, major mobility hubs, and movement typologies, and how they change as the ebola outbreak unfolds [ ] . we could even use two phones attaching to an identical cell as a signal for a possible physical contact. however, the coverage of a single cell tower is at least a few hundred meters in radius, so it would be too coarse to identify infectious physical contact events within a few meters [ ] . one important instance of the mix encounters with strangers is public transport such as train or bus, which people can share for long enough time to enable several modes of disease transmission. for example, an infected person openly coughing in the bus can infect fellow riders in case of aerosol or droplet transmission diseases. when there are many potential contacts that a confirmed case cannot identify or recollect as in public transports, a potent tool we can exploit is the mobile devices such as smartphones. these devices can be leveraged to detect co-location, which can be a good proxy for the physical contacts. for instance, two smartphones located in adjacent cars in a train, both close to the doors dividing the cars, will probably exhibit high similarity in all their measures. but on a multi-car vehicle, a more relevant question in the context of epidemic infection is whether two passengers are on the same car or on different cars. so, in this letter, we explore how we can differentiate locations in the same train at car-level granularity. our study reveals that accelerometer readings during train stop and start events tend to be characteristic of different car positions, so they can be used to generate a strong co-location signature on the car level. thanks to the movements of the train, it does not require a complex communication infrastructure on the train for classification [ ] , but an accelerometer. common ambient sound detection using the microphone sensor [ ] - [ ] can be a technology of choice. but using the microphone sensor has its own issues. first, the number of samples at its typically high sampling frequencies (e.g. . khz) is too large for continuous and indefinitely long monitoring required for detecting contacts that can happen at any time. second, privacy can be violated because any conversations are also recorded. finally, there is the possibility of false detection. for example, two people watching the same tv channel or listening to other broadcast sounds in different places can be classified as coexistent. the smartphone magnetometer has been extensively used for indoor localization and tracking (but not much for coexistence detection). researchers found that the indoor magnetic field is rich in spatial features [ ] , and easy to sense [ ] . moreover, the field is stable over long periods of time [ ] - [ ] . the richness and the stability of the magnetic field enables mapping (a.k.a. fingerprinting) and magnetic map-based applications. the first application is indoor location. chung et al. [ ] showed that the geomagnetic anomaly can provide signatures for indoor locations that can be leveraged for sub-meter-level location accuracy. frassl et al. [ ] used magnetic maps with centimeterlevel accuracy to localize a human or robot. li et al. [ ] discussed possible issues that can affect the precision and the feasibility of the fingerprinting approach for indoor location. angermann et al. [ ] found that the use of all three field components provides good resolution of ambiguities in a small indoor area. carrillo et al. [ ] used the three components of the measured magnetic field by smartphone magnetometers instead of just the intensity to improve accuracy. the second application is navigation. brzozowski and kazmierczak [ ] discussed ways of recording, visualizing, and mapping local magnetic field changes in d that can be used as a support for indoor navigation systems for unmanned aerial vehicles (uavs). riehle et al. [ ] considered a leader-follower style navigation application for visually impaired people where there is time gap between traversals, without relying on expensive indoor magnetic fingerprinting. a follower could compare its own magnetometer trace and the leader's to determine if the follower reached a waypoint and if the follower went off-route. the third application also does not require fingerprinting, and it is of our interest in this paper -coexistence detection. nguyen et al. [ ] used only smartphone magnetometers to detect co-location of passengers in public transport. they exploited the fact that the passengers share the trajectory between at least two consecutive stations, and the magnetometer traces exhibit high similarity, which was measured by the distance in derivative dynamic time warping (ddtw). kuk et al. [ ] showed that even in outdoors the magnetometer traces can be compared to detect contacts within a few meters where the current gps can have an order-ofmagnitude larger errors. it showed that two closely located smartphones generate highly correlated magnetometer traces, which can be exploited to detect coexistence. kuk et al. showed that they could lower the frequency to hz without significantly harming the detection performance, but increasing the battery life significantly. the smartphone magnetometer overcomes undesirable properties of other technological alternatives. it can detect contacts within very short distances that fit infectious disease transmissions monitoring, and it can work indoors. it is supported by all smartphones, and works without any infrastructure support. it consumes relatively small power compared with other sensors, and has little privacy concerns. in this paper, therefore, we focus on the contact detection on smartphone magnetometers and explore their potential to provide a diagnostic tool for potentially infectious contacts made between smartphone holders. as to the privacy concern of some of the technologies above, it may not an issue in the event of an epidemic. authorities may legally have purpose-based access to the phone data of the infected or so suspected person, or rather, volume , users may voluntarily give consent to the authority to use their trajectory data. indeed, we assume such model in subsequent discussions. finally, it is worthwhile to mention that any combination of the magnetometer-based method proposed in this paper with other technologies is possible. for instance, the cost of comparing two traces for checking close contacts could be avoided if their gps coordinates or cellular attachments show totally different values. many valuable combinations could be conceived, but in this paper, we focus on the smartphone magnetometer-based method first so that it can be used in such combination in a more intelligent way. in this section, we discuss how we measure the similarity of two magnetometer traces. among many similarity measures we can use, we pick the pearson correlation coefficient. it is a good measure of linear correlation, which fits the linear correlation that two magnetometers in close proximity show in their ambient magnetic field strength readings. fig. shows real traces generated by two phones held by the people walking side-by-side through a corridor in a university campus building. here, we let the phones measure the ambient magnetic field strength in µt at the rate of hz. in (a), the horizontal axis is the sample number of measured magnetometer values, and the vertical axis is the strength of the magnetic field vector perpendicular to the ground. we observe that these two time series do exhibit similar fluctuations. the fluctuations are the results of the magnetic distortions to the geomagnetic field by ferromagnetic materials such as steel doors, pillars, and rebars among others in the building the smartphone users are passing by. the synchronized fluctuations of the two magnetometer readings have a linear correlation, as shown by (b). therefore, when each phone records such trace while the user moves around in daily life, we can let the users or the disease control authority later check for possible contacts with an infected person using the strength of the linear correlation. for example, as in fig. , a susceptible user can check if her smartphone has a trace segment that computes a high correlation with an infected person's trace that can be provided by the disease control authority. in order to compute the similarity of two smartphone magnetometer traces, we need to use a similarity measure. there are numerous similarity measures, but some popular ones in the literature are cosine similarity, dynamic time warping (dtw) distance, euclidean distance, kullback-liebler distance, jaccard similarity, pearson correlation, among others [ ] . in the areas of epidemiology and psychology, however, the measure of association is frequently analyzed by correlation analysis and regression analysis [ ] . in this paper, we use the correlation analysis. as for the correlation measures, there are pearson, kendall, and spearman correlation coefficients [ ] . among these, we pick the pearson correlation coefficient, as it is a good measure for a linear correlation. to start with, table lists the notations we use in the subsequent discussion on how we compute the pearson's correlation coefficient of two magnetometer traces. two strangers are compared, we do not know whether or when the contact was made. as we discussed in section i, we need to inspect the traces in a window of time t w , as we slide the inspection window (blue box in fig. ) over the entire span of the traces that we are interested in (l tx ). given t w and the magnetometer sampling rate f s , the pearson correlation coefficient for the samples in the window n w = t w · f s starting from the k th sample is defined to be: where a k+i and b k+i are (k + i) th individual magnetometer readings from phones a and b, respectively. µ and σ are the mean and the standard deviation of the measured values in two phone's compared traces in the inspection window. recollect that the existing works [ ] , [ ] compute the similarity over the entire span of samples l = l tx · f s , essentially making n w = l. unfortunately, at an arbitrary length l, we cannot control the false detection possibilities at all, whether positive or negative. therefore, we will use a window n w l to slide over the compared traces to find any interval for which ρ k (a, b) > θ c , ≤ k < l − n w + , where θ c is the cutoff threshold for the contact decision. when l tx significantly increases, there are two aspects we need to consider: memory to store a trace (at smartphones) and correlation computation (at contact tracing check server). first, in terms of memory, the smartphones should keep the samples collected during the long transmissible duration. in our implementation, each magnetometer measurement sample is a vector, whose size is bytes. at hz sampling, we generate data at bytes/second. for an hour of continuous recording, it is approximately . mb. for one week, it is approximately mb. modern smartphones typically have a few tens of gigabytes of memory, so it will not be an excessive burden, especially in the emergency situation (i.e., infectious epidemic). in terms of computation, the trace comparison is performed not on the smartphones but on a server to which the traces are submitted by users who want to check if they were in a close distance with the infected person. the correlation computation will take proportionally long to the length of the compared traces. but the computation itself is not extremely heavy. we tested the correlation computation with the sliding inspection window of seconds over two continuous traces of , samples collected at hz (or l tx = , seconds or minutes). it takes approximately . seconds on a server that has an i - k processor with the clock speed of . ghz, using only a single core. for week-long traces, it will be slightly over minutes. note that the type of contact we aim to detect in this paper is coexistence [ ] that will enable the 'same-placesame-time' (spst) disease transmission. this contact type is more common in infectious disease transmissions than the 'same-place-different-time' (spdt) type [ ] . since the smartphone users are assumed to stay/move together in this type of contact, we do not need to align the traces for the time gap and the moving speed differences by using such schemes as dynamic time warping (dtw) [ ] . finally, we focus on the contacts made in the indoor contexts, because urban life is % indoors [ ] , and indoors is where most infection events take place. as the length of the traces l over which the search is performed should be defined by the given disease of concern, e.g. by its incubation period [ ] or the duration of active transmission, we do not consider this parameter further in this paper. as for the window size n w , it should be long enough to find the contacts of the critical duration that can enable the transmission. however, it is hard to definitely characterize the duration as it will be disease-specific. so, in this paper, we focus on the technical side. namely, we investigate the minimum window size that we can effectively use for the comparison, which will be equivalent to defining the granularity of inspection that smartphone magnetometers can offer. longer contacts than the window size will manifest as a series of consecutive or densely grouped positive decisions, as we slide the window over the entire trace. finally, we will show that the decision cutoff threshold θ c is related with the window size n w for a given target detection accuracy. if the magnetic field strength had a stationary distribution, we could easily draw earlier works on the sample size planning for clinical research [ ] . specifically, the required sample size n w over which the correlation is computed can be estimated as a function of the targeted cutoff θ c . in particular, n w decreases as θ c or the confidence interval increases. unfortunately, the distribution of the magnetic field strength measured by a moving smartphone is not stationary [ ] . without the stationarity of the magnetometer values in our environment, we cannot analytically derive the window size but turn to the measurement-based approach to estimate n w to meet the given θ c . in order to see whether we can use the similarity check of the smartphone magnetometer traces as a diagnostic test, we evaluate its discriminative and predictive power. in particular, we need to evaluate it under different choices of n w and θ c . in clinical studies, numerous metrics are used to evaluate the quality of a diagnostic test. some of them are: sensitivity, specificity, accuracy, positive and negative likelihood ratio, positive and negative predictive value, odds ratio, relative risk, risk difference, number needed to treat, etc. among these, we will use the ones that are not affected by the prevalence, which can only be artificial in our setting. given the ground truth (contact vs. no contact) and the decision using the smartphone magnetometer traces, there can be four cases among which true positive (tp) and true negative (tn) are desirable, and false positive (fp) and false negative (fn) should be minimized ( table ) . as in any other accuracy assessment of diagnostic tests, we use the × table. as to how the false detections (fp and fn) arise in our setting, we can consider two possibilities. suppose the length of the contact duration represented in the traces is t c , and the number of samples generated during the duration l c = t c · f s . then, let us consider fig. , where two people move indoors with the smartphone magnetometers measuring the ambient field strength at hz. the two people come from different places (a vs. b ), meet in the middle, and move together in the region labeled ''a + b '' for t c = seconds, and then part and return to their initial locations (a vs. b ). fig. shows the correlation coefficients obtained as we slide the inspection window over the entire trace under two different n w values. the x-axis is the sample number k in ( ) at which the coefficient is computed, and the y-axis is ρ k . the shaded region represents the duration of contact. it is approximately from samples , through , in both graphs. there are two subcases in this case. first, if n w is very small, it can cause many spurious contact detections since coincidental high correlations may not be sufficiently averaged out. for example, with n w = and , , fig. (a) and (b) show their pearson correlation coefficients, respectively. the circles in fig. (a) show that two spurious detection events are possible for n w = and θ c = . . second, even if n w is large there are still chances for false detections, but only negative. it is because increasing n w decreases ρ k (a, b) as a consequence of the non-stationarity of the magnetic field strength distribution [ ] , when the human smartphone holder moves through space. using our coexistent trace pairs, we indeed confirm that the larger window sizes significantly reduce the correlation coefficient (fig. ) . in this case, a possible consequence is that the adjacent measurement samples outside the coexistence duration that happen to be included in the window decreases the cross correlation, possibly leading to a false negative decision depending on θ c . observe that for high cutoff thresholds such as θ c > . , fig. (b) will falsely determine that there was no contact, whereas the former will correctly detect the contact. either way, these problems can lead to false decisions about the contact, so it is clear that we need to determine figure . pearson correlation coefficient ρ k with % confidence interval, for a large number coexistent trace pairs. the appropriate window size n w as well as the cutoff threshold θ c . for our measurement-based study, we use indoor magnetometer traces collected in the korea university campus in seoul, korea. below, we first discuss how we collect the traces. then we evaluate the smartphone magnetometerbased contact detection using the measures mentioned in section iii-c. to collect the magnetometer traces, we developed and installed a magnetometer sensing app for android smartphones, samsung galaxy s , s , s and lg g and g . we confirmed that our app works correctly on all these platforms. among the phones, we used two galaxy s 's to collect the traces used in this section. we synchronized their sensing activity through the network time protocol (ntp) [ ] for later comparison of their magnetometer traces. we collect the magnetometer traces in five different buildings in the campus. we picked three places in each building. at each place, we repeated the trace collection six times along the same walking path. so, in total, there are traces, and each trace is seconds long. there are c( , ) = pairs of traces per place to be judged co-existent. since there are different places from which the traces were collected, we have · c( , ) = co-existent trace pairs in total. on the other hand, there are c( , ) · c( , ) · c( , ) = · · · = , non-coexistent pairs. we measured the magnetic field strengths at the default sampling frequency of hz, a popular magnetometer sensing rate in the literature [ ] . the magnetometer readings are obtained in three phone-specific axes: x, y, and z. in order to simulate typical indoor walking dynamics, we let the smartphone holders walk approximately at the 'preferred' walking speed [ ] . it is known that people prefer to walk at approximately . m/s (or . km/h) irrespective of cultures, as they find slower or faster speed uncomfortable. each trace was produced in narrow corridors, and we saw to it that the traces do not deviate from each other more than an 'arm's length' to simulate the typical personal gap [ ] . as the smartphones can have arbitrary attitudes when and while the contact is made, the measured magnetic strengths in their x, y, and z axes will generally be misaligned. for comparison, therefore, they should be translated to a common coordinate system. for this, we use android getrotationmatrix() method to translate the phonespecific coordinates to the absolute coordinate (i.e., north, east, etc). a desirable property of the geomagnetism is that it has absolute reference directions such as the east and the north. smartphones will change attitudes freely, but the translation method lets us readily compare the traces from different phones regardless of their attitudes. as to the robustness of the method against the accumulation of errors over a long duration of continuous operation, it is a research issue of its own [ ] . in this paper, we assume that such calibration is being done to maintain the precision of the magnetometers. fig. illustrates the alignment operation in our measurement system. under the misalignment (a), it is not straightforward to choose the axis for the comparison (b). the traces in (b) shows that the x-axis of phone is aligned with the y-axis of phone , which is the ground truth as shown in (a). but after the translation, the readings from the two coexistent but misaligned phones are cleanly separated along the three absolute axes (c). we notice that the z-axis traces from (b) are identical to up-axis traces in (c), because the phones were held parallel to the ground (a) in the generation of the traces in (b). finally, the east is simply the cross product of the two vectors north and up, so it is redundant. thus in our implementation, we choose whichever axis between north and up that shows the highest correlation in the decision. here, we compute the evaluation measures for the combinations of the window size and the decision threshold. in particular, we will compute them for the first n w samples from each trace pair, i.e., k = in ( ). but first, there is a caveat. in total, there are , and non-coexistent and co-existent trace pairs in our data set, totaling at , . the prevalence in our data set is thus / , = . %. however, this is artificial -we could have made it higher or lower by producing more of coexistent or noncoexistent traces, respectively. naturally, it is meaningless to calculate the measures affected by the prevalence, where the prevalence of disease is artificially controlled [ ] . sensitivity and specificity are not generally related to the prevalence of the disease in the population considered, since these are properties of the diagnostic tool. unlike sensitivity and specificity, measures such as predictive values, accuracy, relative risk and risk difference are affected by the prevalence. therefore, we exclude them, and use the measures that are not affected by the prevalence to evaluate the magnetometerbased contact test. sensitivity is expressed as the proportion of correctly classified as true positives among the total contacts tp/(tp + fn ). in other words, it is the ability of the magnetometer-based test to correctly identify the trace pairs with a real contact. a highly sensitive test is useful, when we do not want to miss a contact (with an infected person) in screening the population. the specificity is the ability to identify the no contacts, expressed as tn /(tn + fp). a specific test will rarely misclassify the trace pairs without a contact as having made a contact. the sensitivity and specificity show the discriminative powers of a diagnostic test. fig. shows the sensitivity and the specificity of our smartphone magnetometer-based test, as functions of n w and θ c . we first find that larger n w does not necessarily mean the higher sensitivity. although happening at different values of n w ( | θ c = . ∼ | θ c = . ), the sensitivity begins to decrease beyond a certain n w at each cutoff threshold. it implies that the correlation decreases when computed for an excessively long trace segment used as the inspection window. this is due to the non-stationary property of the magnetometer measurement value distribution [ ] . the specificity, on the other hand, steadily increases as we use larger n w . the lesson here is that when we use the magnetometer-based diagnostic test, we need to examine the similarity of the two traces using the time window of n w = ∼ to achieve the highest sensitivity. then, the choice of the exact cutoff threshold will depend on the target specificity. also, we find in fig. that the sensitivity is higher with lower cutoff values, whereas the specificity is higher with higher cutoff values. this tension is natural, and can be summarized in the receiver operating characteristic (roc) curve. fig. shows the roc curves for different parameter combinations. although we cannot show the area under curve (auc) itself due to the absence of very low specificity data points, it is clear that the auc's for various n w are very high. namely, the magnetometer-based test is of high diagnostic quality. among the inspection sample window sizes, very small n w ( , ) and very large n w ( , ) lead to poorer auc than those in the middle (n w = ∼ ) as shown in fig. (b) . n w = achieves the best overall auc. in order to obtain the cutoff value θ c that achieves the highest auc for a given n w , we can compute the shortest distance volume , likelihood ratio (lr) is the mostly widely applied measure of diagnostic accuracy. also, it can serve as a predictive measure. in our context, lr tells us how many times more likely a decision is in the trace pairs with the contact than in those without contact. when both probabilities are equal (i.e., lr = ), such test is of no value. the lr for positive test results (lr+) is defined as tp tp+fn / fp tn +fp . the higher the lr+, the more indicative the test is of the contact. good diagnostic tests have lr+ > and their positive result has a significant contribution to the diagnosis [ ] . on the other hand, the lr for negative test result (lr−) is defined as fn tp+fn / tn tn +fp , and it represents the ratio of the probability that a negative result will occur in trace pairs with the contact to the probability that the same result will occur in trace pairs without the contact. good diagnostic tests have lr− < . [ ] . the lower the lr−, the more significant contribution of the test is in ruling-out. lr's do not depend on prevalence of disease of population, as only sensitivity and specificity values are used to calculate them. as a result the lr's of one study could be used in another setting with the condition that the definition of contact is not changed. the likelihood ratios of the smartphone magnetometerbased contact test are shown in fig. . in (a), we observe that it is highly useful for positive identification of contacts. the criterion lr+ > tells us that the positive likelihoods can be a significant contribution to the diagnosis. we also note that we do not need large n w to have lr+ > , especially when we use higher cutoff thresholds of θ c ≥ . . less than measurement samples at hz, or equivalently seconds, is enough to qualify for a good test for positive identification of contacts. on the other hand, fig. (b) shows that the higher cutoff thresholds cannot achieve lr− < . regardless of n w . it implies that using the higher cutoffs can produce a high fraction of false negatives. however, this issue may be mitigated if we require that a contact duration be composed of a series of positive decisions as we slide the inspection window. for example, in fig. (a) , hundreds of adjacent positive decisions will occur as we slide up k in ( ). interspersed false negatives will less affect the final decision then. diagnostic odds ratio (dor) is a relative measure for diagnostic accuracy, used for the estimation of discriminative power of diagnostic procedures [ ] . dor of a test is the ratio of the odds of positivity in traces with the contact relative to the odds in traces without contact. it is calculated according to the formula: dor = (tp/fn )/(fp/tn ). dor depends significantly on the sensitivity and specificity of a test. a test with high specificity and sensitivity with low rate of false positives and false negatives has high dor. with the same sensitivity of the test, dor increases with the increase of the test specificity. for example, a test with sensitivity > % and specificity of % has a dor greater than . the diagnostic odds ratio ranges from zero to infinity, although for useful tests it is greater than one, and higher diagnostic odds ratios are indicative of better test performance. fig. shows that the dor of the smartphone magnetometer-based contact test is much larger than one for most n w values. so, this measure also confirms that the magnetometer-based test is useful. if we use or = as the example criterion, the figure tells us that higher cutoff thresholds qualify with less measurement samples n w to look at (θ c = . has fn = at n w = , so it should qualify although we cannot plot it). for these higher cutoffs, less than samples (or equivalently seconds) or less is enough to achieve the high dor. above, we evaluated the quality of the smartphone magnetometer traces comparison as a clinical test for potential (infectious) contact. all evaluation metrics that we used for the evaluation, namely sensitivity, specificity, likelihood ratio, and diagnostic odds ratio, point to the fact that the number of magnetometer readings to be compared between two traces (n w ) can be small. these metrics produce slightly different optimal numbers for the required readings, but if we need one good number to apply in real-life cases, it is samples (or equivalently seconds at the hz sampling frequency). it leads to the best or close-to-thebest performance in all the evaluation measures. our recommendation is that when two magnetometer traces from two smartphones are compared, one needs to use a window of samples for the pearson correlation computation to achieve the most precise decision as to whether the contact was really made between the smartphone holders. one further recommendation is that the correlation coefficient value used as the decision threshold can be high. specifically, θ c = . is a good match for the -sample inspection window. note that these two numbers n w and θ c to produce the most precise decision are closely related, and other combinations than ( , . ) can be inferred from the results in the previous section. when a large-scale epidemic crisis unfolds in the highly urbanized society today, the traditional contact tracing method of medical personnel interviewing the infected persons will become highly costly, slow, and ineffective. in this paper, we discuss how smartphones carried by most people can be harnessed to automatize the contact tracing in such situation. we exploit the fact that smartphone magnetometers show high linear correlation when two phones coexist within a short, disease-contractible distances, such as less than two meters. then, we use a battery of metrics to evaluate the value of such smartphone magnetometer traces comparison as a clinical test that medical personnel can use in reallife with a high trust level. our evaluation reveals that the magnetometer-based method qualifies for a valid clinical test, if used with certain parameter values in the correlation computation. specifically, our finding and recommendations are as follows. first, the size of the sliding window of trace section to be compared is best to be what corresponds to seconds of samples. second, the decision threshold that matches the comparison window size is . , for the most precise contact decision. these two parameters are inversely related with respect to the precision of the contact detection, and other combinations around the recommended values are also possible. in future, we will further test the reliability of the proposed method with the recommended and other parameter settings in more extensive real-life environments, for instance with different smartphone movement speeds, with obstacles, people or objects between or around the smartphone holders, and with interferences such as power lines close to the smartphones. the artificial traces that we generated in a controlled environment could have biased our experiment results and our conclusion. therefore, we will need to optimize the proposed method further against the real-life traces in a building or in public places to make it more reliable and actually usable in the real-life epidemic situations. waiting for the big one: a new flu pandemic is a matter of time what we've learned about fighting ebola bill gates says we must prepare for future pandemics as for 'war'. time magazine ebola: decline encouraging, but critical gaps remain networks and epidemic models ebola: mobility data fluphone study: virtual disease spread using haggle on the feasibility of using two mobile phones and wlan signal to detect co-location of two users for epidemic prediction'' in progress in location-based services face-to-face proximity estimationusing bluetooth on smartphones epidemic contact tracing via communication traces detecting outdoor coexistence as a proxy of infectious contact through magnetometer traces sensingkit: evaluating the sensor power consumption in ios devices co-location epidemic tracking on london public transports using low power mobile magnetometer magnetic maps of indoor environments for precise localization of legged and non-legged locomotion indoor location sensing using geo-magnetism indoor localization using magnetic fields empirical determination of efficient sensing frequencies for magnetometer-based continuous human contact monitoring limiting the spread of pandemic, zoonotic, and seasonal epidemic influenza the impact of information transmission on epidemic outbreaks outbreak communication: best practices for communicating with the public during an outbreak measures derived from a × table for an accuracy of a diagnostic test tracking and visualization of space-time activities for a micro-scale flu transmission study energy-efficient rate-adaptive gpsbased positioning for smartphones accurate, low-energy trajectory mapping for mobile devices close encounters in a pediatric ward: measuring face-toface proximity and mixing patterns with wearable sensors a high-resolution human contact network for infectious disease transmission proximity detection with rfid in the internet of things poster: come closer: proximity-based authentication for the internet of things the use of twitter to track levels of disease activity and public concern in the u.s. during the influenza a h n pandemic detecting disease outbreaks in mass gatherings using internet data garbage in, garbage out: data collection, quality assessment and reporting standards for social media data use in health research, infodemiology and digital disease detection national and local influenza surveillance through twitter: an analysis of the - influenza epidemic why is it difficult to detect sudden and unexpected epidemic outbreaks in twitter? comparison of incubation period distribution of human infections with mers-cov in south korea and saudi arabia detecting influenza epidemics using search engine query data slide: towards fast and accurate mobile fingerprinting for wi-fi indoor positioning systems comm sense: detecting proximity through smartphones amigo: proximity-based authentication of mobile devices ensemble: cooperative proximity-based authentication room-level proximity detection based on rss of dual-band wi-fi signals multimodal indoor social interaction sensing and realtime feedback for behavioural intervention hybrid participatory sensing for analyzing group dynamics in the largest annual religious gathering bluetooth low energy in dense iot environments epidemic contact tracing via communication traces commentary: containing the ebola outbreak-the potential and challenge of mobile network data ambient sound-based proximity detection with smartphone sound-proof: usable two-factor authentication based on ambient sound a wearable, ambient sound-based approach for infrastructureless fuzzy proximity estimation characterization of the indoor magnetic field for applications in localization and mapping magnetic maps for indoor navigation how feasible is the use of magnetic field alone for indoor positioning? magicfinger: d magnetic fingerprints for indoor location magnetic field mapping as a support for uav indoor navigation system indoor waypoint navigation via magnetic anomalies clustering of time series data-a survey applied multiple regression/correlation analysis for the behavioral sciences sample size requirements for estimating pearson, kendall and spearman correlations airborne disease propagation on large scale social contact networks using dynamic time warping to find patterns in time series the national human activity pattern survey (nhaps): a resource for assessing exposure to environmental pollutants quantifying the risk and cost of active monitoring for infectious diseases internet time synchronization: the network time protocol the pace of life in countries dynamic stride length adaptation according to utility and personal space magnetometer calibration for portable navigation devices in vehicles using a fast and autonomous technique measures of diagnostic accuracy: basic definitions her research interests include digital disease detection and mobile computing seungho kuk received the b.e. degree in computer science and engineering from korea university from to , he was a research scientist with bell communications research. he is currently a professor with korea university key: cord- -st ebdah authors: raskar, ramesh; schunemann, isabel; barbar, rachel; vilcans, kristen; gray, jim; vepakomma, praneeth; kapa, suraj; nuzzo, andrea; gupta, rajiv; berke, alex; greenwood, dazza; keegan, christian; kanaparti, shriank; beaudry, robson; stansbury, david; arcila, beatriz botero; kanaparti, rishank; pamplona, vitor; benedetti, francesco m; clough, alina; das, riddhiman; jain, kaushal; louisy, khahlil; nadeau, greg; penrod, steve; rajaee, yasaman; singh, abhishek; storm, greg; werner, john title: apps gone rogue: maintaining personal privacy in an epidemic date: - - journal: nan doi: nan sha: doc_id: cord_uid: st ebdah containment, the key strategy in quickly halting an epidemic, requires rapid identification and quarantine of the infected individuals, determination of whom they have had close contact with in the previous days and weeks, and decontamination of locations the infected individual has visited. achieving containment demands accurate and timely collection of the infected individual's location and contact history. traditionally, this process is labor intensive, susceptible to memory errors, and fraught with privacy concerns. with the recent almost ubiquitous availability of smart phones, many people carry a tool which can be utilized to quickly identify an infected individual's contacts during an epidemic, such as the current novel coronavirus crisis. unfortunately, the very same first-generation contact tracing tools have been used to expand mass surveillance, limit individual freedoms and expose the most private details about individuals. we seek to outline the different technological approaches to mobile-phone based contact-tracing to date and elaborate on the opportunities and the risks that these technologies pose to individuals and societies. we describe advanced security enhancing approaches that can mitigate these risks and describe trade-offs one must make when developing and deploying any mass contact-tracing technology. with this paper, our aim is to continue to grow the conversation regarding contact-tracing for epidemic and pandemic containment and discuss opportunities to advance this space. we invite feedback and discussion. containment, the key strategy in quickly halting an epidemic, requires rapid identification and quarantine of the infected individuals, determination of whom they have had close contact with in the previous days and weeks, and decontamination of locations the infected individual has visited. achieving containment demands accurate and timely collection of the infected individual's location and contact history. traditionally, this process is labor intensive, susceptible to memory errors, and fraught with privacy concerns. with the recent almost ubiquitous availability of smart-phones, many people carry a tool which can be utilized to quickly identify an infected individual's contacts during an epidemic, such as the current novel coronavirus (covid- ) crisis. unfortunately, the very same first-generation contacttracing tools can also be -and have been -used to expand mass surveillance, limit individual freedoms and expose the most private details about individuals. we seek to outline the different technological approaches to mobile-phone based contact-tracing to date and elaborate on the opportunities and the risks that these technologies pose to individuals and societies. we describe advanced security enhancing approaches that can mitigate these risks and describe trade-offs one must make when developing and deploying any mass contact-tracing technology. finally, we express our belief that citizen-centric, privacyfirst solutions that are open source, secure, and decentralized (such as mit private kit: safe paths) represent the nextgeneration of tools for disease containment in an epidemic or a pandemic. with this paper, our aim is to continue to grow the conversation regarding contact-tracing for epidemic and pandemic containment and discuss opportunities to advance this space. we invite feedback and discussion. infectious diseases spread in an exponential fashion. containment is an effective means to slow the spread, allowing health care systems the capacity to treat those infected. however, 'lock down' like containment can also disrupt the productivity of the population, distort the markets (limiting transportation and exchange of goods), and introduce fear and social isolation for those that are not yet infected or that have recovered from an infection. finally, and most importantly, contact tracing can be quickly deployed at the first warnings of an outbreak, but continues to be effective when disease resurgence concerns exist. thus, following an initial epidemic peak, contacttracing can be an effective means to enable disease decline and avoid multiple peak periods and disease resurgence. lessons from china have suggested the utility of understanding gps localization of intersections between known infected individuals and others in stemming infection progression. this is specifically related to the r (r naught) that determines how contagious an infectious disease is. r is a description of the average number of people who will catch a disease from one contagious person. ideally, a lower number will optimize reduction of disease spread, which will facilitate time to develop a vaccine or for the disease to die out. three factors that define r are the infectious period (which is generally fixed for a given disease), the contact rate (i.e., how many people come in contact with a contagious person), and the mode of transmission (which is similarly fixed for a given disease). thus, for a given disease, the most adjustable factor is the contact rate. one key issue with contact rate is how to optimally allow individuals and societies to limit the contact rate. contact amongst uninfected individuals will not facilitate disease spread. thus, ideally a society and/or an individual is principally concerned with understanding the contacts an infected individual has had. understanding if paths have been crossed between an infected individual and any number of other individuals will allow for identifying those who have been exposed (and maybe should be tested resulting in appropriate resource allocation or may isolate themselves in the absence of available testing). thus, at a societal level, this may limit the economic and public impact. with an application that allows for users to understand potential exposure to an infected individual, and appropriate action of the exposed individuals, it may be possible to reduce the contact rate by more rapidly identifying cases/exposures which will remove them from the contact chain. for example, if we assume uptake of an application amongst x% of a population, and assuming that portion of the population responds to known exposure by self-quarantining or pursuing texting to confirm lack of infection, the r will decrease in turn by a multiple of that percentage based on the degree of mixing in the population. the reason for the multiple decrease is r partially depends on the population size and density and the exact number of people an individual may come in contact with after exposure which varies amongst individuals. furthermore, with an increasing number "x" in terms of user base, there will be an exponential decrease in r (e.g., for % use and appropriate action, r would be expected to fall < due to maximal reduction of contact rate). thus, for example, a % uptake will have downstream impacts on individuals that person may have come in contact by more rapid exposure/contact identification. this may eventually disrupt the contact rate with may significantly reduce the r more than is accounted for by the %. this ultimate effect of r with a % use and appropriate response to data will hopefully disrupt ongoing chains of transmission, thus effecting the mortality rate and eventually impacting the contact rate and infection curve. however, high enough utilization could reduce contact rate to such a degree as to make the overall r < which would ideally lead to dying off of the infection entirely. almost half of the world's population carries a device capable of gps tracking. with this capability, location trails-timestamped logs of an individual's location-can be created. by comparing a user's location trails with those from diagnosed carriers of infectious disease, one can identify users who have been in close proximity to the diagnosed carrier and enable contact-tracing. as the covid- outbreak spreads, governments and private actors have developed and deployed various technologies to inform citizens of possible exposure to a pathogen. in the following, we give a brief overview over these technologies. we take this opportunity to define several critical terms used throughout this paper. • users are individuals who have not been diagnosed with an infectious disease who seek to use a contact-tracing tool to better understand their exposure history and risk for disease. • diagnosed carriers then, refers to individuals who have had a confirmatory diagnostic test and are known to have an infectious disease. of note, in the setting of an epidemic in which some infected individuals have mild or no symptoms, a subset of users will in fact be unidentified carriers. an inherent limitation in all containment strategies is the society's ability to identify and confirm disease • location trails refer to the time-stamped list of gps locations of a device, and presumably therefore, the owner of the device. • finally, we broadly speak of the government as the entity which makes location data public and informs those individuals who were likely in close contact with a diagnosed carrier, acknowledging that this responsibility is carried out by a different central actor in every continent, country or local region. • local businesses refer to any private establishment such as shops, restaurants or fitness clubs as well as community institutions like libraries and museums. broadcasting refers to any method, supported by technology, by which governments publicly share locations that diagnosed carriers have visited within the time frame of contagion. governments broadcast these locations through several methods. for example, singapore updates a map with detailed information about each covid- case. south korea sends text messages containing personal information about diagnosed carriers to inform citizens. in the us, nebraska and iowa published information of where diagnosed carriers have been through media outlets and government websites. broadcasting methods can be an easy and fast way for a government to quickly make public this information without the need for any data from other citizens. it requires citizens to access the information provided and evaluate whether they may have come in contact with a diagnosed carrier of a pathogen themselves. however, broadcasting methods risk exposing diagnosed carriers' identities and require exposing the locations with which the diagnosed carrier interacted, making these places, and the businesses occupying them, susceptible to boycott, harassment, and other punitive measures. selective broadcasting releases information about locations that diagnosed carriers have visited to a select group, rather than the general public. for example, information might be selectively broadcast to people within a single region of a country. selective broadcasting requires collection of information, such as a phone number or current location, from users in order to define the selected groups. often, a user must sign up and subscribe to the service, e.g., via a downloaded app. selective broadcasting operates under one of two modes: (i) the broadcaster knows the (approximate) location of the user and sends a location specific message. thus, user location privacy is compromised. (ii) the broadcaster sends a message to all users, but the app displays only the messages relevant to the user's current location. the second approach is typically used when messages are intermittent. katwarn, a german government crisis app that, once downloaded and granted access to location data, notifies users within a defined area of any major event that may impact their safety such as a natural disaster or terrorist attack. user privacy is compromised by apps using the first mode as the broadcasting agent receives information about the user's location. apps using the second mode do not have this same limitation as location data is not reported back to the broadcaster. in addition to the risk to the user's privacy with selective broadcasting, the same risks of identification of the diagnosed carrier and harassment of locations associated with the diagnosed carrier seen with broadcasting apply. further, requiring a user to sign up and subscribe risks decreased participation by possible users. unicasting informs only those users who have been in close contact with a diagnosed carrier. unicasting requires government access data, not only of diagnosed carriers, but also of every citizen who may have crossed their path. the transmission is unique to every user. china developed a unicasting system which shows who poses a risk of contagion. while highly effective at identifying users exposed to contagion for containment interventions, unicasting presents a grave risk for a surveillance state and government abuse. in participatory sharing, diagnosed carriers voluntarily share their location trails with the public without prompting by a central entity, such as a government. advantageously, with participatory sharing, diagnosed carriers retain control of their data and presumably consent to its release. users are required to independently seek the information and assess their own exposure risks. however, these solutions present challenges as it is difficult to check for fraud and abuse. risks exist for both the individual and the public with use of contact-tracing technology. the primary challenge for these technologies, as evident from their deployment in the covid- crisis, remains securing the privacy of individuals, diagnosed carriers of a pathogen, and local businesses visited by diagnosed carriers, while still informing users of potential contacts. additionally, contact-tracing technologies offer opportunities for bad actors to create fear, spread panic, perpetrate fraud, spread misinformation, or establish a surveillance state. all containment strategies require analysis of diagnosed carrier location trails in order to identify other individuals at risk for infection. diagnosed carriers, therefore, are at the greatest risk of their privacy being violated, for example, by public identification. even when personal information is not published, these individuals may be identified by the limited set of location data points released. when identified publicly, diagnosed carriers often face harsh social stigma and persecution. in one example, data sent out by the south korean government to inform residents about the movements of those recently diagnosed with covid- sparked speculations about individuals' personal lives, from rumors of plastic surgery to infidelity and prostitution. online witch hunts aiming to identify diagnosed carriers create an atmosphere of fear. as painfully articulated by the following quote, social stigma can be worse than the disease. with all currently available contact-tracing technologies, the risk for public identification of the diagnosed carrier remains high. further innovation is necessary to protect high risk populations. users also face privacy violations. providing an exposure risk assessment to the user requires the user's location data in order to establish where the user's path has crossed with that of a diagnosed carrier. however, enabling access to contact-tracing technology may, at times, violate the privacy of a non-user. users and non-users are networked together through social relationships and environmental proximity. when a family member or friend's identity as a diagnosed carrier is revealed, non-users close to the diagnosed carrier may endure the same public stigmatization and social repercussions. when a business loses customers or faces harassment due to association with a diagnosed carrier's location trail, its patrons and, particularly, its employees bear the economic and social burden whether or not they are a user of contact-tracing technology. non-users may be further negatively affected if location trails pinpoint sensitive locations, such as military bases and secure research laboratories. obtaining consent for any form of data collection and use helps manage privacy risks. consent's utility in real-world settings, however, is often undermined. language which is incomprehensible for typical users and a lack of real choice (e.g. users must often relinquish privacy and share their data in order to receive a service or opt not to use the service at all) severely limit the power of consent. contact-tracing technologies have yet to overcome the challenges associated with obtaining true consent from the user. typically, a user may be required to share their location with a third party in order to receive an exposure risk assessment. during an epidemic, complex and quickly evolving data must be accurately conveyed to and understood by the entire public, including individuals with low health literacy. serious harm, including heightened alarm among the public, may result from failure to appropriately communicate health risks. contact-tracing technologies have potential to introduce misinformation and cause panic. for example, if users receive an alert about a possible contact location without appropriate information and understanding of the exposure time frame, some users will inaccurately conclude they are at high risk. even when information regarding both location and time is provided to users, if the magnitude of the risk cannot be easily comprehended, an atmosphere of fear or a run on the medical system may be provoked. feeling a false sense of safety at having not received a notification of exposure, some users may underestimate their risk for disease. users who no longer perceive a significant risk may be less likely to engage in other forms of disease prevention, such as social distancing. a false sense of safety may occur when the limitations of contact-tracing technology within a community are not clearly communicated to the public. technological interventions in human crises are often targeted for fraud and abuse. in south korea, fraudsters quickly began blackmailing local merchants and demanding ransoms to not (falsely) report themselves as sick and having visited the business. additionally, bad actors may force individuals to provide their location data for purposes other than disease containment, such as for immigration or police purposes. fear of such abuse may prevent a contact-tracing system meant to help save lives from being adopted. hacking lingers as a serious risk for all data-gathering technologies with sensitive information, like health status and location. hackers have successfully infiltrated apps and services collecting sensitive information before, with million accounts from the genealogy and dna testing service myheritage hacked in . data security must lie at the center of every effort to use location data for contact-tracing and containment. ensuring equity and social justice challenges many technologies, including contact-tracing. if participation requires ownership of a smartphone, some people, often those most vulnerable, the elderly, the homeless, and those living in lower-income countries, will not be able to access the technology. a lack of access to devices among vulnerable populations will remain a significant challenge for contact-tracing technology in the near future. avoidance by the public may impact any business identified on a diagnosed carrier's location trail, but reduced hours or job loss hurt lower-income service workers most. finally, abuse of data collection and violations of user privacy are inflicted more often upon those who are already most vulnerable to government surveillance. in the following table, the various contact-tracing technological approaches are mapped against the reviewed risks and challenges. the inverse relationship between accuracy of the provided risk assessment and user privacy for contact tracing technologies necessitates compromise by the user community. the core trade-off between utility and user privacy, diagrammed below, illustrates this and highlights the potential of private kit: safe paths to fundamentally alter this relationship. deploying any form contact-tracing technology requires contemplation of several risks outlined in the prior analysis. mitigation of these risks depends on thoughtful consideration of the trade-offs inherent to contacttracing technology and containment strategies. in the following, we review decisions required for these trade-offs and best approaches for risk mitigation. data must be collected from diagnosed carriers to facilitate containment of an epidemic. however, both data collection and release of that information to identified contacts may violate the diagnosed carrier's privacy. as the most vulnerable stakeholder in the containment strategy, several efforts must be undertaken to protect the diagnosed carrier's privacy to the highest degree possible. limiting the publicly published data helps protect the known carrier's identity from the public. to date, with the exception of participatory sharing models, the diagnosed carrier's data must be shared with a third-party entity, requiring the carrier to relinquish at least some control over their data. ending the need for third party involvement would represent an immense step forward in privacy protection for diagnosed carriers. access and usage of the data by an entity, mostly governments, should be limited and highly regulated. harsh penalties for the abuse of such data should be established. obtaining true user consent further protects diagnosed carriers. not all approaches in use today require consent to share personal data. particularly in non-democratic regimes, diagnosed carriers may be unable to deny consent. in other instances, all users must consent to share their data in order to be informed of their own exposure risk. we believe no one should be obligated to share their personal information. time limited storage of location trails further protects the privacy of diagnosed carriers. finally, using an open-source approach to create an app fosters trust in the app's privacy protection capabilities, as independent experts and media can access and evaluate the source code. containment of an epidemic requires publication of sites of known exposure to a diagnosed carrier to the public. yet doing so risks harassment of local businesses at these sites. providing broader location data may better protect the privacy of a local business, but also affects the accuracy of the risk assessment. broad location data, such as notice of a x m area into which a diagnosed carrier sojourned, may still identify a business. any contact-tracing approach must balance the public health benefit of disease containment against the threat of economic hardship for local busi-nesses connected to the epidemic. there is no easy answer to this trade-off as any choice impacts utility of the technology and risks affecting the viability of the business. evaluating the risk versus benefit of location data release should occur on a case-by-case basis. the time frame of possible contagion must be released so the users may understand the limits of the exposure risk. critically, the entity publishing the location data should consult with the local business and inform the business of any decision before the public is notified. issues of access and inclusion are not easily resolved by contract-tracing technology. limited access to a device capable of utilizing contact-tracing technology and difficulty understanding and acting on the provided risk assessment overly affect the more marginalized of our societies. however, containing an epidemic outbreak quickly benefits everyone within a community. implementation of contact-tracing technology within a community, even with unequal access, may increase the safety of all. the development of a simple gps device that can share location trails may be a medium-term solution to some accessibility concerns, particularly in countries with limited smartphone penetration. additionally, some form of access to information about a possible contagion must be made available to those without a smartphone and all information should be presented in a way that accounts for variation in health literacy among users. the spread of misinformation cultivates instability and uncertainty during a crisis. release of information on the spread of a pathogen to the public invites public speculation and fear-mongering and manipulation by bad actors. a false sense of safety for users may increase alongside increased efficiency of contact-tracing technology. entities providing contact-tracing technology are also at risk to introduce error within the release information, despite best intentions. at this time, no strategies exist to eliminate these risks; however, such risks can be mitigated through educational outreach efforts and engagement with key stakeholders. storage of sensitive information invites attack by hackers. trade-offs must be made in order to mitigate this risk. only anonymized, redacted, and aggregated sensitive information should be stored. use of a distributed network, rather than a central server, makes hacking less attractive, but requires providing security to multiple sites. in the long term, the safest way to store location data will be in an encrypted database inaccessible to all, including the government. time limitations on data storage also work well to secure information and should be implemented in contact-tracing technology. during an epidemic outbreak, the appropriate amount of time for data storage equals the time during which a diagnosed carrier could have possibly infected another individual. for covid- , this time frame is set to be to days. deleting data after such a short period, particularly during an outbreak of a poorly understood pathogen has risks. however, we feel this trade-off should be made for data security and user privacy. our ability to accurately trace contacts of individuals diagnosed with a pathogen and notify others who may have been exposed has never been greater. real risks exist, though, thus care must be addressed in the design of the solution to prevent abuse and mass surveillance. as a beginning to the discussion of how to develop and deploy contacttracing technologies in a manner which best protects the privacy and data security of its users, we have reviewed varioustechnological methods for contact-tracing and have discussed the risks to both individuals and societies. pri-vatekit: safe paths eliminates the risk of government surveillance. it draws on the advantages from several models of contact-tracing technology while better mitigating the challenges posed by use of such technology. we have presented a discussion of precautions which should be taken and trade-offs which will need to be made. we invite feedback and discussion on this whitepaper. we would like to acknowledge amandeep gill of the international digital health bernardo mariano jr of the world health organization (who), and don rucker of the u.s. department of health and human services (hhs) for their mentorship in advancing contact-tracing solutions contact tracing and disease control emergency guideline: implementation and management of contact tracing for ebola virus disease, world health organization (who) and centers for disease contact tracing in random and clustered networks presumed asymptomatic carrier transmission of covid- contact tracing and epidemics control in social networks covid- : what is next for public health? identification of a new human coronavirus how the painstaking work of contact tracing can slow the spread of an outbreak evaluation of a mobile health approach to tuberculosis contact tracing in botswana a model of the ebola epidemic in west africa with contact tracing innovative technological approach to ebola virus disease outbreak response in nigeria using the open data kit and form hub technology fact sheet more scary than coronavirus': south korea's health alerts expose private lives coronavirus mobile apps are surging in popularity in south korea take a look at these korean apps helping people avoid areas infected by the coronavirus mit techology review [ ] coronavirus privacy: are south korea's alerts too revealing? god's eye view: will global ai empower us or destroy us? tedxbeaconstreet mike and others, a pragmatic introduction to secure multi-party computation split learning for health: distributed deep learning without sharing raw patient data differential privacy: a survey of results, international conference on theory and applications of models of computation a review of homomorphic encryption libraries for secure computation distributed federated learning for ultra-reliable low-latency vehicular communications yves-alexandre, estimating the success of re-identifications in incomplete datasets using generative models unique in the crowd: the privacy bounds of human mobility there's no such thing as anonymous data nextstrain: real-time tracking of pathogen evolution crisis-related apps: assistance for critical and emergency situations multi-method study on distribution, use, and public views on crisis apps collaboration and leadership for effective emergency management introduction to emergency management collaborative emergency management and national emergency management network, disaster prevention and management: an international journal hack of dna website exposes data from million accounts a framework for integrated emergency management key: cord- -z qn tw authors: cho, pauline; boost, maureen title: covid —an eye on the virus date: - - journal: cont lens anterior eye doi: . /j.clae. . . sha: doc_id: cord_uid: z qn tw nan it is currently unknown how the organism can infect the eye, whether it is direct transmission by droplets or aerosol from infected patients or spread from the nasal tissues via the nasolacrimal duct [ ] . optometrists, whilst not considered frontline health care workers for covid , may nonetheless, be exposed to asymptomatic patients or their close contacts. therefore, they need to take precautions to protect themselves and others in their practices. depending on local regulations, it is a sensible precaution to check the temperature of anyone entering the practice, including staff, and for all present to wear a surgical mask, and use hand sanitizers. ideally, this should be supplemented with a face visor, as certain diagnostic procedures do require close contact with patients. in addition, ophthalmic instruments should be shielded as far as possible to prevent contamination and wiped down with disinfectant or alcohol after each patient. clients may try on several frames when choosing new spectacles, so, although it is not normal practice to clean these after every customer, at this extraordinary time, all items that come into contact with potentially infected persons must be wiped with alcohol or equivalently effective method. this means that clients trying on frames must be supervised to ensure that frames tried by the clients can be identified for cleaning. although the optometry practice is not the correct setting, some patients may present with ocular manifestations, such as conjunctivitis [ ] , rather than attending the hospital or a medical practitioner. an ocular condition may be the first symptom of covid [ ] and it is considered that one of the first j o u r n a l p r e -p r o o f practitioners to draw attention to this new infection was dr li wenliang, an ophthalmologist [ ] . it is thought that he contracted the virus from a glaucoma patient. in china, guan et al. [ ] reported that less than % of covid patients had conjunctival congestion, but subsequent reports have observed higher percentages [ , , ] , although not all conjunctivitis present in covid patients was attributable to the virus [ ] . regarding the use of contact lenses during this period of worldwide infection, mixed messages have emerged from various health sources, making it difficult for practitioners to provide absolute guidance to their patients. the most frequent question regards the safety of wearing contact lenses for vision correction during the crisis. contact lens wear itself is safe if there is strict compliance with care and wear routines. [ ] this has not been changed by the pandemic. there is no definitive evidence so far of direct transmission via contact lenses. however, it is well recognized that contact lens patients are frequently not compliant [ ] [ ] [ ] . in addition, contact lens wearers may experience minor discomfort or irritation more frequently than spectacle wearers [ ] , and this in turn increases the chance of the natural response to touch or rub eyes [ ] . this is especially true for patients who experience some dryness when using contact lenses. for these patients in particular, it may be prudent to switch to spectacles for this period. alternatively, those who are more dependent on their contact lenses may consider wearing plano spectacles or sunglasses, in addition to their lenses, in public. nevertheless, as spectacle wearers may also rub their eyes after removing their glasses, they also need to be reminded to avoid doing so. during this pandemic, many countries have instituted formal lockdowns, or working from home. whilst this reduces the risk of contact with others, it may encourage another form of non-compliancenapping while wearing contact lenses. this can lead to hypoxia, which increases the risk of infection j o u r n a l p r e -p r o o f [ ] . once again, practitioners should remind their patients of the importance of not sleeping with their lenses on, unless prescribed otherwise. another issue regarding lockdown is the lack of accessibility to aftercare consultations. some practitioners are sending out refill lenses on request, which allows them to remind patients to attend the clinic after lockdown is lifted. for other patients, a general reminder sent by social media helps to maintain rapport with the practitioners and may prevent patients from regularly ordering lenses online even after the lockdown. practitioners of course, are responsible for impressing upon their patients the importance of good hygiene, but some additional precautions are needed during this time of pandemic. spectacle wearers should be reminded to clean their glasses on returning home, either using alcohol wipes or with soap and water. clients should also be reminded that other frequently contacted items, especially mobile phones, should also be regularly cleaned to prevent cross transmission. any patient who does contract covid should be advised to discard all previously worn contact lenses and used accessories (eg. solutions, lens storage case) after recovery. since the beginning of the pandemic, it was believed that children were relatively safe from covid . however, according to a recent study in the us, the risk has been underemphasized [ ] , as several cases of severe, even fatal, disease have been reported. it appears to be relayed to an immense over response of the immune system to the infection, which is very difficult to treat. the authors consider a surge of severe pediatric cases of covid would be disastrous, in view of inadequate pediatric hospital care resources. this observation is likely to be the case in many other countries as well. at this time, it is essential to increase awareness of children wearing contact lenses for myopia control. these include j o u r n a l p r e -p r o o f those wearing specialized soft contact lenses in the daytime and overnight orthokeratology. it is more difficult to impress on children the importance of not touching 'men'. therefore, the case for temporary discontinuation of daytime lens wear at this time may be stronger, but the final decision must be made by the practitioner after careful consideration of the likelihood of good compliance and amount of time spent out of the home. in many places, schools are closed and time outdoors is strictly limited, reducing opportunities for contamination and infection. once again, when outdoors, the risk can be ameliorated by the use of plano glasses or sunglasses with the contact lenses. with respect to orthokeratology, lenses are only worn at home and insertion/removal is often performed with the assistance of parents. children sleep with the lenses in place, thereby reducing the chance of touching their eyes; after all, no one wears a face mask to sleep! one problem of school closures is that children may be allowed to sleep in longer. practitioners need to caution parents that extended hours of sleep with the orthokeratology lenses may lead to development of corneal oedema. this condition has been observed in a few children undergoing orthokeratology treatment just recently in hong kong. as patients may be unable to visit clinics or practices due to lockdown, it may be advisable for practitioners to contact the parents of all of their orthokeratology patients to alert them about this problem. after all, as boris johnson said, "we must all be alert!" ( may, ). tropism, replication competence, and innate immune responses of the coronavirus sars-cov- in human respiratory tract and conjunctiva: an analysis in ex-vivo and in-vitro cultures face shields for infection control: a review potential utilities of mask-wearing and instant hand hygiene for fighting sars-cov- can the coronavirus disease (covid- ) affect the eyes? a review of coronaviruses and ocular implications in humans and animals ophthalmic manifestations of coronavirus (covid- ) characteristics of ocular findings of patients with coronavirus disease (covid- clinical characteristics of coronavirus disease in china the evidence of sars-cov- infection on ocular surface evaluation of ocular symptoms and tropism of sars-cov- in patients confirmed with covid- the covid- pandemic: important considerations for contact lens practitioners noncompliance and microbial contamination in orthokeratology hand hygiene is linked to microbial keratitis and corneal inflammatory events a study of contact lens compliance in a non-clinical setting responses of contact lens wearers to a dry eye survey eye rubbing-induced changes in intraocular pressure and corneal thickness measured at five locations, in subjects with ocular allergy pathogenesis of contact lens-associated microbial keratitis covid- in children in the united states: intensive care admissions, estimated total infected, and projected numbers of severe pediatric cases in key: cord- -x dqi ym authors: lowery-north, douglas w.; hertzberg, vicki stover; elon, lisa; cotsonis, george; hilton, sarah a.; vaughns, christopher f.; hill, eric; shrestha, alok; jo, alexandria; adams, nathan title: measuring social contacts in the emergency department date: - - journal: plos one doi: . /journal.pone. sha: doc_id: cord_uid: x dqi ym background: infectious individuals in an emergency department (ed) bring substantial risks of cross infection. data about the complex social and spatial structure of interpersonal contacts in the ed will aid construction of biologically plausible transmission risk models that can guide cross infection control. methods and findings: we sought to determine the number and duration of contacts among patients and staff in a large, busy ed. this prospective study was conducted between july and june . two -hour shifts per week were randomly selected for study. the study was conducted in the ed of an urban hospital. there were shifts in the planned random sample of ( %) with usable contact data, during which there were patient encounters. of these, ( %) were approached to participate, of which ( %) agreed. over the course of the year, staff members participated ( %). a radiofrequency identification (rfid) system was installed and the ed divided into distinct zones structured so copresence of two individuals in any zone implied a very high probability of contact < meter apart in space. during study observation periods, patients and staff were given rfid tags to wear. contact events were recorded. these were further broken down with respect to the nature of the contacts, i.e., patient with patient, patient with staff, and staff with staff. , contact events were recorded, with a median of contact events and contacts with distinct individuals per participant per shift. staff-staff interactions were more numerous and longer than patient-patient or patient-staff interactions. conclusions: we used rfid to quantify contacts between patients and staff in a busy ed. these results are useful for studies of the spread of infections. by understanding contact patterns most important in potential transmission, more effective prevention strategies may be implemented. presentation of an infectious patient to an emergency department (ed) brings a substantial risk of cross infection. ed cross infection risk was demonstrated dramatically during the severe acute respiratory syndrome coronavirus (sars co-v) epidemic. the son of the first index case arriving in toronto fell ill after caring for his mother [ ] . he visited a crowded ed and waited hours for a hospital bed assignment. subsequently, nosocomial sars infections among patients and staff were traced to direct or indirect exposure to this patient; several of these victims died. since this incident, ed crowding has worsened, [ ] increasing commingling of acutely infected patients with other susceptible and high-risk patients, thereby increasing the risk of ed cross infection. this sars outbreak is not an isolated incident. ed visits have previously been shown to be a significant risk factor for subsequent infection in the pediatric population, [ , ] as well as in the elderly [ ] . cross infection of patients in the ed is also an important concern to patients and staff in other hospital areas, since more than % of hospitalized patients originate from the ed [ ] . annually, a global epidemic of influenza results in significant morbidity and mortality [ , ] . although evidence suggests influenza may be transmitted via airborne and contact routes, [ ] most authorities agree influenza is transmitted primarily in droplets passing between people, as in the sars co-v outbreak [ ] . droplet-mediated cross infection typically occurs within a one ( ) meter (m) radius between source and exposed, as the droplets are of adequate size that gravity pulls them to the ground before they can travel laterally [ ] . until recently, many mathematical models of the cross infection process had assumed that individuals are mixing and coming into contact with each other randomly [ ] . recent research has begun to show that humans do not come into contact with each other according to a uniformly random process; human interaction is highly influenced by other external factors [ ] [ ] [ ] [ ] . knowledge of the social and spatial structure of interpersonal contacts in the ed will provide information useful for building biologically plausible mathematical models of cross infection risk, [ ] which can guide development of cross infection control measures. advances in technology have made the automated tracking of individuals possible and increasingly affordable using a variety of types of real time location sensing systems, such as radiofrequency identification (rfid) and motes. while the manufacturing and retail sectors of our economy have been making use of such technology to track goods for years, declining costs have enabled researchers to deploy such systems to investigate human motion in a variety of settings. although investigations of human movements and resulting contacts have been conducted in a variety of settings, such as schools [ , ] and academic conferences, [ , ] there has been relatively limited deployment for research purposes in the health care setting. notably there have been four such studies in settings such as a pediatric emergency department, [ ] two hospital wards with airborne precautions, [ ] a general pediatric ward, [ ] and a medical intensive care unit (micu) [ ] . we report here the results of a year-long deployment of a rfid system covering all areas of an adult ed, describing the contacts between and among patients and staff. the goal of this study was to determine contact characteristics among patients and staff in the ed of a busy urban hospital. the number and duration of contacts between individuals is described overall and by patient-patient (pp), patient-staff (ps), and staff-staff (ss) type contacts. ethics statement: the emory university institutional review board (irb) granted waiver of all elements of informed consent and waiver of hipaa authorization. this is a prospective study. the study was conducted in the ed of emory university hospital midtown in the midtown area of atlanta, ga. the ed occupies , square feet, and includes triage, fast track, acute care, and observation functions. it has an annual census of , visits. contact was defined as any two individuals located within m of each other in contiguous two-dimensional space, measured by radiofrequency identification (rfid). an active rfid proximity detection system was installed in and activated in early (radianse corp., amherst, ma). the ed was divided into twodimensional zones, and these zones were configured around hard and soft architectural features such that when two individuals were in the same zone simultaneously, they were within m of one another with a very high probability. the floor plan of the ed as divided into zones is given in figure . all areas of the ed were covered by the system. rfid tags transmitted their unique identifier every ten seconds. the system was dispersed such that rfid tag signals would be detected by at least three receivers. receivers relayed information back to a server, where a proprietary algorithm determined tag location and assigned a corresponding zone identifier. to facilitate line-of-sight determination in the presence of many walls (i.e., two subjects on either side of a wall but otherwise within m would not be considered as in contact), the ed was divided into zones. data were retrieved using mysql database (oracle corp., redwood shores, ca) queries and stored in microsoft excel (microsoft corp., redmond, wa). a representative sample of -hours shifts over one year were randomly selected for direct observation. two -hour periods (one am- pm (day shift) and one pm- am (night shift)) were randomly selected for study from each week between july and june . this selection was made in order to get a sense of seasonal variability and to ensure that day and night shifts were equally represented throughout the year. week was chosen as a blocking factor for the design since there is a distinct rhythm to patient-and work-flow in the ed within a week. the study budget was designed to support the study of no more than two shifts per week. staff. rfid tags were issued to assenting ed staff members to wear. patients. immediately prior to each study period, the research team placed rfid tags on assenting patients already enrolled in the process of care in the ed. research team members placed tags on newly arriving assenting patients during the study period. psychiatric patients, patients in police custody, and patients who were expected to be discharged within a short time were not approached for rfid tagging, due to either irb concerns (first two groups) or study staff limitations (last group). patients in this last group were generally waiting in an exam room and had very few contacts. ed staff removed patient tags at the earliest of time of patient discharge (for those discharged), time of hospital admission (for those admitted), or time of study period conclusion. data were collected from three different sources. the electronic health record (ehr) (cerner millennium electronic health record, cerner corp., kansas city, mo) provided standard clinical and demographic characteristics for patients. the tag identification database tracked tag information. the radianse database provided time-stamped information about tag zones. race was abstracted from the ehr as entered by hospital registration staff, who as part of routine operations ask patients to identify their race and ethnicity. this variable is reported since social mixing may vary culturally and thus it may impact the ultimate generalizability of our study. contacts were described in ways. a unique pair of individuals could make contact multiple times over the course of a shift. ''contact pairs'' defined multiple contacts as one pair and the duration of these contacts were cumulated. in contrast, ''discrete contact events'' treated multiple discontinuous instances as multiple contacts of one contact pair. figure gives a schematic representation of these definitions. for each second of each shift we created adjacency matrices in which every participant in that shift was cross-listed against every other participant. by aligning these matrices along the dimension of time, we created a -dimensional time-resolved adjacency object (trao) for each shift in the study. we used data from the radianse system to determine if any two participants were in contact at a given instance, assigning s in the appropriate cells of the relevant adjacency matrix for that instance. otherwise s were assigned if participants were not in contact at that time or if neither participant was present in the ed at that time. by summing in the time dimension over these objects we could determine the number of contact pairs formed among participants and the total duration of contact for each pair. we then examined the contact sequence for each contact pair in the time dimension as represented by a sequence of s and s. each distinct sequence of s that was interrupted by s was called a contact event. this is also represented in figure . selection bias. to reduce selection bias, study periods were randomly selected. we aimed to study every assenting patient and ed-based staff member present in the ed during these periods. measurement bias. to reduce measurement bias, ed staff were advised to utilize standard operating procedures when performing data entry into the clinical information system. no subjects were asked to record study-specific information. the research team alone performed entry of linkage data for patients and rfid tag data. classification bias. to reduce classification bias, the rfid system determined subject locations independently of the clinical information systems. off-the-shelf software was utilized to determine contacts. the system was calibrated for accuracy and reliability prior to the start of the study. initial calibration of the system verified that a radio signal emitted anywhere within the ed footprint would be captured by at least three and typically four receivers. further verification was completed prior to study initiation to demonstrate that a radio signal received from any location would be accurately mapped to the appropriate zone. no ongoing systematic analysis was performed after the initial setup since hardware and software location and configuration were static. however, the system was utilized during routine, nonresearch operations, / during the entire study year, to locate both human and equipment resources in real time. no disparities were noted nor reported by ed staff during this extensive observation. furthermore, during the twelve-hour study periods, research assistants interacted with the system in real time to locate patients, staff, and equipment. again, no location disparities were noted. funding constraints limited our ability to record each instance research assistants interrogated the system for a specific badge location and then found the badge at that location, although research staff were trained to record any operational aberrations. this descriptive study was designed so sufficient data would be available to adequately characterize temporal aspects of contact variability and simulate epidemic spread at seasonally appropriate times of year. simple descriptive statistics were utilized to estimate number, duration, and nature of contacts (pp, ps, and ss) during the study period, using sas v . for windows enterprise (sas institute, cary nc). in particular, data were summarized over a shift, and these summary values were further aggregated. data were highly skewed, thus we present summary percentile values. friedman's test [ ] along with tukey's posthoc procedure [ ] was used to test if the contact characteristics were the same for pp, ps, and ss using the medians from each shift. from potential twelve-hour sampling periods, we randomly selected for observation. usable data were obtained from shifts ( %). the remaining shifts in the sample were excluded due to equipment problems (n = ; %) and insufficient study staffing (due to illness, inclement weather, etc., n = ; %). over the course of the study year , distinct patient admissions occurred (table ) . of these, ( %) occurred in the shifts studied. among these admissions, the research team did not approach ( %). of the remaining , ( %) were excluded for patient-related reasons (patient refused assent ( %), patient could not assent ( %), nurses recommended exclusion ( %), imminent discharge ( %), other ( %)). another ( %) were excluded for technical reasons, leaving patientadmissions with usable data. the median percent of patients approached for each shift was % and the median percent participating was %. as the study progressed, the flux of patients increasingly exceeded the capacity of the research team to approach all patients (figure ), although the percentage of patients participating of those approached remained constant. a median of patients /shift were tagged (iqr - ). demographic and clinical characteristics of the patient admission data are shown in table , summarized for the population over the year, all patients in our sampled shifts, and patient study participants only. there are no clinically meaningful differences between participants and the general population with respect to age, sex, and race. however, participants tended to present with greater acuity. there were parallel increases in length of stay (los) and percent admitted to the hospital as the overall patient population narrows to participants. there were staff eligible, of which ( %) agreed to participate. the median number of staff participants per shift was (iqr - ). although the staffing pattern provided for a maximum of staff per shift, occasional staff meetings, skill seminars, and double coverage caused the number of staff to exceed this maximum in some shifts. as staff are a vulnerable population we did not collect demographic information in order to preserve anonymity and thus maximize participation. staff participants were observed in shifts numbering between and , with a median of . we observed a median of participants per shift (interquartile range (iqr) - ). the median of the maximum number of participants in the ed simultaneously was for day and for night shifts. contact characteristics (number per shift and duration) are summarized over shifts (table ). median and iqr values for quantity and duration of contacts within shift and by type of contact (i.e., patient-patient, patient-staff, staff-staff) are shown. numbers of contact events are described by shift, by participant, and by contact pair. numbers of contact pairs are described by shift and by participant. duration of contact is described by total per shift, per contact event, per contact pair, and per participant. we observed a total of , contact events across all shifts. in the typical observation period, (median) contact events occurred, with most of type ss. a similar pattern was seen in contact events per participant and per pair. we observed (median) contact pairs per shift, with pp and ss contacts having equally large frequency ( pp, ss) while ps contacts ( ) were less numerous. the number of distinct participants with which any other participant came into contact was largest for ss ( . ) as compared to pp ( ) and patient -staff interactions of both types ( ) . a typical patient participant came into contact with staff participants, while a typical staff participant came into contact with patient participants. in a typical observation period, (median) hours of contact were observed, with ss hours being an order of magnitude greater than pp or ps. similar patterns were seen with hours of contact per hour of shift. most contact events were short (, minutes for pp and ps, , minutes for ss). total minutes per pair were similarly short for pp and ps ( . and . minutes respectively) but longer by a factor of more than for ss ( . minutes). a similar pattern was observed for total duration in contact for participants. a typical patient participant had . minutes of contact with staff participants, while a typical staff participant had . minutes of contact with patient participants. the distributions of the total duration of contacts per participant by contact type (pp, p with s, s with p, and ss) are given in figure . note that the distribution of total contact duration of staff with other staff is much less right-skewed than for the other contact types. the cumulative distributions of the number of contacts per participant (degree) by contact type are given in figure . note that these distributions are not consistent with a power law distribution. an off-the-shelf, commercially available, rfid system was used to measure contacts in an ed. the data described here illustrate the complexity of interpersonal interactions in this setting. we found substantial differences in contact characteristics of the three mixing subgroups. in general, ss interactions were more numerous and longer than pp or ps interactions. therefore, given a susceptible population of patients and staff, the biological gradient created by individuals in contact favors cross infection among staff. this finding is consistent with previous studies of contacts in the healthcare environment which highlight that table . summary of contact characteristics per shift among patients and staff in an emergency department, over shifts . there were a total of individuals in shifts that did not make a contact while under surveillance. they are not included in these calculations. the median and quartiles of each shift were calculated and the median of these values are reported. the median of all types will not be the sum of the subtype medians. all comparisons across groups types were significant by friedman's test at p, . , except for contact pairs/shift, which was significant at p = . . tukey's post hoc procedure was used to determine which groups were different and the ordering. a contact event is defined as any two people being within meter of each other; multiple discontinuous instances between the same two individuals are here counted as multiple contacts. one contact pair is defined as any two people who have at least one instance of being within meter of each other ( = an edge or link); multiple discontinuous instances are here counted as a single contact. total hours/shift is the sum of all instances of contact. nb: shift duration ranged from to hours. doi: . /journal.pone. .t the validity of this study for the general healthcare population is limited by several factors. the study period coincided with the novel h n influenza outbreak. the methods of this study made it impossible to know how the novel influenza outbreak or its associated publicity impacts generalizability of our results to the general population of ed's. informal observations of interpersonal behavior made by research study staff suggest no change in number or duration of interpersonal contacts. sources of bias. despite our countermeasures, several types of bias were present in the study, which might result in biased estimates of number and duration of contacts among all mixing groups. several factors led to the presence of selection bias. first, the study was performed in a busy, urban ed with unique facility footprint, staffing pattern, and patient demographics. therefore these results may not be applicable to all ed's or even to other similar, non-ed healthcare environments. study criteria excluded visitors and non-ed based hospital staff present in the ed (e.g., cleaning staff, hospital chaplains) as well as prehospital personnel. moreover, only staff participated, although staff were employed during the study year. irb approval was contingent on anonymization of staff beyond job category (physician, nurse, other patient care, administrative). therefore, number and total duration of staff contacts are underestimated. second, over time we observed fewer rfid signals from staff participant tags. while it is reasonable to conclude that there was a decline in staff participation over the course of the year, there is also high probability that some (and perhaps most) of the decline could be attributed to battery failure in the permanent tags. the battery half-life was one year, and these badges were activated six months prior to the commencement of our year of official study observation in order to test and calibrate the system. indeed, the fall-off in staff participation is consistent with exponential failure time with half-life of one year. a spurious system setting prevented the receipt of planned alerts regarding weakening batteries. after the conclusion of the study period, we found a substantial number of staff tags with dead batteries. thus the fall-off in staff participation is likely due to battery failure rather than staff selecting out of the study. regardless of the reason, the effect of this decline is underestimation of the number and duration of contacts. lastly, despite utilizing randomly selected observation periods and waiver of all elements of informed consent, patient participants tended towards higher acuity triage score, higher likelihood of hospital admission, and longer ed los than the ed population at large. the study may also be affected by measurement bias. study staffing was inadequate to provide tagging of patients at the instant of ed arrival. thus actual los was longer than studied los, so number, degree, and duration of contacts have likely been underestimated. moreover we made a protocol decision not to tag patients who were awaiting discharge. in this case, the number and duration of contacts have also likely been underestimated. on the other hand, we assumed that discrete contact events occurred when the adjacency matrix elements of the trao for subjects i and j along the time dimension were two strings of s separated by as few as three s. such an occurrence could reflect, for instance, a staff member seeing a patient in an exam room (series of s), stepping out of the room for as little as three seconds ( , , ), then returning to the room (series of s). since we could not separate these very real possibilities from those due to poor signal quality, we elected to leave these as separate events. in this case, we may have overestimated the number of contact events while underestimating event duration. another place where measurement bias may have occurred was in the waiting room. our focus on getting greater separation between patient exam rooms came at the expense of less separation in the waiting room, a large (, square foot) open square-shaped area that was divided for system purposes into two zones. all participants that the system located to one waiting room zone were counted as in contact. given the hard and soft architectural features of this space, it is highly likely that individuals colocated within the space would have a high probability of being in close personal contact. however, we could have overestimated the number and duration of contacts for patients in one waiting room zone that were actually more than m apart, but we also could have underestimated the number and duration of contacts for patients in two zones that in reality were seated next to each other. the waiting room population is very dynamic. there are a number of factors that contribute to this. the waiting room lies between the main entrance for walk-in patients and the registration area. thus upon first entry patients and/or their accompanying visitors must walk through the thick of other patients to sign into the system. second, the front-end process of care requires patients to make multiple trips to and from the waiting room: sign-in, triage, hospital registration, care initiation, emergency radiology, and, in some cases, to see a mid-level provider. also, to enhance patient satisfaction with the ed visit experience, the waiting room contains many diversions to address patient comfort, for example, a coffee station, telephones, reading materials, vending machines, restrooms, and, during the time of the study, a designated smoking area. all of these factors contribute to high mobility of patients in the waiting room during their visit, resulting in more brief contacts than might initially be expected. in the case of a highly transmissible virus, such patterns of contacts may be sufficient for cross infection. however, for a less transmissible virus, many brief contacts may not generate sufficient exposure to the index case that would result in cross infection. several factors contributed to classification bias. software requirements mandated division of the ed footprint into mutually exclusive zones, for which any two individuals simultaneously in a zone would likely (but not certainly) be in contact with one another. the net effect of this procedure should result in neither overestimation nor underestimation of study results. challenges were encountered among staff compliance with badges. hospital staff members typically work in designated areas, and thus perceive themselves to be in constant visual contact with their team members. therefore, to the staff, benefits to be obtained by tracking potential cross infection exposures are outweighed by costs -for instance, annoyance due to dropping of tag when affixed to pocket, tangling with stethoscope when carried on lanyards around the neck, or interfering with hospital staff id tags. we attempted to improve staff compliance by issuing permanent tags for staff to wear and providing periodic education sessions. staff did not have to activate their tags nor record their use, therefore we had a completely passive inclusion system for staff but not for patients. for staff, we considered the shift start time as the time that the tag was first located outside staff only areas or the observation period begin time, whichever occurred last. similarly the shift end time was the time that the tag was last located outside staff only areas, or the observation period end time, whichever occurred first. this was done in order to account for tags that might be stored in a locker or left on a sweater draped over a desk chair when the staff member was not working. most importantly the number and duration of staff-staff contacts demonstrate the dangers of ill or infectious staff members at work. this finding is consistent with simulations conducted by others [ , ] . if transmission is related to a biological gradient of exposure as defined by magnitude and duration of contacts, then pathogens transmissible by droplets appear to have a higher likelihood of cross infection from working contagious staff (i.e. ''presentees'') to susceptible peers. in the case of annual influenza outbreaks, this finding underscores the importance of vaccination of healthcare employees assigned to the ed in order to prevent health care service interruption due to widespread staffing illness (i.e. absentees) since infectious humans may be shedding virus hours or days prior to developing symptoms. in the case of novel infectious diseases transmissible by droplets, it also underscores the risk for staff cross infection once one staff member has been infected. healthcare systems might consider keeping staff in droplet precautions even when not with symptomatic patients, as well as other efforts to reduce the likelihood of cross infection among staff in general. ed staff tend to look at their patients as high risk, while viewing other staff as ''safe'' unless symptomatic. however, implementation of infection-control measures is more difficult in the ed than in other hospital areas, since patients' conditions have not been identified upon arrival. ed staff are at higher risk for cross infection than personnel in other hospital areas [ ] . interestingly, the number of contacts per participant is not consistent with a power law distribution, indicating that our networks are not scale-free. this finding was unexpected a priori, since this property has not been found for other common types of networks, although it has been found by gundlapalli et al. in a study of contacts in a pediatric ed [ ] . as gundlapalli and colleagues noted, patients and staff in particular do not associate in a manner that would be appropriate for a preferential attachment model that would give rise to a power law distribution -newly arriving patients are assigned to staff as staff discharge their current patients. importantly, mathematical models of cross infection in the ed that assume contact networks that are scalefree in nature may not describe the cross infection process correctly. five other studies are directly comparable to ours. polgreen and colleagues report results of shadow observations of staff over hour observation periods during one year throughout a large, academic, rural hospital [ ] . the authors found % of staff contact events were with other staff. our results are compatible with this observation. isella et al. report the results of one continuous week of rfid determination of contacts between staff, patients, and visitors in a general pediatric hospital in rome, italy [ ] . there are too many substantial differences in study design (pediatric vs adult populations, observation continuous over one week vs hour shifts throughout one year, general hospital ward vs emergency department) to compare their findings to ours. lucet and colleagues implemented an rfid system in two french hospital wards (one infectious disease, one pulmonology) over a three-month period in order to determine contacts between staff and patients under respiratory precautions due to tuberculosis [ ] . implicit in these precautions is severe curtailment of the possibility of pp contacts, precluding direct comparison with our results. most recently hornbeck and colleagues report on the use of a mote-based sensor system to characterize the locations of staff in a -bed micu [ ] . data were collected for only seven days, and reported for only two days. only staff were given wearable badges, whereas fixed beacons were placed in patient rooms and in other commonly shared patient care areas (e.g., hallways, nurses' station). no patient-specific data were collected. there were staff present, on average, in each shift, in comparison to our observation of a median of staff per shift. in the four shifts reported, the typical staff had approximately (median) contacts, with sp contacts less than ss contacts and for both day and night shifts. contacts were short, typically less than minute (median), for both sp and ss contacts and for both day and night shifts. these were much fewer and much shorter than the contacts we observed. other than the difference in the purpose of the unit, another factor that might account for the differences between our observations and theirs might be the area of the unit footprint, which we could not determine. in the study with the most comparable setting and methods, gundlapalli and colleagues report data on contacts between patients and staff in a pediatric hospital ed collected over the course of a randomly chosen month [ ] . they constructed networks from existing clinical informatics resources, notable a proprietary patient flow management system as well as a locator system for which staff were given ir badges to wear. in this case, locations of staff were zoned, then merged with the patient flow system data to create a dataset describing ps interactions. the system did not cover the waiting room, in contrast to ours. each staff had contact, on average, with patients, while each patient had contact on average with staff. they also delimited these data for one day, and the resulting network described interactions among staff and patients. the average contact duration per pair was . seconds, and the average number of contacts per participant was . . for both analyses, ss and pp interactions were not considered. the degree distribution was not consistent with a power law distribution, as we also observed. coupled with our observation, there may be some other forces at work here, which may play out in the cross infection process. certainly, as these authors note, there is not a preferential attachment model working for staff-patient assignments, as at least the initial contacts have more to do with the length of the queue of patients for which the staff are already caring. although the two hospital-based rfid studies are not directly comparable to ours, both studies report issues with participation and with the system similar to our experience. in the italian study there was excellent participation ( . %), but data from approximately half of the patients ( / ) and visitors ( / ) had to be excluded for technical reasons [ ] . among days of observation in the french study, less than half could be used in the analysis due to reasons similar to ours (i.e. failure of staff to carry tag, technical issues) [ ] . rfid systems and other remote location sensor technologies will find greater applications in health care systems in the future, with increasing technical capabilities. we have demonstrated that social contacts in the ed can be quantified over long periods. this paper has barely tapped this rich data resource. in future papers we will explore the dynamics of the social networks we have characterized by relating our contact metrics with staff and patient characteristics as well as with the stages of patient care. our findings will also inform mathematical models and simulation studies to determine the potential risks of cross infection and the likelihood of the infection spreading to the rest of the hospital through an admitted patient cross infected in the ed. considering patient and staff interactions in the frame of a network opens up possibilities for major improvements not only in infection control, but also in facilities design and in work-and patient-flow management. investigation of a nosocomial outbreak of severe acute respiratory syndrome (sars) in toronto, canada hospital emergency departments: crowding continues to occur, and some patients wait longer than recommended time frames pediatric emergency room visits: a risk factor for acquiring measles measles transmission in health facilities during outbreaks risk of infection following a visit to the emergency department: a cohort study the growing role of emergency departments in hospital admissions influenza: its impact and control transmission of influenza a in human beings sars reference infectious diseases of humans the origin of bursts and heavy tails in human dynamics the scaling laws of human travel dynamics of person-to-person interactions from distributed rfid sensor networks analysis of a large-scale weighted network of one-to-one human communication critical review and uncertainty analysis of factors influencing influenza transmission a highresolution human contact network for infectious disease transmission. proceedings of the national academy of sciences of the high-resolution measurements of face-to-face contact patterns in a primary school what's in a crowd? analysis of face-to-face behavioral networks social network analysis for patient-healthcare worker interactions: implications for disease transmission electronic sensors for assessing interactions between healthcare workers and patients under airborne precautions close encounters in a pediatric ward: measuring face-to-face proximity and mixing patterns with wearable sensors using sensor networks to study the effect of peripatetic healthcare workers on the spread of hospital-associated infections the use of ranks to avoid the assumption of normality implicit in the analysis of variance nonparametric and distribution-free methods for the social sciences prioritizing healthcare worker vaccinations on the basis of social network analysis responding to the severe acute respiratory syndrome (sars) outbreak: lessons learned in a toronto emergency department drs. lowery-north and hertzberg had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. the authors would like to thank the staff of the emergency department of emory university hospital midtown. we would also like to thank the staff of emory university technology services and the rollins school of public health office of information services. key: cord- -dquztf l authors: schoenmakers, anne; mieras, liesbeth; budiawan, teky; van brakel, wim h title: the state of affairs in post-exposure leprosy prevention: a descriptive meta-analysis on immuno- and chemo-prophylaxis date: - - journal: res rep trop med doi: . /rrtm.s sha: doc_id: cord_uid: dquztf l objective: annually, over , people are diagnosed with leprosy, also called hansen’s disease. this number has been relatively stable over the past years. progress has been made in the fields of chemoprophylaxis and immunoprophylaxis to prevent leprosy, with a primary focus on close contacts of patients. in this descriptive meta-analysis, we summarize the evidence and identify knowledge gaps regarding post-exposure prophylaxis against leprosy. methods: a systematic literature search according to the preferred reporting items for systematic reviews and meta-analyses (prisma) methodology was conducted by searching the medical scientific databases cochrane, embase, pubmed/medline, research gate, scopus and web of science on jan. , , using a combination of synonyms for index terms in four languages: “leprosy” and “population” or “contacts” and “prevention” or “prophylaxis.” subsequently, infolep.org and google scholar were searched and the "snowball method" was used to retrieve other potentially relevant literature. the found articles were screened for eligibility using predetermined inclusion and exclusion criteria. results: after deduplication, , articles were screened, and articles were included in this descriptive meta-analysis. immunoprophylaxis by bacillus calmette-guérin (bcg) vaccination is known to provide protection against leprosy. the protection it offers is higher in household contacts of leprosy patients compared with the general population and is seen to decline over time. contact follow-up screening is important in the first period after bcg administration, as a substantial number of new leprosy patients presents three months post-vaccination. evidence for the benefit of re-vaccination is conflicting. the world health organization (who) included bcg in its guidelines for the diagnosis, treatment and prevention of leprosy by stating that bcg at birth should be maintained in at least all leprosy high-burden regions. literature shows that several vaccination interventions with other immunoprophylactic agents demonstrate similar or slightly less efficacy in leprosy risk reduction compared with bcg. however, most of these studies do not exclusively focus on post-exposure prophylaxis. two vaccines are considered future candidates for leprosy prophylaxis: mycobacterium indicus pranii (mip) and lepvax. for chemoprophylaxis, trials were performed with dapsone/acedapsone, rifampicin, and rom, a combination of rifampicin, ofloxacin, and minocycline. single-dose rifampicin is favored as post-exposure prophylaxis, abbreviated as sdr-pep. it demonstrated a protective effect of % in the first two years after administration to contacts of leprosy patients. it is inexpensive, and adverse events are rare. the risk of sdr-pep inducing rifampicin resistance is considered negligible, but continuous monitoring in accordance with who policies should be encouraged. the integration of contact screening and sdr-pep administration into different leprosy control programs was found to be feasible and well accepted. since , sdr-pep is included in the who guidelines for the diagnosis, treatment and prevention of leprosy. conclusion: progress has been made in the areas of chemoprophylaxis and immunoprophylaxis to prevent leprosy in contacts of patients. investing in vaccine studies, like lepvax and mip, and increasing harmonization between tuberculosis (tb) and leprosy research groups is important. sdr-pep is promising as a chemoprophylactic agent, and further implementation should be promoted. more chemoprophylaxis research is needed on: enhanced medication regimens; interventions in varying (epidemiological) settings, including focal mass drug administration (fmda); specific approaches per contact type; combinations with screening variations and field-friendly rapid tests, if available in the future; community and health staff education; ongoing antibiotic resistance surveillance; and administering chemoprophylaxis with sdr-pep prior to bcg administration. additionally, both leprosy prophylactic drug registration nationally and prophylactic drug availability globally at low or no cost are important for the implementation and further upscaling of preventive measures against leprosy, such as sdr-pep and new vaccines. leprosy is a communicable disease caused by mycobacterium leprae (m. leprae). it can result in disabilities, disfigurements, blindness, and internal organ problems. it is estimated that million to million people are living with leprosy-related disabilities, which cause severe socioeconomic consequences, including stigma and poverty. [ ] [ ] [ ] leprosy is one of the oldest diseases known to mankind and was once endemic on all continents. today, the disease exists primarily in resource-poor countries with often warmer climates and it is considered a neglected tropical disease (ntd). , , , in , the world health organization (who) reported a total of , new leprosy patients worldwide, a number that has been relatively stable in the past decade. m. leprae was discovered, as the first bacterium that caused disease in people, by the norwegian doctor armauer hansen in . , almost years after hansen's discovery, modern science still has not succeeded in growing the bacterium in vitro; it grows in humans, in nine-banded armadillos, and in immunecompromised mice. , , both people and nine-banded armadillos are able to transmit m. leprae to humans. the infection mechanism of leprosy is not fully understood, but it is generally thought to spread via the respiratory route. , the average incubation time of the disease is years, but it can take up to years before symptoms appear. during this period, a leprosy patient is contagious. ongoing transmission is implied by the fact that almost % of global new leprosy cases are children. whether colonization with m. leprae leads to infection and disease depends on the host's resistance and genetics, as well as environmental factors. up to % of people exposed to m. leprae do not develop the disease. as mentioned, most people who do develop the disease live in resource-poor settings; poor living conditions (eg, insufficient food availability, pollution, lacking health care systems and secondary chronic psychological stress) can negatively affect immune function. , , for transmission, prolonged contact with an untreated patient is considered necessary. the genetic and physical distance to a leprosy patient are independently associated with the risk of developing clinical disease. hence, the risk for developing leprosy is increased not only in household contacts, but also in neighbors and social contacts. , in the late s, the antibiotic dapsone was the first breakthrough in leprosy treatment. the duration of treatment often lasted a lifetime, which challenged compliance and fostered dapsone resistance in the s. when clofazimine and rifampicin were discovered in the s and s, these drugs were combined with dapsone, later labeled as multidrug therapy (mdt). mdt has been provided free of charge since via who. within a few days after starting mdt, patients are no longer considered to be contagious. the first focus on leprosy prevention started shortly after the discovery of m. leprae. compulsory isolation for persons affected by leprosy, mainly in leprosy colonies, was internationally promoted from the th century onward as one of the few existing prevention methods. , healthy-appearing children were frequently seperated from their parents with clinical leprosy. at the international leprosy congress in japan, isolation was finally labeled as outdated and inhumane, also because dapsone as a treatment option became available. , when mdt was introduced, it was expected that early detection and prompt treatment would break the transmission chain of m. leprae. however, the incidence decline was slower than anticipated. , this implies that there could be an accumulation of people with leprosy in communities who remain undiagnosed and/or untreated. these "missing millions" contribute to further transmission of the disease. consequently, in the global leprosy strategy - , who determined that early case detection and targeted active case finding among highrisk groups are key to control leprosy and avert disabilities. at the same time, intensified populationbased approaches for case detection are no longer considered cost-effective in all contexts. , , the risk of leprosy exposure in general populations is usually low, and, as stated, geographically closer contacts of patients are more likely to develop the disease. [ ] [ ] [ ] therefore, more targeted prevention methods were considered necessary to stop transmission. over the past years, progress has been made in the fields of chemoprophylaxis and immunoprophylaxis to prevent leprosy, with a primary focus on close contacts of patients. in this article, we present a descriptive metaanalysis on leprosy post-exposure prophylaxis. the aim of this review is to summarize the evidence on post-exposure prophylaxis for leprosy and to identify knowledge gaps and topics for future studies. a descriptive meta-analysis method was chosen. a systematic literature search was conducted using six electronic medical databases: cochrane, embase, pubmed/ medline, research gate, scopus, and web of science (table ). the search was performed using a combination of synonyms of the following index terms: "leprosy" and "population" or "contacts" and "prevention" or "prophylaxis," with a query translated in english, french, spanish, and portuguese. when possible, the search was limited to the title/abstract search fields in the search engines. medical subject headings (mesh) terms were added to the search query if available. in addition, an article search via infolep.org and google scholar was performed. infolep is the international knowledge center for information resources on leprosy. their online database also includes "grey litrature", such as meeting reports and theses. bibliographies of the relevant articles were screened for additional articles missing in the existing yield ("snowball method"). the articles were deduplicated, and consequently, a content review was performed, first in titles and abstracts and second in full texts. uncertainties in the selection were discussed with a second researcher until consensus was reached. regarding inclusion criteria, all human studies as well as meeting reports, expert opinions, editorials, and other relevant articles were included if methods for postexposure chemoprophylaxis and post-exposure immunoprophylaxis or vaccines against leprosy were discussed. exclusion criteria were as follow: (a) studies in languages other than english, french, spanish, or portuguese; (b) studies for which no full text was available; (c) studies that focused predominantly on leprosy treatment instead of prevention; (d) studies that focused solely on primary prevention interventions (eg, primary vaccination in newborns); (e) studies that solely assessed immunological status changes after vaccination without assessing clinical infection; (f) study protocol descriptions or preliminary reports of clinical trial data that were later similarly described in more complete publications; and (g) articles in which no new information was discovered (data-saturation). every step in the search phase was documented according to the preferred reporting items for systematic reviews and meta-analyses (prisma) literature review methodology. after the search, a data and text mining process was performed. the authors, year of publication, and study characteristics (ie, study design, population, period, geographic area, intervention type, results) were extracted the search in the six medical literature databases on january , (table ) and the additional search via infolep.org, google scholar and articles' bibliographies resulted in , articles after deduplication ( figure ). a total of articles included relevant information for this descriptive review on leprosy post-exposure prophylaxis. the bacillus calmette-guérin (bcg) vaccine is a live attenuated vaccine prepared from a strain of mycobacterium bovis, derived from a tuberculous cow. , the first bcg vaccine, originally developed against tuberculosis (tb), was produced by albert calmette and camille guérin in at the pasteur institute. , the influence of bcg on the lepromin reaction, described by josé fernández in , indicated that bcg might also provide leprosy protection. , currently, over substrains of bcg are being manufactured. approximately million newborns each year receive the bcg vaccine, mainly as primary protection against tb. bcg is contraindicated in immunocompromised persons and pregnant women. about % of the people who receive the bcg vaccine experience a reaction at the injection site, often leaving a superficial scar. , the protective effect against leprosy of bcg, usually given as primary tb prevention to newborns, varies widely in different countries. [ ] [ ] [ ] observational studies demonstrated a larger protective effect against leprosy, possibly caused by the shorter follow-up periods, compared with experimental trials ( % versus %). long follow-up periods are needed because of the long incubation period of leprosy. the meta-analysis of merle et al in stated that primary bcg vaccination in newborns is effective in reducing the risk of leprosy by % ( % ci . the over-all protection of bcg vaccination against leprosy in this analysis ranged from % to %. similarly to tb, the protection by bcg against leprosy is most evident in children. , who's guidelines for the diagnosis, treatment and prevention state that bcg at birth should therefore be maintained in at least all leprosy high-burden regions. the heterogeneity in the protective effect is thought to have multiple causes, including the use of different bcg vaccine strains and batches; varying study populations (eg, immunogenic characteristics, age groups); different time periods and follow-up schedules; varying geographical settings regarding environmental mycobacteria variations and m. tuberculosis; differences in leprosy burden; variations in clinical experience of health staff; and discrepancies in study methodology. , , , , , [ ] [ ] [ ] [ ] additionally, the protective effect of bcg in leprosy may decrease over time. when focusing on bcg as a post-exposure immunoprophylactic agent for leprosy contacts, the number of studies is relatively limited. , [ ] [ ] [ ] [ ] [ ] [ ] in , stanley et al described a randomized controlled trial in which , ugandan children, who were contacts of leprosy patients, were included between and (table ) . the children did not show clinical symptoms and scored negative or weakly positive at the tuberculin test. one group received bcg vaccination; the control group was not vaccinated. the children were monitored for up to four times, over an average period of eight years. in the bcg vaccinated group, children developed leprosy; in the control group, developed the disease. the protective effect against leprosy was % ( % ci - ) and did not depend upon vaccination age, gender, or grades of physical contact and genetic relationship with a patient. a small decline in protection seemed to appear at eight years follow-up, with a leprosy incidence reduction of %. lwin et al published a randomized controlled trial performed in myanmar (burma) in (table ) . a group of , children aged - years received bcg; , ( . %) of these children were household contacts of leprosy patients. the control group counted , children, including , ( . %) household contacts. the inclusion period ranged from to . annual follow-up assessments were performed until - . the overall protective effect of bcg was . % ( % ci - ). the leprosy incidence in the vaccinated group was . % for females and . % for males, compared to . % and . % for controls, respectively. a higher protective effect was found in younger children. bcg protection showed to be independent of the initial tuberculin status. a protective effect of . % was seen in contacts of lepromatous/borderline patients, of . % in contacts of patients with other leprosy forms, and of . % among non-contacts. weaknesses in this study are that two strains with varying bacillary count were used, resulting in varying protection rates. in addition, the described total number of contacts in this study was relatively low, and the number of follow-up moments varied ( - times) . furthermore, non-contacts may have become contacts during follow-up. in brazil in , düppre et al published a cohort study on the effectiveness of bcg vaccination among contacts of , leprosy patients between and ( table ). of the , leprosy patient contacts who received bcg vaccination, , people already had a bcg scar, suggesting previous vaccination. the unvaccinated group counted , contacts, of whom , already had a bcg scar. the protection against leprosy from the bcg post-exposure vaccination was found to be % ( % ci - ). for those without a primary scar, protection reached % ( % ci - ); for those with a primary scar, protection was % ( % ci . it was stated that bcg protection in contacts was not substantially affected by previous bcg vaccination. the study strongly supported the routine bcg vaccination of leprosy contacts regardless of previous vaccination. remarkably, during the first - months, new leprosy cases- vaccinated ( without a primary bcg scar) and seven unvaccinated-were detected. tuberculoid forms predominated. of these new cases, ( %) had a multibacillary leprosy index case in their family. however, misclassification of contact type may have led to residual confounding. furthermore, this study stated that no solid reason exists to doubt the disease classification data and bcg exposure, while this is debatable when examining other studies. , , moreover, this study assumed that the contacts who did not return after the initial examination were healthy, which may not always have been the case. merle et al found that bcg efficacy in studies focusing on household contacts was significantly higher compared with studies targeting the general population ( % and %, respectively). but it was mentioned that the risk of misclassification of cases and controls should be considered given the long incubation time of leprosy. included studies were found to be heterogeneous, caused by eg, varying study designs and some study groups receiving multiple doses of bcg. studies including bcg revaccination only or studies that did not (also) estimate the efficacy of one post-exposure dose of the bcg vaccine were not included in this metaanalysis. literature states that several vaccination interventions show similar or slightly less efficacy in reducing the risk of leprosy compared to bcg, but few studies focus primarily on post-exposure prophylaxis. sharma et al performed a double-blind trial on the mw vaccine in india in the s that did include household contacts of leprosy patients. when contacts only were vaccinated, the mw vaccine showed a protective efficacy of . % at the end of the -year follow-up period, % at years, and . % at years follow-up. when both patients and contacts received the mw vaccine, the observed protective efficacy was % at years, % at years, and % at years, with significance found at the first two survey moments (p < . ) and in a lesser degree at the third survey moment (p < . ). however, this study stated that early post-vaccination cases detected within year of administering the vaccine were not included in the analysis, as these vaccine recipients were thought to be harboring the infection. this is a major limitation and makes the results of this trial difficult to interpret. truoc et al published a study on bcg alone, bcg plus killed mycobacterium vaccae (m. vaccae or mv), and killed mv alone. the study included young people, - years old, living in close contact with leprosy patients in in ho chi minh city, vietnam. at least twice a year, signs of leprosy were routinely sought. the study was non-randomized, and intervention allocation in this study was also doubtful: for example, children who did not attend the initial examination moment were chosen as the control group. , over the entire years, of ( . %) unvaccinated controls and of ( . %) vaccinated children developed leprosy (p < . ), showing a protective effect of . % without significant differences among the three vaccines. the small sample size and uncertainties in scar-reading could potentially further have influenced the outcome of this study. , , promising vaccines at the moment, besides bcg, two vaccines are considered potential candidates for leprosy prophylaxis: the mw vaccine, developed in india, and lepvax, developed in the united states. [ ] [ ] [ ] [ ] [ ] because of the described positive protective efficacy of mw, this cultivable, non-pathogenic mycobacterium was selected for further development. it has been sequenced and is now named mycobacterium indicus pranii (mip) to avoid confusion with m. tuberculosis-w. , mip expedites bacterial clearance and shortens the recovery time in leprosy patients. [ ] [ ] [ ] furthermore, the addition of mip vaccine as an immunomodulator to mdt in leprosy patients leads to speedier attainment of slit-skin smear test negativity, and it seems to have a positive effect on leprosy reactions from six months onwards. , the mip vaccine has received approval from both the drugs controller general of india and the american food and drug administration (afda) and is now being manufactured as immuvac/cadi- by cadila pharmaceuticals. it also seems effective in other conditions, such as tb and warts. in , the american leprosy missions (alm) partnered with the infectious disease research institute (idri) in seattle to start a leprosy vaccine development trajectory. lepvax was developed to provide both effective preexposure and post-exposure prophylaxis against m. leprae infection. it was shown to be safe. a decrease and/or delay of neuropathy caused by m. leprae in nine-banded armadillos was also observed, making lepvax promising as immunotherapy. in august , the vaccine was approved by the afda. lepvax testing in humans is ongoing. when dapsone was the sole treatment of choice for leprosy, people sought ways to protect contacts of leprosy patients. the idea to introduce post-exposure prophylaxis (pep) was based on the hypothesis that close contacts of leprosy patients have already been infected by m. leprae by the time the patient is diagnosed and that post-exposure chemoprophylaxis would prevent the contacts from developing the clinical disease. the first reports on chemoprophylaxis trials, using dapsone, in india and seoul were published in the s. [ ] [ ] [ ] these first results suggested that weekly or biweekly doses of dapsone given to contacts of leprosy patients for a longer duration (months to years) had a protective effect against leprosy. in subsequent years, this was confirmed by additional studies in other contexts. [ ] [ ] [ ] [ ] [ ] [ ] in the s, several studies were published on the use of diacetyldapsone (dadds) or acedapsone, a long-acting repository sulphone given by injection. the administration intervals of approximately weeks added to the operational feasibility; also, no toxicity was found. [ ] [ ] [ ] neelan et al found that even a short duration of chemoprophylactic treatment with acedapsone through a series of three injections provided protection after four years of followup. smith and smith conducted a meta-analysis in to quantify the efficacy of dapsone chemoprophylaxis against leprosy. they concluded that chemoprophylaxis is an effective method to reduce the leprosy incidence. they also stated that it is more cost-effective in household contacts than when administered to entire communities. they advised more research on simple, single-dose regimens. this could improve compliance and feasibility in more remote settings. in , a workshop on leprosy prevention with international experts was held in the federated states of micronesia. during this workshop, studies were presented introducing new regimens comprising rifampicin alone or rifampicin combined with ofloxacin and minocycline (rom). an important improvement of these new regimens was that they were given either as a singledose or were repeated once, as compared to dapsone that was given repeatedly for several years. the first experience with rifampicin as chemoprophylaxis was gained in in french polynesia on the south marquesas islands, involving the screening of , inhabitants and administering mg/kg single dose rifampicin (sdr) to , inhabitants ( . %) ( table ) and also south marquesan "emigrants" with their families. , during the four-year follow-up, a decrease in the new case detection rate of % was observed. against a background of an already declining incidence after mdt introduction, the intervention's efficacy was estimated to be %. in addition, experiences were reported with sdr or the regimen rom as chemoprophylaxis, aiming at a decrease in the new case detection rate in the federated states of micronesia, kiribati, and the republic of the marshall islands. [ ] [ ] [ ] [ ] [ ] [ ] no serious side effects were recorded. some concerns about the complexity of the intervention and the costs involved in chemoprophylaxis against leprosy were discussed. , a plea was made for a simple regimen to target high-risk groups, maximizing the costbenefit ratio. it was also acknowledged that more operational research was needed to better define the populations that could benefit the most. in the years after the workshop, more publications provided evidence on the effect of chemoprophylaxis, but it was also recognized that the studies were not placebo controlled and/or could not distinguish between the effect of the chemoprophylaxis and the effect of intensified case detection interventions. , cuba, a low endemic country, was alone in starting early with nationwide implementation of sdr-pep in . the colep study moet et al conducted a large cluster-randomized controlled trial in bangladesh, the colep study, to determine the effectiveness of sdr in preventing leprosy in close contacts of leprosy patients (table ) . the study focused on household contacts, neighbors, and social contacts, because these contacts were identified as most at risk. approximately , contacts received sdr, while another , received placebo. exclusion criteria were as follow: age < years, pregnancy, liver or renal disease, signs or symptoms of leprosy or tb and previous rifampicin intake, refusal, or contacts who were residing only temporarily in that area. the study showed that sdr given to contacts of new leprosy patients reduced their risk of developing clinical leprosy by % ( % ci . the effect was more pronounced in more distant contacts and less in blood-related family. as part of the colep study, the cost-effectiveness of using sdr to prevent leprosy was assessed to be good at all contact levels. the incremental cost-effectiveness ratio (icer) was us$ per one prevented leprosy patient. overall, rifampicin administration to screened contacts of leprosy patients was shown to be affordable and feasible. , an additional post hoc finding during the colep study was that rifampicin given to contacts who have had a childhood bcg vaccination, had a protective effect of % ( % ci - ). in another colep substudy, the acceptability of sdr was assessed among healthy community members. all study participants expressed that they would be willing to take chemoprophylaxis, even after it was explained that full protection against leprosy was not guaranteed. however, many participants also expressed that they would not like to share information on their disease status with neighbors and social contacts if they would have leprosy. in follow-up of the colep study another workshop on the use of chemoprophylaxis against leprosy was held in , in the netherlands. here, the preliminary colep data were presented as well as results from another controlled trial using rifampicin in two repeated doses on a group of five islands in the flores sea in indonesia (table ) . , this trial showed similar results to the colep study, with a risk reduction of about % on the island where the total population was screened and received two dosages of rifampicin when eligible. , preliminary data of a randomized controlled chemoprophylaxis trial in india, the results of which were never individually published, were presented by declercq in this workshop. sdr given to household contacts was compared with placebo and showed a % risk reduction after - years follow-up. during the workshop, concerns were raised regarding the risk of inducing antibiotic resistance when using rifampicin on a large scale. ji addressed these concerns with the argument that resistance is very unlikely to emerge because sdr is proven not to select resistant mutants, as shown in publications on multibacillary patients who relapse after sdr treatment. , experts agreed that the use of a single dose as chemoprophylaxis was to be preferred, that powerful bactericidal activity was a necessity, and that adverse events risk should be minimal. several drugs in the rifamycin and quinolone group were considered suitable candidates for leprosy chemoprophylaxis, though the experts agreed that clinical trials were needed to provide evidence of their effectiveness and that drug prices should be considered. the following requirements were formulated for operational research programs: (a) screening of contacts for signs and symptoms of leprosy and tb; (b) provision of chemoprophylaxis under direct supervision; (c) a recording and reporting system; (d) training of health workers; (e) proper information for people receiving chemoprophylaxis; and (f) a system for antibiotic resistance monitoring. in , an updated meta-analysis of randomized controlled trials of leprosy chemoprophylaxis was published by reveiz et al. they confirmed the review by smith and smith, concluding that the use of rifampicin and acedapsone/ dapsone as chemoprophylaxis can help reduce leprosy incidence and that post-exposure chemoprophylaxis should become embedded in leprosy control programs. , in the same year, the who south-east asia regional office addressed the need for further research, based on an information consultation in the united kingdom. in , the technical commission of the international federation of anti-leprosy associations (ilep) published a review of leprosy research evidence ( - ) and implications for current policy and practice. they strongly recommended additional research on chemoprophylaxis implementation to evaluate acceptability, cost-effectiveness, feasibility, and ethical issues concerning disease disclosure and new regimens. after cuba, sdr chemoprophylaxis against leprosy was introduced in morocco in , another low-endemic country. there, leprosy detection declined over the years, showing an annual percentage reduction in leprosy detection rate of to better understand the long-term effects of chemoprophylaxis on leprosy, several modeling studies have been performed using the mathematical simcolep model developed by erasmus mc. using this model, fisher et al demonstrated that leprosy incidence would be substantially reduced in a context similar to the colep study location (bangladesh) by a combination of childhood bcg and chemoprophylaxis for screened household contacts. de matos et al predicted a declining new case detection rate in para state in brazil, combining contract screening with chemoprophylaxis. mathematical modeling by gilkison et al for kiribati, a high leprosy burden pacific island, also suggested that leprosy incidence can decrease when using an intensive chemoprophylaxis approach. the model predicted the best results for the intensive chemoprophylaxis approach that combined a household contact approach with more than one round of mass sdr in consecutive years rather than every second year. an expert meeting on chemoprophylaxis against leprosy in in switzerland re-emphasized the importance of operational research on sdr-pep. some lessons learned from a pilot project implementing sdr-pep in sampang district, indonesia, were presented by dandel. he stated that sdr-pep implementation requires: (a) ongoing support and supervision; (b) strong local ownership; (c) continuous motivation of healthcare workers; and (d) adequate loose rifampicin supply routes. stakeholders received the intervention positively. it was also seen as an opportunity to present health education, but stigma caused index patients to hesitate in disclosing their disease status to people outside their household. nevertheless, an average of contacts per index patient were included in the first year (budiawan, unpublished report). during the expert meeting, mahotarn presented results of another randomized, placebo-controlled trial with sdr conducted in thailand with a similar set-up as the colep study. , after years, the relative risk in the rifampicin group was . when compared with controls. this was similar to the difference observed in the colep trial, though not statistically significant (p = . ) because the study was not powered to detect risk reductions below %. , the expert meeting identified several important factors for the success of interventions that target contacts of leprosy patients (table ) . the meeting stated that high-endemic pockets would most likely benefit more from a mass drug administration (mda) or "blanket" approach than the contactbased strategy. in such settings, the entire community may be considered contacts. the experts recognized that pep implementation against leprosy should not be limited to high-endemic settings because low-endemic areas are likely to show a high clustering level. richardus et al found that when the endemicity of leprosy declines, a gradually higher proportion of new patients is found amongst the contacts of known cases. this aligns with field observations in west java, where leprosy endemicity varies; in the lowendemic districts, the proportion of new cases detected through contact screening is relatively high (budiawan, unpublished report). the need for more research on the feasibility of implementing sdr-pep, including its cost-effectiveness and acceptability in different geographical and sociocultural environments, was recognized. this and the urge to learn how to operationalize this innovative intervention led to the development of the lpep program. in , a multicountry research project was developed involving ministries of health, international non-governmental organizations (ngos), scientific institutes, leprosy experts, and a donor (table ) . , the lpep program, as operational study, aimed to accelerate the uptake of evidence on sdr-pep effectiveness by gathering data on the impact and feasibility of implementing sdr-pep as part of routine leprosy control. the program was implemented in highendemic areas in india, brazil, indonesia, myanmar, nepal, tanzania, and sri lanka. cambodia was also inlcuded, but followed a different study protocol. , before implementation, an expert meeting was convened to address concerns regarding the risk of inducing rifampicin resistance in tb. the experts concluded that sdr given to contacts of leprosy patients, in the absence of symptoms of active tb, poses a negligible risk of generating resistance in m. tuberculosis in individuals and populations. two years into the lpep program, the integration of contact screening and sdr-pep administration into different leprosy control programs had proved to be feasible and well accepted by all stakeholders. the program was reported to have invigorated leprosy control. after three years, approximately , contacts of recently diagnosed leprosy patients had been screened. only % were not eligible for sdr-pep, because of young age (< years or < years, depending on the country), liver or renal disease, signs and symptoms of leprosy or tb, pregnancy, known allergy to rifampicin, rifampicin use in the past years, or refusal (< %) . sdr-pep was found to be safe; adverse events were very rarely reported (richardus et al, full article under publication). mathematical modeling predicted that implementing sdr-pep could potentially accelerate the reduction of new cases and, therefore, could potentially accelerate the cessation of m. leprae transmission (blok et al, under publication) . it is also estimated to be cost-effective when assessing sdr-pep in the indian health system (us$ , per new leprosy case averted). demonstrating the operational feasibility of integrating sdr-pep into leprosy programs on such a large scale in multiple settings has contributed to who recommending sdr-pep in its guidelines for the diagnosis, treatment and prevention of leprosy. as part of the lpep program, two specific approaches for implementing sdr-pep were piloted. one was a blanket approach in lingat village on selaru island, a remote highendemic village in indonesia. the blanket approach involved total population screening in (n = ). during two visits in consecutive years, , inhabitants were screened ( %), and , ( %) received sdr-pep. during these visits, new leprosy patients were diagnosed with leprosy ( , / , ). during the third visit, new leprosy cases in , screened persons ( / , ) were detected. the other approach made use of "drives" and was implemented in cambodia, a low-endemic country. the drives in cambodia were carried out by a mobile team with leprosy experts who screened the contacts of all patients diagnosed in the previous years, and administered sdr-pep. in the first four operational districts, contacts of index patients were traced and screened, ( %) received sdr-pep, and four new leprosy patients were identified ( / , ). both the blanket approach and the drive approach required sufficient resources and thorough logistic preparation, but were operationally feasible. regular monitoring is required to identify the long-term impact. an lpep substudy in india, indonesia, and nepal assessed how the intervention changed the perception of main stakeholders regarding leprosy. the lpep program was perceived positively, though more research was recommended on providing accurate and understandable health information to contacts, and on approaches that do not require disclosure of the index patient. the lpep program also had a positive effect on people's knowledge regarding leprosy (mieras et al, unpublished report). another side study was an acceptability study carried out in india. the results of this study aligned with the results of the main study and the perception study, illustrating that contact screening and sdr-pep distribution in dadra and nagar haveli, india, were well accepted by the stakeholders. , [ ] [ ] [ ] the lessons learned from the lpep program led to a recommended minimal set of data required to monitor contact tracing activities and sdr-pep administration for leprosy control in a routine setting. based on the experiences gained from the lpep program, an sdr-pep toolkit was developed to support national leprosy program managers concerned with the practical aspects of implementing sdr-pep, such as advocacy, training, and reporting and recording. in , gillini et al collected information to determine the extent of leprosy post-exposure immuno-and chemoprophylaxis in countries around the world as national policy. a total of countries responded, representing % of the total reported global leprosy burden. nationwide routine implementation of sdr-pep was reported by only a few countries with a low burden, such as cuba ( ), morocco ( ), and samoa ( ). since sdr-pep is addressed in the who guidelines for the diagnosis, treatment and prevention of leprosy, more countries are implementing sdr-pep in their national leprosy control programs. for example, in india and indonesia, the ministries of health adopted sdr-pep as a main strategy in leprosy prevention in and , respectively. , , this year, who is expected to publish a technical guidance document whith a description of the steps that could be taken for the implementation of both contact tracing and postexposure prophylaxis (who, under publication). the availability of preventive measures contributed to a new momentum to work toward stopping the transmission of leprosy. lwin et al concluded that bcg vaccination alone is not likely to be the solution for leprosy control. as described, schuring et al performed a secondary analysis on the colep trial. the individual protective effect of bcg % ( % ci - ) and sdr-pep % ( % ci - ) demonstrated a combined protective effect of % ( % ci - ). after the findings of the colep trial, a single-center, cluster-randomized controlled trial (maltalep) was conducted to determine whether possible excess cases in the first year after immunoprophylaxis (with bcg) could be prevented with chemoprophylaxis (with sdr) without affecting bcg's protective effect. , in the intervention group (n = , ), bcg vaccination was followed by sdr after - weeks. controls (n = , ) received only bcg. the combined preventive outcome was expected to be long-lasting and better than the effect of solely bcg or sdr-pep. however, it was found that one third of all new leprosy cases among contacts had appeared - weeks after bcg vaccination, before sdr-pep was administered. this made it impossible to determine whether excess cases in the first year after immunoprophylaxis (with bcg) could be prevented with sdr-pep without affecting the protective effect of bcg. this increase of cases after immunoprophylaxis administration in the first year is consistent with other studies. , , [ ] [ ] [ ] therefore, richardus et al concluded that bcg vaccination followed by sdr as a routine intervention is not recommended in leprosy control. in a time in which a new infectious disease, coronavirus disease (covid- ), dominates the news, it is important to also keep paying attention to one of the oldest diseases known to mankind. today, leprosy still has devastating outcomes, mainly affecting the most marginalized. preventive methods are needed to stop the transmission of this disease. globally, bcg vaccination is mainly used as the primary tb prevention method in newborns. as stated, in its guidelines for the diagnosis, treatment and prevention of leprosy, who recommends only that bcg at birth should be maintained in at least all leprosy high-burden regions. nevertheless, since the early s, the brazilian ministry of health has recommended intradermal bcg for household contacts of leprosy patients. , contacts with no or one primary scar receive bcg; contacts with two bcg scars do not receive another bcg dosage. , however, scar reading is not fully reliable, and scar formation does not always occur. , for example, - % of bcg vaccinated individuals do not develop a scar. in infants vaccinated below the age of month, which is common when bcg vaccination is routinely given to newborns (eg, in brazil), fewer than % have a recognizable scar at the age of . furthermore, the evidence regarding the effectiveness of bcg revaccination is conflicting, because of the wide variation in study outcomes. , , , , , , when assessing other immunoprophylactic agents, bcg plus heat-killed m. leprae was found not to have an additional effect in leprosy prevention when compared with bcg alone. , , even if bcg plus heat-killed m. leprae provided additional protection, further development of a vaccine containing killed m. leprae is challenging because mass production would need to occur in armadillos or in immune-compromised mice. studies on the icrc and m. vaccae vaccines were limited, rarely focused on solely post-exposure prophylaxis, and were often of debatable quality. it is therefore not entirely clear if these two immunoprophylactic agents should be fully ruled out when trying to prevent leprosy transmission. more research is also needed on newly developed vaccines like lepvax as well as on the vaccine mip (mw). furthermore, the introduction of new tb vaccines, possibly replacing bcg, could have a serious impact on leprosy. the number of vaccine studies is much greater in the tb world, consistent with the incidence numbers, compared with the leprosy research field. despite the possibility of cross-protection between the two diseases, the potential impact on leprosy by tb vaccine candidates is rarely considered in tb research. , only two recombinant subunit tb vaccines (id /gla-se and id / gla-se) have also been laboratory tested for their potential use against leprosy. , more integration and harmonization between the tb and leprosy vaccine research groups would therefore be valuable. additionally, after the administration of bcg and other immunoprophylactic agents for leprosy prevention, a relative increase in the number of new (paucibacillary) leprosy patients is seen in the first follow-up year. , , [ ] [ ] [ ] a possible explanation for this is that bcg is catalyzing the existing anti-mycobacterial immunity in people infected with m. leprae, resulting in the clinical appearance of tuberculoid leprosy after bcg vaccination. , gormus and meyers described the reasoning for the doses chosen in trials with integral mycobacterial vaccines as arbitrary. they argue that dosages may be too high, resulting in the protection of some, but also resulting in increased susceptibility to leprosy in other individuals shortly after vaccination. an individual's prior exposure to environmental mycobacteria may also affect the outcome of mycobacterial vaccines. additional research with varying dosages and in different target groups would be recommended. richardus et al concluded that bcg vaccination followed by sdr as a routine intervention is not recommended in leprosy control. the maltalep study was challenged in finding significance because of insufficient statistical power. , a well-powered study focusing on the reversed-maltalep order (first sdr administration, followed by bcg) would therefore be of great value. considering the above results, contact screening followup is especially important in the first period after bcg administration, as a relatively large number of new patients presents three months post-vaccination. , , [ ] [ ] [ ] this may also count for other immunoprophylactic agents. as discussed, many immunoprophylaxis studies lacked power. more well-designed, sufficiently powered and long-lasting (regarding the leprosy incubation period) studies in this field are therefore suggested. but, given the evidence for the effectiveness of sdr-pep and the who guidelines for its use, the ethics of testing new post-exposure immunoprophylactic approaches for leprosy prevention without combining them with chemoprophylaxis in both the intervention and control group needs to be discussed. , , , , chemoprophylaxis the promising findings of the colep study, the systematic reviews, and the workshop recommendation on additional, operational research were not immediately pursued. this was partly because the use of sdr-pep has been criticized. [ ] [ ] [ ] [ ] concerns have been raised about the fact that it does not provide long-term protection like immunoprophylaxis, and that the colep study revealed that it provides an overall risk reduction of % and only % in blood-related household contacts, though the latter finding was not significant (p = . ) because the study was not powered for subgroup analysis. , , however, the effect seen in all contact subgroups trended with the overall risk reduction of %. other criticism concerned the logistics and cost-effectiveness of the intervention. though, studies have demonstrated that both contact examination and sdr-pep implementation are cost-effective. , , rodrigues, lockwood, krishnamurthy and penna also argued that contact screening and sdr-pep administration raise ethical problems because the disclosure of the index-patients' diagnosis to their contacts is required. , however, the lpep program results in brazil illustrated that it is possible to distribute sdr-pep without disclosing the identity of the index case by solely saying: "there is leprosy in your area, and that is why we offer people preventive treatment against leprosy" (ignotti, unpublished report). the lpep program successfully addressed several concerns by demonstrating the feasibility and acceptability of making prevention, in the form of contact screening and sdr-pep administration, part of leprosy control. embedding the approach in existing leprosy services invigorated the leprosy program. two additional systematic reviews of quantitative and qualitative data and a meta-analysis in and confirmed the effectiveness of sdr as pep and the general acceptance of the approach. , preventive chemotherapy has become part of leprosy control. this is needed, since the incidence has been more or less stable at around , in the past years. several high-burden countries (eg, india and indonesia) have embedded sdr-pep into routine leprosy control nationwide. , , subsequent to the publication of the latest who guidelines that recommend implementing sdr-pep, many other countries are following suit. recent modeling studies at erasmus mc rotterdam have estimated that a large-scale roll-out of sdr-pep may reduce the incidence of leprosy by % in - years (taal et al, under publication). in a strategy to halt leprosy transmission, a possible leprosy endgame strategy was described comprising contact screening and chemoprophylaxis. additionally, the leprosy research initiative (lri) and the global partnership for zero leprosy (gpzl) described research priorities related to pep. , , this included the need to seek more effective regimens, especially for household and blood-related contacts. investments in continued (operational) research are needed to discover the most feasible and acceptable approaches to integrate sdr-pep into leprosy control programs in different contexts. da cunha et al recommended that if sdr-pep was implemented in brazil, it should start on a small scale, generating new evidence from the brazilian context. several post-exposure chemoprophylactic studies against leprosy are ongoing. the pep lep project in ethiopia, tanzania, and mozambique compares the feasibility of a health center-based contact approach to acommunity skin camp-based approach regarding integrated skin screening and sdr-pep administration (schoenmakers et al, under publication). in the skin camp group, disclosing the disease status of an (index) leprosy patient is not expected to be necessary. other initiatives are working on an improved regimen as chemoprophylaxis for leprosy. an enhanced antibiotic regimen is being tested in the pep++project, conducted in india, brazil, and indonesia, comprising three standard weight-adjusted single doses of rifampicin plus clarithromycin, given at four weekly intervals. the effectiveness of pep++ still has to be established but is promising as a regimen. the expert meeting on defining an enhanced pep regimen for leprosy discussed other options for improving pep, including the use of more potent and/or longer-acting antibiotics, such as bedaquiline, rifapentine, oxazolidinone, and nitro-dihydro-imidazo-oxazoles. these antibiotics were excluded at the time because they were not yet registered in the countries included in the pep++ study (india, brazil, indonesia) or because no experience was gained with these drugs in leprosy treatment. clarithromycin was chosen after moxifloxacin was deemed unsuitable for prophylactic interventions because of the potential side-effects (publication expected). in the meantime, who has widely pushed the licensing of selected antibiotics because of their potential as second-line treatment in multidrug-resistant tb (mdr-tb), so there is ample opportunity to research their efficacy in preventive chemotherapy for leprosy. the advantages of higher dosages of rifampicin as chemoprophylaxis are currently being investigated by the people project. ortuno-gutierrez et al have published the protocol of this study which is conducted on the comoros and on madagascar. , instead of the mg/kg rifampicin dose which was used in the colep study and the lpep program, a double dose of mg/kg is used in the people project. the study will use an anti-pgl-i test in one of its arms, in which the sdr double dose will be administered only to contacts who test positive for the anti-pgl-i test. improved understanding of leprosy transmission routes will facilitate the design of targeted interventions that complement early case detection and prophylaxis. when a fieldfriendly diagnostic (rapid) test becomes available-especially a test indicating who is infected with m. leprae, incubating and transmitting leprosy before signs or symptoms occur-targeted chemoprophylaxis for leprosy in certain groups, such as household members, would be an option. , , such a test would allow a more tailored approach at the individual level (personalized medicine), which may increase efficacy of pep for blood-related household and other high-risk contacts. the gpzl pep research agenda includes the question: "which type of pep intervention fits best with which epidemiological setting?", which requires more operational research. in very high-endemic hotspot or cluster areas, a total population screening and sdr-pep administration may be the best approach. , however, when screening entire populations is neither feasible nor affordable, focal mass drug administration (fmda) may be a reasonable alternative. the fmda approach would target the population who are not included in a contact-based pep approach and would be implemented alongside contact-based pep. the effectiveness and feasibility of such an approach first needs to be tested in a randomized controlled trial. variations on screening, such as self-screening or self-referred screening only, need to be evaluated. an additional benefit of fmda would be that it is especially suitable in areas where stigma prevents patients from disclosing their disease status. questions concerning disease concealment for index patients, optimal contact and community education (eg, regarding the fact that chemoprophylaxis is not % effective), and the quality of leprosy screening by health workers need to be addressed in future research projects. another important topic is antibiotic resistance. even though the risk of inducing rifampicin resistance in m. tuberculosis is considered negligible, regular sampling and molecular monitoring for mutations associated with rifampicin resistance in m. tuberculosis and in m. leprae as recommended by who are encouraged in areas with a high rate of primary mdr-tb and among recipients of sdr-pep who develop leprosy. , , additionally, loose rifampicin that can be used as sdr-pep for leprosy prevention is urgently needed and should be made available on a global scale to facilitate further sdr-pep implementation. rifampicin is relatively cheap, and its cost-effectiveness was assessed to be good in the colep study, and also in india, based on the simcolep model and the lpep program. , through the stop tb partnership/global drug facility, dosages of mg rifampicin cost us$ . , and dosages of mg cost around us$ . but sdr-pep still requires an initial investment from national leprosy control programs in often resource-poor settings, especially when sdr-pep administration is-as recommended-combined with screening and active case-finding activities. , free provision, similar to mdt, would be beneficial on a global scale. additionally, registration of the new indication (post-exposure prophylaxis in leprosy) of loose rifampicin in more national registers of authorized medicines should be initiated by ministries of health in leprosy endemic countries. furthermore, logistical structures and (national) registration systems, for contacts of leprosy patients and pep pharmaceutical stock, sould be set up. as mentioned, taal et al have estimated the number of persons that need to be treated with sdr-pep using the simcolep model (taal et al, under publication) . the model not only predicts the number of people who need to be treated with sdr-pep to achieve a % reduction in new leprosy case detection at a global level, but it also estimates how many people need chemoprophylaxis with rifampicin for a % leprosy case reduction. predictions per country will also be published. this model and the other available evidence make clear that the implementation of evidence-based preventive interventions against leprosy, like sdr-pep, in national programs needs to be encouraged to facilitate a large-scale roll-out of pep. this will help prevent those at risk from developing leprosy and will be a vital step to stop further transmission of this ancient disease. last, but not least, it will prevent more people from having to live with leprosy and its physical, psychological, and socioeconomic consequences. leprosy: review of the epidemiological, clinical, and etiopathogenic aspects -part a strategy to halt leprosy transmission world health organization. leprosy -key facts neglected diseases of poverty leprosy: an overview of pathophysiology current status of neglected tropical diseases (ntds) in the philippines moving towards a leprosy-free world leprosy: the disease leprosy in the armadillo: new model for biomedical research application of the thymectomized-irradiated mouse to the detection of persisting mycobacterium leprae the global campaign to eliminate leprosy physical distance, genetic relationship, age, and leprosy classification are independent risk factors for leprosy in contacts of patients with leprosy patient contact is the major determinant in incident leprosy: implications for future control social distance and spatial distance are not the same, observations on the use of gis in leprosy epidemiology prophylaxis and exclusion: compulsory isolation of hansen's disease patients in são paulo editorials -first congress of the international leprosy association. fourth international leprosy conference elimination of leprosy as a public health problem: progress and prospects the future incidence of leprosy: a scenario analysis chemoprophylaxis against leprosy: expectations and methodology of a trial the missing millions: a threat to the elimination of leprosy accelerating towards a leprosy-free world. world health organization leprosy control -international textbook of leprosy leprosy post-exposure prophylaxis (lpep) programme: study protocol for evaluating the feasibility and impact on case detection rates of contact tracing and single dose rifampicin basri s descriptive analysis and text analysis in systematic literature review: a review of master data management infolep. international knowledge center for information resources on leprosy | infolep preferred reporting items for systematic reviews and meta-analyses: the prisma statement history of bcg vaccine world health organization. bcg vaccine: who position paper whole genome sequence analysis of mycobacterium bovis bacillus calmette-guérin (bcg) tokyo : a comparative study of bcg vaccine substrains bacillus calmette-guérin (bcg) vaccine: a global assessment of demand and supply balance. vaccine vaccination against leprosy leprosy and tuberculosis: antagonistic diseases. a m a arch dermatology bcg and adverse events in the context of leprosy the role of bcg in prevention of leprosy: a meta-analysis protective effect of bacillus calmette guerin (bcg) against leprosy: a population-based case-control study in nagpur bcg vaccination and leprosy protection: review of current evidence and status of bcg in leprosy control effectiveness of bacillus calmette guerin (bcg) vaccination in the prevention of leprosy: a population-based case-control study in yavatmal district long lasting bcg protection against leprosy the effect of bcg vaccination upon the occurrence of leprosy in nursery children bcg vaccination in leprosy bcg vaccination of children against leprosy: fourteen-year findings of the trial in burma the combined effect of chemoprophylaxis with single dose rifampicin and immunoprophylaxis with bcg to prevent leprosy in contacts of newly diagnosed leprosy cases: a cluster randomized controlled trial (maltalep study) leprosy: diagnostic and control challenges for a worldwide disease leprosy epidemiology in a cohort of household contacts in rio de janeiro ( - ). cadernos de saúde pública bcg vaccination of children against leprosy in uganda: final results effectiveness of bcg vaccination among leprosy contacts: a cohort study combination chemoprophylaxis and immunoprophylaxis in reducing the incidence of leprosy immunoprophylactic effects of the anti-leprosy mw vaccine in household contacts of leprosy patients: clinical field trials with a follow up of - years comparative leprosy vaccine trial in south india world health organization. guidelines for the diagnosis, treatment and prevention of leprosy vaccination in leprosy. observations and interpretations randomised controlled trial of single bcg, repeated bcg, or combined bcg and killed mycobacterium leprae vaccine for prevention of leprosy and tuberculosis in malawi. karonga prevention trial group vaccination against leprosy at ben san leprosy centre vaccines for leprosy and tuberculosis: opportunities for shared research, development, and application lepvax, a defined subunit vaccine that provides effective pre-exposure and post-exposure prophylaxis of m. leprae infection addition of mycobacterium indicus pranii (mip) vaccine as an immunotherapeutic with standard chemotherapy in borderline leprosy: a double blind study to assess clinical improvement (a preliminary report) review of global leprosy vaccine development second coming: the re-emergence and modernization of immunotherapy by vaccines as a component of leprosy control making of a highly useful multipurpose vaccine the use of the name mycobacterium w for the leprosy immunotherapeutic bacillus creates confusion with m. tuberculosis-w (beijing strain): a suggestion clinical and histopathological evaluation of the effect of addition of immunotherapy with mw vaccine to standard chemotherapy in borderline leprosy addition of immunotherapy to chemotherapy in pediatric borderline leprosy: a clinical evaluation international federation of anti-leprosy associations. the leprosy vaccine prophylactic value of dapsone in leprosy chemoprophylaxis of leprosy contacts with d.d. s dds prophylaxis against leprosy four years' experience with dapsone as prophylaxis against leprosy chemoprophylaxis among contacts of non-lepromatous leprosy long-term effects of chemoprophylaxis among contacts of lepromatous cases. results of a . years follow-up antileprosy measures in bombay, india: an analysis of years' work prevention and early detection of leprosy in children management of household contacts of leprosy patients repository acedapsone in leprosy chemoprophylaxis and treatment experience with acedapsone (dadds) in the therapeutic trial in new guinea and the chemoprophylactic trial in micronesia chemoprophylaxis against leprosy with acedapsone limited duration acedapsone prophylaxis in leprosy chemoprophylaxis is effective in the prevention of leprosy in endemic countries: a systematic review and meta-analysis. milep study group. mucosal immunology of leprosy workshop on the prevention of leprosy, pohnpei, federated states of micronesia implementation of chemoprophylaxis of leprosy in the southern marquesas with a single dose of mg per kg rifampin chemoprophylaxis of leprosy with a single dose of mg per kg rifampin in the southern marquesas; results after four years monitoring the effects of preventive therapy in the federated states of micronesia elimination of leprosy in the federated states of micronesia by intensive case finding, treatment with who/mdt and administration of chemoprophylaxis implementation of chemoprophylaxis in chuuk state, federated states of micronesia implementation of chemoprophylaxis in pohnpei state, federated states of micronesia population screening and mass chemoprophylaxis in kiribati population screening and chemoprophylaxis for household contacts of leprosy patients in the republic of the marshall islands drugs and regimens for preventive therapy against tuberculosis, disseminated mycobacterium avium complex infection and leprosy the future of leprosy elimination rationale for the preventive treatment of leprosy chemoprophylaxis of leprosy in the southern marquesas with a single mg/kg dose of rifampicin. results after years leprosy chemoprophylaxis in micronesia eficacia de la rifampicina como profiláctico en contactos de primer orden de lepra a study on transmission and a trial of chemoprophylaxis in contacts of leprosy patients: design, methodology and recruitment findings of colep effectiveness of single dose rifampicin in preventing leprosy in close contacts of patients with newly diagnosed leprosy: cluster randomised controlled trial cost-effectiveness of a chemoprophylactic intervention with single dose rifampicin in contacts of new leprosy patients protective effect of the combination bcg vaccination and rifampicin prophylaxis in leprosy prevention acceptability of chemoprophylaxis for household contacts of leprosy patients in bangladesh: a qualitative study report of the workshop on the use of chemoprophylaxis in the control of leprosy held in amsterdam, the netherlands on prevention of leprosy using rifampicin as chemoprophylaxis chemoprophylaxis in contacts of patients with leprosy: systematic review and meta-analysis monitoring grade- disability rate and applicability of chemoprophylaxis in leprosy control: report of the information consultation review of leprosy research evidence ( - ) and implications for current policy and practice trend analysis of leprosy in morocco between and : evidence on the single dose rifampicin chemoprophylaxis the long term effect of current and new interventions on the new case detection of leprosy: a modeling study leprosy new case detection trends and the future effect of preventive interventions in pará state, brazil: a modelling study predicting the impact of household contact and mass chemoprophylaxis on future new leprosy cases in south tarawa, kiribati: a modelling study role of contact tracing and prevention strategies in the interruption of leprosy transmission close contacts with leprosy in newly diagnosed leprosy patients in a high and low endemic area: comparison between bangladesh and thailand chemoprophylaxis: sufficient evidence for starting implementation pilots negligible risk of inducing resistance in mycobacterium tuberculosis with single-dose rifampicin as post-exposure prophylaxis for leprosy the leprosy post-exposure prophylaxis (lpep) programme: update and interim analysis abstracts of oral presentations - feasibility and impact of leprosy post-exposure prophylaxis: evidence from lpep, a multi-country, -year implementation research program leprosy post-exposure prophylaxis in the indian health system: a cost-effectiveness analysis an innovative approach to screening and chemoprophylaxis among contacts of leprosy patients in low endemic settings: experiences from cambodia a single dose of rifampicin to prevent leprosy: qualitative analysis of perceptions of persons affected, contacts, community members and health professionals towards chemoprophylaxis and the impact on their attitudes in india, nepal and indonesia acceptability of contact screening and single dose rifampicin as chemoprophylaxis for leprosy in dadra and nagar haveli leprosy post-exposure prophylaxis with single-dose rifampicin: toolkit for implementation minimal essential data to document contact tracing and single dose rifampicin (sdr) for leprosy control in routine settings: a practical guide global practices in regard to implementation of preventive measures for leprosy peraturan menteri kesehatan republik indonesia, nomor tahun tentang penanggulangan kusta evidence, opportunity, ethics, and the allure of zero leprosy reply to: single-dose rifampicin and bcg to prevent leprosy effectiveness of single-dose rifampicin after bcg vaccination to prevent leprosy in close contacts of patients with newly diagnosed leprosy: a cluster randomized controlled trial clinical manifestations of leprosy after bcg vaccination: an observational study in bangladesh bcg vaccine and leprosy household contacts: protective effect and probability to becoming sick during follow-up under-explored experimental topics related to integral mycobacterial vaccines for leprosy risk and protective factors for leprosy development determined by epidemiological surveillance of household contacts portaria n o . ,de de outubro de . diretrizes para vigilância, atenção e controle da hansenías e bcg scars in northern malawi: sensitivity and repeatability of scar reading, and factors affecting scar size risk and protective factors for leprosy development determined by epidemiological surveillance of household contacts bcg revaccination does not protect against leprosy in the brazilian amazon: a cluster randomised trial protection against m. leprae infection by the id /gla-se and id /gla-se vaccines developed for tuberculosis protocol, rationale and design of people (post exposure prophylaxis for leprosy in the comoros and madagascar): a cluster randomized trial on effectiveness of different modalities of implementation of post-exposure prophylaxis of leprosy contacts leprosy chemoprophylaxis: what's the need? leprosy now: epidemiology, progress, challenges, and research gaps reply to the role of contact tracing and prevention strategies in the interruption of leprosy transmission -chemoprophylaxis: a call for more research single-dose rifampicin chemoprophylaxis protects those who need it least and is not a cost-effective intervention cost-effectiveness analysis of three leprosy case detection methods in northern nigeria effectiveness of rifampicin chemoprophylaxis in preventing leprosy in patient contacts chemoprophylaxis for contacts of leprosy patients: a systematic review and meta-analysis national health portal of india tm of h and fwg of i. leprosy. prevention reviewing research priorities of the leprosy research initiative (lri): a stakeholder's consultation global partnership for zero leprosy research agenda working group subgroup on post-exposure prophylaxis chemoprophylaxis to control leprosy and the perspective of its implementation in brazil: a primer for non-epidemiologists [quimioprofilaxia para prevenção de hanseníase e sua implantação no brasil: uma explicação introdutória para não epidemiologistas an enhanced regimen as post-exposure chemoprophylaxis for leprosy: pep++ quinolone-and fluoroquinolonecontaining medicinal products prevention of transmission of leprosy: the current scenario world leprosy day : how forward respecting the past? workshop explores how to scale up pep and advance research | global partnership for zero leprosy. global partnership for zero leprosy world health organization. a guide for surveillance of antimicrobial resistance in leprosy: update. who medicines catalog -global drug facility (gdf) research and reports in tropical medicine is an international, peerreviewed, open access journal publishing original research, case reports, editorials, reviews and commentaries on all areas of tropical medicine, including: diseases and medicine in tropical regions; entomology; epidemiology; health economics issues; infectious disease; laboratory science and new technology in tropical medicine;parasitology; public health medicine/health care policy in tropical regions; and microbiology. the manuscript management system is completely online and includes a very quick and fair peer-review system. visit http://www.dovepress.com/testimonials.php to read real quotes from published authors.submit your manuscript here: http://www.dovepress.com/research-and-reports-in-tropical-medicine-journal key: cord- - o rf ii authors: bergasa-caceres, fernando; rabitz, herschel a. title: interdiction of protein folding for therapeutic drug development in sars cov- date: - - journal: j phys chem b doi: . /acs.jpcb. c sha: doc_id: cord_uid: o rf ii [image: see text] in this article, we predict the folding initiation events of the ribose phosphatase domain of protein nsp and the receptor binding domain of the spike protein from the severe acute respiratory syndrome (sars) coronavirus- . the calculations employ the sequential collapse model and the crystal structures to identify the segments involved in the initial contact formation events of both viral proteins. the initial contact locations may provide good targets for therapeutic drug development. the proposed strategy is based on a drug binding to the contact location, thereby aiming to prevent protein folding. peptides are suggested as a natural choice for such protein folding interdiction drugs. understanding the biochemistry of the new coronavirus sars-cov- has become an issue of prime importance and urgency as the virus spread has triggered an ongoing pandemic that has already cost thousands of lives and large economic disruption. − many therapeutic strategies are being considered including monoclonal antibodies , that rely on targeting selected virus proteins based on their native structure. considerable work has been developed against coronaviruses in the past in this direction, and the experience gained is now being deployed against sars-cov- based on the crystal structures already available of several of the virus' proteins. − the purpose of this paper is to introduce a distinct possible new therapy route by (a) presenting predictions of the earliest events along the folding pathway of two of the virus' proteins and (b) building on this foundation to propose an alternative drug development strategy based on reducing the functionality of the virus by interdicting in the folding process of its proteins. in order to fulfill goal (a) of the article, we will focus on the folding initiation events of two of the sars-cov- proteins: ( ) the adp ribose phosphatase domain of the nonstructural nsp protein , and ( ) the receptor binding domain of the spike protein. , , the earliest folding initiation event in both cases will be predicted employing the sequential collapse model (scm). , in the scm, the multistate folding process of proteins longer than ∼ amino acids is initiated by formation of specific nonlocal contacts called primary contacts. these primary contacts help constrain the folding process by dividing the protein into several smaller domains. in this way, overall folding becomes vastly more efficient than a purely random statistical search, resulting in what we have referred to as "nature's shortcut" to the native structure. nucleation of the folding process by the primary contacts then constitutes a potential set of emergent natural physical constraints that sidestep levinthal's paradox. in the case of misfolded protein-based diseases, the scm has been applied to investigate some general properties of the folding dynamics of neuropathogenic proteins. , in order to unequivocally determine the folding initiation events, we will support the scm predictions with structural information from the crystal structures of both proteins. goal (b) of this article will be addressed utilizing the scm predictions to provide potential target (intraprotein) regions for the development of therapeutic drugs able to interdict the folding initiation event. various possible therapeutic drugs could be considered, but peptides form a natural class for target region binding, ideally preventing subsequent folding, although other molecular categories could also be similarly employed. this therapeutic drug development strategy based on folding interdiction of target regions (fitrs) is similar to an earlier proposal to develop drugs to interfere in protein folding. , the novelty in the present work lies in the scm's ability to predict critical target regions for folding initiation from the primary sequence. the broader potential of the fitr strategy and possible development hurdles will also be discussed. in particular, it will be explained that the proposed fitr drug strategy could be extended to other proteins of sars-cov- as well as to other diseases in which the presence of specific proteins plays a decisive role. the physical basis of the scm and its most up-to-date formulation have been recently explained in full detail. , here, only a brief summary of the main concepts is presented that are relevant to the issues investigated in the present paper. . . scm entropic cost of loop formation. the scm considers early specific nonlocal contacts based on the entropy of formation of the resultant protein loops. the scm has successfully predicted many of the observed features of protein folding pathways. in the scm, two different loop regimes are considered when analyzing early nonlocal contacts: short loops for which the gyration radius, r g (n), is smaller than the average side chain length £(n) [i.e., r g (n) < £(n)], and long loops for which r g (n) > £(n). the loop length at which the transition between the short and long loops takes place [i.e., the length for which r g (n) ≈ £(n)] is called the optimal loop length n op . the optimal loop length has been estimated to be n op ≈ amino acids for typical protein sequences for £(n) ≈ . Å, although some sequence variability exists and n op is expected to be shorter for highly disordered proteins that contain few of the bulky hydrophobic amino acids. this value for n op is consistent with experimental data showing the behavior of poly-alanine, a polypeptide with smaller side chains than the average globular protein, in this case £(n) ≈ . Å, which exhibits deviations from gaussian statistics because of steric hindrance when n < amino acids. the long loop regime is physically equivalent to the classical flory−jacobson−stockmayer (fjs) picture and the entropic cost of forming protein loops is well represented, assuming that the amino acids can be taken to be solid ball-like by a simple logarithmic function of the form this is clearly an approximation, as for example, the side chains would be better represented by solid spheres of different sizes according to the primary sequence. in the scm short loop regime, however, the internal degrees of freedom of the side chains cannot be neglected, and the entropic cost of forming short loops must be higher than when the amino acids are taken to be solid spheres. moreover, because most of the degrees of freedom are in the side chains, we expect the contribution of the side chains to the overall entropic cost to be dominant with respect to that of any constraints imposed by loop formation on the backbone. thus, in the scm it is expected that for a short loop, the entropic cost of loop formation Δs loop approximately becomes with Δs side-chain-crowding ≪ , opposing folding. when r g (n) ≥ £(n), we have Δs side-chain-crowding ≈ and the standard fjs regime is recovered. the side chain crowding term Δs side-chain-crowding will appear as a correction to the js results for shorter loops. it is extremely difficult to obtain an analytical expression for the side chain crowding term, and in the scm it has been presented in generic boltzmann−gibbs form where f (n,£) is the average configurational freedom per amino acid in the unfolded chain and f loop (n,£) is the average configurational freedom of an amino acid in a loop. consideration of modifying the homogeneous flory-like representation of the protein chain to take into fuller account the microscopic details of the protein−solvent system is not exclusive to the scm and has been employed before to account, for example, for the effects of the solvent. based on the model developed above, in the scm, the folding of proteins with more than ∼ amino acids likely involves the formation of an early nonlocal contact, called the primary contact within the scm, that defines the earliest folding phase with n ≥ n op ≈ amino acids. as only a few primary contacts can be established at most in proteins of length n ≥ n op , most of the tertiary structure contacts will still be defined by contacts at a shorter range established in later folding phases. formation of the primary contact in the scm defines the primary loop, which subsequently collapses through two-state kinetics. because proteins longer than ∼ amino acids do not generally undergo two-state collapse but rather fold through multistep pathways, consistent simple physical reasoning implies that there is a limit to the size of the primary loop that can successfully lead to the native scm folding pathway of ∼ amino acids. the concept of folding nucleated by nonlocal contacts is not exclusive to the scm, having arisen earlier in the context of the diffusion−collision model and in the energy landscape picture. it also has appeared in simulations of the transition state of two-state folding proteins. also, protein topology has been considered an essential element of folding mechanisms in a number of theoretical efforts. − the particular feature in the scm is that the early nonlocal contacts are highly specific as in the loop hypothesis, and a methodology is developed to derive their location from primary sequence information. . . determining the primary contact. based on the model presented in the previous sections, whether there is a nonlocal contact in an otherwise unfolded state is dependent upon the stability of the potential contact candidates at loop lengths of n ≥ n op amino acids. in the scm, the stability of a contact formed by the number n cont of amino acids, Δg contact (n cont ,n loop ), can be written as here, Δg loop represents the entropic free energy cost of the loop as discussed in section . . the term Δg int,h denotes all the enthalpic interactions that help stabilize the contact, possibly including hydrophobic interactions, van der waals interactions, hydrogen bonds, disulfide bonds, and salt bridges, and its value satisfies Δg int < . the term Δg cont,s > represents the entropic cost of constraining the side chains of the amino acids defining the contact such that the contact is stable and it opposes contact formation. a segment-specific determination of the value Δg cont,s (n cont ) for a given contact would require detailed molecular dynamics techniques. however, a heuristic estimate can be made from earlier work within the scm, which showed that the average entropic cost the journal of physical chemistry b pubs.acs.org/jpcb article of folding per amino acid for a sample of proteins was Δg folding/residue,s ≈ . kt/residue, and the maximum was Δg folding/residue,s ≈ . kt/residue. as these are estimates for the entropic cost for folding per residue of complete proteins that include highly buried as well as flexible exposed regions, it is then reasonable to expect that the entropic cost of a contactforming region must be closer to the highest calculated values for Δg f o l d i n g / r e s i d u e , s . here, we will assume that Δg contact,s (n contact ) for a contact including n cont amino acids is approximately Δg folding/residue,s determined by the number of residues defining the contact, such that Δg cont,s (n cont ) ≈ . n cont . this result is clearly an approximation, but in section it will be shown to suffice for establishing a cut-off in the number of possible contacts that is consistent with the available structural data. hydrophobic interactions are well understood to constitute the main driving force of the folding process. other interactions such as hydrogen bonds are weaker or like disulfide bonds and salt bridges form later along the folding pathway. thus, for an early contact forming from the unfolded state, we can take Δg int (n op ) ≈ Δg hyd (n op ), where Δg hyd (n op ) represents the stabilizing effect of hydrophobicity in the early contacts, and eq can be written as as the hydrophobic stabilization energy of the contact Δg hyd is determined by the hydrophobicity of the segments involved, the hydrophobicity values h k are obtained from the fauchere− pliska scale and assigned to each residue in accord with previous calculations within the scm. because the amino acid side chains are significantly larger than the typical peptide bond length, early contacts between two hydrophobic amino acids will inherently involve segments including several amino acids, adjacent to the initial contact. the stability of this early hydrophobic contact will determine where the folding process is initiated. this picture is not unlike the zapping model of dill and collaborators. here the typical early contact segment size will be taken to be ∼ amino acids in line with previous calculations within the scm. the amino acid window size is based on the geometric considerations underlying the scm: with an average effective fluctuating width of the unfolded protein chain of w ∼ £(n) ≈ . Å, and a peptide bond length of . Å, the minimum number n cont of amino acids that can define a contact in the open fluctuating chain should be n cont ∼ int[ £(n)/ . ] = amino acids. the results for the location of the most stable primary contact were seen to be robust to the employment of five and six-amino acid windows, while some deviations were observed when the window was reduced to four amino acids. in practice within the scm, the hydrophobicity h k of each residue is added over a segment contact window of five amino acids centered at residue i, resulting in a segment hydrophobicity h i , (a value of ∼ . is equivalent to a change in energy of kt, with the margin of error being ∼ . kt ) . in order to determine the best contact, the h i, values of a segment centered at residue i is added to the h j value of a segment centered at residue j, located at a distance n ij at least n op amino acids apart along the sequence, and no longer than the maximum primary loop length of ∼ amino acids, to give a contact stability of we have chosen to focus in this paper on two domains of two major functional proteins of sars-cov- : (a) the nonstructural adp ribose phosphatase domain of protein nsp ; and (b) the structural receptor binding domain of the spike protein. this choice was made based on two distinct considerations: ( ) to study both a structural and a nonstructural protein that have a direct involvement in the viral infection mechanism, thus providing options for drug discovery; and ( ) to employ the scm within the boundaries of its demonstrated applicability, that is, on proteins long enough that a multi-state folding pathway is expected, but not so long that degeneracies in the results might cloud any definite conclusions. non-structural proteins of coronaviruses have been the object of intense study, concerning both their structures and their functionality. , the multi-domain non-structural protein (nsp ) is the largest protein encoded by the coronavirus' genome. it includes up to sixteen domains, of which eight domains and two transmembrane regions are conserved. , one of the conserved domains is the macrodomain (also called the x domain), which includes amino acids. the first available crystal structure of a nsp domain of any coronavirus was the unliganded x domain of sars-cov, obtained in . the crystal structure for the sars-cov- variant of the x domain is known. it has the typical x domain organization, with seven β-strands defining a central β-sheet, surrounded by six α-helices. one of the functions of the x domain is to bind adp-ribose and poly-(adp-ribose). − the x domains of coronaviruses, also show adp-ribose- ″phosphate phosphatase activity. , , it has been observed that this property is linked to the ability of the virus to compromise the immune system of the host. , our calculations predict that the best possible primary contact for the x domain (i.e., the best folding initiation contact) is established between segments ( ptvvv ) centered at v , and ( llapl ) centered at a , with a stability of Δg cont ≈ − . kt. the predicted primary contact is in clear proximity in the crystal structure of the protein (pdb code vxs), and we have represented the two segments in figure a . the next-best possible contacts with energies within ∼ kt of the best primary contact ( , ) are shown in table . none of the possible alternatives to contact ( , ) is a good native contact on the crystal structure. none of the side chains of the two segments defining these alternative contacts appear within the van der waals interaction distance in the crystal structure. the issue of the multiplicity of possible primary contacts in the scm has been considered in previous work. because no major rearrangements of the protein core are expected post-collapse, it is generally assumed within the model that primary contacts that are non-native on the d folded structure are likely not to correspond to native pathways leading to the functional folded structure. , in all the proteins studied to date within the scm it has been observed that the most stable primary contact is native-like. thus, our result implies that the majority of the protein molecules are the journal of physical chemistry b pubs.acs.org/jpcb article expected to fold through the initiation event defined by the primary contact predicted here. in this regard, it is a prediction of the model that primary contact ( − ) is the gateway to "nature's shortcut" to the folding of the adp ribose phosphatase domain of sars-cov- . . . primary contact of the receptor binding domain of the spike protein of sars-cov- . the receptor recognition mechanisms of coronaviruses have been extensively studied. , in particular, for sars-cov- and its earlier viral variant sars-cov, entry into the host's cells is mediated by a virus-surface spike protein that includes a specific receptor-binding domain (rbd). − the rbd recognizes angiotensin-converting enzyme (ace ) as its specific receptor. the structure of sars-cov rbd is well known, and the structure of sars-cov- rbd is similar, albeit with some specific mutations in the ace binding ridge that enhance the ability of sars-cov- to bind human ace . the spike protein of sars-cov- is a natural therapeutic target given its critical biological role in facilitating the virus entry in the cell. the search for inhibitors, including peptides, that actively block rbd-ace binding of coronavirus has been an active field of investigation for many years. − the rbd of sars-cov- is amino acids long, making it the longest protein investigated within the scm to date. , , the best possible contacts with energies within ∼ kt of the best primary contact ( , ) are shown in table . our calculations predict that the best possible primary contact is established between segments ( cviaw ) centered at i , and ( vvlsf ) centered at l , residues apart, with a stability of Δg cont ≈ − . kt. the predicted contact is in clear proximity in the crystal structure of the protein (pdb code m j), and we have represented the two segments defining the contact in figure b . the second-best possible contact is defined by segments ( lcpfg ), centered at p , and ( cviaw ), residues apart, with a stability of Δg cont ≈ − . kt. contact ( , ) is not as good a contact on the native structure. thus, our result implies that the majority of the protein molecules will fold through the initiation event defined by the best primary contact ( , ) predicted here. the identification of the primary contacts along the folding pathway of viral proteins constitutes an important result for at least two reasons: (a) the sequences of the specific segments involved in the primary contacts provide a template to specify candidate peptide drugs of inhibitory effect with the maximum possible contact affinity to compete with the natural folding mechanism; and (b) it provides insight for further investigation into the subsequent folding steps leading to a fully functional viral protein, potentially providing for additional fitrs. the fact that the primary contact is defined by the interaction between two well defined amino acid sequences suggests that a strategy to develop fitr-based therapeutic drugs could be one utilizing trial peptide drugs as suggested above. peptide drugs offer several advantages versus other more classical approaches such as function-blocking monoclonal antibodies. in particular, being much smaller and flexible, peptide drugs can much more easily cross the cellular membrane to reach their intended targets. however, designing therapeutically effective peptide drugs remains an important challenge. there are several reasons why effective peptide drugs are hard to discover: ( ) the potential space of peptide candidates is very large; ( ) ensuring delivery at the right location on the target molecule is a considerable challenge; and ( ) making a peptide drug that is contact-site the color code reflects the location from the nterminus (dark blue) to the c-terminus (yellow). only the side chains corresponding to the segments that define the primary contacts are shown. the journal of physical chemistry b pubs.acs.org/jpcb article specific is not an easy task. as a consequence, no more than ∼ effective peptide drugs are in use today, although there is active investigation of many more. the scm might prove of assistance in addressing at least difficulty ( ) above, and maybe also ( ) and ( ), through the utilization of the predicted primary contact sequence as a template to search for an effective therapeutic peptide capable of inhibiting primary contact formation. also, other non-peptide molecules have shown potentially therapeutic capabilities by binding to mostly unfolded states of proteins. for example, ceftriaxone binds specifically to the c-terminal region of the intrinsically disordered protein α-synuclein, generally understood to be involved in triggering parkinson's disease, and has shown therapeutic potential. also, recent experimental results suggest that the folding kinetics of proteins with two-state transitions can be modulated by the employment of suitable peptides that mimic specific segments of the protein chain. although the purpose of the experiment was the opposite of the one sought for here, as the goal was to speed up the folding transition, the results provide general support for the contention presented that specific peptide molecules can interfere and alter the folding kinetics of globular proteins. the proposed therapeutic strategy is depicted in figure . to summarize, we have presented a target-specific strategy to develop folding-interdicting drugs. such folding-interdicting drugs would function by specifically inhibiting the earliest folding events. in order to do so, we propose to rely on the a priori capabilities to identify fitrs embedded within the scm. this strategy is generally similar to earlier proposals to develop therapeutic peptide folding inhibitors relying on protein folding information. our expectation is that by developing such scm-based folding-interdicting drugs as figure . proposed inhibitory mechanism of viral functionality based on the employment of specific peptides in order to interdict the initial folding event of the viral proteins. the effect on the viral structure of the employment of such folding inhibitory peptides on the spike proteins of sars-cov- is illustrated in generic form as its precise effect on the overall configuration of the virion is not known. additionally, multiple spike proteins would need to be inhibited simultaneously to fully disrupt the functionality of the virion. whether folding interdiction through inhibition of just the dominant folding initiation event suffices to preempt the onset of disease is a matter that calls for experimental assessment. in this article we presented theoretical predictions for the folding initiation events of two functionally relevant proteins of the sars-cov- virus. the predicted folding initiation events were shown to map into good contacts on the d structure of the proteins. we proposed that knowledge of the protein segments involved in the folding initiation event, opens an attractive route to developing new therapeutic drugs intended to prevent the successful folding of key viral proteins. such reduction of the population of properly folded viral proteins could lead to a decreased level of viral spread, thus reducing the lethality of the infection. on the basis of these findings, we proposed a general therapeutic drug development strategy based on interdicting the folding process through the identification of scmpredicted fitrs. the fitr strategy could be generally used to approach other diseases where specific proteins play an essential role. a pneumonia outbreak associated with a new coronavirus of probable bat origin a new coronavirus associated with human respiratory disease in china an interactive web-based dashboard to track covid- in real time phase ii trial of -b antibody therapy with autologous stem cell and transplantation for relapsed b cell lymphomas use of radiolabeled antibodies to carcinoembryonic antigen for the detection and localization of diverse cancers by external photoscanning human monoclonal antibodies against highly conserved hr and hr domains of the sars-cov spike protein are more broadly neutralizing structural basis of receptor recognition by sars-cov- sars-cov- cell entry depends on ace and tmprss and is blocked by a clinically proven protease inhibitor analysis of therapeutic targets for sars-cov- and discovery of potential drugs by computational methods nsp of coronaviruses: structures and functions of a large multi-domain protein crystal structure of adp ribose phosphatase of nsp from sars cov- spike receptor-binding domain bound to the ace receptor sequential collapse model for protein folding pathways nature's shortcut to protein folding macromolecular crowding facilitates the conformational transition of molten globule states of the prion protein predicting the location of the non-local contacts in alpha-synuclein a comprehensive review of current advances in peptide drug development and design the preferred conformation of the tripeptide ala-phe-ala in water is an inverse γ-turn: implications for protein folding and drug design protein folding and drug design moments and distribution functions for polypeptide chains intramolecular reactions in polycondensations. i. the theory of linear systems structural and energetic heterogeneity in protein folding. i. theory diffusion−collision model for protein folding how does a protein fold three key residues form a critical contact network in a protein folding transition state prediction of protein-folding mechanisms from free-energy landscapes derived from native structures a simple model for calculating the kinetics of protein folding from three-dimensional structures how native-state topology affects the folding of dihydrofolate reductase and interleukin- ß topomer search model: a simple, quantitative theory of two-state protein folding kinetics van der waals locks: loopn-lock structure of globular proteins nonlocal interactions stabilize long range loops in the initial folding intermediates of reduced bovine pancreatic trypsin inhibitor dominant forces in protein folding low entropic barrier to the hydrophobic collapse of the prion protein: effects of intermediate states and conformational flexibility trifluoroethanol-induced stabilization of the α-helical structure of β-lactoglobulin: implication for non-hierarchical protein folding hydrophobic parameters ii of amino-acid side chains from the partitioning of n-acetyl-amino-acid amides protein folding by zipping and assembly proteomics analysis unravels the functional repertoire of coronavirus nonstructural protein bioinformatics and functional analyses of coronavirus nonstructural proteins involved in the formation of replicative organelles putative papain-related thiol proteases of positive-strand rna viruses. identification of rubi-and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha-and coronaviruses the macro-domain protein family: structure, functions and their potential therapeutic implications structural basis of severe acute respiratory syndrome coronavirus adp-ribose- "-phosphate dephosphorylation by a conserved domain of nsp structural and functional basis for adp-ribose and poly(adpribose) binding by viral macro domains crystal structures of two coronavirus adpribose- "-monophosphatases and their complexes with adp-ribose: a systematic structural analysis of the viral adrp domain adp-ribose- "-monophosphatase: a conserved coronavirus enzyme that is dispensable for viral replication in tissue culture the conserved coronavirus macrodomain promotes virulence and suppresses the innate immune response during severe acute respiratory syndrome coronavirus infection mouse hepatitis virus liver pathology is dependent on adp-ribose- "-phosphatase, a viral function conserved in the alpha-like supergroup the molecular biology toolkit (mbt): a modular platform for developing molecular visualization applications. bmc bioinf. , , . ( ) bergasa-caceres, f.; rabitz, h. a. sequential collapse folding pathway of β-lactoglobulin: parallel pathways and non-native secondary structure role of topology in the cooperative collapse of the protein core in the sequential collapse model. folding pathway of r-lactalbumin and hen lysoz macromolecular crowding facilitates the conformational transition of molten globule states of the prion protein angiotensin-converting enzyme is a functional receptor for the sars coronavirus receptor recognition mechanisms of coronaviruses: a decade of structural studies efficient replication of severe acute respiratory syndrome coronavirus in mouse cells is limited by murine angiotensin-converting enzyme structure of sars coronavirus spike receptor binding domain complexed with receptor the spike protein of sars-cov: a target for vaccine and therapeutic development screening and identification of linear b-cell epitopes and entry-blocking peptide of severe acute respiratory syndrome (sars)-associated coronavirus using synthetic overlapping peptide library specific asparagine-linked glycosylation sites are critical for dc-sign and l-sign-mediated severe acute respiratory syndrome coronavirus entry getting across the cell membrane: an overview for small molecules, peptides, and proteins reaching for high-hanging fruit in drug discovery at protein-protein interfaces ceftriaxone blocks the polymerization of α-synuclein and exerts neuroprotective effects in vitro lewy bodies rational design of protein-specific folding modifiers design of hiv- -pr inhibitors that do not create resistance: blocking the folding of single monomers the authors declare no competing financial interest. the authors acknowledge support from nsf grant che- . key: cord- -ehp q ry authors: haber, michael j.; shay, davis k.; davis, xiaohong m.; patel, rajan; jin, xiaoping; weintraub, eric; orenstein, evan; thompson, william w. title: effectiveness of interventions to reduce contact rates during a simulated influenza pandemic date: - - journal: emerg infect dis doi: . /eid . sha: doc_id: cord_uid: ehp q ry measures to decrease contact between persons during an influenza pandemic have been included in pandemic response plans. we used stochastic simulation models to explore the effects of school closings, voluntary confinements of ill persons and their household contacts, and reductions in contacts among long-term care facility (ltcf) residents on pandemic-related illness and deaths. our findings suggest that school closings would not have a substantial effect on pandemic-related outcomes in the absence of measures to reduce out-of-school contacts. however, if persons with influenzalike symptoms and their household contacts were encouraged to stay home, then rates of illness and death might be reduced by ≈ %. by preventing ill ltcf residents from making contact with other residents, illness and deaths in this vulnerable population might be reduced by ≈ %. restricting the activities of infected persons early in a pandemic could decrease negative health impact. measures to decrease contact between persons during an influenza pandemic have been included in pandemic response plans. we used stochastic simulation models to explore the effects of school closings, voluntary confinements of ill persons and their household contacts, and reductions in contacts among long-term care facility (ltcf) residents on pandemic-related illness and deaths. our findings suggest that school closings would not have a substantial effect on pandemic-related outcomes in the absence of measures to reduce out-of-school contacts. however, if persons with influenzalike symptoms and their household contacts were encouraged to stay home, then rates of illness and death might be reduced by ≈ %. by preventing ill ltcf residents from making contact with other residents, illness and deaths in this vulnerable population might be reduced by ≈ %. restricting the activities of infected persons early in a pandemic could decrease the pandemic's health effects. t hree influenza pandemics have occurred during the th century (in , , and ) , and another pandemic is inevitable ( ) . the requirements for a pandemic virus include the existence of a new influenza a hemagglutinin for which there is little immunity, the ability of this strain to infect humans efficiently, and person-toperson transmission. such viruses are likely to arise in densely populated agricultural communities where contact between humans and birds or pigs are close and persistent ( ) . in , a highly pathogenic avian influenza a (h n ) virus was transmitted from live poultry to humans in hong kong special administrative region, people's republic of china, killing of infected persons ( ) . from december through june , , the world health organization confirmed human cases and deaths associated with influenza a (h n ) infections in humans ( ) , and in october , influenza a (h n ) infections among birds were identified for the first time in europe. currently circulating influenza a (h n ) viruses appear to infrequently infect humans, and person-to-person transmission, if it occurs, is certainly not efficient. however, international health officials are concerned that, as human exposure to such viruses increases, so does the possibility that a pandemic virus might appear. the next influenza pandemic in the united states could result in , to , deaths, , to , hospitalizations, and to million outpatient visits, with a direct economic effect between us $ and $ billion, according to set of estimates ( ). others have described the possible effects of vaccine and antiviral interventions. one study estimated that vaccinating % of the population would be necessary to achieve optimal cost benefits, assuming that development and mass production of a vaccine would require - months after the pandemic virus was characterized ( ). longini et al. ( ) estimated the effectiveness of rapid targeted antiviral prophylaxis of persons early in a pandemic by using epidemic stochastic simulations. they found that if the next pandemic virus had a similar virulence to that of the - pandemic virus, then delivering prophylaxis to % of exposed persons for up to weeks could reduce attack rates by %- % and death rates by . - . / , persons. however, such a strategy would require a stockpile of . billion doses of antiviral agents, which exceeds the current production capacity for these drugs for at least the next years. in the absence of adequate supplies of vaccines and antiviral agents, at least during the first wave of an influen-effectiveness of interventions to reduce contact rates during a simulated influenza pandemic za pandemic, public health officials should consider using interventions designed to reduce the number of contacts between infected or exposed persons and susceptible persons. the us department of health and human services influenza pandemic plan discusses several possible containment strategies, including those directed to single persons or entire communities ( ) . we used new stochastic simulation models to estimate the effects of several interventions of this kind. these models represented the spread of a pandemic in an urban us community, allowing for contacts in different settings (or mixing groups), including households, daycare centers, schools, workplaces and long-term care facilities (ltcfs). by using the age distribution of the us population ( ), we placed each person in the community in a stratum, defined by age group and (if > years of age) by residence in the community or in an ltcf. person-to-person transmission probabilities depended on the daily duration of contacts. contact rates and their duration varied by each person's stratum and mixing groups. by using these models to simulate an influenza pandemic, we estimated the effects of school closings, home confinement of ill persons (i.e., isolation) or their household contacts (i.e., quarantine), and reduction of contacts among residents of ltcfs on overall illness attack rates, hospitalization rates, and mortality rates. we simulated an influenza outbreak in a small urban us community. the simulation model used data from the asian influenza a (h n ) pandemic in - ( ) and from studies on us influenza-related excess rates of hospitalizations and death ( ) ( ) ( ) . the simulation process begins with the generation of a community of households, where the distributions of sizes of the households and ages of the household members follow the us census. every person in the community belongs to of age-dependent strata: preschool children (ages birth- years), schoolchildren (ages - years), adults (ages - years), seniors (ages > years) living at home, and seniors (ages > years) living in an ltcf. in addition, each person belongs to > mixing groups, according to his or her stratum: households, daycare centers, schools, workplaces, ltcfs, and the community. the mixing matrix is presented as table . on any given day, a susceptible person, a, makes contacts with other persons that may lead him or her to become infected. these contacts take place in each of a's mixing groups. the probability that person a becomes infected depends on the following input parameters: ) the number of different persons with whom person a has contact in each mixing group, ) the total duration, in minutes, of all the contacts with each of these persons, and ) the per-minute rates of infection transmission if the contacted person is infectious. the number and duration of contacts may be different on weekdays and weekend days. the values of the parameters that were used in this study are presented in the online supplemental materials appendix (available at http://www.cdc.gov/eid/content/ / / . htm). once person a becomes infected, he or she undergoes a latent period, followed by a period in which he or she is infectious. the mean length of the latent and infectious periods are input parameters. this model has new features that are not shared by the commonly used simulation models (such as the model in [ ] ) for transmission of influenza: ) the probability of transmission depends on the total duration of all contacts between persons, rather than on the number of times they make a contact, ) the transmission parameters do not depend on the population size, and ) different contact parameters can be specified for weekdays and weekend days. technical details of the simulation model are presented in the supplemental materials appendix. the basic reproductive number (r ) for this model is . . this value is within the range ( . - . ) estimated by mills et al. ( ) for the influenza pandemic. the interventions we examined in this simulation study were school closings, confinement of ill persons and their household contacts to their homes, and reduction in contact rates among residents of ltcfs. interventions were implemented at the start of the outbreak. when this intervention was implemented, schools closed when the prevalence of illness among children in the school exceeded a predetermined threshold, set to %, %, or % in the simulations. a school remained closed for a predetermined period ( , , or days). on weekdays, household and community contact parameters of children whose school was closed were assigned their weekend levels; their contacts with other children who continued to attend school and with working adults did not change. when this intervention was implemented, a given fraction of households were assumed to comply. if a household complied, then all of its members followed the confinement rules unless they had been previously ill and had recovered. we considered types of confinement: ill persons only, and ill persons and all the members of the same household. confinement began after a given number of days of illness ( , , or days) and did not depend on the severity of illness. if symptoms were severe, then the person reduced his or her duration of contacts with other household members by %. when a person was confined on a weekday (because of his or her illness or illness of another household member) and did not withdraw due to severe symptoms, then the duration of contacts with household members who continued to go to school or work did not change. durations of contacts with household members who stayed at home and were not withdrawn were the same as on a weekend day. when ill persons were confined, they returned to school or work day after their illness ended. when ill persons and other household members were confined, a person returned to school or to work day after his or her illness ended (even if other ill persons remained in the household). a person who did not become ill returned to school or work on the third day after the last day of illness of any household member (because the length of the latent/incubation period was assumed to be days). we examined the effects of interventions on ltcf residents: reduction in duration of contacts with other residents who were ill, and reduction in duration of contacts with visiting family members. contacts with ltcf staff did not change. we first ran a set of simulations using the baseline settings for all the parameters, without any interventions (online supplemental materials appendix). the average rates for the outcomes of interest-overall illness rate, hospitalization rate, and death rate-were calculated for simulations and used as baseline rates. for each intervention, we ran a set of simulations and used the aver-ages of these simulations as estimates of the expected rates under this intervention. the effectiveness of each intervention was defined as follows: effectiveness = [(baseline rate) -(rate with intervention)]/baseline rate we performed a sensitivity analysis to assess the robustness of our findings regarding the effectiveness of the modeled interventions. in common with all simulation studies, our findings depended on several parameters for which we have estimated values that we believe are reasonable starting points. these values included baseline contact rates, the probability of illness given infection, the relative infectiousness of an infected person without influenza symptoms, the probability of withdrawal to home because of severe symptoms, and the reduction in contact rates due to severe symptoms. we varied the values of these parameters and examined the effects of these changes on estimates of the effectiveness of school closings and confining ill persons to their homes. based on the simulations conducted with the baseline values of the pandemic parameters, the baseline rate of illness was . %, ( % confidence interval [ci] . %- . %), the baseline rate of hospitalization was . / , ( % ci . - . ) and the baseline rate of death was . / , ( % ci . - . ). these results were based on the assumption that the illness rates would be similar to their values in the influenza pandemic. two parameters affected the effectiveness of school closings: the percentage of ill schoolchildren required to close a school and the number of days the school remained closed. the effectiveness of the intervention varied as a function of the percentage of ill persons required for closing a school and the duration of the closure (figure ). for example, if each school were closed for days when the proportion of ill children exceeded %, then the overall illness rate was . ( % ci . - . ). the baseline illness rate was . ; therefore, the effectiveness of this intervention was ( . - . )/ . = . ( % ci . - . ). as expected, effectiveness usually decreased as the percentage of ill children required to close a school increased. the effect of the length of closure was less clear (figure ). when schools were closed, transmission in households and in the community increased; thus, school closings could increase death and illness rates in some groups. for example, when the illness rate required for school closing was %, then closing schools for days had the largest effect on hospitalization rates, compared with closings of or days. however, when the rate for closing was %, then closing schools for days had a smaller effect on hospitalization rates than closing for or days. in our models, confinement to home took place after a person showed symptoms of influenza. a delay of , , or days occurred between onset of symptoms (which coincided with the onset of infection) and the beginning of the confinement period. this delay and the proportion of households that complied with the confinement rules affected the effectiveness of the intervention. figure presents the effectiveness of these interventions as a function of the percentage of households that comply (between zero and %) for a delay of days. as expected, effectiveness usually increased with the compliance percentage. confining the ill persons and their household members was more effective than confining the ill persons only. for example, given a delay of days and % compliance, the effectiveness of these interventions on illness rates was . for confining the ill only and . for confining ill persons and their household members. effectiveness decreased when the length of the delay was increased. reducing contacts with ill residents of ltcfs decreased the rates of illness, hospitalization, and death for ltcf residents by > % (table ). reducing contacts also decreased the rates of hospitalization and death in the general population by up to % and %, respectively. figure presents the dynamics of the pandemic (a) without any intervention, (b) when schools are closed for days as the proportion of ill children exceed %, and (c) when ill persons and all their household contacts are confined after the second day of illness of the index case-patient and compliance is %. we see that these interventions do not affect the time to the peak of the pandemic (around week ). the rate of decline following the peak does not change under confinement to home, while it slightly decreases under school closing. the value of the basic reproductive number (r ) for the baseline setting of our parameters is . . because this value is higher than values used in recent simulation studies ( , ) , we evaluated the effectiveness of the interventions under smaller values of r . we found that reducing r resulted in an increase in the effectiveness of confinement to home and a decrease in the effectiveness of school closings. thus, our findings regarding the effectiveness of confinement and the lack of effectiveness of school closings remain valid for smaller values of r . the results of additional sensitivity analyses were as follows. emerging infectious diseases • www.cdc.gov/eid • vol. the most important parameters related to the effectiveness of school closings are those that underlie the contacts between children while they are in school. in our simulations we assumed that on a school day each child makes contact with other schoolchildren, each contact lasting minutes (see section d. .a in the online supplemental materials appendix). some of these contacts may be concurrent. to examine the effect of changing each child's exposure to other schoolchildren on the effectiveness of school closures, we increased and decreased the baseline duration of minutes by %. table shows the effectiveness of closing schools for days for the baseline values of duration of school contact. as we see, longer or shorter durations of contact while schools are open do not result in substantial changes in the effectiveness of school closings. we varied the values of several parameters in the baseline model and examined the effects these changes had on estimates of the effectiveness of confinement of ill persons to their homes (table ). we assumed that % of ill persons without severe symptoms were confined to home within days of symptom onset. when the fraction of infected persons who developed symptoms was increased from . to . , then the illness rate without an intervention (i.e., at the baseline level) changed only from . to . , while implementation of the intervention changed this rate from . to . . thus, the effectiveness of this intervention increased from . to . . the alternative values we used in table modeled a more severe pandemic than the pandemic modeled with the baseline initial values. the continuing epizootic of influenza a (h n ) among birds in asia and europe has raised concerns that the likelihood of an influenza pandemic may be increasing. shortages in the supply of neuraminidase inhibitors, the antiviral agents most likely to be effective against a pandemic influenza strain, and the months needed from the isolation of a pandemic strain until the availability of vaccine suggest that reducing contact rates between infected and uninfected persons will represent one of the few sets of interventions that can be rapidly implemented. we used a stochastic simulation model to estimate the effectiveness of several interventions that could reduce contact rates on pandemic-related outcomes. the use of individual-level (e.g., isolation and quarantine) and community-level (e.g., school closings) containment measures. our study considered possible interventions of both kinds, including early identification and confinement of case-patients and their household contacts, limiting visits to ltcfs, and closing of schools. our findings suggest that closing schools would result in relatively small reductions in morbidity and mortality rates during a pandemic. for example, when schools were closed when > % of children had influenza symptoms and remained closed for days, the rates of illness, hospitalization, and death decreased from the baseline rates of . %, / , , and / , to . %, / , , and / , , respectively. thus, the effectiveness of school closings was ≈ %- %. when we increased the threshold of illness incidence required for school closing to %, then these rates were . %, / , , and / , , respectively. these mild decreases in the rates of illness and death after school closures are explained by the fact that in our models, children whose schools were closed were more likely to increase their contacts with other groups. the attack rate of % that we used for school-age children may be considered high. however, if the attack rate were reduced, school closings would have an even smaller effect. our results do not contradict recent findings that vaccination of schoolchildren could be effective in controlling transmission during a seasonal influenza epidemic ( ) . vaccination of children reduces their chances of infection and of transmitting infection to household and community contacts, whereas closing schools may not decrease the likelihood of infection substantially and could increase the probability that an infected child will infect household and community contacts ( ) . the effect of school closings on overall illness rates in an influenza pandemic has been estimated in other recent simulation studies. germann et al. ( ) modeled the effect of a pandemic on the entire us population. they found that for r > . , closing of schools without any additional interventions had limited effectiveness. on the other hand, for r < . , school closings reduced the extent of illness. carrat et al. ( ) , by using a simulation model for the spread of influenza in a community, found school closings to be effective. we believe that these inconsistencies in the reported effects of school closings depend on the details of the various simulation models, especially on the way the community is affected by school closing in terms of increased contact rates of schoolchildren when their school is closed. our simulations predict that it might be possible to decrease illness and death rates by as much as % by reducing the contact rates of all ill persons. however, achieving this level of effectiveness would require persuading % of those with symptoms to withdraw to their homes and confine themselves. simulation studies by longini et al. ( ) and ferguson et al. ( ) found that quarantine, when used in conjunction with vaccines and antiviral agents, would be effective in containing an influenza pandemic in southeast asia. one should remember that the effectiveness of any behavioral/social intervention may vary across cultures. residents of ltcfs are likely to be at high risk for serious pandemic-related illness and death. we found that by limiting contacts of ill residents, illness and death may be reduced among other residents. these are notable findings, as this vulnerable population responds poorly to seasonal influenza vaccination, and they are unlikely to receive the limited quantities of pandemic vaccine when it first becomes available. the effectiveness of any particular intervention designed to reduce contact rates depends on the initial values selected for the parameters affecting influenza transmission (e.g., contact durations, probability of withdrawal due to severe symptoms), and a limitation of our study is that few data exist on which to base these values. studies designed to obtain reliable estimates of these parameters during seasonal, interpandemic influenza outbreaks should be a high priority. however, the major findings of this study seem to be robust, given a range of realistic values for the parameters we used. the target attack rates we used to calibrate the contact parameters (provided in the supplemental materials appendix) are high, but lowering these attack rates should not have a major effect on our findings, because both the pre-and postintervention incidence rates would decrease concomitantly. we did not make formal estimates of the economic costs and benefits of the interventions we examined. however, some likely consequences of school closings may be considered, given current childcare practices. obviously, the longer the duration of school closure, the more costly the consequences as working parents either have to take time off work to supervise children or pay for somebody else to care for them. if a large number of school days are lost, school districts might consider extending the school year, which would incur additional costs, although the conditions would be expected to vary greatly between school districts. these increased costs would have to be weighed against the limited predicted effectiveness of this intervention. encouraging the voluntary withdrawal of ill persons appears to be a more effective strategy than school closings in reducing the impact of a pandemic, and it may represent a relatively inexpensive intervention. however, researchers have found that us workers routinely miss < day of work after reporting onset of influenzalike illness ( ) . encouraging longer durations of work loss could decrease compliance with self-isolation and increase the economic cost per case avoided. home quarantine of the immediate family members of an ill person would likely increase the costs per case averted. for example, during the quarantine efforts related to the severe acute respiratory syndrome outbreak in toronto ( ) , many families found it too expensive to rigidly comply with a household-level quarantine of ≥ days. our stochastic simulation model has several strengths. the model considers the length of time persons are in contact, in addition to the total number of contacts. the model parameters we used are not related to the size of the simulated population, unlike previous models ( ) . we repeated the simulations conducted for this study with a population twice as large as the original population and the same input parameters. the resulting rates were almost unchanged, so the differences can be attributed to the random effects associated with these simulations. the weaknesses of our present model are that it requires many input parameters and that it does not include the effects of antiviral medications. our model allows for estimating vaccine effects for susceptibility and infectiousness; however, this option was not used in the present study. on during a severe pandemic, including isolation of persons with confirmed or probable influenza, voluntary home quarantine of members of households with confirmed cases, dismissal of students from schools and school-based activities, and closure of childcare programs. during a pandemic with a severity index of or (defined as a case fatality rate of > %), this new guidance recommends not only school dismissals of ≤ weeks but also measures to protect children from being exposed or exposing others to the pandemic virus via reduction of their out-of-school social contacts and community mixing. in this article, we assessed the effectiveness of school closures of - weeks duration after school absenteeism rates reached high levels. we assumed that children dismissed from schools would increase their out-of-school contacts. these assumptions reduced the effectiveness of school closures in our model. in future work, we will explore the effectiveness of early dismissal of students from schools, together with changes in out-of-school contacts, and other interventions using our model. in summary, if persons who suspect they are infected with pandemic influenza virus were to withdraw to their homes quickly, the rates of illness and death associated with a pandemic may be substantially reduced. the withdrawal of all household contacts may further reduce rates of illness and death, but this additional intervention is likely to be relatively costly and difficult to implement. restricting the movement of ill ltcf residents will be beneficial in reducing their adverse health outcomes. before early and rapid implementation of such interventions during a pandemic is feasible, the public will need to be educated about the early symptoms of influenza and measures developed to increase the social acceptability of self-isolation when ill. influenza pandemic preparedness plan for the united states influenza pandemic planning casecontrol study of risk factors for avian influenza a (h n ) disease, hong kong confirmed human cases of avian influenza a (h n ) the economic impact of pandemic influenza in the united states containing pandemic influenza with antiviral agents united states department of health and human services pandemic influenza plan profiles of general demographic characteristics for the united states mortality associated with influenza and respiratory syncytial virus in the united states influenza-associated hospitalizations in the united states estimation of influenza-associated deaths and hospitalizations in the united states transmissibility of pandemic influenza containing pandemic influenza at the source strategies for containing an emerging influenza pandemic in southeast asia strategy for the distribution of influenza vaccine to high-risk groups and children mitigation strategies for pandemic influenza in the united states a 'small-world-like' model for comparing interventions aimed at preventing and controlling influenza pandemics effectiveness and cost-benefit of influenza vaccination of healthy working adults: a randomized controlled trial factors influencing compliance with quarantine in toronto during the sars outbreak interim pre-pandemic planning guidance: community strategy for pandemic influenza mitigation in the united states-early, targeted, layered use of nonpharmaceutical intervention we thank martin meltzer for his thoughts on the potential economic consequences associated with the interventions we modeled and keiji fukuda for his comments on early versions of the manuscript. m.j.h. was partially supported by contract ipa from the centers for disease control and prevention.dr haber is a professor of biostatistics at the rollins school of public health, emory university, atlanta. his research focuses on statistical models and methods for infectious disease epidemiology. key: cord- -qwq b a authors: kretzschmar, mirjam; van den hof, susan; wallinga, jacco; van wijngaarden, jan title: ring vaccination and smallpox control date: - - journal: emerg infect dis doi: . /eid . sha: doc_id: cord_uid: qwq b a we present a stochastic model for the spread of smallpox after a small number of index cases are introduced into a susceptible population. the model describes a branching process for the spread of the infection and the effects of intervention measures. we discuss scenarios in which ring vaccination of direct contacts of infected persons is sufficient to contain an epidemic. ring vaccination can be successful if infectious cases are rapidly diagnosed. however, because of the inherent stochastic nature of epidemic outbreaks, both the size and duration of contained outbreaks are highly variable. intervention requirements depend on the basic reproduction number r( ), for which different estimates exist. when faced with the decision of whether to rely on ring vaccination, the public health community should be aware that an epidemic might take time to subside even for an eventually successful intervention strategy. r ecently, concerns about a bioterror attack with the smallpox virus or other infectious disease agents have risen ( , ) . while new vaccines with fewer adverse consequences are being developed ( ), the existing vaccines, which have potential side effects and may be lethal, are the only vaccines available ( , ) . in the united states during the recent voluntary smallpox vaccination program, a limited number of healthcare workers volunteered for vaccination because of the risks associated with vaccination and the low for infection ( ) . if an outbreak occurs, vaccination strategies include ring vaccination around diagnosed cases of smallpox or a mass vaccination to begin as soon as the first cases are diagnosed. without natural smallpox infections, practical experience with ring vaccination against smallpox cannot be gained; accounts of the vaccination programs that eradicated smallpox in the s are the only source of information ( ) . combined with information collected during the last decades of smallpox circulation, mathematical modeling offers a tool to explore various vaccination scenarios if an outbreak occurs ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) . we investigated which conditions are the best for effective use of ring vaccination, a strategy in which direct contacts of diagnosed cases are identified and vaccinated. we also investigated whether monitoring contacts contributes to the success of ring vaccination. we used a stochastic model that distinguished between close and casual contacts to explore the variability in the number of infected persons during an outbreak, and the time until the outbreak is over. we derived expressions for the basic reproduction number (r ) and the effective reproduction number (r υ ). we investigated how effectiveness of ring vaccination depends on the time until diagnosis of a symptomatic case, the time to identify and vaccinate contacts in the close contact and casual contact ring, and the vaccination coverage required to contain an epidemic. the model describes the number of infected persons after one or more index cases are introduced. it simulates a stochastic process in which every infected person generates a number of new infections according to a given probability distribution. this process implies that contacts of different infected persons are independent of each other and that no saturation of the incidence occurs at higher prevalence. the model is applicable for the first few generations of infection, if the outbreak goes unchecked, and for the complete outbreak if it is contained. we summarize the main features of the model; the formal model definition is given in the appendix. the noninfectious state (incubation period plus prodromal phase) lasts - days ( , , ) with specified probabilities per day of moving to the infectious state. the assumption that infectivity during the prodromal phase is negligible is supported by a recently published statistical analysis of outbreak data ( ) . the duration of the infectious state d i is days ( , , ) , with variable infectiousness during that time ( , , ) . the probability of transmission per contact p τ , where τ denotes the day of the infectious period, is high at the beginning and low at the end of the infectious period ( figure a ). at the end of the infectious period, a person either recovers or dies. the case-fatality rate is % ( ) , which is an average value for the case-fatality rate of variola major. transmission takes place in two rings of contacts: ) household and other close contacts, and ) more casual face-to-face contacts. we assumed that in the close contact ring the probability of transmission is five times higher than in the casual contact ring (g = . ). the number of contacts on day τ of the infectious period in the close contact ring follows a poisson distribution with mean µ τ ( ) , and in the casual contact ring this number follows a a negative binomial distribution with mean µ τ ( ) . the values (µ τ ( ) = and µ τ ( ) = . ) were chosen such that the total number of contacts per day was comparable to numbers observed in empirical studies ( ) (figure b ). for every contact, the event of transmission is determined by the infectiousness by day of the infectious period. most people can be infected again, and smallpox can develop - years after vaccination ( , ) . while residual immunity might lower the case-fatality rate, it might also lead to a later diagnosis for infected persons because disease symptoms are milder. an infection with milder symptoms is probably less infectious, but an infectious person might have more contacts with others because he or she feels less impaired by disease symptoms. hence, the net effects of residual immunity are difficult to assess. we assumed that all persons are equally susceptible, and that no protective immunity remains in the population from previous vaccination. the basic reproduction number r describes the average number of secondary cases produced from contact with an infected person during the infectious period and without intervention. the number can be computed as the sum of the reproduction numbers in the close contact and casual contact ring for the baseline parameter values given in the table, r = . , which, when broken down by rings of contacts, gives r ( ) = . and r ( ) = . , i.e., . % of all transmissions take place in the close contact ring. we use the parameter a in the function describing the transmission probability (table) to vary the basic reproduction number, i.e., if we want to simulate an outbreak under the assumption that r is , we chose a accordingly. in the literature, the estimates given for r vary between - ( , , ) and - ( , ) . ring vaccination in the model includes complete isolation of symptomatic patients with diagnosed cases of smallpox and vaccination of (some or) all contacts of the figure . a, the transmission probability per contact by day of the infectious period; b, the probability distribution of the number of contacts with susceptible persons per day; c, the probability of remaining an undiagnosed, but infectious, case by day of the infectious period; and d, the mean (solid line) and the . % and . % percentiles (dotted lines) of the number of infected persons for simulation runs for an epidemic without any intervention after the introduction of one index case at the beginning of this incubation period at t = . number of contacts diagnosed case-patient. in our baseline scenario, we assumed that vaccinated contacts are not isolated after vaccination and may therefore transmit the infection to others if they become infectious. in addition, we enhance the baseline intervention by including monitoring of identified contacts. the effectiveness of the intervention therefore is determined by the probability of diagnosis per day of the infectious period, the time needed to identify contacts of the close contact and casual contact ring, the vaccination coverage in the close contact and the casual contact ring, and whether contacts are monitored. some of those parameters (speed of diagnosis and time to identifying contacts) differ between the first index case in the population and cases occurring later in the epidemic. in figure , the timing of the key events in the chain of transmission and intervention is shown schematically. the index patient can cause new cases of infection between the beginning of the infectious period until diagnosis and isolation. for a secondary case, vaccination has to take place within days after infection ( ) to prevent disease. we denote with δ τ the probability of diagnosis on day τ of the infectious period for those persons who have not been diagnosed before. from those probabilities, one can derive the probability that an infectious person is not yet diagnosed on day τ of his or her infectious period ( figure c ). by υ τ (i) , we denote the probability that a contact in ring i (i = or ), who was infected on day τ of the index patient's infectious period, will be vaccinated within days of being infected. in the appendix, υ τ depends on the diagnosis probabilities, the time needed for contact tracing, and the vaccination coverage. throughout, we assume that the vaccine efficacy is %. we can now determine an effective reproduction number r υ that describes the number of secondary cases caused by an index patient in a situation with intervention: a special strategy included in this formula is an intervention where only case isolation is performed without vaccination of contacts. the vaccination coverage c (i) is set to zero. equivalently, it describes the situation that no window period exists ( ) . the reproduction number can then be calculated as if vaccination is ineffective, but contacts are monitored, the monitoring will have the same effect on r as an effective vaccination, because the contacts will not be able to disseminate the virus any further (assuming a fully effective monitoring). therefore, r υ can be computed with the formula including vaccination, where the window period is now set to w = , of the full duration of the infectious period. this assumption means that regardless of when the index patient's condition is diagnosed, contacts can effectively be excluded from further transmission. if monitoring of contacts is not % effective, the parameter c (i) for the coverage can be used to express the extent of successful monitoring. the outbreak can be controlled if r υ < . in the table, the model parameters and their baseline values are listed. in the appendix, the formal model definition is given. an epidemic starting with one index case in a completely susceptible population without intervention grows exponentially, if it survives early extinction. the large range of possible courses of the epidemic reflects the stochastic variability ( figure d ). if the intervention does not succeed in reducing the effective reproduction number r υ to below , the epidemic will continue to grow exponentially, albeit at a lower rate. for example, if diagnosed infectious persons are isolated, but no ring vaccination is performed, the effective reproduction number is r υ = . and the epidemic cannot be contained. if the intervention succeeds in reducing the effective reproduction number r υ to < , the size of successive generations of infected persons declines. for the parameter values of the baseline scenario given in the table, we have r υ = . and . % of all transmissions take place in the casual contact ring. the epidemic can then be contained and the virus eradicated. in figure a and b, the distribution of the total number of infected persons (excluding those who were vaccinated in time to prevent symptomatic infection) and of the time until recovery of the last infected patient is shown for simulation runs with the baseline parameters (table) . the time until the epidemic is over is quite variable: on average it takes days, in some cases it takes up to year (range - days). to contrast the baseline scenario, in figure c and d we show results for the case that r = . , i.e., twice the value of baseline scenario. to contain the epidemic, we now assumed that % of all contacts in the casual contact ring were vaccinated in time. the effective reproduction number r υ was . . a fraction of . % of transmissions took place in the casual contact ring. the mean number of infected persons during the epidemic was (range - ), excluding the infected contacts vaccinated in time and (range - , persons) including those contacts. the mean time to extinction was days (range to > days). the time to extinction can be very long when r υ is near because the epidemic can flare up again when a case by chance produces many secondary infections. a similar picture would result if, in addition to vaccinating casual contacts with a coverage of %, those contacts are monitored. the effective reproduction number is then . ; . % of transmissions are in the casual contact ring. the initial phase of the epidemic (time before discovery of the first case) is determined by the number of index patients that start the epidemic outbreak and by the time it takes to diagnose the first case. we varied those two variables separately while assuming that after diagnosis intervention took place within the parameters defined in the table, i.e., with an r υ of . (figure ). while the number of cases increases almost linearly with the number of index cases, the dependency on the time to diagnosis shows the influence of the variable infectiousness during the infectious period. in the beginning, when infectiousness is high, the number of infected persons increases rapidly. another rise occurs toward the end of the index patient's infectious period because the second-generation patients become infectious and produce the third generation of infected persons. once the second generation of infected persons has the opportunity to disseminate the infection further, the range of possible outcomes increases greatly (range - infected persons). a similar picture emerges for the time needed to extinguish the outbreak. the duration of the outbreak increases when diagnosis is delayed during the first few days of infectiousness, then stays on a stable level, and finally increases again when diagnosis is delayed towards the end of the infectious period ( figure d ). therefore, diagnosing the first index case before the second generation of infected persons start transmitting the virus is important. if diagnosing the first case at the beginning of its infectious period is possible, the number of cases during the epidemic can be kept at a low level. among others, the value of r determines whether ring vaccination as defined above can contain an epidemic or not. as the value of r is uncertain ( , , ) , we studied r υ as a function of r ( figure ). in figure a , an intervention without monitoring of contacts is considered with various assumptions on how long tracing and vaccinating casual contacts take. in this case, ring vaccination can contain the epidemic if r is < , and contacts can be traced within days. in figure b , monitoring of contacts is added to the intervention. in that instance, the epidemic can be contained up to an r of . in figure a and b, we assumed that % of all casual contacts can be identified and vaccinated or monitored. in figure , we show how the critical vaccination coverage needed to control the epidemic depends on r , or, more specifically, on the average number of daily contacts ( figure c ). in addition, we varied the baseline assumption about the time to diagnosis by shifting the probability of being diagnosed by n days towards a later time in the symptomatic period. in figure a , without monitoring of vaccinated contacts, a shift by day greatly increases the coverage needed to contain the epidemic. if diagnosis is delayed by > day, the chances of controlling the epidemic diminish greatly. if vaccinated contacts are monitored, the situation improves ( figure b) , and a high vaccination coverage in the casual contact ring ensures that the epidemic stays under control. finally, we looked at how differences in intervention effectiveness influence the duration of the epidemic and the cumulative number of infected persons. the effective reproduction number r υ was varied by decreasing the vaccination coverage in the casual contact ring stepwise to . . the effective reproduction number increased up to . (starting from the baseline value of . ). figures a and b show how the cumulative number of infected persons and the time to extinction increase with increasing r υ . the mean time until extinction approximately doubles to almost days, and the range of possible outcomes increases with maximum possible durations of > years. the mean number of infected persons increases by a factor of , and the range of possible outcomes increases such that epidemics with several hundreds of infected persons are possible. hence, if r υ is slightly < , the epidemic might take a long time to control, and the number of persons who become infected and die might be high. our simulation results show that a smallpox epidemic starting from a small number of index cases can be contained by ring vaccination provided the intervention measures are very effective. the time to diagnosis has proven to be an essential and sensitive parameter in determining the intervention effectiveness. the speed of diagnosis is less essential if identified contacts are isolated to prevent them from transmitting further if their vaccination fails. the time window limiting the success of vaccination then loses its importance for determining the effectiveness of intervention. the time to diagnosis of cases and the fraction of contacts found by contact tracing are then the key parameters. quick contact tracing would be even more essential if substantial transmission would take place during the prodromal period of infection as is assumed by some authors ( , ) . some limitations of our modeling approach should be kept in mind. first, we only consider epidemics that are started by a small number of index cases. the branching process approach does not allow for overlapping rings of contact, but we implicitly include such an effect by varying the effective transmission probability such that the distribution of transmissions over the infectious period agrees with empiric findings ( ) . in other words, the decreasing probability of contacting new susceptible persons during the infectious period is incorporated in the decreasing transmission probability per contact. for larger numbers of index cases, our approach can be viewed as a worst-case scenario. second, we assume that the population is completely susceptible, i.e., no residual immunity from vaccination in the pre-eradication era exists. this lack of immunity means that if previously vaccinated persons cannot become infectious to others, our results are too pessimistic, whereas if they become infectious with mitigated symptoms, our results might be too optimistic. in the recent literature, other models, both stochastic and deterministic, of smallpox outbreaks have been introduced to analyze the effects of ring and mass vaccination ( , ( ) ( ) ( ) ( ) ( ) . on the basis of a low estimate for r , meltzer et al. ( ) concluded that even quarantine alone can control the epidemic, as can vaccination alone, if the transmission rate is reduced sufficiently. however, the model does not allow analysis of how intervention parameters determine the reduction of the transmission rate. on the other hand, kaplan et al. ( ) conclude that with a large number of initial cases, mass vaccination will prevent more deaths than will a vaccination strategy based on contact tracing. those authors explicitly take into account the limited resources available for tracing and vaccinating contacts, a limitation that is an important factor in large outbreaks. also, they assume that infectivity is high during the prodromal phase. as in the model by kaplan et al. ( ) , our model takes into account that the intervals between infection and diagnosis of an index case, and between diagnosis of an index case and tracing of the contact, may exceed the time window in research emerging infectious diseases • www.cdc.gov/eid • vol. , no. , may figure . the effective reproduction number r υ , that determines the success of intervention is shown as a function of the basic reproduction number r for a vaccination coverage of % in the casual contact ring. in a, contacts are not monitored after vaccination; in b, all identified contacts are isolated and cause not further transmission. the different lines in a are for different assumptions about how long it takes to trace and vaccinate those contacts. in b, it does not make a difference whether it takes , , or days to find the contacts. if r is , the intervention will be successful in both cases, if r is , % coverage is no longer sufficient to curb the epidemic. figure . here the critical vaccination coverage in the casual contact ring is shown as a function of the basic reproduction number r for different assumptions about the time it takes to diagnose infectious persons. a, for the baseline assumption, that diagnosis is very quick after the beginning of the infectious period, a low coverage is sufficient if r is , but for r around the coverage has to be at least % for the intervention to be successful. if the probability of being diagnosed shifts to later days of the infectious period, the situation quickly gets out of control and vaccination can no longer curb the epidemic. in b, the same is shown with the difference that here we assume that vaccinated contacts are successfully monitored such that they can no longer produce any secondary infections, even if their vaccination was too late to prevent them from becoming infectious. in this case a later diagnosis is not that influential, but nevertheless, if r is , the vaccination coverage (or the percentage of contacts identified and monitored) must be at least % to guarantee success. in c, the critical coverage of the casual contact ring is shown as a function of the average number of contacts per day, again for the situation where vaccination is combined with monitoring of contacts. the average number of contacts was varied by varying the number of daily casual contacts. the effect is similar to that of varying r through the transmission probability per contact as shown in b. which vaccination has to take place. this window limits the possible effectiveness of contact vaccination-a phenomenon termed "race to trace" by kaplan et al. while the model by kaplan et al. is based on differential equations with exponentially distributed sojourn times in different compartments, our model is a stochastic model that is able to deal with more realistic distributions for sojourn times in different disease states. also, our model can provide estimates for variability in outcomes (discussion in [ ] ). in a study by halloran et al. ( ) , a stochastic model for smallpox outbreaks in small, structured communities is described. in some respects the model is similar to ours, namely, that there is a distinction between household contacts and other contacts in the community with differing transmission probabilities. the values of most biologic parameters are choices similar to ours, with the exception of halloran's assumption that persons are highly infectious during the prodromal phase. an important difference between the models is the natural limitation of the number of infected persons in any epidemic attributable to the rather small community size in halloran's model. the nonlinear effects of saturation play a rather large role in determining the outbreak size, especially for less effective intervention measures. also, halloran's model seems to be too complex to derive an explicit formula for the basic reproduction number r , and thus makes sensitivity analysis of the results based on that quantity much more tedious. one of the main differences, however, is the way that vaccination is incorporated into the model does not allow investigation of the effects of those parameters that largely determine the success of intervention, namely the time to diagnosis of new cases and the time needed to trace contacts. in a small closed community as the one described in the model, tracing of contacts is not that difficult, but in modern society with its increasing mobility the tracing of casual contacts can pose a big problem. the main difference between the modeling approach of bozette et al. ( , ) and our model is that in bozette's r and r υ are set to prescribed values, while in our model those numbers can be derived from measurable quantities inherent to the transmission and intervention process. in comparison to the estimate of r υ of . in our baseline scenario with vaccination and monitoring of contacts, bozette et al. assume a much lower value of . . so compared to our results, their results are rather optimistic, but they cannot relate the assumed value of r υ to r or to parameters describing the process of contact tracing and vaccination. finally, eichner ( ) recently published a modeling study that uses a simulation model to assess the effectiveness of case isolation and contact tracing. modeling approach and choice of parameter values resemble our approach, but the intervention is modeled in a more phenomenologic way by defining the outcomes of intervention without explicitly including intervention-related parameters into the model. this does not allow for an explicit calculation of the effective reproduction number based on intervention parameters as is possible in our approach. with respect to preparing for a smallpox outbreak, alertness and ability to diagnose quickly are important. physicians and nurses need to be educated and the public needs to be more aware. also, since we know little about the timing and effectiveness of identifying infectious persons and their contacts in case of a bioterror attack, obtaining more empirical information about contact patterns and contact tracing will be helpful. recently, some useful data about contact patterns have been collected during severe acute respiratory syndrome outbreaks, but a more systematic investigation of contact tracing is advisable. considering the uncertainties connected to all parameter values, we conclude that any contingency plan for use of ring vaccination must also identify the criteria under which switching to large-scale mass vaccination is justified. by e t,τ we denote the number of persons infected at time t-τ who are still latent at time t, < τ < d e . by i t,τ we denote the number of persons who became symptomatic at time t-τ and are not yet in isolation at time t, < τ < d i . by q t,τ we denote the number of persons who became symptomatic at time t-τ and are in isolation at time t, < τ < d i . s τ denotes a bernoulli distributed random variable with mean γ τ , < τ < d e , where γ τ is the probability to move to the infectious state on day τ of the latent period. d e is the maximum duration of the latent period. t τ denotes a bernoulli distributed random variable with mean p τ , < τ < d i , where p τ is the transmission probability per contact on day τ of the infectious period. d i is the duration of the infectious period. we assume that p τ can be well described by the functional form p τ =a τe (-a τ) with a , a > . c τ ( ) denotes a poissson distributed random variable with mean µ τ ( ) , < τ < d i describing the number of contacts per day in the close contact ring. c τ ( ) denotes a random variable with a negative binomial distribution with parameters n τ and q, < τ < d i describing the number of contacts per day in the casual contact ring. the mean number of contacts per day in the casual contact ring is and the standard deviation is . the transition from being undiagnosed infectious to isolation depends on the probability of diagnosis ∆ τ on day τ after the start of the infectious period. ∆ τ is a bernoulli distributed random variable with parameter δ τ , τ= ,...,d i . v τ ( ) and v τ ( ) are bernoulli distributed random variables with parameters υ τ ( ) and υ τ ( ) , respectively, with τ= ,...,d i . they describe the probability that a contact in ring or , respectively, will be effectively vaccinated within the time window of days after infection. the subscript τ refers to day of the infectious period of the index case at the moment of transmission. the vaccination probabilities υ τ (i) for i= , depend on the probability that the index case is diagnosed and on the vaccination coverage as follows. the probability an infectious person who has transmitted to a contact on day τ of the infectious period is diagnosed on day τ+j can be computed as for j= ,..., d i -τ+ . the probability for the infected contact to be vaccinated in time depends on the number of days w after infection that vaccination can still be effective, the number of days r (i) that are needed for tracing the contact, and the coverage c (i) (the fraction of contacts in ring i that are effectively immunized) for i= , . then we get for i= , . to describe the probability of diagnosis per day of the infectious period we use the functional form for τ > b , and δ τ = for τ < b with b > , and an integer b between and d i . in the simulations different values for those parameters are chosen for the first index patient, who starts the epidemic, and later cases assuming that it takes longer to diagnose the first case as there is not yet that much alertness of the public health system. also, the time needed to find contacts and the vaccination coverage may vary between the first index case and later cases. the transitions through the different stages in time is described by the system of difference equations. the initial conditions for the case that the epidemic is started by one infected index case entering the population at the beginning of his latent period are given by at the end of the infectious period an infected patient either recovers and becomes immune or dies. death occurs in a fraction f of all cases, i.e., f denotes the case fatality. this implies that the number of deaths m t+ at time t+ is given by , where z is a bernoulli distributed random variable with parameter f , the case-fatality rate. the cumulative mortality from the start of the epidemic up to time t is given by . the model was implemented and run in mathematica . . the looming threat of bioterrorism smallpox as a biological weapon. medical and public health management usa to increase smallpox vaccine stockpile complications of smallpox vaccinations in : results of ten statewide surveys adverse events following smallpox vaccination-united states smallpox vaccination campaign in the doldrums smallpox and its eradication. geneva: world health organization modeling potential responses to smallpox as a bioterrorist weapon transmission potential of smallpox in contemporary populations correction: transmission potential of smallpox in contemporary populations emergency response to a smallpox attack: the case for mass vaccination containing bioterrorist smallpox a model for a smallpox-vaccination policy supplementary appendix . rand smallpox-vaccination policy model-model details, assumptions, and detailed results case isolation and contact tracing can prevent the spread of smallpox smallpox in europe, - transmission potential of smallpox: estimates based on detailed data from an outbreak bombay: the cotary book depot who mixes with whom? a method to determine the contact patterns of adults that may lead to the spread of airborne infections biological warfare and bioterrorism immunity conferred by smallpox vaccine infectious diseases of humans: dynamics and control the new cell culture smallpox vaccine should be offered to the general population smallpox bioterror response we used a stochastic model to describe the number of infected persons following the introduction of one or more index cases. the model simulates a discrete time-branching process. key: cord- - tkyv w authors: barrett, peter m; bambury, niamh; kelly, louise; condon, rosalind; crompton, janice; sheahan, anne title: measuring the effectiveness of an automated text messaging active surveillance system for covid- in the south of ireland, march to april date: - - journal: euro surveill doi: . / - .es. . . . sha: doc_id: cord_uid: tkyv w we report the effectiveness of automated text messaging for active surveillance of asymptomatic close contacts of coronavirus disease (covid- ) cases in the cork/kerry region of ireland. in the first weeks of the covid- outbreak, , close contacts received , automated texts. overall, contacts ( . %) reported symptoms which required referral for testing and ( . %) tested positive for covid- . non-response was high (n = , ; . %) and this required substantial clinical and administrative resources for follow-up. as part of ongoing efforts to control the spread of infection, national and international guidance recommends active surveillance of asymptomatic close contacts of confirmed cases of covid- [ ] [ ] [ ] [ ] [ ] . however, evidence for the effectiveness of active surveillance systems among community-based close contacts of cases of covid- has been limited to date. this study aimed to measure the effectiveness of an automated text-based active surveillance system which was used in cork/ kerry for the first weeks of the covid- response. during the study period from march to april inclusive, cases were defined according to clinical criteria (presence of fever/cough/shortness of breath) and laboratory detection of severe acute respiratory syndrome coronavirus- (sars-cov- ) nucleic acid in a clinical specimen [ ] . contact tracing was undertaken for all notified cases of covid- that arose in cork/kerry, in accordance with national protocols [ ] . contacts of confirmed cases were called individually by the department of public health (dph) for cork/ kerry and classified as casual (< min face-to-face exposure) or close (≥ min face-to-face exposure). close contacts who were symptomatic were referred for testing directly. asymptomatic close contacts were advised about the need to self-quarantine for days from the date of their last exposure to a confirmed case, and they were sent written information about their potential risk of infection with sars-cov- . they were offered the option of receiving a daily text message from the dph as part of active surveillance. those who declined were offered the option of a daily telephone call as an alternative, but are not included in the current analysis. participants' mobile telephone numbers were added to an automated text messaging system using text broadcasting software (saadian technologies, dublin, ireland). asymptomatic close contacts were texted every day from the day following their initial telephone call with the regional dph until the end of their -day follow-up period. text recipients were asked to provide a yes/no response to the question, "do you have new fever or cough or shortness of breath?" those who responded 'yes' were contacted directly by a clinician, assessed over the telephone and, if necessary, referred for priority testing for covid- . those who responded 'no' continued with active surveillance until the end of their -day follow-up period. those who responded with details of clinical queries or concerns (instead of responding 'yes') were contacted by a clinician or a nurse. non-responders were sent one follow-up text after h, and were then contacted directly by a clinician or a nurse if they did not respond to the second text. details of all responses to the text broadcast messaging system were exported to microsoft excel and collated. of those who had been tested for covid- , positive results were recorded on the computerised infectious disease reporting system, health service executive covid- tracker, or i.laboratory pathology results enquiry system. samples were tested in the national virus reference laboratory in dublin or in one of the regional microbiology laboratories in cork/kerry. results were verified from daily line listings received from each of these laboratories. all participants provided verbal consent during their initial telephone call with the dph to receive a daily text message and possible contact by a clinician or nurse. they had the option to withdraw from active surveillance at any time. if requested, they were provided with relevant information pertaining to data protection legislation and compliance with the general data protection regulation. in this study, we present aggregate data with no identifiable information. thus, ethical approval was not required. there were , asymptomatic close contacts added to the text-based active surveillance system and , texts were sent (mean: . texts per participant). the median age of respondents (or their parents/guardians) was years (range: months to years). in total, respondents ( . %) required clinical follow-up of whom ( . %) were female and ( . %) were male. the majority (n = ; . %) were referred for testing, and the results are shown in the table. overall, . % of close contacts were referred for testing and . % tested positive for covid- during follow-up. of those who required a clinical call-back, ( . %) did not meet criteria for testing; they had symptoms which were not deemed to be consistent with covid- , or else they sought clinical advice about returning to work, duration of self-quarantine or advice about family members or contacts. during the follow-up period, the national testing criteria for covid- also changed several times as knowledge of covid- and laboratory testing capacity evolved [ ] . six individuals who were referred for testing by their general practitioner (gp) were never swabbed because the eligibility criteria changed between ordering and time of testing and they no longer fit the testing criteria. one test was returned as an invalid result and the individual did not wish to be re-tested. overall, the response rate to daily texts was high (n = , ; . %). nonetheless, the absolute number of non-responses was large (n = , ; . %) and this created a substantial workload for dph clinical and administrative staff. active surveillance has been recommended for close contacts of other coronavirus infections such as middle east respiratory syndrome coronavirus (mers-cov) [ ] and sars cov- [ ] , but is considered too resourceintensive to be routinely recommended for other notifiable infectious diseases [ ] . in the current pandemic, regional public health teams are being challenged to use their finite resources as efficiently as possible to minimise onward transmission of covid- . early evidence from the covid- pandemic suggests that active surveillance of close contacts does increase case detection, which in turn facilitates earlier identification of additional contacts and limits onward transmission [ ] . in the first weeks of the covid- response in cork/kerry, . % of close contacts who consented to participate in active surveillance were referred for testing and . % tested positive for covid- . this is a higher detection rate than in a recent study from the united states where the positive case yield from active surveillance of close contacts was . % [ ] . the world health organization has highlighted the need for robust electronic data capture tools to support efficient contact tracing and active surveillance of close contacts on a large scale [ ] . although our text message-based system resulted in the detection of additional positive cases and helped to break chains of transmission in the community, it was resource-intensive. it required manual data entry, daily data exports for follow-up and considerable input and oversight from clinical and administrative staff. in order to sustain active surveillance, extra resources are required in terms of staffing, robust it infrastructure and strong data protection safeguards. this has also been demonstrated recently in singapore where successful active surveillance mechanisms led to a high yield in positive cases [ ] . at the time of writing, several regional public health departments in ireland have discontinued active surveillance because of resource constraints. the system has been largely replaced by a centralised text messaging system for asymptomatic close contacts who are reminded to seek medical advice from their gp if they develop symptoms of sars-cov- infection, akin to passive surveillance. the overall effectiveness of any active surveillance system depends on the eligibility criteria applied in testing referrals and may also involve a value judgement over what constitutes an effective yield. to our knowledge, this is the first european study to measure the positive covid- yield from a text message-based active surveillance system. older people were more inclined to opt out or request follow-up by daily telephone calls rather than by text (data not shown). there was a lack of robust data on this cohort, partly because electronic data capture tools were lacking at the outset. further analysis of this cohort may have resulted in a greater understanding of the limitations of the text messaging system. strict national testing criteria were in place at times because of challenges in it infrastructure, limited laboratory capacity and large backlogs of test results with slow turnaround times owing to difficulties procuring reagents and physical swabs. these practical challenges, and the lack of testing of asymptomatic close contacts, are likely to have reduced the overall yield of positive results. furthermore, some text recipients indicated that they did not reply to daily texts because doing so involved a cost (if using pay-as-yougo mobile telephones), and this may have impacted on the response rate. at the time of writing, ireland has implemented testing for all symptomatic and asymptomatic close contacts of confirmed cases of covid- . if these criteria had applied during the study period, we may have had a higher yield of sars-cov- infections among this cohort. automated active surveillance systems can thus facilitate early identification of symptomatic close contacts and positive cases of covid- . however, it requires resourcing with robust it infrastructure, sufficient laboratory capacity and dedicated clinical and administrative support. statement from the national public health emergency team -saturday report prepared by hpsc on / / for nphet. dublin: health service executive national interim guidelines for public health management of contacts of cases of covid- . v . . dublin: health service executive global surveillance for covid- caused by human infection with covid- virus. interim guidance. geneva: who contact tracing: public health management of persons, including healthcare workers, having had contact with covid- cases in the european union -second update updated: public health management of cases and contacts associated with coronavirus disease (covid- ) supporting the covid- pandemic response: surveillance and outbreak analytics covid- case definitions. dublin: health service executive health protection surveillance centre. covid- information. dublin: health service executive risk assessment guidelines for infectious diseases transmitted on aircraft (ragida) middle east respiratory syndrome coronavirus (mers-cov) who guidelines for the global surveillance of severe acute respiratory syndrome (sars) chapter : surveillance indicators. in: manual for the surveillance of vaccine-preventable diseases evaluation of the effectiveness of surveillance and containment measures for the first patients with covid- in singapore active monitoring of persons exposed to patients with confirmed covid- -united states contact tracing in the context of covid- . interim guidance. geneva: who we are very grateful to all staff in the department of public health hse-south (cork and kerry) who contributed to the identification and clinical management of close contacts of covid- cases. none declared. peter barrett and anne sheahan conceived the study. rosalind condon and janice crompton managed the active surveillance system and collated the data on close contacts. peter barrett and niamh bambury linked the data on close contacts to testing referrals and laboratory results. peter barrett, niamh bambury and louise kelly drafted the initial manuscript. all authors reviewed the draft for important intellectual content and approved the final version. this is an open-access article distributed under the terms of the creative commons attribution (cc by . ) licence. you may share and adapt the material, but must give appropriate credit to the source, provide a link to the licence and indicate if changes were made.any supplementary material referenced in the article can be found in the online version. key: cord- -v gfh m authors: maghdid, halgurd s.; ghafoor, kayhan zrar title: a smartphone enabled approach to manage covid- lockdown and economic crisis date: - - journal: nan doi: nan sha: doc_id: cord_uid: v gfh m the emergence of novel covid- causing an overload in health system and high mortality rate. the key priority is to contain the epidemic and prevent the infection rate. in this context, many countries are now in some degree of lockdown to ensure extreme social distancing of entire population and hence slowing down the epidemic spread. further, authorities use case quarantine strategy and manual second/third contact-tracing to contain the covid- disease. however, manual contact tracing is time consuming and labor-intensive task which tremendously overload public health systems. in this paper, we developed a smartphone-based approach to automatically and widely trace the contacts for confirmed covid- cases. particularly, contact-tracing approach creates a list of individuals in the vicinity and notifying contacts or officials of confirmed covid- cases. this approach is not only providing awareness to individuals they are in the proximity to the infected area, but also tracks the incidental contacts that the covid- carrier might not recall. thereafter, we developed a dashboard to provide a plan for government officials on how lockdown/mass quarantine can be safely lifted, and hence tackling the economic crisis. the dashboard used to predict the level of lockdown area based on collected positions and distance measurements of the registered users in the vicinity. the prediction model uses k-means algorithm as an unsupervised machine learning technique for lockdown management. in an unprecedented move, china locks down the megacity named wuhan, in which the novel coronavirus was first reported, in the hopes stopping the spread of deadly coronavirus. during the lockdown, all railway, port and road transportation were suspended in wuhan city. with the increasing number of infections and fast person-to-person spreading, hospitals are overwhelmed with patients. later, the disease has been identified in many other countries around the globe [ ] , [ ] . subsequently, the world health organization (who) announced that the virus can cause a respiratory disease with clinical presentation of cough, fever and lung inflammation. as more countries are experienced dozens of cases or community transmission, who characterized covid- disease as a pandemic. halgurd s. maghdid is with the department of software engineering, faculty of engineering, koya university, kurdistan region-f.r.iraq. first.last@koyauniversity.org. kayhan zrar ghafoor is with the department of software engineering, salahaddin university-erbil, iraq; school of mathematics and computer science, university of wolverhampton, wulfruna street, wolverhampton, wv ly, uk. kayhan@ieee.org. *kayhan zrar ghafoor is the corresponding author. kayhan@ieee.org. the researchers can access the implementation and programming code in https://github.com/halgurd /lockdown covid in such unprecedented situation, doctors and health care workers are putting their life at risk to contain the disease. further, in order to isolate infected people and combatting the outbreak, many hospitals are converted to covid- quarantine ward. moreover, a surge of covid- patients has introduced long queues at hospitals for isolation and treatment. with such high number of infections, emergency responders have been working non-stop sending patients to the hospital and overcrowded hospitals refused to in more patients. for instance, recently in italy medical resources are in short supply, hospitals have had to give priority to people with a significant fever and shortness of breath over others with less severe symptoms [ ] . as the covid- continues to spread, countries around the glob are implementing strict measures intensify the lockdown, from mass quarantine to city shutdown, to slow down the fast transmission of coronavirus [ ] . during the lockdown, people are only allowed to go out for essential work such as purchasing food or medicine. ceremonies and gatherings of more than two people are not permitted. these strict rules of quarantine that only allows few to move around the city including delivery drivers providing vital lifeline. on the other hand, few countries, such as japan, has declared a state of emergency in many cities in an attempt to tackle the spread of the virus. although covid- started as a health crisis, it possibly acts as a gravest threat to the world economy since global financial crisis [ ] . covid- epidemic affect all sectors of the economy from manufacturing and supply chains to universities. it is also affect businesses and daily lives especially in countries where the covid- has hit the hardest. the shortage of supply chain has knock-on effects on economic sector and the demand side (such as trade and tourism). this makes a supply constraint of the producer and causing a restraint in consumer's demand, this may lead to demand shock due to psychological contagion. in order to prevent such widespread fallout, central banks and government have been rolling out emergency measures to reassure businesses and stabilize financial markets to support economy in the phase of covid- . currently, most countries are in the same boat with leading responsibility of group twenty and international organizations [ ] . to meet the responsibility, many companies and academic institutions around the world made efforts to produce covid- vaccine. but, health experts stating that it may take time to produce an effective vaccine. as an effective vaccine for covid- isn't probably to be in market until the beginning of next year, management of lockdown is an imperative need. thus, public health officials combat the virus by manual tracking of recent contacts history of positive covid- cases. this manual contact tracing is very useful at the early spreading stage of the virus. however, when the number of confirmed cases was increased tremendously in some countries, manual contact tracing of each individual is labor-intensive and requires huge resources [ ] . for example, an outbreak of the covid- at a funeral ceremony in an avenue in erbil, kurdistan region left regional government with hundred of potential contacts. this situation or many other scenarios of massive number of cases burden the government on trying to manual tracking all contacts [ ] . it is risky that health authorities cannot easily trace recent covid- carrier cases so that its probability of occurrence and its impact can hardly be measured. technology can potentially be useful for digital contacttracing of positive coronavirus cases. smartphone can use wireless technology data to track people when they near each other. in particular, when someone is confirmed with positive covid- , the status of the smartphone will be updated and, then the app will notify all phones in the vicinity. for example, if someone tests positive of covid- and stood near a person in the mall earlier that week. the covid- carrier would not be able to memorize the person's name for manual contact tracing. in this scenario, the smartphone contact-tracing app is very promising to notify that person [ ] . this automated virus tracking approach could really transform the ability governments and health authorities to contain the and control the epidemic. in this situation, a dashboard is required to assist governments and health authorities to predict when lockdown and self-quarantine will end. this research first reviews the state-of-the-art solutions to combat covid- . then, we developed a smartphonebased approach to automatically and widely trace the contacts for confirmed covid- cases. particularly, contact-tracing approach creates a list of individuals in the vicinity and notifying contacts or officials of confirmed covid- cases. this approach is not only providing awareness to individuals they are in the proximity to the infected area, but also tracks the incidental contacts that the covid- carrier might not recall. thereafter, we developed a dashboard to provide a plan for government officials on how lockdown/mass quarantine can be safely lifted, and hence tackling the economic crisis. from a technical standpoint, we summarise the most important contributions of this paper as follows: ) we build a tracking model based on positional information of registered users to conduct contact-tracing of confirmed covid- cases. ) we propose a smart lockdown management to predict a duration of lockdown. ) in order to notify contacts for confirmed cases, we also developed a notification model to cluster lockdown regions. the rest of this paper is organized as follows. section ii provide the literature review on recent advances of developed ai systems for covid- detection. this is followed by presenting an overview of the proposed approach and details of the designed algorithm in section iii. section iv presents the experiments which are conducted in the paper. finally, section iv concludes the paper. in [ ] the authors modeled on how covid- spreads over populations in countries in terms of the transmission speed and containing its spreading. in the model, r is representing the reproduction number, which is defined the ability of the virus in infecting other people as a chain of contagious infection. infected individuals rapidly infect a group of people over very short period of time, which then yields an outbreak. on the contrary, the infection would be in control if the probability gets closer of one person to infect less than one other person. this is exactly happening in fig. ; when people (black color) who have come into contact with an infected person (red color), the infection would be spread rapidly. one important aspect is how the number of infected people looks like depends on several factors, such as the number of vulnerable people in the communities, the time takes to recover a person without symptoms, the social contacts and possibility of infecting them with coronavirus. further, another factor will affect fast spreading of coronavirus is the frequency of visiting crowded places such as malls and minimarkets. thus, governments and public health authorities are responsible to manage and plan a convenient way to contain the epidemic. moreover, countries at the early stage of virus spreading need to control the epidemic by typically isolating and testing suspected cases tracing their contact and quarantine those people in case they are infected. testing and contact tracing at wide scale, the better the chance of containment. in the case of covid- , research studies have been conducted for containment or controlling the fast spreading, and hence helping governments and societies in ending this epidemic. in [ ] , the authors have investigated the importance of confirmed covid- case isolation that could play a key role in controlling the disease. they have utilized a mathematical model to measure the effectiveness of this strategy in controlling the transmission speed of covid- . to achieve this goal, a stochastic transmission model is developed to overcome the fast person-to-person transmission of covid- . according to their research study, controlling virus transmission is within weeks or by a threshold of accumulative cases. however, controlling the spread of the virus using this mathematical approach is highly correlated to other factors like pathogen and the reaction of people. one key role to track infected people and predict ending lockdown is contact-tracing. when a patient is diagnosed with infectious disease like covid- , contact-tracing is an important step to slowing down the transmission [ ] . this technique seeks to identify people who have had close contact with infected individuals and who therefore may be infect themselves. this targeted strategy reduces the need for stay at home periods. however, manual contact tracing is subject to a person's ability to recall everyone they have come in contact over a two week's period. in [ ] , the authors exploited the cellphone's bluetooth to constantly advertise the presence of people. these anonymous advertisements, named chirps in bluetooth, are not containing positional or personally identifiable information. every phone stores all the chirps that it has sent and overheard from nearby phones. their system uses these lists to enable contact-tracing for people diagnosed with covid- . this system is not only traces infected individuals, but it also estimates distance between individuals and amount of time they spent in close proximity to each other. when a person is diagnosed with covid- , doctors would coordinate with the patient to upload all the chirps sent out by their phone to the public database. meanwhile, people who have not been diagnosed can their phones do a daily scan of public database, to see if their phones have overheard any of the chirps used by people later diagnosed by covid- . this indicates that they were in close prolonged contact with that anonymous individual. fig. shows the procedure of exchanging anonymous id among users for contact-tracing. as stated in the aforementioned section, manual contacttracing is labor-intensive task. in this section, we detail out each part of the proposed smartphone-based digital contacttracing shown in fig. . the main idea of the proposed framework in fig. to enable digital contact-tracing to end lockdown and the same time preventing the virus from spreading. the best thing to do seems to be let people go out for their business, but any body tests positive of covid- , we would be able, through proposed framework, to trace everybody in contact with the confirmed case and managing the lockdown and mass quarantine. this will confirm preventing the spread of the virus to the rest of the people. the first step of the proposed contact-tracing model is registration of users. there is no doubt registration and coverage of high percentage of population are very significant for effective pandemic control. users provide information such as name, phone number, post code, status of the covid- disease (positive, negative or recovered). effectiveness of the application and digital contact tracing depends on two factors speed and coverage. for the proposed framework, we utilize global navigation satellite system (gnss) receiver for outdoor environment whereas bluetooth low energy is used in indoors. speed depends on how to reduce the time required for contact tracing from few days to hours or minutes. the more people register in the system, the better performance of the system in terms of both speed and coverage of contact tracing. in the second step, global positioning system (gps) receiver is used by the proposed model to track either individuals or a group of people visiting to a common place. the gps service class updates user coordinates to the database in every few seconds. once a registered user reports gets infected with covid- , his test result would be send to the public database in central computer server. other registered users will regularly check those central server provider for possible positive covid- cases they were in contact in the past weeks. server is responsible to compare the infected id with its list of stored ids. a push notification will be send, by the server, to those who were in contact with a person tests positive. it is important to note that the information would be revealed to the central server is an id of the phone. firebased cloud messaging is used to send push notification to multiple devices even the apps are paused or running in the background. many apps send push notification, which indicate an alert to the users. this is happen when a person is approaching someone who is infected with covid- or nearby a lockdown area. in order to protect the privacy of those who have the coronavirus, we only include an alerting message into the push notification. this certainly would be very useful for entire population to make informed decision about not getting close to covid- area. however, this notification would help the public health professionals rather than replace it. the proposal is also including a lockdown prediction model. the model is working based on the collect geographic information and crowding level of the registered users in the system. in this study, k-means as an unsupervised machine learning algorithm is used to cluster the users' positions information and predict that the area should be locked down or not based on same empirical thresholds. both scenarios results are shown in fig. . this section presents the details of how the proposed approach will be implemented. the proposal includes two main parts. first, deploying an application on android-based smartphone which will be used by the users and track/send mobility information of the users to the system. while the second side is a web-portal (including a comprehensive dashboard) to monitor and predict the visited area that should be locked down or not. -an android application is implemented on the smartphone. the application lets the users to register their infor-updated based on how the positions are nearest to each them. the pseudo code of the k-means clustering algorithm is shown in algorithm . is automatically captured through the application without user -assign the position to that cluster interaction. the covid- status includes three options which for j ← k to k n do might be covid- , none covid- , and recovered. fig. a new centroid = mean of all positions assigned to that shows a snapshot of the application form for the registration process. -once the users have completed the registration process, they can enter into the position tracking model. the tracking model is to send user's position information into the database of the system as well as shows the google map regarding to their positions, as shown in fig. b . -beside this, the users are also can receive the notification or alert about the areas which have been visited by infected users. the notification is working in the background, i.e. the user may be paused the application and uses other application on the smartphone. however, when the user opens the application and enters the infected area will receive the alert dialog. fig. c and fig. d show an example of the notification and alert dialogue. the notification and dialogue alert models are also configure both outdoors and indoors. for example, for outdoors, the gnss position information of the users is used to measure the distance between any two users' positions and then if the distance is less than meters then the notification or the alert dialog would be raised. however, for indoors, the application scans for bluetooth devices in the vicinity, and then the result of the scan is matching with pre-registered mac addressed in the system. if the matched mac addresses have covid- or recovered cases then the notification model and the alert dialog will notify the users about having covid- or recovered users in the scan area. a web portal for the system's administrators is designed and implemented using html , php, javascript, and google map api. this part of the system is to monitoring and tracing the registered users only in terms of how the areas (which have been visited by users) should be lockdown or not? to this end, an unsupervised machine learning (uml) algorithm has been implemented in the system. there are several uml algorithms including neural networks, anomaly detection, clustering and etc. however, for this system, k-means clustering algorithm is used to predict the lockdown approach for the visited area. the k-means algorithm, first, reads the tracked users' position information and their status covid- . then, in the next step will calculate the centroid position of the areas based on the dasv seeding method. the dasv method is a good algorithm to select the best centroid position of a set of nearest positions in the vicinity. then, the centroid positions will be once, the process of the clustering of the tracked users' positions information has completed, a set of clusters will be produced. then for each cluster, the distances between the positions of the different users are calculated. this is to calculate how many times the users, in the vicinity, are approaching to each other (from now called aeo). for this study, five users (user a, user b, user c, user d, and user e) are participated into the system in two different areas in usa. therefore, two different scenarios via the five users are conducted for the k-means algorithm, as shown in figure . in the first scenario the users are walking and they are located in denver area in colorado-usa, while in the second scenario they are located in aspen area in colorado-usa. a threshold for the approaching distance has been initialized to meters, i.e. if user a has been approached around meters to user b, or c, or d, or e, it means the users are too near to other users. for the two scenarios, if aeo is greater than , the system assumes this area is too crowed and the system will predict that the area should be locked down. however, if the value of aeo is less than times, it means the area should not be locked down. for trial experiments, the model predicts that the denver area in the first scenario should be locked down, since the five users during the walking in the area are approaching to each other for times and they passed the threshold (i.e. meters). however, in the second scenario, the same trials have been tested parallel with the second scenario, and the model predicted that the aspen area doesn't need to be locked down, since the users are walked far to each other. both scenarios results are shown in figure . at the emergence of covid- , many countries worldwide are commonly practiced social distancing, mass quarantine and even strict lockdown measures. smart lockdown management is a pressing need to ease lockdown measures in places where people are practicing social distance. in this paper, we developed a smartphone-based approach to inform people when they are in proximity to an infected area with covid- . we also developed a dashboard to advise health authorities on how specific area safely get people back to their normal life. the proposed prediction model is used positional information and distance measurements of the registered users in the proximity. the government and public health authorities would be able to take benefit from the proposed dashboard to get latest statistics on covid- cases and lockdown recommendation in different areas. the weak point of this study is the privacy issue of tracking position information of the users. this issue would be solved by applying encryption algorithms, in near future. however, such proposed system is significant to mitigate economic crisis and easing lockdown issues. deep learning-based model for detecting novel coronavirus pneumonia on high-resolution computed tomography: a prospective study novel coronavirus in the united states lockdowns can't end until covid- vaccine found, study says can we compare the covid- and crises? ( , appril) what is contact tracing number of covid- cases reaches in kurdistan region iraq's total now apple and google partner on covid- contact tracing technology what we scientists have discovered about how each age group spreads covid- feasibility of controlling covid- outbreaks by isolation of cases and contacts safe paths: a privacy-first approach to contact tracing conflict of interest: the authors declare that they have no conflict of interest. moreover, this research was not funded by any funding agency. key: cord- -g wmmbx authors: sturniolo, s.; waites, w.; colbourn, t.; manheim, d.; panovska-griffiths, j. title: testing, tracing and isolation in compartmental models date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: g wmmbx existing compartmental mathematical modelling methods for epidemics, such as seir models, cannot accurately represent effects of testing, contact tracing and isolation. this makes them inappropriate for evaluating testing and contact tracing strategies to contain an outbreak. an alternative used in practice is the application of agent- or individual-based models (abm). however abms are complex, less well-understood and much more computationally expensive. this paper presents a new method for accurately including the effects of testing, contact-tracing and isolation (tti) strategies in standard compartmental models. we derive our method using a careful probabilistic argument to show how contact tracing at the individual level is reflected in aggregate on the population level. we show that the resultant seir-tti model accurately approximates the behaviour of a mechanistic agent-based model at far less computational cost. the computationally efficiency is such that it be easily and cheaply used for exploratory modelling to quantify the required levels of testing and tracing, alone and with other interventions, to assist adaptive planning for managing disease outbreaks. since the beginning of , the world has been in the midst of a covid- pandemic, caused by the novel coronavirus sars-cov- . to slow down the spread, many countries, including the uk have imposed social distancing mitigation strategies. however, such measures can not feasibly be imposed over a long period as may cause economic collapse. as a consequence countries need to consider how to ease lockdown measures while controlling sars-cov- spread. the world health organisation has recently updated their guidance on this, recommending a six point strategy that requires firstly assuring that the pandemic spread has been suppressed, and is followed by detecting, testing, isolating and contact-tracing of infected individuals [ ] . mathematical modelling has figured prominently in decision making around control and containment of covid- spread, including the imposition of physical distancing measures [ ] . it provides a logical framework for understanding the propagation of an may , / infectious disease through a population and allows different interventions to be explored, including testing and contact tracing of infected individuals as possible strategies to ease social distancing restrictions. such models are also necessarily simplifications and understanding of their assumptions and what they do and do not represent is required to correctly interpret them. mathematical models have a long history of being used to describe the spread of infectious diseases from plague outbreaks more than a century ago [ ] to the more recent sars [ ] and ebola [ ] , [ ] epidemics, and from making decisions around different vaccination strategies for influenza [ ] to modelling hiv [ ] , and from modelling pandemic influenza [ ] to currently facilitating real-time policy decision making around the covid- epidemic [ , [ ] [ ] [ ] [ ] [ ] . there are several common approaches, each with advantages and disadvantages [ , ] . compartmental models [ , , ] partition the population into different compartments such as susceptible, exposed to the virus but not infectious, infectious and removed and track the movements of individuals between these groups. though dynamics of real disease outbreaks are fundamentally stochastic [ ] [ ] [ ] , this level detail is mainly relevant for early stages or small outbreaks [ ] . commonly within compartmental models a mean-field approximation given by ordinary differential equations (ode) is used [ , , ] . the latter approach is particularly attractive because it is computationally efficient and can yield informative results. ode systems can be generalised to explicitly incorporate dependence on system state at some times in the past, yielding delay-differential equations (dde) [ ] [ ] [ ] , the analogue for continuous state of markov processes with finite memory. such formulations require meticulous care to solve accurately [ , ] and much of what is known about their behaviour consists of asymptotic results [ ] [ ] [ ] [ ] . branching processes are used [ , , ] where more flexibility is desired in representing the timing of transitions among compartments and, for continuous time, are amenable to stochastic differential equation (sde) treatment. for some choices of distribution, the sde formulation is markovian and can be analysed as a continuous-time markov chain (ctmc) [ , ] . finally, individual-or agent-based models (ibm/abm) explicitly represent each individual in the population and allow for fine-grained modelling of the characteristics of each one such as different contact patterns or susceptibilities to the disease [ ] [ ] [ ] [ ] [ ] . they have been [ ] , and are being [ ] [ ] [ ] widely used for planning and epidemic control. while abms allow for maximal flexibility and realism, this comes at a high computational cost and it can be difficult to extract analytical results that relate the fine-grained behaviour to population-level effects. it is generally feasible to conduct agent-based simulations for populations of tens of thousands, but there are salient features of epidemics such as the timing and size of peaks of infectious individuals that depend on population sizes two orders of magnitude larger. an important subset of abms are network or graphical models [ ] [ ] [ ] [ ] [ ] [ ] where the structure of the population, the possible interactions among its members, are explicitly represented. in addition to the computational cost and analytical difficulties with abms, sufficient data to support their fine-grained realism is rarely available. for many purposes, including the one that we are concerned with here, an accurate qualitative understanding of the effect of interventions like testing and contact tracing, cheap, coarse, high-level models are more useful than expensive fine-grained models that rely on vast often not readily available data. while classic compartmental models can easily be used to simulate some interventions analogous to parameter changes, they cannot readily include effects contact tracing of infected individuals unless vast assumptions are made. this is because modelling contact-tracing is intrinsically reliant on individual behaviour within a network structure. previous work on ebola [ ] , sars [ ] and covid- used simple approaches to represent contact tracing in a compartmental model: asserting that a constant fraction of exposed individuals becomes isolated due to contact tracing [ , , , ] or reducing transmission may , / by a constant amount, perhaps after a delay [ ] . we believe that this kind of approach is insufficient for the purpose of understanding how the rate and timing of testing and contact tracing affect success in containing outbreaks. the purpose of contact tracing is to attempt to isolate infectious, or soon to be infectious individuals. therefore, contact tracing should result in the isolation of both infectious and exposed individuals and this is a key assumption that previous work has missed. contact tracing will also inevitably result in the isolation of susceptible and recovered individuals with the former contributing to a reduced rate of disease propagation. to properly understand this process it is imperative to model the effects of contact tracing with mathematical rigour. in this paper we develop an extension to the classic susceptible-exposed-infectious-removed (seir) model [ , , ] simulated with odes to include testing, contacttracing, and isolation (tti) strategies. we call this model seir-tti. this model captures the salient features of the manifestation at the population level of the dynamics of testing and tracing at the individual level. due to its relative simplicity, seir-tti is applicable across a spectrum of diseases. with appropriate parametrisation, it can be used anywhere a standard seir model can be used with the same caveats and limitations. though we are clearly motivated by the current covid- pandemic and wish to understand how interventions like tti can be used to contain it, we do not claim that we are modelling it in particular. our contribution is a mathematical tool and software implementation that can be used for understanding tti, not a model of covid- . the method that we present is general and can also be applied to other compartmental models, with the standard caveat that with more compartments comes more work to determine the appropriate rates. we validate our seir-tti ode model against a mechanistic agent-based model where testing, tracing and isolation of individuals is explicitly represented and show that we can achieve good agreement at far less computational cost. we also provide a flexible software package at https://github. com/ptti/ptti with a convenient declarative language for specifying parameters and interventions and implementations of the seir-tti ode model, mechanistic agent-based model, a second non-mechanistic rule-based model in the κ-language formalism [ , ] , and several related models such as classic seir. we design a compartmentalised model describing the populations of susceptible (s), exposed (e -infected but not infectious), infectious (i) and removed (r) population cohorts. these models are widely used to describe the spread of various infectious diseases [ ] . within the model framework, disease progression is captured by movement of individuals sequentially between compartments accounting for progression from susceptible individuals (s) being exposed to the virus and becoming infected but not infectious (e), to becoming infectious (i) until they recover (r). a schematic illustrating this model is shown in fig . the novelty of our model is that we have within each compartment included subgroups of people diagnosed and undiagnosed with the virus, attributable to reported and unreported diagnosis. individuals in our model are defined to be diagnosed either through testing or putatively through tracing. diagnosed individuals are then isolated. schematic of an seir model with diagnosis described by testing and contact-tracing. seir is a compartmentalised model describing susceptible (s), exposed (e -infected but not infectious), infectious (i) and removed (r) population cohorts. individuals move between these compartments in sequence as they become exposed, infected and infectious during disease progression until recovery. the novelty here is that each compartment comprises diagnosed and undiagnosed individuals with diagnosis leading to isolation. we assume that diagnosis happens through testing or putatively through tracing. individuals transition between compartments x and y at rates ∆ x→y which we derive in the text. before introducing contact tracing, we examine the standard seir model with testing. these results, and those in the following section, use the system of differential equations as described in detail in the methods. we choose a relatively large initial number of infectious individuals merely for illustrative purposes as it renders the dynamics clearerthe more aggressive testing regimes would result in immediate containment of a small outbreak which would be difficult to see whereas a large outbreak nevertheless takes some time to contain. the parameters have the usual meaning, with values fixed for the purposes of this section: n = . × individuals is the total population, i( ) = is the initial number of infected individuals,β = . infections/contact is the probability of transmission; c = contacts/day is the contact rate, α = . days − is the incubation rate, the rate of leaving the exposed state and becoming infectious; and γ = − days − is the rate of recovery, or leaving the infectious state. these values result in a basic reproduction number of r = . in the simplest case, testing is conducted at random at some rate θ of tests per individual per day and only infectious individuals are tested and immediately isolated. representative trajectories from this system for various values of θ are shown in fig . the upper panel shows the time-series for total infections, exposed and infectious, and the lower panel shows the effective reproductive number, r(t). we can observe that while testing the entire population every days (θ = . ) results in a lower maximum total number of infections, we require very frequent testing, every - days (θ = . , . ) in order to control an outbreak and cross the r(t) = threshold (red horizontal line). it is straightforward to work out the condition under which testing crosses this threshold by analysing the fixed points in the underlying system of differential equations since the required condition is that there is no change in the number of infectious people as they each infect one other on average and then are removed. some arithmetic yields θ crit =βc − γ, the red line in fig . the above shows that, whilst testing and isolating alone can be sufficient to control an outbreak, it would take a herculean effort on its own. without any form of distancing may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint the dynamics represented here are for a scenario with normal contact, c = , and an initial number of infected individuals, i( ) = , . individuals who test positive are isolated for the duration of their illness. the top plot shows the total infections (exposed and infectious individuals) over time for various testing rates ranging from none, θ = , to testing all infectious individuals every two days, θ = . . the bottom plot shows the reproduction number over time for these same scenarios. observe that even fairly frequent testing, e.g every five days, θ = . , this is only sufficient to reduce peak infections by one order of magnitude from about million to about two million. in the infrequent testing regimes, θ ∈ [ . , . ], we can also observe that the curve described by r(t) is not a sigmoid but instead first falls to a value above r(t) = before stabilising and then falling again. this is because though testing and isolating does have an effect at those rates, it is not sufficiently frequent to identify all of those who are infectious. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . (c ≈ ) it is necessary to conduct tests about every . days. if a sizeable number of infected individuals are asymptomatic, there is no alternative but to test the entire population at this rate. distancing helps here. if contact rate is cut by half, the required rate is closer to once per fortnight. there is, however, a strategy to avoid regularly sampling the entire population in order to direct tests to those most likely to be infected: contact tracing, which we consider next. the central mathematical result is the expression for the rate at which individuals are isolated due to contact tracing, the notation is explained in detail in the methods section, but the intuition is that, for any compartment x, divided in to exclusive unconfined, x u , and isolated, x d , sub-compartments, the rate of moving between them is proportional to the probability of having had contact with an infectious individual conditional on being in x u . the effects of contact tracing is shown in fig . the scenario is the same as with testing alone, except that the testing rate is fixed at θ = − days − and the tracing rate is fixed at χ = − days − . the interpretation is that, on average, an infectious individual expect to be tested in days and contacts can expect to be traced in days. the choice of these values for illustrative purposes is purposeful. recall from the previous section that γ, the recovery rate is fixed at − days − . one would expect that testing and isolating individuals, on average, after they have recovered and it is too late would be insufficient to contain an outbreak. indeed it is not suffcient, but it does reduce the maximum number of infected individuals somewhat. however, since tracing happens as a consequence of testing, it amplifies its effectiveness. this can be seen in the figure where even a modest tracing success rate of - % results in a substantial reduction of more than half the peak infections. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint the relationship between testing rate and tracing rate can be seen from fig . when θ is very small, meaning very little testing, then contact tracing has little effect. this is unsurprising because testing causes tracing. when there is very frequent testing, on the other hand, there is little benefit to contact tracing. when testing happens more frequently on average than an individual can infect another, it is sufficient to control the outbreak on its own. however for intermediate values, contact tracing amplifies the effectiveness of testing. the above result can be seen from this plot as well: when testing of infectious individuals is expected in a week, a modest % success rate at tracing contacts in two days is enough to reduce the reproduction number from to less than . , a substantial benefit. the central result of this paper is not specific observations about how testing and contact tracing affect the propagation of epidemics, though those are valuable, but a technique to compute these effects efficiently. this technique allows consideration of larger populations than would be possible with agent-or individual-based models allowing for the exploration of many different scenarios. figs and , for example, each contain × = data points resulting from a separate simulation. performing these total simulations takes under a minute on a regular laptop. this would have not been possible with agent-or individual-based models, with population sizes in the hundreds of thousands or millions. it could be argued that it is sufficient to capture these dynamics in an agent-based model for modest populations and simply rescale the output for large populations. that approach is not sound for two reasons that are easily seen. first, small outbreaks. imagine a hypothetical country of million people with thousand infections. proportionally, that is . infections in a population of thousand. there is a non-negligible probability that an outbreak of size will die out on its own. this will be accounted for by the abm but is not a realistic possibility for an outbreak of thousand. scaling therefore suggests fundamentally different results. second, without intervention, the number of infectious individuals will reach a maximum as the available pool of susceptible individuals becomes may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint depleted. this takes longer in a large population simply because the pool is larger. if timing of the peak of an outbreak is a quantity of interest, a scaled abm will give the wrong result. however, doing this requires some approximations and it is important to understand where and how well these approximations hold. to do this, we compare with an agentbased model as described in the methods, and show that our method agrees well for a large range of physically interesting and realistic parameter values. a comparison of the two systems for reasonable parameter values is shown in fig . the figure shows good agreement between the mean trajectory of the abm and the ode approximation. the agreement is particularly precise for the exposed and infectious compartment of both varieties. we can observe a slight over-estimate of the number of unconfined susceptible individual and corresponding under-estimate of the unconfined removed ones. these over-and under-estimates are nevertheless acceptably close with a relative error in the magnitude of the susceptible population of under %. there exist extreme scenarios where the ode performs poorly at reproducing the mean trajectory of the abm system. an example is shown in fig . one such scenario is when the testing rate is very low. the figure shows when θ = − days − . this circumstance violates the assumption underlying eq that the number of susceptible contacts available for tracing should be much smaller than the total susceptible population. intuitively, this can be understood as the ode approximation holding well when testing and tracing are conducted sufficiently rapidly to perform their required purpose. when they do not, the approximation is poor. even in this extreme scenario, however, where the curve produced by the ode system is several standard deviations distant from the average trajectory of the abm, its shape is still similar and realistic. we consider the problem of determining the effect of testing and contact tracing in a population, p , consisting of a set of indistinguishable individuals among whom a may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint propagates. to answer this we adapt the standard susceptible-exposed-infectious-removed (seir) compartmental model [ , ] to incorporate contact tracing as well as testing and isolation of cohorts of people. our adaptation extends the classic seir to not only include progression through disease stages from exposure, via infection to recovery, but to also keeping track of the changing make up of the population as the disease progresses. to achieve this we require our model to have two additional features: . to keep track of whether people have been isolated from the rest (either due to testing positive, or having been traced as a contact of someone who tested positive) . to keep track of whether people have been in contact with an infectious individual recently enough to be potential targets for tracing. ordinary compartment models like seir are designed to separate individuals into distinct, non-overlapping groups. this is not a problem for the first feature, as people who are isolated and people who are not constitute entirely distinct sets. we therefore can represent unconfined and isolated individuals simply by doubling the number of states, labeling s u , e u , i u and r u the undiagnosed people who are respectively susceptible, exposed, infectious, or removed, and similarly, s d , e d , i d and r d the ones who have been diagnosed or otherwise distanced from the rest of the population, by means of home isolation, quarantine, hospitalisation and such. however, dealing with contact tracing is harder, as it can not be achieved with separate compartments. here we take two approaches. first, we describe an agent-based model that simulates contact tracing with an approximation of how it could take place in real life. this agent-based model serves as our reference. then we describe fully our compartment model, and, relying on a system of second order ordinary differential equations (odes), we introduce the concept of overlapping compartments. overlapping compartments represent model states that are not mutually exclusive, so that it is possible for an individual to belong in more than one of them e.g. be infected and contact-traced, or exposed and tested. we define equations for this model in order to represent the processes that happen in the agent-based model, providing the comparisons seen above in the results section. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint an agent based model of contact tracing among the possible measures to suppress an epidemic, contact tracing is defined as "an extreme form of targeted control, where the potential next-generation cases are the primary focus" [ ] . in other words, contact tracing is the process by which we aim to identify and isolate individuals who have been in contact with an infectious patient in the past and are thus more likely to have been exposed to the disease, in order to remove them from the pool of possible infectious patients before they develop symptoms. we start by defining our modified seir model in agent-based form. the model features n agents each characterised by a state symbolising progression throughout the disease (s, e, i, or r) as well as a single bit characterising whether they are undiagnosed or diagnosed/distanced (u or d). as mentioned above, we label s u , s d , e u , etc. respectively the numbers of individuals in each combination of those states, and s, e, i, r the totals (u and d combined). in addition, we store a contact matrix keeping track of which individuals have been in contact with which infectious members of the population, and an array of all those individuals for whom one past infectious contact has been identified, and thus they can be traced as potentially exposed individuals. we call c t the total number of such traceable individuals. this contact matrix encapsulates a history of interactions in a way that is realistic but is not possible to represent directly in ode form. it is specifically the functioning of this individual contact matrix that we claim to reproduce at the population level with our ode formulation below. we simulate the model using gillespie's algorithm [ ] , which provides a way to sample exact trajectories produced by such stochastic processes. the possible state transitions that can take place are: . contact between a random individual and one belonging to i u , with rate ci u . the contact is stored in the contact matrix. if the individual happens to belong in s u , with likelihoodβ ≤ , the contact results in exposure, and the s u individual becomes e u ; . progression of the disease for an e individual into i, with rate αe; . recovery from the disease, or removal due to hospitalisation or death, for an i individual into r, with rate γi; . diagnosis by regular testing of an i u individual, with rate θi. the individual is moved to i d ; all its past contacts, retrieved from the contact matrix, are marked as traceable with likelihood η ≤ . if the individual moved to i d was marked as traceable, it is unmarked (as they're already in isolation and there is no need to trace them any more); . release from isolation of an s d individual, making them s u , with rate κs d ; . release from isolation of an r d individual, making them r u , with rate κr d ; . contact tracing of a traceable individual with rate χc t . the individual is moved from x u to x d , where x is whatever state of progression they are in, and they're removed from the list of traceable individuals. the transitions described above can be intuitively seen as corresponding to the ones that would happen in an idealised real-life version of epidemic spread with testing and contact tracing. the biggest deviation from reality is the perfect mixing of the population implied by the first process. the testing and tracing processes are parametrised by θ, the rate of diagnosis of infectious individuals, η, the likelihood or efficiency with which the tracing process identifies contacts, and χ, the rate at which they are found and isolated. we will describe the meaning and importance of these numbers as we explain how they fit into an ode model description of the same processes. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint we begin by introducing the ode form of the standard seir model [ , ] . because of the large number of model compartments and exchange terms between them that will be featured in the full model, we introduce a systematic notation to refer to rates that link them. we refer to ∆ x→y as the rate at which members of the population move from compartment x to compartment y . for example, ∆ s→e is the rate at which susceptible members of the population are exposed to the virus. in addition, for convenience when discussing movements that can happen due to multiple phenomena, we might add a superscript, such as ∆ z x→y , to indicate only the part of that rate that can be ascribed to a given process z. with this notation, the differential equations that describe the standard seir model have the following form, note that all terms involve compartments identified with u subscripts as these equations all apply to the undiagnosed part of our model. they will then be expanded upon to include the effects of isolation and testing in the next section. the terms in the above differential equations are defined in the usual way as, where β =βc is the infection rate, α is the disease progression rate and γ is the disease recovery rate. while this formulation treats the populations as continuous analytical functions, in general these equations describe the mean trajectory of what is fundamentally a stochastic system. this stochastic system can be simulated with gillespie's algorithm and, up to this point, is equivalent in the continuous limit to an agent-based model featuring the same compartments and transition rates. now we add diagnosis to our description. four more compartments, s d , e d , i d and r d , are created to keep track of population cohorts who have been identified as potentially infected, and thus isolated from the rest of the population as a measure to limit the spread of the disease. disease progression is not affected by this process; therefore, including isolation will change the infection rate, as unlike population i u , the isolated population i d does not contribute to further infection. hence we do not include an may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . infection term here. this is an idealisation. in reality isolation will not be perfect, and we can imagine a reduced 'cross-infection' rate in which some people belonging to s u are infected by people in i d . this could happen with medical professionals treating infectious patients or care workers who maintain a quarantine facility. we could even consider infection of people in s d due to those in i d , such as a patient in home isolation infecting their family. however, for present purposes, we will work in an ideal situation where isolation is perfect. finally, we need to incorporate mechanisms to move individuals between the u and d branches of the model. for this purpose we define a testing rate, θ, which represents the fraction of people belonging in i u who, each day, are diagnosed with the disease. we note that this parameter does not refer to any specific testing procedure; it just represents the total of people who are recognised as having the disease. it can represent, for example, actual testing for a specific pathogen as well as clinical diagnosis. we only focus on the category of i u as these are the patients who are most likely to realise they are sick and seek medical help. this generic testing process is described by the equation, in addition, people will be released from isolation after a finite time without symptoms. for this reason, we don't include a mechanism for people in i d to return to the u branch of the model, as they're likely to be symptomatic or test positive for the pathogen. instead, we consider that people who have been isolated despite being not infected, or who are still isolated after having recovered, will return to normal conditions at a rate κ, with this model adaptation, a single infected individual can now take two paths: in which they are exposed to the disease, become infectious, and finally recover, without being isolated or diagnosed, as in the normal seir model, or, in which, after becoming infectious, they are identified, isolated, removed from the pool of those who can infect other susceptible people, and after recovering, released from isolation. having these two paths allows attainment of some degree of control of the epidemic; however, it must be noted that while we have introduced them, the states s d and e d are here left unused. this is because at this stage we associate testing with symptomaticity; there is yet no mechanism other than by diagnosis to identify someone who could be infected. this is especially problematic in terms of the impossibility of isolating exposed people. these are individuals with a latent infection who will soon become infectious. isolating them pre-emptively would contribute a great deal towards suppressing the epidemic. for this reason, we move on to include contact tracing as a means of preventive isolation. we've seen previously that it is intuitive how contact tracing can be represented in an agent-based model, in which individuals are simulated and each has an history of contacts with other members of the population. it is not as obvious how to treat contact tracing in a compartment model, where there is no memory of the histories of contacts of specific individuals, but only average quantities. we outline here a probabilistic method for doing this. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . let us define pr(x) the probability of an individual of belonging to compartment x of the population. for example, pr(s u ) = s u /n is the probability of an individual to be susceptible and undiagnosed. in addition, let us define pr(c i ) the probability of an individual of having had contact with an infectious individual in the past where that infectious individual is still infectious. the latter detail is important because here we consider only "next-generation" tracing; in other words, we only try to trace the direct contacts of those infectious individuals who were found to test positive. this is a conservative assumption. it could be possible to make contact tracing more effective by also tracing one generation further (the contacts of the contacts), but because the process requires exponentially more resources with each generation with decreasing likelihood of correctly identifying exposed or infectious individuals, we simply opt to neglect that possibility. therefore, in this model the only people who can be traced are those whose most recent infectious contact is still infectious; once they recover, they can not be identified as infectious any more, and thus it will be impossible to trace their contacts as well. finally, we define pr(c t ) the probability of an individual of being traced. all these probabilities are functions of time, and quantities that evolve with the model itself. first, we find that the probability of being traced is where pr(c t |c i ) is the conditional probability of being traced given that one has had an infectious contact in the past, and pr(c t |¬c i ) the probability of being traced given that one has not. clearly, pr(¬c i ) = − pr(c i ). if we ignore the possibility of false positives, then pr(c t |¬c i ) = , namely, a person can only be traced if they did have an infectious contact in the past. if we then set an 'efficiency' parameter η representing the fraction of contacts that we are indeed able to identify, the probability of being traced at a given time is simply , to derive transition rates among compartments, we consider that individuals will be traced proportionally to how quickly the infectious individuals who originally infected them are, themselves, identified. we add a factor χ to account for the speed of the tracing process itself, and we find a global tracing rate, it then follows that, for individuals in a given compartment x, the rate at which they're isolated by contact tracing is where in the last step we made use of bayes' theorem [ ] . this is our eq , the central mathematical result of this paper. the difficulty is then computing the exact probabilities. these are functions that, in general, vary in time and require a certain degree of information about the past. we need to define useful assumptions and approximations in order to work with these probabilities in a model that inherently lacks any memory about the individual histories of the elements of its population. one simple assumption for exposed and infectious individuals is meaning that we assume that if an individual has been exposed or infected, they must also have had an infectious contact in the recent past. this is in fact the reason why contact tracing is an effective use of resources: it skews heavily towards identifying those may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . who have in fact been exposed to the disease. we remark that this assumption does not hold in general in circumstances where it is possible for an individual to become infected indirectly, such as by contact with contaminated surfaces. for present purposes we assume that the likelihood of such events is small compared with the likelihood of being infected through contact with another individual. another limit of this assumption is that we have defined pr(c i ) as the probability of having had an infectious contact who is still infectious. for α γ, or for some infectious individuals who may take a long time to recover, their original infector might have already recovered in the time it takes for them to be tested. however, here we study a model in which α > γ, and it is reasonable to assume that those infectious individuals who are tested are identified relatively early on in their infection, especially if θ > γ. therefore, we deem the assumption in equation acceptable at least insofar as these two conditions hold and indirect infection is unlikely. estimating pr(c i |s u ) and pr(c i |r u ) is more complicated. one possible approximation is to work as if i u were constant on the time-scales of interest; in that case we would have where γ is the overall rate at which individuals are removed from the i u state. putting together recovery, regular testing, and contact tracing, we find γ = γ + θ( + ηχ). the main difference between the two equations is determined by the fact that someone in s u might still be infected, and thus only has a probability − β of remaining susceptible after a contact with an infectious member of the population, whereas for recovered individuals this is not an issue any more. equations and can be used to compute rates of contact tracing by combining them with . however, here we try to go beyond the crude approximation of constant i u , as it may often reflect reality very poorly. we consider for example the total number of members of s u who also have had recent infectious contacts, n (c i |s u ) = pr(c i |s u )s u . we can describe these in first approximation as where the f x (t, τ ) are the 'survival functions' for the state x. in other words, these are the functions that determine how likely it is that an individual that was in x at time τ still is in the same state at time t. we also used f i , meaning the survival function of the total number of infectious individuals, i = i u + i d , because here we focus on overall infectiousness, not the fact that one might have been isolated before recovery. note, however, that only i u individuals participate in contacts. the reason that this is an approximation is that we're not excluding the n (c i |s u ) from the pool of s u that can be contacted, and thus there is a risk of double counting. that risk will remain negligible as long as n (c i |s u )/s u is small; therefore, this model will perform better in a regime in which there are few infectious individuals, and thus, few contacts. this is in fact the regime in which contact tracing is most likely to be feasible in practice, to control small outbreaks rather than in presence of an uncontrolled epidemic. regardless, we show in the results section that even when this approximation does not hold, while it results in oscillatory behaviour early on, it still generally adequately describes the overall trends and long term equilibrium. equation is equivalent to the integral form of an equation for a compartment model [ ] . it can be written in differential form as, may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint where the h x = f x df x dt are the 'hazard functions' for the state x. in particular, h i = γ. given the similarities between these equations and the ones describing the compartment models, it is natural to think of creating a specific compartment for n (c i |s u ). this is in fact what we do. there is, however, an important difference from regular compartments, because this compartment does not include individuals that exclusively belong to it; rather, it overlaps with s u . it is more of a device used for book-keeping purposes, to compute the integral in equation within the confines of the model, than a compartment in the usual sense. we similarly define n (c i |e u ), n (c i |i u ) and n (c i |r u ), which leads, using equation , to the following contact tracing rates, ∆ (c t ) in addition, we establish the following transition rates between these n compartments, there is a lot going on in equations - ; most importantly, these new compartments do not conserve the total size of the population. their membership grows as contacts happen and shrinks as time passes. all the key processes can be summed up as follows: • elements are 'created' for each state proportionally to the rate of contact with individuals belonging to i u , adjusted with − β in the case of s u to account for the likelihood that the contact is infective. these terms are 'sources' and can be recognised by having an arrow with nothing on its left in the subscripts; • elements 'decay' at a rate that amounts to γ (the hazard function for i, which always appears as it refers to the original infector) plus a rate representing the hazard function for the transition x u → x d . these terms are 'sinks' and can be recognised by having an arrow with nothing on its right in the subscripts; • elements move between compartments following the usual transitions that control the dynamics of the seir model (infection, progression of the disease, recovery). these terms are analogous to the corresponding ones connecting x u states, and contribute the remainder of the hazard function for each x u to eq. and equivalents. it must also be noted that, in practice, considering equation , it must be n (c i |e u ) = e u and n (c i |i u ) = i u , which removes the need for two of the four compartments above and simplifies the equations to may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . . a few words are necessary on the hazard function for the x u → x d transitions. this is approximated as η θχ in states s u and r u even though that is not precisely correct; the correct hazard function would be η θχn (c i |x u )/x u , but that introduces a risk of instability for small values of x u . we justify this choice by the following reasoning. in a weak testing regime (η θχ γ), n (c i |x u )/x u might be high due to a great number of infected individuals, but in principle should never be greater than (modulo the point above about double counting). therefore, the hazard function is dominated by γ. conversely, in a strong testing regime, the number of infected individuals, and thus n (c i |x u )/x u , will be very small, and this assumption will at most end up underestimating the effect of contact tracing (by causing a faster decay in n (c i |x u ) than otherwise would happen). the examples shown in the results section illustrate how this affects the simulations -in general, leading to good predictions for the behaviour of the e u and i u compartments. equations - , - , , - and - , together, define entirely our model. the parameters that appear in these equations are summarised for reference in table . we implement the above ordinary differential equations and agent-based model in our ptti python package (https://github.com/ptti/ptti) using the compyrtment [ ] package that facilitates the formulation of initial value problems. it is written for python and makes use of the scientific computation libraries numpy and scipy [ , ] as well as the optimisation library numba [ ] . the ptti package provides a declarative language for specifying simulations of models implemented as python objects. it supports setting of model parameters, simulation hyper-parameters as well as interventions that modify parameters at particular times to conduct piece-wise simulations reflecting changing conditions in a convenient and user-friendly way. we hope that this software formulation will be useful for easy and rapid exploration of the effects of different intervention scenarios for disease outbreak control. our work outlines a method for extending the classic seir model to include testing, contact-tracing and isolation (tti) strategies. we show that our novel seir-tti model can accurately approximate the behaviour of agent-based models at far less computational cost. our adaptation is applicable across compartmental models (e.g. sir, sis etc) and across infectious diseases. we suggest that the seir-tti model can be applied to the covid- pandemic to understand the impact of possible tti strategy to control this outbreak. the importance of modeling to support decision making is widely acknowledged, but models are far more useful when they can accurately represent the classes of interventions that are being considered [ ] . the approach described in this paper is based on sound mathematical reasoning that assures accurate and efficient modelling of contact tracing and testing across a wide range of relevant parameter values. the ability to accurately model tti strategies across parameter values is vital for controlling disease outbreaks including the current covid- pandemic. effective testing, contact tracing and isolation strategies have been the key measures that have prevented the epidemic spreading in south korea [ ] , new zealand and germany [ ] . our work is novel as it is to date, and to the best of our knowledge, the first deterministic model to explicitly incorporate contact tracing. this has been until now only done with agent-based models. an important aspect of our approach is that our ode formulation explains the behaviour of the agent-based model. namely, agent-based models are formulated in terms of local interactions among individuals and exhibit emergent behaviour at the population level. for interesting agent-based models, it is usually difficult to obtain any explicit connection between the local interactions and the population-level dynamics except through simulation and inspection of the results. we argue that our work here shows such an explicit connection: we have been able to capture the dynamics that arise at a population level from testing and contact tracing. we show that this is correct by demonstrating good agreement with the population-level dynamics that emerge from the agent-based formulation where only local interactions are specified. the seir-tti model here considers disease propagation in the classical well-mixed setting. this is appropriate especially in circumstances where data are sparse and gives qualitatively similar results to those from fine-grained models that might otherwise provide more quantitatively accurate results if only more detailed data were available. in particular, well-mixed models do not include any notion of the network of contacts across which a contagion spreads in the real world. in reality, individuals in a large population are not equally likely to have contact with one another and it has long been known [ - , , , - ] that heterogeneity in underlying population structure can have a strong effect [ , [ ] [ ] [ ] on disease propagation. future work will include developing a better understanding of the relationship between network structure and effectiveness of tracing, and mathematical characterisation of the classes of solution available for these models. another extension is investigating the extent to which individual decisions about compliance with measures to reduce disease propagation (voluntary distancing, wearing of masks, etc.) affect the success of containment. a game-theoretical approach such as that considered by zhao et al. [ ] may produce useful insights into this question. insights gained from these extensions can inform policy design for relaxing onerous restrictions on the population. an important next step in this work is the real-time policy driven application of seir-tti. as our next piece of work we are planning to explore how seir-tti model can be combined with economic analysis to guide decisions around optimal design of a tti strategy that can suppress the covid- epidemic in the uk. may , / . cc-by-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint this paper gives a primer of how, using mathematical theory, the classic seir model can be extended to incorporate a testing, contact tracing and isolation strategy. the resulting seir-tti model is a key development in the widely used seir models, and an important step if these are to be useful in policy decision making during outbreaks. the long and successful history of testing, contact tracing and isolation in slowing and stopping the spread of infectious diseases is well known [ ] , with clear immediate importance for covid- control [ ] . the design of policies that include a variety of infectious disease control tools, and understanding and applying them in ways that are effective for society at large, is critical. tools and models that allow policymakers to better understand the policies and the dynamics of a disease are therefore critical. if making policy decisions without evidence is flying blindly, making decisions without understanding the consequences of the various control measures is flying without flight controls. models like seir-tti can inform policymakers of the role that testing and tracing can play in preventing the spread of disease. combined with economic and policy analysis, this can enable far better decision making both in the immediate future, and in the longer term. the next step in our work is indeed this: the application of the seir-tti model combined with economic models to investigate the effect of different tti strategies to conquer the covid- epidemic in the uk. world health organization. who director-general's opening remarks at the media briefing on covid- - impact of non-pharmaceutical interventions (npis) to reduce covid- mortality and healthcare demand containing papers of a mathematical and physical character controlling infectious disease outbreaks: lessons from mathematical modelling modeling contact tracing in outbreaks with application to ebola the contribution of biological, mathematical, clinical, engineering and social sciences to combatting the west african ebola epidemic assessing optimal target populations for influenza vaccination programmes: an evidence synthesis and modelling study how should hiv resources be allocated? lessons learnt from applying optima hiv in countries exploring the role of mass immunisation in influenza pandemic preparedness: a modelling study for the uk context the efficacy of contact tracing for the containment of the novel coronavirus (covid- ) effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov- in different settings isolation and contact tracing can tip the scale to containment of covid- in populations with social distancing age-dependent effects in the transmission and control of covid- epidemics modelling sars-cov spread in london: approaches to lift the lockdown improving decision support for infectious disease prevention and control: aligning models and other tools with policymakers' needs infectious diseases of humans: dynamics and control modeling infectious disease dynamics in the complex landscape of global health mathematical modeling in epidemiology an introduction to stochastic epidemic models stochastic epidemic models: a survey epidemiology of transmissible diseases after elimination mathematical modeling in epidemiology methods and models in mathematical biology: deterministic and stochastic approaches. lecture notes on mathematical modelling in the life sciences some epidemiological models with delays mathematical approaches for emerging and reemerging infectious diseases: an introduction time delays in epidemic models the effect of integral conditions in certain equations modelling epidemics and population growth solution of delay differential equations via a homotopy perturbation method global stability of an sir epidemic model with time delays global asymptotic stability of an sir epidemic model with distributed time delay global behavior of an seirs epidemic model with time delays global behavior and permanence of sirs epidemic model with time delay estimation for discrete time branching processes with application to epidemics mathematical modeling in epidemiology a primer on stochastic epidemic models: formulation, numerical simulation, and analysis. infectious disease modelling individual-based perspectives on r agent-based simulation tools in computational epidemiology formalizing the role of agent-based modeling in causal inference and epidemiology a taxonomy for agent-based models in human infectious disease epidemiology agent-based modeling in public health: current applications and future directions individualbased computational modeling of smallpox epidemic control strategies epidemic dynamics and endemic states in complex networks when individual behaviour matters: homogeneous and network models in epidemiology contact network epidemiology: bond percolation applied to infectious disease prediction and control reasoning about a highly connected world spatial epidemiology of networked metapopulation: an overview mathematics of epidemics on networks: from exact to approximate models modelling strategies for controlling sars outbreaks modeling the impact of social distancing, testing, contact tracing and household quarantine on second-wave scenarios of the covid- epidemic. institute for biocomputation and physics of complex systems preprint modelling the covid- epidemic and implementation of population-wide interventions in italy social distancing strategies for curbing the covid- epidemic seasonality and period-doubling bifurcations in an epidemic model formal molecular biology the kappa language and tools contact tracing and disease control exact stochastic simulation of coupled chemical reactions an essay towards solving a problem in the doctrine of chances. by the late rev time-varying and state-dependent recovery rates in epidemiological models python for scientific computing python for scientific computing numba: a llvm-based python jit compiler llvm ' transmission potential and severity of covid- in south korea countries test tactics in 'war' against covid- comparison of populations whose growth can be described by a branching stochastic process: with special reference to a problem in epidemiology heterogeneity in disease-transmission modeling epidemic spreading in real networks: an eigenvalue viewpoint modeling covid- on a network: super-spreaders, testing and containment the disease-induced herd immunity level for covid- is substantially lower than the classical herd immunity level individual variation in susceptibility or exposure to sars-cov- lowers the herd immunity threshold strategic decision making about travel during disease outbreaks: a game theoretical approach universal weekly testing as the uk covid- lockdown exit strategy the authors would like to thank greg colbourn, vincent danos, gabriel goh and rafaele vardavas for insightful comments on early drafts of this manuscript. this work used the cirrus uk national tier- hpc service at epcc (http://www.cirrus.ac.uk) funded by the university of edinburgh and epsrc (ep/p / ). ww was supported by the chief scientist office scotland (cov/edi/ / ). jpg was supported by the national institute for health research (nihr) applied health research and care north thames at bart's health nhs trust (nihr arc north thames). the funders had no role in study design, data collection, data analysis, data interpretation, or writing of the report. the views expressed in this article are those of the authors and not necessarily those of the nhs, the nihr, or the department of health and social care. ss, ww and jpg came up with the idea of the study. ss, ww and jpg developed the seir-tti model with input from tc and dm. ss and ww coded the model. ww, ss and jpg drafted the paper with inputs from tc and dm. the final version of the paper was approved by all authors. key: cord- - qr sak authors: ferrari, a.; santus, e.; cirillo, d.; ponce-de-leon, m.; marino, n.; ferretti, m. t.; santuccione chadha, a.; mavridis, n.; valencia, a. title: reproducing sars-cov- epidemics byregion-specific variables and modeling contacttracing app containment date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: qr sak targeted contact-tracing through mobile phone apps has been proposed as an instrument to help contain the spread of covid- and manage the lifting of nation-wide lockdowns currently in place in usa and europe. however, there is an ongoing debate on its potential efficacy, especially in the light of region-specific demographics. we built an expanded sir model of covid- epidemics that accounts for region-specific population densities, and we used it to test the impact of a contact-tracing app in a number of scenarios. using demographic and mobility data from italy and spain, we used the model to simulate scenarios that vary in baseline contact rates, population densities and fraction of app users in the population. our results show that, in support of efficient isolation of symptomatic cases, app-mediated contact-tracing can improve containment and achieve successful epidemic mitigation even with relatively small fraction of the population using it, and, with increasing penetrance of its adoption, suppression. however, when regional differences in population density are taken into consideration, the epidemic can be significantly harder to contain in higher density areas, highlighting potential limitations of this intervention in specific contexts. this work corroborates previous results in favor of app-mediated contact-tracing as mitigation measure for covid- , and draw attention on the importance of region-specific demographic and mobility factors to achieve maximum efficacy in containment policies. the spread of covid- has raised new challenges for healthcare systems all over the world, hitting with particular strength europe and usa, after china. according to the available data, italy currently has among the highest number of contagions and dead toll from covid- , with over , confirmed cases and more that , deceased as of may th . however, the spread of covid- has been quite heterogeneous in speed, reach, and lethality, not only from country to country, but also in different regions of the same country. the main possible explanation is given by the delay between the onset of the epidemic, the first diagnosis and the kick-off of containment measures. other reasons may be due to region-specific variables, such as population density and mean age, societal structure and behaviors. a third factor depends also on the adopted policies for containment and testing, in particular for what concerns the fraction of infectious individuals that do not display symptoms (asymptomatic). most countries dealing with the epidemics have resorted to nation-wide lockdowns and social distancing to slow down the outbreak; however, managing the epidemics in the long term will likely require the use of information technology to help implement measures of containment and mitigation. in particular, precise identification of cases and contact tracing and isolation can hardly be performed with traditional methods, and the use of targeted phone apps could highly improve the efficiency of these processes, as shown by the experience of multiple asian countries -such as south korea. different infrastructures and working interfaces for such an instrument have been proposed, and its potential impact on the virus's reproductive rate has been studied [ , , ] . tracing apps seem to have a key role in ensuring that the epidemic remains sustainable on the healthcare systems, not exceeding their capabilities, which would otherwise lead to excess mortality. in this proof-of-concept study we built a comprehensive framework to model the covid- epidemic, taking into account population density, the different contributions of symptomatic, pre-symptomatic and asymptomatic contagions, and we used it to test the efficacy of targeted intervention such as the aforementioned contact tracing app. in contrast to previous work [ , ] aimed at modeling the two-way dynamic between individual behaviors and containment policies, we chose to build a global compartmental modeling framework that can account for region-specific factors, such as the effect of population density on contact rate or the role of expected compliance to containment procedures. our research builds a model that allows testing the effect of both case isolation and app-mediated interventions in a region-specific fashion. we built an improved susceptible-infectious-recovered (sir) model with the aims of a) faithfully reproducing the dynamics of the sars-cov- epidemics, including the respective roles of asymptomatic infection and population density; b) test the effects of specific interventions, and specifically the use of phone apps for contact tracing. as italy was the first western country affected by the sars-cov- and for which region-specific and intervention-related data was readily available, our analysis was focused on the italian case. this has allowed us to study how the virus spreads at a very different pace in different italian districts. the typical sir model assumes that some degree of immunity, at least temporary, is acquired after sars-cov- infection; therefore, it is assumed that subjects move from the s (susceptible) compartment to the i (infectious) compartment, and from there, with a daily rate equal to the invers of recovery time, to the r (recovered) compartment, until the relative densities of s and i become too low for the epidemic to continue. in order to simulate the behavior of asymptomatic and pre-symptomatic individuals, an a (asymptomatic) and p (pre-symptomatic) compartment were added to the model. to simulate the effect of targeted quarantine measures, we introduced a series of q * compartments indicating the number of subjects that are quarantined and their status regarding the disease. there are, to model the reversible transition from the different compartment into the state specific quarantine compartments, four different q compartments: qs, qi, qa and qp . whenever they are . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . infected, individuals are supposed to move from the s compartment to the a or p compartments, with respective probabilities p a and p i = − p a . subjects in the p compartments then transfer to the i compartment with a daily rate of /τ i , where τ i is the incubation period. we also assume that asymptomatic and pre-symptomatic patients are less infectious than symptomatic individuals by a factor f . however, there is evidence that pre-sympromatic individual are highly infectious already about τ d = days before symptom onset [ ] , and the focus of our model is transmission dynamics, rather than symptoms manifestations. under this profile, a p subject is already akin to a i τ d days before symptom onset. therefore, we let p subjects move into i compartment after a incubation period that is τ d = days shorter, while at the same time increasing recovery time by the same amount. our model assumes the use of a phone app that keeps track of contacts and, once a symptomatic case is identified, notifies the event to everyone who had contacts with them in the pre-symptomatic period, so that they can enter a voluntary quarantine. such an application is heavily reliant, on one hand, on effectiveness of case isolation on part of the authorities, and on the other hand on compliance and widespread use of the system by the population. thus, we assume, for our model, that a j fraction of symptomatic cases is identified and undergoes perfect quarantine with zero contacts. we call j the fraction of the population using the app and we assume that once in quarantine, they reduce the contact rate by a factor q. therefore, j is the fraction of contacts that happen between individuals using the app. quarantined subjects that develop the infection undergo the entire course of the disease, whereas those that were not infected in the contact eventually exit quarantine after the quarantine period τ q ( days) and become susceptible again. in the classic sir model, the rate of transfer between the s and i compartments depends on transmission rate β, that is, the product of number of contacts per subject, c, and probability of transmission during a single contact µ [ ] . to estimate contact rate as a function of population density we built on previous results by rhodes and anderson [ ] who derived a formula for estimation of daily contact rate of a subject in a population with density ρ moving with velocityv as where r is the minimum distance within which two individuals can be said to have a "contact"; for covid- , and other air-borne diseases transmitted by droplets expelled from nose or mouth, it is commonly estimated as m [ ] . in our model we assume that susceptible subjects move to the p compartment with a rate proportional to the probability of meeting an a, i or p subject, assuming that at least one of the two does not use the tracing app and/or the case is not successfully identified. this is modeled by making the rate dependent on − jj . the remaining jj fraction of the contacts between s and i individuals leads to a transfer from s to one of the three qp , qa and qs compartments, each with probability equal to p i , p a or − p i − p a , i.e. the . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . probability of the contact leading to symptomatic infection, asymptomatic infection or no infection. individuals in the qp compartment eventually transfer to the qi compartment after incubation, whereas qa and qs transfer to r and to s respectively, with rates equal to /τ i and /τ q . the structure of the model is shown in figure and is described by the following set of differential equations. by defining α = jj as the proportion of contacts that are successfully contained, we have: the model is governed by a set of stochastic differential equations ( - ) that depends on a set of different parameters. for several parameters, values were estimated from the bibliographic references (see table ), whereas for those parameters where no reference value were found, we set plausible values based on other models. for covid- attack rate µ we adopted data from the epidemic in shenzen, where secondary attack rate was estimated between . and . [ ] . according to ferretti et al. the most likely estimates for f and p a are . and . . the . estimate for the asymptomatic fraction is corroborated by a recent epidemiological study on covid- prevalence in veneto [ ] . in simulation we included some basic vital dynamics by including a d (deceased) compartment (not shown), for which we assumed an overall mortality of % in symptomatics . . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . mortality estimates for covid- are actually still quite uncertain, but infectious removal by mortality is not expected to significantly affect epidemic trends. the simulations were run using r package siminf , a system for stochastic simulation of data from compartmental models of epidemics [ ] . for our experiments we assumed that j = % of symptomatic cases are identified and perfectly quarantined at symptom onset, i.e., two days after their infectiousness increases. on the other hand, contacts undergoing voluntary self-quarantine are supposed to reduce their contact rate ten-fold (q = . ). we simulated scenarios with varying contact rate c and, most importantly, assuming a different proportion j of app users in the population ( . , . , . and no users). each simulation was run on nodes representing italian districts (with data on area and resident population updated to ), and was repeated times, with globally , simulations per scenario. for a first set of simulations we assumed a constant population density for all the nodes; this equals the assumption of a unique transmission rate and, therefore, a unique r for the entire set, so that the only region-specific factor affecting the outcome was relative population. transmissibility has been previously estimated, based on data from the diamond princess outbreak, at . [ ] ; accordingly, assuming . attack rate, in this scenario we set a contact rate equal to . . however, this was calculated in a particular scenario where contact rate was supposedly higher than in normal conditions, thus we . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint simulated two more scenarios with lower contact rates, and . . a second set of simulations accounted for different population densities across the districts. here, contact rate was allowed to vary following population density under an assumed average daily distance traveled by the subjects, according to eq. . mobility data from two sample european cities (berga and barcelona) show an average daily distance traveled per person varying between . and . km (personal communication); starting from these figures we simulated scenarios in which subjects travel, on average, . , . and km per day. an interactive r shiny web application, enabling the exploration of simulation scenarios, is available at https://davidecirillo.shinyapps.io/ct_ app/. results of the simulations are summarized in figures and , showing the time curves of the sum of the i and qi compartments and expected mortality. clearly the epidemic peak is expected to vary with increasing contact rate, assuming that transmissibility and recovery rate are constant. as expected, in all simulated scenarios, app-aided contact tracing significantly decreased the effective reprotuctive number r t and height of the epidemic peak. in scenarios with constant contact rate equal to . , symptomatic cases isolation was per se very effective in slowing the epidemic, so much so that appmediated contact tracing managed to achieve suppression even with only % (fig , upper right panel) of the population using the app and complying to self-quarantine, whereas with % peak height was reduced more than -fold (fig , upper right panel) . this shows that, in scenarios with lower baseline contact rate and efficient isolation of cases, app-mediated contact tracing can achieve epidemic suppression. on the other hand, nation-wide suppression was not achieved in the less optimistic scenarios with and . contact rate; however, the app induced a very effective mitigation, with peak number of infectious reduced roughly -fold in the worst-case scenario and with % of the population using the app (fig. , right middle and lower panels). these results highlight the benefit of introducing contact-tracing as a measure of pandemic prevention and control as well as the positive impact that this would have especially upon critical circumstances. in scenarios where contact rate was allowed to vary with population density the epidemic trend was, as expected, significantly different from region to region. in figure , left panels, the curves of symptomatic infections over time are . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . shown; compared to simulations with constant contact rate, it is evident the presence of two more or less distinct epidemic peaks, reflecting groups of districts with different population densities. however, the distinction tends to disappear with increasing proportions of app users. in particular, in all the scenarios where % of the population used the app (dashed-dotted lines), suppression was indeed achieved in most regions, but the epidemics did not die out, being almost entirely sustained by the districts with the highest population density (milan, monza, neaples; fig. ). this results is, clearly, achieved by using the app to augment an efficient tracking and isolation of new symptomatic cases, and indicates that, as for all interventions, effectiveness of app-mediated contact-tracing and voluntary quarantine should be evaluated in the light of region-specific differences. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . figure showing successful/unsuccessful suppression in the first days. suppression is achieved in the most optimistic scenario (contact rate . , app users %) in the scenarios with fixed contact rate (right panels, upper), whereas its success varies from district to district in scenarios with density-dependent contact rate (left panels). the model allows to also keep track of the number of i subjects successfully quarantined ("true positives") and of the subjects that underwent quarantine without actually being infected ("false positives"), by tracing the population in the qi (quarantined-infectious, subjects that were rightfully quarantined) and qs (quarantined-susceptible, subjects that were quarantined but did not contract infection) compartments. the maximum number of subjects in each compartment at the same times in scenarios with % of app users is summarized in table . both false positives and true positives are naturally dependent on the success or failure of epidemics suppression, and will be very low when suppression is achieved. in density-dependent scenarios the maximum number of susceptible subjects quarantined at the same time ranged from , to , , , whereas in scenarios with fixed density the variation was much higher, mainly due to the fact that with contact rate c = . and % app users suppression . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . is achieved. the use of targeted app for contact tracing has been proposed as a means to control the covid- epidemics when lockdown measures are lifted. it has been shown that, given certain combinations of efficacy in case identifications and compliant use of such an instrument, the approach can contribute to the effective reproductive number r t of the disease below [ ] . here we aimed to model the effect of app-mediated contact tracing taking into account population density and transportation, at the same time making it possible to monitor the number of patients that are quarantined and their status concerning the . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . table : average maximum "true positives" (qi) and "false positives" (qs) in the scenarios with % app users. infection. this is a much needed approach if we wish to implement precise and timely specific intervention making the infections and contagion sustainable for health care systems. the model uses a series of q * compartments to model the behavior and status of subjects that are quarantined for symptomatic infection or based on contact tracing. to our knowledge, this is the first model that allows simulation and prediction of the outcomes of the epidemic both accounting for differential population density and quarantine measures. this is particularly important since it allows to visualize the effect of contact-tracing apps along the entire duration of the epidemics, and also because the management of the infection has to take into account the specific characteristics of a given region and implement measures accordingly. in fact, the application of containment policies disregarding region-specific conditions can result in measures which are not needed or too drastic. as a consequence, rather than providing support, such policies might result in a burden for the psychological well-being of people as well as detrimental for the economy. despite the many parameters considered in our model, our work still falls into a classic compartmental epidemiologic framework; this is a net advantage in terms of interpretability of results and generalizability. to our knowledge, this is also the first model that allows keeping track of subjects that undergo voluntary quarantine. according to our model, case isolation is per se a very effective containment measure that, as long as cases are identified and isolated with a very high success rate, can achieve suppression of the epidemics in a series of theoretical scenarios. however, coupling case isolation with immediate app-mediated contact tracing has a remarkable impact on the success of the strategy, achieving in all scenarios a very effective mitigation of contagion and, in some scenarios, . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . full suppression. feasibility of epidemic control by app-mediated contact tracing has been suggested already by ferretti et al.; by decomposing contributions to r from symptomatic, pre-symptomatic, asymptomatic and environmental sources, they show that r t < can be attained for certain combinations of efficiency in case isolation and compliance in contact tracing. our model allows to directly simulate such scenarios while at the same time keeping track of the trends in isolated and quarantined cases. this is particularly relevant as it allows to quantify the effect of interventions on specific compartments, e.g. it is possible to trace the number of individuals that are quarantined at a specific time-point, a piece of data that is potentially very helpful in designing cost-effectiveness analyses of containment measures. another important result comes from our simulations with a density-dependent contact rate. in our simulation different italian districts behaved very differently, and in all scenarios suppression was easily attained in the less densely populated regions, whereas it failed in the others (fig. ) . this is consistent with the different epidemic trends that have been observed to date in italian districts; however, it must be pointed out that differences between districts may as well be justified by different approaches in dealing with the epidemics, time to first diagnosed case versus numbers of people already infected in the population and not yet recognized and, most importantly, is influenced by the nation-wide/region-wide lockdown put in place by the central government. it is also of note that by making contact rate dependent on both density and daily distance traveled, our model takes into account the potential effectiveness of policies aimed at optimizing and regulating transportation, especially in high density regions. according to the model, effective suppression of the epidemics in such areas is strongly dependent on such measures. the main limitation of our work is the uncertainty in the parameters that have to be plugged-in the simulations. we adopted some credible figures for the asymptomatic/symptomatic infections ratio and for relative decrease in infectiousness in asymptomatic subjects, as well as for probability of infection per contact; however the greatest uncertainty is precisely in estimation of contact rate, as this is a variable that is influenced by specific environmental and cultural factors, e.g. individual mobility, social interactions, transportation systems, as well as general social distancing measures that have been implemented wherever the epidemics took place. our first choice estimate for contact rate came from the experience of the diamond princess, based on which we estimated it at a very impressive figure of . , a condition that would hardly allow for suppression in a nation-wide context. however, it is evident that the situation in a closed environment favors a higher contact rate, thus this number is likely to . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may , . . be significantly overestimated. this makes the scenario a worst-case one, suggesting that, in real world experience, case isolation and contact tracing may be more effective than predicted. in scenarios where we modeled contact rate as a function of population density, the most relevant scaling factor isv, i.e. the theoretical average daily distance traveled by individuals. this is a measure that is not readily estimable, which is why in the present work we showed results for a series of credible scenarios; however, using mobility data from european cities we managed to obtain credible figures to plug-in the model. our model does take into account interpersonal distance in the form of r parameter; a potential expansion of this framework is to account for time spent within the m interpersonal distance, as well as distinguishing between high-risk (e.g. taxis, buses) and low-risk (e.g. walking, personal car) means of transportation. however, this work proves the feasibility of including population density and transportation in an expanded sir model, and suggests that, even if travel between districts is forbidden, the epidemics may still be significantly harder to contain in areas with very high population density (for example, in italy, the districts of milano, monza and napoli). the model can be further improved and expanded by adding age-specific compartments, sex and gender factors, and risk classes, by refining the implementation of vital dynamics, and by modeling different methods for contact tracing with varying degree of compliance. we approached the model with a simulation based approach; this is computationally intensive but still manageable by most software and hardware and less demanding than stochastic models based on individual data, and allows for a high degree of customizability by fine-tuning its parameters on specific interventions. the model constitutes a viable framework to monitor epidemic trends and assess the effect of interventions. our results show that ( ) voluntary selfquarantine based on contact-tracing apps, together with efficient case isolation, can give a relevant, and in some scenarios decisive, contribution to epidemics mitigation/suppression; ( ) at the same time, the success of this strategy can depend heavily on population density and transportation. this material is based upon work supported by the european commission's horizon program, h -ict- - - , "infore -interactive extreme-scale analytics and forecasting" (ref. ). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may , . . https://doi.org/ . / . . . doi: medrxiv preprint epidemiology and transmission of covid- in shenzhen china: analysis of cases and , of their close contacts quantifying sars-cov- transmission suggests epidemic control with digital contact tracing temporal dynamics in viral shedding and transmissibility of covid- the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application suppression of covid- outbreak in the municipality of vo, italy". in: medrxiv persistence and clearance of viral rna in novel coronavirus disease rehabilitation patients transmission dynamics and control of severe acute respiratory syndrome". eng a spatiotemporal epidemic model to quantify the effects of contact tracing, testing, and containment covid- : estimating spread in spain solving an inverse problem with a probabilistic model considerations for quarantine of individuals in the context of containment for coronavirus disease (covid- ) contact rate calculation for a basic epidemic model covid- outbreak on the diamond princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures on air-borne infetction: study ii. droplets and droplet nuclei report of the who-china joint mission on coronavirus disease (covid- ) siminf: an r package for data-driven stochastic disease spread simulations key: cord- -zxsm og authors: bengio, yoshua; janda, richard; yu, yun william; ippolito, daphne; jarvie, max; pilat, dan; struck, brooke; krastev, sekoul; sharma, abhinav title: the need for privacy with public digital contact tracing during the covid- pandemic date: - - journal: lancet digit health doi: . /s - ( ) - sha: doc_id: cord_uid: zxsm og nan the need for privacy with public digital contact tracing during the covid- pandemic digital contact tracing applications represent a powerful yet controversial strategy to combat the covid- pandemic. manual contact tracing has important challenges, not limited to recall bias and delays in communicating with high-risk contacts. digital technologies are already increasingly used in the context of health-care delivery and clinical trials. due to the considerable strain on public health institutions, digital contact tracing through mobile phones is being used or explored in a growing number of countries despite concerns raised over individual privacy and state surveillance. mobile phone-enabled digital contact tracing colocalises individuals in time and space through the use of gps, bluetooth, or other such technologies. google and apple have promised to provide frameworks for how to use their technologies for contact tracing. a digital contact trail can be created when individuals who have downloaded such applications come into physical proximity. machine-learning strategies can improve on simple binary contact tracing systems by providing methods to calculate quantifiable individual risk of acquiring covid- depending on specific features such as distance and duration of interaction, self-reported comorbidities, demographics, and the presence of any symptoms in each individual in an interaction. as an individual's risk level for acquiring covid- increases, various behavioural messages can be delivered quickly to enable the individual to take appropriate, measured action. these multiple advantages have the potential to establish rapid epidemiological control of the pandemic. despite the potential advantages, most of the applications in use or under consideration have an impact on individual privacy that democratic societies would normally consider to be unacceptably high. in a free and democratic society, there are major concerns regarding privacy. the uk, australia, singapore, south korea, and other countries have deployed such tools (using binary variables of contact, not scalar risk probabilities for risk of infection); however, these applications have come under scrutiny relating to the ability of governments and other groups to access personal information. public trust in the use of these applications is paramount because widespread adoption of these technologies is needed to be effective in curbing viral transmission. indiscriminate collection of personal information, chronic privacy breaches, and lax attitudes towards individual privacy in the private sector have eroded public trust in digital technologies. moreover, tracing applications raise the spectre of generalised state surveillance in the face of the pandemic, with potentially published online june , https://doi.org/ . / s - ( ) - • download, installation, and use of the application must be entirely voluntary, and users must be able to uninstall the application at will • there must be express consent for all collection, use, and disclosure of personal information (ie, users might choose to share some data and not others, such as official test results or to feed a machine-learning model) • individuals must be able to opt-in or opt-out of data sharing. this includes consent to download the application, turn on location services, receive notifications, and share covid- test results • a non-partisan independent oversight committee with representatives from legal, health, machine-learning, and privacy experts should be established to oversee ongoing development of the application, its information ecosystem, and data governance • importantly, public representatives must be included in this oversight committee virtual data acquisition • no identifiable information regarding digital contact trails or personal health information that an individual enters on the application should be shared with other application users or public, private, and governmental agencies • individual geolocation data should not be stored on a central server and should pass through a rigourous obfuscation protocol to reduce their information content to the bare minimum required for epidemiological and machine-learning modelling • pseudonymised data should be used to inform machine-learning models, and only these data should be stored centrally on a protected server • only non-identifiable aggregated data should be shared with public health institutions • the source code of the application and the algorithms used should be made accessible for public scrutiny • personal identifiable information should be deleted from the device once the pandemic is over • user preferences should drive end-to-end experience • user comprehension should be prioritised and verified rather than assumed • user psychosocial wellbeing should be promoted • user empowerment to protect themselves and others should be maximised • user inclusivity should acknowledge the diversity of user needs in dimensions such as gender, race, education, and rural vs urban location devastating consequences if democratic societies learn to accept such an intrusion on civil liberties. therefore, to counteract both negative perceptions and genuine threats, a privacy-protecting approach must be central in the development of such a contact tracing application. several strategies can be leveraged to increase and maintain the public trust with such applications (panel). express consent at each step of data sharing is crucial and must be meaningful, not buried within lengthy privacy policies or vague language agreements, and includes express consent to anonymously share covid- test results. no identifiable data should be shared with any public institution or private enterprise. pseudonymised or aggregate data can be adequately used to develop machine-learning and epidemiological models and inform public policy. otherwise data should be kept encrypted on users' devices and inaccessible to public authorities or private interests. the tracing application itself can propagate alerts to high-risk contacts and can recommend that users voluntarily contact health authorities where relevant, thereby assisting markedly in contact tracing while minimising the potential for state surveillance, snooping, or vigilantism. the granular non-identifying information used to train machine-learning models generally contains sufficient detail to re-identify individuals when correlated with other sources of data. this is why an independent, non-partisan trust or similar fiduciary structure must be established to protect and control access to these data, and manage the application and its ongoing development. the source code for the application and the privacy protocols used should be publicly available. individuals must be able to make independent informed choices based on recommendations released from the application rather than using coercive or penalising strategies. an application self-destruction strategy should be used so that once the pandemic is over, all application-related personal data is deleted from participants' phones and deleted from the machinelearning server, leaving for further research, only deidentified, aggregated, and statistical data, or artificial data generated from the epidemiological model. the approach presented here advocates that consent must be explicating for users to download the application, transmit covid- test results, and share data for research. recent projections suggest that at least % of a country's population would need to be using the application to ensure maximal chance of epidemiological control of the covid- pandemic. there is a tension between mandating use of the application versus having a consent-based approach that we are advocating. in the face of such tension, the trade-off between individual civil rights and the need for population-level control of the covid- pandemic comes to the forefront. trust in the application by individuals is pivotal for such applications to have population-level benefit. we would suggest that advocating an approach that emphasises consent and prevents any central public or private authority from accessing identifiable data would embolden more individuals to download the application, thereby optimising the population-level benefit. various designs are currently in place with regard to strategies for identifying contacts, the types of notifications that are received, and the use of centralised versus decentralised approaches. , one question that arises in a system that emphasises a consent-based, opt-in approach, is that among individuals who do not receive a notification, does the absence of the notification imply the absence of contacts with other individuals with a covid- infection or that other users are not consenting to share data? the absence of notifications might create a false sense of security in the user of the application or can cause frustration if a user presumes that others are not sharing information. this limitation with such optin applications emphasises the need for broad public outreach and education to optimise the number of users who download the application and consent to share data. leveraging digital contact tracing technologies can change the course of the covid- pandemic. such technologies must robustly support democratic principles of privacy to maintain public trust and to enable individuals to make informed choices to help combat the pandemic. impact of contact tracing on sars-cov- transmission using digital health technology to better generate evidence and deliver evidence-based care cellphone tracking could help stem the spread of coronavirus. is privacy the price? privacy-preserving contact tracing deep learning quantifying sars-cov- transmission suggests epidemic control with digital contact tracing how coronavirus is eroding privacy yuval noah harari: the world after coronavirus effective configurations of a digital contact tracing app: a report to nhsx key: cord- - m cur authors: plank, m. j.; james, a.; lustig, a.; steyn, n.; binny, r. n.; hendy, s. c. title: potential reduction in transmission of covid- by digital contact tracing systems date: - - journal: nan doi: . / . . . sha: doc_id: cord_uid: m cur digital tools are being developed to support contact tracing as part of the global effort to control the spread of covid- . these include smartphone apps, bluetooth-based proximity detection, location tracking, and automatic exposure notification features. evidence on the effectiveness of alternative approaches to digital contact tracing is so far limited. we use an age-structured branching process model of the transmission of covid- in different settings to estimate the potential of manual contact tracing and digital tracing systems to help control the epidemic. we investigate the effect of the uptake rate and proportion of contacts recorded by the digital system on key model outputs: the effective reproduction number, the mean outbreak size after days, and the probability of elimination. we show that effective manual contact tracing can reduce the effective reproduction number from . to around . . the addition of a digital tracing system with a high uptake rate over % could further reduce the effective reproduction number to around . . fully automated digital tracing without manual contact tracing is predicted to be much less effective. we conclude that, for digital tracing systems to make a significant contribution to the control of covid- , they need be designed in close conjunction with public health agencies to support and complement manual contact tracing by trained professionals. contact tracing has become a key tool in the global effort to control the spread of covid- . contact tracing has been crucial in controlling several disease outbreaks, notably sars, mers and ebola (who & cdc, ; kang et al., ) . while contact tracing alone is unlikely to contain the spread of covid- kucharski et al., ) , in countries like new zealand where cases have been reduced to very low numbers (cousins, ; binny et al., ) , it may allow population-wide social distancing measures to be relaxed. in countries with more widespread epidemics, it can allow safe reopening. manual contact tracing typically involves interviewing confirmed cases about their recent contacts, getting in touch with those contacts and asking them to take measures to prevent onward transmission of the disease in case they are infected. such measures may include limiting their interactions with others, formal quarantine, getting tested, or remaining vigilant for symptoms. manual contact tracing is intensive work and requires highly trained public health professionals to be implemented effectively (verrall, ) . it is also difficult to scale manual contact tracing up to deal with very large outbreaks. in response, many countries have attempted to develop digital contact tracing systems using smartphone apps. there are multiple different approaches to this problem, for example qr code-based systems, bluetooth-proximity based apps, automated exposure notification features, and systems that do not rely on smartphones such as card-based proximity detection. these systems can offer, to varying extents, three main benefits to controlling covid- : (i) an increase in the proportion of contacts who are traced (e.g. contacts that would not be traced by case recall but are recorded by the digital system); (ii) a reduction in the time taken to identify and notify traced contacts (e.g. via an exposure notification feature); (iii) improved scalability over manual contact tracing. the effectiveness of digital contact tracing is still unproven, with limited real-world data (anglemyer et al., ) . here, we use a model of covid- transmission and contact tracing to evaluate the potential of digital contact tracing systems to reduce the spread of covid- . we evaluate the benefits of digital contact tracing both alone and in combination with manual tracing, over a range of uptake rates, tracing probabilities, and the effectiveness of quarantine. transmission and contact tracing model. we use an age-structured branching process model for covid- transmission and contact tracing that is an extension of the age-structured model of james et al. ( ) to include different contact types (see below). we assume that the outbreak is sufficiently small that the effect of infection-induced population immunity can be ignored. the time from infection to symptom onset is gamma distributed with mean . days and standard deviation . days (lauer et al., ) . infectiousness is a weibull function, timeshifted such that % of transmission occurs prior to symptom onset (ferretti et al., ; ganyani et al., ) . infections have an age-dependent probability of being subclinical (davies et al., ) that decreases linearly from % in the - year age group to % in the over years age group. subclinical infections are assumed to be % as infectious as clinical cases. contacts are categorised into one of four different types: home, work, school and casual. each contact has a probability, called the secondary attack rate (sar), of resulting in transmission. the sar is assumed to be % for home contacts and % for work, school and casual contacts . age-specific contact rates in each of these four settings are based on the contact rates estimated by prem at al. ( ) for the new zealand population. the average number of age-specific home contacts was taken directly from the results of prem et al. ( ) and assumes that household contacts do not vary from day to day. the average number of work, school and casual contacts made during the infectious period was chosen to be a fixed multiple of the number of daily work, school and casual contacts in prem et al. ( ) , chosen to give a basic reproduction number (in the absence of any control measures, case isolation, or contact tracing) of = . . we model heterogeneity in number of contacts via gamma distributed individual multipliers for each of the four settings (lloyd-smith et al., ) . heterogeneity is assumed to be higher in work, school and casual contacts than in home contacts, reflecting a greater occurrence of superspreading events in these settings. we model homophily by assuming that the number of contacts of a secondary case is correlated to the number of contacts of the index case in the setting in which transmission occurred. we assume that each transmission via a non-home contact results in infection of a new household, which is assigned a household size according to the age-specific distribution of home contacts. each household is assumed to consist of a fixed group of individuals, so that subsequent infections within the same household deplete the pool of susceptible home contacts (see appendix for details). in the absence of any contact tracing, clinical cases are assumed to be eventually tested with probability = (i.e. all clinical cases are eventually detected). the delay from onset of symptoms to testing is assumed to be gamma distributed with mean . days and standard deviation . days (estimated from new zealand data for the march-april outbreak). cases are isolated at the same time as getting tested and this prevents any further transmission. there is a further delay between getting tested and the test result being returned that is a minimum of . days, plus an exponentially distributed random variable with a mean of . days. subclinical cases do not get tested are do not isolate. the contact tracing model is illustrated in figure . when a new positive test result is returned and contact tracing for that case begins, we refer to the individual testing positive as the index case and to contacts of the index case as secondary cases. contacts who do not end up getting infected are not modelled explicitly (but see discussion about effects of quarantining false positives). tracing of the index case's contacts begins when the index case returns a positive test result. under manual tracing, each contact has a probability of being traced, and a time taken to trace, both of which may differ across the four settings. traced contacts who are not currently symptomatic (i.e. either subclinical or presymptomatic) go into quarantine, which is assumed to reduce onward transmission to a level < relative to no quarantine. traced contacts who are symptomatic go into isolation immediately on symptom onset, which is assumed to completely prevent any further onward transmission. untraced contacts do not go into quarantine, and experience a delay between symptom onset and isolation and testing (see above). effective contact tracing therefore reduces transmission in two ways: (i) quarantining of contacts who are not currently symptomatic; and (ii) prompt isolation of contacts immediately on symptom onset. home contacts are always assumed to be traced instantly (i.e. immediately after the index case returns a positive test result) with probability , and this happens independently of any digital contact tracing system. under manual contact tracing, work contacts are traced with probability . , school contacts with probability . , and casual contacts with probability . . work and casual contacts are assumed to have a tracing time that is gamma distributed with mean days and standard deviation . days. school contacts are assumed to be traced more rapidly but not instantly ( . days after the index case returns a positive test result). . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint figure . schematic diagram of the contact tracing model. infectious individuals are initially asymptomatic (yellow). for the index case who was not traced ( ), there is a delay between onset of symptoms (red) and getting tested. isolation occurs at the same time as getting tested. there is a subsequent delay to the test result being returned (+) and tracing of contacts. traced contacts ( - ) are quarantined when contacted by public health officials (phone icons) and are isolated and tested immediately on symptom onset. traced contacts ( ) who are already symptomatic prior to being traced are isolated immediately when contacted. traced contacts ( ) that have already isolated prior to being traced are not affected. contacts that cannot be traced ( ) may still get tested and isolated, but this is likely to take longer. subclinical individuals ( ) do not get tested or isolated, but will be quarantined if they are a traced contact. we model alternative digital contact tracing systems by varying key parameters of the contact tracing model. home contacts are assumed to be always traced rapidly by the manual system, so digital contact tracing applies to school, work and casual contacts. for each scenario, we assume there is an uptake rate and that each individual in the population is a user of the digital tracing system (e.g. has installed the app) with probability , independently of all other individuals. this ignores any correlation in the usage probabilities of close contacts. provided the index case and secondary case are both users of the system, the contact is digitally logged with probability , which we assume is the same for school, work and casual contacts (see table ). we set a default value of = %, but also investigate values of smaller than this. a tracing probability of % is likely to be unachievable, for example because there will be situations where one or both individuals failed to carry the smartphone or card with them. contacts that are digitally logged are assumed to be quarantined immediately after the index case returns a positive test result, the same as for home contacts. if neither or one of the index case and the secondary case is a user of the system, or both are users but the contact was not logged by the digital system, the contact is not traced digitally, but may still be later traced manually. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . in addition to the benefits of instant tracing described above, some digital tracing systems may also help improve coverage of manual tracing. for example, location-based tracking or a digital diary feature may enable contact tracers to identify and follow up contacts who would otherwise be missed. we do not explicitly investigate these scenarios, though they could be modelled via an increase in the manual tracing probabilities of relevant settings in table . rather than making estimates of tracing probabilities for alternative digital systems, we investigate how the performance of the contact tracing system varies with the uptake rate and the probability of digital tracing (see table ). we assume that isolation of symptomatic traced contacts is % effective in preventing onward transmission, but quarantine of pre-symptomatic and subclinical traced contacts is only partially effective. we investigate two scenarios: ( ) quarantine reduces transmission by % ( = . ); ( ) quarantine reduces transmission by % ( = . ). we assume the effectiveness of quarantine/isolation is the same for contacts that are traced digitally and contacts that are traced manually. this is likely to require effective manual follow up of digitally traced contacts as opposed to relying solely on automatic exposure notifications. we also investigate the additional benefit from including recursive tracing of second-order contacts (i.e. quarantining the contacts of contacts of a confirmed case) in the model. recursive tracing and effective quarantine of second-order contacts is more difficult to achieve in practice because of the much larger number of second-order contacts and the lower risk of them being infected. in the case on an ongoing outbreak with a large number of cases, the number of uninfected individuals being quarantined under recursive tracing is likely to be prohibitively large (firth et al., ) . however, recursive tracing could potentially be useful in suppressing a small outbreak in its very early stages. we assume that tracing of second-order contacts begins days after tracing of the first-order contact and can occur digitally or manually following the same rules described above ( figure ). this means that second-order contacts who are traced digitally are quarantined days after quarantining the first-order contact; second-order contacts traced manually are quarantined later. any second-order contacts made subsequent to quarantine of the first-order contact cannot be traced recursively (but may still be traced on or after the positive test result of the first-order contact). third-order contacts were not traced recursively. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . manual contact tracing has different tracing probabilities and times for home, school, work and casual contacts. home contacts are always traced and this is assumed to happen with no delay. school contacts are traced with probability % and this takes half a day. work and casual contacts have lower tracing probabilities and a delay of days on average. we investigate manual contact tracing supported by a digital contact tracing system that has probability (default value %) of instantly tracing school, work and casual contacts, provided both the index case and the contact are users of the digital system. contacts that are not traced by the digital system may still later be traced manually. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint we measured the reduction in spread of covid- by looking at three model outputs: (a) the effective reproduction number (average number of secondary infections per case); (b) the mean outbreak size (total number of cases per seed case) after days; (c) the probability of extinction of an outbreak starting from a single seed case. together these outputs measure the relative effectiveness of the contact tracing system in containing the virus. these results are robust to the initial number of seed cases: if there are multiple initial seed cases, the reproduction number is not affected; the mean outbreak size is simply multiplied by the initial number of cases, and the probability of elimination is raised to the power of the initial number of cases. for each combination of contact tracing parameters, we ran multiple simulations each of which was initialised with a single infected seed case. results are shown for uptake rates ranging from to % and were calculated by averaging over a sufficient number of simulations to provide an aggregate total of at least , cases. with case isolation in the absence of any contact tracing, the effective reproduction number was = . , the mean outbreak size after days was , and the probability of extinction was approximately %. manual-only contact tracing (which corresponds to a digital uptake rate of = in fig. ) with moderately ( %) effective quarantine of pre-symptomatic or subclinical individuals reduced to . , the mean outbreak size to approximately and increased the probability of extinction to %. when quarantine is moderately effective (reduces transmission by %, blue curves in fig. ), the addition of digital tracing with high uptake rate (> %) and high probability of logging contacts ( = %) reduced to around . , mean outbreak size to , and increased the probability of extinction to %. if quarantine is more effective (reduces transmission by %, red curves in fig. ) , digital tracing can reduce to approximately . and increase probability of elimination to %. adding recursive tracing of second-order contacts (orange curves in fig. ) provides a relatively small reduction in to . , although this does increase the probability of elimination from % to %. lower uptake rates (< %) result in poorer performance although there is still some noticeable benefit of digital tracing at an uptake rate of around %, provided the probability of a close contact being logged is high ( = % in fig. ). if is much lower than this, performance will deteriorate. however, the results are not as sensitive to as they are to uptake rate , because the requirement for both the index and the secondary case to be users of the system means there is a quadratic dependence on uptake rate. this means that a % reduction in is approximately equivalent to a % reduction in uptake rate. we also considered a scenario in which there is no manual contact tracing, except for home contacts which are still assumed to be traced instantly (figure ) . this could represent a situation where a larger outbreak has exceeded the capacity of the manual contact tracing system, so non-household contacts can only be traced digitally. in this scenario, we assume that quarantine of pre-symptomatic individuals is only moderately ( %) effective and there is no recursive tracing, representing a digital-only contact tracing system without consistent follow-up from trained public health professionals. we measure the effectiveness of contact tracing by , as the other two measures (mean outbreaks size per seed case and probability of elimination) are less relevant in the case of a large outbreak. in this scenario, digital contact tracing makes a larger relative contribution to controlling the spread of covid- . however, it also means that digital tracing alone is unlikely to be able to contain an outbreak: even with a very high uptake rates ( %) of an effective digital tracing system ( = %), is around . . this is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . implies that digital contact tracing would need to be combined with significant population-wide control measures in order to avoid a major epidemic. ; (b) mean number of cases per seed case after days; (c) probability of elimination of an outbreak starting from a single seed case. proportion of contacts logged by digital tracing system when both individuals are users of the system is = %. quarantine reduces onward transmission by % (blue), by % (red), by % and with recursive tracing (orange). isolation of symptomatic cases completely prevents onwards transmission. results are averaged over sufficient simulations of the branching process to give a total , cases. . home contacts are still traced manually, but other contacts can only be traced digitally. proportion of contacts logged by digital tracing system when both individuals are users of the system is = %. quarantine reduces onward transmission by %. isolation of symptomatic cases completely prevents onwards transmission. results are averaged over sufficient simulations of the branching process to give a total , cases. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . / . . . doi: medrxiv preprint table shows a comparison of the effectiveness of alternative technological approaches to digital contact tracing, modelled as having different probabilities of recording contacts. table shows the effective reproduction number for manual contact tracing plus digital tracing systems at different levels of uptake, with highly effective ( %) quarantine and without recursive tracing of second-order contacts. systems based on qr codes alone without proximity detection are likely to perform less well because of the additional steps required for a contact to be recorded: the location needs to have a qr code displayed and both the index case and secondary case need to scan it. this does not diminish the benefit to manual tracing of users scanning qr codes to maintain a record of their movements, but we do not explicitly model this here. fully decentralised bluetooth apps are estimated to be less effective than centralised apps at the same level of uptake, because of a reduced likelihood of users reacting to automatic exposure notifications from a decentralised system without follow up from manual contact tracers. a card-based proximity system is estimated to perform similarly to a centralised bluetooth app, though with a slightly reduced effectiveness because notifications cannot be sent natively and need to be made via a separate system which requires current contact details. effective reproduction number of manual contact tracing plus different types of digital tracing system at % uptake, % uptake and % uptake, with highly effective ( %) quarantine and without recursive tracing. a system based on qr codes with no proximity-detection is modelled as logging a low proportion ( %) of contacts, because the location needs to have a qr code displayed, as well as both contacts taking the additional step of scanning the code. bluetooth apps are modelled as logging a high proportion ( %) of contacts, but in decentralised systems only % of users are assumed to quarantine following a notification. a card-based proximity detection system is assumed to have similar detection probability as a bluetooth app, but only % of contacts lead to an instant notification because the card is separate from user's phones. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https: //doi.org/ . //doi.org/ . / successful control of covid- is likely to require a range of intervention strategies, including some or all of: moderate population-wide social distancing; widespread use of face coverings; restrictions on large gatherings or other interventions targeting superspreading events; case isolation and household quarantine; manual and digital contact tracing (ferreti et al. ; hellewell et al., ; . establishing trusted relationships with cases and contacts is crucial both to increasing contact tracing coverage and to supporting effective quarantine and isolation. there is still limited evidence on the effectiveness of digital tracing systems (anglemyer et al., ) and reliance on digital tracing alone is unlikely to be sufficient. however, there is potential for digital tracing to enhance the coverage and/or speed of contact tracing systems run by trained public health professionals . digital systems should therefore be seen as an opportunity to improve coverage and/or speed by acting in a complementary role to manual contact tracing. emphasis should be on how these systems can provide additional useful information to contact tracers in a timely way or fill potential gaps in the manual system. this implies that thorough consultation with public health agencies undertaking contact tracing should be a pre-requisite for an effective digital support system. in this paper, we modelled the effect of manual contact tracing supported by a digital tracing system with varying levels of uptake and effectiveness. our results show that manual contact tracing can significantly reduce the spread of covid- , but on its own is not sufficient to make the effective reproduction number less than . manual contact tracing supported by a digital tracing system can further reduce spread, depending on the effectiveness of quarantine of unconfirmed (pre-symptomatic and subclinical) cases. if quarantine is moderately effective (reduces transmission by %), a manual plus digital tracing system with very high uptake (> %) could reduce to around . . if quarantine is highly effective (reduces transmission by %), can be reduced to approximately . , which could be sufficient to contain the majority of small outbreaks. however, if the uptake rate is less than around %, the reduction in is not sufficient to contain an outbreak without additional measures. our results suggest that the additional benefit from recursive tracing of second-order contacts is relatively small. in the case of a small outbreak, this may be worthwhile as it could increase the probability of elimination from % to % provided uptake is high and quarantine is effective. however, given that it is likely more difficult to effectively quarantine second-order contacts, and the number of uninfected people being quarantined would be much higher, it may be more beneficial to focus on effective quarantine of first-order contacts than attempting to locate second-order contacts. in our model, untraced clinical cases take . days on average to get tested after onset of symptoms. this assumption was based on the onset to reporting times in data from the march-april outbreak in new zealand. the aim of contact tracing is to find close contacts of confirmed cases and therefore enable early quarantine or isolation. however, it is important to recognise that reducing the time between symptom onset and isolation can bring significant benefits, even in the absence of contact tracing. this implies that enhancing public awareness of covid- symptoms and the need to get tested quickly, ensuring equitable access to healthcare, and maintaining rapid testing capacity are equally important as investment in contract tracing. we have not considered the consequences of false positives from digital contact tracing systems, i.e. people who are recorded as potential contacts by the digital system but have not in fact been at risk of exposure to covid- . for countries such as new zealand that have reduced cases to very low numbers and where the primary goal is to achieve or maintain elimination of community transmission, we assume that the number of false positives is not a primary consideration. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . if case numbers exceed the capacity of the manual contact tracing system, its performance will rapidly deteriorate. under this scenario, digital tracing can make a significant contribution to slowing the growth of the outbreak, but if it became the dominant form of tracing it is likely to be insufficient to reduce under . this implies that population-wide control measures would likely be needed to prevent a major epidemic. a low false positive rate is a more important consideration in this situation. our model allowed for age-specific mixing patterns in home, work, school, and casual contacts, and for heterogeneity and homophily in individual contact rates. however, the model ignores other sources of heterogeneity, for example in types of workplace. some workplaces will be much more amenable to rapid contact tracing than others, for example where employees tend to work in a consistent physical location each day. this could be modelled via individual heterogeneity and homophily in contact tracing probabilities as well as contact rates, however the mean tracing probability is likely to remain the most important parameter. we have taken a technology-agnostic approach to modelling digital contact tracing. among the most important parameters for any digital tracing system are the uptake rate (number of people using the system) and the probability of close contacts being logged. our results show that to contain an outbreak, a well-functioning manual contact tracing system needs to be combined with a digital tracing system that should ideally have at least % uptake and record and record % of close contacts. different systems may have different uptake rates, for example due to usability and privacy issues, so a careful study of expected uptake rates is critical to choosing the best system. the proportion of contacts logged will be affected by the usability of the system and by individual behaviour. for example, contacts will be missed in situations where a user forgets or loses their phone or card, has the phone switched off or a flat battery, or has bluetooth deactivated. systems that rely solely on qr code scanning for exposure notification are likely to perform less well than systems with proximity-based detection at the same level of uptake. this is because successful tracing requires the location to have a compatible qr code displayed, both individuals to have the app installed and to take the additional step of scanning the code. there may be benefits to a qr code app in keeping a record of movements or digital diary, but we did not explicitly consider these here. prem k, cook ar, jit m ( ) . projecting social contact matrices in countries using contact surveys and demographic data. plos computational biology. ( ): e . tindale l, coombe m, stockdale je, garlock e, lau wyv, saraswat m, lee y-hb, zhang l, chen d, walinga j, colijn c ( ). transmission interval estimates suggest pre-symptomatic spread of covid- . medrxiv preprint. doi: https://doi.org/ . . . . . verrall, a ( . rapid audit of contact tracing for covid- in new zealand. ministry of health. who & cdc ( ) . implementation and management of contact tracing for ebola virus disease: emergency guideline. retrieved from: http://www.who.int/csr/resources/publications/ebola/contacttracing/en/ . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . when case infects a new case, the new case occurs in age group and setting with probability each new secondary case needs to be assigned individual multipliers ( ) for each of the setting types. the multiplier for the setting ( = * ) in which case was infected is correlated with the multiplier for the index case . this models assortative mixing, meaning people with high contact rates in setting tend to be linked to other people with high contact rates and vice versa). multipliers for different settings ( ≠ * ) are assumed to be uncorrelated with the multipliers for the index case. for simplicity, we assume that for work, school and casual transmission, the secondary case's multiplier for that setting is equal to the index case's multiplier for that setting with probability , and is independent from the index case's multiplier with probability − : ).when a secondary case occurs via household transmission it shares the same household id as the index case . the expected number of home contacts decreases by one with each new case in that household. this means that, eventually, transmission chains within a given household will always go extinct as the pool of susceptible household members is depleted. for simplicity, network effects and infection-induced immunity for work, school and casual contacts are ignored, i.e. all non-household contacts of the secondary case are assumed to be mutually exclusive of the non-household contacts of the primary case. the parameter ( ) , the standard deviation of the distribution of , represents the degree of heterogeneity in individual contact rates in setting s. higher values of ( ) correspond to a greater degree of superspreading in the transmission dynamics (lloyd-smith et al., ) . we set ( ) = for home contacts and ( ) = √ for work, school and casual contacts. this represents greater heterogeneity in contact rates outside the home and more potential for superspreading events in nonhousehold settings, and corresponds to a negative binomial over-dispersion parameter of = . for infections outside the home. supplementary figure s shows an example simulation of the branching process and contact tracing model. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . supplementary figure s . contact matrices in -year age bands derived from results of prem et al. ( ) . each matrix shows the average number of home, work, school or casual contacts that an individual in age group has with an individual in age group over the course of their infectious period. the number of home contacts are taken directly from the results of prem et al. ( ) for new zealand. the number of work, school and casual matrices are scaled up by a factor of from the number of daily work, school, and other contacts in prem et al. ( ) to allow for an infectious period longer than day. the factor of was chosen to make = . in the absence of any control measures. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . supplementary figure s . example branching process simulation starting from a single infected seed case. cases are represented as nodes with the vertical coordinate corresponding to time of symptom onset. transmission routes are blue lines. node labels indicate age group ( = - years, = - years, etc.), red indicates traced contacts, black indicates untraced, grey/pink indicates subclinical infections. . cc-by-nc-nd . international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted september , . . https://doi.org/ . digital contact tracing technologies in epidemics: a rapid review epidemiology and transmission of covid- in cases and of their close contacts in shenzhen, china: a retrospective cohort study probability of elimination for covid- in aotearoa new zealand new zealand eliminates covid- cmmid covid- working group age-dependent effects in the transmission and control of covid- epidemics quantifying sars-cov- transmission suggests epidemic control with digital contact tracing using a real-world network to model localized covid- control strategies estimating the generation interval for coronavirus disease (covid- ) based on symptom onset data centre for the mathematical modelling of infectious diseases covid- working group successful contact tracing systems for covid- rely on effective quarantine and isolation quantifying the impact of physical distance measures on the transmission of covid- in the uk contact tracing for imported case of middle east respiratory syndrome effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov- in different settings the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application superspreading and the effect of individual variation on disease emergence the authors acknowledge the support of statsnz, esr, and the ministry of health in supplying data in support of this work. the authors are grateful to andrew chen, matt parry and philippa yasbek for discussions about digital contact tracing systems and feedback on an earlier version of this paper. this work was funded by the ministry of business, innovation and employment and te pūnaha matatini, new zealand's centre of research excellence in complex systems. ,individual cases ( for index case, for secondary case),age groups ( )age group of case setting (home/work/school/casual) contact structure ( )contact matrix representing the average number of contacts that an individual in age group has in age group and setting (home/work/school/casual). these were obtained from the results of prem et al. ( ) using the number of daily contacts for new zealand combined into year age bands. the total numbers of work, school and casual contacts were scaled up by a common factor from the number of daily work, school and casual contacts to model an infectious period longer than day. the total number of household contacts was set equal to the number of daily household contacts, to reflect the fact that household contacts do not usually change from one day to the next. the common scaling factor for work, school and casual contacts was set to be . to give a basic reproduction number of = . in the absence of any control measures. this corresponds to a reproduction number of approximately . with case isolation and no contact tracing. contact matrices are shown in supplementary figure s . has a greater than average number of contacts in setting and is more likely to be a superspreader. this parameter may be correlated between index case and secondary cases transmitted in setting (see below).( )attack rate for contacts in setting , assumed to be % for household contacts and % for nonhousehold contacts , independent of age . = . if case is subclinical and = otherwise. a case in age group is subclinical with probability ( ) the total expected number of new cases infected by case between time and time + is:, is the symptom onset time for case , ( ) is equal to if case is in quarantine at time , if case is in isolation at time , or otherwise. the function is the probability density function of a weibull distribution (see supplementary table s ). key: cord- -r dhmpq authors: khan, muhammad bilal; zhang, zhiya; li, lin; zhao, wei; hababi, mohammed ali mohammed al; yang, xiaodong; abbasi, qammer h. title: a systematic review of non-contact sensing for developing a platform to contain covid- date: - - journal: micromachines (basel) doi: . /mi sha: doc_id: cord_uid: r dhmpq the rapid spread of the novel coronavirus disease, covid- , and its resulting situation has garnered much effort to contain the virus through scientific research. the tragedy has not yet fully run its course, but it is already clear that the crisis is thoroughly global, and science is at the forefront in the fight against the virus. this includes medical professionals trying to cure the sick at risk to their own health; public health management tracking the virus and guardedly calling on such measures as social distancing to curb its spread; and researchers now engaged in the development of diagnostics, monitoring methods, treatments and vaccines. recent advances in non-contact sensing to improve health care is the motivation of this study in order to contribute to the containment of the covid- outbreak. the objective of this study is to articulate an innovative solution for early diagnosis of covid- symptoms such as abnormal breathing rate, coughing and other vital health problems. to obtain an effective and feasible solution from existing platforms, this study identifies the existing methods used for human activity and health monitoring in a non-contact manner. this systematic review presents the data collection technology, data preprocessing, data preparation, features extraction, classification algorithms and performance achieved by the various non-contact sensing platforms. this study proposes a non-contact sensing platform for the early diagnosis of covid- symptoms and monitoring of the human activities and health during the isolation or quarantine period. finally, we highlight challenges in developing non-contact sensing platforms to effectively control the covid- situation. the pandemic of covid- is exponentially spreading all over the world. due to this exponential increase, many people have been affected or have died, and as a result the entire world is quarantined from each other. as the outbreak continues to evolve, every country's government is considering options to prevent the spread of the virus to new places by stopping human movement in places where the disease that causes covid- is already circulating [ ] . self-quarantine is the only option to make one's own and others' lives safe. the quarantine of a person is the limit of activities or the separation of persons who are not actually ill but who may be exposed to disease or an existing platforms. we categorize the studies into monitoring of human activities and health conditions. furthermore, the review presents advantages and limitations. finally, we summarize and explain some challenges to open research problems that require further investigation and improvements. the following are the contributions of the comprehensive study: this study provides a road map in developing a covid- pandemic platform for containing the virus. ( ) systematically review the non-contact sensing platforms used for human activity and health monitoring. ( ) propose a non-contact sensing platform for the early diagnosis of covid- symptoms and the monitoring of human activities and health during the isolation or quarantine period. ( ) highlight the challenges, testing environment, performance and optimal solutions to work on deployment. the rest of paper is organized as follows: section includes a literature review of the covid- pandemic, the existing non-contact wireless sensing platforms and technology exploited, the monitoring of human activities and health, and the classification approach and accuracy achieved. in section , the proposed platform is described for the early diagnosis of covid- symptoms and monitoring of human activities and health during the isolation or quarantine period. in section , the experimental setup based on both commercial and specialized hardware is presented. in section , the advantages of developing a non-contact wcsi sensing platform for containing covid- are explained. in section , the challenges faced in developing a non-contact sensing platform are discussed. in section , future recommendations and possible solutions are discussed. in section , conclusions on the non-contact sensing platform development for containing the covid- are made. this study used a list of abbreviations, as defined in table . in this section, we present a summary on the origination, spreading mechanisms, symptoms and prevention methods of covid- . then follows a systematic review of non-contact sensing platforms for human activities and health monitoring. this review identifies reliable and intelligent existing related work to propose a new platform for the early diagnosis of covid- symptoms and monitoring of human activities and health to protect human life. covid- is a respiratory disease caused by severe acute respiratory syndrome coronavirus- (sars-cov- ). the first case was reported in december , in the city of wuhan, in hubei province, china. since then, covid- has spread like a tsunami around the world and is now present in countries and independent territories. according to the who, viral infections initiated by various coronaviruses continue to develop and pose a serious public health problem [ ] [ ] [ ] . distinctive features of the virus include its highly contagious nature and relatively long ( - days) development period. during this time, a person can become infected by the virus and show no symptoms. therefore, people who have the disease can act as silent carriers of the virus without realizing it and contribute to a high number of basic reproductions of the covid- virus. to date, there is no specific vaccine or treatment for infection, and management protocols focus on containing disease development. most covid- cases have exhibited clinical features such as fever, cough and fatigue. some patients had symptoms such as headache, sore throat and shortness of breath, while symptoms such as runny nose, diarrhea, aches and pains were very rare, as shown in table . while most covid- patients develop mild to moderate disease, a few patients have been diagnosed with a severe ( . %) and critical ( . %) health condition [ ] . according to the united states centers for disease control and prevention (cdc), people at the greatest risk of disease from covid- are older adults (those over years of age) and those with pre-existing conditions such as high blood pressure, asthma, diabetes, cardiovascular disease and those taking immunosuppressing therapy [ ] . covid- is categorized by a specific dysfunction in respiratory physiological processes involving the other parts of the lower respiratory tract and diaphragm, thereby affecting respiratory patterns during inhalation and exhalation from the lungs [ ] . in speech initiation, at the time of exhalation, air from the lungs travels from other basic vocal subsystems, namely the larynx and trachea and the vocal canal into the oral, pharyngeal and nasal cavities. the way we breathe during speech, containing the speed and length of exhalation (depending on the number of words in a sentence or phrase), and its intensity and variability, greatly affect the quality of our voice. in addition, the respiratory system is primarily highly coordinated with these laryngeal-based subsystems [ , ] . similarly, laryngeal activity is well linked to articulation in the oral and nasal cavities [ ] . although their effects and coordination of speech subsystems are perceptibly apparent with an inflammatory condition, these changes may be mild in the asymptomatic stages of a disease at baseline or recovery. speech subsystems and coordination are assumed to be affected by covid- . in addition to the respiratory involvement by covid- , the current pandemic shows evidence that neurological involvement may occur with covid- . headache and dizziness remain the most common symptoms; however, symptoms due to loss of muscle control and loss of proprioception have recently been reported due to transient neuromuscular disorder [ , ] , as well as loss of smell and taste [ , ] . given the physiological disorder to respiratory functions and evidence of this increased neurological issue due to covid- , biomarkers of covid- derived from vocal subsystem coordination measurements are the most significant in the asymptomatic stage [ ] . although there are several studies in the direction of covid- s pathophysiological properties, its propagation mechanism remains somewhat indefinable. while the initial covid- cases were associated with the direct exposure of individuals to infected animals, the rapid outbreak of the disease has shifted the focus of the research to human-to-human via direct or other surface transmission. an analysis of around , cases of covid- in china has revealed that the covid- virus is primarily transmitted between people from the spread of respiratory droplets through coughing and sneezing [ ] . these respiratory droplets have the potential to travel a distance of up to . m ( feet). therefore, any person in close contact with an infected person is at risk of being exposed to the respiratory droplets, and by extension, the virus. although symptomatic people have been identified to be the primary source of sars-cov- transmission, there is also a possibility of transmission via asymptomatic people. direct and indirect contact with infected surfaces have been identified as other potential causes of covid- transmission. evidence suggests that the virus can survive on plastic and steel surfaces. researchers have revealed that covid- is spread by contact. therefore, it is recommended to minimize human-to-human contact for the safety of human society. human activity monitoring plays an important role in human health. in the covid- pandemic situation, it is essential to monitor human activities in terms of non-contact to stop the spread of the virus. various non-contact human activity-sensing technologies, methods and performances achieved by existing platforms were investigated for the development of the covid- platform. device-free detection is a valuable technology for the detection of moving bodies in the operational region without the wearing of any device. the device-free passive detection of moving humans with dynamic speed (pads) scheme extracts csi with both types of information (amplitude and phase) and exploits space diversity across multi-antennas in multiple input multiple output (mimo) systems. the prototype pads uses commercial wi-fi devices to extract shape sensitive metrics for accuracy and robust target detection [ ] . an active device-free system uses sdr to exploit activity recognition of a person standing, walking, crawling or lying and/or an empty environment [ ] . since wireless signals are good reflectors of human bodies, activities can be recognized by monitoring received wi-fi signals characteristics; carm proposes a human activity recognition and monitoring system by extracting csi and was implemented on commercial wi-fi devices [ ] . a through the wall (ttw) presence detection system for both stationary and moving persons uses wi-fi signals with a single wi-fi access point (ap). this system considers an empty environment with one stationary human or a human moving in the room; the channel frequency response (cfr) changes over time carry significant information for monitoring activities [ ] . device free solutions based on radio signals (wi-fi) available in the home, particular . standard, have been considered. fine-grained analysis based on available csi have been proposed to detect human activities [ ] . human body motions were detected in a quasi-real-time environment using non-contact devices. patterns of csi present unique changes caused by body motions to identify particular human activities. sdr technology has been exploited to extract a dataset that contains radio wave signals patterns [ ] . human activity recognition (har) using ultra-wide band (uwb) technology is very effective to investigate the feasibility of device-free activity recognition [ ] . hand gesture recognition is one of the issues in human-computer interactions. non-contact wi-fi-based gesture recognition systems (wi-gers) detect hand motions by capturing the changes in the csi using wi-fi signals. a public wi-fi router is used for the detection of hand motions [ ] . non-contact sensing has attained a lot of attraction due to the availability of wi-fi signals in homes, offices, shopping malls, airports, etc. the commercial wi-fi infrastructure proposed a training-free human vitality sensing platform named wi-vit. this platform can capture real-time human motion speed information without the offline training or calibration that involves human effort. the feasibility study of the platform revealed that it can monitor long-term activities of daily living in practice for various applications [ ] . eating is an essential activity in human daily life. in this regard, a device-free system for the monitoring of eating uses wi-fi built-in devices (e.g., laptop or smartphone). this system automatically monitors human eating activities by extracting the fine-grained csi from wi-fi signals of eating motions and by detecting swallowing and chewing. it can differentiate non-eating from eating activities and further classifies eating motions with different utensils. eating monitoring is essential to understand eating behaviors, and it is useful for estimating a balanced diet [ ] . the wi-see system uses wi-fi signals for gesture recognition, since wireless signals can traverse ttw and do not require line-of-sight (los) from source to destination. the system uses wireless resources to enable entire-home gesture recognition. wi-see was evaluated in a two-bedroom apartment and an office environment using sdr technology [ ] . wi-hear uses wi-fi signals to "hear" human speech without installing any devices. this system introduces mouth motion profile (mmp) to solve micro movement detection problems that leverage wavelet packet transformation and partial multipath effects. it can "hear" human speech within the radio range, and it can simultaneously "hear" multiple human speech by exploiting mimo technology. it was implemented on both commercial wi-fi infrastructure and the sdr platform [ ] . an ambient radar sensor was proposed to recognize human activities in indoor environments. a radar uses . ghz frequency to capture the fine dynamics of human activities while emitting micromachines , , of pulse signals every second. this approach also includes a method to separate a collection of numerous activities into individuals [ ] -the concept of domain gap (dg)-and further contains a domain independent (di) feature, which is a promising solution to eliminate dg and achieve gesture recognition accuracy [ ] . the bumble-bee radar captures micro-doppler signatures for indoor human activity recognition. it can discriminate between human activities even under variable conditions [ ] . occupant activity recognition (oar) is very important for building management systems (bms) to give comfortable environments for occupants. the wi-oar system uses wi-fi signals to provide user-centric services and are energy-efficient in smart offices. this system presents a fast and robust target component separation (frtcs) algorithm for measuring both high accuracy and time efficiency. a pair of commercial wi-fi devices was used for developing a prototyped wi-oar system in diverse office environments [ ] . a public dataset by ten volunteers with sixteen different activities in indoor environments used wi-fi signals to develop the wi-ar system. the aim of the system is to reduce the cost of collected signal data for researchers in a convenient manner and improve the performance in different domains [ ] . wi-motion uses the amplitude and phase information extracted from the csi sequence to build the classifiers. this system can recognize six different human activities [ ] . a device-free, non-wearable, privacy-preserving occupancy detection system uses wi-fi imaging for future smart buildings. this system was developed using an off-the-shelf commercial wi-fi router, omnidirectional antennas and a network interface card (nic) for imminent body-centric communication [ ] . a low cost, non-intrusive and minimal low-power radar-based sensing system that uses a novel approach for human activity recognition in the home was developed that investigates fifteen different challenging activities performed inside the kitchen [ ] . har uses deep learning (dl) networks with enhanced csi to develop a csi feature enhancement scheme (cfes). it includes two modules of background correlation feature enhancement and reduction for preprocessing the data input to the dln [ ] . at present, we are entering into the era of the internet of things (iot), where it will be convenient to find aps at any location. the presence of a human body between two aps uses wi-fi signal waveforms to extract csi. machine learning (ml) uses csi data to recognize and predict human motion [ ] . with the popularization and development of wi-fi technology, it has become a benefit of human daily life to use mobile devices for monitoring daily activities [ ] . sleep monitoring is a very important human activity because it plays a key role in human health. sleep-guardian, a radio frequency (rf)-based healthcare system, combines signal processing, edge computing and ml [ ] . extensive running is life-threatening if it is not monitored properly. wi-run, a non-invasive step estimation, is a complete model-based system that intelligently estimates steps using commercial wi-fi devices. wi-run consisted of two models. the first model is the single runner csi-based step estimation, which measures the relationship between single runner running and csi dynamics in the activity area. the second model is the multi-runner csi-based step estimation that quantifies the relationship between each runner's running and csi dynamics in the activity area [ ] . investigating spatial diversity in wi-fi signal-based har identifies the dead zones and their important dominant factors. a wi-fi signal-based spatial diversity aware non-contact activity recognition system (wi-sdar) was introduced. it overshadows the dead zones with only one physical wi-fi sender and receiver, which is fully compatible with commercial off-the-shelf wi-fi devices [ ] . har uses radar as a sensor having unique characteristics such as contactless sensing and privacy protection. dl methods for activity recognition use radar to exploit human motion information [ ] . table summarizes the non-contact sensing technologies, detection and monitoring, classification methods and accuracy achieved by existing research platform. human activity classification and monitoring review reveals that non-contact sensing exploits wcsi to study human activities such as sitting, standing, walking, running, eating and sleeping. wi-fi sensing using commercial hardware is widely used because it is an inexpensive and easily available solution. human activities have been monitored and classified in existing research by ml and dl algorithms having accuracies over % [ , [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] . the average accuracy of promising non-contact technologies for monitoring human activities is shown in figure . table summarizes the non-contact sensing technologies, detection and monitoring, classification methods and accuracy achieved by existing research platform. human activity classification and monitoring review reveals that non-contact sensing exploits wcsi to study human activities such as sitting, standing, walking, running, eating and sleeping. wi-fi sensing using commercial hardware is widely used because it is an inexpensive and easily available solution. human activities have been monitored and classified in existing research by ml and dl algorithms having accuracies over % [ , [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] . the average accuracy of promising non-contact technologies for monitoring human activities is shown in figure . regular health monitoring can detect potential health issues before they become a problem. in the covid- pandemic, it is essential to diagnose early symptoms and monitor health conditions in a non-contact manner to stop the spread of the virus. various non-contact sensing studies for monitoring health were investigated for the development of a covid- platform. wireless sensing was used to detect asthma attacks based on wcsi where doppler effects were manifested [ ] . the wireless sensing-based healthcare facility utilized g c-band technology, which improved the efficiency of detecting the fall and body motions with a wide spectrum range and maximum capacity. the system works at . ghz frequency for capturing the wcsi of post-surgical falls and other important activities of humans. the low cost solution includes an rf signal generator, a nic, and a desktop pc along with omni-directional antennas. this system is feasible and reliable for detecting post-surgical falls with high accuracy [ ] . huntington's disease (hd) is a genetic disorder that cannot be cured easily. the quality of life of patients becomes more serious as the disease quickly progresses. it is essential to examine patients timely and effectively. a microwave sensing platform (msp) was developed for continuous monitoring of hd patients in a non-contact manner. this platform also resolved patient inconvenience and privacy issues [ ] . parkinsonian gait is the most devastating symptom of parkinson's disease (pd), which has a more negative impact on quality of life than other pd symptoms. wireless sensing technology is used for the detection of parkinsonian gait using s-band for classification of normal walking and abnormal gait. additionally, the early diagnosis of shaking palsy (sp) symptoms in a non-contact manner was also achieved [ ] . patients suffering from dementia show signs of wandering behavior due to memory loss or boredom. dementia patients are exposed to serious injuries from falls if they are not continuously monitored. a wireless sensing platform was designed and evaluated the wandering behavior of patients suffering from dementia in an indoor environment [ ] . passive wi-fi sensing extracts the two-dimensional phase to monitor three health essentials, includes breathing rate, tremor and falls. the signal processing of the cross-ambiguity function (caf) and various features are extracted from the signal [ ] . wireless sensing uses the c-band ( . ghz) in the indoor environment to monitor body movements of women, especially pregnant women, for early detection of seizure in pre-eclamptic women, so patients can be managed promptly and the mode of delivery can be decided early. the body movement shows unique features extracted from wcsi and can easily be differentiated by using ml classifiers [ ] . a bathroom has a comparatively higher probability of severe accidents than other places or rooms due to a slippery floor. a commercial wi-fi device-based danger-pose detection system was used while preserving privacy [ ] . a non-contact sensing method uses passive doppler radar to capture human body movements to recognize respiration and other physical activities used for monitoring health. the system uses existing available wireless signals as a source to detect human activity. a two-stage signal processing framework was outlined to support the multi-purpose monitoring functionality. the first stage obtained the primary doppler information by introducing high speed passive radar signal processing. the second stage functionality was signal processing of micro doppler extraction data for breathing detection and classification [ ] . parasomnia is a sleep disorder that causes involuntary, random and unwanted movements of a dreaming patient. unfortunately, these dreams may cause violent activities, which can result in more chances of injury, including that of a bed partner. continuous monitoring of patients can prevent difficult situations. the system for continuous monitoring of patients exploits fine-grained magnitude and phase information of the wcsi. the variations in the wcsi, as a result of patient body movements, were monitored to identify the behavior [ ] . cerebellar ataxia (ca) is a neurological disease having symptoms of weak coordination movements and balance disorders. a non-contact sensing system was developed for detecting ca based on rapid alternating movements and heel-knee-shin diagnosis tests. this system has the potential to monitor ca in a flexible and patient-friendly environment [ ] . a non-contact sensing method uses rf signals to detect paraparesis. it is a promising solution that can reduce the load and improve doctor work efficiency. a system used the d-convolution neural network (cnn) model for automatic extraction of valid features and classifications. the system performed efficient and accurate patient screening of suspected paraparesis [ ] . parkinson's disease is a progressive neurologic disorder that primarily affects the movements and limits the motor ability of the patient. freezing of gait (fog) is a motor symptom of parkinson's disease in ageing people, and its timely treatment can reduce the probability of any secondary disorders. the magnitude and phase information of the radio signals is used to detect the motor and non-motor symptoms. the method is very useful with minimum deployment of resources for real-time patient monitoring systems [ ] . cerebellar dysfunction (cd) is one of several neurological disorders that disturbs the movement of the body. a user-friendly system was used to evaluate body movements in cd patients using s-band sensing technique. this system quantified the tremors in hand and gait abnormality using wireless devices such as a nic, omnidirectional antennas and a router operating at . ghz to extract the csi data [ ] . wireless sensor networks (wsns) use directional antennas extensively for various applications. the four-beam patch antenna was used as a sensor node to evaluate the pill-rolling effect in parkinson's disease. the four-beam patch is highly directive, small in size and can mitigate the multipath fading present in an indoor environment for effective measurements. the pill-rolling affect indicates the tremors in the hands, predominantly in the fore-finger and the thumb. the developed system was a low-cost framework that evaluates the movement disorder using the s-band sensing platform leveraging wireless devices working at . ghz. the system efficiently classifies tremor and non-tremor feelings in the fingers [ ] . a particular body movement of multiple sclerosis patients is monitored by a c-band sensing system working at . ghz, and especially the tremors and breathing patterns by a g potential band. this system can identify the particular condition of a patient efficiently [ ] . the wireless signal technology successfully detects human motions and related diseases in a non-contact manner [ ] . heart rate and breathing patterns of a person are major indicators of a physical condition. a system was developed for measuring the changes in the heart rate and breathing pattern of a person using commercial wi-fi devices. this is an inexpensive system and very useful for monitoring daily life health [ ] . human vital signs of heart rate and breathing along with body posture during sleep is very important to monitor and evaluate general physical health. a system was developed by using off-the-shelf wi-fi devices to track the vital signs of both heart rate and breathing rate during sleep without dedicated devices. an existing wi-fi network was re-used by the system to exploit the fine-grained csi to capture every movement caused by heart beats and breathing. this system has the ability to monitor continuously and can be easily deployed everywhere with very cheap solutions [ ] . fog is a periodic absence of forward movement in pd patients, and it is one of the disabilities. a wi-freeze is a non-invasive wi-fi-based sensing system used for detection and classification of fog [ ] . the monitoring of various physical activities exploits wireless sensing devices, such as sensors used in medical cyber-physical systems (cps). patients undergoing epileptic seizures show signs of involuntary body movements. the system exploits s-band operating frequencies used for data extraction and classification of a clinical condition of epileptic seizures [ ] . wi-fall is a system used for fall detection of independently-living people, especially the elderly. it can detect the fall of the human without any extra hardware setup or any wearable device. the system was implemented using commercial . n nic. it can achieve high fall detection accuracy for a single person [ ] . a real time (rt)-fall, contactless, inexpensive and accurate fall detection system used commercial wi-fi devices. it allowed users to perform routine activities continuously and naturally without attaching any devices on the body [ ] . res-beat is a commercial wi-fi device-based system used for non-contact real-time respiration rate monitoring. the system analyzes bimodal csi data for breathing signal anomalies to detect peak and estimate respiration rates [ ] . the dl-based cnn model classifies ankle movements after surgery using the sdr platform. wcsi image data accurately detected movement of the ankle of patients who suffered fracture ankle surgery [ ] . a non-contact sensing testbed was designed using universal software radio peripheral (usrp) devices for the classification of post-surgery activities. the testbed efficiently classified the weight lifting activity of spinal cord patients by exploiting wcsi [ ] . sometimes involuntary scratching may increase the spread of skin diseases such as atopic dermatitis. the frequency of scratching indicates the degree of itching and is helpful in analyzing clinical diagnosis. a system has the potential to monitor the scratching signal of a sleeping human body using a wi-fi router and a leaky coaxial cable (lcc) [ ] . hypopnea syndrome is a chronic respiratory disease that is described by repetitive occurrences of breathing disturbances during sleep. a contactless system provides an alternative to conventional medical testing for detecting incognito hypopnea syndrome using s-band wireless sensing. this system has the potential for monitoring accurate hypopnea syndrome in a user-friendly and flexible environment [ ] . respiratory rhythm is the indication of respiratory diseases. the ignored respiratory issues can be dangerous and may cause damage to other body tissues and organs. a non-contact respiratory rhythm detection used an msp to capture the minute variations caused by breathing. this solution is affordable and its performance is high [ ] . lcc has been used extensively in wireless communication to cover blind and semi-blind regions. a system used lcc to identify patient postures in bed in order to prevent or reduce bedsores. the indoor installation and periodic csi data collection using . n intel wlan nics helped to monitor postures [ ] . a system monitored abnormal breathing patterns caused by sudden infant death syndrome (sids) and sleep apnea patients. this system used s-band wireless sensing to extract csi data for the periodic and non-periodic signals that identify the normal and abnormal respiratory conditions [ ] . traditional, non-contact breathing detection systems required specialized hardware support that is not affordable in normal environments. non-contact breathing detection systems based on c-band wireless sensing can easily be deployed in any environment. it is based on a multi-input, multi-output orthogonal frequency division multiplexing (mimo-ofdm) system using . n protocol [ ] . the focus of the g autonomous network used wireless sensing for health care monitoring. the monitoring of respiratory symptoms for copd (chronic obstructive pulmonary disease) used c-band wireless sensing to detect the respiratory conditions, including coughing and normal and abnormal breathing of a copd patient by utilizing nic and the csi tool for the extraction of csi with an omni-directional antenna operating at . ghz frequency. the g sensing technology enhanced the health care system for detection of various diseases effectively [ ] . table summarizes the non-contact sensing technologies, diagnosis of symptoms and monitoring health, classification method and accuracy achieved by the existing research platform. it was revealed from the literature review that the non-contact sensing approach has the potential for the early diagnosis of various symptoms to monitor health, such as breathing, heart rate, fall and sleep disorder. most systems in the existing literature exploit wi-fi technology using csi to detect and classify the health problems. the svm algorithm is widely applied because it is applicable to both linear and non-linear data. the classification accuracy achieved by ml and dl algorithms is over % [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] . the average accuracy achieved by various non-contact sensing technologies to monitor health issues is shown in figure . a non-contact wireless platform is proposed on the basis of the existing literature. the five major functional blocks for the development of platforms are data collection devices, wcsi-based data extraction, data preprocessing, features extraction and classification, as shown in figure . a non-contact wireless platform is proposed on the basis of the existing literature. the five major functional blocks for the development of platforms are data collection devices, wcsi-based data extraction, data preprocessing, features extraction and classification, as shown in figure . a non-contact wireless platform is proposed on the basis of the existing literature. the five major functional blocks for the development of platforms are data collection devices, wcsi-based data extraction, data preprocessing, features extraction and classification, as shown in figure . the data can be collected by either specialize or commercial hardware devices. the coughing and breathing data are collected for early diagnosis of covid- symptoms. the data of sitting, standing, walking, sleeping, eating and posture are collected for the monitoring and detection of fall, heart rate, sleep disorder and diet to protect human lives in covid- pandemic. the ofdm signal is used for fine grained wcsi extraction at the receiver. the wcsi frequency response of each activity is monitored continuously, having the information of the number of sub-carriers, the number of samples and the time taken to complete the activity. the time and samples can be expressed as the number of samples received in a unit time. this sampling time can be chosen on the basis of the device sample rate. the total frequency response of wcsi is express in equation ( ): where k represents the maximum number of sub-carriers, and s represents the total number of samples. the wcsi frequency response of single ofdm frame can be expressed as in equation ( ): the wcsi frequency response of each sub-carrier contain amplitude and phase information, it can be expressed as in equations ( ) and ( ), respectively: h( jω k ) is the amplitude of the kth subcarrier, and ∠h( jω k ) is the phase of the kth sub-carrier. the amplitude and phase information of wcsi is useful for identifying the human body motion to recognize human activity and health condition. data preprocessing requires data cleaning, smoothing and grouping to ensure meaningful, accurate and efficient analysis. data cleaning is a process to remove and replace missing or bad data. it detects abrupt changes and local extrema, which is useful to find significant data trends. the smoothing process remove noise using filtering and other signal processing techniques. the grouping process is used to identify correlations among the data values. feature extraction is a transformation of information, which changes measured data into meaningful information. in addition, it is a dimension reduction process to reduce the computation complexity and time. presently, statistical characteristics approaches have been used for feature extraction. in the literature, various features are extracted on the basis of data properties to improve the classification performance. according to wcsi data properties, statistical features are divided into two categories, such as the time-domain and frequency-domain [ , [ ] [ ] [ ] . the most important features used for wcsi data are listed in table . minimum most human activity recognition approaches exploit the ml and dl algorithms to classify the motion type and its corresponding human activity to test the health status. the efficiency of the classifier depends on the type of dataset. ml can be used to develop activity detection models that make health predictions based on wcsi data in the presence of uncertainty. adaptive algorithms classify normal and abnormal health patterns in the wcsi data. when a learning computer is exposed to more experimental data, the computer improves its identification performance. the entire set of wcsi data is considered as a heterogeneous matrix. the wcsi response data set is a column vector where each row is labeled with the corresponding activity in the wcsi row data. ml model accuracy is used as a diagnostic measure to reflect the validated model results [ ] . in dl, cnn learns useful information from images. in the existing literature, it is used for monitoring purposes in many research studies. currently, dl is efficiently applied in the biomedical area. alexnet and zfnet are the most popular cnn architectures and can be used in a parallel manner to classify wcsi numeric data that is converted to images [ ] . on the basis of the literature review, an experimental setup is proposed for the development of a covid- platform. in the following, two hardware setups are proposed to conduct experiments in a bedroom along with a bathroom to capture rt environments. this hardware platform consists of a wi-fi router, nic, desktop pc or laptop and omni-directional antennas, as shown in figure a . it utilizes a csi tool inbuilt on the intel wi-fi wireless link (iwl) . n mimo radio, which uses an open source linux wireless driver and custom modified firmware. it includes all the software to read and parse the channel measurements and scripts needed to run experiments. the iwl provides . n standard wcsi in a data format that reports the channel matrices of sub-carrier groups, which is about one group for every sub-carriers at mhz or one group for every sub-carriers at mhz frequency. each channel matrix entry is a complex number, with signed -bit resolution, each for the real and imaginary parts. it specifies the gain and phase of the signal path between a single transmit-receive antenna pair. the hardware setup is inexpensive, easily accessible and commercially available [ , ] . this hardware platform consists of two usrp devices, one for transmission and the other for reception, along with omni-directional antennas and a desktop pc or laptop, as shown in figure b . currently, different sdr platforms are used for experimental research; among them, the usrp manufactured by ettus research is mostly used, which has become the standard choice for wireless research [ , ] . the architecture incorporates the xilinx spartan- fpga along with the agile analog radio frequency integrated circuits (rfic)'s direct adaptation transceiver. the rfic determines the number of independent transceivers. it integrates independent coherent transceivers that allow implementation of n×n mimo systems. this device can cover a wide range of frequency coverage with an adjustable bandwidth and can run in frequency division duplex (fdd) or time division duplex (tdd) mode; it allows the fdd to operate in full-duplex mode, while the tdd operates in half-duplex mode [ ] . this hardware setup is flexible and portable. various experiments can be carried out on commercial and specialized hardware platforms to develop fully functional non-contact sensing covid- platform. experiments are to be divided into two main categories: initially, coughing and shortness of breath data are collected from patients for developing a ml/dl model to classify suspected patients of covid- . from the literature review [ , , , , , ] , both coughing and breathing can be monitored in a non-contact manner. it is very essential to recognize human activities for stability during the isolation and quarantine period. standing, walking, sleeping, eating, bathing and postures data are collected for developing a ml/dl model to recognize the fall, sleep disorder and diet of the patient. from the literature review, heart problems, fall, sleep disorder and eating habits can be recognized by non-contact wireless sensing platforms [ , , , , , , [ ] [ ] [ ] [ ] , , , , , , , , [ ] [ ] [ ] , [ ] [ ] [ ] . the following are the outcomes from the development of a non-contact sensing platform for containing of covid- . this hardware platform consists of two usrp devices, one for transmission and the other for reception, along with omni-directional antennas and a desktop pc or laptop, as shown in figure b . currently, different sdr platforms are used for experimental research; among them, the usrp manufactured by ettus research is mostly used, which has become the standard choice for wireless research [ , ] . the architecture incorporates the xilinx spartan- fpga along with the agile analog radio frequency integrated circuits (rfic)'s direct adaptation transceiver. the rfic determines the number of independent transceivers. it integrates independent coherent transceivers that allow implementation of n × n mimo systems. this device can cover a wide range of frequency coverage with an adjustable bandwidth and can run in frequency division duplex (fdd) or time division duplex (tdd) mode; it allows the fdd to operate in full-duplex mode, while the tdd operates in half-duplex mode [ ] . this hardware setup is flexible and portable. various experiments can be carried out on commercial and specialized hardware platforms to develop fully functional non-contact sensing covid- platform. experiments are to be divided into two main categories: symptoms data collection initially, coughing and shortness of breath data are collected from patients for developing a ml/dl model to classify suspected patients of covid- . from the literature review [ , , , , , ] , both coughing and breathing can be monitored in a non-contact manner. activities data collection it is very essential to recognize human activities for stability during the isolation and quarantine period. standing, walking, sleeping, eating, bathing and postures data are collected for developing a ml/dl model to recognize the fall, sleep disorder and diet of the patient. from the literature review, heart problems, fall, sleep disorder and eating habits can be recognized by non-contact wireless sensing platforms [ , , , , , , [ ] [ ] [ ] [ ] , , , , , , , , [ ] [ ] [ ] , [ ] [ ] [ ] . the following are the outcomes from the development of a non-contact sensing platform for containing of covid- . wireless signals can pass through the wall and do not require los. this feature of non-contact sensing eliminates the need for face-to-face contact and provides improved management to contain covid- . in case covid- symptoms develop, the data sent by means of cloud computing platforms can enable healthcare authorities to respond quickly. . it will reduce the physical contact time with a covid- patient as much as possible. . it will not only monitor covid- symptoms but also continuous health monitoring during quarantine and isolation periods in a non-contact manner. transferring care to home, or treating high-risk elders and children in their own homes. . this will improve privacy of individuals during quarantine or isolation periods. . it will also help in early recognition of patients who need aggressive management or hospitalization to prevent them from serious or irreversible sequelae of the disease. . reduce life risk of doctors, paramedical staff and caretakers during quarantine and isolation periods. . innovative tools to construct useful contactless sensing platforms for health care applications. . these platforms can be deployed by re-using the existing infrastructure of wireless communication networks. . improved access to care, increased quality of care and reduced care costs. . it can be deployed in any emergency condition at any place to counter health challenges. although the literature review has demonstrated the potential capabilities for developing a non-contact sensing platform for the monitoring of covid- to contain the virus and protect humanity, there are still existing challenges and research problems that need further investigation and exploration. the real time environment is challenging for developing a classification model using wcsi. the environment varies from place to place due to furniture movement, closing and opening of windows and doors, electronic devices, etc., which leads to changes of the behavior of the wireless channel. it is necessary to develop a model which can adopt to new environment. one of the biggest challenges in data collection is the subject used in the medical related experiments. it is very difficult to use real patients in all the experiments. diseases and health status vary greatly from patient to patient and during different time periods. with covid- , early symptoms also vary from patient to patient. on the other hand, experimenting with real patients is not comfortable for them and also requires time to perform extensive experiments. researchers mainly used healthy subjects for performing experiments, which may not address the actual problem. the user's orientation and location also have critical effects on the performance of wcsi-based sensing systems. the differences in users' orientation and location may cause different variations in wcsi measurements. existing research mainly used the same orientation and location during experiments. however, a few research studies considered different orientations and locations to overcome such limitations. most of the research on wcsi-based sensing platforms considered a single subject for investigation. it is challenging to differentiate the movements of multiple subjects using wcsi measurements. considering the covid- isolation or quarantine period guidelines, a single subject is enough for developing the platform. however, early diagnosis of covid- symptoms requires multi-subject sensing, because before diagnosing covd- as positive, people are not quarantined from their families and are living in a multi-subject environment. this stage is difficult to monitor and is the main source of the viral spread. with the rapid expansion of wi-fi sensing technology, it also raises privacy and security issues. existing research has demonstrated that wi-fi signals can interfere with other users. enemies may spy the activities and position using existing human activity sensing systems. it is necessary to pay more attention to the improved privacy and security concerns. the following are the future recommendations to improve the non-contact sensing platform for the monitoring of human activities and health conditions to contain covid- : it is recommended to perform extensive experiments with different environments and experimental setups to develop an rt model. it is recommended to collect experimental data by using multi-subjects with extensive experimentation to develop a model. an efficient and possible solution must develop a rigorous theoretical model independent of the user's location and orientation; the correct mapping of the relationship between wcsi measurements and the human body motions identify the health conditions. it is recommended to conduct experiments with different orientations and locations for the collection of data for developing models. it is recommended to extract more prominent features to differentiate human activities and health conditions. frequency domain features are useful for classifying multi-subjects. it is recommended to use sdr-based wcsi sensing to counter the privacy and security using a self-generated signal approach that can switch to different frequency bands. various measurements and research studies are initiated to contain covid- throughout the world. limiting human-to-human contact is the best solution to reduce the spread of covid- . this research presents a comprehensive review on existing non-contact sensing of human activities and health monitoring that could be used for the development of a covid- pandemic platform. the wi-fi and sdr technology has the potential to contain covid- in a non-contact manner. this study proposes a non-contact wcsi-based sensing platform for monitoring covid- to contain the deadly pandemic situation. the proposed platform has the potential to diagnose the early symptoms like coughing and shortness of breath. the development of the platform is very useful in the quarantine and isolation period because it will monitor fall, sleep disorder, shortness of breathing, coughing level, heartbeat and diet of suspected or confirmed covid- cases. although the proposed platform is a promising solution, there still exist limitations to achieve optimal performance. this study highlights the challenges, and it is expected that proposed solutions will contribute to contain covid- . world health organization. considerations for quarantine of individuals in the context of containment for coronavirus disease (covid- ): interim guidance evidence based management guideline for the covid- pandemic-review article different approaches for human activity recognition: a survey. arxiv human activity sensing with wireless signals: a survey a survey on wi-fi based contactless activity recognition design of software defined radios based platform for activity recognition balasingham, i. applications of software-defined radio (sdr) technology in hospital environments annual international conference of the ieee engineering in medicine and biology society (embc) wifi sensing with channel state information wireless sensing for human activity: a survey freesense: indoor human identification with wi-fi signals a survey on csi-based human behavior recognition in through-the-wall scenario wifi vision: sensing, recognition, and detection with commodity mimo-ofdm wifi evaluation and treatment coronavirus (covid- ) updated world health organization declares global emergency: a review of the novel coronavirus (covid- ) a comprehensive review of the covid- pandemic and the role of iot, drones, ai, blockchain, and g in managing its impact micromachines available online people who are at higher risk for severe illness. centers for disease control prevention (cdc) clinical management of severe acute respiratory infection (sari) when covid- disease is suspected. interim guidance respiratory laryngeal coordination in airflow conservation and reduction of respiratory effort of phonation relationship between changes in voice pitch and loudness speech motor coordination and control: evidence from lip, jaw, and laryngeal movements neurological manifestations of hospitalized patients with covid- in wuhan, china: a retrospective case series study neurologic features in severe sars-cov- infection association of chemosensory dysfunction and covid- in patients presenting with influenza-like symptoms a new symptom of covid- : loss of taste and smell a framework for biomarkers of covid- based on coordination of speech-production subsystems modes of transmission of virus causing covid- : implications for ipc precaution recommendations enabling contactless detection of moving humans with dynamic speeds using csi rf-sensing of activities from non-cooperative subjects in device-free recognition systems using ambient and local signals device-free human activity recognition using commercial wifi devices wifi-based through-the-wall presence detection of stationary and moving humans analyzing the doppler spectrum device free human activity and fall recognition using wifi channel state information (csi) an intelligent non-invasive real-time human activity recognition system for next-generation healthcare device-free activity recognition using ultra-wideband radios wifi-based gesture recognition system training-free human vitality monitoring using commodity wi-fi devices fine-grained device-free eating monitoring leveraging wi-fi signals. arxiv whole-home gesture recognition using wireless signals we can hear you with wi-fi! indoor human activity recognition based on ambient radar with signal processing and machine learning adversary helps: gradient-based device-free domain-independent gesture recognition micro-doppler-based human activity classification using the mote-scale bumblebee radar device-free occupant activity recognition in smart offices using intrinsic wi-fi components wiar: a public dataset for wifi-based activity recognition wi-motion: a robust human activity recognition using wifi signals privacy-preserving non-wearable occupancy monitoring system exploiting wi-fi imaging for next-generation body centric communication kitchen activity detection for healthcare using a low-power radar-enabled sensor network human activity recognition using deep learning networks with enhanced channel state information human activity recognition and prediction based on wi-fi channel state information and machine learning walls have no ears: a non-intrusive wifi-based user identification system for mobile devices sleepguardian: an rf-based healthcare system guarding your sleep from afar device-free step estimation system with commodity wi-fi on spatial diversity in wifi-based human activity recognition: a deep learning-based approach a survey of deep learning-based human activity recognition in radar effect of wireless channels on detection and classification of asthma attacks in wireless remote health monitoring systems post-surgical fall detection by exploiting the g c-band technology for ehealth paradigm monitoring of huntington's disease based on wireless sensing technology non-contact early warning of shaking palsy wandering pattern sensing at s-band wireless health monitoring using passive wifi sensing an efficient monitoring of eclamptic seizures in wireless sensors networks danger-pose detection system using commodity wi-fi for bathroom monitoring passive radar for opportunistic monitoring in e-health applications monitoring of patients suffering from rem sleep behavior disorder activity pattern mining for healthcare a non-contact paraparesis detection technique based on d-cnn ur-rehman, m. freezing of gait detection considering leaky wave cable $ s $-band sensing-based motion assessment framework for cerebellar dysfunction patients cognitive health care system and its application in pill-rolling assessment utilizing a g spectrum for health care to detect the tremors and breathing activity for multiple sclerosis gait signals classification and comparison design and implementation of monitoring system for breathing and heart rate pattern using wifi signals monitoring vital signs and postures during sleep using wifi signals multiresolution scalograms for freezing of gait detection in parkinson's leveraging g spectrum with deep learning seizure episodes detection via smart medical sensing system device-free fall detection by wireless networks rt-fall: a real-time and contactless fall detection system with commodity wifi devices resilient respiration rate monitoring with realtime bimodal csi data cognitive intelligence for monitoring fractured post-surgery ankle activity using channel information non-contact sensing testbed for post-surgery monitoring by exploiting artificial-intelligence monitoring of atopic dermatitis using leaky coaxial cable diagnosis of the hypopnea syndrome in the early stage breathing rhythm analysis in body centric networks posture recognition to prevent bedsores for multiple patients using leaking coaxial cable respiration symptoms monitoring in body area networks chronic obstructive pulmonary disease warning in the approximate ward environment ubiquitous smoking detection with commercial wifi infrastructures leveraging wifi for human activity classification using ofdm subcarriers' correlation a survey on wifi based human behavior analysis technology literature review on wireless sensing-wi-fi signal-based recognition of human activities revolutionizing software defined radio: case studies in hardware, software, and education demonstrating the practical challenges of wireless communications using usrp teaching software defined radio using the usrp and labview design of a portable and multifunctional dependable wireless communication platform for smart health care this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord- -m xba g authors: macintyre, chandini raina; costantino, valentina; kunasekaran, mohana priya title: health system capacity in sydney, australia in the event of a biological attack with smallpox date: - - journal: plos one doi: . /journal.pone. sha: doc_id: cord_uid: m xba g planning for a re-emergent epidemic of smallpox requires surge capacity of space, resources and personnel within health systems. there are many uncertainties in such a scenario, including likelihood and size of an attack, speed of response and health system capacity. we used a model for smallpox transmission to determine requirements for hospital beds, contact tracing and health workers (hcws) in sydney, australia, during a modelled epidemic of smallpox. sensitivity analysis was done on attack size, speed of response and proportion of case isolation and contact tracing. we estimated clinical hcws and public hospital beds in sydney. rapid response, case isolation and contact tracing are influential on epidemic size, with case isolation more influential than contact tracing. with % of cases isolated, outbreak control can be achieved within days even with only % of contacts traced. however, if case isolation and contact tracing both fall to %, epidemic control is lost. with a smaller initial attack and a response commencing days after the attack, health system impacts are modest. the requirement for hospital beds will vary from up to % to % of all available beds in best and worst case scenarios. if the response is delayed, or if the attack infects people, all available beds will be exceeded within days, with corresponding surge requirements for clinical health care workers (hcws). we estimated there are public health workers in sydney with up to , contacts to be traced. at least million respirators will be needed for the first days. to ensure adequate health system capacity, rapid response, high rates of case isolation, excellent contact tracing and vaccination, and protection of hcws should be a priority. surge capacity must be planned. failures in any of these could cause health system failure, with inadequate beds, quarantine spaces, personnel, ppe and inability to manage other acute health conditions. smallpox is a category a bioterrorism agent, despite being declared eradicated in [ ] . the virus is retained in high security biosafety level laboratories in the united states and russia [ ] . the variola genome is fully sequenced and could be synthesized in a laboratory [ ] . this a a a a a to determine the capacity of the health system in sydney, a city of . million people in australia, during an epidemic of smallpox. specifically, we aimed to determine hospital bedcapacity for isolation, public health workforce capacity for contact tracing and health care worker (hcw) personal protective equipment (ppe) requirements under different attack scenarios. we also aimed to test a worst case scenario among the range of possible attack scenarios and identify modifiable factors which would prevent a worst case scenario. we constructed a modified seir model for smallpox transmission based on a model published in our previous study [ ] . model parameters and their estimation have been previously described [ ] . we assumed that the virus has not been genetically modified and that there is minimal residual immunity in the population from previous vaccination, as described in our previous study [ ] . we assumed an initial attack size of , or , infected. case isolation was assumed to reduce transmission to zero [ ] . given antivirals would be commenced after diagnosis and isolation, we assumed this effect would only apply in the healthcare setting and would not add to interruption of transmission above the effect of isolation alone, with the main transmission risk being in the community for undiagnosed or early cases prior to hospitalisation. we assumed that antivirals would therefore have no effect on community transmission, acknowledging that they would likely reduce morbidity and mortality for treated cases. we estimated number of hospital beds needed to control the epidemic, ppe requirements for clinical hcws and public health workers required for contact tracing, under different scenarios. we constructed a modified model for smallpox transmission. population residual immunity and contact mixing are based on assumptions used in our previous study [ ] . using ordinary differential equations as described in s file, the population moves through the disease compartmental epidemiological states of being susceptible, exposed, infected and recovered (seir) from smallpox. once infected, people move into the next state following disease duration rates. euler's approximation was used to estimate age-specific force of infection, assuming contact would be similar to observed patterns in the uk [ ] [ ] [ ] . different infectivity levels were based on the reproduction number for hemorrhagic, flat, ordinary and modified smallpox as previously described [ ] . the force of infection was multiplied by a parameter (α , α , α , α ) to account for different population susceptibility levels. model parameters and their estimation are described in our previous study [ ] . the model runs for days. the model assumptions are shown in table . to test the preparedness of the health system we assumed the base case response would be case isolation and treatment, contact tracing and ring vaccination. as the study was examining health system factors, ring vaccination was kept constant in the model at % with assumed adequate vaccine supply and trained vaccinators, and - % vaccine efficacy for uninfected people and - % for latent infected [ , [ ] [ ] [ ] as described in s file. hospital bed requirements. we estimate the number of hospital beds in sydney using data published for - in nsw [ ] . in nsw for - there were beds available from public hospitals ( . per population) and beds available from private hospitals ( . per population). this was resized for the sydney population. the number of hospital beds needed for case isolation was then modelled under different scenarios based on variation of response time (t), the percentage of infected cases isolated each day and how many contacts were traced. we tested if the number of available hospital beds in sydney will be enough to isolate up to % of all new infected cases every day, under different scenarios. clinical health workforce and ppe requirements. the clinical health workforce was estimated by the number of hcws in sydney for / , including aboriginal and tsi health practitioners, chinese medicine practitioners, dental practitioners, medical practitioners, radiation therapists, nurses and midwives, occupational therapists, pharmacists and ambulance services workers [ ] . the total estimated health workforce number was for nsw, with a total population of . million in [ ] . we applied the same percentage adjusted for the sydney population from the same year, . million [ ] . hcw distribution by age average number of contacts per case [ ] proportion of contacts traced around an infected case %, sensitivity analysis with % and % [ ] [proportion of cases that get isolated once infected and symptomatic %, sensitivity analysis on with % and % [ ] group was estimated using national and global health worker data [ , ] . we estimated, based on epidemic size and duration, the amount of respiratory ppe (n respirators) required for sydney clinical hcws assuming two respirators per shift per hcw. this is based on recommendations that disposable respirators should not be re-used, and the fact that a standard shift for a hcw would include at least one break, after which a new respirator would need to be used. public health workforce for epidemic control. there are no published data to estimate the public health workforce, comprising trained public health officers working in health departments and capable of conducting contact tracing and outbreak investigation, as public health workers are not registered health practitioners. the only uniform qualification in public health is a master of public health (mph) and similar degrees. whilst there are a large number of mph graduates in australia, the number working in government public health roles would be a minority. it should also be noted that a mph does not equip people with the skills for field response to an epidemic. there are approximately alumni of the national field epidemiology training program (fetp). in addition, there is a medical specialisation in public health medicine for a relatively small number of medical doctors, with an estimated full time equivalent public health physicians nationwide in [ ] . based on discussions with national experts we estimated there are approximately skilled public health officers in australia, although the actual number may be lower. the public health workforce was calculated using estimates of mph graduates currently working in government, fetp graduates or current fetp trainees and public health physicians. an optimistic assumption of public health officers nationally was used to estimate the number working in sydney. contact tracing requirements. in the base case, we assumed % of contacts would be traced and % infectious people would be isolated. we used age specific contacts rates, with an average of contacts per case based on european social mixing data [ ] . we estimated the number of public health workers required to conduct contact tracing under different scenarios. given contact tracing may require complex communications and travel over large geographic distances, we assumed one public health officer could trace contacts per day. australian guidelines for management of smallpox state that isolation is needed for nonimmune category a (high risk) contacts, in individual rooms with supervision by vaccinated staff [ ] . we conservatively assumed at least % of contacts traced would be category a, and would require supervised quarantine. data from tuberculosis studies [ ] as well as estimated social contact matrices suggest one person [ ] , on average, has - contacts at reasonable risk of infection. the closest of these contacts would include household contacts (about - people), plus - others in work or friendship circles. this would be about half of the - contacts. the number of contacts needed to be traced and managed was estimated based on attack size, time to response (t) and the percentage of infected cases isolated each day. a contact tracing day was defined as one day entirely spent tracing contacts per public health worker. sensitivity analysis. a sensitivity analysis was conducted on attack size, the proportion of infectious cases isolated and contacts traced, as well as time to commencing the response. to illustrate the difference in epidemic size between a single index case of smallpox imported from overseas, compared to a primary attack which results in or simultaneous firstgeneration cases, we modelled the epidemic resulting from , or initial first generation cases. the size of an attack is unknown, and would depend on the technical sophistication of aerosol dispersion of variola. to account for this uncertainty, we explored the influence of attack scenarios of , and initial infected as a wide range of possible attack sizes, to determine the impact of attack size on epidemic control. delays in diagnosis and time to obtaining laboratory confirmation could vary the time of onset of the response. we therefore varied the time of the response commencing between t = , and t = days following virus release. given an average incubation period of days for smallpox [ ] , this corresponds to day , and after the onset of symptoms of the index case. we estimated public hospital beds and private hospital beds in sydney. we estimated there are clinical hcws in sydney, the majority ( %) aged - years old, % nurses and % doctors [ ] . we estimated a public health workforce of nationally, with approximately public health workers in sydney. fig shows the relative epidemic size of a deliberate release scenario with or initial infected compared to a single importation of smallpox from an epidemic overseas, to illustrate the potential scale of the required public health response. the higher the initial number infected, the more rapid and severe the epidemic. without intervention, the death rate will reach an incidence (number of new infected people per day) rate of deaths per population per day. the overall reproductive number was estimated to be . . fig shows the influence on infections and deaths by varying time to response and case isolation rates. both timing and isolation rates of infected cases are highly influential in outbreak control. with infected initially, when isolation decreases, the deaths increase from a maximum of , and per day in the best scenario with % isolation (for t = , and respectively), to . , . and per day with only % of cases isolated (fig ) . fig shows the influence of varying time of starting the intervention and varying percentage of contacts traced on the incidence of infections and deaths with case isolation constant at %. with a high proportion of cases isolated, outbreak control can be achieved within days even with only % of contacts traced. fig shows the effect of varying both case isolation and contact tracing rates, with the intervention commencing at days-case isolation is more influential, with epidemic control severely impacted when isolation and contact tracing falls to %. in table we show the impact of the epidemic (total cases and contacts needed to be traced) by varying case isolation rates. the total number of cases range from to , ; and contacts that need to be traced between and in the best-and worst-case scenarios respectively. we estimated public hospital beds and private hospital beds in sydney. the modelled maximum number of people isolated at the same time is about . and . times the initial number of infected if we start intervention respectively at t = and at t = and peaks days after the response commences. therefore, if the initial number of infected is or , the available beds will not be completely exhausted, but treatment capacity for other illnesses may be impacted. in the case of initial infected, the maximum beds usage will reach . % and . % of sydney public available beds, if the response starts at time t = and t = days from the virus release respectively. if the initial number of infected is , the maximum beds usage will reach . % and . % of sydney public available beds days after the response commences, if the response starts at time t = and t = days from the virus release respectively. however, in the case of initial infected, the available hospital beds will be all used in the first few days of the response. fig shows the hospital bed usage in the worst-case scenario of initially infected, with varying start times of the response. maximum number of beds needed at the same time and day shown in the square windows. with initial infected if we start the intervention at t = from virus release, beds will be used the first day, the second day. at and days after commencing the intervention (at day and ) more than % and % of the total beds in sydney hospitals will be needed. at day post-attack, days after starting the intervention, % of all public and private beds will be used. if the intervention is delayed to day t = , almost % of the available beds will be used in the first days, % at day of response will be used and at day of the response (t = after the attack) all available public and private beds will be used. table https://doi.org/ . /journal.pone. .g results showed for % of new infected isolated. % % % maximum number of beds used in the same day (% of the total) initial infected shows the time to occupancy of all available hospital beds at levels of % or greater. whilst an attack of initial infections does not reach % of beds under any scenario, in the worstcase scenario (response commencing at day ) % of beds will be used by day . the number of hcws required and the ppe they need will be proportionate to the number of cases requiring treatment (table ). in scenarios described above where cases (beds) exceed % of available beds, staffing requirements will increase %, unless reduced staff/patient ratios are implemented. estimating a minimum of disposable respirators a day per hcws for days, over million respirators will need to be stockpiled for all clinical hcws in sydney. this number can be used to estimate requirements based on the estimated percentage of the clinical workforce needed for the epidemic, which will be proportionate to the number of cases requiring treatment ( table ). if % of clinical hcws are involved in care of smallpox patients, over million respirators will be needed. if the epidemic is not controlled within days this number will be doubled. the public health staff (phs) required to conduct contact tracing will depend on how many contacts one person can trace per day and the number of available phs, estimated to be in sydney. if one phs can trace contacts a day, then in the best-case scenario of contacts, over contact tracing days are required, based on the number of contacts in table above. in the worst-case scenario, , contacts and , contact tracing day are required. in the worst-case scenario, phs would work over days each doing contact tracing. if half of contacts are high-risk, quarantine spaces will be required for to , contacts. in the case of a smallpox release in sydney, a high-income, well-resourced city of over . million people, health system impacts may be substantial under some scenarios as shown in our model. we showed if smallpox arises overseas and is imported as a single case into australia by travel, control will be far easier than under an attack scenario. we showed that influential factors on epidemic impact are the size of the initial attack, time to commencing the response, case isolation rates and contact tracing for ring vaccination. whilst both are influential, case isolation is more influential than contact tracing. these public health interventions depend on physical and human resources, including clinical and public health workforce. whilst the size of an attack may not be within our control, other the influential factors are modifiable and potentially within our control. if the initial attack size is - and the response is rapid, an outbreak of smallpox can be controlled with case isolation, contact tracing and vaccination. however, if the response is delayed to days or longer (which equates to about weeks after the first symptoms occur), or if the attack infects people, epidemic control will be much more challenging, and the health systems impacts will be substantial. in the worst-case scenario, available hospital beds will be exceeded in less than days. the requirement for hospital beds for isolation of cases will vary from up to % to % of all available beds depending on the size of initial release and speed of response. even in the mid-range scenario of initial cases, up to % of all available hospital beds will be required for smallpox control. this does not account for the facilities required for quarantine of contacts, which must additionally be planned for, and in the worst-case scenario would require over , high risk contacts to be quarantined. quarantine and isolation capacity are critical to epidemic control. planning for surge bed capacity using available guidelines should be undertaken [ ] , and back up plans such as the use of community halls, school buildings, hotels or other large buildings should be made to ensure that that other viable isolation sites are pre-designated as smallpox treatment centres and available. during the pandemic of influenza, which was reportedly not as severe as expected, studies reported a tripling of patient presentations to hospital [ ] . plans for managing hospital bed capacity in the event of a large initial attack should also be made, including designation of specific treatment facilities, cancellation of elective surgery and decanting of patients with non-urgent other conditions into private hospitals or other facilities. the capacity for hospital beds for non-smallpox patients who require urgent hospitalisation must also be considered, and in some scenarios, the care of patients with urgent non-infectious conditions such myocardial infarction or stroke, may be compromised by lack of hospital capacity and staffing shortages. rapid response time is critical and becomes even more critical when the initial infected number is higher. responding more than days from the virus release (which means commencing the public health response within days of symptom onset, given an average day incubation period) will result in a more severe outbreak. whether it is feasible to commence response within the best-case scenario of days post-release (or days after symptom onset) is unknown, but unlikely. a rapid response depends on very early detection and diagnosis, as well as prompt commencement of case finding, isolation, contact tracing and vaccination. practically, the target for reducing the time to response is in early diagnosis, which depends on awareness first, followed by diagnostic test. delays may occur if the diagnosis is missed in the index case. recent examples of serious emerging infectious diseases where the diagnosis was missed include ebola in nigeria and the us, both of which occurred during the height of the west african epidemic when media reports were at a peak and awareness should have been high [ , ] . the largest epidemic of mers coronavirus outside the arabian peninsula occurred in south korea following a missed diagnosis and failure of triage in a patient with a relevant travel history and a respiratory clinical syndrome [ ] . the last european epidemic of smallpox in also involved a missed diagnosis, when a traveller to the middle east returned to yugoslavia, which had been free of smallpox for years. the patient had haemorrhagic smallpox, which was misdiagnosed as a severe adverse reaction to an antibiotic, and smallpox was not suspected until second generation cases began occurring, resulting in an outbreak of cases [ ] . excellent surveillance systems and triage protocols for early detection of low probability, high impact outbreaks such as smallpox is recommended. improving diagnosis requires triage protocols and rapid diagnostics, the latter being useful only if the diagnosis is suspected clinically in the first instance. other avoidable delays in response could include having pre-vaccinated first responder teams, pre-designated isolation and quarantine facilities, and rapid human resources and surge capacity scale up plans [ ] . we have showed that epidemic control is highly sensitive to case isolation rates, which need to be maintained at high levels. identifying and isolating less than half the new infected cases and tracing only half of all contacts will result in a blow-out of the epidemic. space and human resource requirements for case finding, isolation, contact tracing, vaccination and quarantine are therefore essential for preparedness planning. physical space requirements extend beyond isolation of smallpox cases, to quarantine of contacts. in the worst-case scenario, almost one million contacts need to be traced and there will be a lack of physical space for quarantine of high-risk contacts. plans for home quarantine and surveillance of contacts should also be undertaken and will require adequately trained personnel. the speed and effectiveness of contact tracing is also critical to the success of ring vaccination and will require an adequately trained critical mass of public health workers and epidemiologists, separate from the clinical workforce. in australia, as a federation, this will rely on state and territory capacity, and cross-border mobilisation of jurisdictional capacity in the event of a smallpox epidemic, and estimation of the current and required capacity for such an event. the fact that public health personnel are not registered as health practitioners or documented in any other centralised way, makes it more challenging to rapidly mobilise suitably qualified and experienced personnel for a large-scale epidemic response. this is a policy consideration that could be addressed as part of pandemic and health emergency planning, which may strengthen response. contact tracing may need to rely on community volunteers, as the available public health workforce will be inadequate in a large epidemic. staff surge requirements would track parallel to bed requirements and would be over % in some scenarios. the need for a clinical health workforce to treat smallpox will be high, with case numbers in the s to , s in many of the scenarios modelled, and just over , clinical hcws in sydney. limiting the number of hcws working in designated smallpox facilities is a sensible strategy. a possible approach to such a scenario would be a reduction of staff to patient ratios, as well as using trainee hcws. protection of these clinicians is key, with vaccination being the mainstay. up to , doses of vaccine will need to be reserved for clinical hcws and plans in place to commence vaccination. ppe will not be an alternative to vaccination, but an additional protective measure for hcws. today, work health and safety requirements would dictate that paprs or disposable respirators with a hood and coveralls be available to clinicians treating smallpox cases. perceived lack of protection during a serious emerging infection outbreak may result in refusal to work or industrial action by hcws [ ] . hcw may be well protected by ppe, but there is large uncertainty around effectiveness of ppe. studies of other viruses transmitted by the respiratory route suggests good effectiveness of respirators against smallpox [ ] . it should be noted that surgical masks are unlikely to offer protection to hcws based on available data [ ] . stockpiling may provide a short duration of supplies. the modelled epidemic may run for - days or more, depending on the scenario. a very large quantity of respirators may need to be stockpiled, depending on the percentage of hcw involved in direct care of smallpox patients. given the likely duration of an epidemic, plans should put in place for rapid procurement of ppe supplies beyond the stockpiled capacity. strategies to minimise the number of hcw treating each case of smallpox, including using designated smallpox hospitals, will reduce the quantity of ppe required. early identification of the epidemic, high rates of case isolation, excellent contact tracing and vaccination, and protection of hcws are the key influential components of epidemic control. failure in any of these could severely compromise the capacity of the health system. australia has a detailed plan for smallpox response, [ ] and we have outlined key influential parameters for disease control which can add further guidance on mitigating severe outcomes in both the planning and response stages. excellent surveillance systems and triage protocols for early detection of low probability, high impact outbreaks such as smallpox can make a difference, given the criticality of timing of the response and better prospects of epidemic control in the early stages. planning for the health system should consider rapid surge capacity for beds, strategies to create and staff make-shift designated smallpox treatment facilities, and protection of hcws at all levels of care. requirements for contact tracing are substantial and may require mobilisation of community volunteers and additional space for quarantine and surveillance of high-risk contacts. designated surge smallpox facilities and plans for management of other urgent health conditions should be considered. we have outlined several modifiable factors which, with good planning, can ensure adequate health system capacity in the event of a smallpox epidemic. general fact sheets on specific bioterrorism agents smallpox and its eradication: the pathogenesis, immunology, and pathology of smallpox and vaccinia the de novo synthesis of horsepox virus: implications for biosecurity and recommendations for preventing the reemergence of smallpox. health security how canadian researchers reconstituted an extinct poxvirus for $ , using mailorder dna biopreparedness in the age of genetically engineered pathogens and open access science: an urgent need for a paradigm shift development of a risk-priority score for category a bioterrorism agents as an aid for public health policy influence of population immunosuppression and past vaccination on smallpox reemergence who (world health organizazation). smallpox. geneva, switzerland: world health organization smallpox vaccines: past, present, and future public health and health reform in australia respiratory protection for healthcare workers treating ebola virus disease (evd): are facemasks sufficient to meet occupational health and safety obligations? planning for smallpox outbreaks modelling disease outbreaks in realistic urban social networks containing bioterrorist smallpox modeling the effect of herd immunity and contagiousness in mitigating a smallpox outbreak planned and unplannedfutures for the public health physician workforce in australia. a labour market analysis for the australasian faculty of public health medicine: australasian faculty of public health medicine:sydney hospital resources - : australian hospital statistics canberra: australian institute of health and welfare social contacts and mixing patterns relevant to the spread of infectious diseases transmission potential of smallpox in contemporary populations effectiveness of postexposure vaccination for the prevention of smallpox: results of a delphi analysis smallpox in europe ambulance service of nsw. workforce statistics staff turnover counting health workers:definitions, data, methods and global results. geneva: world health organization annual report / : melbourne. australian health practitioner regulation agency smallpox response plan and guidelines (version . ). archive-use only for research purposes preventability of incident cases of tuberculosis in recently exposed contacts. the international journal of tuberculosis and lung disease a survey of emergency department pandemic influenza a (h n ) surge preparedness office of the chief health officer. public health workforce surge guidelines transmission dynamics and control of ebola virus disease outbreak in nigeria ebola us patient zero: lessons on misdiagnosis and effective use of electronic health records hospital outbreaks of middle east respiratory syndrome smallpox as actual biothreat: lessons learned from its outbreak in ex-yugoslavia in . annali dell'istituto superiore di sanita exercise mataika: white paper on response to a smallpox bioterrorism release in the pacific uncertainty, risk analysis and change for ebola personal protective equipment guidelines the efficacy of medical masks and respirators against respiratory infection in healthcare workers. influenza and other respiratory viruses smallpox cdna national guidelines for public health units. communicable diseases network australia key: cord- -sj dd jr authors: grantz, k. h.; cummings, d. a. t.; zimmer, s.; vukotich, c.; galloway, d.; schweizer, m. l.; guclu, h.; cousins, j.; lingle, c.; yearwood, g. m. h.; li, k.; calderone, p. a.; noble, e.; gao, h.; rainey, j.; uzicanin, a.; read, j. m. title: age-specific social mixing of school-aged children in a us setting using proximity detecting sensors and contact surveys date: - - journal: medrxiv : the preprint server for health sciences doi: . / . . . sha: doc_id: cord_uid: sj dd jr comparisons of the utility and accuracy of methods for measuring social interactions relevant to disease transmission are rare. to increase the evidence base supporting specific methods to measure social interaction, we compared data from self-reported contact surveys and wearable proximity sensors from a cohort of schoolchildren in the pittsburgh metropolitan area. although the number and type of contacts recorded by each participant differed between the two methods, we found good correspondence between the two methods in aggregate measures of age-specific interactions. fewer, but longer, contacts were reported in surveys, relative to the generally short proximal interactions captured by wearable sensors. when adjusted for expectations of proportionate mixing, though, the two methods produced highly similar, assortative age-mixing matrices. these aggregate mixing matrices, when used in simulation, resulted in similar estimates of risk of infection by age. while proximity sensors and survey methods may not be interchangeable for capturing individual contacts, they can generate highly correlated data on age-specific mixing patterns relevant to the dynamics of respiratory virus transmission. role schoolchildren play in facilitating transmission ( ) ( ) ( ) ( ) . schoolchildren generally display highly assortative mixing by age (i.e., they preferentially interact with children of the same age) and high contact rates with adults and the elderly (their parents and grandparents) which may facilitate transmission among schoolchildren and within their surrounding communities ( , , , , , ) . many public health interventions, including school closures and vaccination campaigns, focus on the role of schoolchildren in the spread of respiratory infections ( , ) . one challenge in drawing links between patterns of social contacts and respiratory disease transmission is the difficulty in empirically measuring patterns of proximal social interaction. social contacts that can lead to transmission of pathogens can potentially be transient, non-synchronous (i.e., through contamination of the environment), and of varying intensity ( , ). multiple methods have been used to measure social contact, the relative disadvantages and advantages of which have been described elsewhere ( ). the majority have used interviews or surveys to collect data on self-reported contacts, raising the possibility contacts was less skewed, but the presence of several high-degree nodes (individuals with many contacts) became increasingly apparent as the minimum number of cumulative contacts (an approximation of contact duration) required to be considered a unique contact was increased. the average duration of a survey-reported contact was . minutes, compared to just . minutes for sensor-recorded contacts. there was marked similarity between the distribution of survey-reported in-school contacts (fig. c) and unique sensor-recorded contact events with at least cumulative contacts (fig. g ), but the association at an individual level was unclear. in multivariate regression analysis adjusted for participant age, sex, and survey design, sensor- recorded and survey-reported contacts rarely served as significant predictors of one another (fig. , supp. fig. ). increasing the cumulative contact threshold for sensor contacts did not improve these associations. generally, the number of survey-reported contacts increased with age. duration of survey-reported contacts increased with age as well, but the effect size was reduced compared to the number of contacts. survey type or method of administration was not associated with number or duration of recorded contacts. male students were less likely than female students to report contacts and reported shorter contacts on average in contact surveys. results using multiple thresholds of cumulative sensor contact are shown in the supplement (supp. fig. ). we found significant associations between sensor outcomes and number of survey-recorded contacts; however, the effect size was small relative to other factors (e.g., age). age-specific mixing patterns age-specific contact patterns derived from both data collection methods showed highly proportionate mixing assumptions (fig. a) . there was also a striking consistency between pairwise survey-and sensor-recorded contact ratios as a function of the difference in grade. the average departure from proportionate mixing expectations for participants in the same grade was . , compared to just . for participants one grade apart and . for participants two or more grades apart (fig. e ). assortativity of age-specific matrices based on contact surveys and sensor data ranged from q = . to q = . (fig. ) . the range was partially due to the structure of the participating schools; in this study, there were no schools with both high school and non-high school students. however, even within each school, mixing patterns showed high degrees of assortative mixing (e.g., in-school contact survey-based matrices range from q = . to q = . , supp. fig. ). the effect of school structure on mixing patterns was most apparent in matrices based on unique sensor contacts, which revealed three elementary grade clusters (k- , - , - ) within which there was strong assortative mixing ( fig. b ). high school students (grades to ) represented a well-mixed, modular cluster (q = . and . for hs and hs , supp. fig. ). transmission models when used in age-specific simulation, sensor-and survey-based mixing matrices produced similar attack rates when adjusted by proportionate mixing expectations (fig. ) . increasing the contact threshold resulted in more heterogeneity relative to the proportionate mixing baseline. there was discordance between the sensor-and survey-based predicted attack rates in particular schools, which increased with cumulative sensor contact threshold and disjuncture in contact matrices. however, in other schools, there was a marked degree of similarity between attack rates regardless of contact matrix employed. in simulations based on unadjusted contact rate matrices, predicted attack rates were lower in younger children when using survey-based matrices, a reflection of the different reporting rates by age and specific demography of each school (supp. fig. ). we explored multiple parameters in our transmission model, assuming reproductive numbers of . , and . we found little qualitative difference between these simulations (supp. fig. ) . the utility of social contact data to the study of infectious diseases has been limited in part by questions of how to best measure social interactions relevant to transmission. in this project, we found that, while the two commonly used methods captured different information at the individual level, they gave similar results in several aggregate patterns of contact that are thought to be relevant to pathogen transmission, namely, patterns of age-specific mixing and probability distributions of the total number of contacts. as in other work, we found evidence for strong assortativity of contacts by grade ( ). this work has important implications for the empirical parameterization of mathematical models of transmission, particularly of respiratory pathogens. this work suggests that either empirical approach could be used to characterize cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint found poor individual-level concordance between the two methods: anywhere from % to % patterns across age ranges using two different methods ( , ). we also observed substantial absolute differences in the number and type of contacts recorded by self-reported contact surveys and proximity sensors. we found either metric was a poor predictor of the other, even when adjusting for age, sex, and study factors. however, we found stronger individual-level correspondence between the measures when we restricted sensor data to contacts with longer cumulative duration (true for -minute and -minute minimum thresholds), consistent with earlier work which found longer contacts were more likely to be reported in surveys ( , , ) . in practice, the two methods are designed to capture different social interactions. per the study protocol, survey-recorded contacts should only have included those with interactions that involved talking, playing, or touching, while sensors recorded all other sensors within proximity regardless of whether participants were socially interacting. that the correspondence increased when limiting sensor information to proximal contacts with longer duration suggests that these were more likely to be contacts which include social interactions. it is unclear which type of contact (proximal or social interaction) is most relevant for the spread of respiratory pathogens. to determine whether contact patterns measured using different empirical approaches lead to different transmission dynamics, we simulated transmission using models parameterized with data from the two empirical techniques. in simulations using mixing matrices adjusted by proportionate mixing expectations, similar age-specific infection patterns were found using sensor and survey data. previous work has similarly found that, while simulations using . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint unadjusted contact data from surveys and proximity sensors differ, appropriate adjustment to survey data which capture key structural elements of the contact network (e.g., age assortativity) leads to consistent simulation results using both kinds of contact data ( ). here, differences in attack rates appear to be driven by increasing disjuncture between grades and age assortativity in certain mixing matrices. importantly, the metric we used to compare age-specific contact patterns from survey-and sensor-recorded data did not account for absolute differences in the overall contact rates of children in each grade. in simulation, the β estimation procedure (see supplementary methods) scaled the overall rate of contact between age-specific contact matrices, but did not account for our study has some important limitations. though we adjusted for the demographics of the specific schools and deployments that we conducted, our results may not be generalizable to other settings. the physical and architectural environment of our schools, the density of sensors that we were able to deploy in our schools, and the specific days that we deployed our study may all have affected our results. technical issues, though not common, did occur with the . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint sensors, resulting in lost data for some sensors. similarly, recall bias and misclassification by participants when completing contact surveys may have obscured the relationship between our two methodological measurements. we found that the design and administration of contact surveys led to some censoring in the number of contacts reported (fig. ) . nonetheless, we believe that the relationships we found were robust to the misclassifications and biases that may be generated by these sources. previous work has indicated that risk of infection with influenza is more closely linked to the average mixing patterns of an individual's age group, rather than the individual's contact behaviour ( ). we found that two common methods of collecting social contact data, self- reported surveys and proximity sensors, recorded qualitatively and quantitatively different individual social mixing behaviour but could still generate similar aggregate age-specific social contact patterns. the collection of high-quality social contact data through either method has important implications for surveillance, prediction, and prevention of respiratory virus transmission. our finding that these two methods found some commonality in aggregate age- specific social contact patterns suggests that these phenomena are not an artefact of either specific empirical method but attributes of these study populations. study description enrolment in the social mixing and respiratory transmission (smart) study operated on an opt-out basis, and all students registered in a participating school before the start of the study were eligible to participate. students in kindergarten (typically aged years) to th grade (typically aged years) from two elementary (k to th grade, k to th grade), two middle ( th to th grade, th to th grade), two elementary-middle (k to th grade), and two high (both th to th grade) schools were eligible to participate in smart. participation rates were high in all schools . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint ( to %). each school provided aggregate demographic information about the school population, and individual grade and sex of participating students. proximity sensor deployments the details of proximity sensor deployments have been described in detail elsewhere ( ). in brief, participating students were given proximity sensors in plastic pouches and instructed to wear the pouch around their neck for the duration of the school day without removing or otherwise tampering with the sensor. in six of the eight schools, all participating students were given a sensor; in two schools, the large student population limited the deployment to randomly selected classrooms in each grade. deployments typically lasted from the first class period ( : - : ) to the last class period ( : - : ). deployment days in each school were chosen to be representative of a typical school day, without any special schoolwide or grade- specific activities that could modify normal contact patterns. we used telosb wireless sensors ( ) programmed in the nesc language to send beacons every seconds (beacon frequency per minute). the receiving sensor recorded the contacting sensor's identity, an internal time stamp, and a radio strength signal indicator (rssi). signal strength provided an estimate of physical proximity, but was highly dependent on the orientation of the two sensors and any obstructions between them and therefore could not be used to define an exact distance between contacts. based on pilot studies and previous work on effective distances of respiratory virus transmission ( , ), we chose a signal threshold (- dbm) that should correspond to contacts of relevance to respiratory disease transmission. the number of unique proximity sensor contacts recorded for a participant was defined as the total number of other participants with whom their proximity sensor recorded at least one interaction during each deployment. to explore patterns of contacts of varying length, we . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . participants were asked to report information about any individual they talked with, played with, or touched the previous day, including the contact's age and sex, whether they attended the same school as the participant, the context in which the contact was made, whether the contact cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint we defined total survey contacts as the total number of individuals a student reported having interacted with on the day before the survey was completed. detailed contacts were the subset of total contacts for which the student reported contact age, sex, duration, and context. we considered further subsets of detailed survey contacts, including those occurring within school, those reported to have lasted more than minutes over the course of the day, and those occurring on the same day as a sensor deployment. briefly, each sensor interaction was assumed to represent an independent contact of between to seconds; the total interactions between a pair of participants were summed to compute the total duration of contact in one deployment. participants were asked to record the approximate durations of survey-reported contacts. we used negative binomial regression to investigate which factors were associated with the number of reported contacts for each student who participated in a sensor deployment and completed at least one contact survey. each model included participant grade, gender, and a random intercept term for day of survey completion or sensor deployment. survey administration and sensor deployment days were unique to each school. terms for the type and method of survey administration were added to models of survey-recorded outcomes. age-specific mixing matrices . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / we estimated two metrics of age-specific contact patterns: an average per-capita mixing rate, and the age-specific mixing ratio of observed contact rates to those expected under the assumption of proportionate mixing. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / where xj is the number of individuals in grade j with whom participants in grade i could record a contact, and all other terms are as defined above. values greater than indicate more contacts were recorded by participants in grade i with individuals of grade j than would be expected under proportionate mixing. proportionate mixing assumes that an individual in grade i mixing at random will contact individuals in grade j with a probability equal to the proportion of the population in grade j, but no assumption is made on the probability of individuals in grade i making any contact relative to other groups. by design, rj, the participant population, is equal to xj, the contact population, in sensor deployments. for within-school contacts, we used the demographic information of all registered students in each school to define the potential contact population. combined k- matrices were generated by averaging age-specific matrices from all participating schools, weighted by the number of participants in each school. confidence intervals were calculated using , resampled bootstrap replicates of contact events. mantel correlation coefficients were used to compare mixing matrices. the degree of assortative mixing, q, was calculated as the ratio of the first minor eigenvalue to the dominant eigenvalue ( ), where q ranges from - , representing completely disassortative mixing, to , completely assortative mixing. cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint proximity sensors and self-reported surveys were likely to record contacts with different transmission potential, we fitted β for each set of parameters, including the age-specific mixing (ep/n / ). the findings and conclusions in this report are those of the authors and do not necessarily represent the official position of cdc. additional information we declare no competing interests. . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july , . . https://doi.org/ . / modes of transmission: a critical review. j. infect. , - ( plos one , e ( ). tables table . . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july , . . https://doi.org/ . / . . . doi: medrxiv preprint figure . distribution of the number of contact events recorded in a us school setting by self-reported contact surveys and proximitydetecting sensors: (a) total survey-reported contacts; (b) detailed survey-reported contacts; (c) survey-reported in-school contacts; (e) all unique contacts recorded by sensors; (f) all unique contacts with more than cumulative contacts (roughly minutes of interaction); and (g) all unique contacts with more than cumulative contacts (roughly minutes of interaction). insets in (e)-(g) show the plot of in-school survey contacts versus each metric of sensor-recorded contacts with a cubic smoothing spline. (d) shows the population distribution by grade of participants who completed at least one contact survey or participated in a sensor deployment, compared to the population distribution of the pittsburgh standard metropolitan statistical area (psmsa) for . figure . factors associated with the number and duration of survey-reported in-school contacts in a us school setting. all models include a random intercept for day of survey completion. figure . age-specific mixing matrices generated from in-school survey contacts and unique sensor-recorded contacts in a us school setting at various cumulative contact thresholds. matrices are presented as log- ratio of observed contacts relative to expectation under proportionate mixing assumptions for survey-reported in-school contacts (a) and sensor-recorded unique contacts with thresholds of (b), (c), and (d) cumulative contacts. blue colours indicate more contacts than expected under proportionate mixing assumptions, and red colours indicate less mixing than expected. bolded ratio values deviate significantly from the null expectation, ɑ= . , and q equals the degree of assortative mixing. scatterplots (f-h) show the corresponding i,j values of the survey-and sensor-based mixing matrices at each threshold ( , , ). (e) shows the average departure from proportionate mixing as a function of difference between grade for each matrix. . grade-specific final predicted attack rates of a respiratory virus in a us school setting, based on stochastic simulation using mixing matrices of in-school survey contacts and unique sensor-recorded contacts at various contact thresholds, adjusted by proportionate mixing expectations, within each school (elem, elementary; ms, middle school; hs, high school). . cc-by-nc-nd . international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july , . . https://doi.org/ . / inactivation of influenza a viruses in the environment and using data on social contacts to estimate age- school opening dates predict pandemic influenza a(h n ) outbreaks in the united states spatial transmission of pandemic influenza in the us estimating the impact of school closure on social mixing behaviour and telos: enabling ultra-low power wireless research in th international symposium on information processing in sensor networks how far droplets can move in indoor transmissibility of swine flu at fort dix,