key: cord-0164881-60cwy9x5
authors: Maccari, Leonardo; Cagno, Valeria
title: Do we need a Contact Tracing App?
date: 2020-05-20
journal: nan
DOI: nan
sha: 6977c17ace421d1cc950106e9aa2474e2ebc6ed2
doc_id: 164881
cord_uid: 60cwy9x5

The goal of this paper is to shed some light on the usefulness of a contact tracing smartphone app for the containment of the COVID-19 pandemic. We review the basics of contact tracing during the spread of a virus, we contextualize the numbers to the case of COVID-19 and we analyse the state of the art for proximity detection using Bluetooth Low Energy. Our contribution is to assess if there are scientific evidences that suggest a contact tracing app, using present technologies, could be effective to slow down the spread of the virus. Our conclusion is that such evidences are lacking, and we should re-think the introduction of such a privacy-invasive measure.

During the recent pandemic of the coronavirus infectious disease 2019 (COVID-19) national governments imposed various flavors of "lockdown" forcing people in their houses and preventing many of them to work, with the goal of slowing down the pandemic and make it more manageable by national health systems. Such measures are unprecedented as they limit personal freedom and strongly impact national economies and need to be maintained only for the shortest time required to stop the emergency. When the emergency is tamed, governments introduce a so-called "phase 2" which imposes milder limitations that can progressively help economies recover. Yet, being the first time in modern history that our society faces such a threat, we don't have a clear path to follow for the phase 2. On the one hand there is a widespread interest in removing personal limitations, on the other, the risk of provoking a second wave of spread of the virus will be present until a vaccine or a cure are found and made available to everybody.

One of the measures that has been considered by many governments, and implemented by some, is the introduction of a mobile phone application that performs contact tracing, e.g., it provides a list of contacts that took place between couples of people. The app enables a person that was tested positive to Sars-CoV2 (the aetiological agent of to send a warning to all those that he/she was in contact with in a certain time span. Those who receive the warning will presumably act accordingly (isolating themselves, or asking to be tested) and this is imagined to contribute to keep the diffusion of the virus under control.

In a few weeks the interest for this application sky-rocketed, Google and Apple produced a dedicated API to support contact tracing, the European Union provided guidelines to specify the privacy implication of such applications and a consortium of companies started to elaborate proposals. The public debate rightly focused on the privacy aspects, which are of paramount importance because this is, in essence, the first ever mass-scale contact tracing action promoted by democratic states. The privacy fallout caused by the extensive use of such an app may be extremely dangerous and are hard to estimate and forecast [1] , if our society takes such a high risk, we must expect a greater collective payoff. Unfortunately, little importance has been given to the fact that this application may be actually effective in its goal, that is, limit the diffusion of the virus. In this paper we take a first principle approach, and review the research literature to understand if the technical means that we have today may correctly perform contact tracing as needed to stop the pandemic. If this is not true, then the whole debate about the privacy implications may simply loose its ground.

In our analysis we were not able to find any strong evidence that a privacy preserving app based on current technologies can achieve the stated goal, which should lead to rethinking from scratch the adoption of a contact tracing application. We observe that with current technologies a similar app could be used to perform a mild risk assessment, but that this approach needs more technical discussion and social acceptance.

In the rest of the paper we discuss what is needed to slow down COVID-19 (section 2) , and what are the privacy and technical constraints we have to respect for the app to be effective in a short time (sections 3 and 4). We then review the experiments carried out in the literature in similar realistic conditions (section 5). All these elements constitute the basis for a grounded discussion on the use of a contact tracing app (sections 6 and 7).

The reproduction number R expresses the expected number of people infected by one single individual at a certain stage of the epidemic. It is a key parameter to estimate the speed of the contagion and depends on a number of factors, including the way the virus is transmitted but also the contention measures that are enforced. The higher is the value of R, the faster the virus will spread. The goal of contention measures is to lower R below 1, so that the total number of infected people decreases with time. The base reproduction number R 0 is the value of R at the beginning of the contagion, when all the population is susceptible of being infected and no contention measures are in place, thus, it generally holds that R ≤ R 0 .

The transmission of Sars-CoV2 primarily happens due to respiratory droplets [2, 3] , therefore reducing the number of infection-producing contacts that an infected person incurs into is a key factor to slow the contagion. This is obtained with social distancing, hygiene measures and isolation of infected people. It is intuitive that the earliest in time an infected person is isolated, the lower are the chances to have contacts, the lower R will be overall. The rationale for using a contact tracing app is straightforward, when a person is tested positive, we need to quickly identify those that were infected by this person so that we can isolate them as well. For COVID-19 there are evidences that the transmission could occur also from asymptomatic and pre-symptomatic people [4] , which makes it even more urgent to quickly isolate the close contacts of a person that was tested positive.

There are estimates of the value of R 0 using data that refer to initial phases of the outbreak when no contention measures where in place and all the population was susceptible. One of the first attempts to estimate R 0 was performed observing the passengers of the Diamond Princess cruise ship [5] and reports a median value of R 0 = 2.28 with a distribution that never exceeds 3. A similar number (between 2-3) is confirmed also by the analysis of the initial phases of outbreak in China [6] . When contention measures are put in place R is expected to decrease, for instance at the time of writing the value of R in Germany is estimated to be lower than 1 [7] , as well as in Italy in the period between April 19th and may 7th [8] .

Of course the average or median estimated value of R may not be representative of some extreme cases, the presence of super-spreaders has been reported for COVID-19 [9] with individuals possibly infecting up to 10 other people in a single event. Yet, this kind of event is likely at the early stage of the epidemic but can be controlled with contention measures that have a limited impact on personal freedom: maintaining interpersonal distance, using face masks and hygiene measures, and forbidding large inperson meetings. In the phase 2, these measures must be already in place. So the overall effect of super-spreaders should become marginal.

Another element that is relevant for our analysis is the so-called secondary attack rate, that is, the percentage of people that is actually infected among the contacts of a person that has been tested positive to Sars-CoV2. The secondary attack rate can be further specified defining it for different contacts types (family, workplace etc. . . . ). In the early stages of the outbreak of the virus in China (Jan 14th -Feb 15th, 2020) a group of people tested positive were observed together with 1286 people among their contacts [10] . Contacts were identified as "those who lived in the same apartment, shared a meal, travelled, or socially interacted with an index case" and excluded "contacts (eg, other clinic patients) and some close contacts (eg, nurses) who wore a mask during exposure were not included in this group.". Among the contacts, the large majority of those that tested positive (77 over a total of 81 positives) were households of the infected person. The secondary attack rate restricted to households was estimated to be 11.2%.

A similar statistics related to the advanced phases of the contagion comes from the Istituto Superiore per la Sanit (ISS, the main center for research, control and technical scientific advice on public health in Italy) which provides weekly updates on the evolution of the Italian situation. The report dealing with the month April 7th -May 7th [8] includes the distribution of the contact places for more than 9,360 cases for which this information is available 1 . Table 1 summarises the data and shows several issues, the first is that in Italy 58% of the new infections in that week happened in retirement homes. This is due to the specific way the emergency has been managed in Italy, which made retirement homes a focus of infection. Albeit this happened in other nations, it can not be fully generalized. For this reason in the last two columns of the table we report the distribution of cases excluding the first line. We can see that about 87% (70% excluding retirement houses) of the cases happen in extremely predictable locations: 18% (44%) in the family, 8% (21%) in hospitals, and 2.4% (5.9%) in workplaces. Only 10% (25%) of the cases happen somewhere else.

Previous sections exposed data that are generally overlooked when discussing about contact tracing applications: i) every infected person infects in average between 2 and 3 people in the early stage of the epidemic, a number that decreases when contention measures are applied during the outbreak; ii) the large majority of the infections involves extremely predictable groups of people, family members and work mates which can be identified with very basic means, i.e. with manual contact tracing among households and work mates. If we consider a pessimistic value R = 2 for the phase 2, and we assume that 30% of the infected people can not be traced with manual contact tracing, then we are left with an average value of slightly less than 1 non-identified infected contact per 3 infected people. The contact tracing application, to be useful, should be able to identify this person. This does not automatically rule out digital contact tracing as we know that fast tracing can be key to slow down the contagion in its initial phase or in a second upsurge of the epidemic. Yet, data show that we need finegrained contact identification which in turn raises the expectations on the required performance of digital contact tracing.

Contact tracing refers to identifying "close contacts", a term that is specifically defined for a certain virus. For COVID-19 the World Health Organization defines a "close contact" as "Any person who had contact (within 1 metre) with a confirmed case during their symptomatic period, including 4 days before symptom onset " 2 . This definition leaves room for interpretation because the duration of the period is uncertain and does not take into consideration prevention measures that could be adopted (such as face masks). Several national health systems provide a more specific version, the ones we were able to review generally consider close contact as being at short distance (less than 2 meters) with an infected person for more than 15 minutes (see Appendix A.1).

Mobile phones can not be used for contact tracing, but can be used for "proximity detection", which means, in a nutshell, identifying couples of devices that have been in communication range with each other in a period of time.

2 See https://www.who.int/publications-detail/the-first-few-x-(ffx)-cases-and-contact-investigation-protocol-for-2019-novel-corona

The discussion on contact tracing apps start from the assumption that proximity detection is considered a good approximation of contact tracing. One of the works that is often mentioned in support of a contact tracing application is from Ferretti et al. [11] . Ferretti et. al model the outbreak of COVID-19 using the state of the art data on its diffusion and show that considering the specific characteristics of COVID-19, a fast contact tracing can be beneficial to lower the value of R below 1. The authors do not enter into the details of how such an application may work, they just treat it as a factor that can speed up the detection of new infected people. In this sense, it quite straightforward that any factor that can make detection faster will reduce R, and the paper provides a quantitative analysis of this reduction using the efficacy of contact tracing as one of the parameters of the model. They show that with realistic timing (one day passing from detection of infection to isolation and quarantine of contacts, and 100% success in isolating detected cases) to achieve R < 1 we need a contact tracing efficacy between 50% and 60%. Note that this is a number that cumulatively includes all contact tracing actions, including the presence of a mobile phone application.

On this basis, several governments started to express interest on the adoption of such an app, which of course initiated a debate on its privacy implications. To set the stage for such apps the European Union defined guidelines for Privacy-Preserving contact tracing apps. These guidelines must be respected for apps to be acceptable under a privacy point of view. At the same time, Apple and Google provided the design for an API to perform proximity detection on their operating systems. Since we can not derogate on the privacy guidelines, the open question is: do we have enough evidences suggesting that the proposed proximity detection solution can achieve the necessary accuracy in contact tracing while respecting the privacy guidelines? To answer this question we need to review some of the technical guidelines provided by the EU.

lines Contact tracing is a privacy-intrusive operation. The people that we meet every day and the duration of the encounters are extremely sensitive information. As soon as the debate on contact tracing applications emerged, the European Union published guidelines on the use of contact tracing in the context of the COVID-19 pandemic [12] . These guidelines should be interpreted as technical functional requirements to be considered by the implementer of the contact tracing app.

Among the many guidelines, some of the points that are relevant for our analysis can be summarized as follows (Appendix A.2 reports excepts from the mentioned document that expand the bullet points):

• Contact tracing must be based on voluntary adoption, there should be no consequences for those that opt out

• Phone location should not be used, only proximity data should be used • It should not be possible to track back the identity of a person using the data from the app. This is a crucial point, when Alice receives the notification that she was in contact with an infected person, she should not be able to say if this person is Bob 3 .

• Information should not exit the user phone if not absolutely necessary.

These guidelines go in the direction of retaining the least information possible, and maintaining it in the user device.

Another extremely important guideline states the importance of false positives. People that are diagnosed with COVID-19 will be subject to isolation and thus decisions can not be taken with an automated mechanism [1] . Beyond personal consequences we add that false positives may have two side effects that can make the app useless, or even harmful. If the rate of false positives is too high, people that receive alerts will simply start to ignore them, thus defeating the goal of the app itself. If instead people do not underestimate the alerts and receive many false positives, huge testing capacity, large enough medical staff to perform the tests, and efficient logistic to avoid risk of transmission in the hospital setting will be required. All these resources are scarce during the upsurge of an epidemic and can not be wasted due to the inaccuracy of contact tracing.

The sensitiveness to false positives calls for a fine-grained alert system. Users should receive alerts when there is a concrete risk of being infect, which requires a non trivial analysis of the data. This stands at odds with the guidelines that, instead, recommend to limit the collection of information and to avoid the concentration of information, which is normally a requisite to perform inference on large and complex data sets as we will see in the works described in section 5.

BLE is a natural candidate to perform proximity detection between pairs of mobile phones. It is a well established technology, introduced in 2010 and currently part of the core Bluetooth specification 4 . There are no publicly available statistics for the market uptake of BLE in active devices at the time of writing. According to Katevas et al. [13] BLE is present in almost all the existing iPhones, and the Bluetooth SIG estimates that in 2024 100% of new devices will be equipped with BLE 5 . Support for BLE is available in Android since version 4.3 (2013) and in iOS since version 5 (2011), so it's reasonable to assume that a very large portion of the mobile phones on the market support BLE.

Proximity sensing can be performed with BLE with a simple mechanism. Each BLE-equipped device can be in two states, the broadcaster or the observer. The broadcaster sends a broadcast beacon message on three default channels every "Advertising Interval", the observer instead every "Scan Interval" wakes up and listens to beacons for a "Scan Window" time. When the observer receives the beacon it estimates the distance from the broadcaster using the Received Signal Strength (RSS). Since descriptions of BLE can be easily found in the literature [14] , we focus only on the description of the solution that Google and Apple provided to perform Contact Tracing using BLE.

In April 2020 Apple and Google released a joint document with the technical specifications of a Privacy-Preserving Contact Tracing API supported by their operating systsms 6 based on BLE (PPCT, for short). PPCT offers a tradeoff between energy consumption, user privacy, and efficacy and it is supposed to become the layer on which every other contact tracing application may be based on. Nothing prevents developers to use other technologies, but considering the large diversity of devices and OS versions in the market it is unlikely that any custom solution may reach the needed uptake to be effective.

In PPCT phones act both as a broadcaster and as an observer. The Advertising Interval is set between 200 and 270 ms (corresponding to approximately 4Hz), while the Scan Interval is not specified, it should "have sufficient coverage to discover nearby Exposure Notification Service advertisements within 5 minutes [. . . ] with minimum periodic sampling every 5 minutes". The Scan Window is not specified at all. PPCT defines for each device a Temporary Exposure Key (TEK), which is changed once per day. This key is kept in Alice's phone and never leaves it if Alice does not get infected. Every ten minutes this 4 See https://www.bluetooth.com/blog/bluetooth-low-energy-it-starts 5 See https://www.bluetooth.com/bluetooth-resources/2020-bmu/. 6 Privacy-Preserving Contact Tracing, see https://www.apple.com/covid19/contacttracing/. key is used together with a counter ranging from 0 to 144 to generate another key, the Rolling Proximity Identifier (RPI). The RPI is included in the beacon (together with some meta information that are not relevant for this discussion). Alice's phone is thus identified by the same RPI for 10 minutes, which will give Bob the chance of observing the beacon at least twice. Bob stores all the RPI he receives in his device.

If Alice at some point becomes infected she uploads all the TEK for the last 14 days (or any other value decided by the application) to a Diagnosis Server (Eve), that is run by the service provider (not Apple or Google necessarily but the developer of the app). Eve periodically aggregates the keys "from all users who have tested positive, and distributes them to all the user clients that are participating in exposure notification". Bob then periodically receives sets of TEK keys coming from many people tested positive to COVID-19, he re-generates all the RPIs and checks if some of them are present in his own local storage. If some of the keys are present in his storage, he was possibly subject to contagion.

The rationale of PPCT seems to be following:

• minimize the energy consumption. Since users are expected to constantly run this system, it should not severely impact battery use.

• minimize the amount of exchanged information between Alice and Bob. There is no unicast communication happening or packets handshake. This makes power consumption predictable as it does not depend on the number of devices nearby.

• minimize the amount of information transmitted to Eve.

• provide to Bob a sufficient amount of information to estimate his exposure to some infected person without revealing who these people were.

Note that Apple and Google do not take responsibility for deciding when to notify Bob, the task of passing from Proximity Detection to the definition of Close Contact is completely delegated to the app.

PPCT seems to be aligned with the privacy guidelines from the EU. In this regard it is worth mentioning the notable work done by the DP-3T consortium, a group of international experts that are devising distributed, privacyaware solutions for contact tracing. In a white paper they describe three solutions, two distributed ones and a centralized one, and analyze their privacy characteristics [15] . One of the distributed solution is very close to the PPCT proposal and their analysis basically confirms that PPCT can be used minimizing the risk of privacy breach.

A part of their analysis that is useful to this discussion is the possibility of de-anonymising the identity of infected people among close contacts, which is considered likely. In general when Alice is tested positive, if Bob and Alice spent time in proximity, Bob should receive enough information to know that there was a close contact with an infected person, but not enough to understand who this person was. In practice this is impossible to achieve with a decentralized solution. Bob will receive all the RPIs Alice generated and he can associate to each RPI a short time-interval in which he was in proximity of an infected person. If the contact lasted for a long enough time there are chances that Bob may infer that Alice was, for instance, the person that was sitting next to him in the office for all the covered period. There is a clear tradeoff between having rolling identifiers that last long enough to allow a precise proximity detection and short enough not to uniquely identify a mobile phone for a long time interval.

It interesting to note that a centralized solution may instead elaborate all the contacts in the server and then notify Bob that he was close to an infected person for a certain amount of time, without Bob being necessarily able to track the contact intervals. On the other hand a centralized solution introduces several other potential sources of loss of privacy reviewed in the DP-3T white paper which we do not cover.

In this section we review the works in the literature that deal with proximity detection using BLE on mobile phones. There is a large body of literature in this field which we tried to restrict to those works that provide insights on the applicability of BLE contact tracing for the COVID-19 use case. The necessary criteria are:

1. The work must present a real implementation. 2. The experimentation must involve COTS mobile devices, and not only custom devices. 3. Proximity detection should be performed without external aids if not for results validation. This excludes fixed BLE anchors or implicit constraints due to the set-up of the experiment (e.g., the experiment takes place in a single room only). 4. The proposal must be compatible with PPCT: it must use BLE, it should not require post-processing of data by a centralized entity and should not require handshakes between devices.

Unfortunately, none of the works in the literature were satisfying these criteria. We decided to list and comment the ones that describe experiments that are as close as possible to the use case of contact tracing for COVID-19, in order to assess what are the main challenges for a real contact tracing app. We split contact tracing in two separate functions. The first is the estimation of the distance between two devices using RSS of BLE beacons. We see that this is already a non-trivial task. The second is to define when a "contact" is happening, which implies to elaborate the data provided by the first function and estimate reciprocity of the contact, length and duration.

Indoor positioning is, itself, an extremely popular research topic which can be obtained with a variety of techniques, such as triangulation or fingerprinting. Most of these techniques rely on the presence of static beacons and can not be adopted for contact tracing (see Gu et al. for a recent review of this topic [16] ). PPCT uses RSS with BLE and thus it is important to shed some light on its accuracy, even only in controlled environments. Several papers [17, 18] perform an analysis of the accuracy of a BLE-based positioning system when using mobile phones as receivers and fixed, dedicated devices as broadcaster. The measurement report that the RSS measured on the smartphones is extremely noisy, for two reasons. The first is multipath fading, the second is that BLE broadcasts beacons on three different Bluetooth channels and the response of the smartphones radio in the three channels is different, which provides a very noisy figure when the levels are summed into a single value. When people walk there can be drops in the measured RSS up to 30 dB [17] .

Katevas et al. [13] performed detailed experiments to estimate the accuracy of distance estimation with BLE in a very controlled environment, including an anechoic chamber. The results show that distance estimation on commercial devices (iPhone 5S and 6S) are very noisy, with en error that reaches 1.5 meters on an estimation of 3 meters. Similar measures are performed also by Montanari [14] , and the interesting observation that the author himself does is that the RSS value is completely different from the results of Katevas et al. This is not surprising as different devices use different radio chips and antennas, but underlines that even for "simple" distance estimation in a controlled environment (one single device transmitting, extremely low noise environment), a training phase is necessary for each specific receiver. Given the extremely large number of smartphone models in the market, it is unlikely that a single application could show the same performance on any mobile phone out of the box, and it is unclear how the ground truth for training may be provided to the app.

Montanari et al. [19, 14] use BLE to perform contact tracing in an office environment, with a set-up that is the most similar to a real world situation. The goal of the experiment was to track interactions among 25 co-workers in an office for 4 weeks, data were validated by human observers in the office and stationary beacons. The ground truth consisted in 401 observed interactions, meaning two or more people standing at less than 3 meters from each other and having a conversation. On average the interactions lasted for 1 minute and 13 seconds and 70% of the interactions were shorter than 1 minute, while only 5% were longer than 5 minutes. The authors use a custom device in order to achieve a high precision in data collection, but they then re-sample data in order to match the configuration that is achievable on commercial devices. Some of the results are encouraging, with realistic configurations that could achieve between 81% and 96% true positive detection rate.

Unfortunately the experiment set-up is far from the the COVID-19 use case as it breaks all the criteria we defined. Devices were smart watches and not smartphones, this makes a big difference because watches are always at people's wrist and can not simply be left on a desk. With smart watches there are higher chances that the two devices are in line of sight, while a smartphone usually stays in a pocket or bag, and RSS strongly depends on shadowing and multipath fading. The choice of the parameters allowed a much more fine-grained sensing than PPCT, as Scan Interval and Scan Window were set to be below 5 seconds, orders of magnitude lower than PPCT. The testing environment was a firm office and only that office, participants were asked to remove watches when going out of the office. Each contact was tracked using the RSS measured on both watches in order to mitigate the effects of multipath fading. To achieve this, all data were stored in a server and later on post-processed. Post processing used machine learning to identify contacts, with a supervised learning approach. All these issues make the set-up not at all comparable with the COVID-19 use case, and makes it impossible to generalize the results.

Another work that is relevant for our analysis is from Girolami et. al. [20] and investigates the possibility of using smartphones for contact tracing. In this case the experiment included students from a high-school that were asked to perform certain interactions (such as standing or sitting in front of each other for 3 minutes) while their mobile was recording BLE messages. The reported accuracy of encounter detection reached almost 82%. Unfortunately, again the testing conditions were not comparable to the COVID-19 use case. Interactions were not spontaneous, the participants were asked to perform specific actions, and these actions were happening in certain places, not "in the wild". The whole data-set was collected and post-processed, the reported accuracy was obtained with the best combination of tracking parameters and considering received beacons on both the mobiles involved in the contact. A key contribution to this discussion from Girolami's work is contained in this sentence: Firstly, we investigate the possibility of using commercial smartphones to advertise and to collect BLE beacons demonstrating that, currently, such approach is not feasible due to the heterogeneous implementation of BLE firmware in different versions of mobile OS (both Android and iOS). The authors initially tried to use the devices of the students but found out that in the batch of Android devices owned by the participants (which matched 15 different models), 42% of them were not usable for contact tracing. Even if the hardware and the software were supposed to be compatible with BLE, the device simply did not allow BLE to be used for active beaconing. In the end, the authors used mobile phones as observers but had to equip participants with a BLE watch acting as the broadcaster. Even once the broadcaster was set to be a "standard" watch with fixed hardware and software features, and the receiver phone was kept in a pre-defined position (front pocket or back pocket in participant's pants), the measured RSS changed substantially depending on the mobile phone receiving the data and its position. Furthermore, the median number of lost beacons per session was larger than 50%, which suggests that in a real world scenario the loss of beacons is an extremely important factor, which can be mitigated only with a large Scan Window. Montanari et. al showed that enlarging the Scan Window is the highest source of power consumption and that the most aggressive device configurations were depleting the device battery below a threshold of acceptability (one day of use). The trade-off between battery consumption and accuracy is completely unexplored in a noisy use-case, in which people may find themselves in crowded places with tens of other mobiles in the range of a few meters (a bus, a shopping center, or even only the queue to get into a shopping center). In those cases the loss of packets due to noise and collisions could introduce false negatives, i.e., contacts that are not traced.

Katevas et al. in a recent paper [21] use iPhones to detect interactions between couples and groups of people. 22 people were involved in a 45 minutes experiment in which they were left in an open space and were free to socialize. The ground truth was obtained with post-processing of video recording. A total of 99 one-to-one encounters and 22 group encounters were detected. Again, the accuracy of encounter detection was satisfactory (about 89%) but the experiment set-up does not generalize to the COVID-19 use case, for similar reasons to the previous works. The experiment was in a controlled environment, the participants had a dedicated beaconing device in one pocket and an iPhone in the other, all data were stored and postprocessed with machine learning techniques. Furthermore the iPhone had access to data produced by other sensors: acceleration, gravity, rotation time.

It is important to note that as of today all the experiments involving iPhones used a second dedicated device playing the role of the broadcaster. This is due to the fact that iOS does not allow an app to act as a broadcaster if the app is not in the foreground. This was noted in the literature [13] and was brought to the attention of the media recently 7 . This is a privacy-preserving feature that Apple introduced to prevent the exact goal of contact tracing apps: constantly tracking the user position. Even if it is likely that Apple could remove this limitation for the COVID-19 use case, there are currently no scientific works that estimate the efficacy of proximity detection using an iPhone as a broadcaster for prolonged periods. 7 See https://www.bloomberg.com/news/articles/2020-04-20/france-says-apple-s-bluetooth-policy-is-blocking-virus-tracker.

Palaghias et al. [22] present an accurate technique to perform proximity detection using smartphones only. The authors start from the observation that proximity detection using RSS only is not precise enough and elaborate a new strategy that needs 6 Bluetooth packets to correctly estimate user proximity. Results show an accuracy close to 82% in a realistic scenario but again the technique can not be generalized to the COVID-19 use case. Detection is done on-line by each phone (which is a peculiarity compared to other experiments which perform off-line centralized detection) but still with a machine learning approach which requires training. Furthermore, proximity is estimated using an estimation of the direction the user is facing based on a previously introduced technique [23] which requires access to various sensors on the phone. This technique also includes an interaction between the two phones using Bluetooth in ad hoc mode and the experimental setup is limited to 8 people performing partly controlled operations in an indoor setting. It is also not clear what version of Bluetooth is used in the experiment, and thus, what is the effective power consumption of the proposed technique.

The first important consideration that emerges from our analysis is that, albeit called with the same name, the "contact tracing" that is needed to limit the spread of a virus is not what a mobile application can provide. A "close contact" for international guidelines is when two people stay at a distance of less than 1-2 meters without proper protections. A smartphone app can only estimate when two devices are in communication range, regardless of where their owners are and what there is in between (a thin wall, a glass. . . ), which is generally referred to as proximity detection. The second fundamental consideration is that improving contact tracing requires high precision. The reproductive number when containment measures are in place is of a few units, and the majority of the contacts are extremely predictable (family, workplace, hospital). Proximity detection should provide an estimate of, in average, less than one contact per infected person. The third important consideration is that a high rate of false positives could defeat the goal of the app itself (with people ignoring the messages they receive) or even be harmful (diverting precious resources to manage false positives). Proximity detection to be reasonably useful should provide only a very short list of people that had long-lasting contacts (more than 15 minutes) with Alice. Yet we know that the rolling identifiers change approximately every 10 minutes, and sensing can happen at most three times in this time span. After 10 minutes Alice's identifier changes and Bob's phone is not able to keep tracking Alice's phone. Thus, Bob's phone should make a precise estimate of the proximity with a very small number of samples (at most the samples collected in three Scan Windows). The literature analysis we performed shows that at the time being, there is no scientific evidence to support that under these conditions, a proximity detection application running on smartphones can provide such high precision.

All the works we reviewed provide reasonable accuracy for proximity detection (between 80%-90%) but use a setup that is far from being applicable to the COVID-19 use case, combining at least two of the following requirements:

• They require a centralized system to store all the raw data. Note that this database is not limited to detected contacts, but is needed in order to detect contacts and should contain the RSS for all received packets.

• They require training. Calibration of the system needs ground truth provided by the experimenters, and a controlled environment.

• They used dedicated devices. They were not able to perform the experiments with smartphones only and used custom devices that participants were wearing.

• They required direct communications between the two phones, or access to other sources of data.

This makes it impossible to forecast the accuracy of proximity detection when used "in the wild".

Note that the first two requirements basically call for a centralized system that collects every single received packet and performs contact tracing in the server. This is even more privacy intrusive than what is normally described as a "centralized system" for contact tracing for COVID-19 [15] , which generally refers to a system that collects all the proximity events after they were detected in the phones. According to the analyzed literature instead, contact tracing performed by only one mobile phone is strongly affected by multipath fading and shadowing, and data from the two endpoints are needed to improve the detection. This creates a centralized system that collects all the received packets (tens per second!) and owns enough information to de-anonymise the position of the users, their loose interactions, and in general goes definitely beyond the goals of contact tracing for COVID-19.

Let us call N the average number of close contacts that a person may have during the observation period. This depends on two factors, the length of the period and the kind of social interactions this person has. Consider a person that goes to work every day with public transportation (bus), it is reasonable that he/she will stay in the bus for more than 10 minutes, and repeat this routine twice a day. If a distance of 2 m is maintained between people in the bus, and we consider only the closest 2 persons to be in communication range, we may estimate 4 contacts per day. We set the observation period to three days, that is, one day that passes from the emergence of the symptoms to when the person is tested positive (a very optimistic estimation) and two days in advance (again, a very conservative choice). This yields N = 12. The numbers we introduced in section 2.2 tell that we are looking in average for 1 person every 3 infected people. If the app has full penetration in the whole society and works perfectly, then we have that every 3 * 12 = 36 traced contacts, there is only one person that is likely positive. That is, the task of contact tracing itself, under the assumptions of a perfect application produces 35 false positives every true positive. This figure does not consider several other causes of contacts (people that work in contact with the general public, social gatherings, shopping malls etc.) yet, it already shows that using an app for contact tracing may produce an unbearable amount of people that want to be tested, or that simply ignore the messages they receive.

In order to estimate the potential efficacy of the application, we have to take into account the adoption rate. According to the Pew Research Center 8 in Italy in 2018 71% of the population owned a smartphone, while recent statistics 9 say that Android and iOS cover almost 100% of the market of new devices with Android alone covering 86% of the market. We know that at the time of writing, Apple iOS does not allow active beaconing for applications that run in the background, and one of the work observed that 42% of the Android smartphones did not allow beaconing even if the phone specifications theoretically allowed it [20] . Assuming iOS will enable beaconing on all devices and that s a = 1 − 0.42 = 0.58 is still a valid estimation of the fraction of BLE-equipped Android devices that support beaconing, we have that the fraction of Italians that have software support for PPCT in their phone is given by e = (0.14 + 0.86 * s a ) * 0.71 = 0.45.

It is hard to estimate the fraction a of phones that will be effectively able to use PPCT. There could be many reasons to be excluded: non BLE-enabled phone (recall that BLE is expected to reach all new phones on the market only in 2024), phones owned by people not willing to install the app, that failed to install the app, or that are not allowed to install the app (because their phone is provided by their employer). Then we should consider phones that are simply discharged, or impossible to use (carried in a bag that isolates BLE transmissions). We may use a prudent value a = 0.5 which produces a fraction of devices in condition to use PPCT corresponding to E = e * a = 0.225.

These phones are in conditions or receiving beacons, similar to the ones tested in the papers in the literature. Let us assume that the app never introduces false positives (an extremely strong assumption) and that it is able to detect A = 81% of the proximity events, the lower bound of the results reported by the literature in non realistic conditions. Then the sensitivity of the contact tracing 8 application (i.e. the number of contacts that are correctly traced divided by N ) is given by S = E * A ≃ 0.18. That is, the application will be able to provide less than one fifth of the possible contacts.

In the short term with current technologies, we argue that the high number of potential false positives and the low sensitivity does not justify the introduction of a contact tracing application with an extremely high potential privacy risk. Yet, several apps are currently under development and will be released soon 10 . In this section we build on the previous discussion to provide some pragmatic considerations on how to make the best use of digital contact tracing.

As a first point, we stress the importance of a rigorous monitoring of the ongoing efforts to produce contact tracing applications. It is essential that not only the source code of the apps, but also the results on their experimentation and their daily use is made available for public scrutiny. Since we outlined important technical unsolved challenges it is paramount that the way these challenges are addressed is made public, so that experts can evaluate the efficacy of the app, and validate the efficacy of the approach. It is also fundamental that the results obtained with the app will be constantly monitored during its use, to periodically asses its overall social utility. If the app may reveal to be inefficient, there are still other ways we could reshape its goals to provide some social benefit.

The amount of contacts with infected people and their cumulative duration, even if coming from a noisy source of information may provide users with an estimation of their cumulative exposure to risk. This may be beneficial in convincing them to adopt protection measures, keep the safety distance or change some of their habits in order to see their risk profile improve. Such protection measures are a limitation of freedom, and certain categories of people that are less exposed to the risk may not be motivated enough to enforce them. Creating empathy towards the others is key to motivate people in respecting social distancing [24] , and in this sense gamification is an approach that has been proved useful in several domains, including health [25] and may be adopted also in this specific situation. Gamification means introducing game elements in non-gaming activities and can encourage people to reach goals quickly. Gamification goes beyond "points and badges" and can be used for the creation of social awareness of the consequences or one's own actions towards the more vulnerable ones, which is an effective way 10 We just  mention  the  Italian  and  Swiss  ones,  available at https://github.com/immuni-app and https://github.com/DP-3T/dp3t-app-ios-ch respectively. of motivating people to respect contention measures [26] . The amount of information required for this task could be lower than what required for contact tracing thus lowering the privacy risk. For instance, re-use or RPIs could make it harder to de-anonymize the identity of infected contacts, while still providing a reasonable measure of the exposure risk.

One of the key challenges of analog contact tracing is that Alice needs to be interviewed by an expert that is able to identify the contacts that are at risk, based on the kind of exposure. This is a time-consuming task, and people can not always remember all their contacts. On the other hand we know that digital contact tracing can not be used automatically, as it only identifies proximity. Merging the two forms of tracing would instead definitely improve contact identification. If the expert can access a list of contacts which can be reviewed together with the infected person, the whole process would be made faster and more robust. Unfortunately this is not achievable with the current solution, due to the privacy requirements we (rightly) set on the application.

With more time and research efforts, it may be possible to achieve a satisfactory trade-off that follows this direction.

The risk assessment of a proximity tracing app, considering its privacy issues it is very hard to do, since the fallout of data leakage can be simply impossible to predict with current information. For this reason, a first principle approach would call to ask, before even proposing it, what is the estimated benefit of an application that introduces a potential privacy risk. We analyzed the available literature to try to answer this question, matching the data of the pandemic with the available data from experiments in proximity tracing, and our conclusion is that there is not enough evidence to support that such an app would help slow down the running contagion.

A contact tracing app, adopting the highest standards of privacy could be indeed useful to spread awareness and encourage modifications in people behavior, a goal that appears to be less daunting and more practical to achieve in the short term.

symptoms and the type of interaction (e.g., did the person cough directly into the face of the individual) remain important.

According to the Australian guidelines a close contact is a "face-to-face contact in any setting with a confirmed or probable case, for greater than 15 minutes cumulative over the course of a week [. . . ] sharing of a closed space with a confirmed or probable case for a prolonged period (e.g. more than 2 hours) in the period extending from 48 hours before onset of symptoms in the confirmed or probable case" 12 .

For the Irish Health institution a close contact is defined as Any individual who has had greater than 15 minutes face-to-face (<2 meters distance) contact with a case, in any setting. 13 The state of Alberta (CA) provides the following definition: individuals that lived with or otherwise had close prolonged contact (i.e., for more than 15 min and within two metres) with a case without consistent and appropriate use of PPE [Personal Protection Equipment] and not isolating 14

The the mentioned EU Privacy Guidelines [12] contain the following guidelines:

The systematic and large scale monitoring of location and/or contacts between natural persons is a grave intrusion into their privacy. It can only be legitimised by relying on a voluntary adoption by the users for each of the respective purposes. This would imply, in particular, that individuals who decide not to or cannot use such applications should not suffer from any disadvantage at all.

[. . . ] contact tracing apps do not require tracking the location of individual users. Instead, proximity data should be used; as contact tracing applications can function without direct identification of individuals,appropriate measures should be put in place to prevent re-identification; the collected information should reside on the terminal equipment of the user and only the relevant information should be collected when absolutely necessary.

[. . . ] procedures and processes including respective algorithms implemented by the contact tracing apps should work under the strict supervision of qualified personnel in order to limit the occurrence of any false positives and negatives. In particular, the task of providing advice on next steps should not be based solely on automated processing.

False positives will always occur to a certain degree. As the identification of an infection risk probably can have a high impact on individuals, such as remaining in self isolation until tested negative, the ability to correct data and/or subsequent analysis results is a necessity. This, of course, should only apply to scenarios and implementations where 12 See https://www1.health.gov.au/internet/main/publishing.nsf/Content/cdna-song-novel-coronavirus.htm 13 https://www.hpsc.ie/a-z/respiratory/coronavirus/novelcoronavirus/guidance/contacttracingguidance/National%20Interim%20Guidance%20 14 See https://open.alberta.ca/publications/coronavirus-covid-19 data is processed and/or stored in a way where such correction is technically feasible and where the adverse effects mentioned above are likely to happen.

Any server involved in the contact tracing system must only collect the contact history or the pseudonymous identifiers of a user diagnosed as infected as the result of a proper assessment made by health authorities and of a voluntary action of the user. Alternately, the server must keep a list of pseudonymous identifiers of infected users or their contact history only for the time to inform potentially infected users of their exposure, and should not try to identify potentially infected users.

Data protection impact assessment for the corona app

Report of the who-china joint mission on coronavirus disease 2019 (covid-19)

Community transmission of severe acute respiratory syndrome coronavirus 2, shenzhen, china

Temporal dynamics in viral shedding and transmissibility of COVID-19

Estimation of the reproductive number of novel coronavirus (COVID-19) and the probable outbreak size on the Diamond Princess cruise ship: A data-driven analysis

Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak

Coronavirus disease 2019 (covid-19) daily situation report

Secondary attack rate and superspreading events for SARS-CoV-2

Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study

Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing

European Data Protection Board, Guidelines 04/2020 on the use of location data and contact tracing tools in the context of the covid-19 outbreak

Detecting group formations using iBeacon technology

Devising and evaluating wearable technology for social dynamics monitoring

Decentralized privacy-preserving proximity tracing

Indoor localization improved by spatial contexta survey

Location Fingerprinting With Bluetooth Low Energy Beacons

Smartphone-Based Indoor Localization with Bluetooth Low Energy Beacons

A study of bluetooth low energy performance for human proximity detection in the workplace

IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)

Finding Dory in the Crowd: Detecting Social Interactions using Multi-Modal Mobile Sensing

Accurate detection of real-world social interactions with smartphones

Design, Realization, and Evaluation of uDirect-An Approach for Pervasive Observation of User Facing Direction on Mobile Phones

The emotional path to action: Empathy promotes physical distancing during the covid-19 pandemic

Gamification in theory and action: A survey

Motivating social distancing during the COVID-19 pandemic: An online experiment

The authors want to thank Gianrocco Lazzari for the useful feedback he provided to the realization of the paper.

Appendix A. 1 

Here we report a number of definitions of "close contact" from the official documents of several English speaking national health institutions. Note that most of these document extend the definition with specific provisions for, i.e., households or partners, which we don't report here as they are easy to detect without the need of a mobile phone app. The definitions may also change with time, as more evidence is accumulated on the way the virus spreads.The US Center for Desease and Control Prevention identifies a close contact as Individual who has had close contact (< 6 feet) for a prolonged period of time and specifies in notes that 11 :Factors to consider when defining close contact include proximity, the duration of exposure (e.g., longer exposure time likely increases exposure risk), whether the individual has symptoms (e.g., coughing likely increases exposure risk) and whether the individual was wearing a facemask (which can efficiently block respiratory secretions from contaminating others and the environment).Data are insufficient to precisely define the duration of time that constitutes a prolonged exposure. Recommendations vary on the length of time of exposure from 10 minutes or more to 30 minutes or more. [. . . ] Brief interactions are less likely to result in transmission; however,