key: cord-0103790-3tbn7ctz
authors: Gunther, Christoph; Gunther, Daniel
title: Contact Classification in COVID-19 Tracing
date: 2020-08-02
journal: nan
DOI: nan
sha: 6defbc153cb85950cffdcfede944ae96622a51b0
doc_id: 103790
cord_uid: 3tbn7ctz

The present paper addresses the task of reliably identifying critical contacts by using COVID-19 tracing apps. A reliable classification is crucial to ensure a high level of protection, and at the same time to prevent many people from being sent to quarantine by the app. Tracing apps are based on the capabilities of current smartphones to enable a broadest possible availability. Existing capabilities of smartphones include the exchange of Bluetooth Low Energy (BLE) signals and of audio signals, as well as the use of gyroscopes and magnetic sensors. The Bluetooth power measurements, which are often used today, may be complemented by audio ranging and attitude estimation in the future. Smartphones are worn in different ways, often in pockets and bags, which makes the propagation of signals and thus the classification rather unpredictable. Relying on the cooperation of users to wear their phones hanging from their neck would change the situation considerably. In this case the performance, achievable with BLE and audio measurements, becomes predictable. Our analysis identifies parameters that result in accurate warnings, at least within the scope of validity of the models. A significant reduction of the spreading of the disease can then be achieved by the apps, without causing many people to unduly go to quarantine. The present paper is the first of three papers which analyze the situation in some detail.

T HE COVID-19 pandemic has spread to enormous dimensions with 16 Million people affected and more than 644'000 fatalities up to July 26th, 2020. Unfortunately, the rate of increase has only flattened in China and selected European countries. The most important effective method to slow down the pandemic has so far been the enforcement of quarantine to large portions of the population, which led to a massive economic disruption. In countries such as China, South Korea, Singapore, and a number of European countries, the reduced infection rates made it possible to alleviate some of the restrictions. This involves the obligation to use masks and at least a recommendation to use some form of contact tracing. Different proposals for such a tracing have been made [1] and several different approaches are being followed in various countries. The most interesting proposals are those that fully focus on the tracing of contacts without tracking the movement of individuals, such as the scheme implemented in Germany [2] . The associated concepts were developed nearly synchronously by a number of authors and were published Christoph Günther is with the German Aerospace Center, 82234 Weßling, and with Technische Universität München, 80330 Munich, Germany, e-mail: KN-COVID@dlr.de.

Daniel Günther is a student at Technische Universität München, 80330 Munich, Germany, e-mail: d.guenther@tum. de. in [3] , [4] , and [5] . A review of associated requirements is found in [6] and a review of major apps in [7] . In view of the highly contagious nature of COVID-19, of the lack of a vaccine and of the high casualty rates, an effective tracing and significant testing capabilities are essential. In Germany 16 million people have downloaded the associated app on their iPhone and Android Phones so far.

Tracing apps rely on Bluetooth to detect the proximity of other people's devices. These apps generate random IDs, which are broadcast and stored to identify contacts in the case that the owner of a device is tested positively. If the owner is tested positively, the list of IDs stored on his device is published. Conversely, each device keeps the IDs of past contacts and compares them to the published list of IDs on a regular basis in order to establish whether a critical contact has taken place. Apple published an update of its operating system to support the development of such apps (iOS 13.1.5) and Google updated its Application Programming Interface (API). In the case of a critical contact the person should quarantine himself and register for testing. The outcome might be that he is found to be a carrier of the disease. In this case, the owner should trigger the release of his device's list of random IDs. The consequences of positive and negative testing depends on local regulations. In Germany a contact is characterized as critical and is called a Category 1 contact, if two people were in a face-to-face meeting at a distance of less than 2 meters for more than 15 minutes. The present paper relies on this definition but its parameters can easily be adapted to any other definition.

Several countries have released tracing apps. The classification methods used are typically not discussed publicly. Ideally, the classification ensures a minimum missed detection rate at an acceptable level of false alarms. In the case of too many false alarms, people will be unduly sent to quarantine and the app-based will be rejected by the public. If on the other side the app fails to identify potential carriers, they continue infecting others and its effectiveness is jeopardized. As shall be seen both issues are most critical in the case of a high densities of COVID-19 carriers. As a consequence, the present analysis will be most pertinent to regions with a high infection rate. This paper is the first of a series of three papers. The other two papers address the particularities of the evaluation of Bluetooth Radio Signal Strength Indicators (BT-RSSI) [8] and of audio ranging [9] in more detail.

Electromagnetic signals, such as Bluetooth signals, can be used for time of flight measurements, which provides accurate ranging results. Unfortunately, this and some other ideas cannot be considered presently, since a contact tracing app must rely on existing smartphones and devices. Thus, only existing functions provided by the chipsets, and even more importantly by the APIs of the devices can be used. The options for Bluetooth on existing equipment are limited to power measurements. The outcome of such measurements very much depend on the location of the device, which might be in a pocket or in a bag, often together with keys, coins, metallic business card holders and the like. Furthermore, the human body, with a strong water content, strongly absorbs Bluetooth signals. Together these uncertainties greatly influence the power levels measured at a distant receiver. The difficulties of tracing contacts by Bluetooth power measurements are also discussed in [10] . The remaining uncertainties about a potential contact to an infected person could potentially be resolved by interrogating the people involved. This would require the disclosure of the location at the time of contact, which might have been on a commuter train or at lunch in a restaurant, for example. The people must then identify where they sat or stood, which they might remember or not. In any case, this would be a source of privacy issues, discomfort and residual uncertainty. The German app would not support such a manual tracing anyway, since it does not collect the necessary information! In any case, such a human intervention would reduce the level of acceptance.

As a consequence, we propose to carry the smartphone in an exposed manner, namely hanging around one's neck. In summer time, younger people often do that already. On the basis of the present findings, this is recommended to everyone, also in a business context, see Figure 1 . Corresponding cases are available from several vendors. This mode of wearing the smartphone ensures a line of sight situation between two fellows facing each other. It leads to measurements that are a lot easier to interpret using Bluetooth Radio Signal Strength Indication (RSSI), audio ranging as well as gyro and magnetic sensors. The paper starts with a description of the statistical relationship between individual measurements and their classification in Section II. This section lays the foundation for evaluating the performance of classification in simulations or experiments. The probability of missed detection turns out to be critical for the success of the classification. Bluetooth RSSI evaluation is rather sensitive to the manner in which measurements are evaluated. Section III describes some aspects relating to the modeling of Bluetooth propagation and power measurements, as well as the essential result from the more in-depth study of the situation developed by Dammann et al. [8] . The following Section IV addresses audio ranging, which turns out to be an important complementary technique. Some audio properties of smartphones are summarized in this section. A more detailed study is published by Kurz et al. [9] . Section V shortly addresses the possibility of using attitude sensing, which is not explored in depth. Section VI finally discusses some basics of classifying contacts using the set of sensors mentioned.

The success of classifying contacts into Category 1 and other contacts depends critically on our capability of estimating distances. As a consequence, it is important to understand the influence of under-and overestimating distances from a pandemic point of view. This requires a study of the associated statistics. For a Category 1 contact, two fellows have to be facing each other at a distance of less than 2 meters for at least 15 minutes. This is called a C 1 contact throughout the paper.

Assume that we are the person A and that we monitor the presence of B. We aim at determining whether the contact to B is a C 1 contact or not, denoted by C 1 or ¬C 1 , respectively. Furthermore, denote the outcome of the estimation process bŷ C 1 and ¬Ĉ 1 , then there are four different possibilities, as listed in Table I (classical hypothesis testing) .

Obviously, in any good design p md , and p f a are small. The four cases have to be considered jointly with the possibility that B is tested positively, which happens with probability p i , and shall be denoted by B. Current values, based on data published by John Hopkins University on July 26th, forp i are 1/5 100 for Germany and 1/113 for the USA. If B is either not tested or tested negatively, this shall be denoted by ¬B. Let finally p C1 be the probability of C 1 , then this leads to the situations summarized in Table II. The first and fourth rows of Table II provide the desired outcome. The probability p C1 of a contact being C 1 is driven by social behavior. Social distancing aims at reducing p C1 . This is important, since many people would otherwise be potentially infected and sent to quarantine by the first row in the table, whenever p i is significant. The product p C1 p i is the probability that the contact is C 1 and that fellow B is infected at the same time. Aiming at a small value of p md ensures that few potential carriers continue spreading the disease (second row). The actual value of p md is a direct measure of the containment benefit provided by a tracing app. Since 1 − p C1 is large, it is very important that the probability p f a of wrongly classifying a contact as being C 1 be small. Otherwise, numerous people would be unduly sent to quarantine by the third row. The value of p f a characterizes the extra load in terms of quarantining and testing generated by a tracing app. This has to be taken into account in the tradeoff of p f a versus p md . Also note that the undesired outcomes, i.e. the rows 2 and 3, have a probability proportional to p i , which means that they are unlikely to occur in the case of a small density of infectious people. As a consequence, a 

none (all other cases) potential under-performance of an app only becomes apparent in environments with a high number of infectious people. The decision forĈ 1 or ¬Ĉ 1 is taken after a substantial number of individual measurements. They are assumed to be performed at regular intervals. The number of such intervals in a time laps of 15 minutes is denoted by x 0 . Depending on the assumed behavior of people, different methods of analyzing the measurement data shall be considered:

• Model A: People are rather mobile and the environment is changing quickly -the contact duration is accumulated over many short intervals. Examples of such situations occur when people work closely together, which is not particularly critical in terms of classification. They occur in underground trains, during breaks at conferences, at any form of party and the like. In these cases, a decision is taken every 15 seconds, if x 0 = 60 such measurements indicate that fellow B is in the contact zone of fellow A the contact is classified as being C 1 . It will turn out that this model cannot be addressed with the current capabilities. • Model B: People come together, stay in a given relative pose and then separate again. This happens when people are seated in a train, especially in long-distance intercity trains, in restaurants, meeting rooms, lecture halls, theaters and the like. In this case, a single test (x 0 = 1) is performed to decide on whether A is in the contact zone of B. Specifically, in the case of Bluetooth RSSI measurements, a timer is started when the RSSI value exceeds a critical value for the first time. From then on, the times for which the RSSI values are compatible with a C 1 contact are accumulated. If the time exceeds 15 minutes at the end of the contact, a C 1 contact is declared. There are many different options for the implementation of this model. They will not be further discussed, however, since they assumes a static constellation of people, which is not the most common case. This approach is more robust with respect to the behavior of people and preferable to Model B. Model A is most universally valid with respect to people's behavior. Its statistics are so unfavorable that it does not lead to acceptable values of p md , however. In all models, the number of RSSI measurement n that are combined, before taking an elementary decision, is another parameter that can be adapted. Large values lead to more reliable decisions but also to a higher number of exchanged messages. The rate of message will be n · x 0 measurements in 15 minutes.

In order to assess p md , we need to know the number of times that the distance and attitude condition for C 1 between A and B are fulfilled. This depends on the profession and personality of the person. It has two components, the first one is determined by the number of people met during one day. Let us assume that this number is k and that it has probability p n (k), then the probability p S that a particular fellow A spreads the virus after having been in contact with m ∈ {1, 2, . . . k} people, who are infectious with probability p i , under the assumption that i ∈ {1, 2, . . . m} of these contacts are not detected, is given by:

Since p i and p md are small numbers, the dominant term in this equation is obtained for m = i = 1 :

with K = ∞ k=0 p n (k)k being the average number of contacts, see also the second row in Table II . All these contacts take place mostly independently and can thus be treated as such. Each of them is associated with a contact time x ∈ N, with a distribution p X (x). The latter is derived from social models and depends on whether people are practicing social distancing.

The accumulation of n measurements leads to a decision c 1 . The latter has a probability of missed detection and false alarm denoted by π md and π f a , respectively. In the present section, both quantities are written without further indices. In later sections, the dependency on n will be made explicit. The combination of x 0 such decisionsĉ 1 finally leads to the decisionĈ 1 , which is associated with a missed detection probability:

since the combined missed detection occurs whenever less than x 0 detections succeed. Using this in Equation 1 implies that the probability that A spreads the disease is:

with, x M = 24·4·x 0 being the number of elementary decisions taken per day (24 · 4 quarters of hours times x 0 ). The above equation is an approximation since the distribution of contact times depends on the people and circumstances of the meeting, like sitting together in the train, having a joint lunch and so on. If π md 1, the term m = x 0 − 1 is dominant in Equation (3):

The second line in the equation is obtained by shifting the indices, the third one is obtained by expanding the binomial coefficients and bounding the terms in the numerator. Note that the term for x = 1 holds with equality. Under the same assumptions used so far, the probability that fellow A is a C 1 contact of B after a day is:

Thus, the comparison of p S , i.e. the probability of spreading the virus with tracing, and of p C1 , i.e. the corresponding probability without tracing, shows that contact tracing is a very effective option to reduce the spreading whenever

is small. This implies that the probability of missed detection must be constrained to a value smaller than 1/x 0 , which is possible to achieve if x 0 is small, as it is the case in Model B and Model C and not possible to achieve in Model A, even with very large values of n. Rephrasing this in words may help developing some intuition: since x 0 individualĉ 1 decision are needed for aĈ 1 decision, missing any one of them leads to a missed detection. Since there are x 0 options for that, p S becomes essentially proportional to x 0 π md . We will use the latter product as a measure for the reduction in the spreading of the disease by the tracing app.

In order to evaluate p f a , we need to additionally know the number of times y that a person is close enough for a measurement to take place. The distribution p Y (y) does again depend on social parameters but additionally depends on radio propagation in the case of Bluetooth measurements, and on the triggering mechanism in the case of audio ranging. The number of contacts K Y ≥ K is larger, since the presence detection by Bluetooth signaling is triggered well beyond C 1 separation. Consider Bluetooth measurements: if among the y time instances for which a radio contact to one particular fellow B persists, and assume that m < x 0 of those contacts are correctly detected as fulfilling the C 1 conditions. Then, q additional erroneously identified contacts (erroneousĉ 1decisions) with m + q ≥ x 0 are needed to cause a false alarm for that number y of radio contacts to B (see Table III for a summary of the meaning of the variables):

for y ≥ x 0 and p f a (y) = 0 for y < x 0 . 

Meaning y number of radio contacts x number of C 1 contacts x 0 number ofĉ 1 -decisions to declare C 1 m number of correctĉ 1 estimates q number of incorrectĉ 1 estimates Using Equation (6), the expected number of an unnecessary quarantining of people is approximated by:

This equation also includes the possibility that users move with respect to each other, which means that the conditions C 1 and ¬C 1 alternate as a function of time. If C 1 is fulfilled π f a = 0, and if ¬C 1 , the equation π d = 0 holds. At the border of the C 1 domain, the two quantities change their role. This implies that a small p f a near that border is associated with a large p md ∼ 1 − p f a on the other side of the border. This is uncritical if the distributions are very narrow -concentrated around a value -as is the case for ranging, but becomes rather problematic with Bluetooth signal power measurements, which show a very flat distribution. Unless great precautions are taken the classification becomes unreliable. Consider the case, that fellow B is outside of the C 1 zone of fellow A, i.e. p X (0) = 1. Then x = 0 for these measurements and the equation becomes:

Although terms with x > x 0 may be larger, the term x = x 0 gives us an idea of the scaling. Its asymptotic dependency can be evaluated using Stirling's formula and lim y→∞ (y/(y − x 0 )) y = e x0 :

This means that in the long term, it is the duration of the radio contact y, which dominates the rate of quarantining people. Some target figures for π f a can be obtained for a fully occupied train, for example. In Germany's 2nd class setups, there are 4 seats in one row on each side of a carriage, and around 10 rows in the carriage. The range of Bluetooth reaches well beyond the next row forward and backward. This means that K Y > 24 of which 4-8 are within the contact zone and must thus be discounted, leading to an effective value K Y = 16. The value y itself is determined by the duration of the common journey. For commuter trains we choose 15 and 30 minutes, for inter-city journeys 1, 2, and 3 hours, which leads to y/x 0 = 1, 2, 4, 8, and 12. In such a train a carrier of the disease will send 4 people to quarantine, thus it should be tolerable that 2 additional people are sent to quarantine by false alarms as well. The value of π f a is then obtained by solving

Numerical values of π f a are indicated in Table IV . They are the values that can be tolerated, leading to a 50% increase in the quarantining of people riding a German train. The situation is rather uncritical on a short commuter train ride π f a < 0.93 and much more demanding on a longer intercity train journey. 

The Application Programming Interfaces (API) of Android and iOS allow to trigger the transmission of Bluetooth Low Energy (BLE) advertisement messages and to measure the radio signal strength of the received signals. The corresponding values are provided in the form of a Radio Signal Strength Indicators (RSSI), which is defined as the received signal power on a logarithmic scale. Bluetooth uses frequencies from a band shared with microwave heating, which means that Bluetooth signals are strongly absorbed by water. As a consequence any part of a human body obstructing the line of sight significantly attenuates the signal. The wide variety of options for carrying mobile phones in your hand, pocket or bag thus implies an enormous variability in received power levels. This is further amplified by the directional characteristic of low-cost antennas. You might make an experiment yourself using a Bluetooth module and a BLE scanner app on your smartphone, which can be downloaded from the iOS or Android stores. With the module and phone separated by 1.5 meters, I personally found the following RSSI-values: -61 to -66 dBm when the module was in my hand and -81 to -89-dB when it was in my pocket. Knowing that a 20 dB change corresponds to a factor 10 in distance exemplifies the difficulty of estimating distances using Bluetooth RSSI values. This led us to propose the rule of carrying smartphones hanging down from the neck. Note that the smartphone could be replaced by a much smaller device built around a Bluetooth module, an Inertial Navigation System (INS) and a sonic or ultra-sonic ranging system, as well.

Even if people follow the above recommendation on how to carry their smartphone, the situation remains difficult due to uncertainties in radio propagation, which furthermore takes place on three different carrier frequencies. The unknown association of carrier frequencies to measurements adds an additional level of difficulty. Gentner et al. identified certain patterns in the use of carriers, see [11] , which can be used to reduce the associated uncertainty. Traditional models of propagation are shortly addressed in the following section and in more details in [8] . The section furthermore relates the associated statistics to the statistics of classification.

The smartphone is assumed to be worn on the chest, see [8] for details of the measurement setup used to obtain numerical results. For each individual carrier, the received signal power P RX is modeled by the equation:

with P T X denoting the transmit power, γ denoting a stochastic fading coefficient, d being the distance between the receiver and the transmitter, ν being the exponent of the decay law, which is 2 for free space propagation, and with n representing a superposition of noise and interference. For simplicity, the noise and interference are not further considered here -at low distances they are not dominant. In this case, the received power, can be represented on a logarithmic scale, which leads to the definition of the RSSI:

with η = 10 log γ and with logarithms taken to the basis 10.

The relationship between the reported RSSI value and d is the basis for distance measurement: the measured RSSI is compared to

with d c = 2 m being the critical distance. Note that Equation (8) defines the units, which have to be maintained after taking logarithms.

In order to evaluate the missed detection probability per event p md or the false alarm probability per event p f a , the statistics for η or γ need to be known. These statistics are dependent on the situation. In the case, that two fellows face each other, they are in a line of sight situation. If the direct path dominates all other contributions, γ is basically delta distributed with an average of Γ determined by the antenna pattern. In other cases, the direct path remains present but is superposed by scattered components. In this case, the distribution of the amplitude of the received signal is modeled by a Ricean distribution. This model is considered to provide a faithful representation of reality, whenever the parameters are properly estimated. Presently the model is only considered for comparative purposes, as shall be seen below. The received power (or attenuation γ) in this model has a non-central χ 2distribution with two degrees of freedom:

with γ R being the non-centrality parameter and σ R being the variance. In the case that the decision about C 1 is taken on the basis of a single measurement (n = 1), e.g. in Model A, the criterion for the decision is:

with γ c being given by:

The associated estimate is denoted byĉ 1 , and the probability of missed detection for the distance d < d c is given by:

If one would add several power measurements, i.e. n > 1, e.g. in Model B and C, this would mean adding n independent identically distributed variables, each of them being χ 2distributed with 2 degrees of freedom. The result would then be χ 2 -distributed with 2n degrees of freedom:

The Equations (11) and (12) would remain valid and the latter integral could be computed in closed form for arbitrary n.

The value γ c is the first moment of the χ 2 -distribution with 2n degrees of freedom and non-centrality parameter nγ R /σ 2 R :

The probability of missed detection (13) in estimatingĉ 1 could then be computed in closed form using Marcum's Q-function Q n (., .):

The above distributions are adequate for users A and B in close proximity of each other, as is the case for d ≤ d c . It is the desired result in Model A and shall serve as a benchmark in the Models B and C. The reason for not using this result directly in the latter models is that apps are expected to add the RSSI values rather than the power values. In this case, the statistics cannot be determined in closed form but must rather be evaluated numerically. Before addressing this case, let us consider the situaiton d > d c with a line of sight that is often obstructed. In such cases, a lognormal fading distribution is considered to be a reasonable model of reality, see [12] . The distribution may either be written in terms of γ:

or in terms of η = 10 log γ:

with η L = 10 log γ L = η . Equation (15) makes the Gaussian character and the meaning of η L and σ L obvious. In the above discussion, a decision in the case n = 1 was taken in favor of C 1 , whenever the power level was above a threshold. On the logarithmic scale this condition reads RSSI > Θ, i.e. whenever the difference

is positive or equivalently whenever η > η +ν ·10 log(d/d c ). Thus, a false alert occurs if this condition is fulfilled for d > d c . The probability of a false alarm, i.e. and erroneous decision for c 1 , becomes

with the present Q-function being a scaled version of the error function complement:

In the case of n = 1, a closed form of the statistics thus exists for π md for d ≤ d c and for π f a for d > d c . In the case n > 1, e.g. Model B and C, the situation changes somewhat since measurements are now combined by adding RSSI-values. This corresponds to a geometric average of the received powers.

In this case, the probability of false alarm can be computed easily:

for d > d c . This equation is a consequence of the scaling of η L and σ 2 L by n. Using the same distribution, but with different parameters for d < d c is expected to be a worse match to reality but allows to also evaluate the probability of missed detection in closed from:

It leads to an interesting symmetry between the probabilities of missed detection and of false alert. Note that both probabilities π md and π f a depend on the parameters of the distribution, on the true distance d, and on the critical distance d c , but that they do not depend on the explicit threshold Θ, see Equation (16) and the associated explanations. The resulting functional dependence can either be used in a simulation of roaming users or can simply be averaged over the interior of a circle of radius d c for π md or over its complement or a relevant subset for π f a . The closed form of Equation (6) provides the immediate insight that π f a,n (d c ) = 1/2, which shows that the models are consistent with our intuition.

The companion paper by Dammann et al. [8] describes the measurements and their analysis in more details. All these measurements have so far been made using ideal conditions with no additional people except A and B (in the very initial measurements A was a actually a post carrying the receiver). The experimental basis shall be further broadened in the future. A first result can be derived from the estimated Rice parameters at a distance of 2 meters γ R = 247 pW, and σ R 2 = 9.15 pW, as well as for the lognormal distribution at 2 and 4 meters: 1.60 and 1.97 dBm, respectively. This allows plotting the functions from Equation (14) and (17) for π md,R,n (d) and for π f a,n (d 2 c /d) = π md,L,n (d), respectively. The values of n determines how many measurements are combined into an elementary decisionĉ 1 . For n = 1, the values π md,R,1 (d) and π f a,1 (d) are the best models among those considered -the use of a decision threshold in the absolute or logarithmic domain are equivalent. The parameter for 4 meters 1.97 dBm is used for determining the false alarm rate.

If several RSSI values are added (logarithmic domain), the statistics associated with the more realistic Rice distribution in the near range can not be determined in closed form, at least not today. In this case, Equation (19) for the lognormal distribution is used to determine π md,L,n (d) with the parameter for 2 meters. This is used as an approximation of the true distribution in the exemplary case n = 60. The plots in Figure  ( 2) show two groups of curves. The upper group corresponds to n = 1 and the lower group to n = 60. The latter group of curves shows the benefit of diversity. Within these groups there are differences between π md,R,n (d) (wrong combination) and π md,L,n (d) (wrong fading statistics) but they turn out not to be fundamental. In Section III-A the probability of missed detection was determined as a function of distance. Since the probability of detection is additive in the sense that:

In this equation π d (r) = 1 − π md (r) is the condition probability of detection given that fellow B is at distance r and dS(r) ρ(r) is the probability density for fellow B to be at that distance. Equation 20 thus is the marginalization of π d (r) with respect to r. Note that the limitation of the integration is a consequence of π d (r) = 0, whenever r > d c . This allows to define the average probability of missed detection over the distribution of users:

π md,av,n = dc 0 2πrdr ρ(r) π md,n (r) dc 0 2πrdr ρ(r)

.

(21) 8 The probability distribution of users in Equation (20) and (13) is given by:

In this expression n(r) = πr 2 is the number of people at a distance not greater than r in the case of a density of one person per square meter. This corresponds to the densest packing of people occupying a surface of 1 meter. People are continuously spread in a symmetric manner around fellow A, which is a simple way of achieving a densest packing. The "function" dn(r)/dr is mostly zero. It jumps at the values r m = m/π with

which is a distribution in the sense of Schwartz [13] . With these preparations, the integrals become:

with m c being the largest integer with such that r mc ≤ d c .

Note that the density of points r m increases with increasing m, which means that the main contribution comes from the border of the contact zone. Using the experimental results from [8] , this integral is evaluated to π md,av,1 = 0.15 for n = 1 for the χ 2 -distribution and to π md,av,1 = 0.12 for the lognormal distribution, which are both not very compatible with the need of a small x 0 π md, . Remember that the latter value is the reduction factor in the probability of further spreading of the disease, achieved by contact tracing. Table  V lists values of π md,av,n , for different n, which can be used to determine the reduction factor. Even in the case n = 120, the factor x 0 π md = 0.21 in Model A and it would require 4 measurement per second. It is only with n = 480, that factor x 0 π md falls below 1%, which would require 16 measurements per second. This would seriously impact the standby time of the smart phone. Assuming Model C and a decision based on 3 minutes intervals, i.e. x 0 = 5, means that we could achieve a reduction by a factor 0.07 provided that n = 60 measurements are performed and aggregated in each 3 minutes interval, i.e. that one measurement is performed every 3 seconds. In the case of a decision every 5 minutes, which assumes a lower dynamics in the relative movement of people, the reduction factor is 0.04 with the same 60 measurements, but now spread over a 5 minutes interval, which corresponds to one measurement every 5 seconds. So, lower requirements in the dynamic allow both to improve the suppression of the spreading of the virus and to reduce the measurement rate. Tolerable alarm rates were derived for the train scenario. This led to the values in Table (IV) . The evaluation of π f a,n (d) is straight forward. For d = d c it gives π f a,n (d c ) = 1/2 as was already discussed previously. Assuming that people occupy a circular surface of 1 square meter gives them a radius δ = 1/ √ π. Thus, the minimum distance to people fully outside of the critical zone is d c + δ. Evaluating Equation (19) yields:

respectively. This means that n = 1 is compatible with a journey of 15 minutes before sending more than the two people to quarantine. For n = 3, long journeys of up to 3 hours become possible with the same consequences. The probability of false alarm does thus not strongly limit the number n of measurements aggregated to a decision and one might consider the more demanding homogeneous distribution of users. This requires a study of the combination of false alarms. Consider two fellows B and B', there is no alarm if neither B nor B' triggers an alarm, i.e.:

1 − π f a = (1 − π f a,B )(1 − π f a,B ) .

Furthermore, let users be at distances d c +δ(k+1) with k ∈ Z + being a positive integer and assume that there are

users at that distance (they cover an angular shell of thickness 2δ). This guarantees a densest packing. In that case, the probability of false alarm, i.e. an erroneous decision in favor of C 1 , becomes:

In this more demanding scenario, exemplary values are:

p f a,3 = 0.413 and p f a,9 = 0.009, which means that n = 9 would be sufficient to reduce the probability of false alarm to a very small level. Table VI shows performance figures for a number of possible choices for the number n of measurements aggregated to an estimateĉ 1 , as well as for the number x 0 of estimatesĉ 1 that lead to a decision C 1 . The product of n and x 0 leads to the measurement rate ρ = x 0 n/(15 · 60). The performance figures are the reduction factor x 0 π md,n of the spreading achieved by tracing as well as the probability of unduly sending a person to quarantine. A choice with n = 15 and x 0 = 3, for example, requires a measurement to be performed every 12 seconds, suppressed the risk of spreading by a factor 0.12 and does hardly send anyone unduly to quarantine. Performing a measurement every five seconds reduces the risk of spreading by a factor 0.04. This assumes that people let their phones hang from their neck, and some standard form of environment. In reality, a number of additional factors have to be taken into account, such as a more complex propagation situation, e.g., due to metallic walls, a higher dynamic of user movements, e.g. due to people entering and exiting commuter trains, or unpredictable shadowing due to the user's hands, arms or body in the path of radio signals. Thus, it is advisable to complement the Bluetooth measurement by an alternative. Audio ranging is the option that shall be described in the next section. The idea is to use it whenever the situation is not clear.

Smartphones have a microphone and a speaker with rather good transmit and receive conditions if the device is carried on the chest or held in the hand. This can be used for audio ranging up to distances of a few meters. Signals and their transmission can be configured by the API. In experiments that we performed recently, we focused on the use Android phones. The response of the microphones built into three different phones is shown in Figure 3 . The references were a NT1-A microphone from Rode and an Adagio Infinite Speaker of A3 on the source side. Figure 3 shows the response of three smartphones from two different brands. The curves are very similar, suggesting that the same microphones are integrated in those phones. All microphones show a good sensitivity over all frequencies. A similar experiment was performed for the speakers with a rather different result. In that case only two smartphones were analyzed. The response on the better device is reduced by roughly 10 dB above 16 kHz, as compared to the reference. The response of the other one is degraded by another 3 dB and the degradation starts 2kHz earlier. Covering the speaker by one layer of tissue of a sweater degrades the performance by another 4 dB. If both parties cover their smartphones the associated attenuation adds up. Thus, the use of audio ranging requires carrying the devices in an exposed manner, e.g. hanging from one's neck, see Figure (1 ). Transmission at lower, less attenuated frequencies is not considered as a true option, since it would be too disturbing. The norm ISO 226:2003 compiles equivalent hearing sensitivity (isophones), which allows to compare the disturbance caused by acoustical signals on different frequencies. On the basis of such considerations, we propose modulating a carrier at 18 kHz with a modulation rate of 1 kbaud. This keeps the signal in a spectral range that is not too disturbing to most people. A spread spectrum modulation provides a good range resolution and allows to operate at a low signal-to-noise ratio at the same time. Different options exist and are discussed in [9] . Since the velocity of sound in air is c s = 343 m/s under standard conditions, a chip duration of 1 ms corresponds to a length of 34 cm. At a typical signal-to-noise ratio this leads to a distance resolution of 1 to 3 cm. Let us be conservative and assume a resolution of 5 cm. A multipath delay of two meters leads to an offset by 6 chips and is well suppressed by the autocorrelation of the spreading code. The length of the spreading code is assumed to be around 350 chips. An alternative using chirps is also considered. The performance of audio ranging is further developed in Section IV.

Audio ranging can be performed in a peer-to-peer or in a networked manner. Consider the peer-to-peer situation first. Smartphones do not provide accurate timing control. However, the microphone input of a smartphone may be sampled at a fixed rate. Furthermore, smartphones can transmit and receive at the same time and this is furthermore supported by the APIs of Android and iOS. Let the smartphones thus agree to start audio ranging via Bluetooth . In a first step they open their microphone channels and then proceed according to Figure 5 : at time t T X,A , A transmits the ranging signal using its speaker. This transmission is delayed with respect to the API by τ T X,A . In parallel to its transmission A's microphone capture the transmitted signal. This signal is delayed by the sum of the local propagation delay τ l,A and by the internal receive delay τ RX,A . The delay τ l,A is determined by the device geometry and can be stored in memory. A standard value of 14 cm should be appropriate for most devices on the market. The time of reception thus is: t RX,A = t T X,A + τ T X,A + τ l,A + τ RX,A , and is used for calibration purposes. The same definition of delays applies at B. Thus, the signal transmitted by A at time t T X,A is received at B at the time t RX,B :

with τ being the propagation time from A to B. After reception of the signal from A by B, B sends a corresponding signal to A. The equations are obtained by changing the roles of A and B:

At the end of the reception A sends

to B and B sends ∆t B = t RX,B − t RX,B + τ l,B , using BLE. Thus, both can compute the propagation time:

and thus the distance d = τ c s . The property of audio signals, which is crucial for this self-calibration, is the possibility to observe the own transmitted signal. 

The above peer-to-peer protocol can be extended to a networked protocol. In this case, the users agree on an ordering of transmissions via Bluetooth. All smartphones A 1 . . . A k activate their microphones and one after the other transmit their audio ranging signals. For simplicity, the scheduling is prearranged, which also works if some of the smartphone cannot acquire all signals. In this case, all delays are summed up: 350 ms for the ranging signal, 10 ms (corresponding to 4 meters) for propagation and 40 ms for the internal delays between the activation of the transmission command and the start of transmission (the latter needs to be confirmed by more data). This allows for a scheduling of a transmission every 400 ms. After the completion of the cycle and the evaluation of the reception time t RX,Ai by terminal A 1 , this terminal transmits the time difference using Bluetooth:

If all terminals see each other, they transmit k(k − 1) such values in total. The annoying transmissions of audio signals remain limited to k, however. The overall time interval spanned by all transmissions in the networked protocol may be long enough for users to move slightly. This is not critical, however. The snap-shot measurements are simply converted to average values. The only instances, which require some care are those in which the audio signals are used to calibrate Bluetooth measurements. Finally, it should be emphasized that audio beacon transmissions should not be activated if the device is held to the ear. Even if the signals are hardly heard, this seems a reasonable precaution.

The received audio signal is filtered to remove out-ofband interference and noise to the best possible extent. The filtered signal is used to determine the in-band interference and noise level N 0 and is furthermore correlated using the filtered ranging signal. For simplicity, the further exposition focuses on spread spectrum signals. In a first step the I and Q components of the correlation C(∆τ ) are computed at intervals of T c /2 with T c = 1 ms denoting the chip duration. The result is searched for the delay leading to the maximum norm |C(∆τ )|. Although, the implementations by widely used phones seem not to require that, frequency offsets may be searched as well. This allows to acquire the signal which may be present or not. Thus, it is sufficient to search for the delay (and frequency offsets) leading to the maximum norm from early to late. The latter ordering is to avoid locking on an echo. If the signal to noise ratio is above the expected threshold, the signal is assumed present. In this case, a successive refinement of the result is performed in a DLL type of processing. The power discriminator D P (∆τ ) = |R(∆τ + δ)| 2 − |R(∆τ − δ)| 2 is used to iteratively increase/reduce the delay ∆τ depending on the value of D P (∆τ ) ≷ 0. In this equation δ is half the correlator spacing and is expressed as a fraction ∆ of the chip duration: δ = ∆T c . We will restrict ourselves to ∆ = 1. A further optimization is possible, see Betz and Kolodziejski [14] , [15] . The uncertainty of the delay estimate ∆τ due to noise is given by (see Dierendonck, Fenton and Ford [16] ):

In this expression, E i is the signal energy accumulated during the correlation, and N 0 is the spectral noise density of the audio noise and interference. The latter quantity is estimated using the norm of the filtered I and Q samples of the incoming signal:

with N denoting the number of samples and with B S denoting the bandwidth of the passband filter. This estimate is performed ahead of time and is used for setting the volume of the transmission, such that E i /N 0 = 6 dB at 4 meters. At this level the signal can be acquired, and Equation (25) implies that σ ∆τ T c /4,which corresponds to 9 cm. At 2 meters, this is half that value, i.e. 4.5 cm. The calibration of the transmit power may be performed by listening to the own beacon. This allows detecting whether the user is inadvertently covering the microphone or the speaker, which should trigger a request to the user to remove the blockage. The distribution of audio ranging measurements is Gaussian with a standard deviation given by Equation (25). This allows computing π md , i.e the probability of deciding againstĉ 1 , as a function of the distance d ≤ d c :

and π f a , i.e the probability of wrongly deciding in favor of c 1 , for distances d > d c :

Note that the symmetry of lognormal fading between π md (d) and π f a (d 2 c /d) is lost. The plot for audio ranging, corresponding to σ ∆τ = 5 cm is shown in Figure 6 Again one might evaluate the average rate of missed detection and of false alarm as in Equation (22). In this case, the averaged probability of missed detection becomes π md,av = 0.016. In the present case, the number of measurements is primarily limited by the acoustical disturbances associated with the transmission of the beacon. The number of measurements n used for taking a decision is always 1. Furthermore, the number of measurements x 0 per 15 minutes must also be small for the same reason. With x 0 = 3, the reduction of the spreading rate of disease is x 0 π md,av < 0.05, which is a low figure. The probability of false alarm described by Equation (27) decays so quickly that it is insignificant at d = d c + δ, i.e. π f a (d c + δ) 0. The same applies for the integration over a two-dimensional plane according to Equation (23).

The present discussion was about the contributions of uncertainty due to signaling. Additionally, the relative geometry of the microphones and speakers may add some bias, which may lead to a shift of the border to a contact zone by a few centimeters. This is rather uncritical, however. The important conclusion is that audio ranging provides sharp results. This form of ranging might thus be activated whenever the information gained by Bluetooth measurements may lead to a wrong conclusion.

V. ATTITUDE SENSING This section is more a reference to options that may be considered. The benefits will become visible by the qualitative discussion of Section VI. Earth gravity in the − e z direction, i.e. towards the center of the earth and the magnetic field in the direction of e mN , i.e. towards magnetic North provide two directions that enable attitude determination. Both are seriously disturbed in ways that depend on the environment. A number of authors have investigated the quality of attitude sensing both using algorithms built into smartphones and using own estimation algorithms. Michel and co-authors summarize a number of findings [17] . They report an accuracy of 6 • with a sampling rate of 40Hz whenever the smartphone is kept in a relatively calm position (front pocket, texting or phoning). These results apply to their own algorithms "Mich-elObsF" and "MichelEkfF." They did not study the behavior in a train, which is a particularity difficult environment: with many sources of acceleration, due to the track geometry, due to passing switches or simply due to irregularities in the tracks themselves. Similarly, the magnetic field in trains is modulated by electrical motors, permanent magnets and large currents. On the other hand people sitting or standing next to each others are likely to be affected in a similar manner. Exploiting the latter property, however, requires the use of common standardized algorithm and precise time stamping of measurements.

Carrying the smartphone by letting it hang down one's neck leads to two stable orientation, one with the display facing the chest and one with the display facing ahead. The resolution of the associated ambiguity is rather straightforward, at least as long as people do not predominantly walk backward. Alternatively, the cameras could be used for determining the orientation, since the brightness of the pictures is very different. Pitch angles are suppressed by gravity, as long as people do not bend backwards, which is unnatural. Roll angles may occur if one strap is shorter than the other one. They are compensated by sensing earth gravity. In our opinion the context of COVID-tracing is quite favorable to the use of relative attitude estimation, which would provide an interesting complement to Bluetooth sensing and/or acoustic ranging. This needs to be developed, however.

The definition of a Category 1 contact by the Robert Koch Institute [18] includes three elements:

• an accumulated duration of 15 minutes, which can easily be metered, • a distance of less than 2 meters, which is more difficult to establish, • and the concept of being face-to-face, discussed below. From the previous sections, specially Section II and III, we learned that under idealized conditions, Bluetooth RSSI measurements provide an adequate estimation of the distance between two fellows or more exactly an estimate on whether B is in the critical zone of A. The probability of missed detection was found to the be a critical performance measure. Audio ranging was found to be an interesting complement to Bluetooth measurements, in particular if the latter measurements are disturbed by shadowing or multipath. They provide a comparatively sharp answer, and may be used to calibrate past and future Bluetooth RSSI measurement. Audio measurements may be audible and thus annoying for younger people, as well as for dogs and other animals. As a consequence, it is beneficial to keep them sparse. In Section V, we very shortly addressed the use of attitude sensing.

In this section, we shall superficially address the potential of combining these measurement types. For this discussion, it is meaningful to differentiate different poses, as shown in Figure 7 . A selection of essential poses of two fellows in close proximity is shown in a top view. Fellow B is infected and exhales air charged with microscopic droplets carrying the virus. Fellow A inhales the droplets. Pose (a) in Figure 7 is what everyone would agree to call a face-to-face situation. It is the type of situation, which occurs during a meeting, lunch or in public transportation for people sitting or standing opposite to each other. It might also occur when desks are facing each other and in some other special situations. Pose (b) occurs in public transportation, in queues as well as in lecture halls, concert halls, cinemas or the like. It also appears dangerous, although Fellow B needs to be closer for that, but this might often be the case. However, unless B stands and is much taller than A, the air flow will only partially reach A's nose and mouth. A further specification by medical authorities would be helpful in this case. Pose (c) occurs in similar situations as Pose (b). Pose (d), (e) and (f) occur during meetings both while standing and sitting, in public transportation and some other situations. Pose (c) and (d) do not appear too critical, although B is likely to turn his head from time to time, which is not detected by the sensors considered. Pose (d), (e), and (f) are difficult to differentiate even using perfect ranging and orientation.

Assuming that there is no specific direction in the air-flow, due to wind or draft, and that the different poses can be differentiated, medical requirements would probably choose • Pose (a), (d), and (e) to be Category 1, i.e. critical, • Pose (b) would be critical for a lower distance which might depend on the height differences, • Pose (c) and (f) would be essentially uncritical.

The possibility to discriminate the cases depends on the type of sensing, as described so far, and is discussed in the following three sections. A. Bluetooth-only Measurements BLE RSSI measurements will return similar results for the Poses (a), (c), (d), and (e). The distance d between the fellows might appear larger in Pose (f) than it actually is. This is uncritical, however. In Pose (b), the received power will be associated with a larger distance than the actual one, as well. Depending on how Pose (b) is classified, this leads to a missed detection. A similar situation may also occur in Pose (e) whenever Fellow A obstructs the line of sight with his left harm, e.g. by holding himself on a bar in public transportation. All missed detection events are critical since they leave close encounters undetected. Finally, the poses (c) and (d) will typically generate false alarms, which sends people to quarantine and testing. This sort of differentiation has not been considered so far, at least to our knowledge.

The addition of a attitude sensing, allows to separate the cases of "Pose (b) with a small distance" from "Pose (a) with a large distance". Thus, it might use a lower threshold in the case of an aligned attitude and thus avoid the missed detection events in Pose (b). With a lower threshold, however, fellows in Pose (c) will be identified as C 1 up to a rather large relative distance, potentially generating many false alarms.

An extensive use of audio ranging, would eliminate false alarms mostly. It would implement the conditions of Category 1 without the alleviation due to the the condition of being face-to-face. When combined with the other measurements, audio measurements provide additional discrimination and allow reducing the rate of missed detection and false alarms. In reality, acoustical signals are subject to multipath, which might be critical if the direct path is strongly attenuated. Since the receiver searches from early to late it is unlikely to be induced in error, however, as long as the direct path can still be detected.

VII. CONCLUSIONS Difficulties in Bluetooth RSSI-based ranging are mentioned by a number of scientists orally. The significant attenuation by the human body and other influencing factors, such as keys, coins, metallic pens, business card holders and the like make the power levels very unpredictable. We thus propose to standardize the wearing of smartphones or alternative devices on the chest, when not held in the hand or used for making phone calls. This provides an environment that is much better defined for Bluetooth RSSI-based ranging, audio ranging and attitude determination. Currently, we don't see an alternative setting to the present one that allows for an analysis of the tracing performance in terms of identifying Category 1 contacts and avoiding unduly frequent alerts for contacts that are not Category 1. The analysis shows that the accumulated statistics require low figures for the per event missed detection rate. This can be achieved with measurements every few seconds aggregated into decisions every few minutes, which is adequate for stable distributions of people, such as in a meeting, at lunch, in a train and the like. The false alarm rate is a lesser problem as soon as a few measurements are aggregated. The analysis presented in the paper is a preliminary one. Much more experimental data should be generated to refine the findings. In Germany, the current probability of encountering an infected person is rather low. In such a context the performance does not matter too much. There are many regions in the world, where this is not the case, however. It would thus be quite beneficial if this work was taken up and further developed, in particular with respect to attitude sensing. Some individuals may reject the idea of carrying their smartphone around their neck. This could be addressed by producing decorative gadgets which are less obstructive to wear. Beyond that the carrying of a device around the neck also enables the use of the camera. This would allow to further refine the evaluation of the risk but would drain the batteries much more and would raise concerns about privacy Thus, the use of the sensors addressed in the present papers seem to remain most promising. In the future, Bluetooth ranging should be considered as well. The complete analysis of the paper and its validity rely on the current model of infection of the Robert-Koch Institute.

Can China's COVID-19 Strategy Work Elsewhere

Slowing the Spread of Infectious Diseases Using Crowdsourced Data

Tracing Contacts to Control the COVID-19 Pandemic

Give more Data, Awareness and Control to Individual Citizens, and They Will Help COVID-19 Containment

Apps Gone Rogue: Maintaining Personal Privacy in an Epidemic

A flood of coronavirus apps are tracking us. now it's time to keep track of them

On BLE Proximity Detection Performance for COVID-19 Contact Tracing

Audio Ranging for COVID-19 Contact Tracing

Bluetooth contact tracing needs bigger, better data

Identifying the BLE Advertising Channel for Reliable Distance Estimation

The Indoor Radio Propagation Channel

Théorie des Distributions

Generalized Theory of Code Tracking with an Early-Late Discriminator Part I: Lower Bound and Coherent Processing

Generalized Theory of Code Tracking with an Early-Late Discriminator Part II: Noncoherent Processing and Numerical Results

Theory and Performance of Narrow Correlator Spacing in a GPS Receiver

On Attitude Estimation with Smartphones

Kontaktpersonennachverfolgung bei Respiratorischen Erkrankungen durch das Coronavirus SARS-CoV-2

The authors would like to thank Dr. Armin Dammann from the German Aerospace Center (DLR) for comments on Section III-A and for providing us with early results from the evaluation of the experiments.