key: cord-0158476-1mrsvg3b authors: Ng, Pai Chet; Spachos, Petros; Gregori, Stefano; Plataniotis, Konstantinos title: Epidemic Exposure Notification with Smartwatch: A Proximity-Based Privacy-Preserving Approach date: 2020-07-08 journal: nan DOI: nan sha: 83d08a7ea67b5988a7c1084deef5844710363f00 doc_id: 158476 cord_uid: 1mrsvg3b Businesses planning for the post-pandemic world are looking for innovative ways to protect the health and welfare of their employees and customers. Wireless technologies can play a key role in assisting contact tracing to quickly halt a local infection outbreak and prevent further spread. In this work, we present a wearable proximity and exposure notification solution based on a smartwatch that also promotes safe physical distancing in business, hospitality, or recreational facilities. Our proximity-based privacy-preserving contact tracing (P$^3$CT) leverages the Bluetooth Low Energy (BLE) technology for reliable proximity sensing, and an ambient signature protocol for preserving identity. Proximity sensing exploits the received signal strength (RSS) to detect the user's interaction and thus classifying them into low- or high-risk with respect to a patient diagnosed with an infectious disease. More precisely, a user is notified of their exposure based on their interactions, in terms of distance and time, with a patient. Our privacy-preserving protocol uses the ambient signatures to ensure that users' identities be anonymized. We demonstrate the feasibility of our proposed solution through extensive experimentation. M ANY industries suspended their daily operations in correspondence to the government's effort in containing the COVID-19 pandemic. In view of the urgency to resuming the daily life routine, several countries have started to relax the restriction so that some industries can resume operation and have their employees back to normal activities. However, each industry is expected to implement some preventive measures to minimize the risk of further outbreaks. Among those preventive measures, such as temperature checks, face coverings, and frequent hand washing, contact tracing is deemed essential in monitoring the interaction between individuals and thus providing an immediate alert to all those who were exposed when someone is diagnosed with an infectious disease [1] , [2] . An efficient contact tracing approach needs to properly address the question of how to monitor the interactions between employees and customers and how to alert exposed individuals while preserving the anonymity of the patient. While there are many smartphone-based contact tracing systems (e.g., Pan European Privacy-Preserving Proximity Tracing (PEPP-PT) [3] , COVID-19 Watch [4] , Privacy-Preserving Automated Contact Tracing (PACT) [5] , etc.), these solutions might not be effective in a workplace because the employee does not necessarily carry with them the smartphone all the time due to the inherent nature of the activity. Furthermore, many people might put the smartphone inside a pocket or backpack, which increases the difficulty in achieving reliable proximity sensing. An effective and low-cost contact tracing solution that can be used by the employee without affecting their activity and at the same time providing line-of-sight (LOS) signals for more accurate proximity sensing is necessary. Motivated by this limitation, this paper proposes a wearable contact tracing solution based on a low-cost smartwatch, namely proximitybased privacy-preserving contact tracing (P 3 CT). First, we exploit the proximity sensing information computed from the received Bluetooth Low Energy (BLE) signals to monitor the interaction between employees [6] . Second, we design a privacy-preserving protocol that encapsulates the BLE packet with an ambient signature packet rather than the employee's identity or location-related information. The main framework describing the contact tracing based on BLE technology is shown in Fig. 1 . Each smartwatch will broadcast a BLE packet periodically according to a system-defined interval. Rather than using the conventional two-way BLE communication channels (i.e., a secure channel for data exchange established through a series of pairing and handshaking processes), the smartwatch uses a non-connectable advertising channel, which was primarily used by beacon-based applications, to broadcast the packet. Hence, it is almost impossible for any malicious device to connect to the smartwatch to access sensitive information. When two users are in proximity to each other, that is, when the smartwatches are within the broadcasting range, they can listen to the incoming packet and measure the received signal strength (RSS). The smartwatches will log the packet including the measured RSS value into their local storage, as shown in Fig. 1(a) . The packet contains the ambient signature information observed by the individual smartwatch at particular timestamps. When an individual is diagnosed with an infectious disease, as shown in Fig. 1(b) , the smartwatch will upload the individual's own signatures generated for the past 14 days to the signature database. All the other employees will download the infected signatures into their smartwatch for signature matching. In other words, the signature matching process takes place in the individual smartwatch rather than the cloud server. In this case, there is no way for others When employees A and B are in proximity to each other, their smartwatches will log the received BLE packet containing the ambient signature information into their watch's local storage. When employee A is diagnosed with an infectious disease, the watch will upload his/her own signatures to the signature database. All the other employees can download those signatures and compare them to a list of signatures they have observed in the past 14 days. An alert will be triggered when the downloaded signatures match one of the signatures on the list. to know who has come into close contact with the infected person. The smartwatch will automatically trigger an alert when it found a matched signature. Based on the alert, the individual will be informed about the necessary actions, such as self-quarantine, testing, follow-up and monitoring process, so that further spread of this highly contagious disease can be prevented. While it is relatively straightforward to develop such an application to the smartwatch for contact tracing purposes, it remains unclear how accurate is the proximity sensing information estimated through the RSS value and how the ambient signature information can help to prevent information leaks. Furthermore, rather than a simple alert when there are matched signatures, it is more effective to tell the individual about their exposure risk level based on how close and how long their exposure was. This is because the risk of an individual to be infected is low if they spent less than one second in close proximity to the infected employee compared to the individual who spent more than one hour in not so close proximity, yet still relatively near (i.e., the smartwatch still in the broadcasting range), to the infected employee. Recognizing the above challenges, we carefully developed our proposed P 3 CT that has the following contributions: • Accurate proximity sensing: this is the first work that provides a comprehensive investigation about the performance of proximity sensing based on the RSS values measured by a smartwatch worn by individuals. While RSS suffers severe attenuation due to the human body, our empirical analysis verified that we can achieve satisfactory performance with a carefully implemented machine learning method. Since the smartwatch is always worn on the human wrist, it is less challenging for the smartwatch to provide suitable LOS signals compared to smartphones. • Low-cost device: Being a low-cost commercial off-theshelf device that is equipped with essential BLE technology, the smartwatch has become an ideal solution for privacy-preserving contact tracing. The widely available software development support allows the workplace to provide quick yet reliable prototypes for testing prior to workplace reopening. • Risk classification: we define the exposure's risk of a user based on the interaction duration and distance with the infectious individual. In contrast to most works that simply rely on the RSS value as an input feature to train a classification model, we explore other possible input features including the number of samples observed by the smartwatch, the maximum RSS, the minimum RSS, and the RSS range. Our experiment unveils the effects of selecting the right features on classification performance. • Real-time notification and dataset: Our developed application can provide real-time notification when the physical distance is violated. Also, our experimental results were validated with a real-world dataset that was collected with a smartwatch worn on the human wrist. The rest of the paper is organized as follows. Section II provides the background related to contact tracing and discusses its current development. Section III presents our proposed P 3 CT. Section IV describes the method to classify the risk level. Section V discusses our experimental evaluations. Section VI concludes the paper with future works. Recognizing the urgency to have an effective contact tracing system, various digital-based solutions, either based on a smartphone or a smart wearable, have emerged lately. During an epidemic of a highly contagious disease such as COVID-19, it is very likely for anyone to contract the virus when they interact with an infected individual in close proximity for a very long time. Contact tracing aims to trace down this group of people so that they can be aware of their exposure to the virus and take the necessary action as soon as possible. We can divide the contact tracing into two major phases: 1) Interacting Phase: The interacting phase keeps track of the daily contacts including distance and duration. A contact tracing system should be able to detect when any two persons are in proximity to each other at the same time keeping track of the duration they remain in close proximity. An effective contact tracing system should be able to detect the proximity with high accuracy rather than seeking to estimate the exact distance, which is quite challenging considering the dynamic movement of humans. 2) Tracing Phase: When a person is diagnosed with the infectious disease, we need to trace down a list of people who have been in close contact with the infected person because they are more likely to get affected. If this group of people can be notified promptly, we reduce the chances for the virus to continue to spread to others. However, many people are concerned about exposing their identity during the tracing phase. Hence, a privacy-preserving contact tracing system should provide these two pieces of information without disclosing one's identity. The traditional contact tracing is conducted manually through in-person interviews and investigations. Such a manual method based on subjective feedback (i.e., feedback from the infected individual) is unable to gather the precise interaction distance and duration. Furthermore, the investigator might acquire some sensitive information to identify those people who have come into close contact. In contrast to the manual contact tracing, the digital tools for contact tracing can provide more precise information regarding the interaction distance and duration, at the same time preserving the individual's privacy. To date, the digital-based contact tracing can be categorized into smartphone or smart wearable-based: 1) Smartphone-Based Contact Tracing: The pervasiveness of smartphones has made smartphones the most popular choice when comes into the digital-based contact tracing system. The rich sensing features embedded in the smartphone provides a better estimation of interaction distance and duration [7] - [9] . For example, many works leverage location sensing [10] and proximity sensing [3] to keep track of the interaction between any two individuals. There are also works exploiting the heterogeneous sensing features in a smartphone to improve the distance estimation [11] . However, most of these works fail to consider the location of the smartphone during the interacting phase. While people might carry their smartphone with them during grocery shopping, the smartphone will be inside a pocket or a purse most of the time. Hence, the distance estimation is more complex and can be highly inaccurate for a contact tracing application. At the same time, people might not carry the smartphone with them all the time while working. 2) Wearable-Based Contact Tracing: Considered the inconsistency of smartphones, some companies have started to exploit the smart wearable approach to contact tracing. The goal is to resume the working routine with less distraction. For example, EasyBand [12] presents a wearable solution to auto contact tracing while encouraging safe social distancing practice during interaction. However, EasyBand uses a centralized server for contact tracing, in which all the users' data is uploaded to the cloud through TCP/IP connection. Such a centralized approach is not scalable as all the computations to find the close contact for all the workers are performed within the server. Furthermore, there is a high possibility of information leak if the server is compromised. Our proposed P 3 CT, on the other hand, provides a privacy-preserving contact tracing by keeping no individual information on the cloud server. Recognizing the importance of contact tracing in resuming the normal lifestyle while preventing the further virus outbreak, both government and academia have devoted efforts in developing a more effective contact tracing solution to fight against COVID-19. 1) National-Level Efforts: China, South Korea, and Singapore are among the first countries enforcing digital tools for contact tracing. China leverages its existing surveillance strategy to implement a close contact detector based on QR codes technology [13] . South Korea leverages the location data (i.e., the GPS data) from the smartphone to detect the distance of the users and push a notification containing personal details of the infected individuals to the nearby users [10] . Singapore developed the TraceTogether application based on BLE signals on the smartphone to detect the proximity between any two individuals [14] . While the methods applied by China and South Korea might be less strict on user's privacy, Singapore adopted a more privacy-preserving approach by only tracking the proximity between users without explicit location information. 2) Academia-Level Efforts: There have been a number of initiatives from industry and academia researchers in delivering an effective contact tracing solution while preserving user privacy [15] , [16] . For example, Pan European Privacy-Preserving Proximity Tracing (PEPP-PT) detects the proximity based on the broadcast BLE packet containing a full anonymous ID [3] . COVID-19 Watch provides automatic alert the user when they are in contact with the infected individual [4] . The Privacy-Preserving Automated Contact Tracing (PACT) exploits the BLE signals in combination with secure encryption to detect possible contacts while protecting users' privacy [5] . Most of these initiatives assume that the BLE signals will work for proximity detection while there are no works providing a comprehensive study of the accuracy of using BLE signals for proximity sensing. To bridge the gap, this paper presents extensive experiments to validate the feasibility of using BLE signals for proximity detection. Our proposed P 3 CT leverages the BLE technology available on the smartwatch for proximity sensing. To achieve privacypreserving contact tracing, we adopt the following signature protocol to define the BLE advertising packet. As a popular short-range communication over the 2.4 GHz ISM band [17] , BLE is readily available in many smart devices including smartwatches, earphones, smart thermostats, etc. [18] , [19] . BLE communicates through either nonconnectable advertising or connectable advertising [20] . The latter advertising mode allows another device to request a secure connection through handshaking. Our proposed P 3 CT uses the non-connectable advertising mode, which rejects any incoming connection requests [21] , as the main feature for exposure notification purposes. The non-connectable advertising mode allows the smartwatch to broadcast a short advertising packet periodically according to the system-defined advertising interval, T a . Upon receiving the packet, the smartwatch can measure the RSS and use it to estimate the proximity. More precisely, the RSS is inversely proportional to the square of the distance according to the inverse square law [22] , [23] : where P r is the signal strength in dBm, d is the distance between any two smartwatches, and n is the path loss exponent. Even though the RSS-distance relationship holds for the signal in the free space, the RSS values suffer a great variation in practical environments owing to the multipath [24] and body shadowing effects [25] , [26] . We can minimize the signal variation by applying signal filtering methods, such as moving average. As shown in Fig. 2 , the RSS values at each distance are more distinct and with less variation when a moving average is applied as compared to the raw RSS data. While we can set a cut-off threshold, for example, any value greater than −75 dBm as being in close proximity, such a thresholding approach will result in high false-negative with raw RSS value (i.e. the system will not record the contact as close proximity) and high false-positive with filtered RSS value (i.e. the system will record the contact in close proximity while it is not). Rather than using a thresholding approach, we exploit machine learning methods to proximity sensing and further classify the sensing output into high-risk and low-risk. We propose a signature protocol that constructs a signature vector that fits into the length-constrained advertising packet (i.e., the available payload is only 31 bytes). Specifically, each smartwatch is configured to execute the following functions: i. Signature generation: The smartphone scans for the ambient environmental features. These features are selectively Signature generation Signature generation Signature generation ( 1) Own Signatures Observed Signatures processed to generate a unique signature that fits into the 31 bytes advertising payload. The signature is updated every few minutes. ii. Signature broadcasting: The smartphone broadcasts the advertising packet containing the unique signature periodically according to the advertising interval T a . The packet is broadcasted through the non-connectable advertising channels. iii. Signatures observation: The smartphone scans the three advertising channels to listen for the advertising packet broadcast by the neighboring smartphones. The scanning is performed in between the broadcasting events. The signature is a 31-dimensional transformed vector containing the ambient environmental features. Upon the generation of the signature, the smartwatch will encapsulate the signature information into its advertising packet and broadcast the packet through the non-connectable advertising channels. The nearby smartwatches can see the packet when they scan on those advertising channels where the packet is transmitted. The timing diagram for the advertising, scanning, and signature generation activities, in which each activity is triggered periodically according to its interval, i.e., generation interval T g , advertising interval T a , and scanning interval T s , is shown in Fig. 3 . Given T s , the smartwatch will only stay active to listen for the incoming packet for a duration defined by the scanning window T w . While it is possible to use a continuous scanning (i.e., by setting T w = T s ) to increase the packet receiving rate, such a scanning approach has an adverse effect on the energy consumption. IV. RISK CLASSIFICATION Besides using proximity sensing to detect the approximate interaction distance between any two individuals, we also consider the interaction time when labeling the individual into low or high risk. Proximity sensing has been employed in many scenarios, for example, to identify the user proximity to museum collection [27] , gallery art pieces [28] , etc. Some works study the proximity detection in a dense environment [29] , or proximity accuracy with filtering technique [30] . However, most of these works study the proximity detection between a human and an object with an attached BLE beacon [31] . So far, there is no work studying the proximity sensing between devices carried by two humans. While estimating the distance can help to check if the user maintains a safe physical distancing, an exact 2 m distance should not be a rigid requirement in classifying the risk of a user. Rather, we are more interested to know the proximity between any two workers, and how long they remain in proximity. Then, we can forward these pieces of information to the epidemiologists and they can decide to classify a contact as high or low risk. BLE is an excellent technology for the above purpose since BLE is a short-range communication that can only be heard when two smartwatches are in the communication range of each other. Upon receiving the advertising packet, the smartwatch can measure the RSS and thus estimate its proximity to the nearby smartwatch. We classify the proximity into two classes, i.e., far and close. We define close proximity when the distance between any two smartwatches is less than 2 m, and any distance greater than 2 m but less than the broadcasting range is considered far. In other words, the two smartwatches are not in proximity if they are outside the broadcasting range of each other. The RSS distributions for far and close proximity is shown in Fig. 4 . It is clear that there would be errors if proximity were decided by simply setting an RSS threshold. For example, if we set anything above −80 dBm as close proximity, chances are some values greater than −80 dBm are from the smartwatch located at a distance greater than 2 m. Hence, it is unreliable to identify the risk of an individual simply based on the proximity. At the same time, some individuals might come in very close proximity when they pass by each other. Hence, we also consider the interaction time when we want to identify the risk of an individual. While it is more likely to be infected when the individual is in close proximity to the infected person, the risk of getting where H + denote the hypothesis that the user belongs to the high-risk (+1) group, H − the hypothesis that the user belongs to the low-risk (−1), and H 0 the null hypothesis. Specifically, the null hypothesis happens when the user is risk-free, i.e., the user is outside the communication range of the infected person. Obviously, miss detection is undesirable because the user might be at risk but the system considers the user safe. False-negative misclassified the high-risk user to low-risk, this may give a wrong impression to the user that the probability for them of getting infected is low, but actually, the probability could be high. While false-positive is a bit more conservative by misclassifying the low-risk user to high-risk, it is a relatively safer outcome than miss detection and false-negative. We can apply supervised machine learning methods to train a classification model. However, supervised methods required a set of labeled data, which is not readily available To address this problem, we developed an application on the smartwatch to collect the BLE data. Given the collected data, we can train a classification model, as shown in Fig. 6 . During the training phase, the data is split into training and validation set before feeding the data for model learning. The objective is to learn a set of weights that fits the hypothesis function R(x, C) defined by the corresponding classification model C. Validation is performed to evaluate the learned model as well as preventing the model from overfitting. If necessary, model fine-tuning can be performed to improve the classification performance. Mathematically, the learning process aims to fit the risk mapping function R : (x) −→ y given a set of n training samples {(x 1 , y 1 ), . . . , (x n , y n )}, where x = (x 1 , . . . , x m ) T is an m-dimensional feature vector and y = {+1, −1} is the classification output indicating the risk of a user. In this paper, we exploit four types of classifications: decision tree (DT), linear discriminant analysis (LDA), naïve Bayes (NB) and k-nearest neighbors (kNN). 1) DT: Top-down approach is the commonly used method to learn a classification tree. More precisely, DT starts by choosing a feature from the feature vector that provides the best splitting in connection to the target risk label, and then repeats the same splitting procedures for each separate branch until it reaches a final decision. Let θ = (x, γ) be the splitting rule given feature x and threshold γ, we can split n samples of training data T into two subsets, i.e., where T r and T l are the resultant subsets representing the data for right and left branches, respectively. The commonly measure used to govern the splitting rule is the Gini impurity G(·), which tells how likely the model will produce a misclassification if the model predicts the labels based on the labels distribution from a randomly chosen feature. Mathematically, the Gini impurity can be computed as follows: where n l and n r are the number of training samples for each subset, and H(·) is the entropy function, i.e., and p y denotes the probability of correct classification. Suppose that I = {1, 0} be the indication function andỹ be the predicted output, then we have The objective of DT is to find the parameters that produce the best splitting rule, i.e., θ * = arg min G(T , θ) 2) LDA: Assuming that the covariance for each class is the same, LDA learns a classifier by fitting a Gaussian density to each class. Let P(x|ỹ = y) be the conditional distribution for each class y = {+1, −1}, by applying Bayes' rule, we obtain P(ỹ = y|x) = P(x|ỹ = y)P(ỹ = y) y={+1,−1} P(x|y)P(y) Then, the class (i.e., the risk) can be determined by selecting the output with the highest posterior probability. 3) NB: Following a naïve assumption that each feature is conditionally independent, we can apply the Bayes' theorem to learn a classification model. By simplifying P(x|y, ∀x ∈ x) to P(x|y), we have Since P(y|∀x ∈ x) is proportional to P(y) m i=1 P(x|y), then we can use maximum a posteriori (MAP) to estimate the probability for each class P(y) and the conditional probability for each class given the feature P(x|y). The output risk can then be predicted based on the following rule: The goal of kNN is to maximize the probability of correct classification. Let p i indicate the probability that a training sample i is classified correctly, according to the stochastic nearest neighbors' rule, we have: where T i is a subset of data belonging to the same class as training sample. Given p i , the goal of kNN can be defined as follows: Note that all the classifiers described above can be further extended by assuming different distribution functions. One of the possible future work is to calibrate the classifier based on the prior empirical distribution knowledge about a certain environment. More precisely, different environments might produce different distributions, and if we can acquire this information, it could help to better calibrate the classifier and thus improve the classification performance. We consolidated the collected data from both smartwatches before dividing them into training and testing datasets. Then, we evaluate the experimental results obtained from different classifiers. For the experiment, we used Fossil Sport, a smartwatch based on Google's Wear OS 2.17. The smartwatch is powered by a Qualcomm Snapdragon Wear 3100 processor and has an internal memory of up to 1 GB. The 8 GB internal storage is sufficient to store the generated and observed signatures for at least 14 days. The small form factor (i.e., 1.28 in AMOLED screen with 44 mm case size and 12 mm case thickness) makes the smartwatch an ideal candidate for contact tracing in the workplace. As shown in Fig. 7 , the smartwatch can trigger the alert automatically when any two smartwatches are in close proximity to each other. We programmed the smartwatch to broadcast the advertising packet in the background. For experimental purposes, we also programmed the application to log all the advertising packet it received at every distance. In particular, the following information will be logged: the ground truth distance, name of the smartphone, MAC address of BLE chipset, the packet payload, RSS values, time elapsed, and timestamp. The time elapsed indicates the time difference between the previous broadcast packet and the current broadcast packet, whereas the timestamp is the exact time when the smartphone received the packet. We performed the experiment by asking two volunteers to stand at a certain distance from each other, from 0.5 m up to 5 m, as illustrated in Fig. 8 . A measuring tape is used as a reference to the ground truth distance. We first performed the experiment by asking volunteer A to wear the smartwatch on her left hand, and volunteer B on her right hand (i.e., left to right (LR)). After that, we repeated the same experiment with right hand to left hand (RL), left hand to left hand (LL), and right hand to right hand (RR). Since LR and RL constitute a direct line between two smartwatches and LL and RR constitute a crosswise line, we categorize these four hand-combinations into two groups: a) direct line, and b) crosswise line. All the measurement data is saved into a "comma-separated values" (.csv) file format and exported to Matlab for training and testing. In total, we have collected 37,644 data points from all the four combinations, as shown in Table I . We consolidated the data from RR and LL into a single dataset (i.e., the crosswise dataset) and then apply an 80%-20% splitting rule to split the data into training and testing set. Similarly, we applied the same splitting rule to the consolidated data from RL and LR (i.e., the direct dataset). For each training and testing set, the first four columns indicate the input features and the last column is the target label (i.e., the risk). These four input features include the number of samples observed by the smartwatch, mean RSS, maximum RSS, minimum RSS, and the RSS range (i.e., maximum RSS − minimum RSS). Note that the number of samples observed by the smartwatch tells how long the smartwatch being in proximity to each other. The final training and testing data for both sets are shared openly in our GitHub repository [32] . We used four metrics (i.e., precision (p), recall (r), F1score (f 1 ) and accuracy (a)) to evaluate the performance of the classifier. Let T + , T − , F + and F − denote the true-positive, true-negative, false-positive and false-negative, respectively, then the above four metrics can be computed as follows: Precision tells how many are actually in the high-risk of all the classifier predicted as positive. In other words, high precision indicates the classifier produces low false-positive, which means the classifier is capable of avoiding create unnecessary tension and anxiety to the people. Recall, on the other hand, tells how many we predicted as high-risk are in fact high-risk of being infected. In contrast to the accuracy that considers the number of correctly classified true-positives and true-negatives, F1-score considers the balance of precision and recall. F1-score is a useful metric when false-negatives and false-positives are important factors in evaluating the classifier performance. We fed the two consolidated datasets, i.e., direct and crosswise datasets, to the four different classifiers (i.e., DT, LDA, NB, and kNN) for training. We repeated the experiment 100 times with a different set of testing data. Specifically, we randomly sampled 20% of data from the dataset for testing purposes at every iteration. For each evaluation metric, we show the mean result and its corresponding 95% confidence interval (CI). An illustration of the F1-score distribution obtained from DT with the 100 testing sets, is shown in Fig. 9 . The overall mean results and 95% CI for both direct and crosswise dataset are shown in Table II and Table III , respectively. From both tables, we can see that all the classifiers achieve satisfactory performance with high precision and recall. In other words, the classifier did not penalize the recall in order to achieve high precision. Hence, the F1-scores for both datasets are high. We also observed that the direct dataset gave a better performance than the crosswise dataset. This can be explained by the possible signal attenuation when the two hands are blocked by the human body. Among all the classifiers, DT achieves the best performance with the highest precision, recall, F1-score, and accuracy. In Fig. 10 , it shows the precision-recall curve for (a) direct and (b) crosswise. The precision-recall curve provides further insight into the trade-off between precision and recall. Both plots indicate that DT achieves superior performance with high precision and recall, whereas other methods tend to trade-off the recall in order to achieve high precision. Previously, we used all the five input features (i.e., number of samples observed by the smartwatch, mean RSS, maximum RSS, minimum RSS, and RSS range) to train the model. All the four trained classifiers were able to produce satisfactory classification performance, i.e., at least 85% accuracy. Hence, we would like to investigate the implication of input features on the classification performance. We repeated the experiment by using only one feature (i.e., mean RSS), and then two features (i.e., mean RSS and the number of samples), and so on. The classification accuracy achieved by all the four classifiers is shown in Fig. 11 . From both bar charts, we can see that kNN suffers severe performance degradation when only one input feature available. Overall, the performance increases when the number of features increases. The performance gain of each classifier when the number of features increases, is shown in Fig. 12 . We can see that kNN benefited a lot when there are more input features. On the other hand, both LDA and NB did not show improvement after two features. Their performance saturated when the number of features is more than two. It can be noted that the performance of DT also increases when the number of features increases, even though the performance gain is quite minimal. Overall, we can see that some features are indeed useful in training a good model, while some features might be redundant and can be excluded from training. For example, the maximum RSS and minimum RSS might not provide good information to the model training, whereas the RSS range provides more useful information. The RSS range provides an indication of how big the RSS fluctuated during a particular observation period, and this piece of information is indeed helpful to model learning. As discussed, the number of samples observed by the smartwatch within a certain continuous time period is a good indication of how long the user has been interacting with each other. Furthermore, we can make a better inference when the number of samples observed by the smartwatch increases. The effect of the number of samples on the classification accuracy is illustrated in Fig. 13 . It is clear that the accuracy increases when the number of samples increases and then slowly saturates after it obtains a sufficient number of samples. In other words, the increase in the number of samples has less effect on accuracy when the system has obtained a sufficient number of samples to make an inference. From the results, we can see that the accuracy starts to saturate when the number of samples reaches 100, for the (a) direct and (b) crosswise cases. Hence, we can conclude that most classifiers can produce proper classification output when there are at least 100 samples. If the smartwatch is configured to advertise the packet every 100 ms, we should expect approximately 10 samples per second, which means that approximately 10 s are required for each classifier to reach a stable performance. In practice, this is a reasonable duration considering the interaction duration between users. Furthermore, if the interaction duration is less than 10 s, the risk of getting infected is low even if the user is very close to the infected individual. The World Health Organization recommends a distance of at least one meter. However, different countries implement different physical distancing requirements, from 2 m to 1 m, depending on factors including location, activity, and age of the individuals. Considered the variations in physical distancing requirements, we conducted an experiment to verify our classification approach with different physical distancing thresholds. The classification accuracy with different physical distancing threshold is shown in Fig. 14. The results prove the robustness of our classification approach, in which each classifier achieves almost similar accuracy despite the differences in the physical distancing threshold. This means that our proposed approach is practical and can be applied in any setting directly by simply updating the physical distancing threshold in correspondence to the set of required preventive measures. Contact tracing is deemed to be an essential measure in the post-pandemic to prevent the second outbreak while slowly reopening the workplace. Even though smartphonebased contact tracing is cost-effective considering the ubiquity of smartphones, it is not convenient to have the employee carry with them the smartphone all the time during working. On the other hand, a smart wearable approach provides a more practical solution to contact tracing in the workplace. In this paper, we verify the practicality of our proposed P 3 CT with real-world BLE data collected from the smartwatch. For future work, we can integrate the embedded sensors within the watch to monitor employee's activity and thus to better predict their interaction behaviors. The additional knowledge of interaction behaviors, besides the interaction proximity and Quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing Contact tracing and disease control Pan-european privacy-preserving proximity tracing We put the power to reduce the spread of covid-19 in the palm of your hand Pact: Private automated contact tracing A reliable smart interaction with physical thing attached with ble beacon Smartphone inertial sensor-based indoor localization and tracking with ibeacon corrections Indoor positioning using smartphone camera Face-to-face proximity estimationusing bluetooth on smartphones Coronavirus mobile apps are surging in popularity in south korea Epidemic contact tracing with smartphone sensors Easyband: A wearable for safety-aware mobility during pandemic outbreak China launches coronavirus 'close contact detector' app Privacy guidelines for contact tracing applications Tracesecure: Towards privacy preserving contact tracing Overview and evaluation of bluetooth low energy: An emerging low-power wireless technology Bluetooth: a viable solution for iot? Smartphones and ble services: Empirical insights Secure seamless bluetooth low energy connection migration for unmodified iot devices Ble beacons for internet of things applications: Survey, challenges, and opportunities Rss localization using unknown statistical path loss exponent model Improved distance estimation with ble beacon using kalman filter and svm Compressive sensing-based multipath exploitation for stationary and moving indoor target localization Body shadowing and furniture effects for accuracy improvement of indoor wave propagation models Human body shadowing effect on uwb-based ranging system for pedestrian tracking Ble beacons for indoor positioning at an interactive iot-based smart museum Notify-and-interact: A beacon-smartphone interaction for user engagement in galleries High resolution beacon-based proximity detection for dense deployment Improving ble beacon proximity estimation accuracy through bayesian filtering A compressive sensing approach to detect the proximity between smartphones and ble beacons