key: cord-0684077-6y3p0ujj
authors: Muthazhagan, Balaji; Panchapakesan, Aparnasri; Sundaramoorthy, Suriya
title: Real-time social distance alerting and contact tracing using image processing
date: 2021-05-21
journal: Data Science for COVID-19
DOI: 10.1016/b978-0-12-824536-1.00032-0
sha: 073858fd0b07f446c56c60a13dd75ee7f17ee16d
doc_id: 684077
cord_uid: 6y3p0ujj

Social distancing has been enforced by many government and health organizations as an imperative measure to stop the spread of the ongoing COVID-19 crisis. However, current surveillance systems do not have the ability to detect breaches in social distancing nor real-time contact tracing. This chapter aims to provide a novel solution to this problem with the help of image processing techniques. The proposed system will generate structured data involving social distancing breaches and face mask detection based on surveillance footages. In the case of a person testing positive for the virus, it will also project the susceptible victims who have been in contact with the person through contact tracing, An installation of this kind will decrease the spread substantially and enable real-time contact tracing.

Companies across the world which have support for remote communication and work infrastructure have enforced employees to work from home [9] . Sporting events have been postponed; the London marathon which was supposed to be held on April 26, 2020 has been put away till October 4, 2020 [10] . The opening ceremony of the Tokyo Olympics which was to be held on July 24, 2020 has been postponed by a year [11] . The International Tennis Federation has also confirmed that 900 tournaments varied across all its circuits have been postponed [12] . People immigration was constricted in various modes of intracountry as well as intercountry transportation [13] . Proper hygiene and health protocols were installed in hospitals as well as public places. Markers and graphic cues have been placed at select places to highlight minimum separation [14, 15] .

As lockdowns across countries get eased, it is imperative to house quality surveillance systems which ensure whether minimum separation is being followed by individuals at places of interest. In India, artificial intelligenceeequipped drones are being used to keep an eye on people who breach evening curfews [16] . Singapore has begun trials for robot dogs, developed by Boston dynamics, to patrol major parks and monitor the densities of public crowding [17] . Surveillance cameras in France [18] and thermal imaging technology in some parts of Britain [19] have also started to monitor social distancing compliance. These surveillance systems use machine-learning techniques to quickly identify and alert breaches.

This chapter first introduces the concept of flattening the curve which explains why social distancing works, then explores the currently available contact tracing methods, and proposes a surveillance system to identify susceptible members. The concept of identifying susceptible people in image frames essentially boils down to a classification problem where the object to be detected is a person. This task involves two steps, first is to identify the features and second is the choice of learning algorithm. The outcome of this is largely based on the efficacy of the feature selection technique. Deep learning techniques automate the feature selection process, thereby eliminating the errors that could rise out of hand-engineered pipelines. Deep learning methods which revolve around object detection largely use either of recurrent convolutional neureal networks (R-CNN) [20e22], single shot detection (SSD) [23] , or you only look once (YOLO) [24] . ReCNN methods proposed by Girshick et al. [20e22] incorporates a two-stage approach where first the bounding boxes along with the region of interests are identified and then the regions are passed onto the classification phase. To reduce the latency, in YOLO and SSD, the two steps are combined into one. Thus, SSD and YOLO seem to be a good fit for real-time surveillance systems which require lower latencies and is considered for the implementation of the proposed system.

The idea of dampening the virus spread rate so that minimal number of people tends to seek treatment is referred to as "flattening the curve" [25] . Social distancing satisfies this objective to reduce the R 0 , which is the rate at which the infection multiplies to other people to a value lesser than one or in other words flatten the curve [26] . Social distancing can also help in increasing the doubling time T d which is the time frame in which the growth of the infected population doubles and is given by

where r is the growth rate [25] . In this chapter, we consider the susceptible, exposed, infected, and recovered (SEIR) model to demonstrate this.

2.1 Susceptible, exposed, infected, and recovered model The SEIR model [27] which stands for susceptible, exposed, infected, and recovered, respectively, buckets the entire population into the following categories:

Susceptible: This bucket consists of people who have not yet contracted the virus neither have immunity to it. Exposed: This bucket consists of people who have been exposed to the virus but are yet to show symptoms as the virus is in the incubation stage. Infected: This bucket consists of people who are either mildly, severely, or critically ill because of the virus. At this stage, not every individual requires hospitalization or ICU attention unless belonging to the latter half of the symptoms. The symptoms may worsen from one category to another. This stage also shows the mortality rate of the virus. Recovered: This bucket consists of people who have completely recovered from the said disease. It is assumed people belonging to this category do not relapse and fall into infected category again ( Fig. 15.1) There are four constants in addition to SEIR which define the model: a, b, s, and g. a is the inverse of the virus incubation period, for COVID-19 this is approximated to 5.2 days [28] . The probability that an infected person spreads the disease to a susceptible person is defined as b. s represents the incubation rate and g represents the recovery rate. g is defined as 1=D where D is the average duration of the infection. These constants are related to the SEIR model through first order and second order differential equations with time as an independent variable [27] (Table 15 .1).

The entire population is the sum of the constants S, E, I, and R:

R0 which is the basic reproductive number of the virus is given by R0 ¼ b=g [27] (Fig. 15.2 ).

To the above set of differential equations, we can include the element of social distancing through r which is the population interaction factor, the greater the r, the more is the interaction between the population. Therefore, suspected calculation FIGURE 15 .1 Susceptible, exposed, infected, and recovered flow. now becomes dS dt ¼ Àr $ b $ S $ I and exposed becomes dE

To understand the effectiveness of social distancing in curbing the spread of infection, we plot the graph showing the infection levels at various interaction factors among the population. We can see as the rate of interaction among the population increases, we get a higher curve and as the rate of interaction reduces, we get a flatter curve. With the use of the same model, let us understand how early cessation or relaxation of a lockdown or social distancing measures can lead to a second curve which is often higher than the 

Supporting differential equation for susceptible, exposed, infected, and recovered With the help of a longer lockdown, we can reduce the population that can be susceptible to the virus over the course of time. This is known as "herd immunity" where the population is given immunity to a virus through indirect exposure. This ensures that once the social distancing measures are relaxed, the curve does not become high again [30] . 

Contact tracing is a process used to track down the people who might have come in contact with an infectious virus such as COVID-19, these people are then kept in close observation to ensure that there is no further transmission of the infection. Contact tracing needs a good amount of technology and efficiency to be successful on a large scale. This process can be further broken as below [31] :

When a contact is confirmed to have been infected or tested positive for an infection, their contacts must be identified, their contacts could be from various points such as work, family, recreational spots, etc. All the contacts who have been identified should be reached out to and informed about the infection and asked to be quarantined/isolated depending on nature of infection Identified contacts must be reached out to often to understand if they are developing any symptoms associated with the infection.

The methods that are available for contact tracing are either by using mobile applications or by manual methods. Manual tracking process involves contact tracing through manual efforts by identifying all the close contacts of the infected person and informing them. This is an arduous and error prone process and is largely discouraged. Governments and private organizations have come up with mobile applications which once installed use Bluetooth low energy and help in tracking an infected person or inform a person if they have encountered an infected person; The Australian government launched Covidsafe for COVID-19 using Bluetooth to track individuals who have been in close contact with each other [32] . In India, the contact tracing app Aarogya Setu which is based on the same principle has been downloaded over 100 million times, and the government has made it mandatory for government and private sector employees to download it [33] .

The proposed system can be broadly described as an engine which extracts relevant information from a footage which is fed as an input into the system (Fig. 15.5) .

The input is a surveillance footage which is passed into the analysis engine. The analysis engine handles the following tasks:

The engine first identifies unique people in the frame using image processing. The engine then assigns a score based on the social distancing breach and face mask detection. The output of the engine is a graph which can be used for real-time contact tracing and identify susceptible members.

In this chapter, we consider YOLO [23] and SSD [24] cross trained with MobileNet [34] as our prediction networks for identifying people within a given frame.

SSD [24] is a convolutional neural network which is feed forward in nature. The architecture of an SSD model can be broken up into two main parts: a backbone network and an SSD head. Generally pretrained image classification models such as VGG-16 or ResNet34 can be exploited as these backbone networks because of their accuracy in classifying images and the reduction in training time. This network gives an image with the relevant features which now must be extracted for semantic meaning. The SSD heads are the layers which are responsible for having the input of an image of arbitrary size else we are restricted by the dimensions of the base network. These are also CONV layers which gradually reduces the size of the image volume. Every CONV layer which is present is attached to the prediction layer and this is what allows for the variations in the size of the input section which is being considered and thus reducing the need for resampling the feature maps. An ameliorated version of the multibox algorithm is used for the bounding box suggestions. For every class present in the dataset, we have groundtruth bounding boxes. Bounding boxes which have already been calculated based on the sizes and locations of the ground-truth ones are called priors because of the use of a prior probability distribution [24] (Fig. 15.6 ).

MobileNet architecture was created to ensure that the network gives faster results with a smaller space complexity [34] . Standard convolution is done on the dimensions of the input, output channels and the feature map vector. Let us assume S f as the sizes of the input feature maps, N i as the number of input channels, N o as the number of output channels, and S k as the kernel size. The complexity of evaluating with this filter is S 2 f Ã N i Ã N o Ã S 2 k . A depth wise convolution filter has the complexity as S 2 f Ã N o Ã S 2 k since only a single convolution is mapped on each on every input. Then there are pointwise convolutions where the kernel size is 1 with the complexity being S 2 f Ã N i Ã N o . MobileNet architecture combines both depth wise and point wise convolution into depth wise separable convolution which has a reduction of complexity by 1

. Apart from the first layer of the architecture which is fully convolutional, everything else is on depth wise separable convolutions post which batch normalization and rectified linear unit (ReLU) is applied. It also accommodates a width multiplier which helps in reducing the number of channels and a resolution multiplier which reduces the input [34] . In this chapter, we considered a caffe implementation where SSD is used as the base model in MobileNet which is around 45 MB in size. This enables the model to reside on portable devices with stringent memory requirements (Figs. 15.7 and 15.8).

YOLO is an extremely popular detection network for the identification of objects. There are three variants to this network: YOLO [23] , YOLOv2 [35] , and YOLOv3 [36] . The first version YOLO consisted of 24 CONV layers accompanied by two fully connected layers at the end. ImageNet classification dataset which houses 1000 classes and with a resolution of 224 Â 224 is used to pretrain the first 20 CONV layers. The last four layers is coupled with two fully connected layers for detecting objects. The resolution is also increased to 448 Â 448 to increase the granularity of object detection. But the first version of YOLO had a major problem in object detection if objects were closer to each other. This was improvised in YOLOv2 which used darknet-19 where batch normalization and anchor boxes were introduced. The ameliorated version was also able to take a higher resolution as input and was able to detect in-depth features. The network was also trained using multiple dimensions of the same class to ensure that varying sizes of the object can be detected. The YOLOv3 is improvised from darknet-53 architecture by stacking up additional 53 layers. This was improvised from the previous version by incorporating residual blocks and skip connections. This version also handles upsampling. Since there is an addition 53 layers, this caused the architecture to be slower however ensured better accuracy than its previous versions (Fig. 15.9 ). The output of the YOLO network is a prediction vector consisting of five normalized components (center_x, center_y, box_w, box_h, prediction_confidence) where center_x, center_y is the center point coordinates, box_w, box_h is the bounding box dimensions, and confidence is the product of the probability of an object present and intersection over union between predicted box and ground truth.

In this chapter, we use YOLOv3 over its predecessors considering its high accuracy. The model consumed a space of around 240 MB which is suitable systems without memory constraints (Fig. 15.10 ).

The image processing module that we implemented in a frame can be broken up into three components:

Identification of unique members Social distance breach identification Identifying whether an individual is wearing a mask or not The video is processed to first identify people in a frame using MobileNet-SSD and YOLOv3 which is pretrained for recognizing person as an object. The centroid of the individual identified is continuously tracked. To make sure that we assign an individual an ID to keep track of, we compare the difference in centroids of subsequent frames. In the unique person tracking implementation screenshot, we have identified two individuals as Person 0 and Person 3 (Person 1 and Person 2 existed at a previous point in time and have escaped the current frame). We define a threshold and if the centroid difference is below this threshold, the ID is retained. We then calibrate a specific distance as reference in the frame and examine whether there is a social distancing breach or not. People who have caused social distancing breach is continuously tracked. Thereafter we subject the frame to see whether the person is wearing a face mask or not using MobileNet-SSD and YOLOv3. The training set for detecting face mask was generated by web scraping 1400 faces which contain mask, and which do not contain masks. The face mask module is integrated with the person identification module to produce a unified result. We define the prediction accuracy a of the system as c t where c is the total number of human faces correctly classified and t is the total number of human faces in the video. Once these steps are completed, we forward this information to a graph tracing algorithm which constructs a structured data and can be later used for querying. The dataset used to analyze this measure was generated by scraping 250 pedestrian videos from publicly available sources of 60 s duration each which contained a good mix of individuals with face masks and without face mask (Figs. 15.11e15.13) (Table 15. 3). 

Each person who is identified in a frame is given a susceptibility score which is a measure of their ability to contract the disease. Thereafter we establish a set of heuristic rules upon which increments happens to this score (Table 15 .4):

The susceptibility scores are maintained in a map where the key is the person identifier and the value being the susceptibility score. The people who breach social distancing are the people who have been in close contact with each other beyond a distance threshold. This information is maintained in the form of an adjacency list with the timestamp. Let us understand this with the help of an example. Consider that we have identified four individuals in a frame: (A, B, C, D). This translates to Fig. 15 .14, where the susceptibility score is mentioned inside the parenthesis within the person node and the adjacency list tracks the people who have been in contact with each other (breach in social distancing). They are initially assigned a value of 1.

Post that, the check for whether a person is wearing a mask or not happens. Let us consider that B is not wearing a mask and that A, C, and D are wearing masks at t0. The susceptibility score of B is incremented by a value of 1, but there is no increment for rest of the candidates (Fig. 15.15 ). Now in the next frame, let us consider that A and B commit a social distance breach at t1. A has an increment of 1 to the susceptibility score and B has an increment of 1 to the susceptibility score. Notice that B has now been added to the adjacency list of A and A has been added to the adjacency list of B. The time stamp of the breach has also been added in Fig. 15 .16. The time stamp will not be updated until A and B continue to be in social branch.

Let us consider that even C and D commit a social distance breach to understand the change in values. So, we can see that D has been added to the adjacency list of C, and C has been added to the adjacency list of D. The values of C and D have been incremented as well in Fig. 15 .17 since there was a social distance breach. 

Addition to susceptibility score

Identification of a unique individual þ1 Individual is not wearing a mask þ1 For every unique social distance breach þ1 for both the individuals involved Individual was initially wearing a mask but has removed it (and for every mask activity that follows this action)

Individual was initially not wearing a mask but is now wearing a mask (and for every mask activity that follows this action) þ0 FIGURE 15.14 Susceptibility score initialization at time t0. Now let us consider the case that A and B got separated and reunited at a time t3 in Fig. 15.18 . The susceptibility scores of A and B will not be updated since it is not a unique meet. However, the time stamp t3 will be updated in the adjacency list.

Let us now add two more candidates E and F at a time t4 in Fig. 15. 19. E is not wearing a mask and F is wearing a mask. By similar logic, E now has a score of 3 because E is not wearing a mask, and has two social distancing breaches (E, A) and (E, D). The adjacency list has been updated likewise.

Let us now consider the case of F removing the mask and E putting a mask on at a time t5 in Fig. 15 .20. The susceptibility score of F's has been updated to 2 since F has removed the mask and is now increasing the susceptibility. However, the score of E is not updated because he has already been exposed. Now let us discuss how contract tracing will be applied in Fig. 15 .21 when we get to know F and A have been infected. Since the adjacency list has already been populated, we can exploit it to find out the most susceptible members. F has not been in contact with anyone. A has been in contact with B at time t1, and time t3. Therefore, B is susceptible. Now let since B has not been in contact with any other member, we shall stop probing further than B. Moving on to the next member, A has been in contact with E at time t4 and hence it is also susceptible. We now look at the contacts of E with time t4 and above. Since D fits the description, it is also susceptible. Therefore, the susceptible members are B, E, and D. The general rule of thumb here is we look at the events in the list and find out the susceptible members who have been in contact greater than or equal to the current time being probed. For real-time analysis, without contact tracing, we can the exploit the map which holds the susceptible values to understand who all are susceptible.

Social distancing is a pivotal measure to reduce the current infection spread and avoid overcrowding in hospitals. Thus, organizations need to install surveillance applications which can in real-time highlight the relevant social distancing breaches and produce the list of susceptible members. The application proposed in this chapter achieved a good amount of accuracy on surveillance footages with low-cost development. Future work can be focused on real-time face identification, where we already have the knowledge of the faces appearing in the video and the idea of blockchain based networks can be probed where time transactions pertaining to social distancing breaches can be stored.

The COVID-19 pandemic calls for spatial distancing and social closeness: not for social distancing!

FIGURE 15.21 Contact tracing for infection detected at time t6

Chapter 15 Real-time social distance alerting and contact tracing 313

From SARS to Ebola: legal and ethical considerations for modern quarantine

Public health and public trust: survey evidence from the Ebola Virus Disease epidemic in Liberia

The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19)dChina

Coronavirus: How Do You Social Distance in Schools? BBC News, 2020

Big Tech Firms Ramp up Remote Working Orders to Prevent Coronavirus Spread, CNN

London Marathon Postponed to 4 October in Response to Coronavirus, The Guardian

Joint Statement from the International Olympic Committee and the Tokyo 2020 Organising Committee e Olympic News, International Olympic Committee

Tennis: ITF to Furlough Staff, 900 Tournaments Postponed. Reuters, 2020

Advisory: Travel and Visa Restrictions Related To COVID-19

The iconography of social distancing around the world

Social Distancing Signs Around the World Show the New Normal. Reuters, 2020

Up Is Testing Drones in India to Enforce Social Distancing

Singapore Deploys Robot 'dog' to Encourage Social Distancing, CNN

Cameras to Monitor Masks and Social Distancing

Thermal Imaging Used to Help Police Crack Down on People Flouting Social-Distancing Rules, MarketWatch, 2020

Rich feature hierarchies for accurate object detection and semantic segmentation

Proceedings of the IEEE International Conference on Computer Vision

Faster R-CNN: towards real-time object detection with region proposal networks

European Conference on Computer Vision

You only look once: unified, real-time object detection

Flattening the pandemic and recession curves, Mitigating the COVID Economic Crisis: Act Fast and Do Whatever It Takes

Revisiting the basic reproductive number for malaria and its implications for malaria control

Global stability for the SEIR model in epidemiology

The reproductive number of COVID-19 is higher compared to SARS coronavirus

Controlling epidemic spread by social distancing: do it well or not at all

Contact tracing with a real-time location system: a case study of increasing relative effectiveness in an emergency department

Why India's Covid-19 Contact Tracing App Is Controversial, BBC News

Research on a surface defect detection algorithm based on MobileNet-SSD

YOLO9000: better, faster, stronger

Yolov3: An Incremental Improvement