key: cord-0883191-273y4eca authors: Hasaninasab, Mehdi; Khansari, Mohammad title: Efficient COVID-19 testing via contextual compressive sensing date: 2021-08-13 journal: Pattern Recognit DOI: 10.1016/j.patcog.2021.108253 sha: 1f9225ead3edf4c18ff6007eb0914a2990306654 doc_id: 883191 cord_uid: 273y4eca The COVID-19 pandemic is threatening billions of people's life all over the world. As of March 6, 2021, covid-19 has confirmed in 115,653,459 people worldwide. It has also a devastating effect on businesses and social activities. Since there is still no definite cure for this disease, extensive testing is the most critical issue to determine the trend of illness, appropriate medical treatment, and make social distancing policies. Besides, testing more people in a shorter time helps to contain the contagion. The PCR-based methods are the most popular tests which take about an hour to make the output result. Obviously, it makes the number of tests highly limited and consequently, hurts the efficiency of pandemic control. In this paper, we propose a new approach to identify affected individuals with a considerably reduced No. of tests. Intuitively, saving time and resources is the main advantage of our approach. We use contextual information to make a graph-based model to be used in model-based compressive sensing (CS). Our proposed model makes the testing with fewer tests required compared to traditional testing methods and even group testing. We embed contextual information such as age, underlying disease, symptoms (i.e. cough, fever, fatigue, loss of consciousness), and social contacts into a graph-based model. This model is used in model-based CS to minimize the required test. We take advantage of Discrete Graph Signal Processing on Graph (DSP(G)) to generate the model. Our contextual model makes CS more efficient in both the number of samples and the recovery quality. Moreover, it can be applied in the case that group testing is not applicable due to its severe dependency on sparsity. Experimental results show that the overall testing speed (individuals per test ratio) increases more than 15 times compared to the individual testing with the error of less than 5% which is dramatically lower than that of traditional compressive sensing. COVID-19 has become the main threat to people's lives all over the world. It also disrupts both business and social activities. Economic depression is another problem that all people are involved in, regardless of affected or nonaffected. In the absence of a vaccine and definite cure, the only remedy is that stop spreading the viruses. Largescale quarantine is not a good solution because of serious, perhaps irreparable consequences in long term including a downturn and social problem. The best policy which minimizes economical damage is to isolate the infected people. So, COVID-19 testing is the primary step to isolate infected people. This testing must be fast enough to cover most people in a short time and consequently isolate the right individuals before time passed. There are mainly two types of test that can detect COVID-19 in individuals: 1) Serological tests, which study the presence of antibodies in the blood of individuals, and 2) The Swap test, which takes material from the cavity between nose and mouth and looks for RNA of an alive virus. Although the serological test has its advantages, the swap test is extremely recommended by the Center for Disease Control and Prevention (CDC) [1] . This class of tests uses Real-Time Polymerase Chain Reaction (RT-PCR) for detecting COVID-19. This method is based on selective DNA strands amplification in each cycle which can make the result in a shorter time and conservation of reagent and testing kits. Additionally, increasing the range of detection makes it suitable techniques to detect COVID-19. The RT-PCR is done typically in 40 cycles, and in each cycle, three major steps are performed [2] : High temperature (typically 95°C, the maximum temperature that DNA can withstand) incubation is used to "melt" double-strand DNA to the single stand. During annealing, the temperature is reduced (usually 5°C below the first step) to let the complementary sequence hybridize. At 70-72°C, the optimal activity of DNA polymerase is achieved and primer extension occurs. It is expected to take 8 hours to complete the RT-PCR test for each individual. This is a long time that extremely limits the number of individual testing per day. Although the RT-PCR test is recognized as the golden standard for determining COVID-19 status in individuals [3] , but it's time-consuming is the critical challenge. It may cause to diagnose from chest CT image as an alternative way [4] . One of the usual methods to mitigate this limitation is group testing [5] . Group testing is an old technique introduced by Dorfman in 1943. In the group testing, people are grouping and testing is done on the pooled sample. If the result is negative, it means all group members are disease-free; otherwise, the test must be performed for each group member separately. Although it has advantages compared to sampling techniques (i.e. no data loss), it has two important restrictions: The output of the group testing is a binary value. In other words, it just determines disease affection or non-affection. This is inefficient in the case of COVID-19 where the viral load (i.e. virus concentration) is important to determine the COVD-19 severity among patients [6] and also to study about the antibody response [7] . The other drawback of group testing is its dependency on the size of groups. Increasing the group size increases the number of required test and fades the efficiency of the group testing [5] . This is extremely critical in case of a large population (e.g. testing . From the last decade, Compressive sensing (CS) has been a popular technique in signal processing related fields. The key advantage of CS over traditional sampling is the extremely lower sampling rate required for the signal recovery. CS has two main phases: signal sampling and signal recovery. Contrary to the traditional sampling, the CS samples cannot directly be mapped to any specific data elements, but a linear combination of all data elements. In each CS sample, each individual's data have its own weight which is specified by the sampling matrix. In the recovery phase, the signal elements are recovered from CS samples. From the mathematical point of view, CS signal recovery is going to find a solution (i.e. the original signal) of the underdetermined system of equations that have an infinite number of solutions [8] . It is done by considering the signal's unique features (i.e. sparsity) among other solutions. In the application of COVID-19 testing, the CS sampling rate is the number of required tests to identify all group members' COVID-19 affection status (i.e. recovered signal). Depending on the recovery strategy chosen, the appropriate sampling rate may be varied [9] . Even though CS efficiency, its implementation has some drawbacks in the real world. In the COVID-19 testing scenario, decreasing the number of required tests while maintaining the required quality is the most critical challenge. Baraniuk introduced the model-based CS to present the unique features concretely in the form of a mathematical model. In model-based CS, in addition to sparsity, the structure of signal values and locations are used as signal features. It has been shown that model-based CS makes it possible to reduce the sampling rate without scarifying robustness and the quality of the output [10] . To the best of our knowledge, the model-based CS hasn't applied to COVID-19 so far. Moreover, in none of the proposed models, the interdependency between signal elements (i.e. the relation between individual test results in the context of COVID-19) was considered. In this paper, we introduce the Contextual Model-based CS (CMb-CS) that considers the contextual information ( e.g. lifestyle, underlying disease, and social contacts ) to determine the signal elements (i.e. individual test results) interdependency. The model is generated based on the emerging field of Signal Processing on Graph (DSP G ) and exploit its powerful features to show interdependency between signal elements. The aim of our model is to represent the signal characteristic to recover the compressed signal (i.e. compressed COVID-19 test results) more efficiently. The interdependency between signal element, along with the signal smoothness and sparsity are three factors that are embedded into our model to represent the signal more precisely. Owing the clear image that the model offers from the recovered signal, there are expected that the signal is recovered with much lower number of data. At the final part of this paper, the model efficiency in testing is compared to the existing group testing and the CS based method. The rest of this paper is organized as follows: Section 2 describes some necessary backgrounds. In Section 3, the proposed model is presented. In section 4 the experimental results are shown and explained, and finally, the conclusion is discussed in section 5. Group testing was firstly introduced by Dorfman to detect syphilis from blood samples in a large population without the need to test each individual [5] . Instead of testing each individual separately, the population of people is divided into groups of members. In each group, the samples are mixed and the test is done on the pooled samples. The negative result means all group members are free from infection. Otherwise, there is at least one person contaminated. It is proven that in this way, the number of tests per individual is reduced. The efficiency of group testing depends on two key parameters: infection rate ( e.g. probability of infection) and group size ( ). The expected relative cost ( ) given by: Where is the size of the group (number of members), and is the average infection probability. Group testing has been applied on COVID-19 testing [11] . The multistage group testing approach was proposed to reduce the number of COVID-19 required tests. In this work, the group testing was done iteratively, and in each iteration, the group size was changed dynamically. It was shown that this approach reduce the above parameter [11] . From (1), it is clear that increasing or may causes , which means that using group testing is ineffective. Therefore, generally group testing is not beneficial in a large population. Moreover, it gives only binary results (infection or non-infection) and the viral load is also unspecified. These shortcomings make the group testing not a suitable method for COVID-19 pandemic conditions (i.e. high population, and fast result required, and high output quality demanding). Another demand is to reduce more number of required tests which consequently, saves quantity of reagents and manpower. Regarding all above together, one of the promising approaches to address these challenges, is compressive sensing which is described in the following section. CS is an emerging field of research that merges sampling and compression in one step. Contrary to traditional sampling, in the CS, it has been proven that the signal can recover completely at a much lower sampling rate than that in Shannon/Nyquist [8] . Due to the merging of data acquisition and data compression in one single step, there will be no need to access all data elements. One of the key concepts in CS is sparsity which is described in the following. Imagine the is a vector consist of the N individual test result. Vector x is K-sparse if it has at most K nonzero coefficients (K<