key: cord-0151709-kmj37ynv
authors: Bastopcu, Melih; Ulukus, Sennur
title: Using Timeliness in Tracking Infections
date: 2022-03-07
journal: nan
DOI: nan
sha: ca658a8a8d24c19c11c74aa00f4f38ee3f54d179
doc_id: 151709
cord_uid: kmj37ynv

We consider real-time timely tracking of infection status (e.g., covid-19) of individuals in a population. In this work, a health care provider wants to detect infected people as well as people who have recovered from the disease as quickly as possible. In order to measure the timeliness of the tracking process, we use the long-term average difference between the actual infection status of the people and their real-time estimate by the health care provider based on the most recent test results. We first find an analytical expression for this average difference for given test rates, infection rates and recovery rates of people. Next, we propose an alternating minimization based algorithm to find the test rates that minimize the average difference. We observe that if the total test rate is limited, instead of testing all members of the population equally, only a portion of the population may be tested in unequal rates calculated based on their infection and recovery rates. Next, we characterize the average difference when the test measurements are erroneous (i.e., noisy). Further, we consider the case where the infection status of individuals may be dependent, which happens when an infected person spreads the disease to another person if they are not detected and isolated by the health care provider. Then, we consider an age of incorrect information based error metric where the staleness metric increases linearly over time as long as the health care provider does not detect the changes in the infection status of the people. In numerical results, we observe that an increased population size increases diversity of people with different infection and recovery rates which may be exploited to spend testing capacity more efficiently. Depending on the health care provider's preferences, test rate allocation can be adjusted to detect either the infected people or the recovered people more quickly.

We consider the problem of timely tracking of an infectious disease, e.g., covid-19, in a population of n people. In this problem, a health care provider wants to detect infected people as quickly as possible in order to take precautions such as isolating them from the rest of the population. The health care provider also wants to detect people who have recovered from the disease as soon as possible since these people need to return to work which is especially critical in sectors such as education, food retail, public transportation, etc. Ideally, the health care provider should test all people all the time. However, as the total test rate is limited, the question is how frequently the health care provider should apply tests on these people when their infection and recovery rates are known. In a broader sense, this problem is related to timely tracking of multiple processes in a resource-constrained setting where each process takes binary values of 0 and 1 with different change rates.

Recent studies have shown that people who recovered from infectious diseases such as covid- 19 can be reinfected. Furthermore, the recovery times of individuals from the disease may vary significantly. For these reasons, in this problem, the ith person gets infected with rate λ i which is independent of the others. Similarly, the ith person recovers from the disease with rate µ i . We note that the index i may represent a specific individual or a group of individuals that have common features such as age, gender, profession. Depending on the demographics, coefficients λ i and µ i may be statistically known by the health care provider. We denote the infection status of the ith person as x i (t) (shown with the black curves on the left in Fig. 1 ) which takes the value 1 when the person is infected and the value 0 when the person is healthy. The health care provider applies tests to people marked as healthy with rate s i and to people marked as infected with rate c i . Based on the test results, the health care provider forms an estimate for the infection status of the ith person denoted byx i (t) (shown with the blue curves on the right in Fig. 1 ) which takes the value 1 when the most recent test result is positive and the value 0 when it is negative. We measure the timeliness of the tracking process by the difference between the actual infection status of people and the real-time estimate of the health care provider which is based on the most recent test results. The difference can occur in two different cases: i) when the person is sick (x i (t) = 1) and the health care provider maps this person as healthy (x i (t) = 0), and ii) when the person recovers from the disease (x i (t) = 0) but the health care provider still considers this person as infected (x i (t) = 1). The former case represents the error due to late detection of infected people, while the latter case represents the error due to late detection of healed people. Depending on the health care provider's preferences, detecting infected people may be more important than detecting recovered people (controlling infection), or the other way around (returning people to workforce).

Age of information has been proposed to measure timeliness of information in communication systems, and has been studied in the context of queueing systems [1] - [8] , multi-hop and multicast networks [9] - [17] , social networks [18] , timely remote estimation of random processes [19] - [25] , energy harvesting systems [26] - [40] , wireless fading channels [41] , [42] , scheduling in networks [43] - [55] , lossless and lossy source and channel coding [56] - [66] , vehicular, IoT and UAV systems [67] - [70] , caching systems [71] - [82] , computation-intensive systems [83] - [90] , learning systems [91] - [93] , gossip networks [94] - [97] and so on. A more detailed review of the age of information literature can be found in references [98] - [100] . Most relevant to our work, the real-time timely estimation of a single and multiple counting processes [19] , [25] , a Wiener process [20] , a random walk process [101] , binary and multiple states Markov sources [23] , [51] , [102] have been studied. The work that is closest to our work is reference [23] where the remote estimation of a symmetric binary Markov source is studied in a time-slotted system by finding the optimal sampling policies via formulating a Markov Decision Process (MDP) for real-time error, AoI and AoII metrics. Different from [23] , in our work, we consider real-time timely estimation of multiple non-symmetric binary sources for a continuous time system. In our work, the sampler (health care provider) does not know the states of the sources (infection status of people), and thus, takes the samples (applies medical tests) randomly (exponential random variables) with fixed rates. Thus, we optimize the test rates of people to minimize the real-time estimation error.

In this paper, we consider the real-time timely tracking of infection status of n people. We first find an analytical expression for the long-term average difference between the actual infection status of people and the estimate of the health care provider based on test results. Then, we propose an alternating minimization based algorithm to find the test rates s i and c i for all people. We observe that if the total test rate is limited, we may not apply tests on all people equally. Next, we provide an alternative method to characterize the average difference, by finding the steady state of a Markov chain defined by (x i (t),x i (t)). By using this alternative method, we determine the average estimation error when there are errors in the test measurements expressed by a false positive rate p and a false negative rate q. Next, we consider the infection status of two people where an infected person may spread the disease to another person if the infection has not been detected by the health care provider to isolate the infected person. Finally, we consider an age of incorrect information based error metric where the estimation error increases linearly over time when the health care provider has not detected the changes in the infection status of the people.

Through extensive numerical results, we observe that increasing the total test rate helps track the infection status of people better, and increasing the size of the population increases diversity which may be exploited to improve the performance. Depending on the health care provider's priorities, we can allocate more tests to people marked as healthy to detect the infections more quickly or to people marked as infected to detect the recoveries more quickly. In order to combat the test errors, the health care provider may prefer to apply tests to only a selected portion of the population with higher test rates. When the infection status of a person depends on that of another person, the average time that a person remains infected can be reduced by increasing the total test rate as it helps to detect the infected people more quickly. Finally, we observe that depending on the error metric used, the test rate distribution among the population differs greatly, and thus, we should choose an error metric that aligns with the priorities of the health care provider.

We consider a population of n people. We denote the infection status of the ith person at time t as x i (t) (black curve in Fig. 2 (a)) which takes binary values 0 or 1 as follows,

In this paper, we consider a model where each person can be infected multiple times after recovering from the disease. We denote the time interval that the ith person stays healthy for the jth time as W i (j) which is exponentially distributed with rate λ i . We denote the recovery time for the ith person after infected with the virus for the jth time as R i (j) which is exponentially distributed with rate µ i .

A health care provider wants to track the infection status of each person. Based on the test results at times t i,ℓ , the health care provider generates an estimate for the status of the ith person denoted asx i (t) (blue curve in Fig. 2(a) ) bŷ

Whenx i (t) is 1, the health care provider applies the next test to the ith person after an exponentially distributed time with rate c i . Whenx i (t) is 0, the next test is applied to the ith person after an exponentially distributed time with rate s i .

An estimation error happens when the actual infection status of the ith person, x i (t), is different than the estimate of the health care provider,x i (t), at time t. This could happen in two ways:

(a) A sample evolution of xi(t) andxi(t), and (b) the corresponding ∆i(t) in (5) . Green areas correspond to the error caused by ∆i1(t) in (3). Orange areas correspond to the error caused by ∆i2(t) in (4). when x i (t) = 1 andx i (t) = 0, i.e., when the ith person is sick, but it has not been detected by the health care provider, and when x i (t) = 0 andx i (t) = 1, i.e., when the ith person has recovered, but the health care provider does not know that the ith person has recovered.

We denote the error caused by the former case, i.e., when x i (t) = 1 andx i (t) = 0, by ∆ i1 (t) (green areas in Fig. 2 

and we denote the error caused by the latter case, i.e., when x i (t) = 0 andx i (t) = 1, by ∆ i2 (t) (orange areas in Fig. 2 

Then, the total estimation error for the ith person ∆ i (t) is

where θ is the importance factor in [0, 1]. A large θ gives more importance to the detection of infected people, and a small θ gives more importance to the detection of recovered people.

We define the long-term weighted average difference between x i (t) andx i (t) as

Then, the overall average difference of all people ∆ is Fig. 3 . A sample evolution of (a) ∆i1(t), and (b) ∆i2(t) in a typical cycle.

Our aim is to track the infection status of all people. Due to limited resources, there is a total

Thus, our aim is to find the optimal test rates s i and c i to minimize ∆ in (7) while satisfying this total test rate constraint. We formulate the following problem,

In the next section, we find the total average difference ∆.

We first find analytical expressions for ∆ i1 (t) in (3) and ∆ i2 (t) in (4) when s i > 0 and c i > 0. We note that ∆ i1 (t) can be equal to 1 whenx i (t) = 0 and is always equal to 0 when

x i (t) = 1. Assume that at time 0, both x i (0) andx i (0) are 0. After an exponentially distributed time with rate λ i , which is denoted by W i , the ith person is infected, and thus x i (t) becomes 1.

At that time, sincex i (t) = 0, ∆ i1 (t) becomes 1. ∆ i1 (t) will be equal to 0 again either when the ith person recovers from the disease which happens after R i which is exponentially distributed with rate µ i or when the health care provider performs a test on the ith person after D i which is exponentially distributed with rate s i . We define T m (i) as the earliest time at which one of these two cases happens, i.e., T m (i) = min{R i , D i }. We note that T m (i) is also exponentially

If the ith person recovers from the disease before testing, we return to the initial case where both x i (t) andx i (t) are equal to 0 again. In this case, this cycle repeats itself, i.e., the ith person becomes sick again after W i and ∆ i1 (t) remains as 1 until either the person recovers or the health care provider performs a test which takes another T m (i) duration. If the health care provider performs a test before the person recovers, thenx i (t) becomes 1. We denote the time interval for whichx i (t) stays at 0 as I i1 which is given by

where K 1 is geometric with rate P(T m (i) = D i ) = s i µ i +s i . Due to [103, Prob. 9.4.1], K 1 ℓ=1 T m (i, ℓ) and K 1 ℓ=1 W i (ℓ) are exponentially distributed with rates s i and λ i s i µ i +s i , respectively. As

Whenx i (t) = 1, the health care provider marks the ith person as infected. The ith person recovers from the virus after R i . After the ith person recovers, either the health care provider performs a test after Z i which is exponentially distributed with rate c i or the ith person is reinfected with the virus which takes W i time. We define T u (i) as the earliest time at which one of these two cases happens, i.e., T u (i) = min{W i , Z i }. Similarly, we note that T u (i) is exponentially distributed with rate λ i + c i , and we have P(T u (i) = W i ) = λ i λ i +c i and P(T u (i) = Z i ) = c i λ i +c i . If the person is reinfected with the virus before a test is applied, this cycle repeats itself, i.e., the ith person recovers after another R i , and then either a test is applied to the ith person, or the person is infected again which takes another T u (i). If the health care provider performs a test to the ith person before the person is reinfected, the health care provider marks the ith person as healthy again, i.e.,x i (t) becomes 0. We denote the time interval thatx i (t) is equal to 1 as I i2 which is given by

where K 2 is geometric with rate P(T u (i)

are exponentially distributed with rates c i and c i µ i λ i +c i , respectively. As

We denote the time interval between the jth and (j + 1)th times thatx i (t) changes from 1 to 0 as the jth cycle

Then, using ergodicity, similar to [80] ,

.

Thus, we have

Next, we find ∆ i2 . We note that ∆ i2 (t) is equal to 1 when x i (t) = 0 in I i2 (j) and is always equal to 0 during I i1 (j). Similarly, we denote the total time duration where ∆ i2 (t) is equal to

By using (5), (14) , and (15), we obtain ∆ i as

Then, by inserting (16) in (7), we obtain ∆. In the next section, we solve the optimization problem in (8) .

In this section, we solve the optimization problem in (8) . Using ∆ i in (16) in (7), we rewrite

We define the Lagrangian function [104] for (17) as

where

for all i. The complementary slackness conditions are

First, we find s i . From (19) , we have

where we used the fact that we either have s i > 0 and ν i = 0, or s i = 0 and ν i ≥ 0, due to (21) . Here, (·) + = max(·, 0). On the other hand, when θ(c i + λ i ) < (1 − θ)µ i , we have ∂∆ i ∂s i > 0, and thus it is optimal to choose s i = 0 as our aim is to minimize ∆ in (7) . In this case, when

As we obtain the same ∆ i for all values of c i , and the total update rate is limited, i.e., n i=1 s i + c i ≤ C, in this case, it is optimal to choose c i = 0 as well (i.e., when s i = 0).

Next, we find c i . From (20) , we have

When

where we used the fact that we either have c i > 0 and η i = 0, or c i = 0 and η i ≥ 0, due to (21) . Similarly, when

Thus, for a given c i , the optimal test rate allocation policy for s i is a threshold policy where s i 's with small 1

we must have c i = 0. Thus, for a given s i , the optimal policy to determine c i is a threshold policy where c i 's with small 1

Next, we show that in the optimal policy, if s i > 0 and c i > 0 for some i, then the total test rate constraint must be satisfied with equality, i.e., n i=1 s i + c i = C.

In the optimal policy, if s i > 0 and c i > 0 for some i, then we have n i=1 s i + c i = C.

Proof: The derivatives of ∆ i with respect to s i and c i are

We note that s i > 0 in (23) implies that θ( (25) implies that (1−θ)(s i +µ i ) > θλ i . Thus, we have ∂∆ i ∂c i < 0. Therefore, in the optimal policy, if we have s i > 0 and c i > 0 for some i, then we must have n i=1 s i +c i = C. Otherwise, we can further decrease ∆ in (7) by increasing c i or s i .

Next, we propose an alternating minimization based algorithm for finding s i and c i . For this purpose, for given initial (s i , c i ) pairs, we define φ i as

Then, we define u i as

From (23) and (25), s i = u i and c i = u n+i , for i = 1, . . . , n.

Next, we find s i and c i by determining β in (29) . First, assume that, in the optimal policy, there is an i such that s i > 0 and c i > 0. Thus, by Lemma 

Then, given the initial (s i , c i ) pairs, we immediately choose u i = 0 for φ i < 0. For the remaining u i with φ i ≥ 0, we apply a solution method similar to that in [80] . By assuming φ i ≥ β, i.e., by disregarding (·) + in (29), we solve 2n i=1 u i = C for β. Then, we compare the smallest φ i which is larger than zero in (28) with β. If we have φ i ≥ β, then it implies that u i ≥ 0 for all remaining i. Thus, we have obtained u i values for given initial (s i , c i ) pairs. If the smallest φ i which is larger than zero is smaller than β, then the corresponding u i is negative and we should choose u i = 0 for the smallest non-negative φ i . Then, we repeat this procedure until the smallest non-negative φ i is larger than β. After determining all u i , we obtain s i = u i and c i = u n+i for i = 1, . . . , n. Then, with the updated values of (s i , c i ) pairs, we keep finding u i 's until the KKT conditions in (19) and (20) are satisfied.

We note that for indices (persons) i for which (s i , c i ) are zero, the health care provider does not perform any tests, and maps these people as either always infected, i.e.,x i (t) = 1 for all t, or always healthy, i.e.,x i (t) = 0. Ifx i (t) = 0 for all t,

and should choosex i (t) = 1 for all t, otherwise, without performing any tests. Finally, we note that the problem in (17) is not a convex optimization problem as the objective function is not jointly convex in s i and c i . Therefore, the solutions obtained via the proposed method may not be globally optimal. For that reason, we choose different initial starting points and apply the proposed alternating minimization based algorithm and choose the solution that achieves the smallest ∆ in (7).

In the next section, we first provide an alternative way to find the average difference ∆ in (6) and then characterize the average difference for the erroneous test measurements.

We note that the infection status of the ith person and its estimate at the health care provider form a continuous time Markov chain [105, Section 7.5] 

In this section, by finding the steady-state distribution for (x i (t),x i (t)), we provide an alternative method to find ∆ in (6) . Then, we consider the case with erroneous test measurements. For this case, we characterize the long-term average difference for the ith person denoted by ∆ e i .

When there is no error in the tests, the state transition graph is shown in Fig. 4 (a). Assuming that s i > 0, c i > 0, every state is accessible from any other state, and thus, the Markov chain induced by the system is irreducible. Note that in Section IV, we see that the testing rates for some people can be equal to 0, i.e., s i = 0 and c i = 0. For these people, we choosex i (t) to be either always 0 or 1, i.e., consider them as always healthy or sick all the time. Depending on the choice ofx i (t), when s i = 0 and c i = 0, either the states (0, 0) and (1, 0), or the states (0, 1) and (1, 1) will be transient, and thus, have 0 probability in the steady-state. By using small time-step approximation to a discrete time Markov chain, one can show that the self transition probabilities are non-zero, and thus, the Markov chain induced by the system is also aperiodic [105, Section 7.5] . Therefore, the Markov chain shown in Fig. 4 (a) admits a unique stationary distribution given by π = {π 00 , π 01 , π 10 , π 11 }. We find the stationary distribution by writing the local-balance equations which are given as π 00 λ i =π 10 µ i + π 01 c i ,

π 11 µ i =π 10 s i + π 01 λ i .

By using (30)-(33) and 2 k=1 2 ℓ=1 π kℓ = 1, we find the steady-state distribution π as

and π 00 = µ i +s i λ i π 10 , and π 11 = c i +λ i µ i π 01 . We note that ∆ i1 in (14) is also equal to π 10 in (35), i.e., we have ∆ i1 = π 10 . Similarly, ∆ i2 in (15) is equal to π 01 in (34) . Thus, by observing that the states (x i (t),x i (t)) form a continuous time Markov chain, we can find the average difference ∆ in (6) by finding the steady-state distribution for π. This method will be particularly useful in the following section where we consider the case with erroneous test measurements.

In this section, we consider the case where the test measurements can be erroneous. When a test in applied to an infected person, i.e., when x i (t) = 1, the test result will be 0 with probability q and 1 with probability 1 − q, where 0 ≤ q < 1 2 . In other words, the false negative probability is equal to q. Similarly, when a test is applied to a healthy person, i.e., when x i (t) = 0, the test result will be 1 with probability p and 0 with probability 1 − p, where 0 ≤ p < 1 2 . Thus, the false positive probability is equal to p. The probability distribution for the test measurements is provided in Table I . In this section, we consider the case where the health care provider applies only one test rate v i to the ith person, whether the person is currently marked as healthy or infected. That is, we do not consider separate testing rates of s i and c i for healthy and infected people as we did before, instead, here both s i and c i are equal o v i . Since the health care provider applies the same test rate for the ith person, here we do not consider the importance factor θ either. Then, we define the long-term average difference for the ith person with the error on the test measurements as follows, where the superscript e stands for "erroneous"

and the definitions of ∆ e i1 and ∆ e i2 follow similarly from (13) . We note that with the test rates v i and errors on the test measurements, the states (x i (t),x i (t)) form a continuous time Markov chain, and the corresponding state transition graph is shown in Fig. 4(b) . Assuming that v i > 0, one can show that there is a unique steady-state distribution π e = {π e 00 , π e 01 , π e 10 , π e 11 } which can be found by solving the local balance equations which are given as follows

Then, by using (37)-(40) and 2 k=1 2 ℓ=1 π e kℓ = 1, we find the steady-state distribution π e as π e 00 =

We note that ∆ e i1 , and ∆ e i2 are equal to π e 10 in (43), and π e 01 in (42), respectively. Thus, if v i > 0, then ∆ e i in (36) becomes

We immediately note that if false positive test probability p and false negative test probability q are equal to 0, ∆ e i becomes (14) and (15), respectively when v i = s i = c i . Then,

∂q ≥ 0 is equivalent to v i + λ i − µ i ≥ 0 which means that depending on the values of v i , µ i , and λ i , the long-term average difference ∆ e i can be an increasing function of only p or only q, or both p and q, but ∆ e i cannot be a decreasing function of both p and q. This is expected as false negative and false positive tests negatively affect the estimation process. One can also show that Next, we consider the case when v i = 0. Note that when v i = 0, the health care provider either maps these people as always sick or always healthy depending on their infection and recovery rates. Thus, when v i = 0 and depending on the estimatex i (t), two of the states in Fig. 4 (b) will never be visited and thus, these states will have 0 steady-state probabilities. For this case, the steady states are given byπ e 1,x i andπ e 0,x i . The local balance equation is λ iπ e 0,x i = µ iπ e 1,x i . By usingπ e 0,x i +π e 1,x i = 1, we find the steady-state distribution asπ e 0,x i = µ i µ i +λ i , andπ e 1,x i = λ i µ i +λ i . Thus, if µ i < λ i , i.e., if people are infected more frequently, then the health care provider chooses its estimate asx i (t) = 1 and, ∆ e i = µ i µ i +λ i . If µ i ≥ λ i , i.e., if people stay healthy more often, then we havex i (t) = 0, and ∆ e i = λ i µ i +λ i . Therefore, when v i = 0, we have

In order to find the optimal test rates v i in the case of errors on the test measurements, we formulate the following optimization problem

where the objective function is given by the summation of ∆ e i in (45) when v i > 0 and ∆ e i in (46) when v i = 0 over all people and ½{.} is the indicator function taking value 1 when {·} is true and 0, otherwise. In (47), we have a constraint on the total test rate, i.e., n i=1 v i ≤ C. We note that the optimization problem in (47) is in general not convex due to the indicator function in the objective function. However, for a given set of ½{v i = 0}, the optimization problem in (47) is convex and can be solved optimally. Thus, by solving the problem in (47) for all possible set of ½{v i = 0}, we can determine the global optimal solution which requires to solve 2 n different optimization problems which can be impractical for large n. Because of this reason, next, we provide a greedy algorithm to solve the optimization problem in (47) .

In the greedy solution, initially, assuming that ½{v i > 0} = 1 for all i, we consider the following the optimization problem

where the objective function in (48) is equal to ∆ e i in (45) . For this optimization problem, we define the Lagrangian function for (48) as

whereβ ≥ 0,ν i ≥ 0. We note that the problem defined in (48) is a convex optimization problem, and thus we can find the optimal test rates v i by analyzing the KKT and the complementary slackness conditions. The KKT conditions are given by

for all i. The complementary slackness conditions arē

By using (50) and (51), we find the optimal v i values for the problem in (48) as

With the test rates v i in (52) we find the average differences ∆ e i in (45) and then compare them with ∆ e i in (46) when v i = 0. Due to the errors in the tests, ∆ e i in (46) with v i = 0 can be smaller than ∆ e i in (45) with the test rates v i found in (52) . For these people, we choose index i where the difference between ∆ e i in (45) with the v i in (52) and ∆ e i in (46) is the highest. Then, we take v i = 0 as applying no test to this person can further decrease ∆ e i . For the remaining people, we solve the optimization problem in (48) . After obtaining the test rates for the remaining people, we again compare average differences ∆ e i with the test rates in (52) and with no test and we choose v i = 0 for the person where ∆ e i can be further decreased. We repeat these steps until all ∆ e i s with v i > 0 cannot be further decreased by choosing v i = 0. We note that the solution obtained in (52) has a threshold structure. As false positive and negative test rates increase, the term 2(1−p−q) β in (52) gets smaller. As a result, some people with a smaller portion of the population is tested with higher test rates in order to combat the test errors.

In this section, we consider the case where we have two people whose infection rates depend on each other. When these two people are healthy, they can be individually infected with the virus after an exponential time with rate λ. When one of these two people is infected and this has not been detected by the health care provider, this person can infect the other healthy person after an exponential time with rate λ 12 which has been illustrated in Fig. 5 . Thus, when both of the people are healthy, their individual infection rate is λ. However, when one of them is sick and this has not been detected by the health care provider, the healthy person's total infection rate is equal to λ + λ 12 . On the other hand, if only one person is infected, i.e., x i (t) = 1, which has also been detected by the health care provider,x i (t) = 1, then we assume that we isolate the infected person from the healthy one, and thus, the healthy person's infection rate remains as λ instead of λ + λ 12 . When the people are infected, they recover from the disease after an exponential time with rate µ.

When the health care provider thinks that a person is healthy, i.e.,x i (t) = 0, the next test is applied to this person after an exponential time with rate s. When the health care provider thinks that a person is sick, i.e.,x i (t) = 1, the next test applied to this person after an exponential time with rate c. Here, we note that since the people are identical in terms of their infection and the recovery rates, the health care provider applies the same test rates.

Similar to Section V, we note that the states {x 1 (t),x 1 (t), x 2 (t),x 2 (t)} form a continuous time Markov chain where the unique stationary distribution is given by π d = {π d 0000 , π d 0001 , . . . , π d 1111 }.

In order to find the stationary distribution, we write the local balance equations as follows

By using (53)-(68) and 2 j=1 2 ℓ=1 2 m=1 2 h=1 π d jℓmh = 1, we find the stationary distribution π d . We denote the long-term average estimation error for person i as ∆ d i for i = 1, 2, where the superscript d stands for "dependent", which is given by

where ∆ d i1 and ∆ d i2 follow from (13) . Then, we have In the context of infection tracking, it is important to know how long the estimation at the health care provider has been different from the actual infection status of the people. However, the error metric that we have considered so far does not have the time component, i.e., it only takes value 1 independent of the time duration that it has been off from the actual health status.

Motivated by the AoII introduced in [51], [102] which takes into account both the time and the information content, in this section, we consider the following error metric, where the superscript s stands for "synchronization" implied in AoII,

where V i (t) is the last time instant where the health care provider has the accurate estimation of the health status for the ith person, i.e., the last time instant when ∆ s i = 0. Similarly, we define

where V i1 (t) and V i2 (t) are equal to the last time instants when ∆ s i1 and ∆ s i2 are equal to 0, respectively. A sample evolution of ∆ s i1 and ∆ s i2 is shown in Fig. 6 and we note that ∆ s Fig. 6 . A sample evolution of (a) ∆ s i1 (t), and (b) ∆ s i2 (t) in a typical update cycle.

∆ s i1 (t) + ∆ s i2 (t). Similar to Section III, the infection and the recovery rates of the ith person are λ i and µ i , respectively. In this section, the health care provider applies only one test rate for each person denoted by w i . That is, we do not consider separate testing rates of s i and c i for healthy and infected people as we did before, instead, here both s i and c i are equal o w i . We first consider the case where w i > 0. By following the steps in Section III, one can show that E[

can be obtained by substituting w i instead of s i and c i in (10) and (12) , respectively. Next, we denote the total area when ∆ s i1 (t) > 0 as A e,1 (i, j) during the jth cycle where A e,1 (i, j) = and K 1 has a geometric distribution with success rate w i µ i +w i . Then, we have E[A e,1 (i)] = 1 w i (w i +µ i ) . Similarly, we denote the total area when ∆ s i2 (t) > 0 as A e,2 (i, j) during the jth cycle where A e,2 (i, j) =

Tu(i,ℓ) 2 2

and K 2 has a geometric distribution with success rate w i λ i +w i . Then, we have E[A e,2 (i)] = 1 w i (w i +λ i ) . By using ergodicity, the long-term average differences become ∆ s i1 = E[A e,1 (i)]

which gives

when w i > 0. One can show that ∆ s i is a decreasing function of w i , i.e., ∂∆ s i ∂w i < 0, and ∆ s i is a convex function of w i , i.e.,

When w i = 0, we have E[I i ] = µ i λ i µ i +λ i , i.e., E[I i ] is equal to the expected time of a person's healthy and sick states. Since the health care provider applies no tests to test a person, it either estimates this person to be always sick (x i (t) = 1) or always healthy (x i (t) = 0). When w i = 0

then the health care providerx i (t) = 1, andx i (t) = 0, otherwise. Thus, when w i = 0,

In order to find the optimal test rates, we formulate the following optimization problem

where the objective function in (78) is equal to the summation of ∆ s i in (77) when w i > 0 and ∆ s i when w i = 0 over all people. In order to solve the problem in (78), we follow the same greedy solution approach in Section V. First, by assuming that w i > 0, and thus, the average difference ∆ s i is given in (77), we solve the following optimization problem

Since the problem in (79) is a convex optimization problem, by defining Lagrangian function and analyzing the KKT and the complementary slackness conditions, we can find the optimal w i values. In order not to be repetitive, we skip these optimization steps. Then, we compare ∆ s i in (77) with w i values found in (79) with min{ 1

If we can reduce ∆ s i further, we choose w i = 0 for the person with the highest improvement. Then, we solve the optimization problem in (79) for the remaining people. We repeat these steps until there is no improvement in ∆ s i by choosing w i = 0. In the next section, we provide extensive numerical results to evaluate optimal test rates in various settings considered in this paper.

In this section, we provide seven numerical results. For these examples, we take λ i as

where r = 0.9 and a is such that n i=1 λ i = 6. Also, we take µ i as

where q = 1.1 and b is such that n i=1 µ i = 4. Since λ i in (80) decreases with i, people with lower indices get infected more quickly compared to people with higher indices. Since µ i in (81) increases with i, people with higher indices recover more quickly compared to people with lower indices. Thus, low index people get infected quickly and get well slowly.

In the first example, we take the total number of people as n = 10, the total test rate as C = 16, and θ = 0.5. We start with randomly chosen s i and c i such that n i=1 s i + c i = 16, and apply the alternating minimization based method proposed in Section IV. We repeat this process for 30 different initial (s i , c i ) pairs and choose the solution that gives the smallest ∆. In Fig. 7(a) , we observe that the first three people are never tested by the health care provider. We note that s i , which is the test rate whenx i (t) = 0, initially increases with i but then decreases with i. This means that people who get infected rarely are tested less frequently when they are marked as healthy. Similarly, we observe in Fig. 7 (a) that c i , which is the test rate when

x i (t) = 1, monotonically increases with i. In other words, people who recover from the virus quickly are tested more frequently when they are marked infected.

In Fig. 7 (b), we plot ∆ i resulting from the solution found from the proposed algorithm, ∆ i when the health care provider applies tests to everyone in the population uniformly, i.e., s i = c i = C 2n for all i, and ∆ i when the health care provider applies no tests, i.e., s i = c i = 0 for all i. In the case of no tests, we have ∆ i = min{ θλ i µ i +λ i , (1−θ)µ i µ i +λ i }. We observe in Fig. 7 (b) that the health care provider applies tests on people whose ∆ i can be reduced the most as opposed to uniform testing where everyone is tested equally. Thus, the first three people who have the smallest ∆ i are not tested by the health care provider. With the proposed solution, by not testing the first three people, ∆ i are further reduced for the remaining people compared to uniform testing. For the people who are not tested, the health care provider choosesx i (t) = 1 all the time, i.e., marks these people always sick as θλ i µ i +λ i > (1−θ)µ i µ i +λ i . This is expected as these people have high λ i and low µ i , i.e., they are infected easily and they stay sick for a long time.

In the second example, we use the same set of variables except for the total test rate C. We vary the total test rate C in between 5 and 20. We plot ∆ with respect to C in Fig. 8 . We observe that ∆ decreases with C. Thus, with higher total test rates, the health care provider can tract the infection status of the population better as expected.

In the third example, we use the same set of variables except for the total number of people n. In addition, we also use uniform infection and healing rates, i.e., λ i = 6 n and µ i = 4 n for all i, for comparison with λ i in (80) and µ i in (81) , while keeping the total infection and healing rates the same, i.e., n i=1 λ i = 6 and n i=1 µ i = 4, for both cases. We vary the number of people n from 2 to 30. We observe in Fig. 9 that when the infection and healing rates are uniform in the population, the health care provider can track the infection status with the same efficiency, even though the population size increases (while keeping the total infection and healing rates fixed).

For the case of λ i in (80) and µ i in (81), when we increase the population size, we increase the number of people who rarely get sick, i.e., people with high i indices, and also people who rarely heal from the disease, i.e., people with small i indices. Thus, it gets easier for the health care provider to track the infection status of the people. That is why when we use λ i in (80) and µ i in (81), we observe in Fig. 9 that the health care provider can track the infection status of the people better, even though the population size increases.

In the fourth example, we use the same set of variables as the first example except for the importance factor θ. Here, we vary θ in between 0.2 to 0.7. We plot ∆ in (7),∆ 1 which is Fig. 10 (a). Note that∆ 1 represents the average difference when people are infected, but they have not been detected by the health care provider, and∆ 2 represents the average difference when people have recovered, but the health care provider still marks them as infected. Note that when θ is high, we give importance to minimization of∆ 1 , i.e., the early detection of people with infection, and when θ is low, we give importance to minimization of∆ 2 , i.e., the early detection of people who recovered from the disease. That is why we observe in Fig. 10 (a) that∆ 1 decreases with θ while∆ 2 increases with θ.

We plot the total test rates n i=1 s i and n i=1 c i in Fig. 10 (b). We observe in Fig. 10 (b) that if it is more important to detect the infected people, i.e., if θ is high, then the health care provider should apply higher test rates to people who are marked as healthy. In other words, n i=1 s i increases with θ. Similarly, if it is more important to detect people who recovered from the disease, then the health care provider should apply high test rates to people who are marked as infected. That is, n i=1 c i is high when θ is low. Therefore, depending on the priorities of the health care provider, a suitable θ needs to be chosen.

In the fifth numerical result, we consider the case where there are errors in the test measurements, i.e., the model in Section V. We take the total test rate as C = 20, and vary error rates in the test p = q = {0.1, 0.2, 0.4}. In Fig. 11(a) , we provide the test rates v i that we found as a result of our greedy policy in Section V. When the error rates p and q are low, i.e., when p = q = 0.1, we see that the health care provider applies tests to everyone in the population and the corresponding ∆ e i is lower than applying no test as shown in Fig. 11(b) . As we increase the error rates, we see that some people in the population start to be not tested by the health care provider, see Fig. 11 (a) when p = q = {0.2, 0.4}. In this case, the health care provider applies more tests to remaining people to combat with the test errors. However, even though it applies more tests to the remaining people, we observe in Fig. 11 (b) that the achieved average difference ∆ e i gets higher as error rates increase. each other, then the average time that person 1 or 2 is sick is equal to λ λ+µ = 1 3 . As we increase infection rate λ 12 among the person 1 and 2, we see in Fig. 12 (a) that the average time that person 1 is sick increases. However, we note that as we increase the total test rate, the health care provider can detect a sick person more frequently, and that is why the average infected time is low in Fig. 12 (a) when the test rate is high. Then, we consider λ 12 = {5, 10, 15} and vary the total test rates λ = {2, . . . , 200}. We plot the average time that both person 1 and 2 stay as sick in Fig. 12(b) . As we increase the total test rate, the health care provider detects the infected person more quickly, and thus, prohibits the infection from spreading. As a result, we observe that the average time that both people are infected decreases in C in Fig. 12(b) . Since both people can be infected with the virus independent from each other with rate λ, the plots in Fig. 12(b) do not go down to 0.

In the last numerical result, we consider the age of incorrect information based error metric in Section VII. Here, the estimation error increases with the time that the health care provider does not detect the changes in the infection status of the people. As a result, the average difference expression given by ∆ s i in (77) is different than ∆ e i in (45) when p = q = 0. For this example, we consider the total test rate C = 4 and compare the normalized average differences given by and the corresponding test rates w i and v i . In Fig. 13(b) , depending on the error metric model, people who are tested by the health care provider and their test rates vary considerably. For example, with the error metric ∆ s i in (77), we apply tests to the 3rd person while the same person is not tested with the error metric ∆ e i in (45) . In Fig. 13(a) , we provide the normalized average difference values. Here, the average normalized error for the tested people has similar values whereas the normalized difference may vary for the untested people. Thus, we should choose a suitable error metric that satisfies the priorities of the health care provider as it greatly affects who is tested and with which test rates.

We considered timely tracking of infection status of individuals in a population. For exponential infection and healing processes with given rates, we determined the rates of exponential testing processes. We considered errors on the test measurements and observed that in order to combat the test errors, a limited portion of the population may be tested with higher test rates.

Then, we studied a dependent infection spread model for two people where an infected person can spread the virus to the other one if it has not been detected by the health care provider yet.

Finally, we studied an AoII based error metric where the error function linearly increases over time as the changes on the infection status has not been detected by the health care provider.

We observed in numerical results that the test rates depend on the individuals' infection and healing rates, the individuals' last known state of healthy or infected, as well as the health care provider's priorities of detecting infected people versus detecting recovered people more quickly.

In the literature, in order to model epidemics, population is partitioned into groups called compartments. One such example is the SIR model used in [106] with the compartments susceptible (S), infected (I), and recovered (R) which has been further developed by adding states hospitalized (H), and death (D) in [107] . In these epidemic models, the transitions between the compartments are assumed to be Markovian. In [107] , with the epidemiological data, the delay distributions for the infected (I) to hospitalized (H), and infected (I) to death (D) are well approximated by exponential and gamma distributions, respectively. However, due to the lack of data availability the delay distribution for infected (I) to recovered (R) is modeled with gamma distribution with higher tolerance. In our work, we modeled infection and recovery times, i.e., the delays between recovered (R) to infected (I) and infected (I) to recovered (R) with exponential distributions. Therefore, more realistic infection tracking models can be developed by considering gamma distributions as observed in [107] . This more realistic model corresponds to the problem of real-time timely tracking of a binary Markov source in a serially connected network. The serially connected network model has been studied in [8] with the traditional age of information metric. We note that considering the same networking model with the AoII based error metric to track information dissemination of a binary Markov source is a promising research direction and has direct applications to the real-time tracking of epidemic spread models. One can also study the extension of dependent infection spread model in Section VI to n > 2 people as a future research direction.

Real-time status: How often should one update?

Scheduling policies for minimizing age of information in broadcast wireless networks

Age of information with a packet deadline

Update or wait: How to keep your data fresh

Status updates in a multi-stream M/G/1/1 preemptive queue

Age of information in G/G/1/1 systems: Age expressions, bounds, special cases, and optimization

Age of information with Gilbert-Elliot servers and samplers

The age of information in networks: Moments, distributions, and sampling

Minimizing age-of-information in multi-hop wireless networks

Age of information in multi-source systems

The age of information in multihop networks

Multicast with prioritized delivery: How fresh is your data?

Age of information in two-hop multicast networks

Age of information in multihop multicast networks

Age of information in multicast networks with multiple update streams

Minimizing age of information in a multihop wireless network

Fundamental bounds on the age of information in multi-hop global status update networks

Optimal and scalable distribution of content updates over a mobile social network

Reconstruction of counting process in real-time: The freshness of information through queues

Remote estimation of the Wiener process over a channel with random delay

Information aging through queues: A mutual information perspective

Remote estimation over a packet-drop channel with Markovian state

Age of incorrect information for remote estimation of a binary Markov source

Sample, quantize and encode: Timely estimation over noisy channels

Who should Google Scholar update more often?

Achieving the age-energy trade-off with a finite-battery energy harvesting source

Sending information through status updates

Coded status updates in an energy harvesting erasure channel

Optimal status update for age of information minimization with an energy harvesting source

Optimal status updating for an energy harvesting sensor with a noisy channel

Minimizing age of information for an energy harvesting source with updating failures

Age-minimal online policies for energy harvesting sensors with incremental battery recharges

Age-minimal online policies for energy harvesting sensors with random battery recharges

Age-minimal transmission for energy harvesting sensors with finite batteries: online policies

Online timely status updates with erasures for energy harvesting sensors

Using erasure feedback for online timely updating with an energy harvesting sensor

Timely status updating over erasure channels using an energy harvesting sensor: Single and multiple sources

Average age of information for status update systems with an energy harvesting server

Age of information minimization for an energy harvesting cognitive radio

Age of information in a multiple access channel with heterogeneous traffic and an energy harvesting node

Throughput maximization with an average age of information constraint in fading channels

Peak-age violation guarantees for the transmission of short packets over fading channels

Age of information with soft updates

Minimizing age of information with soft updates

Age of information scaling in large networks

Age of information scaling in large networks with hierarchical cooperation

Scaling laws for age of information in wireless networks

Minimizing content staleness in dynamo-style replicated storage systems

Not just age but age and quality of information

Towards the tradeoff between service performance and information freshness

The age of incorrect information: an enabler of semantics-empowered communication

Semantic communications in networked systems

Age-of-information vs. value-of-information scheduling for cellular networked control systems

Timely group updating

Fundamental limits of age-of-information in stationary and non-stationary environments

Timeliness in lossless block coding

Timely lossless source coding for randomly arriving symbols

Optimal lossless source codes for timely updates

Optimal source codes for timely updates

Optimal selective encoding for timely updates

Optimal selective encoding for timely updates with empty symbol

Selective encoding policies for maximizing information freshness

Age of information with finite horizon and partial updates

On timely channel coding with hybrid ARQ

Timely transmissions using optimized variable length coding

Partial updates: Losing information for freshness

Average peak age-of-information minimization in UAV-assisted IoT networks

Age-optimal trajectory planning for UAV-assisted data collection

On the role of age of information in the internet of things

Joint information freshness and completion time optimization for vehicular networks

Distributed maintenance of cache freshness in opportunistic mobile networks

Age-optimal constrained cache updating

Information freshness and popularity in mobile caching

Towards fresh and low-latency content delivery in vehicular networks: An edge caching aspect

Age of information aware cache updating with file-and agedependent update durations

Two freshness metrics for local cache refresh

Edge caching with real-time guarantees

Maximizing information freshness in caching systems with limited cache storage capacity

Cache freshness in information updating systems

Information freshness in cache updating systems

Freshness based cache updating in parallel relay networks

Optimizing information freshness in two-hop status update systems under a resource constraint

Age-of-information for computation-intensive messages in mobile edge computing

Reducing age-of-information for computation-intensive messages via packet replacement

Trading off computation with transmission in status update systems

Age of information for updates with distortion

Age of information for updates with distortion: Constant and age-dependent distortion constraints

Timely updates in distributed computation systems with stragglers

Timely distributed computation with stragglers

Optimizing information freshness through computation-transmission tradeoff and queue management in edge computing

Timely communication in federated learning

Age-based coded computation for bias reduction in distributed learning

A reinforcement learning approach to age of information in multi-user networks with HARQ

The age of gossip in networks

Age of gossip in networks with community structure

Gossiping with binary freshness metric

Timely gossiping with file slicing and network coding

Age of information: A new concept, metric, and tool

Age of information: A new metric for information freshness

Age of information: An introduction and survey

Optimal real-time monitoring of an information source under communication costs

The age of incorrect information: A new performance metric for status updates

Probability and Stochastic Processes

Convex Optimization

Introduction to Probability

A time-dependent SIR model for covid-19 with undetectable infected persons

A data-informed approach for analysis, validation, and identification of covid-19 models