key: cord-0109803-46qyzpem authors: Hoeher, Peter Adam; Damrath, Martin; Bhattacharjee, Sunasheer; Schurwanz, Max title: On Mutual Information Analysis of Infectious Disease Transmission via Particle Propagation date: 2021-01-28 journal: nan DOI: nan sha: 23c09ce6a763ed934793f8aaa92caacee6660cb0 doc_id: 109803 cord_uid: 46qyzpem Besides mimicking bio-chemical and multi-scale communication mechanisms, molecular communication forms a theoretical framework for virus infection processes. Towards this goal, aerosol and droplet transmission has recently been modeled as a multiuser scenario. In this letter, the"infection performance"is evaluated by means of a mutual information analysis, and by an even simpler probabilistic performance measure which is closely related to absorbed viruses. The so-called infection rate depends on the distribution of the channel input events as well as on the transition probabilities between channel input and output events. The infection rate is investigated analytically for five basic discrete memoryless channel models. Numerical results for the transition probabilities are obtained by Monte Carlo simulations for pathogen-laden particle transmission in four typical indoor environments: two-person office, corridor, classroom, and bus. Particle transfer contributed significantly to infectious diseases like SARS-CoV-2 and influenza. M OTIVATED by Claude E. Shannon's fundamental model of a noisy transmission system [1, Fig. 1 ], viral aerosol information retrieval in communication through exhaled breath has initially been studied in [2] . Respiratory events are exploited as a source message in a human-tomachine communication setup. After transmission via an atmospheric channel, the message is proposed to be scanned by a nanosensor-based machine-type detector. Subsequently, in [3] the same authors developed a mathematical analysis model of aerosol transmission and detection in order to examine detection probabilities and ranges. Related work on end-to-end system modeling has been published in [4] , but for a humanto-human communication setup, where transmitter/receiver are interpreted as infected/uninfected humans. Also, drag and buoyancy are taken into account in [4] . Inspired by papers such as [2] , [4] , the duality between molecular communication and pathogen-laden particle transmission has been explored in [5] , [6] . Specifically, the analogy to a multiuser communication scenario is elaborated. In [5] , it is briefly suggested to use mutual information to measure the "infection performance" of a particle-based transmission system. Possible mutual information minimization techniques to reduce the risk of infection are pointed out in [6] . In the context of infectious disease transmission via particles, novel contributions of the letter include P. A. Hoeher, M. Damrath, S. Bhattacharjee, and M. Schurwanz are with the Faculty of Engineering, Kiel University, Kiel, Germany, e-mail: {ph,md,sub,masc}@tf.uni-kiel.de. • an analytical investigation of the mutual information for basic discrete channel models, • a critical review of a logarithmic infection measure, and • numerical evaluations of the transition probabilities considering typical environments. In the literature, the distinction between aerosols (small lightweight particles suspended in air) and droplets (larger particles) is not defined uniquely. Therefore, subsequently we refer to particles as a general expression if size is not relevant. Pathogen-laden particles are dubbed infected particles. Subsequently, the end-to-end system is simplified by a discrete memoryless channel (DMC) model. The validity of this assumption is checked in Section IV. Let us denote the events at the channel input by x i ∈ X , 1 ≤ i ≤ L X , where X is the input alphabet of cardinality L X = |X |. Correspondingly, the channel output events are denoted as y j ∈ Y, 1 ≤ j ≤ L Y , where Y is the output alphabet of cardinality L Y = |Y|. Channel inputs and outputs are randomly distributed. The corresponding random variables are X and Y , respectively. X and Y are assumed to be discretevalued. Furthermore, let p X (x i ), p Y (y j ), p X,Y (x i , y j ), and p Y |X (y j |x i ) be the marginal probability mass function of X, the marginal probability mass function of Y , the joint probability mass function between X and Y , and the conditional probability mass function of Y given X, respectively. With these notations, the average mutual information can be expressed as [7] . In terms of the input distribution p X (x i ) and the transition probabilities p Y |X (y j |x i ), (1) can be reformulated as The transition probabilities are a characteristic of the channel, including receiver-side properties (like the sensitive area), whereas the input distribution is independent of the channel. This model separation proves to be quite useful. Regarding data transmission, the maximum transmission rate for which a quasi-error-free data transmission is possible, i.e., the channel arXiv:2101.12121v1 [cs.IT] 28 Jan 2021 capacity, is obtained by maximizing the mutual information [1] , [7] . Examples of recent publications considering particle transmission include [8] , [9] . Concerning infection, neither the entire mutual information I(X; Y ) = j I(X; Y = y j ) nor j∈Y I I(X; Y = y j ) is relevant, where Y I is the set of infectious output events, but only the individual contributions for all j ∈ Y I . R = I(X; Y ∈ Y I ) is dubbed infection rate and measured in bit/channel event. For n events, the mutual infection is n R. As opposed to data communication systems, mutual information should be minimized in this context. In the first example, the point-to-point scenario shown in Fig. 1 The two channel input events are labeled "0" (referring to noninfected particles) and "1" (comprising all infected particles), respectively. Assume that the emitted particles are infected with probability p 1 , i.e., p X (0) = 1−p 1 and p X (1) = p 1 . The channel output event is labeled "1" for absorbed viruses and "0" for unabsorbed viruses. Naturally, the transition probabilities leaving 0 are p Y |X (0|0) = 1 and p Y |X (1|0) = 0, because non-infected particles are not able to cause an infection at the receiver side. Vice versa, the transition probabilities leaving 1 are p Y |X (0|1) = 1 − q 1 and p Y |X (1|1) = q 1 , respectively. Only a fraction q 1 of infected particles will eventually cause the receiver to become infected. The only infectious output event is "1". It can be proven that p Y (0) = 1 − p 1 q 1 and p Y (1) = p 1 q 1 . The mutual information between X and Y = 1 can be written in closed form as The maximum is −(1/e) log 2 (1/e) · q 1 yielded at p 1 = 1/e. For 0 ≤ p 1 ≤ 1/e, I(X; Y = 1) is monotonically increasing, as desired. Beyond this maximum, however, the so-called logarithmic infection measure is meaningless. We subsequently prove that p 1 · q 1 , called linear infection measure, is a simpler yet more realistic infection measure. In the second example, a second transmitter is added, see Fig. 1 (b) . Now, L X = 4 and L Y = 2. The additional input and transition probabilities are called p 2 and q 2 , respectively. The mutual information between X = (X 1 , X 2 ) and Y = 1 can be computed as Note that any additional transmitter in a multipoint-to-point scenario linearly increases the mutual information. Starting off from Example 1, in the third example a second receiver is added (see Fig. 1 (c) ), i.e., L X = 2 and L Y = 4. In this point-to-multipoint scenario the mutual information between X and Y 1 = 1 respectively Y 2 = 1 is yielded as I(X; Y 1 = 1) = −q 1 · p 1 log 2 p 1 , I(X; Y 2 = 1) = −q 2 · p 1 log 2 p 1 . Compared to Scenario (b), there are less parameters involved because of the common input. As opposed to Scenario (b), the infection rates act individually rather than additive. In the fourth example, two Z channels are concatenated as depicted in Fig. 1 (d) . This scenario emulates relaying with sequential processing. Effectively, there are two receivers in this scenario. The first receiver, Y , is infected the same as in Scenario (a). After a certain time delay D, receiver Y may turn into a transmitter. The second receiver, Z, can only be infected by Y . If D is shorter than the incubation time, Y is not able to amplify the viral load. In this case, dubbed passive relaying, the mutual information between X and Z = 1 is yielded in closed form as I(X; Z = 1) = −q 2 · p 1 q 1 log 2 (p 1 q 1 ). According to the data processing theorem, I(X; Z = 1) ≤ I(X; Y = 1). This is explained by the fact that p Y (1) = p 1 · q 1 ≤ p X (1). As a result, the relation to the initial particle spreader diminishes with an increasing number of relays. For passive relaying, the overall DMC between X and Z is again a Z channel with transition probabilities p Z|X (0|0) = 1, p Z|X (0|1) = 1 − q 1 q 2 , and p Z|X (1|1) = q 1 q 2 . If, however, D exceeds the incubation time, the relay acts as an active spreader. In that case, where p Y (1) = p 2 depends on the efficiency of the relay to boost the viral load. A natural extension of the Z channel in Scenario (a) is shown in Fig. 1 (e) . In this refinement, the binary input alphabet is replaced by a ternary input alphabet (L X = 3). Input event "0" again corresponds to non-infectious particles, but now input events "1" and "2" are assigned to aerosols and droplets, respectively. The output events are kept unchanged (L Y = 2). Let us assume that emitted aerosols are infected with probability p X (1) = p 1 , and emitted droplets with probability p X (2) = p 2 . Accordingly, the transition probabilities causing an infection are denoted as p Y |X (1|1) = q 1 and p Y |X (1|2) = q 2 , respectively. Since event 0 is not able to cause an infection, p Y |X (0|0) = 1. The mutual information between X and Y = 1 can be computed as I(X; Y = 1) = −q 1 · p 1 log 2 p 1 − q 2 · p 2 log 2 p 2 . (9) Formally, this result is equivalent to the two-user case in Scenario (b). An extension to multiple particle sizes is apparent. The number of absorbed viruses can be expressed as where n is the number of respiratory events, p(d) and q(d) are the input and transition probabilities as a function of particle size d, N (d) the corresponding number of emitted particles per event, and η(d) the number of viruses per particle. A subject is said to be infected if the absorbed viral load Φ exceeds an infection threshold Θ. Eqn. (10) in conjunction with the nonbinary-input channel model (e) proves the relevance of the proposed linear infection measure. In this section, transition probabilities will be evaluated for the four environments depicted in Fig. 2 . These environments are interesting from communication theory (point-to-point, point-to-multipoint, and multipoint-to-multipoint scenarios), epidemiology (short-range and long-range transmission), as well as socio-economic aspects. Environment (a) shows a typical office space occupied by two people sitting vis-a-vis in front of their desks for a long period of time, one of whom is believed to be infected. A duration of 4 h, for example, corresponds to n = 240 coughs. In Environment (b), two people meet in a hallway. They are looking at each other, but continue walking. At a distance of 1 m, the infected person coughs once (n = 1). The noninfected person is walking at a speed of 1 m/s (that is 3.6 km/h) through the emitted particle cloud. Environment (c) features a fully occupied classroom. One student as well as the teacher are assumed to be infected. The teacher's position is uniformly random distributed in front of the class. A lesson typically lasts for 90 min (n = 90). A similar situation is studied in Environment (d), where a 2 × 2-seater bus is assumed: although the bus is occupied by just 50 %, any infected person boarding or leaving the bus is likely to infect surrounding passengers. A single cough event is emulated at the position drawn in the figure. Random trajectories of exhaled particles are emulated by means of Monte Carlo simulations. In all cases, the height h of the mouth is taken to be 1.20 m and 1.64 m for sitting and standing people, respectively. One or two infected subjects are considered, coughing once per minute. Per cough, 4973 water particles are expected to be released at an initial speed of 11.2 m/s, see [4] and reference [37] therein. Their diameter d is taken from [4] and reference [29] therein, where d min = 2 µm and d max = 2000 µm. Since the size of a COVID-19 virus is approximately 0.1 µm d min , the number of viruses encapsulated per water particle is ∼ d 3 . For instance, given an oral fluid average viral load of 7 · 10 6 copies per milliliter, the probability that a 20 µm particle contains a virion is about 3 %. Since the viral load depends on the course of the disease and varies individually, we have changed the viral load and hence p X (1) over a wide range. The temporal resolution is ∆T = 10 −4 s, which is sufficiently small given a maximum particle velocity on the order of 10 m/s and a spatial resolution better than of 1 mm. The beam width of emitted particles is zero-mean Gaussian distributed N (0, 6.25 • ) in horizontal and vertical directions [5] . The mean value of the angle of incidence is also assumed to be Gaussian distributed to account where v x,y,z [0] are the initial velocities in the direction of the x, y, z axes. The air drag is determined by α = (β/m)∆T , where β = 3πηd is the Stokes drag coefficient and η ≈ 1.85 · 10 −5 kg/(m s) the dynamical viscosity [10] . Insertion of the mass m = ρ water (π/6) d 3 yields α = 18η∆T /(ρ water d 2 ), where ρ water ≈ 998 − 994 kg/m 3 at 20 − 35 • C. Note that ρ water ρ air . Neglecting buoyancy, turbulence, droplet shrinking in dry air, and assuming that the particles do not interact with each other, the maximum velocity is |v z, ∞ | = m g/β if the particles are emitted horizontally. Consequently, the channel model is memoryless if the time difference between respiratory events exceeds the fall time h/|v z, ∞ | for most of the pathogen-laden particles. Numerical results obtained by Monte Carlo simulations are depicted in Fig. 3 . For each scenario, averaging is performed over 90-240 statistically independent runs. Besides the infection rate, the number of events, n, is an important parameter for the probability of infection. Taking this into account, the office and the classroom are the most critical environments. Beyond a certain viral load the mutual information does not increase any more, which is counterintuitive. The so-called infection rate is a novel quantity to mathematically measure the "infection performance." Besides a logarithmic infection rate based on the mutual information, an even simpler yet more realistic linear infection measure is suggested. Unlike transmission schemes targeting a maximization of the mutual information, in the area of pathogen-laden particle transmission the objective is to minimize the infection rate. Both infection rates can completely be determined by the probability distribution of the channel input events, and by the transition probabilities between input and output events. This model separation proves to be useful. The input probabilities are affected by the kind of respiratory event, masks, etc. The transition probabilities are affected by distances, particle flow, and many other parameters. In this contribution, the mutual information has been calculated in closed form for several basic discrete memoryless channel models. It is shown that any additional infected transmitter linearly increases the infection rate. Furthermore, active and passive relaying are considered. By means of particle-based Monte Carlo simulations, the infection rates have been emulated for four typical environments: (a) two-person office space; (b) corridor; (c) classroom; (d) bus. Scenarios (a) and (c) are most critical. A mathematical theory of communication Communication through breath: Aerosol transmission Modeling of viral aerosol transmission and detection A molecular communication perspective on airborne pathogen transmission and reception via droplets generated by coughing and sneezing Duality between coronavirus transmission and air-based macroscopic molecular communication Infectious disease transmission via aerosol propagation from a molecular communication perspective: Shannon meets coronavirus Elements of Information Theory An information theoretic framework to analyze molecular communication systems based on statistical mechanics Capacities and optimal input distributions for particle-intensity channels Falling dynamics of SARS-CoV-2 as a function of respiratory droplet size and human height