key: cord- -lkoyrv s authors: salathé, marcel; jones, james h. title: dynamics and control of diseases in networks with community structure date: - - journal: plos comput biol doi: . /journal.pcbi. sha: doc_id: cord_uid: lkoyrv s the dynamics of infectious diseases spread via direct person-to-person transmission (such as influenza, smallpox, hiv/aids, etc.) depends on the underlying host contact network. human contact networks exhibit strong community structure. understanding how such community structure affects epidemics may provide insights for preventing the spread of disease between communities by changing the structure of the contact network through pharmaceutical or non-pharmaceutical interventions. we use empirical and simulated networks to investigate the spread of disease in networks with community structure. we find that community structure has a major impact on disease dynamics, and we show that in networks with strong community structure, immunization interventions targeted at individuals bridging communities are more effective than those simply targeting highly connected individuals. because the structure of relevant contact networks is generally not known, and vaccine supply is often limited, there is great need for efficient vaccination algorithms that do not require full knowledge of the network. we developed an algorithm that acts only on locally available network information and is able to quickly identify targets for successful immunization intervention. the algorithm generally outperforms existing algorithms when vaccine supply is limited, particularly in networks with strong community structure. understanding the spread of infectious diseases and designing optimal control strategies is a major goal of public health. social networks show marked patterns of community structure, and our results, based on empirical and simulated data, demonstrate that community structure strongly affects disease dynamics. these results have implications for the design of control strategies. mitigating or preventing the spread of infectious diseases is the ultimate goal of infectious disease epidemiology, and understanding the dynamics of epidemics is an important tool to achieve this goal. a rich body of research [ , , ] has provided major insights into the processes that drive epidemics, and has been instrumental in developing strategies for control and eradication. the structure of contact networks is crucial in explaining epidemiological patterns seen in the spread of directly transmissible diseases such as hiv/aids [ , , ] , sars [ , ] , influenza [ , , , ] etc. for example, the basic reproductive number r , a quantity central to developing intervention measures or immunization programs, depends crucially on the variance of the distribution of contacts [ , , ] , known as the network degree distribution. contact networks with fat-tailed degree distributions, for example, where a few individuals have an extraordinarily large number of contacts, result in a higher r than one would expect from contact networks with a uniform degree distribution, and the existence of highly connected individuals makes them an ideal target for control measures [ , ] . while degree distributions have been studied extensively to understand their effect on epidemic dynamics, the community structure of networks has generally been ignored. despite the demonstration that social networks show significant community structure [ , , , ] , and that social processes such as homophily and transitivity result in highly clustered and modular networks [ ] , the effect of such microstructures on epidemic dynamics has only recently started to be investigated. most initial work has focused on the effect of small cycles, predominantly in the context of clustering coefficients (i.e. the fraction of closed triplets in a contact network) [ , , , , ] . in this article, we aim to understand how community structure affects epidemic dynamics and control of infectious disease. community structure exists when connections between members of a group of nodes are more dense than connections between members of different groups of nodes [ ] . the terminology is relatively new in network analysis and recent algorithm development has greatly expanded our ability to detect sub-structuring in networks. while there has been a recent explosion in interest and methodological development, the concept is an old one in the study of social networks where it is typically referred to as a ''cohesive subgroups,'' groups of vertices in a graph that share connections with each other at a higher rate than with vertices outside the group [ ] . empirical data on social structure suggests that community structuring is extensive in epidemiological contacts [ , , ] relevant for infectious diseases transmitted by the respiratory or close-contact route (e.g. influenza-like illnesses), and in social groups more generally [ , , , , ] . similarly, the results of epidemic models of directly transmitted infections such as influenza are most consistent with the existence of such structure [ , , , , , ] . using both simulated and empirical social networks, we show how community structure affects the spread of diseases in networks, and specifically that these effects cannot be accounted for by the degree distribution alone. the main goal of this study is to demonstrate how community structure affects epidemic dynamics, and what strategies are best applied to control epidemics in networks with community structure. we generate networks computationally with community structure by creating small subnetworks of locally dense communities, which are then randomly connected to one another. a particular feature of such networks is that the variance of their degree distribution is relatively low, and thus the spread of a disease is only marginally affected by it [ ] . running standard susceptible-infected-resistant (sir) epidemic simulations (see methods) on these networks, we find that the average epidemic size, epidemic duration and the peak prevalence of the epidemic are strongly affected by a change in community structure connectivity that is independent of the overall degree distribution of the full network ( figure ). note that the value range of q shown in figure is in agreement with the value range of q found in the empirical networks used further below, and that lower values of q do not affect the results qualitatively (see suppl. mat. figure s ). epidemics in populations with community structure show a distinct dynamical pattern depending on the extent of community structure. in networks with strong community structure, an infected individual is more likely to infect members of the same community than members outside of the community. thus, in a network with strong community structure, local outbreaks may die out before spreading to other communities, or they may spread through various communities in an almost serial fashion, and large epidemics in populations with strong community structure may therefore last for a long time. correspondingly, the incidence rate can be very low, and the number of generations of infection transmission can be very high, compared to the explosive epidemics in populations with less community structure (figures a and b ). on average, epidemics in networks with strong community structure exhibit greater variance in final size (figures c and d) , a greater number of small, local outbreaks that do not develop into a full epidemic, and a higher variance in the duration of an epidemic. in order to halt or mitigate an epidemic, targeted immunization interventions or social distancing interventions aim to change the structure of the network of susceptible individuals in such a way as to make it harder for a pathogen to spread [ ] . in practice, the number of people to be removed from the susceptible class is often constrained for a number of reasons (e.g., due to limited vaccine supply or ethical concerns of social distancing measures). from a network perspective, targeted immunization methods translate into indentifying which nodes should be removed from a network, a problem that has caught considerable attention (see for example [ ] and references therein). targeting highly connected individuals for immunization has been shown to be an effective strategy for epidemic control [ , ] . however, in networks with strong community structure, this strategy may not be the most effective: some individuals connect to multiple communities (so-called community bridges [ ] ) and may thus be more important in spreading the disease than individuals with fewer inter-community connections, but this importance is not necessarily reflected in the degree. identification of community bridges can be achieved using understanding the spread of infectious diseases in populations is key to controlling them. computational simulations of epidemics provide a valuable tool for the study of the dynamics of epidemics. in such simulations, populations are represented by networks, where hosts and their interactions among each other are represented by nodes and edges. in the past few years, it has become clear that many human social networks have a very remarkable property: they all exhibit strong community structure. a network with strong community structure consists of smaller sub-networks (the communities) that have many connections within them, but only few between them. here we use both data from social networking websites and computer generated networks to study the effect of community structure on epidemic spread. we find that community structure not only affects the dynamics of epidemics in networks, but that it also has implications for how networks can be protected from large-scale epidemics. the betweenness centrality measure [ ] , defined as the fraction of shortest paths a node falls on. while degree and betweenness centrality are often strongly positively correlated, the correlation between degree and betweenness centrality becomes weaker as community structure becomes stronger ( figure ). thus, in networks with community structure, focusing on the degree alone carries the risk of missing some of the community bridges that are not highly connected. indeed, at a low vaccination coverage, an immunization strategy based on betweenness centrality results in fewer infected cases than an immunization strategy based on degree as the magnitude of community structure increases ( figure a ). this observation is critical because the potential vaccination coverage for an emerging infection will typically be very low. a third measure, random walk centrality, identifies target nodes by a random walk, counting how often a node is traversed by a random walk between two other nodes [ ] . the random walk centrality measure considers not only the shortest paths between pairs of nodes, but all paths between pairs of nodes, while still giving shorter paths more weight. while infections are most likely to spread along the shortest paths between any two nodes, the cumulative contribution of other paths can still be important [ ] : immunization strategies based on random walk centrality result in the lowest number of infected cases at low vaccination coverage (figure b and c ). to test the efficiency of targeted immunization strategies on real networks, we used interaction data of individuals at five different universities in the us taken from a social network website [ ] , and obtained the contact network relevant for directly transmissible diseases (see methods). we find again that the overall most successful targeted immunization strategy is the one that identifies the targets based on random walk centrality. limited immunization based on random walk centrality significantly outperforms immunization based on degree especially when vaccination coverage is low (figure a ). in practice, identifying immunization targets may be impossible using such algorithms, because the structure of the contact network relevant for the spread of a directly transmissible disease is generally not known. thus, algorithms that are agnostic about the full network structure are necessary to identify target individuals. the only algorithm we are aware of that is completely agnostic about the network structure network structure identifies target nodes by picking a random contact of a randomly chosen individual [ ] . once such an acquaintance has been picked n times, it is immunized. the acquaintance method has been shown to be able to identify some of the highly connected individuals, and thus approximates an immunization strategy that targets highly connected individuals. we propose an alternative algorithm (the so-called community bridge finder (cbf) algorithm, described in detail in the methods) that aims to identify community bridges connecting two groups of clustered nodes. briefly, starting from a random node, the algorithm follows a random path on the contact network, until it arrives at a node that does not connect back to more than one of the previously visited nodes on the random walk. the basic goal of the cbf algorithm is to find nodes that connect to multiple communities -it does so based on the notion that the first node that does not connect back to previously visited nodes of the current random walk is likely to be part of a different community. on all empirical and computationally generated networks tested, this algorithm performed mostly better, often equally well, and rarely worse than the alternative algorithm. it is important to note a crucial difference between algorithms such as cbf (henceforth called stochastic algorithms) and algorithms such as those that calculate, for example, the betweenness centrality of nodes (henceforth called deterministic algorithms). a deterministic algorithm always needs the complete information about each node (i.e. either the number or the identity of all connected nodes for each node in the network). a comparison between algorithms is therefore of limited use if they are not of the same type as they have to work with different inputs. clearly, a deterministic algorithm with information on the full network structure as input should outperform a stochastic algorithm that is agnostic about the full network structure. thus, we will restrict our comparison of cbf to the acquaintance method since this is the only stochastic algorithm we are aware of the takes as input the same limited amount of local information. in the computationally generated networks, cbf outperformed the acquaintance method in large areas of the parameter space ( figure d ). it may seem unintuitive at first that the acquaintance method outperforms cbf at very high values of modularity, but one should keep in mind that epidemic sizes are very small in those extremely modular networks (see figure a ) because local outbreaks only rarely jump the community borders. if outbreaks are mostly restricted to single communities, then cbf is not the optimal strategy because immunizing community bridges is useless; the acquaintance method may at least find some well connected nodes in each community and will thus perform slightly better in this extreme parameter space. in empirical networks, cbf did particularly well on the network with the strongest community structure (oklahoma), especially in comparison to the similarly effective acquaintance method with n = . (figure c ). as immunization strategies should be deployed as fast as possible, the speed at which a certain fraction of the . assessing the efficacy of targeted immunization strategies based on deterministic and stochastic algorithms in the computationally generated networks. color code denotes the difference in the average final size s m of disease outbreaks in networks that were immunized before the outbreak using method m. the top panel (a) shows the difference between the degree method and the betweenness centrality method, i.e. s degree s betweenness . a positive difference (colored red to light gray) indicates that the betweenness centrality method resulted in smaller final sizes than the degree method. a negative difference (colored blue to black) indicates that the betweenness centrality method resulted in bigger final sizes than the degree method. if the difference is not bigger than . % of the total population size, then no color is shown (white). panel (a) shows that the betweenness centrality method is more effective than the degree based method in networks with strong community structure (q is high). (b) and (c): like (a), but showing s degree s randomwalk (in (b)) and s betweenness s randomwalk (in (c)). panels (b) and (c) show that the random walk method is the most effective method overall. panel (d) shows that the community bridge finder (cbf) method generally outperforms the acquaintance method (with n = ) except when community structure is very strong (see main text). final epidemic sizes were obtained by running sir simulations per network, vaccination coverage and immunization method. doi: . /journal.pcbi. .g network can be immunized is an additional important aspect. we measured the speed of the algorithm as the number of nodes that the algorithm had to visit in order to achieve a certain vaccination coverage, and find that the cbf algorithm is faster than the similarly effective acquaintance method with n = at vaccination coverages , % (see figure ). a great number of infectious diseases of humans spread directly from one person to another person, and early work on the spread of such diseases has been based on the assumption that every infected individual is equally likely to transmit the disease to any susceptible individual in a population. one of the most important consequences of incorporating network structure into epidemic models was the demonstration that heterogeneity in the number of contacts (degree) can strongly affect how r is calculated [ , , ] . thus, the same disease can exhibit markedly different epidemic patterns simply due to differences in the degree distribution. our results extend this finding and show that even in networks with the same degree distribution, fundamentally different epidemic dynamics are expected to be observed due to different levels of community structure. this finding is important for various reasons: first, community structure has been shown to be a crucial feature of social networks [ , , , ] , and its effect on disease spread is thus relevant to infectious disease dynamics. furthermore, it corroborates earlier suggestions that community structure affects the spread of disease, and is the first to clearly isolate this effect from effects due to variance in the degree distribution [ ] . second, and consequently, data on the degree distribution of contact networks will not be sufficient to predict epidemic dynamics. third, the design of control strategies benefits from taking community structure into account. an important caveat to mention is that community structure in the sense used throughout this paper (i.e. measured as modularity q ) does not take into account explicitly the extent to which communities overlap. such overlap is likely to play an important role in infectious disease dynamics, because people are members of multiple, potentially overlapping communities (households, schools, workplaces etc.). a strong overlap would likely be reflected in lower overall values for q; however, the exact effect of community overlap on infectious disease dynamics remains to be investigated. identifying important nodes to affect diffusion on networks is a key question in network theory that pertains to a wide range of fields and is not limited to infectious disease dynamics only. there are however two major issues associated with this problem: (i) the structure of networks is often not known, and (ii) many networks are too large to compute, for example, centrality measures efficiently. stochastic algorithms like the proposed cbf algorithm or the acquaintance method address both problems at once. to what extent targeted immunization strategies can be implemented in a infectious diseases/public health setting based on practical and ethical considerations remains an open question. this is true not only for the strategy based on the cbf algorithm, but for most strategies that are based on network properties. as mentioned above, the contact networks relevant for the spread of infectious diseases are generally not known. stochastic algorithms such as the cbf or the acquaintance method are at least in principle applicable when data on network structure is lacking. community structure in host networks is not limited to human networks: animal populations are often divided into subpopulations, connected by limited migration only [ , ] . targeted immunization of individuals connecting subpopulations has been shown to be an effective low-coverage immunization strategy for the conservation of endangered species [ ] . under the assumption of homogenous mixing, the elimination of a disease requires an immunization coverage of at least - /r [ ] but such coverage is often difficult or even impossible to achieve due to limited vaccine supply, logistical challenges or ethical concerns. in the case of wildlife animals, high vaccination coverage is also problematic as vaccination interventions can be associated with substantial risks. little is known about the contact network structure in humans, let alone in wildlife, and progress should therefore be made on the development of immunization strategies that can deal with the absence of such data. stochastic algorithms such as the acquaintance method and the cbf method are first important steps in addressing the problem, but the large difference in efficacy between stochastic and deterministic algorithms demonstrates that there is still a long way to go. to investigate the spread of an infectious disease on a contact network, we use the following methodology: individuals in a population are represented as nodes in a network, and the edges between the nodes represent the contacts along which an infection can spread. contact networks are abstracted by undirected, unweighted graphs (i.e. all contacts are reciprocal, and all contacts transmit an infection with the same probability). edges always link between two distinct nodes (i.e. no self loops), and there must be maximally one edge between any single pair of nodes (i.e no parallel edges). each node can be in one of three possible states: (s)usceptible, (i)nfected, or (r)esistant/immune (as in standard sir models). initially, all nodes are susceptible. simulations with immunization strategies implement those strategies before the first infection occurs. targeted nodes are chosen according to a given immunization algorithm (see below) until a desired immunization coverage of the population is achieved, and then their state is set to resistant. after this initial set-up, a random susceptible node is chosen as patient zero, and its state is set to infected. then, during a number of time steps, the initial infection can spread through the network, and the simulation is halted once there are no further infected nodes. at each time step (the unit of time we use is one day, i.e. a figure . assessing the efficacy of targeted immunization strategies in empirical networks based on deterministic and stochastic algorithms. the bars show the difference in the average final size s m of disease outbreaks (n cases) in networks that were immunized before the outbreak using method m. the left panels show the difference between the degree method and the random walk centrality method, i.e. s degree s randomwalk . if the difference is positive (red bars), then the random walk centrality method resulted in smaller final sizes than the degree method. a negative value (black bars) means that the opposite is true. shaded bars show non-significant differences (assessed at the % level using the mann-whitney test). the middle and right panels are generated using the same methodology, but measuring the difference between the acquaintance method (with n = in the middle column and n = in the right column, see methods) and the community bridge finder (cbf) method, i.e. s acquaintance s cbf and s acquaintance s cbf . again, positive red bars mean that the cbf method results in smaller final sizes, i.e. prevents more cases, than the acquaintance methods, whereas negative black bars mean the opposite. final epidemic sizes were obtained by running sir simulations per network, vaccination coverage and immunization method. doi: . /journal.pcbi. .g time step is one day), an infected node can get infected with probability exp( bi), where b is the transmission rate from an infected to a susceptible node, and i is the number of infected neighboring nodes. at each time step, infected nodes recover at rate c, i.e. the probability of recovery of an infected node per time step is c (unless noted otherwise, we use c = . ). if recovery occurs, the state of the recovered node is toggled from infected to resistant. unless mentioned otherwise, the transmission rate b is chosen such that r ,(b/c) * d< where d is the mean network degree, i.e the average number of contacts per node. for the networks used here, this approximation is in line with the result from static network theory [ ] , r ,t(,k ./,k. ), where ,k. and ,k . are the mean degree and mean square degree, respectively, and where t is the average probability of disease transmission from a node to a neighboring node, i.e. t i t u , that is: as a consequence the growth occurs if: . in the second case, the inequality i t+ u > i t u gives: in this example, for the sake of simplicity we will suppose that the epidemic is spreading over n = cities, v , . . . , v , forming a complete graph k . in this example, we will consider the following initial configuration: that is, there is only one node at time t = with infected population. moreover, the parameters used are p = . , r = . , η ui = . , ≤ i ≤ . moreover, let us suppose that the population of each node is the same: p ui = with ≤ i ≤ , and also the transport capacity between two nodes is the same: w uiuj = for ≤ i, j ≤ . note that this example deals with an homogeneous-symmetric case. in figure the evolution of the total number of infected and susceptible individuals is shown. if we set p = . instead of p = . , the number of infected and susceptible individuals also remains constant with time, but in this case the number of susceptible is greater than the number of infected individuals. in this work a new mathematical model to simulate the spreading of an epidemic is introduced. it is based on the use of cellular automata on graphs endowed with fig. . evolution of the total number of infected and susceptible individuals a suitable local transition function. the state of each cell is considered to be the portion of its population which is infected at each time step. the analysis of the model proposed in this paper seems to be in agreement with the results obtained for other mathematical models not based on discrete event systems, such as odes or pdes. future work aimed at designing a more complete ca-based epidemic model involving additional effects such as the population movement, virus mutation, etc. furthermore, it is also interesting to consider non-constant connections factors and also the effect of migration between the cells must be considered. on some applications of cellular automata a simple cellular automaton model for influenza a viral infections critical behaviour of a probablistic automata network sis model for the spread of an infectious disease in a population of moving individuals cellular automata and epidemiological models with spatial dependence a model based on cellular automata to simulate epidemic diseases contributions to the mathematical theory of epidemics, part i a cellular automata model for citrus variegated chlorosis the dependence of epidemic and population velocities on basic parameters the prevention of malaria extending the sir epidemic model a cellular automaton model for the effects of population movement and vaccination on epidemic propagation cellular automata machines: a new environment for modeling a new kind of science acknowledgments. this work has been partially supported by consejería de sanidad, junta de castilla y león (spain). key: cord- -ub mgqxb authors: wang, cheng; zhang, qing; gan, jianping title: study on efficient complex network model date: - - journal: proceedings of the nd international conference on green communications and networks (gcn ): volume doi: . / - - - - _ sha: doc_id: cord_uid: ub mgqxb this paper summarizes the relevant research of the complex network systematically based on statistical property, structural model, and dynamical behavior. moreover, it emphatically introduces the application of the complex network in the economic system. transportation network, and so on are of the same kind [ ] . emphasis on the structure of the system and the system analysis from structure are the research thinking of the complex network. the difference is that the property of the topological structure of the abstracted real networks is different from the network discussed before, and has numerous nodes, as a result we call it complex network [ ] . in recent years, a large number of articles are published in world leading publication such as science, nature, prl, and pnas, which reflects indirectly that complex network has been a new research hot spot. the research in complex network can be simply summarized as contents of three aspects each of which has close and further relationships: rely on the statistical property of the positivist network measurement; understanding the reason why the statistical property has the property it has through building the corresponding network model; forecasting the behavior of the network system based on the structure and the formation rule of the network. the description of the world in the view of the network started in when german mathematician eular solved the problem of johannesburg's seven bridges. the difference of complex network researching is that you should view the massive nodes and the properties they have in the network from the point of the statistics firstly. the difference of the properties means the different internal structures of the network; moreover the different internal structures of the network bring about the difference of the systemic function. therefore, the first step of our research on complex network is the description and understanding of the statistical properties, sketched as follows: in the research of the network, generally speaking we define the distance between two nodes as the number of the shortest path edge of the two connectors; the diameter of the net as the maximum range between any two points; the average length of the net is the average value of the distance among all the nodes, it represents the degree of separation of the nodes in the net, namely the size of the net. an important discover in the complex network researching is that the average path length of the most of the large-scale real networks is much less than our imagine, which we call ''small-world effect''. this viewpoint comes from the famous experiment of ''milgram small-world'', the experiment required the participators to send a letter to one of their acquaintances making sure the letter reach the recipient of the letter, in order to figure out the distribution of the path length in the network, the result shows that the number of the average passing person is just six, in addition the experiment is also the origin of the popular theory '' °of separation''. the aggregation extent of the nodes in the network is represented by convergence factor c, that is how close of the network. for example in the social networks, your friend's friend may be your friend or both of your two friends are friends. the computational method is that: assuming node i connect other k i nodes through k i , if the k i connected each other, there should be k i ðk i À Þ= sides among them, however if the k i nodes have e i sides, then the ratio of e i to k i ðk i À Þ= is the convergence factor of node i. the convergence factor of the network is the average value of all the nodes' convergence factor in the network. obviously only is in fully connected network the convergence factor equals , in most other networks convergence factor less than . however, it proves to be that nodes in most large-scale realworlds network tent to be flock together, although the convergence factor c is far less than , it is far more than n À . the degree k i of the node i in the graph theory is the total amount of the sides connected by node i, the average of the degree k i of the node i is called average degree of the network, defined as \ k [. the degree of the node in the network is represented by distribution function p(k), the meaning of which is that the probability that any nodes with k sides, it also equals the number of nodes with k degree divide the number of all the nodes in the network. the statistical property described above is the foundation of the complex networks researching; with the further researching we generally discover the realworld network has other important statistical property, such as the relativity among network resilience, betweenness, and degree and convergence factor. the most simple network model is the regular net region; the same number around every node is its characteristic, such as d chain-like, d lattice, complete graph and so on. paul erdös and alfred rényi discovered a complete random network model in the late s twentieth century, it is made of any two nodes which connected with probability p in the graph made of n nodes, its average degree is \k [ ¼ pðn À Þ % pn; the average path length l : ln n= lnð\k [ Þ; the convergence factor c ¼ p; when the value of n is very large, the distribution of the node degree approximately equals poisson distribution. the foundation of the random network model is a significant achievement in the network researching, but it can hardly describe the actual property of the realworld, lots of new models are raised by other people. as the experiment puts, most of the realworld networks has small-world (lesser shortest path) and aggregation (larger convergence factor). however, the regular network has aggregation, but its average shortest path length is larger, random graph has the opposite property, having small-world and less convergence factor. so the regular networks and random networks can not reflect the property of the realworld, it shows that the realworld is not well-defined neither is complete random. watts and strogatz found a network which contains both small-world and high-aggregation in , which is a great break in the complex network researching. they connected every side to a new node with probability p,through which they build a network between regular network and random network (calling ws net for short), it has less average path length and larger convergence factor, while the regular network and random network are special case when p is and in the ws net. after the ws model being put forward, many scholars made a further change based on ws model, the nw small-world model raised by newman and watts has the most extensive use. the difference between nw model and ws model is that nw model connects a coupe of nodes, instead of cutting off the original edge in the regular network. the advantage of nw model is that the model simplifies the theory analysis, since the ws model may have orphan nodes which nw would not do. in fact, when p is few while n is large, the results of the theory analysis of the two models will be the same; we call them small-world model now. although the scale-free network can describe the small-world and highaggregation of the realworld well, the theory analysis of the small-world model reals that the distribution of the node is still the index distribution form. as the empirical results put it is more accurate to describe the most of the large-scale realworld model in the form of the power-law namely pðkÞ : k Àc . compared with index distribution power-law has no peak, most nodes has few connection, while few nodes have lots of connection, there is no characteristic scale as the random network do, so barabási and some other people call this network distribution having power rate characteristics scale-free network. in order to explain the foundation of the scale-free network, barabási and albert found the famous ba model, they thought the networks raised before did not consider the two important property of the realworld-growth property and connection optimization, the former means the new nodes are constantly coming into the network, the latter means after their arriving the new nodes prefer to connect the nodes with large degree. not only do they make the simulation analysis of the generating algorithm of the ba model, but also it has given the analytic solution to the model using the way of the mean field in statistical physics, as the result put: after enough time of evolution, the distribution of ba network don't change with time, degree distribution is power-law with its index number steadily. foundation of the ba model is another great breakout in the complex network research, demonstrating our further understanding of the objective network world. after that, many scholars made many improvements in the model, such as nonlinearity priority connection, faster growth, and local events of rewind side, being aging, and adaptability competition and so on. note that: most instead of all of the realworld is scale-free network, for some realworld network's degree distribution is the truncation form of the power-law. scholars also found some other network model such as local area world evolution model, weight evolution network model and certainty network model to describe the network structure of the realworld besides small-world model and scale-free network. study of the network structure is important, but the ultimate purpose is that we can understand and explain the system's modus operand based on these networks, and then we can forecast and control the behavior of network system. this systemic dynamical property based on network is generally called dynamical behavior, it involves so many things such as systemic transfusion, synchronization, phase change, web search and network navigator. the researched above has strong theoretical, a kind of research of network behavior which has strong applied has increasingly aroused our interests, for example the spread of computer virus on computer net, the spread of the communicable disease among multitude and the spread of rumours in society and so on, all of them are actually some propagation behavior obeying certain rules and spreading on certain net. the traditional network propagation models are always found based on regular networks, we have to review the issue with the further research of the complex networks. we emphatically introduce the research of the application. one of the uppermost and foremost purposes of network propagation behavior research is that we can know the mechanism transmission of the disease well. substitute node for the unit infected, if one unit can associate with another in infection or the other way round through some way, then we regard that the two units have connection, in this way can we get the topological structure of network propagation, the relevant propagation model can be found to study the propagation behavior in turn. obviously, the key to network propagation model studying is the formulation of the propagation rule and the choice of the network topological structure. however, it does not conform to the actual fact simply regarding the disease contact network as regular uniform connect network. moore studied the disease propagation behavior in small-world, discovering that the propagation threshold value of disease in small-world is much less than it does in regular network, in the same propagation degree, experience the same time, the propagation scope of disease in the small-world is significantly greater than the propagation scope in the regular network, that is to say: compared to regular network, disease in the smallworld inflects easily; paster satornas and others studied the propagation behavior in the scale-free world, the result turns out to be amazing: there is always positive propagation degree threshold value in both of regular world and small-world, while the propagation degree threshold value approves to be . we can get the similar results when analyzing the scale-free world. as lots of experiments put realworld network has both small-world and scale-free, the conclusion described above is quite frustrated. fortunately, no matter virus or computer virus they all has little infectious (k ¼ ), doing little harm. however, once the intensity of disease or virus reaches some degree, we have to pay enough attention to it, the measurement to control it can not totally rely on the improvement of medical conditions, we have to take measures to quarantine the nodes and turn off the relevant connections in order to cut off avenue of infection in which we can we change the topological structure of the propagation network. in fact, just in this way can we defeat the war of fighting sars in summer in our country. the study of the disease's mechanism transmission is not all of the questions our ultimate goal is that we can master how to control disease propagation efficiently. while in practical applications, it is hard to stat the number of nodes namely the number of units which have possibilities connect with other nodes in infection period. for example in the research of std spread, researchers get the information about psychopath and high risk group only through questionnaire survey and oral questioning, while their reply has little reliability, for that reason, quite a lot of immunization strategy have been put forward by some scholars based on above-mentioned opinion, such as ''who is familiar with the immune'', ''natural exposure'', ''vaccination''. analyzing disease spread phenomenon is not just the purpose of researching network propagation behavior; what is more a large amount of things can be analyzed through it. for example we can apply it to propagation behavior's research in social network, the basic ideas showed as follows: first we should abstract the topological structure of the social network out from complex network theory, then analyze the mechanism transmission according to some propagation rules, analyze how to affect the propagation through some ways at last. actually, this kind of work has already started, such as the spread of knowledge, the spread of new product network and bank financial risk; they have both relation and difference, the purpose of the research of the former is to contribute to its spread; the latter is to avoid its spread. systems science. shanghai scientific and technological educational publishing house pearson education statistical mechanics of complex network the structure and function of complex networks key: cord- -fiukemh authors: holme, petter title: three faces of node importance in network epidemiology: exact results for small graphs date: - - journal: phys rev e doi: . /physreve. . sha: doc_id: cord_uid: fiukemh we investigate three aspects of the importance of nodes with respect to susceptible-infectious-removed (sir) disease dynamics: influence maximization (the expected outbreak size given a set of seed nodes), the effect of vaccination (how much deleting nodes would reduce the expected outbreak size), and sentinel surveillance (how early an outbreak could be detected with sensors at a set of nodes). we calculate the exact expressions of these quantities, as functions of the sir parameters, for all connected graphs of three to seven nodes. we obtain the smallest graphs where the optimal node sets are not overlapping. we find that (i) node separation is more important than centrality for more than one active node, (ii) vaccination and influence maximization are the most different aspects of importance, and (iii) the three aspects are more similar when the infection rate is low. one of the central questions in theoretical epidemiology [ ] [ ] [ ] is how to identify individuals that are important for an infection to spread [ , ] . what "important" means depends on the particular scenario-what kind of disease spreads and what can be done about it. in the literature, three major aspects of importance have been discussed. first, influence maximization is aimed at identifying the nodes that, if they are sources of the outbreak, would maximize the expected outbreak size (the number of nodes infected at least once) [ , ] . second, vaccination is aimed at finding the nodes that, if vaccinated (or, in practice, deleted from the network), would reduce the expected outbreak size the most [ ] . third, sentinel surveillance is aimed at finding the nodes that are likely to get infected early [ , ] . these three notions of importance do not necessarily yield the same answer as to which node is most important. in this work, we investigate how the ranking of important nodes for these three aspects differs and why (see fig. ). in this paper, we evaluate the three aspects of importance with respect to the susceptible-infectious-removed (sir) disease-spreading model [ ] [ ] [ ] ] on small connected graphs (all connected graphs from three up to seven nodes). the main reason we restrict ourselves to small graphs is that it allows us to use symbolic algebra, and thus exact calculations [ ] . in this way we can discover, e.g., the smallest graph where the three aspects of importance disagree about which node is most important; cf. ref. [ ] . we argue that graphs of seven nodes are still large enough to illustrate the effects of distance. nevertheless, large networks are important to study. a possible future extension of this work will be to address the relationship between the three importance measures for larger networks. in the related ref. [ ] , the difference between influence maximization and vaccination problems on (some rather large) empirical networks is studied. the authors compare the top results of heuristic algorithms to identify influential single nodes, whereas in this paper we will consider the influence of all nodes, and also all sets of two and three nodes. (the terminology of ref. [ ] is a bit different from ours-they call important nodes for vaccination "blockers" and important nodes for influence maximization "spreaders".) * holme@cns.pi.titech.ac.jp we will proceed by discussing our setup in greater detail: our implementation of the sir model, how to analyze the three aspects of importance, network centrality measures that we need for our analysis, and our results, including the smallest networks where different nodes come out as most important. in this section, we provide the background to our analysis. the basis of our analysis is graphs g(v ,e) consisting of n nodes v and m links e. as mentioned earlier, there are three ways to think of importance in theoretical infectious disease epidemiology. influence maximization was first studied in computer science with viral marketing in mind [ , ] . as was mentioned, a node is important for influence maximization if it is a seed of an infection that could cause a large outbreak. for epidemiological applications, therefore, it might be interesting if one could immunize people against a disease before an outbreak happens. we will simply measure the expected outbreak size (s) (the expected number of nodes to catch the disease), with s as the set of source nodes, and we will rank the set of nodes according to their . for vaccination, we will use the average outbreak size from one random seed node to estimate the importance of a node [ , , , ] . one could rephrase it as a cost problem [ ] . we assume the vaccinees are deleted from the network before the outbreak starts. the node with the smallest is the one that is most important for the vaccination problem. sentinel surveillance assumes a response after the outbreak already started (compared to influence maximization and vaccination, where the action affecting the nodes in question is assumed to take place before the outbreak happens). a node is important for sentinel surveillance if it gets infected early so that the health authorities can activate their countermeasures. this is usually determined by the lead time-the expected difference between the time a sentinel node gets infected, or the outbreaks dies out, and the infection time of any node in the graph [ ] . we will instead measure the average discovery time time τ (i) from the beginning of the infection until a node i gets infected or the outbreak dies [ ] . the node with the illustration of the three different notions of importance we explore in this work. panel (a) shows an example of an sir outbreak in a seven-node network. panels (b)-(d) show how this outbreak influences maximization (a), vaccination (b), and sentinel surveillance, respectively. the idea of influence maximization (b) is that a node is important if the outbreak originating at it is expected to be large. the idea of vaccination (c) is that a node is important if removing it would reduce significantly the average outbreak size. the idea of sentinel surveillance (d) is that a node is important if a sensor on it would detect the outbreak early. the shades of the nodes in (c) and (d) are proportional to their contribution. in a stochastic simulation, one would average the values over many runs and, for (c) and (d), many seeds of the outbreak. in this work, however, rather than running simulations, we calculate the exact expectation values of these quantities. smallest discovery time is then considered most important for sentinel surveillance. if the purpose of the surveillance is just to discover the outbreak-not to rid the population of the disease as early as possible-one could measure τ (i) conditioned on the outbreak reaching a sentinel before it dies out. we will briefly discuss such a conditioned τ and refer to it as τ . for all of the three problems mentioned above, one can consider sets of nodes rather than individuals. there can be more than one source (for influence maximization), vaccinee, or sentinel. we will, in general, call these sets active nodes and denote their number as n. we will try to find the optimal sets of active nodes (and call them optimal nodes). note that this is not the same as ranking the nodes in order of importance and taking the n most important ones-such a "greedy" approach can in many cases fail [ , ] . note that for vaccination and sentinel surveillance, we use one source node of the infection. this is the standard approach in infectious disease epidemiology simply because most outbreaks are thought to start with one person [ , ] . we will use the constant infection and recovery-rate version of the sir model [ ] . in this formulation, if a susceptible node i is connected to an infectious node j , then i becomes infected at a rate β. infected nodes recover at a rate ν. without loss of generality, we can set ν = (equivalently, this means we are measuring time in units of /ν). let c be a configuration (i.e., a specification of the state-s, i, or r-of every node), m si is the number of links between s and i nodes, and n i is the number of infected nodes. then, the rate of events (either infections or recoveries) is βm si + n i , which gives the expected duration of c as proceeding in the spirit of the gillespie algorithm, the probability of the next event being an infection event is βm si t , and the probability of a recovery event is n i t [ , ] . exactly calculating the outbreak size and time to discovery or extinction is, in principle, straightforward. consider the change from configuration c into c by an infection event (changing node i from susceptible to infectious). this can happen in m i ways, where m i is the number of links between i and an infectious node. thus the probability for the transition from c to c is βm i t . the probability that the next event will be a recovery event is simply t . to compute the probability of a chain of events, one simply multiplies these probabilities over all transitions. to compute the expected time for a chain of events, one sums the t for all configurations of the chain. we will illustrate the description above with an example. see fig. . the probability of the outbreak chain is (multiply the probabilities of the transitions) the expected duration of the infection chain is giving a contribution of chain to τ . then these contributions need to be summed up for all chains, and averaged over all starting configurations. for the example in fig. , this gives the expressions of and τ are fractions of polynomials. for the largest networks we study (seven nodes), these polynomials can be of order up to with up to -digit integer coefficients. calculating for the influence maximization or vaccination problems follows the same path as the τ calculation above. the difference is that instead of multiplying by the expected time of a chain, one would multiply by the number of recovered susceptible infectious recovered sentinel β/( β+ ) β / ( β + ) nodes in that branch. furthermore, there are no sentinels to stop outbreaks, so trees (like fig. ) become larger. in practice, our approach to analyzing network epidemiological models is time-consuming. the major bottleneck is the polynomial algebra (to be precise, calculating the greatest common divisor needed to reduce the fractions of polynomials to their canonical form). because of this, we could not handle networks of more than seven nodes. the code was implemented in both python (with the sympy library [ ] ) and c with the flint library [ ] . it also uses the subgraph isomorphism algorithm vf [ ] as implemented in the igraph c library [ ] . our code is available at http://github.com/pholme/exact-importance, which also includes code to calculate τ (mentioned above but not investigated in the paper). to better understand how the network structure determines what nodes are most important, we measure the average values of static importance predictors. in general, there are many ways to be the central means for a node-is it a node often passed by things traveling over the network, or is it a node for which short paths exist to other nodes? different rationales give different measures. these are typically positively correlated, but they do not rank the nodes in the exact same way, and thus they can complement each other [ ] . we focus on three measures: degree, closeness centrality, and vitality. degree centrality is simply the number of neighbors of a node. if a node has twice the neighbors of another, it has twice as many nodes to which to spread an infection. this makes it more important for influence maximization and vaccination. it also has twice as many nodes from which to get the infections, which contributes to its importance for vaccination and sentinel surveillance. on the other hand, degree is not a global quantity-it could happen that the neighbors of a high-degree node are so peripheral that a disease could easily die out there. the simplest way of modifying the degree to become a global measure is to operationalize the idea that a node is central if it is the neighbor of many central nodes. with the simplest possible assumptions, this reasoning leads to eigenvector centrality, i.e., the centrality of node i can be estimated as the ith entry of the leading eigenvector of the adjacency matrix [ ] . for the small graphs that we consider, however, the eigenvector centrality is so strongly correlated with degree (intuitively so, because "everything is local" in a very small graph) that it makes little sense to include it in the analysis. many centrality measures are based on the shortest paths. perhaps the simplest of these measures is closeness centrality-using the idea that a node is central if it is on average close to other nodes [ , ] . this leads to a measure of the centrality of i as the reciprocal distance to all other nodes in the network: the main problem, in general, with closeness centrality may be that it is ill-defined on disconnected graphs. in our work, however, we consider only connected graphs. we chose the third centrality measure-vitality-with the vaccination problem in mind. vitality is, in general, a class of measures that estimate node centrality based on its impact on the network if it is deleted [ ] . in our work, we let vitality denote the particular metric where s(g) is the number of nodes in the largest component of g. this measure is thus in the interval [ ,n − ], and it increases with i's ability, if removed, to fragment the network. since vaccination is, in practice, like removing nodes from the network, we expect v to identify important nodes for β close to . for large graphs, we expect v to be very close to , so we only recommend it for small graphs such as the ones we used here. another popular centrality measure-betweenness centrality (roughly how many shortest paths there are in the network that passes a node) [ ]-is very strongly correlated with vitality for our set of small graphs, and it is thus omitted from the analysis. in our work, we systematically evaluate small distinct (nonisomorphic) connected graphs. we use all such graphs with n . there are two such graphs with n = , six with n = , with n = , with n = , and with n = . to generate these, we use the program geng [ ] . in our analysis, we will focus on when and why the three cases of node importance rank nodes differently. we will start with some extreme examples, and continue with general properties of all small graphs. inspired by ref. [ ] , we will start with a special example (fig. ) . this is the smallest graph where the most important single node (n = ) is different for influence maximization, vaccination, and sentinel surveillance. for [( + √ )/ ,( + √ )/ ] ≈ ( . , . ), node is the most important node for influence maximization, is most important for vaccination, and is most important for sentinel surveillance. for small β values, is most important for all three aspects of importance. in this region, the outbreaks die out easily. the fact that and have a larger degree than the others is, of course, helpful for an outbreak to take hold in the population. node is slightly more important as a seed node since the extra link in its neighborhood helps the outbreak to persist longer [there are the ( , , ) and ( , , ) infection paths that, although unlikely, do not exist for diseases starting at ]. this reasoning also explains why is most important for vaccination. for sentinel surveillance and for low enough β, the outbreak would typically end by the outbreak becoming extinct rather than hitting a sentinel. thus, for low β, when an outbreak has the highest chance of surviving if it starts at , then putting in a sentinel is good because an outbreak is either instantly discovered or will likely soon be extinct. with a conditional discovery time τ , the curves are strictly decreasing (since the early die-off is omitted), so is the most important node for all β. for larger β, node becomes, relatively speaking, more important for influence maximization and sentinel surveillance. this is the most central node in aspects other than degree. for vaccination, however, node is most important as it fragments the network most [the vitality is the same for both nodes v( ) = v( ) = , but the size of the second biggest component is larger if is deleted]. so since becomes more important than at a larger β value for influence maximization compared with sentinel surveillance, there is an interval of beta where the network of fig. has three distinct most important nodes for the three aspects of importance that we investigate. for two active nodes (n = ), the smallest network with no overlap between the optimal node sets is actually smaller than for n = . this network, displayed in fig. , has six nodes and eight links. note that n = is the smallest number of nodes to make three distinct sets of two nodes, so in that sense the n = example seems more extreme than the n = counterpart, fig. . for large β values, and are the most important nodes for influence maximization, and are most important for vaccination, and and are most important for sentinel surveillance. and are the nodes that, if deleted, break the network into the smallest components, which explains why they are most important for vaccination (at least for large β). in addition to and , and are the only pair of nodes whose neighborhoods contain all other nodes. nodes and both have degree , as opposed to and , which have degrees and , respectively. it is not clear whether that makes and better than and for influence maximization, or why. similarly, it is hard to understand why and are the best nodes for sentinel surveillance. the neighborhoods of these nodes do not even contain the entire graph. we can see that the optimal sets of nodes in fig. do not have links within themselves. this seems natural for most networks and all three notions of importance. this means that as n grows, the distance between the optimal nodes will be larger than . this is an observation we will make more quantitatively in the next section. another such observation is that for small β, the optimal nodes for the three importance aspects are overlapping. in this parameter region, most outbreaks die out before they reach a sentinel. if the outbreak starts at a high-degree node in a highly connected neighborhood, there is a larger chance for it to survive. for all three importance aspects, it is important to have active nodes where an outbreak would be likely to survive. still, as evident from fig. , there are examples where the optimal nodes are not overlapping. we will now move to a more statistical evaluation of all graphs with - nodes. we will present average quantities over all these graphs as functions of β. other summary statistics, including grouping the graphs according to size, give the same conclusions. let u a,b i be the optimal sets for a given network, β, and importance classes a and b. the first quantity we look at is the pairwise overlap of sets of optimal nodes as measured by the jaccard overlap, where for example, in fig. at β = , we have where a is influence maximization and b represents sentinel surveillance, giving as seen in fig. , for n = , the overlap between the optimal nodes for vaccination and sentinel surveillance has a minimum as a function of β. the same is true for sentinel surveillance versus influence maximization when n = . it is hard to say why, beyond that, for individual graphs the j (a,b,β) curves can of course be nonmonotonous as different aspects of the graph structure determine the role of the nodes. we note that (for a different spreading model and much larger networks), ref. [ ] finds the jaccard similarity between influence maximization and vaccination to have a minimum as a function of β. next, we investigate the structural properties of the most influential nodes and how they depend on β. in fig. , we plot the degree, closeness centrality, and vitality as a function of β for all aspects of importance and n ∈ { , , }. we start by examining the case n = ; see figs. (a), (b), and (c). the first thing to notice is the general impression that centralities of the optimal nodes decrease with β. the only case with an opposite trend is vitality [ fig. (c) ], where the curves are increasing monotonically. if we first focus on the case with one active node, this could be understood as the ability of nodes to (if removed) fragment the network. this ability is captured by vitality and becomes more important as β increases. continuing the analysis for n = , when β is low the most important thing is for the outbreak to persist in the population. if an active node has a high degree, it is likely to be the source of a large outbreak, meaning it is important for influence maximization (which was also concluded by ref. [ ] ). if a high-degree node is deleted, it would remove many links that could spread the disease and thus be important for vaccination [ ] . it would also be important not to put a sentinel on a low-degree node for sentinel surveillance and low β as diseases reaching low-degree nodes would likely die out. so panels figs. (a) and (c) can be understood as a shift from nodes of high degree to nodes of high vitality. closeness centrality-seen in fig. (b)-is harder to explain. values of c increase with β for influence maximization but decrease for vaccination. one way of understanding this is from the observation that vitality is most important for vaccination [as evident from fig. (c) ], and degree is most important for influence maximization [as seen in fig. (a) ]. the results of fig. (b) then suggest that the high vitality nodes optimizing the solution of the vaccination problem have a lower closeness centrality. indeed, for many of the graphs we study, the highest vitality node has many degree- neighbors-cf. node in fig. -which does not necessarily contribute to the closeness centrality. for influence maximization, it seems that the optimal nodes are central in the closeness sense-the closer to average the seed node is to the rest of the network, the higher is the chance for the outbreak to reach the entire network. for n = and , the picture is somewhat different than for n = . in these cases, all centrality measures are decreasing monotonically. the order of importance measures are all the same, with vaccination having the largest values, and influence maximization the smallest. it is no longer the case for vaccination that the optimizing nodes have high vitality and low closeness centrality (as it was for n = ). indeed, for the vaccination case, the optimal nodes are usually independent of β, which is why the curves for vaccination in figs. (d)- (i) are almost straight. naively, one would think that some centrality measure needs to increase with β. however, as we will argue further below, the optimal nodes would usually not be close to each other. one could think of each node being responsible for (and centrally situated within) a region of the network, and that that tendency is so strong that it overrides all simple centrality measures. on the other hand, there are group centrality measures that could perhaps increase with β [ ] (that could be a theme for another paper). the fact that all the curves of figs. (d)- (i) are nonincreasing could be explained by the fact that the separation of the optimal nodes increases with β. in fig. , we try to make this argument more quantitative by measuring the average (shortest path) distance d between the optimal nodes. in the limit of small β, these values come rather close to its minimum of , but as β increases, so does d. essentially, the pattern from fig. is the reverse of figs. (d)- (i)-the vaccination curve is almost constant, sentinel surveillance increases moderately, but influence maximization increases much more. a larger separation gives the sentinels the ability on average to be closer to outbreaks anywhere in the network, while for influence maximization a larger separation means that there are more susceptible-infectious links (fewer infectious-infectious links) in the incipient outbreak. for vaccination there is no such positive effect of a larger separation that we can think of, which is a part of the explanation as to why the optimal sets are relatively independent of β for n > . the rest of the explanation, i.e., why the trends for n = are so much weaker when n > , is not clear to us, and it is something we will investigate further in the future. we investigated the average properties of the optimal nodes for all our graphs. we found that the overlap between the optimal nodes of the different importance aspects are largest for small β. in the small-β region, a high degree seems most important for all importance aspects. for larger β nodes, it becomes more important for them to be positioned such that they would fragment the network if they were removed, particularly for the vaccination problem (slightly less for the sentinel surveillance problem, and much less for influence maximization). on the other hand, when the number of active nodes increases, it becomes important for the nodes to be spread out-the average distance between them increases. this effect is large for influence maximization, intermediate for sentinel surveillance, and very small for vaccination. the small effect for vaccination can be understood since all that matters is to fragment the network, and for that purpose the vaccine does not necessarily have to be distant. most of the behavior discussed above seems quite natural. for small β, the dominant aspect of the dynamics is how fast an outbreak will die out. for large β, the outbreak will almost certainly reach all nodes. for vaccination and sentinel surveillance, this leads to a question of deleting nodes that would break the network into the smallest components. (in the former case, this is trivial since the size of the outbreak is almost surely the size of the connected component to which the seed node belongs. in the latter, we conclude this from the monotonically increasing vitality). as an extension, it would be interesting to confirm this work in larger networks using stochastic simulations. this would not allow for the discoveries of special graphs such as those in figs. and , but it could reinforce the connection between the different notions of centrality. we believe that many of our conclusions hold for larger networks, an indication being that our results are consistent with the results of ref. [ ] (comparing the vaccination and influence maximization for n = in large empirical networks). a survey of models and algorithms for social influence analysis proceedings of the ninth acm sigkdd international conference on knowledge discovery and data mining networks: an introduction proceedings of the third international congress on mathematical software rd iapr-tc workshop on graph-based representations in pattern recognition network analysis: methodological foundations we thank petteri kaski, nelly litvak, and naoki masuda for helpful comments. key: cord- -p vxnn z authors: lyu, tianshu; sun, fei; zhang, yan title: node conductance: a scalable node centrality measure on big networks date: - - journal: advances in knowledge discovery and data mining doi: . / - - - - _ sha: doc_id: cord_uid: p vxnn z node centralities such as degree and betweenness help detecting influential nodes from local or global view. existing global centrality measures suffer from the high computational complexity and unrealistic assumptions, limiting their applications on real-world applications. in this paper, we propose a new centrality measure, node conductance, to effectively detect spanning structural hole nodes and predict the formation of new edges. node conductance is the sum of the probability that node i is revisited at r-th step, where r is an integer between and infinity. moreover, with the help of node embedding techniques, node conductance is able to be approximately calculated on big networks effectively and efficiently. thorough experiments present the differences between existing centralities and node conductance, its outstanding ability of detecting influential nodes on both static and dynamic network, and its superior efficiency compared with other global centralities. electronic supplementary material: the online version of this chapter ( . / - - - - _ ) contains supplementary material, which is available to authorized users. social network analysis is used widely in social and behavioral sciences, as well as economics and marketing. centrality is an old but essential concept in network analysis. central nodes mined by centrality measures are more likely to help disseminating information, stopping epidemics and so on [ , ] . local and global centralities are classified according to the node influence being considered. local centrality, for instance, degree and clustering coefficient are simple yet effective metrics for ego-network influence. on the contrary, tasks such as information diffusion and influence maximization put more attention on the node's spreading capability, which need centrality measurements at long range. betweenness and closeness capture structural characterization from a global view. as the measures are operated upon the entire network, they are informative and have been extensively used for the analysis of social-interaction networks [ ] . however, exact computations of these centralities are infeasible the ideal route (shortest path or maximum flow) at most times. random walk centrality [ ] counts the number of random walks instead of the ideal routes. nevertheless, the computational complexity is still too high. subgraph centrality [ ] , the most similar measure to our work, is defined as the sum of closed walks of different lengths starting and ending at the vertex under consideration. it characterizes nodes according to their participation in subgraphs. as subgraph centrality is obtained mathematically from the spectra of the adjacency matrix, it also runs into the huge computational complexity. advance in nlp research. neural language model has spurred great attention for its effective and efficient performance on extracting the similarities between words. skip-gram with negative sampling (sgns) [ ] is proved to be co-occurrence matrix factorization in fact [ ] . many works concerns the different usages and meanings of the two vectors in sgns. the authors of [ ] seek to combine the input and output vectors for better representations. similarly, in the area of information retrieval, input and output embeddings are considered to carry different kinds of information [ ] . input vectors are more reflective of function (type), while output vectors are more reflective of topical similarity. in our work, we further analyze the relationships between the learned input and output vectors and the network topology, bringing more insights to the network embedding techniques. moreover, we bridge the gap between node embedding and the proposed centrality, node conductance. conductance measures how hard it is to leave a set of nodes. we name the new metric node conductance as it measures how hard it is to leave a certain node. for an undirected graph g, and for simplicity, we assume that g is unweighted, although all of our results apply to weighted graphs equally. a random walk on g defines an associated markov chain and we define the node conductance of a vertex i, nc ∞ , as the sum of the probability that i is revisited at s-th step, where s is the integer between and ∞. the next section demonstrates that the number of times that two nodes co-occur in the random walk is determined by the sub-network shared by these two nodes. node conductance is about the co-occurrence of the target node itself and is thus able to measure how dense the connections are around the target node. the graph g is supposed to be connected and not have periodically-returned nodes (e.g. bipartite graph). the adjacency matrix a is symmetric and the entries equal if there is an edge between two nodes and otherwise. vector d = a , where is a n × vector of ones, n is the node number, and each entry of d is the node degree. d is the diagonal matrix of degree: d = diag(d). graph g has an associated random walk in which the probability of leaving a node is split uniformly among the edges. for a walk starting at node i, the probability that we find it at j after exactly s steps is given by nc r denotes the sum of the probability that the node is revisited at the step s, s is between and r where p ii is the entry in the i-th row and i-th column of matrix p . supposed that r approaches infinity, nc ∞ becomes a global node centrality measure. in order to compute the infinite sum of matrix power, s = is added for convenience. ( ) d−a, the laplacian matrix l of the network, is singular and cannot be inverted simply. we introduce pseudo-inverse. l ij = n k= λ k u ik u jk , where λ and u are the eigenvalue and eigenvector respectively. as vector [ , , ...] is always an eigenvector with eigenvalue zero, the eigenvalue of the pseudo-inverse l † is defined as follows. nc ∞ (i) only concerns about the diagonal of l † . where d i is the degree of node i, the ith entry of d. although node conductance is a global node centrality measure, the node conductance value is more relevant with local topology. as shown in eq. , in most cases, the entry value of (d − a) s is quite small when s is large. it corresponds to the situation that the random walk is more and more impossible to revisit the start point as the walk length increases. in the supplementary material, we will prove that node conductance can be well approximated from local subgraphs. moreover, as the formalized computation of node conductance is mainly based on matrix power and inverse, the fast calculation of node conductance is also required. we will discuss the method in sect. . node conductance seems to have very similar definition as subgraph centrality (sc) [ ] and pagerank (pr) [ ] . in particular, node conductance only computes the walks started and ended at the certain node. and pr is the stationary distribution of the random walk, which means that it is the probability that a random walk, with infinite steps, starts from any node and hits the node under consideration. pr = d(d − αa) − , where the agent jumps to any other node with probability α. the difference between pr and eq. lies in the random walks taken into account. by multiplying matrix , the pr value of node i is the sum of the entries in the i-th row of d(d − αa) − . in eq. , the nc value of node i is the entry of the i-th row and i-th column. in summary, nc is more about the node neighborhood while pr is from a global view. the difference makes pagerank a good metric in information retrieval but less effective in social network analysis. after all, social behavior almost have nothing to do with the global influence. sc counts the subgraphs number that the node takes part in, which is equivalent to the number of closed walks starting and ending at the target node, the authors later add a scaling factor to the denominator in order to make the sc value converge, but get less interpretive. nc, on the contrary, is easy-to-follow and converges by definition. as the calculation of node conductance involves matrix multiplication and inverse, it is hard to apply to large networks. fortunately, the proof in our supplementary material indicates that node conductance can be approximated from the induced subgraph g i formed by the k-neighborhood of node i. and the approximation error decreases at least exponentially with k. random walk, which node conductance is based on, is also an effective sampling strategy to capture node neighborhood in the recent network embedding studies [ , ] . next, we aim at teasing out the relationship between node embeddings and network structures, and further introduces the approximation of node conductance. word vec is highly efficient to train and provides state-of-art results on various linguistic tasks [ ] . it tries to maximize the dot product between the vectors of frequent word-context pairs and minimize it for random word-context pairs. each word has two representations in the model, namely the input vector (word vector w) and output vector (context vector c). deepwalk [ ] is the first one pointing out the connection between texts and graphs and using word vec technique into network embedding. although deepwalk and word vec always treat the input vector w as the final result, context vector c still plays an important role [ ] , especially in networks. ( ) syntagmatic: if word i and j always co-occur in the same region (or two nodes have a strong connection in the network), the value of w i · c j is large. ( ) paradigmatic: if word i and j have quite similar contexts (or two nodes have similar neighbors), the value of w i ·w j is high. in nlp tasks, the latter relationship enables us to find words with similar meaning, and more importantly, similar part-of-speech. that is the reason why only input embeddings are preserved in word vec. however, we do not have such concerns about networks, and moreover, we tend to believe that both of these two relationships indicate the close proximity of two nodes. in the following, we analyze the detailed meanings of these two vectors based on the loss function of word vec. sgns is the technique behind word vec and deepwalk, guaranteeing the high performance of these two models. our discussion of deepwalk consequently starts from sgns. the loss function l of sgns is as follows [ , ] . v w is the vocabulary set, i is the target word and v c is its context words set, #(i, j) r is the number of times that j appears in the r-sized window with i being the target word. #(i) r is the times that i appears in the training pairs: #(i) r = j∈vw #(i, j) r , where w i and c i are the input and output vectors of i. ( ) neg is the word sampled based on distribution p (i) = #(i)/|d|, corresponding to the negative sampling parts, d is the collection of observed words and context pairs. note that word vec uses a smoothed distribution where all context counts are raised to the power of . , making frequent words have a lower probability to be chosen. this trick resolves word frequency imbalance (non-negligible amount of frequent and rare words) while we found that node degree does not have such imbalanced distribution in all of the dataset we test (also reported in fig. in deepwalk [ ] ). thereby, we do not use the smoothed version in our experiments. sgns aims to optimize the loss function l presented above. the authors of [ ] provide the detailed derivation of sgns as follows. we define x = w i · c j and find the partial derivative of l (eq. ) with respect to x: comparing the derivative to zero, we derive that where k is the number of negative samples. in the above section, we derive the dot product of the input and output vectors. now as for a certain node i, we calculate the dot product of its input vector and output vector: w i · c i = log #(i,i)r #(i)r·p (i) − log k. usually, the probability is estimated by the actual number of observations: p (i), namely the probability of a node being visited in a random walk, is proportional to the node degree. thus, we have in our experiments, the value of exp(w i · c i ) · deg(i) is used as the relative approximate node conductance value of node i. actually, the exact value of each node's node conductance is not that necessary. retaining their relative ranks is enough to estimate their centrality. the variants of deepwalk also produce similar node embeddings. for example, node vec is more sensitive to certain local structure [ ] and its embeddings has lower capacity of generalization. we only discuss deepwalk in this paper for its tight connection to random walk, which brings more interpretability than other embedding algorithms. deepwalk generates m random walks started at each node and the walk length is l, sliding window size is w. node embedding size is d. we set m = , l = , w = , and d = . in order to compute the node embeddings, deepwalk uses word vec optimized by sgns in gensim and preserves the default settings, where the embeddings are initialized randomly, initial learning rate is . and linearly drops to . , epochs number is , negative sample number is . the formalized computation of node conductance is based on eigendecomposition, which scales to o(v ), v is the number of nodes. using deepwalk with sgns, the computational complexity per training instance is o(nd + wd), where n is the number of negative samples, w is the window size and d is the embedding dimension. the number of training instance is decided by the settings of random walks. usually it is o(v ). now that different measures are designed so as to capture the centrality of the nodes in the network, it has been proved that strong correlations exist among these measures [ ] . we compute different centrality measures on several small datasets . nc ∞ is computed by eq. . nc dw is computed by deepwalk with the window size . as presented in table , we calculate their correlations by spearman's rank correlation coefficient. nc ∞ and network flow betweenness are not able to be computed on dataset polblog as the graph is disconnected. apart from the football dataset, degree, nc ∞ and pagerank value show significant relation with nc dw on all the rest datasets. node conductance is not sensitive to window size on these datasets. we visualize the special case, football network, in order to have an intuitive sense of the properties of degree, betweenness, and node conductance (other centralities are presented in the supplementary material). moreover, we want to shed more light on the reason why node conductance does not correlate with degree on this dataset. figure presents the football network. the color represents the ranking of nodes produced by different metrics (low value: red, medium value: light yellow, high value: blue). the values produced by these four metrics are normalized into range [ , ] respectively. comparing fig. a and fig. b with fig. d , it seems that the result provided by node conductance (window = ) synthesizes the evaluations from degree and betweenness. node conductance gives low value to nodes with low degree (node , , ) and high betweenness centrality (node , , ). we are able to have an intuitive understanding that node conductance captures both local and global structure characteristics. when the window size is bigger, the distribution of node colors in fig. c basically consistent with fig. d . some clusters of nodes get lower values in fig. c because of the different levels of granularity being considered. we employ node conductance computed by deepwalk to both static network and dynamic network to demonstrate its validity and efficiency. node conductance of different window size are all tested and size is proved to be the best choice. we try our best to calculate the baseline centralities accurately, while some of them do not scale to the big network datasets. (table ) . we employ the collaboration network of dblp, amazon product co-purchasing network, and youtube social network provided by snap . in dblp, two authors are connected only if they are co-authors and the publication venue is considered to be the ground-truth communities. dblp has highly connected clusters and consequently has the best clustering coefficient (cc). in amazon network, an edge means that two products are co-purchased frequently and the ground-truth communities are the groups of products that are in the same category. users in youtube social networks create or join into different groups on their own interests, which can be seen as the ground-truth. the link between two users represents their friend relationship. the cc of youtube network is very poor. dynamic network. flickr network [ ] between november nd, and may th, . as shown in table , there are altogether snapshots during this period. this unweighted and undirected network has about , new users and over . million new edges. table . the configuration of our computer is: two intel(r) xeon(r) cpu e - at . ghz, gb of ram. node conductance is calculated by deepwalk with the setting m = , l = , w = , and d = , the same setting in [ ] . as node conductance is the by-product of deepwalk, the actual running time of node conductance is the same as deepwalk. as presented in the beginning of the section, eigenvector centrality and pagerank are approximately calculated and we set the error tolerance used to check convergence in power method iteration to e− . betweenness are approximately calculated by randomly choosing pivots. more pivots requires more running time. subgraph centrality and network flow betweenness do not have corresponding approximations. time costs of some global centralities are listed in table . approximate eigenvector, subgraph centrality and network flow betweenness are not able to finish calculating in a reasonable amount of time on these three datasets. node conductance calculated by deepwalk is as fast as the approximate pagerank and costs much less time than approximate betweenness. comparing with the existing global centralities, node conductance computed by deepwalk is much more scalable and capable to be performed on big datasets. we use node conductance to find nodes spanning several communities. sometimes, it is called structural hole as well. amazon, dblp and youtube datasets provide the node affiliation and we count the number of communities each node belongs to. in our experiments, nodes are ranked decreasingly by their centrality values. we first calculate the spearman ranking coefficient between the ranks produced by each centrality measure and the number of communities. the error tolerance of approximate eigenvector centrality is set to be e− . other settings are the same as the sect. . . results are shown in table . node conductance performs the best and pagerank has a poor performance. we further explore the differences between the rank of these centralities and plot the communities numbers of nodes (y-axis) in the order of each centrality measure (x-axis). in order to smooth the curve, we calculate the average number of communities node belongs to for every nodes. for example, point (x, y) denotes that nodes that are ranked from ( x) to ( (x + )) belong to y communities on average. in fig. , all of the six metrics are able to reflect the decreasing trend of spanning communities number. it is obvious that node conductance provides the smoothest curve comparing with the other five metrics, which indicates its outstanding ability to capture node status from a structural point of view. the consistency of performance on different datasets (please refer to the supplementary material) demonstrates that node conductance is an effective tool for graphs with different clustering coefficient. degree and pagerank seem to have very different performances as shown in the table , fig. . the ground-truth centrality is the number of communities that each node belongs to, which means many nodes have the same centrality rank. similarly, many nodes have the same degree too. however, under the measurement of the other centralities, nodes have different centrality values and ranks. thus, degree has advantage to achieve higher ranking coefficient in table but performs bad as shown in fig. . as for the curves of pagerank, the tails are quite different from the curves of node conductance. in fig. e , the tail does not smooth. in other words, pagerank does not perform well for those less active nodes and thus achieves a poor score in table . the calculation of node conductance is entirely based on the topology, while node affiliation (communities) is completely determined by the fields and applications. node affiliation is somehow reflected in the network topology and node conductance has better ability to capture it. in this experiment, we focus on the mechanism of network growing. it is well-known that the network growth can be described by preferential attachment process [ ] . the probability of a node to get connected to a new node is proportional to its degree. we consider the flickr network [ ] expansion during dec. rd, to feb. rd, . note that the results are similar if we observe other snapshots, and given space limitations, we only show this expansion in the paper. nodes in the first snapshot are ranked decreasingly by their degree. we also count the newly created connections for every node. figure presents strong evidence of preferential attachment. however, there exist some peaks in the long tail of the curve and the peak should not be ignored as it almost reaches and shows up repeatedly. figure b presents the relationship between increasing degree and node conductance. comparing the left parts of these two curves, node conductance fails to capture the node with the biggest degree change. on the other hand, node conductance curve is smoother and no peak shows up in the long tail of the curve. degree-based preferential attachment applies to the high degree nodes, while for the nodes with fewer edges, this experiment suggests that there is a new expression of preferential attachmentthe probability of a node to get connected to a new node is proportional to its node conductance. in this paper, we propose a new node centrality, node conductance, measuring the node influence from a global view. the intuition behind node conductance is the probability of revisiting the target node in a random walk. we also rethink the widely used network representation model, deepwalk, and calculate node conductance approximately by the dot product of the input and output vectors. experiments present the differences between node conductance and other existing centralities. node conductance also show its effectiveness on mining influential node on both static and dynamic network. internet: diameter of the world-wide web approximating betweenness centrality emergence of scaling in random networks factoring and weighting approaches to status scores and clique identification centrality and network flow centrality estimation in large networks subgraph centrality in complex networks a set of measures of centrality based on betweenness centrality in social networks conceptual clarification node vec: scalable feature learning for networks identification of influential spreaders in complex networks neural word embedding as implicit matrix factorization improving distributional similarity with lessons learned from word embeddings word embedding revisited: a new representation learning and explicit matrix factorization perspective enhancing the network embedding quality with structural similarity distributed representations of words and phrases and their compositionality growth of the flickr social network improving document ranking with dual word embeddings a measure of betweenness centrality based on random walks the pagerank citation ranking: bringing order to the web deepwalk: online learning of social representations collective dynamics of a small-world networks centers of complex networks key: cord- -dkw sugl authors: singh, indu; swami, rajan; khan, wahid; sistla, ramakrishna title: delivery systems for lymphatic targeting date: - - journal: focal controlled drug delivery doi: . / - - - - _ sha: doc_id: cord_uid: dkw sugl the lymphatic system has a critical role in the immune system’s recognition and response to disease, and it is an additional circulatory system throughout the entire body. most solid cancers primarily spread from the main site via the tumour’s surrounding lymphatics before haematological dissemination. targeting drugs to lymphatic system is quite complicated because of its intricate physiology. therefore, it tends to be an important target for developing novel therapeutics. currently, nanocarriers have encouraged the lymphatic targeting, but still there are challenges of locating drugs and bioactives to specific sites, maintaining desired action and crossing all the physiological barriers. lymphatic therapy using drug-encapsulated colloidal carriers especially liposomes and solid lipid nanoparticles emerges as a new technology to provide better penetration into the lymphatics where residual disease exists. optimising the proper procedure, selecting the proper delivery route and target area and making use of surface engineering tool, better carrier for lymphotropic system can be achieved. thus, new methods of delivering drugs and other carriers to lymph nodes are currently under investigation. the lymphatic system was fi rst recognised by gaspare aselli in , and the anatomy of the lymphatic system was almost completely characterised by the early nineteenth century. however, knowledge of the blood circulation continued to grow rapidly in the last century [ ] . two different theories are proposed which are in favour of origin of the lymphatic vessels. firstly, centrifugal theory of embryologic origin of the lymphatics was described in the early twentieth century by sabin and later by lewis, postulating that lymphatic endothelial cells (lecs) are derived from the venous endothelium. later the centripetal theory of lymphatic development was proposed by huntington and mcclure in which describes the development of the lymphatic system beginning with lymphangioblasts, mesenchymal progenitor cells, arising independently of veins. the venous connection to the lymphatic system then happens later in development [ ] . the lymphatic vessels in the embryo are originated at mid-gestation and are developed after the cardiovascular system is fully established and functional [ ] . a dual origin of lymphatic vessels from embryonic veins and mesenchymal lymphangioblasts is also proposed [ ] . recent studies provide strong support of the venous origin of lymphatic vessels [ - ] . the recent discovery of various molecular markers has allowed for more in-depth research of the lymphatic system and its role in health and disease. the lymphatic system has recently been elucidated as playing an active role in cancer metastasis. the knowledge of the active processes involved in lymphatic metastasis provides novel treatment targets for various malignancies. the lymphatic system consists of the lymphatic vessels, lymph nodes, spleen, thymus, peyer's patches and tonsils, which play important roles in immune surveillance and response. the lymphatic system serves as the body's second vascular system in vertebrates and functions co-dependently with the cardiovascular system [ , ] . the lymphatic system comprises a single irreversible, open-ended transit network without a principal driving force [ ] . it consists of fi ve main types of conduits including the capillaries, collecting vessels, lymph nodes, trunks and ducts. the lymphatic system originates in the dermis with initial lymphatic vessels and blind-ended lymphatic capillaries that are nearly equivalent in size to but less abundant than regular capillaries [ , ] . lymphatic capillaries consist of a single layer of thin-walled, non-fenestrated lymphatic endothelial cells (lecs), alike to blood capillaries. the lecs, on the contrary to blood vessels, have poorly developed basement membrane and lack tight junctions and adherent junctions too. these very porous capillaries act as gateway for large particles, cells and interstitial fl uid. particles as large as nm in diameter can extravasate into the interstitial space, get phagocytosed by macrophages and are ultimately passed on to lymph nodes [ - ] . lymphatic capillary endothelial cells are affi xed to the extracellular matrix by elastic anchoring fi laments, which check vessel collapse under high interstitial pressure. these initial lymphatics, under a positive pressure gradient, distend and create an opening between loosely anchored endothelial cells letting for the entry of lymph, a protein-rich exudate from the blood capillaries [ , , ] . in initial lymphatic vessels, overlying endothelial cell-cell contacts prevent fl uid refl ux back into the interstitial space [ , ] . after the collection of lymph by the lymphatic capillaries, it is transported through a system of converging lymphatic vessels of progressively larger size, is fi ltered through lymph nodes where bacteria and particulate matter are removed and fi nally goes back to the blood circulation. lymph is received from the initial capillary lymphatic by deeper collecting vessels that contain valves to maintain unidirectional fl ow of lymph. these collecting vessels have basement membranes and are surrounded by smooth muscle cells with intrinsic contractile activity that in combination with contraction of surrounding skeletal muscles and arterial pulsations propels the lymph to lymph nodes [ - ] . the collecting lymphatic vessels unite into lymphatic trunks, and the lymph is fi nally returned to the venous circulation via the thoracic duct into the left subclavian vein [ , ] . the fl ow of lymph toward the circulatory system is supported by increases in interstitial pressure as well as contractions of the lymphatic vessels themselves. roughly l of lymphatic fl uid enters the cardiovascular system each day [ ] . the key functions of the lymphatic system are maintenance of normal tissue fl uid balance, absorption of lipids and fat-soluble vitamins from the intestine and magnetism and transport of immune cells. lymphatics transport the antigen-presenting cells as well as antigens from the interstitium of peripheral tissues to the draining lymph nodes where they initiate immune responses via b-and t-cells in the lymph nodes [ , , , ] . tissue fl uid balance is maintained by restoring interstitial fl uid to the cardiovascular system [ ] . although capillaries have very low permeability to proteins, these molecules as well as other macromolecules and bacteria accumulate in the interstitium. due to the accumulation of these large molecules in the interstitium, signifi cant tissue oedema would result. the lymphatic system offers the mechanism by which these large molecules re-enter the blood circulation [ ]. the lymphatic system is the site of many diseases such as metastitial tuberculosis (tb), cancer and fi lariasis [ ] . due to the peculiar nature and anatomy of the lymphatic system, localisation of drugs in the lymphatics has been particularly diffi cult to achieve. the lymphatic system has an active role in cancer metastasis. although many cancers may be treated with surgical resection, microscopic disease may remain and lead to locoregional recurrence. conventional systemic chemotherapy cannot prove effective for delivering drugs to the lymphatic system without dose-limiting toxicities [ ]. lymphatic system functions in the clearance of particulate matter from the interstitium following presentation to lymph nodes have created interest in developing microparticulate systems to target regional lymph nodes. molecule's composition is important in determining uptake into the lymphatics and retention within the lymph nodes. colloidal materials, for example, liposomes, activated carbon particles, emulsions, lipids and polymeric particulates, are highly taken up by the lymphatics; that's why nowadays these substances are emerging as potential carriers for lymphatic drug targeting [ ] . the vast majority of drugs following oral administration are absorbed directly into portal blood, but a number of lipophilic molecules may get access to the systemic circulation via the lymphatic pathway [ , ] . intestinal lymphatic transport of lipophilic molecules is signifi cant and presents benefi ts in a number of situations: the lymphatic system also acts as the primary systemic transport pathway for b-and t-lymphocytes as well as the main route of metastatic spread of a number of solid tumours [ , ] . therefore, lymphatic absorption of the immunomodulatory and anticancer compounds may be more effective [ , ] . the presence of wide amounts of hiv-susceptible immune cells in the lymphoid organs makes antiretroviral drug targeting to these sites of tremendous interest in hiv therapy. this strategy comprises once again targeting nanosystems to immune cell populations, particularly macrophages. also evidence further suggests that lymph and lymphoid tissue, and in particular gut-associated lymphoid tissue, play a major role in the development of hiv and antivirals which target acquired immunodefi ciency syndrome (aids) may therefore be more effective when absorbed via the intestinal lymphatics [ , ] targeting drugs to lymphatic system is a tough and challenging task, and it totally depends upon the intricate physiology of the lymphatic system. targeting facilitates direct contact of drug with the specifi c site, decreasing the dose of the drugs and minimising the side effects caused by them. currently, nanocarriers have encouraged the lymphatic targeting, but still there are challenges of locating drugs and bioactives to specifi c sites, maintaining desired action and crossing all the physiological barriers. these hurdles could be overcome by the use of modifi ed nanosystems achieved by the surface engineering phenomena. from the growing awareness of the importance of lymph nodes in cancer prognosis, their signifi cance for vaccine immune stimulation and the comprehension that the lymph nodes harbour hiv as well as other infectious diseases stems the development of new methods of lymph node drug delivery [ - ] . new methods of delivering drugs and other carriers to lymph nodes are currently under investigation. lymph node dissemination is the primary cause of the spread of majority of solid cancers [ ] . in regard to cancer metastasis, the status of the lymph node is a major determinant of the patient's diagnosis. the most important factor that determines the appropriate care of the patient is correct lymph node staging [ ] . but patient survivals have been shown to improve by the therapeutic interventions that treat metastatic cancer in lymph nodes with either surgery or local radiation therapy [ ] . viraemia is an early indication of primary infection with hiv followed by a specifi c hiv immune response and a dramatic decline of virus in the plasma [ ] . long after the hiv virus can be found in the blood, hiv can be found in high levels in mononuclear cells located in lymph nodes. viral replication in these lymph nodes has been reported to be about -to -fold higher than in the peripheral blood mononuclear cells [ ] . standard oral or intravenous drug delivery to these lymph node mononuclear cells is diffi cult [ ] . even if highly active antiretroviral therapy (haart) can reduce plasma viral loads in hiv-infected patients by %, active virus can still be isolated from lymph nodes even after months of haart therapy. lymph nodes are the key element of the life cycle of several parasite organisms, including fi laria. lymphatic vessels and lymph nodes of infected patients can carry adult worms. this adult fi laria obstructs the lymphatic drainage that results into swelling of extremities that are distal to the infected lymph node. these very symptoms of swollen limbs in patients with fi larial disease have been termed elephantiasis. the eradication of adult worms in lymph nodes is not frequently possible, and commonly a much extended course of medical therapy is required for it to be successful [ ] . new methods of curing anthrax have become a burning interest following the recent outburst of anthrax infections and deaths in the usa as a result of terrorism. in anthrax infection, endospores from bacillus anthracis that gain access into the body are phagocytosed by macrophages and carried to regional lymph nodes where the endospores germinate inside the macrophages and become vegetative bacteria [ ] . according to one literature, computed tomography of the chest was performed on eight patients infected with inhalational anthrax. mediastinal lymphadenopathy was found in seven of the eight patients [ ] . in another case report of a patient, the anthrax bacillus was shown to be rapidly sterilised within the blood stream after initiation of antibiotic therapy. however, viable anthrax bacteria were still present in postmortem mediastinal lymph node specimens [ ] . treatment and control of these diseases are hard to accomplish because of the limited access of drugs to mediastinal nodes using common pathways of drug delivery. also, the anatomical location of mediastinal nodes represents a diffi cult target for external beam irradiation. newer methods to target antituberculosis drugs to these lymph nodes could possibly decrease the amount of time of drug therapy. tb requires lengthy treatment minimum of approximately months probably because of its diffi culty in delivering drugs into the tubercular lesions. the tb infection is caused by mycobacteria that invade and grow chiefl y in phagocytic cells. lymph node tb is the most common form of extrapulmonary tb rating approximately as . %. this is frequently found to spread from the lungs to lymph nodes. in one study, total tb lymph node involvement was found as % of the intrathoracic lymph nodes, % of the cervical lymph nodes and % of the axillary lymph nodes [ ] . targeted delivery of drugs can be achieved utilising carriers with a specifi ed affi nity to the target tissue. there are two approaches for the targeting, i.e. passive and active. in passive targeting, most of the carriers accumulate to the target site during continuous systemic circulation to deliver the drug substance, the behaviour of which depends highly upon the physicochemical characteristics of the carriers. whereas much effort has been concentrated on active targeting, this involves delivering drugs more actively to the target site. passive targeting involves the transport of carriers through leaky tumour vasculature into the tumour interstitium and cells by convection or passive diffusion. further, nanocarriers and drug then accumulate at the target site by the enhanced permeation and retention (epr) effect [ ] . the epr effect is most prominent mainly in cancer targeting. moreover, the epr effect is pertinent for about all fast-growing solid tumours [ ] . the epr effect will be most positive if nanocarriers can escape immune surveillance and circulate for a long period. very high local concentrations of drug-loaded nanocarriers can be attained at the target site, for example, about -to -fold higher than in normal tissue within - days [ ] . however, there exist some limitations for passively targeting the tumour; fi rst is the degree of tumour vascularisation and angiogenesis which is important for passive targeting of nanocarriers [ ] . and, second, due to the poor lymphatic drainage in tumours, the interstitial fl uid pressure increases which correlates nanocarrier size relationship with the epr effect: larger and long-circulating nanocarriers ( nm) are more retained in the tumour, whereas smaller molecules easily diffuse [ ] . active targeting is based upon the attachment of targeting ligands on the surface of the nanocarrier for appropriate receptor binding that are expressed at the target site. the ligand particularly binds to a receptor overexpressed in particular diseased cells or tumour vasculature and not expressed by normal cells. in addition, targeted receptors should be present uniformly on all targeted cells. targeting ligands are either monoclonal antibodies (mabs) and antibody fragments or non-antibody ligands (peptidic or not). these can also be termed as ligand-targeted therapeutics [ , ] . targeting approaches for lymphatic targeting are shown in fig. . . current research is focussed on two types of carriers, namely, colloidal carriers and polymeric carriers. targeting strategies for lymphatics are shown in fig. much effort has been concentrated to achieve lymphatic targeting of drugs using colloidal carriers. the physicochemical nature of the colloid itself has been shown to be of particular relevance, with the main considerations being size of colloid and hydrophobicity. the major purpose of lymphatic targeting is to provide an effective anticancer chemotherapy to prevent the metastasis of cancer cells by accumulating the drug in the regional lymph node. emulsions are probably well-known particulate carriers with comparative histories of research and have been widely used as a carrier for lymph targeting. hashida et al. demonstrated that injection of water-in-oil (w/o) or oil-in-water (o/w) emulsions favoured lymphatic transport of mitomycin c via the intraperitoneal and intramuscular routes and uptake into the regional lymphatics was reported in the order of o/w > w/o > aqueous solution. the nanoparticle-in-oil emulsion system, containing anti-fi larial drug in gelatin nanoparticles, was studied for enhancing lymphatic targeting [ ] . pirarubicin and lipiodol emulsion formulation was developed for treating gastric cancer and metastatic lymph nodes [ , ] . after endoscopic injection of the pirarubicin-lipiodol emulsion, the drug retained over days at the injection site and in the regional lymph node. hauss et al. in their study have explored the lymphotropic potential of emulsions and self-emulsifying drug delivery systems (sedds). they investigated the effects of a range of lipid-based formulations on the bioavailability and lymphatic transport of ontazolast following oral administration to conscious rats and found that all the lipid formulations increased the bioavailability of ontazolast comparative to the control suspension, the sedds promoted more rapid absorption and maximum lymphatic transport is found with the emulsion [ , ] . lymphatic delivery of drug-encapsulated liposomal formulations has been investigated extensively in the past decade. liposomes possess ideal features for delivering therapeutic agents to the lymph nodes which are based on their size, which prevents their direct absorption into the blood; the large amount of drugs and other therapeutic agents that liposomes can carry; and their biocompatibility. the utility of liposomes as a carrier for lymphatic delivery was fi rst investigated by segal et al. in [ ] . orally administered drug-incorporated liposomes enter the systemic circulation via the portal vein and intestinal lymphatics. drugs entering the intestinal lymphatic through the intestinal lumen avoid liver and fi rst-pass metabolism as they fi rst migrate to lymphatic vessels and draining lymph nodes before entering systemic circulation. lymphatic uptake of carriers via the intestinal route increases bioavailability of a number of drugs. for oral delivery of drug-encapsulated liposomal formulations, intestinal absorbability and stability are the primary formulation concerns. ling et al. evaluated oral delivery of a poorly bioavailable hydrophilic drug, cefotaxime, in three different forms: liposomal formulation, aqueous-free drug and a physical mixture of the drug and empty liposomes [ ] . the liposomal formulation of the drug turned out to exhibit a . -fold increase in its oral bioavailability compared to the aqueous dosage and a . -fold increase for the physical mixture. they also accounted that the liposomal formulation leads to a signifi cant enhancement of the lymphatic localisation of the drug relative to the other two formulations. as a result, liposome systems emerged as useful carriers for poorly bioavailable hydrophilic drugs, promoting their lymphatic transport in the intestinal lymph as well as their systemic bioavailability. conventional liposomal formulations contain anticancer drugs incorporated in them for intravenous infusion in treating various types of cancers. doxil, a chemotherapeutic formulation of pegylated liposomes of doxorubicin, is widely used as fi rst-line therapy of aids-related kaposi's sarcoma, breast cancer, ovarian cancer and other solid tumours [ - ] . liposomal delivery of anticancer drug actinomycin d via intratesticular injection has shown greater concentration of the drug in the local lymph nodes. furthermore, a study by hirnle et al. found liposomes as a better carrier for intralymphatically delivered drugs contrasted with bleomycin emulsions [ ] . systemic liposomal chemotherapy is preferred mainly because of its reduced side effects compared to the standard therapy and improved inhibition of the anticancer drugs from enzymatic digestion in the systemic circulation. effective chemotherapy by pulmonary route could overcome various lacunas associated with systemic chemotherapy like serious non-targeted toxicities, poor drug penetration into the lymphatic vessels and surrounding lymph node and fi rst-pass clearance concentrating drugs in the lungs and draining lymphatics in the case of oral delivery. latimer et al. developed liposomes of paclitaxel and a vitamin e analogue α-tocopheryloxy acetic acid (α-tea) in an aerosol formulation for treating murine mammary tumours and metastases [ ] . similarly, lawson et al. performed a comparative study for the anti-proliferative effi cacy of a -nitro-camptothecin ( -nc)-encapsulated dilauroylphosphatidylcholine liposomal delivery, α-tea and a combination therapy of -nc and α-tea, in a metastatic murine mammary tumour model. liposome-encapsulated individual as well as combination treatment was delivered via an aerosol for curing metastases of lungs and of the surrounding lymph node. the animals treated with the combination therapy were found to have less proliferative cells compared to the animals treated with -nc alone when immunostained with ki- . the in vivo anticancer effi cacy studies demonstrated that the combination treatment greatly hindered the tumour progression compared to each treatment alone, leading to the prolonged survival rate [ ] . high levels of drugs could be targeted to lymph nodes containing tb using liposomal antituberculosis drug therapy [ ] . deep lung lymphatic drainage could also be visualised using mtc radioactive marker-incorporated liposomes. in addition, botelho et al. delivered aerosolised nanoradioliposomal formulation to wild boars and observed their deep lung lymphatic network and surrounding lymph nodes [ ] . also, this technique has offered new information of the complicated structure of lymphatic network and has emerged as a new and non-invasive molecular imaging technique for the diagnosis of early dissemination of lung cancers as compared to the conventional computed tomography. solid lipid nanoparticles (sln) could be a good formulation strategy for incorporating drugs with poor oral bioavailability due to low solubility in gi tract or pre-systemic hepatic metabolism (fi rst-pass effect) permitting transportation into the systemic circulation through the intestinal lymphatics. bargoni et al. have performed various studies on absorption and distribution of sln after duodenal administration [ - ] . in one study, i- -iodoheptadecanoic acid-labelled drug-free sln were delivered into the duodenal lumen of fed rats, and transmission electron microscopy and photon correlation spectroscopy results of the lymph and blood samples verifi ed the transmucosal transport of sln [ ] . in a later study of tobramycin-loaded sln after duodenal administration, the improvement of drug absorption and bioavailability was ascribed mostly to the favoured transmucosal transport of sln to the lymph compared to the blood [ ] . the same group conducted a study using idarubicin-loaded sln, administered via the duodenal route rather than intravenous route, and observed enhancement in drug bioavailability [ ] . reddy et al. prepared etoposide-loaded tripalmitin (etpl) sln radiolabelled with mtc and administered the etpl nanoparticles subcutaneously, intraperitoneally and intravenously, to mice bearing dalton's lymphoma tumours, and h after subcutaneous administration, gamma scintigraphy and the radioactivity measurements showed that the etpl sln revealed a clearly higher degree of tumour uptake given via subcutaneous route ( -and -fold higher than that of the intraperitoneal and intravenous routes, respectively) and reduced accumulation in reticuloendothelial system organs [ ] . targeting therapies are of great potential in small cell lung cancer considering intrathoracic lymph node metastasis occurring in approximately % of the limited stage patients and to nearly % of the extensive stage patients [ ] . considering the case of non-small cell lung cancer, extensive rate of metastasis of lymphatics is seen in greater than % of stage iv patients [ ] . videira et al. compared the biodistribution of inhaled mtc-d,l -hexamethylpropyleneamine oxime (hmpao)radiolabelled sln with that of the free tracer administered through the same route, and gamma scintigraphic results specifi ed that the radiolabelled sln were primarily cleared from lungs via the lymphatics [ , ] . nanocapsules tend to be the most promising approach for lymphatic targeting because of their possibility of attaining distinct qualities with an easy manufacturing process. nanocapsules coated with hydrophobic polymers could be easily captured by lymphatic cells in the body, when administered, because the hydrophobic particle is generally recognised as a foreign substance. the lymphatic targeting ability of poly(isobutylcyanoacrylate) nanocapsules encapsulating -( -anthroxy) stearic acid upon intramuscular administration was evaluated and compared with three conventional colloidal carriers [ ] . in vivo study in rats proved that poly(isobutylcyanoacrylate) nanocapsules retained in the right iliac regional lymph nodes in comparison with other colloidal carriers following intramuscular administration. for effective targeted and sustained delivery of drugs to lymph, several polymeric particles have been designed and studied. the polymers are categorised in two types based on their origin either natural polymers like dextran, alginate, chitosan, gelatin, pullulan and hyaluronan or synthetic polymers like plga, pla and pmma. dextran a natural polysaccharide has been used as a carrier for a range of drug molecules due to its outstanding biocompatibility. bhatnagar et al. synthesised cyclosporine a-loaded dextran acetate particles labelled with mtc. these particles gradually distributed cyclosporine a all through the lymph nodes following subcutaneous injection into the footpad of rats [ ] . dextran (average molecular weights of , and kda)-conjugated lymphotropic delivery system of mitomycin c has been studied and it was reported that after intramuscular injection in mice, this mitomycin c-dextran conjugates retained for a longer period in regional lymph nodes for nearly h while the free mitomycin was quickly cleared. hyaluronan, also called as hyaluronic acid, is a natural biocompatible polymer that follows lymphatic drainage from the interstitial spaces. cai et al. demonstrated a novel intralymphatic drug delivery method synthesising a cisplatin-hyaluronic acid conjugate for breast cancer treatment. following subcutaneous injection into the upper mammary fat pad of female rats, most of the carrier localised in the regional nodal tissue compared to the standard cisplatin formulation [ ] . poly(lactide-co-glycolide) as synthetic polymer that is used to prepare biodegradable nanospheres has been accounted to deliver drugs and diagnostic agents to the lymphatic system. similarly, nanospheres coated with block copolymers of poloxamers and poloxamines with radiolabelled in-oxine are used to trace the nanoparticles in vivo. upon s.c. injection, the regional lymph node showed a maximum uptake of % of the administered dose [ ] . dunne et al. synthesised a conjugate of block copolymer cis-diamminedichloroplatinum(ii) (cddp) and poly(ethylene oxide)-block-poly(lysine) (peo-b-plys) for treating lymph node metastasis. one animal treatment with wt.% cddp-polymer resulted into limited tumour growth in the draining lymph nodes and prevention of systemic metastasis [ ] . johnston and coworkers designed a biodegradable intrapleural (ipl) implant of paclitaxel consisting gelatin sponge impregnated with poly(lactide-co-glycolide) (plga-ptx) for targeting thoracic lymphatics. in rat model, this system exhibited lymphatic targeting capability and showed sustained drug release properties [ ] . kumanohoso et al. designed a new drug delivery system for bleomycin by loading it into a small cylinder of biodegradable polylactic acid to target lesions. this system showed signifi cantly higher antitumour effect compared to bleomycin solution and no treatment [ ] . to treat lesions, a new biodegradable colloidal particulatebased nanocarrier system was designed to target thoracic lymphatics and lymph nodes. various nano-and microparticles of charcoal, polystyrene and poly(lactideco-glycolide) were studied for the lymphatic distribution after intrapleural implantation in rats, and after h of intrapleural injection, the lymphatic uptake was observed [ ] . kobayashi et al. utilised dendrimer-based contrast agents for dynamic magnetic resonance lymphangiography [ ] . gadolinium (gd)-containing dendrimers of different sizes and molecular structures (pamam-g , pamam-g and dab-g ) (pamam, polyamidoamine; dab, diaminobutyl) are used as contrast agents. size and molecular structure play a great role in distribution and pharmacokinetics of dendrimers. for example, pamam-g when injected intravenously had a comparatively long life in the circulatory system with minimum leakage out of the vessels, whereas pamam-g cleared rapidly from the systemic circulation due to rapid renal clearance but had immediate survival in lymphatic circulation. the smaller-sized dab-g showed greater accumulation and retention in lymph nodes useful for lymph node imaging using mr-lg. gadomer- and gd-dtpadimeglumine (magnevist) were evaluated as controls. imaging experiments revealed that all of the reagents are able to visualise the deep lymphatic system except gd-dtpa-dimeglumine. to visualise the lymphatic vessels and lymph nodes, pamam-g and dab-g were used, respectively. while pamam-g provided good contrast of both the nodes and connecting vessels, gadomer- was able to visualise lymph nodes, but not as clear as gd-based dendrimers. kobayashi also delivered various gd-pamam (pamam-g , pamam-g , pamam-g , pamam-g ) and dab-g dendrimers to the sentinel lymph nodes and evaluated its visualisation with other nodes. the g dendrimer provided excellent opacifi cation of sentinel lymph nodes and was able to be absorbed and retained in the lymphatic system [ ] . using a combination of mri and fl uorescence with pamam-g -gd-cy, the sentinel nodes were more clearly observed signifying the potential of the dendrimers as platform for dual imaging. kobayashi et al. further overcame the sensitivity limitation and depth limitations of each individual method by the simultaneous use of two modalities (radionuclide and optical imaging). making use of pamam-g dendrimers conjugated with near-infrared (nir) dyes and an in radionuclide probe, multimodal nanoprobes were developed for radionuclide and multicolour optical lymphatic imaging [ , ] . later kobayashi also proposed the use of quantum dots for labelling cancer cells and dendrimer-based optical agents for visualising lymphatic drainage and identifying sentinel lymph nodes [ ] . polylysine dendrimers have been best used for targeting the lymphatic system and lymph nodes. carbon nanotubes (cnt) possess various mechanochemical properties like high surface area, mechanical strength and thermal and chemical stability which cause them to be versatile carriers for drugs, proteins, radiologicals and peptides to target tumour tissues. hydrophilic multiwalled carbon nanotubes (mwnts) coated with magnetic nanoparticles (mn-mwnts) have emerged as an effective delivery system for lymphatic targeting following subcutaneous injection of these particles into the left footpad of sprague dawley rats; the left popliteal lymph nodes were dyed black. mn-mwnts were favourably absorbed by lymphatic vessels following their transfer into lymph nodes and no uptake was seen in chief internal organs such as the liver, spleen, kidney, heart and lungs. gemcitabine loaded in these particles was evaluated for its lymphatic delivery effi ciency and mn-mwnts-gemcitabine displayed the maximum concentration of gemcitabine in the lymph nodes [ ] . mcdevitt et al. synthesised tumour-targeting water-soluble cnt constructs by covalent attachment of monoclonal antibodies like rituximab and lintuzumab using , , , -tetraazacyclododecane- , , , -tetraacetic acid (dota) as a metal ion chelator while the fl uorescent probe was fl uorescein. cnt-([ in] dota) (rituximab) explicitly targeted a disseminated human lymphoma in vivo trials compared to the controls cnt-([ in] dota) (lintuzumab) and [ in]rituximab [ ] . tsuchida and coworkers evaluated the drug delivery effi ciency of water-dispersed carbon nanohorns in a non-small cell lung cancer model. polyethylene glycol (peg)-doxorubicin conjugate bound oxidised single-wall carbon nanohorns (oxswnhs) injected intratumourally into mice bearing human non-small cell lung cancer (nci-h ) caused a signifi cant retardation of tumour growth. histological analyses showed (probably by means of interstitial lymphatic fl uid transport), migration of oxswnhs to the axillary lymph node occurred which is a major site of breast cancer metastasis near the tumour [ ] . shimada et al. described a silica particle-based lymphatic drug delivery system of bleomycin and compared its therapeutic effi cacy to that of free bleomycin solution in a transplanted tumour model in animals. silica particle-adsorbed bleomycin showed considerable inhibitory effect on tumour growth and lymph node metastasis compared to free bleomycin solution [ ] . activated carbon particles of aclarubicin are used for adsorption and sustained release into lymph nodes. upon subcutaneous administration into the fore foot-pads of rats these particles showed signifi cantly elevated distribution of aclarubicin to the auxiliary lymph nodes compared to aqueous solution of the drug [ ] . activated carbon particles of aclacinomycin a, adriamycin, mitomycin c and pepleomycin have also been used by another group for adsorption. higher level of drug concentration was maintained in the new dosage form than in the solution form [ ] . antibody-drug conjugates enhance the cytotoxic activity of anticancer drugs by conjugating them with antibodies. antibodies conjugated with cytostatic drugs such as calicheamicin have been used for the treatment of various lymphomas, including non-hodgkin b-cell lymphoma (nhl), follicular lymphoma (fl) and diffuse large b-cell lymphoma (dlbcl) [ - ] . cd b-cell marker is expressed on the surface membrane of pre-b-lymphocytes and mature b-lymphocytes. the anti-cd mab rituximab (rituxan) is now the most potential antibody for the treatment of non-hodgkin b-cell lymphomas (b-nhl) [ ] . rituximab-conjugated calicheamicin elevated the antitumour activity of rituximab against human b-cell lymphoma (bcl) xenografts in preclinical models [ ] . cd is a b-lymphoid lineage-specifi c differentiation antigen expressed on the surface of both normal and malignant b-cells. hence, the cd -specifi c antibody could be effective in delivering chemotherapeutic drugs to malignant b-cells. also, cd (siglec- ) antibodies targeting to cd are suited for a trojan horse strategy. thus, antibody-conjugated therapeutic agents bind to the siglec and are carried effi ciently into the cell [ ] . a lot of interest has been seen in clinical progress of the conjugated anti-cd antibodies, especially inotuzumab ozogamicin (cmc- ) [ ] . cd is expressed in the malignant hodgkin and reed-sternberg cells of classical hodgkin lymphoma (hl) and anaplastic large-cell lymphoma. younes and bartlett reported an ongoing phase i dose-escalation trial in relapsed and refractory hl patients with seattle genetics (sgn- ) , a novel anti-cd -antibody-monomethylauristatin e conjugate. sgn- was stable in the blood and released the conjugate only upon internalisation into cd -expressing tumour cells [ ] . huang et al. constructed (anti-her / neu -igg -ifn α ), another antibody-drug conjugate, and examined its effect on a murine b-cell lymphoma, c , expressing human her / neu , and this signifi cantly inhibited c /her tumour growth in vivo [ ] . hybrid systems use combination of two or more delivery forms for effective targeting. khatri et al. prepared and investigated the in vivo effi cacy of plasmid dna-loaded chitosan nanoparticles for nasal mucosal immunisation against hepatitis b. chitosan-dna nanoparticles prepared by the coacervation process adhered to the nasal or gastrointestinal epithelia and are easily transported to the nasal-associated lymphoid tissue (nalt) and peyer's patches of the gut-associated lymphoid tissue (galt) both as iga inductive site [ ] , in which chitosan-dna might be taken in by m cell, and transported across the mucosal boundary and thereby transfect immune cells within nalt or galt [ ] . a work demonstrates targeting of three peptides containing sequences that bind to cell markers expressed in the tumour vasculature (p -nrp- and p -flt- ) [ , ] and tumour lymphatics (p -lyp- ) [ ] and were tested for their ability to target (nitrilotriacetic acid)-ditetradecylamine (nta -dtda) containing liposomes to subcutaneous b -f tumours. signifi cantly, a potential antitumour effect was seen after administration of doxorubicin-loaded peg liposomes engrafted with p -nrp- . hybrid liposomes composed of l -α-dimyristoylphosphatidylcholine and polyoxyethylene ( ) dodecyl ether prepared by sonication showed remarkable reduction of tumour volume in model mice of acute lymphatic leukaemia (all) treated intravenously with hl- without drugs after the subcutaneous inoculation of human all (molt- ) cells was verifi ed in vivo. prolonged survival (> %) was noted in model mice of all after the treatment with hl- without drugs [ ] . in a report, lyp- peptide-conjugated pegylated liposomes loaded with fl uorescein or doxorubicin were prepared for targeting and treating lymphatic metastatic tumours. the in vitro cellular uptake and in vivo near-infrared fl uorescence imaging results confi rmed that lyp- -modifi ed liposome increased uptake by tumour cells and metastatic lymph nodes. in another study, in vitro cellular uptake of peg-plga nanoparticle (lyp- -nps) was about four times that of peg-plga nanoparticles without lyp- (nps). in vivo study, about eight times lymph node uptake of lyp- -nps was seen in metastasis than that of nps, indicated lyp- -np as a promising carrier for targetspecifi c drug delivery to lymphatic metastatic tumours [ ] . currently, surgery, radiation therapy and chemotherapy are the principal methods for cancer treatment. gene therapies may act synergistically or additively with them. for example, another case demonstrated that replacement of the p (protein ) gene in p -defi cient cancer cell lines enhanced the sensitivity of these cells to ad-p (adenovirus-expressed protein ) and cisplatin (cddp) and resulted into greater tumour cell death [ ] . later, son and huang [ ] stated that treatment of cddp-resistant tumour cells with cddp increased the sensitivity of these cells to transduction by dna-carrying liposomes. also, chen et al. [ ] described that to improve tumour killing, herpes simplex virus thymidine kinase (hsv-tk) and interleukin (il) expression can be combined. on the whole, greater therapeutic effect can be achieved by effectively combining conventional cancer treatments and gene therapy together. mainly colloidal carriers have emerged as potential targeting agents to lymphatic system. physicochemical properties affect the effi ciency of colloid uptake into the lymphatic system [ ]. these properties include size, number of particles, surface charge, molecular weight and colloid lipophilicity. physicochemical properties are altered by adsorption of group of hydrophilic polymers like poloxamers and poloxamines to the particle surface. these properties modifi ed the biodistribution of particles in vivo, particularly the avoidance of the reticuloendothelial system (res) upon intravenous administration [ , ] . in one study, it was opined that opsonisation may cause alteration of the particle surface in vivo [ ] . size could be important factor in defi ning the behaviour of particulates after subcutaneous injection. small particles with diameter less than a few nanometres generally exchanged through the blood capillaries, whereas larger particles of diameters up to a few tens of nanometres absorbed into the lymph capillaries. but particles over a size of few hundred nanometres remain trapped in the interstitial space for a long time [ ] . christy et al. have shown a relationship between colloid size and ease of injection site drainage using model polystyrene nanospheres after subcutaneous administration to the rat [ ] . results showed distribution of polystyrene nanospheres in the size range - nm h after administration and - % of the recovered dose retained at the administration site, and as particle diameter increased, drainage became slower. it has been proposed earlier that the optimum colloid size range for lymphoscintigraphic agents is l - nm [ ] . size has less importance when colloids are administered intraperitoneally (i.p.) within the nanometre size range, as drainage is only from a cavity into the initial lymphatics; hence, no diffusion is required through the interstitial space [ ]. the size limit of the open junctions of the initial lymphatic wall is the only barrier to uptake from the peritoneal cavity into the lymphatics [ ] . more number of particles at the injection site decreases their rate of drainage, owing to increased obstruction of their diffusion through the interstitial space [ , ] . scientists at nottingham university investigated this effect using polystyrene nanospheres of nm. following administration to the rat, the concentration range of nanospheres was approximately . - . mg/ml. lower lymphatic uptake was seen on increasing the concentration of nanospheres in the injection volume due to slower drainage from the injection site. injecting oily vehicles intramuscularly to the rat, the effect of injection volume has been studied. increasing volume of sesame oil accelerated oil transport into the lymphatic system. upon s.c. administration, volumes of aqueous polystyrene particle suspensions have been investigated in the range - μl [ ]. surface charge studies have been done utilising liposome as colloidal carrier. the surface charge of liposomes affected their lymphatic uptake from s.c. and i.p. injection sites. negatively charged liposomes showed faster drainage than that for positive liposomes after i.p. administration [ ] . patel et al. also indicated that liposome localisation in the lymph nodes followed a particular order negative > positive > neutral [ ] . macromolecule having high molecular weight has a decreased ability for exchange across blood capillaries and lymphatic drainage becomes the route of drainage from the injection site which shows a linear relationship between the molecular weight of macromolecules and the proportion of the dose absorbed by the lymphatics. for a compound to be absorbed by the lymphatics, the molecular weight should range between , and , [ , ] . the effect of molecular weight becomes negligible when targeting carriers to the lymphatic system as the molecular weight of a colloidal carrier is generally less than , da. the most important determinant of the phagocytic response and so lymphatic uptake is the lipophilicity of a colloid [ ] . opsonins generally unite with lipophilic rather than hydrophilic surfaces; hence, the hydrophilic particles show reduced phagocytosis [ ] . hydrophobic polystyrene nanospheres adsorbed with hydrophilic block copolymers showed drastic reduction in phagocytosis prior to i.v. administration [ ] . in the case of polystyrene nanospheres of -nm diameter, peo chains of the poloxamers and poloxamines adsorbed onto the surface of the particle described the relationship between interstitial injection site drainage and lymph node uptake in rat [ ] . uncoated nanospheres of this diameter showed reduced drainage from the injection site with % of the administered dose remaining after h. the adsorption of block copolymers can enhance the drainage from the injection site such that levels remaining at the injection site may be as little as % after h, with very hydrophilic polymers such as poloxamine . uptake of nanospheres into the regional lymph nodes may also be improved by the adsorption of block copolymers with intermediate lengths of polyoxyethylene, such as poloxamine . this polymer may sequester up to % of the given dose by the lymph nodes after h [ ] . surface modifi cation could prove as an effective strategy for potential targeting to lymphatic system. the infl uence can be quoted in following ways. coating of a carrier with hydrophilic and sterically stabilised peg layer can successfully enhance lymphatic absorption, reducing specifi c interaction of particle with the interstitial surrounding, and inhibit the formation of too large particle structure [ ] . surface modifi cation of liposomes with peg also does not have a significant effect on lymph node uptake. small liposomes coated with peg showed greatest clearance from the s.c. injection site with small -nm peg-coated liposomes having < % remaining at the injection site at h. larger neutral and negatively charged liposomes had a clearance > % remaining at the initial s.c. injection site. however, this smaller amount of large liposomes that were cleared from the injection site was compensated by better retention in the lymph node [ ] . oussoren et al. reported that the amount of liposomes cleared from the injection site was somewhat greater with the peg-coated liposomes [ ] . this improved clearance did not result in improved lymph node retention because the fraction of peg liposomes retained by the lymph node is decreased. phillips et al. also studied the slightly improved clearance of peg-coated liposomes from the s.c. injection site [ ] . porter and coworkers demonstrated that pegylation of poly-l -lysine dendrimers resulted into better absorption from s.c. injection sites and stated that the extent of lymphatic transport may be improved by increasing the size of the pegylated dendrimer complex. they estimated the lymphatic uptake and lymph node retention properties of several generation four dendrimers coated with peg or -benzene sulphonate after subcutaneous administration in rats. for this surface modifi cation study, three types of pegs with molecular weights of , or , da were taken. peg -derived dendrimers showed rapid and complete absorption into the blood when injected subcutaneously, and only % of the total given dose was found in the pooled thoracic lymph over h, whereas peg -and peg derived dendrimers showed lesser absorption, and a higher amount was recovered in lymphatics ( %) over h. however, the benzene sulphonate-capped dendrimer was not well absorbed either in the blood or in lymph following subcutaneous injection [ ] . carriers capped with nonspecifi c human antibodies as ligands showed greater lymphatic uptake and lymph node retention compared to uncoated one at the s.c. site. liposomes coated with the antibody, igg, have been shown to increase lymph node localisation of liposomes to . % of the injected dose at h, but this level decreased to % by h [ ] . in a study, the liposomes containing positively charged lipids had approximately - times the lymph node localisation (up to . % of the injected dose) than liposomes containing neutral or negatively charged lipids ( . % of the injected dose) [ ] . attachment of mannose to the surface of a liposome increased lymph node uptake by threefold compared to control liposomes [ ] . another study demonstrated hbsag entrapped dried liposomes with their surfaces modifi ed with galactose. pharmacokinetic study in rats showed that galactosylated liposomes delivered higher amounts of hbsag to the regional lymph nodes than other ungalactosylated formulations [ ] . lectin is another ligand that can be attached to the carriers for improved targeting to intestinal lymphatics. bovine serum albumin containing acid phosphatase model protein and polystyrene microspheres conjugated with mouse m-cell-specifi c ulex europaeus lectin. ex vivo results showed that there was favoured binding of the lectin-conjugated microspheres to the follicle-associated epithelium. final results indicated that coupling of ligands such as lectin specifi c to cells of the follicleassociated epithelium can improve the targeting of encapsulated candidate antigens for delivery to the peyer's patches of the intestine for better oral delivery [ ] . to improve carrier retention in lymph nodes, a new method of increasing lymphatic uptake of subcutaneously injected liposome utilises the high-affi nity ligands biotin and avidin. biotin is a naturally occurring cofactor and avidin is a protein derived from eggs. avidin and biotin are having extremely high affi nity for each other. for instance, upon injection, the avidin and the biotin liposomes move into the lymphatic vessels. biotin liposomes that migrate through the lymphatic vessels meet the avidin resulting in an aggregate that becomes trapped in the lymph nodes [ , ] . the biotin liposome/avidin system has promising potential as therapeutic agent for delivery to lymph nodes. it can be applied not only to s.c. targeting of lymph nodes but also to intracavitary lymph node targeting [ ] . different ligands with their application in lymphatic targeting are represented in table . . the lymphatics have the potential to play a major role in anticancer treatment as lymphatic spread is recognised to precede haematological spread in many cancers including melanoma, breast, colon, lung and prostate cancers. currently, the focus is on the development of drug carriers that can localise chemotherapy to the lymphatic system, thus improving the treatment of localised disease while minimising the exposure of healthy organs to cytotoxic drugs. the delivery of novel carriers to lymph nodes for therapeutic purposes has much promise. giving importance to the lymphatic route in metastasis, this delivery system may have great potential for targeted delivery of various therapeutic agents to tumours and their metastatic lymph nodes. various delivery systems have been discussed here but colloidal carriers, especially, liposomes have been the carrier of choice to date. the purpose of this review is to provide an improved and effective lymphotropic system with a satisfactory quality for clinical use and to establish a preparation method applicable for industrial production. surface-engineered lymphotropic systems may prove as an effective carrier for anti-hiv, anticancer and oral vaccine delivery in near future. . delivery of antigens to gut-associated lymphoid tissue (galt) intestinal delivery [ ] . microparticles active targeting of peripheral lymph nodes doppler ultrasonography contrast agent [ ] . lymph vaccine delivery [ - ] . . block copolymer of poloxamine and poloxamer nanospheres regional lymph nodes - [ ] . lyp- nanoparticles, liposomes targeted to lymphatic vessels and also in tumour cells within hypoxic area antitumour [ ] . liposomes targeting to lymph node mediastinal lymph node targeting [ ] . liposome targeting to lymph node increased lymph node retention [ , ] the physiology of the lymphatic system the anatomy and development of the jugular lymph sacs in the domestic cat (felis domestica) on the origin of the lymphatic system from the veins and the development of the lymph hearts and thoracic duct in the pig dual origin of avian lymphatics lineage tracing demonstrates the venous origin of the mammalian lymphatic vasculature an essential role for prox in the induction of the lymphatic endothelial cell phenotype prox function is required for the development of the murine lymphatic system live imaging of lymphatic development in the zebrafi sh developmental and pathological lymphangiogenesis: from models to human disease tumor lymphangiogenesis and melanoma metastasis cardiovascular physiology new insights into the molecular control of the lymphatic vascular system and its role in disease advanced colloid-based systems for effi cient delivery of drugs and diagnostic agents to the lymphatic tissues the structure of lymphatic capillaries in lymph formation specifi c adhesion molecules bind anchoring fi laments and endothelial cells in human skin initial lymphatics focal adhesion molecules expression and fi brillin deposition by lymphatic and blood vessel endothelial cells in culture the second valve system in lymphatics evidence for a second valve system in lymphatics: endothelial microvalves ultrastructural studies on the lymphatic anchoring fi laments new horizons for imaging lymphatic function lymphatic smooth muscle: the motor unit of lymph drainage the fi ne structure and functioning of tissue channels and lymphatics clinically oriented anatomy lymphangiogenesis in development and human disease acyclic nucleoside phosphonate analogs delivered in ph-sensitive liposomes liposomes for drug targeting in the lymphatic system liposomes to target the lymphatics by subcutaneous administration novel method of greatly enhanced delivery of liposomes to lymph nodes current concepts in lymph node imaging old friends, new ways: revisiting extended lymphadenectomy and neoadjuvant chemotherapy to improve outcomes targeted delivery of indinavir to hiv- primary reservoirs with immunoliposomes studies on lymphoid tissue from hivinfected individuals: implications for the design of therapeutic strategies lymphoid tissue targeting of anti-hiv drugs using liposomes a randomized clinical trial comparing single-and multi-dose combination therapy with diethylcarbamazine and albendazole for treatment of bancroftian fi lariasis infección bacteriana por ántrax bioterrorism-related inhalational anthrax: the fi rst cases reported in the united states fatal inhalational anthrax in a -year-old connecticut woman extrapulmonary tuberculosis: clinical and epidemiologic spectrum of cases nanoparticles for drug delivery in cancer treatment polymeric drugs for effi cient tumor-targeted drug delivery based on epr-effect exploiting the enhanced permeability and retention effect for tumor targeting drug targeting and tumor heterogeneity does a targeting ligand infl uence nanoparticle tumor localization or uptake? high affi nity restricts the localization and tumor penetration of single-chain fv antibody molecules vcam- directed immunoliposomes selectively target tumor vasculature in vivo lymphatic targeting with nanoparticulate system a lymphotropic colloidal carrier system for diethylcarbamazine: preparation and performance evaluation evaluation of endoscopic pirarubicin-lipiodol emulsion injection therapy for gastric cancer targeted lymphatic transport and modifi ed systemic distribution of ci- , a lipophilic lipid-regulator drug, via a formulation approach self-emulsifying drug delivery systems (sedds) of coenzyme q : formulation development and bioavailability assessment liposomes as vehicles for the local release of drugs enhanced oral bioavailability and intestinal lymphatic transport of a hydrophilic drug using liposomes delivery of liposomal doxorubicin (doxil) in a breast cancer tumor model: investigation of potential enhancement by pulsed-high intensity focused ultrasound exposure reduced cardiotoxicity and comparable effi cacy in a phase iii trial of pegylated liposomal doxorubicin hcl (caelyx™/doxil®) versus conventional doxorubicin for fi rst-line treatment of metastatic breast cancer doxil offers hope to ks sufferers liposomal doxorubicin (doxil): in vitro stability, pharmacokinetics, imaging and biodistribution in a head and neck squamous cell carcinoma xenograft model caelyx/doxil for the treatment of metastatic ovarian and breast cancer patent blue v encapsulation in liposomes: potential applicability to endolympatic therapy and preoperative chromolymphography aerosol delivery of liposomal formulated paclitaxel and vitamin e analog reduces murine mammary tumor burden and metastases novel vitamin e analogue and -nitro-camptothecin administered as liposome aerosols decrease syngeneic mouse mammary tumor burden and inhibit metastasis use of liposome preparation to treat mycobacterial infections nanoradioliposomes molecularly modulated to study the lung deep lymphatic drainage solid lipid nanoparticles in lymph and plasma after duodenal administration to rats duodenal administration of solid lipid nanoparticles loaded with different percentages of tobramycin transmucosal transport of tobramycin incorporated in sln after duodenal administration to rats. part i-a pharmacokinetic study pharmacokinetics and tissue distribution of idarubicin-loaded solid lipid nanoparticles after duodenal administration to rats infl uence of administration route on tumor uptake and biodistribution of etoposide loaded solid lipid nanoparticles in dalton's lymphoma tumor bearing mice metastatic patterns in small-cell lung cancer: correlation of autopsy fi ndings with clinical parameters in patients metastatic pattern in non-resectable non-small cell lung cancer solid lipid nanoparticles (sln) for controlled drug delivery-a review of the state of the art lymphatic uptake of pulmonary delivered radiolabelled solid lipid nanoparticles infl ammation imaging using tc- m dextran intralymphatic chemotherapy using a hyaluronan-cisplatin conjugate lymph node localisation of biodegradable nanospheres surface modifi ed with poloxamer and poloxamine block co-polymers block copolymer carrier systems for translymphatic chemotherapy of lymph node metastases translymphatic chemotherapy by intrapleural placement of gelatin sponge containing biodegradable paclitaxel colloids controls lymphatic metastasis in lung cancer enhancement of therapeutic effi cacy of bleomycin by incorporation into biodegradable poly-d, l-lactic acid targeting colloidal particulates to thoracic lymph nodes comparison of dendrimer-based macromolecular contrast agents for dynamic micro-magnetic resonance lymphangiography delivery of gadolinium-labeled nanoparticles to the sentinel lymph node: comparison of the sentinel node visualization and estimations of intra-nodal gadolinium concentration by the magnetic resonance imaging multimodal nanoprobes for radionuclide and fi ve-color near-infrared optical lymphatic imaging a dendrimer-based nanosized contrast agent dual-labeled for magnetic resonance and optical fl uorescence imaging to localize the sentinel lymph node in mice multicolor imaging of lymphatic function with two nanomaterials: quantum dot-labeled cancer cells and dendrimerbased optical agents hydrophilic multi-walled carbon nanotubes decorated with magnetite nanoparticles as lymphatic targeted drug delivery vehicles tumor targeting with antibody-functionalized, radiolabeled carbon nanotubes waterdispersed single-wall carbon nanohorns as drug carriers for local cancer chemotherapy enhanced effi cacy of bleomycin adsorbed on silica particles against lymph node metastasis derived from a transplanted tumor selective distribution of aclarubicin to regional lymph nodes with a new dosage form: aclarubicin adsorbed on activated carbon particles carbon dye as an adjunct to isosulfan blue dye for sentinel lymph node dissection safety, pharmacokinetics, and preliminary clinical activity of inotuzumab ozogamicin, a novel immunoconjugate for the treatment of b-cell non-hodgkin's lymphoma: results of a phase i study therapeutic potential of cd -specifi c antibody-targeted chemotherapy using inotuzumab ozogamicin (cmc- ) for the treatment of acute lymphoblastic leukemia antibody-targeted chemotherapy with cmc- : a cd -targeted immunoconjugate of calicheamicin for the treatment of b-lymphoid malignancies preclinical anti-tumor activity of antibody-targeted chemotherapy with cmc- (inotuzumab ozogamicin), a cd -specifi c immunoconjugate of calicheamicin, compared with non-targeted combination chemotherapy with cvp or chop rituximab (rituxan®/mabthera®): the fi rst decade cd -specifi c antibody-targeted chemotherapy of non-hodgkin's b-cell lymphoma using calicheamicin-conjugated rituximab siglecs as targets for therapy in immune-cell-mediated disease clinical activity of the immunoconjugate cmc- in b-cell malignancies: preliminary report of the expanded maximum tolerated dose (mtd) cohort of a phase study objective responses in a phase i dose-escalation study of sgn- , a novel antibody-drug conjugate (adc) targeting cd , in patients with relapsed or refractory hodgkin lymphoma targeted delivery of interferonalpha via fusion to anti-cd results in potent antitumor activity against b-cell lymphoma chitosan for mucosal vaccination polysaccharide colloidal particles as delivery systems for macromolecules suppression of tumor growth and metastasis by a vegfr- antagonizing peptide identifi ed from a phage display library antiangiogenic and antitumor activities of peptide inhibiting the vascular endothelial growth factor binding to neuropilin- a tumor-homing peptide with a targeting specifi city related to lymphatic vessels chemotherapy with hybrid liposomes for acute lymphatic leukemia leading to apoptosis in vivo lyp- -conjugated nanoparticles for targeting drug delivery to lymphatic metastatic tumors successful treatment of primary and disseminated human lung cancers by systemic delivery of tumor suppressor genes using an improved liposome vector exposure of human ovarian carcinoma to cisplatin transiently sensitizes the tumor cells for liposome-mediated gene transfer combination gene therapy for liver metastasis of colon carcinoma in vivo polymeric microspheres as drug carriers the organ distribution and circulation time of intravenously injected colloidal carriers sterically stabilized with a blockcopolymerpoloxamine fate of liposomes in vivo: a brief introductory review the characterisation of radio colloids used for administration to the lymphatic system effect of size on the lymphatic uptake of a model colloid system radiolabeled colloids and macromolecules in the lymphatic system electron microscopic studies on the peritoneal resorption of intraperitoneally injected latex particles via the diaphragmatic lymphatics lymphatic transport of liposomeencapsulated drugs following intraperitoneal administration-effect of lipid composition assessment of the potential uses of liposomes for lymphoscintigraphy and lymphatic drug delivery failure of -technetium marker to represent intact liposomes in lymph nodes effect of molecular weight on the lymphatic absorption of water-soluble compounds following subcutaneous administration surface engineered nanospheres with enhanced drainage into lymphatics and uptake by macrophages of the regional lymph nodes serum opsonins and liposomes: their interaction and opsonophagocytosis physicochemical principles of pharmacy targeting of colloids to lymph nodes: infl uence of lymphatic physiology and colloidal characteristics evaluation of [( m) tc] liposomes as lymphoscintigraphic agents: comparison with [( m) tc] sulfur colloid and [( m) tc] human serum albumin lymphatic uptake and biodistribution of liposomes after subcutaneous injection: iii. infl uence of surface modifi cation with poly(ethyleneglycol) pegylation of polylysine dendrimers improves absorption and lymphatic targeting following sc administration in rats lymph node localization of non-specifi c antibody-coated liposomes modifi ed in vivo behavior of liposomes containing synthetic glycolipids enhanced lymph node delivery and immunogenicity of hepatitis b surface antigen entrapped in galactosylated liposomes targeted delivery of antigens to the gut-associated lymphoid tissues: . ex vivo evaluation of lectinlabelled albumin microspheres for targeted delivery of antigens to the m-cells of the peyer's patches avidin/biotin-liposome system injected in the pleural space for drug delivery to mediastinal lymph nodes pharmacokinetics and biodistribution of in avidin and tc biotin-liposomes injected in the pleural space for the targeting of mediastinal nodes folate-peg-ckk -dtpa, a potential carrier for lymph-metastasized tumor targeting nanotechnology in cancer therapeutics: bioconjugated nanoparticles for drug delivery molecular targeting of lymph nodes with l-selectin ligand-specifi c us contrast agent: a feasibility study in mice and dogs hyaluronan in drug delivery lymphatic targeting of zidovudine using surfaceengineered liposomes alginate/chitosan microparticles for tamoxifen delivery to the lymphatic system homing of negatively charged albumins to the lymphatic system: general implications for drug targeting to peripheral tissues and viral reservoirs key: cord- -l wo t authors: gao, chao; liu, jiming; zhong, ning title: network immunization and virus propagation in email networks: experimental evaluation and analysis date: - - journal: knowl inf syst doi: . /s - - - sha: doc_id: cord_uid: l wo t network immunization strategies have emerged as possible solutions to the challenges of virus propagation. in this paper, an existing interactive model is introduced and then improved in order to better characterize the way a virus spreads in email networks with different topologies. the model is used to demonstrate the effects of a number of key factors, notably nodes’ degree and betweenness. experiments are then performed to examine how the structure of a network and human dynamics affects virus propagation. the experimental results have revealed that a virus spreads in two distinct phases and shown that the most efficient immunization strategy is the node-betweenness strategy. moreover, those results have also explained why old virus can survive in networks nowadays from the aspects of human dynamics. the internet, the scientific collaboration network and the social network [ , ] . in these networks, nodes denote individuals (e.g. computers, web pages, email-boxes, people, or species) and edges represent the connections between individuals (e.g. network links, hyperlinks, relationships between two people or species) [ ] . there are many research topics related to network-like environments [ , , ] . one interesting and challenging subject is how to control virus propagation in physical networks (e.g. trojan viruses) and virtual networks (e.g. email worms) [ , , ] . currently, one of the most popular methods is network immunization where some nodes in a network are immunized (protected) so that they can not be infected by a virus or a worm. after immunizing the same percentages of nodes in a network, the best strategy can minimize the final number of infected nodes. valid propagation models can be used in complex networks to predict potential weaknesses of a global network infrastructure against worm attacks [ ] and help researchers understand the mechanisms of new virus attacks and/or new spreading. at the same time, reliable models provide test-beds for developing or evaluating new and/or improved security strategies for restraining virus propagation [ ] . researchers can use reliable models to design effective immunization strategies which can prevent and control virus propagation not only in computer networks (e.g. worms) but also in social networks (e.g. sars, h n , and rumors). today, more and more researchers from statistical physics, mathematics, computer science, and epidemiology are studying virus propagation and immunization strategies. for example, computer scientists focus on algorithms and the computational complexities of strategies, i.e. how to quickly search a short path from one "seed" node to a targeted node just based on local information, and then effectively and efficiently restrain virus propagation [ ] . epidemiologists focus on the combined effects of local clustering and global contacts on virus propagation [ ] . generally speaking, there are two major issues concerning virus propagation: . how to efficiently restrain virus propagation? . how to accurately model the process of virus propagation in complex networks? in order to solve these problems, the main work in this paper is to ( ) systematically compare and analyze representative network immunization strategies in an interactive email propagation model, ( ) uncover what the dominant factors are in virus propagation and immunization strategies, and ( ) improve the predictive accuracy of propagation models through using research from human dynamics. the remainder of this paper is organized as follows: sect. surveys some well-known network immunization strategies and existing propagation models. section presents the key research problems in this paper. section describes the experiments which are performed to compare different immunization strategies with the measurements of the immunization efficiency, the cost and the robustness in both synthetic networks (including a synthetic community-based network) and two real email networks (the enron and a university email network), and analyze the effects of network structures and human dynamics on virus propagation. section concludes the paper. in this section, several popular immunization strategies and typical propagation models are reviewed. an interactive email propagation model is then formulated in order to evaluate different immunization strategies and analyze the factors that influence virus propagation. network immunization is one of the well-known methods to effectively and efficiently restrain virus propagation. it cuts epidemic paths through immunizing (injecting vaccines or patching programs) a set of nodes from a network following some well-defined rules. the immunized nodes, in most published research, are all based on node degrees that reflect the importance of a node in a network, to a certain extent. in this paper, the influence of other properties of a node (i.e. betweenness) on immunization strategies will be observed. pastor-satorras and vespignani have studied the critical values in both random and targeted immunization [ ] . the random immunization strategy treats all nodes equally. in a largescale-free network, the immunization critical value is g c → . simulation results show that % of nodes need to be immunized in order to recover the epidemic threshold. dezso and barabasi have proposed a new immunization strategy, named as the targeted immunization [ ] , which takes the actual topology of a real-world network into consideration. the distributions of node degrees in scale-free networks are extremely heterogeneous. a few nodes have high degrees, while lots of nodes have low degrees. the targeted immunization strategy aims to immunize the most connected nodes in order to cut epidemic paths through which most susceptible nodes may be infected. for a ba network [ ] , the critical value of the targeted immunization strategy is g c ∼ e − mλ . this formula shows that it is always possible to obtain a small critical value g c even if the spreading rate λ changes drastically. however, one of the limitations of the targeted immunization strategy is that it needs to know the information of global topology, in particular the ranking of the nodes must be clearly defined. this is impractical and uneconomical for handling large-scale and dynamic-evolving networks, such as p p networks or email networks. in order to overcome this shortcoming, a local strategy, namely the acquaintance immunization [ , ] , has been developed. the motivation for the acquaintance immunization is to work without any global information. in this strategy, p % of nodes are first selected as "seeds" from a network, and then one or more of their direct acquaintances are immunized. because a node with higher degree has more links in a scale-free network, it will be selected as a "seed" with a higher probability. thus, the acquaintance immunization strategy is more efficient than the random immunization strategy, but less than the targeted immunization strategy. moreover, there is another issue which limits the effectiveness of the acquaintance immunization: it does not differentiate nodes, i.e. randomly selects "seed" nodes and their direct neighbors [ ] . another effective distributed strategy is the d-steps immunization [ , ] . this strategy views the decentralized immunization as a graph covering problem. that is, for a node v i , it looks for a node to be immunized that has the maximal degree within d steps of v i . this method only uses the local topological information within a certain range (e.g. the degree information of nodes within d steps). thus, the maximal acquaintance strategy can be seen as a -step immunization. however, it does not take into account domain-specific heuristic information, nor is it able to decide what the value of d should be in different networks. the immunization strategies described in the previous section are all based on node degrees. the way different immunized nodes are selected is illustrated in fig. an illustration of different strategies. the targeted immunization will directly select v as an immunized node based on the degrees of nodes. suppose that v is a "seed" node. v will be immunized based on the maximal acquaintance immunization strategy, and v will be indirectly selected as an immunized node based on the d-steps immunization strategy, where d = fig. an illustration of betweenness-based strategies. if we select one immunized node, the targeted immunization strategy will directly select the highest-degree node, v . the node-betweenness strategy will select v as it has the highest node betweenness. the edge-betweenness strategy will select one of v , v and v because the edges, l and l , have the highest edge betweenness the highest-degree nodes from a network, many approaches cut epidemic paths by means of increasing the average path length of a network, for example by partitioning large-scale networks based on betweenness [ , ] . for a network, node (edge) betweenness refers to the number of the shortest paths that pass through a node (edge). a higher value of betweenness means that the node (edge) links more adjacent communities and will be frequently used in network communications. although [ ] have analyzed the robustness of a network against degree-based and betweenness-based attacks, the spread of a virus in a propagation model is not considered, so the effects of different measurements on virus propagation is not clear. is it possible to restrain virus propagation, especially from one community to another, by immunizing nodes or edges which have higher betweenness. in this paper, two types of betweenness-based immunization strategies will be presented, i.e. the node-betweenness strategy and the edge-betweenness strategy. that is, the immunized nodes are selected in the descending order of node-and edge-betweenness, in an attempt to better understand the effects of the degree and betweenness centralities on virus propagation. figure shows that if v is immunized, the virus will not propagate from one part of the network to another. the node-betweenness strategy will select v as an immunized node, which has the highest node betweenness, i.e. . the edge-betweenness strategy will select the terminal nodes of l or l (i.e. v , v or v , v ) as they have the highest edge betweenness. as in the targeted immunization, the betweenness-based strategies also require information about the global betweenness of a network. the experiments presented in this paper is to find a new measurement that can be used to design a highly efficient immunization strategy. the efficiency of these strategies is compared both in synthetic networks and in real-world networks, such as the enron email network described by [ ] . in order to compare different immunization strategies, a propagation model is required to act as a test-bed in order to simulate virus propagation. currently, there are two typical models: ( ) the epidemic model based on population simulation and ( ) an interactive email model which utilizes individual-based simulation. lloyd and may have proposed an epidemic propagation model to characterize virus propagation, a typical mathematical model based on differential equations [ ] . some specific epidemic models, such as si [ , ] , sir [ , ] , sis [ ] , and seir [ , ] , have been developed and applied in order to simulate virus propagation and study the dynamic characteristics of whole systems. however, these models are all based on the mean-filed theory, i.e. differential equations. this type of black-box modeling approach only provides a macroscopic understanding of virus propagation-they do not give much insight into microscopic interactive behavior. more importantly, some assumptions, such as a fully mixed (i.e. individuals that are connected with a susceptible individual will be randomly chosen from the whole population) [ ] and equiprobable contacts (i.e. all nodes transmit the disease with the same probability and no account is taken of the different connections between individuals) may not be valid in the real world. for example, in email networks and instant message (im) networks, communication and/or the spread of information tend to be strongly clustered in groups or communities that have more closer relationships rather than being equiprobable across the whole network. these models may also overestimate the speed of propagation [ ] . in order to overcome the above-mentioned shortcomings, [ ] have built an interactive email model to study worm propagation, in which viruses are triggered by human behavior, not by contact probabilities. that is to say, the node will be infected only if a user has checked his/her email-box and clicked an email with a virus attachment. thus, virus propagation in the email network is mainly determined by two behavioral factors: email-checking time intervals (t i ) and email-clicking probabilities (p i ), where i ∈ [ , n ] , n is the total number of users in a network. t i is determined by a user's own habits; p i is determined both by user security awareness and the efficiency of the firewall. however, the authors do not provide much information about how to restrain worm propagation. in this paper, an interactive email model is used as a test-bed to study the characteristics of virus propagation and the efficiency of different immunization strategies. it is readily to observe the microscopic process of worm propagating through this model, and uncover the effects of different factors (e.g. the power-law exponent, human dynamics and the average path length of the network) on virus propagation and immunization strategies. unlike other models, this paper mainly focuses on comparing the performance of degree-based strategies and betweenness-based strategies, replacing the critical value of epidemics in a network. a detailed analysis of the propagation model is given in the following section. an email network can be viewed as a typical social network in which a connection between two nodes (individuals) indicates that they have communicated with each other before [ , ] . generally speaking, a network can be denoted as e = v, l , where v = {v , v , . . . , v n } is a set of nodes and l = { v i , v j | ≤ i, j ≤ n} is a set of undirected links (if v i in the hit-list of v j , there is a link between v i and v j ). a virus can propagate along links and infect more nodes in a network. in order to give a general definition, each node is represented as a tuple . -id: the node identifier, v i .i d = i. -state: the node state: i f the node has no virus, danger = , i f the node has virus but not in f ected, in f ected = , i f the node has been in f ected, immuni zed = , i f the node has been immuni zed. -nodelink: the information about its hit-list or adjacent neighbors, i.e. v i .n odelink = { i, j | i, j ∈ l}. -p behavior : the probability that a node will to perform a particular behavior. -b action : different behaviors. -virusnum: the total number of new unchecked viruses before the next operation. -newvirus: the number of new viruses a node receives from its neighbors at each step. in addition, two interactive behaviors are simulated according to [ ] , i.e. the emailchecking time intervals and the email-clicking probabilities both follow gaussian distributions, if the sample size goes to infinity. for the same user i, the email-checking interval t i (t) in [ ] has been modeled by a poisson distribution, i.e. t i (t) ∼ λe −λt . thus, the formula for p behavior in the tuple can be written as p behavior = click prob and p behavior = checkt ime. -clickprob is the probability of an user clicking a suspected email, -checkrate is the probability of an user checking an email, -checktime is the next time the email-box will be checked, v i .p behavior = v i .checkt ime = ex pgenerator(v i .check rate). b action can be specified as b action = receive_email, b action = send_email, and b action = update_email. if a user receives a virus-infected email, the corresponding node will update its state, i.e. v i .state ← danger. if a user opens an email that has a virus-infected attachment, the node will adjust its state, i.e. v i .state ← in f ected, and send this virus email to all its friends, according to its hit-list. if a user is immunized, the node will update its state to v i .state ← immuni zed. in order to better characterize virus propagation, some assumptions are made in the interactive email model: -if a user opens an infected email, the node is infected and will send viruses to all the friends on its hit-list; -when checking his/her mailbox, if a user does not click virus emails, it is assumed that the user deletes the suspected emails; -if nodes are immunized, they will never send virus emails even if a user clicks an attachment. the most important measurement of the effectiveness of an immunization strategy is the total number of infected nodes after virus propagation. the best strategy can effectively restrain virus propagation, i.e. the total number of infected nodes is kept to a minimum. in order to evaluate the efficiency of different immunization strategies and find the relationship between local behaviors and global dynamics, two statistics are of particular interest: . sid: the sum of the degrees of immunized nodes that reflects the importance of nodes in a network . apl: the average path length of a network. this is a measurement of the connectivity and transmission capacity of a network where d i j is the shortest path between i and j. if there is no path between i and j, d i j → ∞. in order to facilitate the computation, the reciprocal of d i j is used to reflect the connectivity of a network: if there is no path between i and j, d − i j = . based on these definitions, the interactive email model given in sect. . can be used as a test-bed to compare different immunization strategies and uncover the effects of different factors on virus propagation. the specific research questions addressed in this paper can be summarized as follows: . how to evaluate network immunization strategies? how to determine the performance of a particular strategy, i.e. in terms of its efficiency, cost and robustness? what is the best immunization strategy? what are the key factors that affect the efficiency of a strategy? . what is the process of virus propagation? what effect does the network structure have on virus propagation? . what effect do human dynamics have on virus propagation? the simulations in this paper have two phases. first, a existing email network is established in which each node has some of the interactive behaviors described in sect. . . next, the virus propagation in the network is observed and the epidemic dynamics are studied when applying different immunization strategies. more details can be found in sect. . in this section, the simulation process and the structures of experimental networks are presented in sects. . and . . section . uses a number of experiments to evaluate the performance (e.g. efficiency, cost and robustness) of different immunization strategies. specifically, the experiments seek to address whether or not betweenness-based immunization strategies can restrain worm propagation in email networks, and which measurements can reflect and/or characterize the efficiency of immunization strategies. finally, sects. . and . presents an in-depth analysis in order to determine the effect of network structures and human dynamics on virus propagation. the experimental process is illustrated in fig. . some nodes are first immunized (protected) from the network using different strategies. the viruses are then injected into the network in order to evaluate the efficiency of those strategies by comparing the total number of infected nodes. two methods are used to select the initial infected nodes: random infection and malicious infection, i.e. infecting the nodes with maximal degrees. the user behavior parameters are based on the definitions in sect. . , where μ p = . , σ p = . , μ t = , and σ t = . since the process of email worm propagation is stochastic, all results are averaged over runs. the virus propagation algorithm is specified in alg. . many common networks have presented the phenomenon of scale-free [ , ] , where nodes' degrees follow a power-law distribution [ ] , i.e. the fraction of nodes having k edges, p(k), decays according to a power law p(k) ∼ k −α (where α is usually between and ) [ ] . recent research has shown that email networks also follow power-law distributions with a long tail [ , ] . therefore, in this paper, three synthetic power-law networks and a synthetic community-based network, generated using the glp algorithm [ ] where the power can be tuned. the three synthetic networks all have nodes with α = . , . , and . , respectively. the statistical characteristics and visualization of the synthetic community-based network are shown in table and fig. c , f, respectively. in order to reflect the characteristics of a real-world network, the enron email network which is built by andrew fiore and jeff heer, and the university email network which is complied by the members of the university rovira i virgili (tarragona) will also be studied. the structure and degree distributions of these networks are shown in table and fig. . in particular, the cumulative distributions are estimated with maximum likelihood using the method provided by [ ] . the degree statistics are shown in table . in this section, a comparison is made of the effectiveness of different strategies in an interactive email model. experiments are then used to evaluate the cost and robustness of each strategy. input: nodedata[nodenum] stores the topology of an email network. timestep is the system clock. v is the set of initially infected nodes. output: simnum[timestep] [k] stores the number of infected nodes in the network in the k th simulation. ( ) for k= to runtime //we run times to obtain an average value ( ) nodedata[nodenum] ← initializing an email network as well as users' checking time and clicking probability; ( ) nodedata[nodenum] ← choosing immunized nodes based on different immunization strategies and adjusting their states; ( ) while timestep < endsimul //there are steps at each time ( ) for i= to nodenum ( ) if nodedata[i].checktime== ( ) prob← computing the probability of opening a virus-infected email based on user's clickprob and virusnum ( ) if send a virus to all friends according to its hit-list ( ) endif ( ) endif ( ) endfor ( ) for i= to nodenum ( ) update the next checktime based on user's checkrate ( ) nodedata the immunization efficiency of the following immunization strategies are compared: the targeted and random strategies [ ] , the acquaintance strategy (random and maximal neighbor) [ , ] , the d-steps strategy (d = and d = ) [ , ] (which is introduced in sect. . ), the bridges between different communities: the whole network: α= . , k = . and the proposed betweenness-based strategy (node-and edge-betweenness). in the initial set of experiments, the proportion of immunized nodes ( , , and %) are varied in the synthetic networks and the enron email network. table shows the simulation results in the enron email network which is initialized with two infected nodes. figure shows the average numbers of infected nodes over time. tables , , and show the numerical results in three synthetic networks, respectively. the simulation results show that the node-betweenness immunization strategy yields the best results (i.e. the minimum number of infected nodes, f) except for the case where % of the nodes in the enron network are immunized under a malicious attack. the average degree of the enron network is k = . . this means that only a few nodes have high degrees, others have low degrees (see table ). in such a network, if nodes with maximal degrees are infected, viruses will rapidly spread in the network and the final number of infected nodes will be larger than in other cases. the targeted strategy therefore does not perform any better than the node-betweenness strategy. in fact, as the number of immunized nodes increases, the efficiency of the node-betweenness immunization increases proportionally there are two infected nodes with different attack modes. if there is no immunization, the final number of infected nodes is with a random attack and with a malicious attack, and ap l = . ( − ). the total simulation time t = more than the targeted strategy. therefore, if global topological information is available, the node-betweenness immunization is the best strategy. the maximal s i d is obtained using the targeted immunization. however, the final number of infected nodes (f) is consistent with the average path length (ap l) but not with the s i d. that is to say, controlling a virus epidemic does not depend on the degrees of immunized nodes but on the path length of a whole network. this also explains why the efficiency of the node-betweenness immunization strategy is better than that of the targeted immunization strategy. the node-betweenness immunization selects nodes based on the average path length, while the targeted immunization strategy selects based on the size of degrees. a more in-depth analysis is undertaken by comparing the change of the ap l with respect to the different strategies used in the synthetic networks. the results are shown in fig. . figure a , b compare the change of the final number of infected nodes over time, which correspond to fig. c , d, respectively. these numerical results validate the previous assertion that the average path length can be used as a measurement to design an effective immunization strategy. the best strategy is to divide the whole network into different sub-networks and increase the average path length of a network, hence cut the epidemic paths. in this paper, all comparative results are the average over runs using the same infection model (i.e. the virus propagation is compared for both random and malicious attacks) and user behavior model (i.e. all simulations use the same behavior parameters, as shown in sect. . ). thus, it is more reasonable and feasible to just evaluate how the propagation of a virus is affected by immunization strategies, i.e. avoiding the effects caused by the stochastic process, the infection model and the user behavior. it can be seen that the edge-betweenness strategy is able to find some nodes with high degrees of centrality and then integrally divide a network into a number of sub-networks (e.g. v in fig. ) . however, compared with the nodes (e.g. v in fig. ) selected by the node-betweenness strategy, the nodes with higher edge betweenness can not cut the epidemic paths as they can not effectively break the whole structure of a network. in fig. , the synthetic community-based network and the university email network are used as examples to illustrate why the edge-betweenness strategy can not obtain the same immunization efficiency as the node-betweenness strategy. to select two nodes as immunized nodes from fig. , the node-betweenness immunization will select {v , v } by using the descending order of node betweenness. however, the edge-betweenness strategy can select {v , v } or {v , v } because the edges, l and l , have the highest edge betweenness. this result shows that the node-betweenness strategy can not only effectively divide the whole network into two communities, but also break the interior structure of communities. although the edgebetweenness strategy can integrally divided the whole network into two parts, viruses can also propagate in each community. many networks commonly contain the structure shown in fig. , for example, the enron email network and university email networks. table and fig. present the results of the synthetic community-based network. table compares different strategies in the university email network, which also has some self-similar community structures [ ] . these results further validate the analysis stated above. from the above experiments, the following conclusions can be made: tables - , ap l can be used as a measurement to evaluate the efficiency of an immunization strategy. thus, when designing a distributed immunization strategy, attentions should be paid on those nodes that have the largest impact on the apl value. . if the final number of infected nodes is used as a measure of efficiency, then the nodebetweenness immunization strategy is more efficient than the targeted immunization strategy. . the power-law exponent (α) affects the edge-betweenness immunization strategy, but has a little impact on other strategies. in the previous section, the efficiency of different immunization strategies is evaluated in terms of the final number of infected nodes when the propagation reaches an equilibrium state. by doing experiments in synthetic networks, synthetic community-based network, the enron email network and the university email network, it is easily to find that the node-betweenness immunization strategy has the highest efficiency. in this section, the performance of the different strategies will be evaluated in terms of cost and robustness, as in [ ] . it is well known that the structure of a social network or an email network constantly evolves. it is therefore interesting to evaluate how changes in structure affect the efficiency of an immunization strategy. -the cost can be defined as the number of nodes that need to be immunized in order to achieve a given level of epidemic prevalence ρ. generally, ρ → . there are some parameters which are of particular interest: f is the fraction of nodes that are immunized; f c is the critical value of the immunization when ρ → ; ρ is the infection density when no immunization strategy is implemented; ρ f is the infection density with a given immunization strategy. figure shows the relationship between the reduced prevalence ρ f /ρ and f. it can be seen that the node-betweenness immunization has the lowest prevalence for the smallest number of protected nodes. the immunization cost increases as the value of α increases, i.e. in order to achieve epidemic prevalence ρ → , the node-betweenness immunization strategy needs , , and % of nodes to be immunized, respectively, in the three synthetic networks. this is because the node-betweenness immunization strategy can effectively break the network structure and increase the path length of a network with the same number of immunized nodes. -the robustness shows a plot of tolerance against the dynamic evolution of a network, i.e. the change of power-law exponents (α). figure shows the relationship between the immunized threshold f c and α. a low level of f c with a small variation indicates that the immunization strategy is robust. the robustness is important when an immunization strategy is deployed into a scalable and dynamic network (e.g. p p and email networks). figure also shows the robustness of the d-steps immunization strategy is close to that of the targeted immunization; the node-betweenness strategy is the most robust. [ ] have compared virus propagation in synthetic networks with α = . and α = . , and pointed out that initial worm propagation has two phases. however, they do not give a detailed explanation of these results nor do they compare the effect of the power-law exponent on different immunization strategies during virus propagation. table presents the detailed degree statistics for different networks, which can be used to examine the effect of the power-law exponent on virus propagation and immunization strategies. first, virus propagation in non-immunized networks is discussed. figure a shows the changes of the average number of infected nodes over time; fig. b gives the average degree of infected nodes at each time step. from the results, it can be seen that . the number of infected nodes in non-immunized networks is determined by attack modes but not the power-law exponent. in figs. a , b, three distribution curves (α = . , . , and . ) overlap with each other in both random and malicious attacks. the difference between them is that the final number of infected nodes with a malicious attack is larger than that with a random attack, as shown in fig. a , reflecting the fact that a malicious attack is more dangerous than a random attack. . a virus spreads more quickly in a network with a large power-law exponent than that with a small exponent. because a malicious attack initially infects highly connected nodes, the average degree of the infected nodes decreases in a shorter time comparing to a random attack (t < t ). moreover, the speed and range of the infection is amplified by those highly connected nodes. in phase i, viruses propagate very quickly and infect most nodes in a network. however, in phase ii, the number of total infected nodes grows slowly (fig. a) , because viruses aim to infect those nodes with low degrees (fig. b) , and a node with fewer links is more difficult to be infected. in order to observe the effect of different immunization strategies on the average degree of infected nodes in different networks, % of the nodes are initially protected against random and malicious attacks. figure shows the simulation results. from this experiment, it can be concluded that . the random immunization has no effect on restraining virus propagation because the curves of the average degree of the infected nodes are basically coincident with the curves in the non-immunization case. . comparing fig. a , b, c and d, e, f, respectively, it can be seen that the peak value of the average degree is the largest in the network with α= . and the smallest in the network with α= . . this is because the network with a lower exponent has more highly connected nodes (i.e. the range of degrees is between and ), which serve as amplifiers in the process of virus propagation. . as α increases, so does the number of infected nodes and the virus propagation duration (t < t < t ). because a larger α implies a larger ap l , the number of infected nodes will increase; if the network has a larger exponent, a virus need more time to infect those nodes with medium or low degrees. fig. the average number of infected nodes and the average degree of infected nodes, with respect to time when virus spreading in different networks. we apply the targeted immunization to protect % nodes in the network first, consider the process of virus propagation in the case of a malicious attack where % of the nodes are immunized using the edge-betweenness immunization strategy. there are two intersections in fig. a . point a is the intersection of two curves net and net , and point b is the intersection of net and net . under the same conditions, fig. a shows that the total number of infected nodes is the largest in net in phase i. corresponding to fig. b , the average degree of infected nodes in net is the largest in phase i. as time goes on, the rate at which the average degree falls is the fastest in net , as shown in fig. b . this is because there are more highly connected nodes in net than in the others (see table ). after these highly connected nodes are infected, viruses attempt to infect the nodes with low degrees. therefore, the average degree in net that has the smallest power-law exponent is larger than those in phases ii and iii. the total number of infected nodes in net continuously increases, exceeding those in net and net . the same phenomenon also appears in the targeted immunization strategy, as shown in fig. . the email-checking intervals in the above interactive email model (see sect. . ) is modeled using a poisson process. the poisson distribution is widely used in many real-world models to statistically describe human activities, e.g. in terms of statistical regularities on the frequency of certain events within a period of time [ , ] . statistics from user log files to databases that record the information about human activities, show that most observations on human behavior deviate from a poisson process. that is to say, when a person engages in certain activities, his waiting intervals follow a power-law distribution with a long tail [ , ] . vazquez et al. [ ] have tried to incorporate an email-sending interval distribution, characterized by a power-law distribution, into a virus propagation model. however, their model assumes that a user is instantly infected after he/she receives a virus email, and ignores the impact of anti-virus software and the security awareness of users. therefore, there are some gaps between their model and the real world. in this section, the statistical properties associated with a single user sending emails is analyzed based on the enron dataset [ ] . the virus spreading process is then simulated using an improved interactive email model in order to observe the effect of human behavior on virus propagation. research results from the study of statistical regularities or laws of human behavior based on empirical data can offer a valuable perspective to social scientists [ , ] . previous studies have also used models to characterize the behavioral features of sending emails [ , , ] , but their correctness needs to be further empirically verified, especially in view of the fact that there exist variations among different types of users. in this paper, the enron email dataset is used to identify the characteristics of human email-handling behavior. due to the limited space, table presents only a small amount of the employee data contained in the database. as can be seen from the table, the interval distribution of email sent by the same user is respectively measured using different granularities: day, hour, and minute. figure shows that the waiting intervals follow a heavy-tailed distribution. the power-law exponent as the day granularity is not accurate because there are only a few data points. if more data points are added, a power-law distribution with long tail will emerge. note that, there is a peak at t = as measured at an hour granularity. eckmann et al. [ ] have explained that the peak in a university dataset is the interval between the time people leave work and the time they return to their offices. after curve fitting, see fig. , the waiting interval exponent is close to . , i.e. α ≈ . ± . . although it has been shown that an email-sending distribution follows a power-law by studying users in the enron dataset, it is still not possible to assert that all users' waiting intervals follow a power-law distribution. it can only be stated that the distribution of waiting intervals has a long-tail characteristic. it is also not possible to measure the intervals between email checking since there is no information about login time in the enron dataset. however, combing research results from human web browsing behavior [ ] and the effect of non-poisson activities on propagation in the barabasi group [ ] , it can be found that there are similarities between the distributions of email-checking intervals and email-sending intervals. the following section uses a power-law distribution to characterize the behavior associated with email-checking in order to observe the effect human behavior has on the propagation of an email virus. based on the above discussions, a power-law distribution is used to model the email-checking intervals of a user i, instead of the poisson distribution used in [ ] , i.e. t i (τ ) ∼ τ −α . an analysis of the distribution of the power-law exponent (α) for different individuals in web browsing [ ] and in the enron dataset shows that the power-law exponent is approximately . . in order to observe and quantitatively analyze the effect that the email-checking interval has on virus propagation, the email-clicking probability distribution (p i ) in our model is consistent with the one used by [ ] , i.e. the security awareness of different users in the network follows a normal distribution, p i ∼ n ( . , . ). figure shows that following a random attack viruses quickly propagate in the enron network if the email-checking intervals follow a power-law distribution. the results are consistent with the observed trends in real computer networks [ ] , i.e. viruses initially spread explosively, then enter a long latency period before becoming active again following user activity. the explanation for this is that users frequently have a short period of focused activity followed by a long period of inactivity. thus, although old viruses may be killed by anti-virus software, they can still intermittently break out in a network. that is because some viruses are hidden by inactive users, and cannot be found by anti-virus software. when the inactive users become active, the virus will start to spread again. the effect of human dynamics on virus propagation in three synthetic networks is also analyzed by applying the targeted [ ] , d-steps [ ] and aoc-based strategy [ ] . the numerical results are shown in table. and fig. . from the above experiments, the following conclusions can be made: . based on the enron email dataset and recent research on human dynamics, the emailchecking intervals in an interactive email model should be assigned based on a power-law distribution. . viruses can spread very quickly in a network if users' email-checking intervals follow a power-law distribution. in such a situation, viruses grow explosively at the initial stage and then grow slowly. the viruses remain in a latent state and await being activated by users. in this paper, a simulation model for studying the process of virus propagation has been described, and the efficiency of various existing immunization strategies has been compared. in particular, two new betweenness-based immunization strategies have been presented and validated in an interactive propagation model, which incorporates two human behaviors based on [ ] in order to make the model more practical. this simulation-based work can be regarded as a contribution to the understanding of the inter-reactions between a network structure and local/global dynamics. the main results are concluded as follows: . some experiments are used to systematically compare different immunization strategies for restraining epidemic spreading, in synthetic scale-free networks including the community-based network and two real email networks. the simulation results have shown that the key factor that affects the efficiency of immunization strategies is apl, rather than the sum of the degrees of immunized nodes (sid). that is to say, immunization strategy should protect nodes with higher connectivity and transmission capability, rather than those with higher degrees. . some performance metrics are used to further evaluate the efficiency of different strategies, i.e. in terms of their cost and robustness. simulation results have shown that the d-steps immunization is a feasible strategy in the case of limited resources and the nodebetweenness immunization is the best if the global topological information is available. . the effects of power-law exponents and human dynamics on virus propagation are analyzed. more in-depth experiments have shown that viruses spread faster in a network with a large power-law exponent than that with a small one. especially, the results have explained why some old viruses can still propagate in networks up till now from the perspective of human dynamics. the mathematical theory of infectious diseases and its applications emergence of scaling in random networks the origin of bursts and heavy tails in human dynamics cluster ranking with an application to mining mailbox networks small worlds' and the evolution of virulence: infection occurs locally and at a distance on distinguishing between internet power law topology generators power-law distribution in empirical data efficient immunization strategies for computer networks and populations halting viruses in scale-free networks dynamics of information access on the web a simple model for complex dynamical transitions in epidemics distance-d covering problem in scalefree networks with degree correlation entropy of dialogues creates coherent structure in email traffic epidemic threshold in structured scale-free networks on power-law relationships of the internet topology improving immunization strategies immunization of real complex communication networks self-similar community structure in a network of human interactions attack vulnerability of complex networks targeted local immunization in scale-free peer-to-peer networks the large scale organization of metabolic networks probing human response times periodic subgraph mining in dynamic networks. knowledge and information systems autonomy-oriented search in dynamic community networks: a case study in decentralized network immunization characterizing web usage regularities with information foraging agents how viruses spread among computers and people on universality in human correspondence activity enhanced: simple rules with complex dynamics network motifs simple building blocks of complex networks epidemics and percolation in small-world network code-red: a case study on the spread and victims of an internet worm the structure of scientific collaboration networks the spread of epidemic disease on networks the structure and function of complex networks email networks and the spread of computer viruses partitioning large networks without breaking communities epidemic spreading in scale-free networks epidemic dynamics and endemic states in complex networks immunization of complex networks computer virus propagation models the enron email dataset database schema and brief statistical report exploring complex networks modeling bursts and heavy tails in human dynamics impact of non-poissonian activity patterns on spreading process predicting the behavior of techno-social systems a decentralized search engine for dynamic web communities a twenty-first century science an environment for controlled worm replication and analysis modeling and simulation study of the propagation and defense of internet e-mail worms chao gao is currently a phd student in the international wic institute, college of computer science and technology, beijing university of technology. he has been an exchange student in the department of computer science, hong kong baptist university. his main research interests include web intelligence (wi), autonomy-oriented computing (aoc), complex networks analysis, and network security. department at hong kong baptist university. he was a professor and the director of school of computer science at university of windsor, canada. his current research interests include: autonomy-oriented computing (aoc), web intelligence (wi), and self-organizing systems and complex networks, with applications to: (i) characterizing working mechanisms that lead to emergent behavior in natural and artificial complex systems (e.g., phenomena in web science, and the dynamics of social networks and neural systems), and (ii) developing solutions to large-scale, distributed computational problems (e.g., distributed scalable scientific or social computing, and collective intelligence). prof. liu has contributed to the scientific literature in those areas, including over journal and conference papers, and authored research monographs, e.g., autonomy-oriented computing: from problem solving to complex systems modeling (kluwer academic/springer) and spatial reasoning and planning: geometry, mechanism, and motion (springer). prof. liu has served as the editor-in-chief of web intelligence and agent systems, an associate editor of ieee transactions on knowledge and data engineering, ieee transactions on systems, man, and cybernetics-part b, and computational intelligence, and a member of the editorial board of several other international journals. laboratory and is a professor in the department of systems and information engineering at maebashi institute of technology, japan. he is also an adjunct professor in the international wic institute. he has conducted research in the areas of knowledge discovery and data mining, rough sets and granular-soft computing, web intelligence (wi), intelligent agents, brain informatics, and knowledge information systems, with more than journal and conference publications and books. he is the editor-in-chief of web intelligence and agent systems and annual review of intelligent informatics, an associate editor of ieee transactions on knowledge and data engineering, data engineering, and knowledge and information systems, a member of the editorial board of transactions on rough sets. key: cord- -ztx fsbg authors: de chiara, davide; chinnici, marta; kor, ah-lian title: data mining for big dataset-related thermal analysis of high performance computing (hpc) data center date: - - journal: computational science - iccs doi: . / - - - - _ sha: doc_id: cord_uid: ztx fsbg greening of data centers could be achieved through energy savings in two significant areas, namely: compute systems and cooling systems. a reliable cooling system is necessary to produce a persistent flow of cold air to cool the servers due to increasing computational load demand. servers’ dissipated heat effects a strain on the cooling systems. consequently, it is necessary to identify hotspots that frequently occur in the server zones. this is facilitated through the application of data mining techniques to an available big dataset for thermal characteristics of high-performance computing enea data center, namely cresco . this work presents an algorithm that clusters hotspots with the goal of reducing a data centre’s large thermal-gradient due to uneven distribution of server dissipated waste heat followed by increasing cooling effectiveness. a large proportion of worldwide generated electricity is through hydrocarbon combustion. consequently, this causes a rise in carbon emission and other green house gasses (ghg) in the environment, contributing to global warming. data center (dc) worldwide were estimated to have consumed between to billion kwh of electricity in the year [ ] and in , us based dcs alone used up more than billion kilowatt-hours of electricity [ ] . according to [ ] , unless appropriate steps are taken to reduce energy consumption and go-green, global dc share of carbon emission is estimated to rise from million tons in to million tons in . servers in dcs consume energy that is proportional to allocated computing loads, and unfortunately, approximately % of the energy input is being dissipated as waste heat energy. cooling systems are deployed to maintain the temperature of the computing servers at the vendor specified temperature for consistent and reliable performance. koomey [ ] emphasises that a dc energy input is primarily consumed by cooling and compute systems (comprising servers in chassis and racks). thus, these two systems have been critical targets for energy savings. computing-load processing entails jobs and tasks management. on the other hand, dc cooling encompasses the installation of cooling systems and effective hot/cold aisle configurations. thermal mismanagement in a dc could be the primary contributor to it infrastructure inefficiency due to thermal degradation. server microprocessors are the primary energy consumers and waste heat dissipators [ ] . generally, existing dc air-cooling systems are not sufficiently efficient to cope with the vast amount of waste heat generated by high performance-oriented microprocessors. thus, it is necessary to disperse dissipated waste heat so that there will be an even distribution of waste heat within a premise to avoid overheating. undeniably, a more effective energy savings strategy is necessary to reduce energy consumed by a cooling system and yet efficient in cooling the servers (in the compute system). one known technique is thermal-aware scheduling where a computational workload scheduling is based on waste heat. thermal-aware schedulers adopt different thermal-aware approaches (e.g. system-level for work placements [ ] ; execute 'hot' jobs on 'cold' compute nodes; predictive model for job schedule selection [ ] ; ranked node queue based on thermal characteristics of rack layouts and optimisation (e.g. optimal setpoints for workload distribution and supply temperature of the cooling system). heat modelling provides a model that links server energy consumption and their associated waste heat. thermal-aware monitoring acts as a thermal-eye for the scheduling process and entails recording and evaluation of heat distribution within dcs. thermal profiling is based on useful monitoring information on workload-related heat emission and is useful to predict the dc heat distribution. in this paper, our analysis explores the relationship between thermal-aware scheduling and computer workload scheduling. this is followed by selecting an efficient solution to evenly distribute heat within a dc to avoid hotspots and cold spots. in this work, a data mining technique is chosen for hotspots detection and thermal profiling for preventive measures. the novel contribution of the research presented in this paper is the use of real thermal characteristics big dataset for enea high performance computing (hpc) cresco compute nodes. analysis conducted are as follows: hotspots localisation; users categorisations based on submitted jobs to cresco cluster; compute nodes categorisation based on thermal behaviour of internal and surrounding air temperatures due to workload related waste heat dissipation. this analysis aims to minimise employ thermal gradient within a dc it room through the consideration of the following: different granularity levels of thermal data; energy consumption of calculation nodes; it room ambient temperature. an unsupervised learning technique has been employed to identify hotspots due to the variability of thermal data and uncertainties in defining temperature thresholds. this analysis phase involves the determination of optimal workload distribution to cluster nodes. available thermal characteristics (i.e. exhaust temperature, cpus temperatures) are inputs to the clustering algorithm. subsequently, a series of clustering results are intersected to unravel nodes (identified by ids) that frequently fall into high-temperature areas. the paper is organised as follows: sect. -introduction; sect. -background: related work; sect. -methodology; sect. -results and discussion; sect. -conclusions and future work. in the context of high performance computing data center (hpc-dc), it is essential to satisfy service level agreements with minimal energy consumption. this will involve the following: dc efficient operations and management within recommended it room requirements, specifications, and standards; energy efficiency and effective cooling systems; optimised it equipment utilisation. dc energy efficiency has been a long standing challenge due to multi-faceted factors that affect dc energy efficiency and adding to the complexity, is the trade-off between performance in the form of productivity and energy efficiency. interesting trade-offs between geolocations and dc energy input requirements (e.g. cold geolocations and free air-cooling; hot, sunny geolocations and solar powered renewable energy) are yet to be critically analysed [ ] . one of the thermal equipment-related challenge is raising the setpoint of cooling equipment or lowering the speed of crac (computer room air conditioning) fans to save energy, may in the long-term, decrease the it systems reliability (due to thermal degradation). however, a trade-off solution (between optimal cooling system energy consumption and long-term it system reliability) is yet to be researched on [ ] . another long-standing challenge is it resource over-provisioning that causes energy waste due to for idle servers. relevant research explores optimal allocation of pdus (power distribution units) for servers, multi-step algorithms for power monitoring, and on-demand provisioning reviewed in [ ] . other related work addresses workload management, network-level issues as optimal routing, virtual machines (vm) allocation, and balance between power savings and network qos (quality of service) parameters as well as appropriate metrics for dc energy efficiency evaluation. one standard metric used by a majority of industrial dcs is power usage effectiveness (pue) proposed by green grid consortium [ ] . it shows the ratio of total dc energy utilisation with respect to the energy consumed solely by it equipment. a plethora of dc energy efficiency metrics evaluate the following: thermal characteristics; ratio of renewable energy use; energy productivity of various it system components, and etc. there is a pressing need to provide a holistic framework that would thoroughly characterise dcs with a fixed set of metrics and reveal potential pitfalls in their operations [ ] . though some existing research work has made such attempts but to date, we are yet to have a standardised framework [ , , ] . to reiterate, the thermal characteristics of the it system ought to be the primary focus of an energy efficiency framework because it is the main energy consumer within a dc. several researches have been conducted to address this issue [ ] . sungkap et al. [ ] propose an ambient temperature-aware capping to maximize power efficiency while minimising overheating. their research includes an analysis of the composition of energy consumed by a cloud-based dc. their findings for the composition of dc energy consumption are approximately % for compute systems; % for refrigeration-based air conditioning; remaining % for storage and power distribution systems. this implies that approximately half of the dc energy is consumed by non-computing devices. in [ ] , wang and colleagues present an analytical model that describes dc resources with heat transfer properties and workloads with thermal features. thermal modelling and temperature estimation from thermal sensors ought to consider the emergence of server hotspots and thermal solicitation due to the increase in inlet air temperature, inappropriate positioning of a rack or even inadequate room ventilation. such phenomena are unravelled by thermal-aware location analysis. the thermal-aware server provisioning approach to minimise the total dc energy consumption calculates the value of energy by considering the maximum working temperature of the servers. this approach should consider the fact that any rise in the inlet temperature rise may cause the servers to reach the maximum temperature resulting in thermal stress, thermal degradation, and severe damage in the long run. typical different identified types of thermal-aware scheduling are reactive, proactive and mixed. however, there is no reference to heatmodelling or thermal-monitoring and profiling. kong and colleagues [ ] highlight important concepts of thermal-aware profiling, thermal-aware monitoring, and thermalaware scheduling. thermal-aware techniques are linked to the minimisation of waste heat production, heat convection around server cores, task migrations, and thermalgradient across the microprocessor chip and microprocessor power consumption. dynamic thermal management (dtm) techniques in microprocessors encompasses the following: dynamic voltage and frequency scaling (dvfs), clock gating, task migration, and operating system (os) based dtm and scheduling. in [ ] , parolini and colleagues propose a heat model; provide a brief overview of power and thermal efficiency from microprocessor micro-level to dc macro-level. to reiterate, it is essential for dc energy efficiency to address thermal awareness in order to better understand the relationship between both the thermal and the it aspects of workload management. in this paper, the authors incorporate thermal-aware scheduling, heat modelling, thermal aware monitoring and thermal profiling using a big thermal characteristic dataset of a hpc-data center. this research involves measurement, quantification, and analysis of compute nodes and refrigerating machines. the aim of the analysis is to uncover underlying causes that causes temperatures rise that leads to the emergence of thermal hotspots. overall, effective dc management requires energy use monitoring, particularly, energy input, it energy consumption, monitoring of supply air temperature and humidity at room level (i.e. granularity level in the context of this research), monitoring of air temperature at a higher granularity level (i.e. at computer room air conditioning/computer room air handler (crac/crah) unit level, granularity level ). measurements taken are further analysed to reveal extent of energy use and economisation opportunities for the improvement of dc energy efficiency level (granularity level ). dc energy efficiency metrics will not be discussed in this paper. however, the discussion in the subsequent section primarily focuses on thermal guidelines from american society of heating, refrigerating and ac engineers (ashrae) [ ] . to reiterate, our research goal is to reduce dc wide thermal-gradient, hotspots and maximise cooling effects. this entails the identification of individual server nodes that frequently occur in the hotspot zones through the implementation of a clustering algorithm on the workload management platform. the big thermal characteristics dataset of enea portici cresco computing cluster is employed for the analysis. it has measured values (or features) for each single calculation node (see table ) and comprises measurements for the period from may to january . briefly, the cluster cresco is a high-performance computing system (hpc) consisting of calculation nodes with a total of cores. it is based on lenovo think system sd platform, an ultra-dense and economical two-socket server in a . u rack form factor inserted in a u four-mode enclosure. each node is equipped with intel xeon platinum cpus (each with cores) and a clock frequency of . ghz; a ram size of gb, corresponding to gb/core. a low-latency intel omni-path series single-port pcie . x hfa network interface. the nodes are interconnected by an intel omni-path network with intel edge switches series of ports each, bandwidth equal to gb/s, and latency equal to ns. the connections between the nodes have tier : no-blocking tapered fat-tree topology. the power consumption massive computing workloads amount to a maximum of kw. this work incorporates thermal-aware scheduling, heat modelling, and thermal monitoring followed by subsequent user profiling based on "waste heat production" point of view. thermal-aware dc scheduling is designed based on data analytics conducted on real data obtained from running cluster nodes in a real physical dc. for the purpose of this work, approximately months' worth of data has been collected. data collected are related to: relevant parameters for each node (e.g. inlet air temperature, internal temperature of each node, energy consumption of cpu, ram, memory, etc…); environmental parameters (e.g. air temperatures and humidity in both the hot and cold aisles); cooling system related parameters (e.g. fan speed); and finally, individual users who submit their jobs to cluster node. this research focuses on the effect of dynamic workload assignment on energy consumption and performance of both the computing temperature at the front, inside (on cpu and cpu ) and at the rear of every single node (expressed in celsius) sysairflow speed of air traversing the node expressed in cfm (cubic foot per minute) dc energy meter of total energy used by the node, updated at corresponding timestamp and expressed in kwh and cooling systems. the constraint is that each arrived job must be assigned irrevocably to a particular server without any information about impending incoming jobs. once the job has been assigned, no pre-emption or migration is allowed, which a rule is typically adhered to for hpc applications due to high data reallocation incurred costs. in this research, we particularly explore an optimised mapping of nodes that have to be physically and statically placed in advance to one of the available rack slots in the dc. this will form a matrix comprising computing units with specific characteristics and certain resource availability level at a given time t. the goal is to create a list of candidate nodes to deliver "calculation performance" required by a user's job. when choosing the candidate nodes, the job-scheduler will evaluate the suitability of the thermally cooler nodes (which at the instant t) based on their capability to satisfy the calculation requested by a user (in order to satisfy user's sla). to enhance the job scheduler decision making, it is essential to know in advance, the type of job a user will submit to a node(s) for computation. such insight is provided by several years' worth of historical data and advanced data analytics using machine learning algorithms. through platform load sharing facility (lsf) accounting data we code user profiles into macro-categories: this behavioural categorisation provides an opportunity to save energy and better allocate tasks to cluster nodes to reduce overall node temperatures. additionally, when job allocation is evenly distributed, thermal hotspots and cold spots could be avoided. the temperatures of the calculation nodes could be evened out, thus, resulting in a more even distribution of heat across the cluster. based on thermal data, it is necessary to better understand in-depth what users do and how they manage to solicit the calculation nodes for their jobs. the three main objectives of understanding users' behaviour are as follows: identify parameters based on the diversity of submitted jobs for user profiling; analyse the predictability of various resources (e.g. cpu, memory, i/o) and identify their time-based usage patterns; build predictive models for estimating future cpu and memory usage based on historical data carried out in the lsf platform. abstraction of behavioural patterns in the job submission and its associated resource consumption is necessary to predict future resource requirements. this is exceptionally vital for dynamic resource provisioning in a dc. user profile is created based on submitted job-related information and to reiterate, the macro categories of user profiles are: ) cpu-intensive, ) diskintensive, ) both cpu and memory-intensive, or ) neither cpu-nor memoryintensive. a crosstab of the accounting data (provided by the lsf platform) and resource consumption data help guide the calculation of relevant thresholds that code jobs into several distinct utilisation categories. for instance, if the cpu load is high (e.g., larger than %) during almost % of the job running time for an application, then the job can be labelled as a cpu-intensive one. the goal is for the job-scheduler to optimise task scheduling when a job with the same appid (i.e. the same type of job) or same username is re-submitted to a cluster. in case of a match with the previous appid or username, relevant utilisation stats from the profiled log are retrieved. based on the utilisation patterns, this particular user/application will be placed into one of the previously discussed categories. once a job is categorised, a thermally suitable node is selected to satisfy the task calculation requirements. a task with high cpu and memory requirement will not be immediately processed until the node temperature is well under a safe temperature threshold. node temperature refers to the difference between the node's outlet exhaust air and inlet air temperatures (note: this generally corresponds to the air temperature in the aisles cooled by the air conditioners). it is necessary to have a snapshot of relevant thermal parameters (e.g. temperatures of each component in the calculation nodes) for each cluster to facilitate efficient job allocation by the job-scheduler. generally, a snapshot is obtained through direct interrogation of the nodes and installed sensors in their vicinity, or inside the calculation nodes. for each individual node, the temperatures of the cpus, memories, instantaneous energy consumption and peed of the cooling fans are evaluated undeniably, the highly prioritised parameter is the difference between the node's inlet and exhaust air temperatures. if there is a marked difference, it is evident that the node is very busy (with jobs that require a lot of cpu or memory-related resource consumption). therefore, for each calculation node, relevant data is monitored in real time, and subsequently, virtually stored in a matrix that represents the state of the entire cluster. each matrix cell represents the states of a node (represented by relevant parameters). for new job allocation, the scheduling algorithm will choose a node based on its states depicted in the matrix (e.g. recency or euclidean distance). through this, generated waste heat is evenly distributed over the entire "matrix" of calculation nodes so that hotspots could be significantly reduced. additionally, a user profile is an equally important criterion for resource allocation. this is due to the fact that user profiles provide insights into user consumption patterns and the type of submitted jobs and their associated parameters. for example, if we know that a user will perform cpu-intensive jobs for h, we will allocate the job in a "cell" (calculation node) or a group of cells (when the number of resources requires many calculation nodes) that are physically well distributed or with antipodal locations. this selection strategy aims to evenly spread out the high-density nodes followed by the necessary cooling needs. this will help minimise dc hotspots and ascertain efficient cooling with reduction in coolingrelated energy consumption. as previously discussed, we have created user profiles based on submitted job-related information. undeniably, these profiles are dynamic because they are constantly revised based on user resource consumption behaviour. for example, a user may have been classified as "cpu intensive" for a certain time period. however, if the user's submitted jobs are no longer cpu intensive, then the user will be re-categorised. the deployment of the thermal-aware job scheduler generally aims to reduce the overall cpu/memory temperatures, and outlet temperatures of cluster nodes. the following design principles guide the design and implementation of the job: ) job categoriesassign an incoming job to one of these categories: cpu-intensive, memory-intensive, neither cpu nor memory-intensive, and both cpu and memory-intensive tasks; ) utilisation monitoring -monitoring cpu and memory utilisation while making scheduling decisions; ) redline temperature control -ensure operating cpus and memory under threshold temperatures; ) average temperatures maintenance -monitor average cpu and memory temperatures in a node and manage an average exhaust air temperature across a cluster. to reiterate, user profile categorisation is facilitated by maintaining a log profile of both cpu and memory utilisation for every job (with an associated user) processed in the cluster. a log file contains the following user-related information: ( ) user id; ( ) application identification; ( ) the number of submitted jobs; ( ) cpu utilisation; ( ) memory utilisation. a list of important thermal management-related terms is as follows: ) cpu-intensiveapplications that is computation intensive (i.e. requires a lot of processing time); ) memory-intensive-a significant portion of these applications require ram processing and disk operations; ) maximum (redline) temperature -the maximum operating temperature specified by a device manufacturer or a system administrator; ) inlet air temperature -the temperature of the air flowing into a data node (i.e. temperature of the air sucked in from the front of the node); ) exhaust air temperature -the temperature of the air coming out from a node (the temperature of the air extracted from the rear of the node). by applying these evaluation criteria, we have built an automated procedure that provides insight into the user associated categories (based on present and historical data). obviously, the algorithm always makes a comparison between a job just submitted by a user and the time series (if any) of the same user. if the application launched or the type of submitted job remains the same, then the user will be grouped into one of the categories (based on a supervised learning algorithm) during each job execution, the temperature variations of the cpus and memories are recorded at preestablished time intervals. finally, it continuously refines the user behaviour based on the average length of time the user uses for the job. this will provide a more accurate user (and job) profile because it provides reliable information on the type of job processed in a calculation node and its total processing time. the job scheduler will exploit such information for better job placement within an ideal array of calculation nodes in the cluster. a preliminary study is conducted. to provide insight into the functioning of the clusters. for months, we have observed the power consumption ( fig. ) and temperature (fig. ) profiles of the nodes with workloads. we have depicted energy consumed by the various server components (cpu, memory, other) in fig. and presented a graph that highlights the difference in energy consumption between idle and active nodes (fig. ) . it is observed that for each node, an increase in load effects an increase in temperature difference between inlet and exhaust air for that particular node. figure depicts the average observed inlet air temperature (blue segment, and in the cold aisle), and exhaust air temperature at their rear side (amaranth segment, in the hot aisle). note the temperature measurements are also taken two cpus adjacent to every node. the setpoints of the cooling system are approximately °c at the output and °c at the input of the cooling systemas respectively shown in fig. as blue and red vertical lines. however, it appears that the lower setpoint is variable (supply air at - °c) while the higher setpoint varies from - °c. as observed from the graph, the cold aisle maintains the setpoint temperature at the inlet of the node, which affirms the efficient design of the cold aisle (i.e. due to the use of plastic panels to isolating the cold aisle from other spaces in the it room). however, the exhaust air temperature has registered on average, °c higher level than the hot aisle setpoint. notably, exhaust temperature sensors are directly located at the rear of the node (i.e. in the hottest parts of the hot aisle). therefore, it is observed that hotspots are immediately located at the back of server racks, while the hot aisle air is cooled down to the - °c. this is due the cooling system at the crac (computer room air conditioning) which results in hot air intake, air circulation and cold-hot air mix in the hot aisle. meanwhile, the previously mentioned temperature difference of °c between the hotspots and the ambient temperature unravels the cooling system weak points because it could not directly cool the hotspots. in the long term, the constant presence of the hotspots might affect the servers' performance (i.e. thermal degradation) which should be carefully addressed by the dc operator. remarkably, although the hotspots are present at the rear of the nodes, the cooling system does cool temperatures around the nodes. cold air flows through the node and is measured at the inlet, then at cpu and cpu locations (directly on the cpus) and finally, at the exhaust point of the server. the differences between observed temperature ranges in these locations are averaged for all the nodes. an investigation on the observed temperature distribution contributes to the overall understanding of the thermal characteristics, as it provides an overview of the prevailing temperatures shown in fig. and fig. . for every type of thermal sensors, the temperature values are recorded as an integer number, so the percentage of occurrences of each value is calculated. the inlet air temperature is registered around °c in the majority of cases and has risen up to °c in around . % of cases. it could be concluded that the cold aisle temperature remains around the - °c setpoint for most of the monitored period. ranges of the exhaust temperature and those of cpus and are in the range - °c with most frequently monitored values in the intervals of - °c. although these observations might incur measurement errors, they reveal severs that are at risks of frequent overheating when benchmarked with manufacturer's recommendation data sheets. additionally, this study focuses on variation between subsequent thermal measurements with the aim of exploring temperature stability around the nodes. all temperature types have distinct peaks of zero variation which decreases symmetrically and assumes a gaussian distribution. it could be concluded that temperature tends to be stable in the majority of monitored cases. however, the graphs for exhaust and cpus and temperature variation ( fig. reveal that less than . % of the recorded measurements show an amplitude of air temperature changes of °c or more occurring at corresponding locations. sudden infrequent temperature fluctuations are less dangerous compared to prolonged periods of constantly high temperatures. nevertheless, further investigation is needed to uncover causes of abrupt temperature changes so that appropriate measures could be undertaken by dc operators to maintain prolonged periods of constantly favourable conditions. we propose a scheduler upgrade which aims to optimise cpu and memories-related resource allocation, as well as exhaust air temperatures without relying on profile information. prescribed targets for the proposed job scheduler are shown in table . the design of the proposed job schedule ought to address four issues: ) differentiate between cpu-intensive tasks and memory-intensive tasks; ) consider cpu and memory utilisation during the scheduling process; ) maintain cpu and memory temperatures under the threshold redline temperatures; ) minimise the average exhaust air temperature of nodes to reduce cooling cost. the job scheduler receives feedback of node status through queried confluent platform [ ] (monitoring software installed on each node). when all the nodes are busy, the job scheduler will manage the temperatures, embarks on a load balancing procedure by keeping track of the coolest nodes in the cluster. in doing so, the scheduler continues job executions even in hot yet undamaging conditions. the job scheduler maintains the average cluster cpu and memory utilisation represented by u_{cpuavg} and u_{memavg}, cpu and memory temperatures represented by t_{cpuavg}, t_{memavg}, respectively. the goal of our enhanced job scheduler is to maximise the cop (coefficient of performance). below are the constraints (at nodes level) for our enhanced scheduler: each job is assigned to utmost one node . minimise response time of job with the first and second constraints are satisfied, ensure that the memory and cpu temperatures remain below the threshold temperatures. if a cluster's nodes exceed the redline threshold, then optimise the temperature by assigning jobs to the coolest node in the cluster. the third constraint specifies that if the average temperature of memory or cpu rises above the maximum temperature, then the scheduler should stop scheduling tasks as it might encounter hardware failures. the fourth constraint states that the exhaust air temperature of a node should be the same or less than the average exhaust air temperature of the cluster (taking into consideration n number of nodes). the fifth constraint ensures that a node gets utmost one job at a single point in time. the last point aims at reducing the completion time of a job to achieve optimal performance. the following is the description of our algorithm: ****matrix of node with position r-ow and c-olumn**** cluster= matrix[r,c] user=getuserfromsubmittedjob_in_lsf jobtype= getjobprofile(user) ****push the values of utilization and temperature for cpu and memory into matrix***** for (i= ; i=number_of_node;i++) do nodename = getnodename(i) u i cpu = getcpu_utilization(nodename) u i memory = getmemory_utilization(nodename) t i cpu = getcpu_temperature(nodename) t i memory = getmemory_temperature(nodename) end for *************if a user is not profiled *************** if jobtype= null then **********try to understand job type at run time*********** if (ucpu <= u_threshold_cpu) && (umemory <= u_threshold_memory) then jobtype=easyjob else if (ucpu>u_threshold_cpu) && (umemory < u_threshold_memory) then jobtype=cpuintensivejob else if (ucpu u_threshold_memory) then jobtype=memoryintensivejob else jobtype=cpu&memoryintensivejob end if end if ******** i try to find the candidate nodes for each type of job*********** avgtempcluster= avgtemp(cluster) mint_nodename= gettempnodename(mintemp(cluster)) maxt_nodename=gettempnodename(maxtemp(cluster)) ***********intervals of temperatures for candidate nodes************* bestcpuintensivenode=getnode (mint_nodename, mint_nodename+ %)) bestmemoryintensivenode= getnode(mint_nodename+ %, mint_nodename+ %) bestcpu&memoryintensivenode= getnode(mint_nodename+ %, mint_nodename+ %) besteasyjob= getnode(maxt_nodename, maxt_nodename- % ) ******************job assignments************************** if jobtype= cpuintensivejob then assignjob (bestcpuintensivenode) else if jobtype= memoryintensivejob then assignjob (bestmemoryintensivenode) else if jobtype= cpu&memoryintensivejob then assignjob(bestcpu&memoryintensivenode) else assignjob(besteasyjob) end if the algorithm feeds into the node matrix by considering the physical arrangement of every single node inside the racks. firstly, obtain the profile of the user who puts in a resource request for resources. this is done by retrieving the user's profile from a list of stored profiles. the algorithm is executed for all the nodes to appreciate resource utilisation level and temperature profiles each node. if the user profile does not exist, then when a user executes a job for the first time, the algorithm calculates a profile instantaneously. all the indicated threshold values are operating values calculated for each cluster configuration and are periodically recalculated and revised according to the use of the cluster nodes. subsequently, some temperature calculations are made from the current state of the cluster (through a snapshot of thermal profile). finally, the last step is to assign the job to the node based on the expected type of job. through this, the algorithm helps avert the emergence of hotspots and cold spots by uniformly distributing the jobs in the cluster. in order to support sustainable development goals, energy efficiency ought to be the ultimate goal for a dc with a sizeable high-performance computing facility. to reiterate, this work primarily focuses on two of major aspects: it equipment energy productivity and thermal characteristics of an it room and its infrastructure. the findings of this research are based on the analysis of available monitored thermal characteristics-related data for cresco . these findings feed into recommendations for enhanced thermal design and load management. in this research, clustering performed on big datasets for cresco it room temperature measurements, has grouped nodes into clusters based on their thermal ranges followed by uncovering the clusters they frequently subsume during the observation period. additionally, a data mining algorithm has been employed to locate the hotspots and approximately % of the nodes have been frequently placed in the hot range category (thus labelled as hotspots). several measures to mitigate risks associated with the issue of hotspots have been recommended: more efficient directional cooling, load management, and continuous monitoring of the it room thermal conditions. this research brings about two positive effects in terms of dc energy efficiency. firstly, being a thermal design pitfall, hotspots pose as a risk of local overheating and servers thermal degradation due to prolonged exposure to high temperatures. undeniably, information of hotspots localisation could facilitate better thermal management of the it room where waste heat is evenly distributed. thus, it ought to be the focus of enhanced thermal management in the future. secondly, we discussed ways to avert hotspots through thermal-aware resource allocation (i.e. select the coolest node for a new incoming job), and selection of nodes (for a particular job) that are physically distributed throughout the it room. growth in data center electricity use greenpeace: how dirty is your data? a look at the energy choices that metrics for sustainable data centers recent thermal management techniques for microprocessors a cyber-physical systems approach to data center modeling and control for energy efficiency thermal aware workload placement with task-temperature profiles in a datacenter thermal guidelines for data processing environments -expanded data center classes and usage guidance green data centers: a survey, perspectives, and future directions. arxiv measuring energy efficiency in data centers. in: pervasive computing next generation platforms for intelligent data collection data center, a cyber-physical system: improving energy efficiency through the power management atac: ambient temperature-aware capping for power efficient datacenters thermal metrics for data centers: a critical review review on performance metrics for energy efficiency in data center: the role of thermal management confluent site optimized thermal-aware job scheduling and control of data centers energy efficiency of thermal-aware job scheduling algorithms under various cooling models key: cord- -odnlt fr authors: wu, jia; chen, zhigang title: reducing energy consumption and overhead based on mobile health in big data opportunistic networks date: - - journal: wirel pers commun doi: . /s - - - sha: doc_id: cord_uid: odnlt fr a great number of people and non-equalizing medical resources, in developing countries, have become a serious contradiction. not only does it affects the person’s life, but also causes serious epidemic contagious, because patients can not get help with hospital on time. with the development of wireless communication network, patient may get medical information by wireless network device. it can alleviate contradictions between patients and medical resources. but in developing countries, population quantity is a big data. how to solve data packets in wireless communication network is a big problem when researchers face huge population. in order to solve some problems in big data communication, this paper founds availability data transmission routing algorithm. this algorithm can reduce energy consumption and overhead, then improve deliver ratio in big data communication. compare with spray and wait algorithm, binary spray and wait algorithm in opportunistic networks, this algorithm acquires good results by reduce energy consumption, overhead and deliver ratio. life, in developing countries, can not be protected by medicine because the medical technology is underdeveloped and the population is large. one of the result is that patients with light illness may get serious even cause a disastrous infection. finally developing countries have to spend a great amount of personnel and finance to solve the problem. china, as a representative of developing countries in asia, suffered a disaster because of the uneven distribution of the limited medical resources and the huge population problem. in , the sars virus which originated from china [ ] , affected the whole asia and caused serious consequences. thousands of people are affected and many of them dead. if the early affected patients had chances to consult doctors and get treatments, they may not get serious and disastrous infection could be prevented. in china, a country with a population of more than . billion, the shortage of health resources is serious. people share one doctor in average, and a doctor has to treat patients day at most according to the statistics data in from china's ministry of health [ ] . there is another data, a hospital in a big city, the average amount of treatment reaches million people a year; a better hospital holds . million treatments per year. there is another record in china that there are more than million mobile phone users. this is a big data in mobile network. with the development of g and g mobile technology, the mobile phone signal covers every city. people can communicate with each other and check information by mobile devices wherever there are g and g signals [ ] . everyone may check medical information such as health care, epidemic prevention, and equipment of hospitals by mobile devices when the data of hospitals are shared on the web platform. by this method, the medical resources can be utilized the most and serve the patients better. at the same time, symptoms can be sent to the doctor immediately by the mobile device, by which doctor can judge and give an electronic treatment to the patient. in this condition, hospitals are open to millions of people. this is a revolution in it technology. there are some problems that the mobile network may not be able to transmit such a big data successfully; the data may not arrive on time; and the mobile terminal may not able to store so much data. so it needs a new transition network to solve. as a kind of ad hoc network, the greatest characteristic of the opportunistic network [ , ] is the transmission way among nodes are ''carry-store-transfer''. if a node is not in the communication area, the node will store the information and move to transmit it to the next-hop node [ ] . there is no need complete link in opportunistic networks. the communication is accomplished by nodes movement. the information transmission in opportunistic networks is diffused. this feature makes the opportunistic network can be adopted in mobile health [ ] . persons store health information and transmit them among each other by mobile devices. people can share information anywhere even without the mobile signals [ ] . although the problem in big data research is a common that happens in the mobile computing environment, is not only limited in the health application. so, there are some problems must be considered when mobile health is applied in big data communication in the paper. such as: . how to select available data to reduce energy consuming when mobile device receives or sends messages? . how to improve available data receiving in limited storage space when mobile device receives messages? . how to improve deliver ratio according to transmit electronic medical record. this paper founds availability data transmission routing algorithm (adtra) in big data opportunistic network. this algorithm can judge routing request and predict data data packets and then can overhead, energy consumption and deliver ratio in transmission. research methods in opportunistic networks focus on routing algorithms. these algorithms can be applied by different fields when researchers improve characteristics in opportunistic networks. there are some algorithms in opportunistic network as follow. huang et al. [ ] suggest an algorithm is called epidemic algorithm. as store-andforward mechanism, epidemic algorithm shows the transmission mechanism of infectious diseases. the characteristic of this algorithm is two nodes exchange messages that not stored when two nodes meet. nodes reach the target node and transmit message can be ensure the shortest path with much network channels, cache spaces etc. in big data environment, however, the congestion could happen in transmitting when increase of the nodes. in the application of mobile health in big data, this method can not get good effect since the limitation of the resource. spyropoulos et al. [ ] suggest the spray and wait algorithm which is based on the epidemic algorithm. this algorithm considers that two steps which are spray step and wait step. in spray step, the source node focuses the neighbors. then source node transmits messages to the nodes around in a way of spraying. in wait step, messages are transmitted to the target node to fulfill the process of transmission unless there is no available node can be found in spray step. this method is a modified algorithm which improves the floodtransmission. but the spray step may cause source nodes waste when this method is applied of big data environment because increasing nodes. that is to say, there are a great mount of neighbors which consume much overhead and energy by source nodes. hence this algorithm may cause nodes death by over spraying of the source nodes. searle et al. [ ] suggest prophet algorithm in opportunistic network. this algorithm improves the utilization of opportunistic network by counting the available messages transmission nodes around and calculating the appropriate delivery nodes to form message groups. but in big data environment, a number of counting need much time. so it is not suitable when this algorithm is applied of mobile health big data environment. the condition is same as mv algorithm. cui et al. [ ] suggest mv algorithm which is based on probability can calculate the transmission probability by records statistics in the light of the nodes meeting and area visiting. burgess et al. [ ] suggest maxprop algorithm found upon array setting priority. the feature of this algorithm is that the transmission sequence is determined by the settled array priority when two notes meet. this method may reduce the consuming of the resource by setting the reasonable sequence of message transmission. but in big data transmitting, it is hard to establish sequence because nodes movement lack of regular pattern. musolesi et al. [ ] suggest context-aware routing algorithm according to calculating the transmission probability of the source nodes getting the target nodes. this algorithm gets the middle node by calculating the cyclic exchange transmission probability, then collects and groups the message to guide the middle node to transmit the message directly to the node with higher transmission probability. this algorithm lacks of high deliver ratio when nodes increase. in big data environment, it is hard to acquire high deliver ratio. from what has been discussed above, adtra needs to solve energy consumption, overhead and deliver ratio when the algorithm is applied of big data environment. the next work is how to design algorithm. under the background in big data mobile health, how to solve the data transmission among hospitals, doctors, patients, their relationships is an important problem. patients and their relationships use mobile devices receiving and sending messages. those devices limit storage space when they send and receive messages. electronic medical record of each patient contains above g and g storage space. if all messages are received or sent by device, that would consume a number of energy and overhead. meanwhile, big data in transmutation may reduce deliver ratio because of data losing. next work will be discussed this problem. in the real world, patient personal data are confidentiality. at the same time, diagnosis suggests from doctors are protected. thus, effective data in mobile health are only predicted disease and therapeutic schedule, especially in underdeveloped areas, it is good for patients and doctors to judge disease. in this paper, effective data in mobile health contain diagnose pictures, diagnosis collection in clinic, data in history diagnosis, diagnostic reports, electronic patient record and so on. in big data mobile health environment, model design is show as fig. . in fig. , p shows patients; r shows patient's relationships. when patients and hospital establish relationship, hospital or doctors send messages to the patients by mobile devices, as long as the patients in the hospital's communication area. patients receive the messages from the hospital, and then they can put forward data to their own relationships. in this way, patients do not go to a hospital looking over their inspection results, the stress will be this communication model likes opportunistic network. all the messages are transmitted by non-link. the scheme of model design can choose opportunistic network as communication method according to the characteristic of opportunistic network. in opportunistic network, all the roles would become nodes. they can transmit messages with no link, and messages can be stored by devices. this process is in fig. . figure shows data transmitting in big data mobile health environment. effective data packets are recorded by sensor devices and they can transmit to mobile devices. according to data collection, all the data can be sent to hospital and doctors. data will be analyzed. hospital and doctors receive patients' request by mobile devices. this request contains patients' check indicator, illness describing, and patient requirement and so on. doctors diagnose the condition of patients' index from mobile device, and then give treatment list for patients. patients can download list and look over diagnostic results. at the same time, patients may send these results to their relationships. all the processes can be established by g or g network environment, users may only have a mobile device completed diagnoses. patients and doctors are nodes in big data opportunistic network, they can transmit information by moving. in this communication method, patients and their relationships need not go to hospitals; they may save money and time. hospitals can also reduce the stress of populations and guarantee work efficiency. in large population counties, this model may reduce finance with hospital beds, because patients can get medical service at home when they connect their mobile devices. in big data communication, storage space of mobile devices are limit, we can not receive all the data when a great number of data packets are created in clinic. how to select effective data become important work. figure is diagnosing pictures in pet-ct. it is formed of pictures. there are eight pictures contain yellow-bright areas (suv concentration), four pictures display scanning location. for doctors, they judge pathological changes only four pictures in fig. , and then they can point effective images (picture , , and are marked circle. picture is maximum concentration area). accurate marked can help doctors decision-making. but in the real condition, each patient may have over more pictures. only pictures may adopt by doctors. effective data is . %. moreover, pictures approximate . g storage space in device. in wireless communication, if doctors or hospitals send all pictures to patients, many network resource must be wasted, especially in big data environment, mass of people join in transmission, there are no enough space storing effective information. figure is collection data in clinic. for a patient in hospital, there are m per day stored in database. however, only % data are abnormal. that is to say, % in transmission can help patient and doctor judge disease, % data are useless. in order to keep % effective abnormal data, we must waste a great number of space. not only that, patient in clinic may display normal in a day. it seems in fig. in the morning. but at night, a short moment in clinic, heart rate, blood pressure and blood supply may change very soon. it is dangerous for patient. those data are important to save life and analyze illness. from what has been discuss in those figures, it is important to select effective data when we consider to save storage space and transit effective data packets, especially in big data communication environment. so in big data communication, how to make data decision and select effective data is the key. in big data communication, a great number of electronic medical record (emr) are commuted in wireless devices. table shows an emr for a patient in hospital. from table , we can see that many type of documents are records in hospital. for a patient, an emr may occupy g storage space and documents conclude different types in database. it is hard to diagnose directly for one or two doctors at moment. moreover, if all documents which have not been judged by doctors are transmitted by devices in wireless, there are not enough network resource could be used. more seriously, data packets which contain important records for patients would loss. it is insecurity when emr are transmitted between people. thus, we must consider how to select effective data in emr. figure is data decision tree in emr. the root of the decision tree is a patient. the first layer is the basic information assemblies of the patient. the second is sub-assembly, which records detail information. the third layer is diagnosis category. it contains many kinds of parameters. this process in hospital is completed by medical treatment machine. traditional method in wireless networks may deliver all diagnosis categories to patients and doctors. in order to reduce energy and overhead in transmission, new method must decrease data packet transmission. we can set normal range of parameters in decision layer. abnormal data and normal parameter are selected by devices. all the decision data assemble from k to k n which can help doctor and patient to judge illness. figure shows data assemble exchanging between patients and doctors. patients send request as data packets. these data packets include diagnostic reports, electronic patient record and image information. when doctors receive these data packets, they can judge patients illness condition and give a treatment recommendation. at the same time, doctors may select a part of effective data which are recorded patient results from medical instruments. from k i to k n are data packets in emr data decision. when doctors receive request from patients, they may suggest some important patient results data packets such as k i to k j and k m to k n . in this way, patients need not download the whole data packets and then they can know therapeutic regimen clearly. the process of exchanging data packets are finished by app device(accelerated parallel processing device). effective data packets reduce energy consumption and overhead. patients select effective data packets by mobile device from anywhere and anytime in big data environment. it is not only convenient receive and send messages, but also improve the application for mobile device and mobile health. this communication likes opportunistic network. all the messages are transmitted by non-link. next work is how to design effective scheme of big data mobile health in opportunistic network. in opportunistic networks, the characteristic of the spray and wait algorithm [ , ] is that the nodes transmit information to neighbor nodes by spraying data packets. there are serious problems in this algorithm. firstly, since the storage space of the neighbor nodes is limited, it may not be able to receive all the data packets, which may cause data lost. secondly, the lost of the data leads to the low deliver ratio. thirdly, with multiple neighbor nodes, the spraying will consume great routing overhead of the source node. the spray and wait algorithm is not an optimal selection. the binary spray and wait algorithm [ ] is an optimization of the spray and wait algorithm. it reduces routing overhead in the way that the source node sends / data data packets to the neighbor nodes and stores the rest data. however, this algorithm can be adopted only in networks with few nodes. with the increase of nodes and data data packets, the problems in spray and wait algorithm occur again. this paper designs adtra. the algorithm improves receiving and sending data packets in big data in opportunistic network when data packets are transmitted in mobile health environment. the data transmitting process refer to fig. in adtra. that is to say, the whole data data packets from doctors and patients found an assemble as v. v contains data assemble and is shown as {k i , k i? ……k n }. then in adtra, the effective data packets assemble from doctors and patients shows v . v contains {k i ……k j } and {k m ……k n }. there is a relationship between v and v is v v. for each element in v and v , it contains several data packets. if assemble v contains p data packets, v contains p data packets. according to v v, it shows the relationship between p and p: p p. p is effective data packets from doctors to patients. the next work will compare with effective data packets and the whole data packets when they are shown in energy consumption, overhead and deliver ratio in big data opportunistic network. mobile devices must cost energy consumption in big data mobile health when these send and receive messages. if it is cost energy more than the node storage, the node would die or the device would not work. messages can be send and received on the condition that e b e , e is send data packets consumption, e is storage energy by node. energy consumption in opportunistic network comes from data transmission, signal processing, and hardware operation. particularly in data transmission, we consider energy consumption in scan, data transmission, and received data. these factors establish some models. each scan consumes e s . scan period is t. energy consumption e s for node is where t shows working time with node. . energy consumption is in transmission. the sent data packet needs to consume energy. each sent data packet must consume e t . p t data packets are transmitted. energy consumption in transmission e t is: . energy consumption is in receiving. the condition is the same as receiving data packets by nodes. each data packet received must consume e r . p r data packets are transmitted. energy consumption in transmission e r is: from eqs. ( )-( ), energy consumption e c and surplus energy e sur are energy can be calculated easily according to the established energy assessment model. from energy assessment model, we can found a functional relation between node and neighbor when they send and receive message. in opportunistic network, each node sends data packets to neighbors consuming energy e s ? e t . if node has n number neighbors and sends p r data packets to a neighbor. the energy consumption for a node e is: in adtra, p r is decided by neighbor requests. but in spray and wait algorithm, each node sends uniform number p. it is obvious that p r p, because in spray and wait, all data packets are transmitted from node to neighbors when it carries. however in binary spray and wait, each node sends uniform number p . if effective data packets for adtra are half with the total, that is to say p r p ,energy consumption of adtra is less than binary spray and wait; else p r p when effective data packets for adtra are less than half of total. there is a pair of close relationship between overhead and data packet transmission. if node sends data packets to neighbors, it may cost overhead. in normal, overhead in opportunistic network shows load level with node. high overhead may bring low deliver ratio and high transmission delay. the more data packets are transmitted, the more overhead may cost with node. especially in big data mobile health environment, reduce overhead may improve network quality. assume that among doctors, patients and hospitals are nodes. overhead in one transmitting o is: if node has n number neighbors and sends p r data packets to a neighbor. overhead o in opportunistic network is: it is the same condition in energy consumption, in spray and wait algorithm, each node sends uniform number p and in binary spray and wait, each node sends uniform number p . overhead in adtra is less than spray and wait, because p r p. if effective data packets for adtra are half with the total, that is to say p r p ,overhead in adtra is less than binary spray and wait. data packet deliver ratio is most important parameter on network. in big data mobile health environment, high deliver ratio can be guaranteed messages to send and receive successful among patients and doctors. and then integrality for patient electronic medical record, doctor diagnostic result can be protected. the deliver ratio has some relationships among energy consumption and overhead when storage and energy limit with nodes. nodes in opportunistic network can not give service when overhead is flood and energy is empty. it likes mobile device can not as normal. so deliver ratio is founded by triad function: d peer to peer f g k shows deliver ratio between two nodes, n shows nodes quantity. the average deliver ratio between nodes according to eqs. ( ), ( ) is: energy consumption e k and overhead o k are functions about p k according to eqs. ( ), ( ) . so eqs. ( ) can be shown as: deliver ratio is a function about data packet p according to eqs. ( ), ( ), ( ) . if messages are delivered by nodes have enough energy and overhead. the deliver ratio is decided by rate of neighbor receiving data packets and node sending data data packets. that is d ¼ dr ds . to the whole network, average of deliver ratio d is: it tests and verifies average of deliver ratio by simulation. in network communication, transmission delay is a problem. especially in mobile health environment, if we may reduce delay in emr transmission between devices, patients can gain more rescue time. if a packet is transmitted from node to its neighbor spend time t . actual transmission time from two nodes is t n . delay t del in transmission is according to eq. ( ), we can calculate delay time between nodes when they transmit and receive data packets. the algorithm scheme can be design by chapter . availability data transmission routing algorithm designs as table . in this algorithm, time complexity in overhead and energy consumption are o(n). the deliver ratio can be calculated by sending data packets, overhead and energy consumption. time complexity is better than o(n ) in spray and wait and binary spray and wait. the simulation adopts the simulator opportunistic networking environment (one) [ ] and its extended model in opportunistic network to compare energy consumption in these algorithms. parameters are set based on the random model in opportunistic network. the parameters are set in table . in the paper, data packets in transmission are many kinds of emr(electronic medical record). to keep environment authenticity, we adopt emr in hospital. the parameters are set in table . the next step is simulation result analysis. after algorithm design and setting simulation apportionment, we will compare with three algorithms spray and wait(s&w), binary spray and wait(bs&w) and adtra. the simulation is as follows. fig. the relationship between consume energy and node figure shows the relationship between energy and node. before nodes are participated in transmitting messages, energy consumption in adtra is greater than the bs&w. when nodes is reached more than , adtra energy consumption is less than bs&w. adtra can save energy consumption under the condition of the nodes. when nodes is , adtra energy consumption is the least. bs&w energy consumption is less than s&w. adtra and bs&w acquire good effect, especially adtra more suitable used in big data environment. figure shows the relationship between overhead and node. in simulation experiment, s&w algorithm in routing overhead is very high. the number of nodes in , s&w is around , when node number , routing overhead is over . obviously, s&w algorithm is not suitable for application in big data environment. adtra and bs&w are not increased obvious. especially nodes are less than , bs&w is smaller than adtra routing overhead. the nodes are more than , routing overhead for bs&w is less than bs&w and adtra. nodes are participated to , routing overhead adtra is bs&w % of consumption. explain adtra has better optimization effect. figure is the relationship between deliver ratio and node. in initial stage, the participation message nodes are , adtra and bs&w transmission deliver ratio is over %, s&w is also exceeded %. with the increase of nodes, three algorithms of deliver ratio are raised. this suggests that in opportunistic network, transmission deliver ratio is improved by increasing when the nodes of the algorithm. when nodes reach , adtra deliver ratio is more than %, when the nodes reach , adtra deliver ratio is more than %, when the nodes reach , adtra algorithm deliver ratio is more than %. explain adtra when the more data, higher transmission rate. and, in the node reaches , adtra deliver ratio is . times than s&w. it is said adtra has good effect to improve deliver ratio in transmission. deliver ratio is most important parameter in big data mobile heath environment. so it needs to design sending data data packets in different time. the time intervals are , and s. the simulation result is as fig. . the start in the simulation, s sends data packets deliver ratio is higher than and s. but the distance is less than % deliver ratio. the trend continues. nodes are arrived at , and s deliver ratio are moreover %, s is stable. when nodes reach , s and s deliver ratio are more than %, s is close to %. it is said, adtra at different time distribution of data data packets can be get higher deliver ratio. and then the more data data packets interval longer, the higher deliver ratio arrive at. figure shows the end-to-end delay on average. in the simulation, three samples are collected at different moments when nodes join in transmission. in fig. , the highest delay occurs in the spray and wait algorithm, which exceeds . binary spray and wait and adtra are better than the spray and wait algorithm because many data packets are lost during transmission via the multicopy algorithm. in the adtra, the end-to-end delay stabilizes at when nodes transmit because effective data packets are adopted. table and fig. record surplus energy for nodes that are transmitted in simulation. all of the nodes are divided into four groups and are indexed c -c , t -t , w -w , and r -r in the map. surplus energy is recorded per hour by each one hundred on average. surplus energy for three algorithms is shown in table . they show that surplus energy in spray and wait algorithm is rapidly consumed. all energy is consumed in h. binary spray and wait algorithm is better than spray and wait algorithm. node assemble c -c has . energy in h. other assembles consumed their energy for more than eight hours. adtra is the best algorithm. surplus energy is approximately at the end of simulation time. thus, energy supply can extend lifetime for nodes in communication. emr transmission in three algorithms shows in fig. . in spray and wait (fig. a) , a great number of space in transmission are occupied by image data report, over % space are transmitted image data packets in spray and wait. because nodes accept data packets in order, each node must receive all image data packets from its neighbors. so much space are wasted. in fig. b , the condition is better than spray and wait, because in binary spray and wait, only a half of data packets are transmitted from neighbors to nodes. each node can receive many type of data packets. in fig. c , we can see that many kind of emr are occupied in transmission more average. only % space are transmitted image data packets in adtra. according to adtra, much more emr can be transmitted between nodes. in big data environment, adtra can reduce energy consumption and overhead, extend life time with nodes. moreover, much more emr can be transmitted when patients and doctors request. this paper founds adtra in big data mobile health environment. it reduces energy consumption and overhead, and then improves deliver ratio. that is to say, this algorithm can be applied of mobile device and take part in transmitting messages and then it may acquire good result in big data mobile health environment. research algorithm can improve the contradiction between large populations and medical resources. in the further, some research can be improved selecting effective data by artificial intelligence and applied in mobile device. zhigang chen received the be, the ms and ph.d. from central south university in china in china in , china in and . he is currently a professor, supervisor of phd and dean of school of software, central south university. he is also director and advanced member of china computer federation (ccf), and member of pervasive computing committee of ccf. his research interests cover the general area of cluster computing, parallel and distributed system, computer security, wireless networks. reducing energy consumption and overhead based on mobile… isolation and characterization of viruses related to the sars coronavirus from animals in southern china business ecosystem strategies of mobile network operators in the g era: the case of china mobile the third-generation-mobile ( g) policy and deployment in china: current status, challenges, and prospects usability and evaluation of a deployed g network prototype opportunities in opportunistic computing proposal for efficient routing protocol for wireless sensor network in coal mine goaf efficient and secure routing protocol for wireless sensor networks through snr based dynamic clustering mechanisms eptr: expected path throughput based routing protocol for wireless mesh network a survey of opportunistic networks efficient routing in intermittently connected mobile networks: the multiple-copy case scaffold: a bioinformatic tool for validating ms/ms-based proteomic studies energy efficient opportunistic network coding for wireless networks maxprop: routing for vehicle-based disruptiontolerant networks car: context-aware adaptive routing for delay-tolerant mobile networks spray and wait: an efficient routing scheme for intermittently connected mobile networks spray and wait routing based on average delivery probability in delay tolerant network the one simulator for dtn protocol evaluation he is a senior member of ccf(china computer federation), a member of ieee and acm. his research interests include wireless communications and networking, wireless network, big data research key: cord- - jlk tv authors: balalau, oana; goyal, sagar title: subrank: subgraph embeddings via a subgraph proximity measure date: - - journal: advances in knowledge discovery and data mining doi: . / - - - - _ sha: doc_id: cord_uid: jlk tv representation learning for graph data has gained a lot of attention in recent years. however, state-of-the-art research is focused mostly on node embeddings, with little effort dedicated to the closely related task of computing subgraph embeddings. subgraph embeddings have many applications, such as community detection, cascade prediction, and question answering. in this work, we propose a subgraph to subgraph proximity measure as a building block for a subgraph embedding framework. experiments on real-world datasets show that our approach, subrank, outperforms state-of-the-art methods on several important data mining tasks. in recent years we have witnessed the success of graph representation learning in many tasks such as community detection [ , ] , link prediction [ , ] , graph classification [ ] , and cascade growth prediction [ ] . a large body of work has focused on node embeddings, techniques that represent nodes as dense vectors that preserve the properties of nodes in the original graph [ , ] . representation learning of larger structures has generally been associated with embedding collections of graphs [ ] . paths, subgraphs and communities embeddings have received far less attention despite their importance in graphs. in homogeneous graphs, subgraph embeddings have been used in community prediction [ , ] , and cascade growth prediction [ , ] . in heterogeneous graphs, subgraphs embedding have tackled tasks such as semantic user search [ ] and question answering [ ] . nevertheless, the techniques proposed in the literature for computing subgraph embeddings have at least one of the following two drawbacks: i ) they are supervised techniques and such they are dependent on annotated data and do not generalize to other tasks; ii ) they can tackle only a specific type of subgraph. approach. in this work, we tackle the problem of computing subgraph embeddings in an unsupervised setting, where embeddings are trained for one task and will be tested on different tasks. we propose a subgraph embedding method based on a novel subgraph proximity measure. our measure is inspired by the random walk proximity measure personalized pagerank [ ] . we show that our subgraph embeddings are comprehensive and achieve competitive performance on three important data mining tasks: community detection, link prediction, and cascade growth prediction. contributions. our salient contributions in this work are: • we define a novel subgraph to subgraph proximity measure; • we introduce a framework that learns comprehensive subgraphs embeddings; • in a thorough experimental evaluation, we highlight the potential of our method on a variety of data mining tasks. node embeddings. methods for computing node embeddings aim to represent nodes as low-dimensional vectors that summarize properties of nodes, such as their neighborhood. the numerous embedding techniques differ in the computational model and in what properties of nodes are conserved. for example, in matrix factorization approaches, the goal is to perform dimension reduction on a matrix that encodes the pairwise proximity of nodes, where proximity is defined as adjacency [ ] , k-step transitions [ ] , or katz centrality [ ] . random walk approaches have been inspired by the important progress achieved in the nlp community in computing word embeddings [ ] . these techniques optimize node embeddings such that nodes co-occurring in short random walks in the graph have similar embeddings [ , ] . another successful technique is to take as input a node and an embedding similarity distribution and minimizes the kl-divergence between the two distributions [ , ] . subgraph embeddings. a natural follow-up question is how to compute embeddings for larger structures in the graph, such as paths, arbitrary subgraphs, motifs or communities. in [ ] , the authors propose a method inspired by paragraphvector [ ] , where each subgraph is represented as a collection of random walks. subgraph and node embeddings are learned such that given a subgraph and a random walk, we can predict the next node in the walk using the subgraph embedding and the node embeddings. the approach is tested on link prediction and on community detection, using ego-networks to represent nodes. in [ ] , the authors present an end-to-end neural framework that given in input the cascade graph, predicts the future growth of the cascade for a given time period. a cascade graph is sampled for a set of random walks, which are given as input to a gated neural network to predict the future size of the cascade. [ ] is similarly an end-to-end neural framework for cascade prediction, but based on the hawkes process. the method transforms the cascade into diffusion paths, where each path describes the process of information propagation within the observation time-frame. another very important type of subgraph is a community and in [ ] community embeddings are represented as multivariate gaussian distributions. graph embeddings. given a collection of graphs, a graph embedding technique will learn representations for each graph. in [ ] , the authors propose an inductive framework for computing graph embeddings, based on training an attention network to predict a graph proximity measure, such as graph edit distance. graph embeddings are closely related to graph kernels, functions that measure the similarity between pairs of graphs [ ] . graph kernels are used together with kernel methods such as svm to perform graph classification [ ] . preliminaries. pagerank [ ] is the stationary distribution of a random walk in which, at a given step, with a probability α, a surfer teleports to a random node and with probability − α, moves along a randomly chosen outgoing edge of the current node. in personalized pagerank (ppr) [ ] , instead of teleporting to a random node with probability α, the surfer teleports to a randomly chosen node from a set of predefined seed nodes. let p r(u) be the pagerank of node u and p p r(u, v) be the pagerank score of node v personalized for seed node u. problem statement. given a directed graph g = (v, e), a set of subgraphs s , s , · · · , s k of g and an integer d, compute the d-dimensional embeddings of the subgraphs. we define a subgraph proximity measure inspired by personalized pagerank. let s i and s j be two subgraphs in a directed graph g. their proximity in the graph is: where p r si (v i ) represents the pagerank of node v i in the subgraph s i , and p p r(v i , v j ) the pagerank of node v j personalized for node v i in the graph g. when considering how to define proximity between subgraphs, our intuition is as follows: important nodes in subgraph s i should be close to important nodes in subgraph s j . this condition is fulfilled as pagerank will give high scores to important nodes in the subgraphs and personalized pagerank will give high scores to nodes that are "close" or "similar". we note that our measure is a similarity measure, hence subgraphs that are similar will receive a high proximity score. we choose the term proximity to emphasis that our measure relates to nearness in the graph, as it is computed using random walks. we can interpret eq. using random walks, as follows: alice is a random surfer in the subgraph s i , bob is a random surfer in the subgraph s j , and carol is a random surfer in graph g. alice decides to send a message to bob via carol. carol starts from the current node alice is visiting (p r si (v i )) and she will reach a node v j ∈ s j with probability p p r(v i , v j ). bob will be there to receive the message with probability p r sj (v j ). normalized proximity. given a collection of subgraphs s = {s , s , · · · s k }, we normalize the proximity px(s i , s j ), ∀j ∈ , k such that it can be interpreted as a probability distribution. the normalized proximity for a subgraph s i is: rank of a subgraph. similarly to pagerank, our proximity can inform us of the importance of a subgraph. the normalized proximity given a collection of subgraphs s , s , · · · s k can be expressed as a stochastic matrix, where each row i encodes the normalized proximity given subgraph s i . the importance of subgraph s i can be computed by summing up the elements of column i. sampling according to the proximity measure. given a subgraph s i in input, we present a procedure for efficiently sampling px(s i , ·) introduced in eq. . we suppose that all the pagerank vectors of the subgraphs {s , s , · · · s k } have been precomputed. we first select a node n i in s i according to distribution p r si . secondly, we start a random walk from n i in the graph g and we select n j , the last node in the walk before the teleportation. lastly, node n j may belong to several subgraphs s j , s j · · · . we return a subgraph s j according to the normalized distribution p r s j (n j ), p r s j (n j ), · · · . the procedure doesn't require computing the personalized pagerank vectors, which saves us o(n ) space. we shall use this procedure for computing embeddings, thus avoiding computing and storing the full proximity measure px. given a graph g = (v, e) and set of subgraphs of g, s = {s , s , · · · , s k }, we learn their representations as dense vectors, i.e. as embeddings. we extend the framework in [ ] proposed for computing node embeddings to an approach for subgraph embeddings. in [ ] , the authors propose to learn node embeddings such that the embeddings preserve an input similarity distribution between nodes. the similarities of a node v to any other node in the graph are represented by the similarity distribution sim g , where w∈v sim g (v, w) = . the corresponding embedding similarity distribution is sim e . the optimization function of the learning algorithm minimizes the kullback-leibler (kl) divergence between the two proximity distributions: the authors propose several options for instantiating sim g , such as personalized pagerank and adjacency similarity. the similarity between embeddings, sim e , is the normalized dot product of the vectors. in order to adapt this approach to our case, we define the subgraph-tosubgraph proximity sim g to be the normalized proximity presented in eq. . the embedding similarity sim e is computed in the same manner and the optimization function now minimizes the divergence between distributions defined on our input subgraphs, i.e. sim g , sim e : s × s → [ , ]. in our experimental evaluation we use this method, which we refer to as subrank. we note that sim g will not be fully computed, but approximated using the sampling procedure presented in sect. . . proximity of ego-networks. two very important tasks in graph mining are community detection and link prediction. suppose alice is a computer scientist and she joins twitter. she starts following the updates of andrew ng, but also the updates of her friends, diana and john. bob is also a computer scientist on twitter and he follows andrew ng, jure leskovec and his friend julia. as shown in fig. , there is no path in the directed graph between alice and bob. a pathbased similarity measure between nodes alice and bob, such as personalized pagerank, will return similarity , while it will return high values between alice and andrew ng and between bob and andrew ng. an optimization algorithm for computing node embeddings will have to address this trade-off, with a potential loss in the quality of the representations. thus, we might miss that both alice and bob are computer scientists. to address this issue we capture the information stored in the neighbors of the nodes by considering ego-networks. therefore in our work, we represent a node v as its ego network of size k (the nodes reachable from v in k steps). in sect. , we perform quantitative analysis to validate our intuition. proximity of cascade subgraphs. in a graph, an information cascade can be modeled as a directed tree, where the root represents the original content creator, and the remaining nodes represent the content reshares. when considering the task of predicting the future size of the cascade, the nodes already in the cascade are important, as it very likely their neighbors will be affected by the information propagation. however, nodes that have reshared more recently the information are more visible to their neighbors. when running pagerank on a directed tree, we observe that nodes on the same level have the same score, and the score of nodes increases as we increase the depth. hence, two cascade trees will have a high proximity scorepx if nodes that have joined later the cascades (i.e. are on lower levels in the trees) are "close" or "similar" according to personalized pagerank. in sect. , we perform quantitative analysis and we show that our approach gives better results than a method that gives equal importance to all nodes in the cascade. datasets. we perform experiments on five real-world graphs, described below. we report their characteristics in table . • citeseer is a citation network created from the citeseer digital library. nodes are publications and edges denote citations. the node labels represent fields in computer science. • cora (see footnote ) is also a citation network and the node labels represent subfields in machine learning. • polblogs is a directed network of hyperlinks between political blogs discussing us politics. the labels correspond to republican and democrat blogs. competitors. we evaluate our method, subrank, against several state-of-theart methods for node and subgraph embedding computation. for each method, we used the code provided by the authors. we compare with: • deepwalk [ ] learns node embeddings by sampling random walks, and then applying the skipgram model. the parameters are set to the recommended values, i.e. walk length t = , γ = , and window size w = . • node vec [ ] is a hyperparameter-supervised approach that extends deep-walk. we fine-tuned the hyperparameters p and q on each dataset and task. in addition, r = , l = , k = , and the optimization is run for an epoch. • line [ ] proposes two proximity measures for computing two d-dimensional vectors for each node. in our experiments, we use the second-order proximity, as it can be used for both directed and undirected graphs. we run experiments with t = samples and s = negative samples, as described in the paper. • verse [ ] learns node embeddings that preserve the proximity of nodes in the graph. we use personalized pagerank as a proximity measure, the default option proposed in the paper. we run the learning algorithm for iterations. • verseavg is a adaption of verse, in which the embedding of a node is the average of the verse embeddings of the nodes in its ego network. • sub vec [ ] computes subgraph embeddings and for the experimental evaluation, we compute the embeddings of the ego networks. using the guidelines of the authors, for cora, citeseer and polblogs we select ego networks of size and for the denser networks cithep and dblp, ego networks of size . for the first four methods, node embeddings are used to represent nodes. for sub vec, subrank and verseavg, the ego network embedding is the node representation. the embeddings are used as node features for community detection and link prediction. we compute dimensional embeddings. parameter setting for subrank. we represent each node by its ego network of size . we run the learning algorithm for iterations. our code is public. we assess the quality of the embeddings in terms of their ability to capture communities in a graph. for this, we use the k-means algorithm to cluster the nodes embedded in the d-dimensional space. in table we report the normalized mutual information (nmi) with respect to the original label distribution. on polblogs, subrank has a low nmi, while on citeseer and cora it outperforms the other methods. on dblp it has a comparative performance with verse. node classification is the task of predicting the correct node labels in a graph. for each dataset, we try several configurations by varying the percentage of nodes used in training. we evaluate the methods using the micro and macro f score, and we report the micro f , as both measures present similar trends. the results are presented in table . on citeseer and cora subrank significantly outperforms the other methods. on polblogs, subrank performs similarly to the other baselines, even though the embeddings achieved a low nmi score. on dblp, subrank is the second best method. to create training data for link prediction, we randomly remove % of edges, ensuring that each node retains at least one neighbor. this set represents the ground truth in the test set, while we take the remaining graph as the training set. in addition, we randomly sample an equal number of node pairs that have no edge connecting them as negative samples in our test set. we then learn embeddings on the graph without the % edges. next, for each edge (u, v) in the training or the test set, we obtain the edge features by computing the hadamard product of the embeddings for u and v. the hadamard product has shown a better performance than other operators for this task [ , ] . we report the accuracy of the link prediction task in table . our method achieves the best performance on out of datasets. given in input: i ) a social network g = (v, e), captured at a time t , ii ) a set of information cascades c that appear in g after the timestamp t , and that are captured after t duration from their creation, iii ) a time window t , our goal is to predict the growth of a cascade, i.e. the number of new nodes a cascade acquires, at t + t time from its creation. note that given a cascade c = (v c , e c ) ∈ c, we know that the nodes v c are present in v , however c can contain new edges not present in e. datasets. we select for evaluation two datasets from the literature: • aminer [ ] represents cascades of scientific citations. we use the simplified version made available by the authors . the dataset contains a global citation graph and the cascades graphs. a node in a graph represents an author and an edge from a to a represents the citation of a in an article of a . a cascade shows all the citations of a given paper. competitors. we compare subrank with the following state-of-the-art methods for the task of predicting the future size of cascades: • deepcas [ ] is an end-to-end neural network framework that given in input the cascade graph, predicts the future growth of the cascade for a given period. the parameters are set to the values specified in the paper: k = , t = , mini-batch size is and α = . . • deephawkes [ ] is similarly an end-to-end deep learning framework for cascade prediction based on the hawkes process. we set the parameters to the default given by the authors: the learning rate for user embeddings is × − and the learning rate for other variables is × − . • in addition, we consider the node embedding method verse [ ] , as one of the top-performing baseline in the previous section. the node embeddings are learned on the original graph and a cascade is represented as the average of the embeddings of the nodes it contains. we then train a multi-layer perceptron (mlp) regressor to predict the growth of the cascade. parameter setting for subrank. we recall that our subgraph proximity measure requires the computation of ppr of nodes in the graph and the pr of nodes in the subgraphs. for this task, we consider the ppr of nodes in the global graph and the pr of nodes in the cascades. we obtain the cascade embeddings which are then used to train an mlp regressor. for both verse and subrank we perform a grid search for the optimal parameters of the regressor. we report the mean squared error (mse) on the logarithm of the cascade growth value, as done in previous work on cascade prediction [ , ] in table . we observe that subrank out-performs verse thus corroborating our intuition that nodes appearing later in a cascade should be given more importance. the best mse overall is obtained by the end-to-end framework deephawkes which is expected as the method is tailored for the task. we note, however, that subrank achieves the best results on aminer. in this work, we introduce a new measure of proximity for subgraphs and a framework for computing subgraph embeddings. in a departure from previous work, we focus on general-purpose embeddings, and we shed light on why our method is suited for several data mining tasks. our experimental evaluation shows that the subgraph embeddings achieve competitive performance on three downstream applications: community detection, link prediction, and cascade prediction. sub vec: feature learning for subgraphs distributed large-scale natural graph factorization unsupervised inductive graph-level representation learning via graph-graph proximity question answering with subgraph embeddings a comprehensive survey of graph embedding: problems, techniques, and applications deephawkes: bridging the gap between prediction and understanding of information cascades grarep: learning graph representations with global structural information learning community embedding with community detection and node embedding on graphs graph embedding techniques, applications, and performance: a survey node vec: scalable feature learning for networks topic-sensitive pagerank: a context-sensitive ranking algorithm for web search distributed representations of sentences and documents deepcas: an end-to-end predictor of information cascades subgraph-augmented path embedding for semantic user search on heterogeneous social network distributed representations of words and phrases and their compositionality asymmetric transitivity preserving graph embedding the pagerank citation ranking: bringing order to the web deepwalk: online learning of social representations line: large-scale information network embedding verse: versatile graph embeddings from similarity measures graph kernels retgk: graph kernels based on return probabilities of random walks key: cord- -onj zpi authors: abuelkhail, abdulrahman; baroudi, uthman; raad, muhammad; sheltami, tarek title: internet of things for healthcare monitoring applications based on rfid clustering scheme date: - - journal: wireless netw doi: . /s - - - sha: doc_id: cord_uid: onj zpi covid- surprised the whole world by its quick and sudden spread. coronavirus pushes all community sectors: government, industry, academia, and nonprofit organizations to take forward steps to stop and control this pandemic. it is evident that it-based solutions are urgent. this study is a small step in this direction, where health information is monitored and collected continuously. in this work, we build a network of smart nodes where each node comprises a radio-frequency identification (rfid) tag, reduced function rfid reader (rfrr), and sensors. the smart nodes are grouped in clusters, which are constructed periodically. the rfrr reader of the clusterhead collects data from its members, and once it is close to the primary reader, it conveys its data and so on. this approach reduces the primary rfid reader’s burden by receiving data from the clusterheads only instead of reading every tag when they pass by its vicinity. besides, this mechanism reduces the channel access congestion; thus, it reduces the interference significantly. furthermore, to protect the exchanged data from potential attacks, two levels of security algorithms, including an aes bit with hashing, have been implemented. the proposed scheme has been validated via mathematical modeling using integer programming, simulation, and prototype experimentation. the proposed technique shows low data delivery losses and a significant drop in transmission delay compared to contemporary approaches. coronavirus will have a long-term impact overall world. the most significant impact will manifest itself in the penetration of it surveillance and tracking. wireless sensor networks (wsns) become very efficient and viable to a wide variety of applications in many aspects of human life, such as tracking systems, medical treatment, environmental monitoring, intelligent transportation system (its), public health, smart grid, and many other areas [ ] . radio frequency identification (rfid) is a wireless technology with a unique identifier that utilizes the radio frequency for data transmission; it is transferred from the device to the reader via radio frequency waves. the data is stored in tags; these tags can be passive, active, or battery-assisted-passive (bap). the active and bap tags contain batteries that allow them to communicate on a broader range that can go up to km for enterprise users and over km in military applications. unlike battery-powered tags, passive tags use the reader's rf signal to generate power and transmit/receive data [ ] . using wsns and rfid is a promising solution, and it becomes prevalent in recent years. its low cost and low power consumption, rfid is easy to install, deploy, and combine with sensors [ ] . these features make rfid combined with sensors a viable and enabling technology for iot. with a wide variety of increasingly cheap sensors and rfid technologies, it becomes possible to build a real-time healthcare monitoring system at low price with very high quality. the rfid system is considered the strategic enabling component for the healthcare system due to the energy autonomy of battery-less tags and their low cost. in addition, rfid can be attached to the monitored items to be recognized, and hence enhancing the efficiency of monitoring and managing the objects [ ] [ ] [ ] [ ] . having real-time data collection and management is very important, especially in health-related systems. for instance, the united nations international children emergency fund (unicef) and the world health organization (who) reported in that more than thousand women die every year from causes related to pregnancy and childbirth [ ] ; this is due to the unavailability of timely medical treatments. moreover, the report stated that the main reasons of cancer-related deaths are due to the late detection of the abnormal cellular growth at the last stage. many lives can be saved by utilizing real-time iot smart nodes that can continuously monitor the patient's health condition. hence, it empowers the physicians to detect serious illnesses such as cancer in the primary stage. the motivations for the proposed framework are threefold: low cost, high performance, and real-time collection of data. an rfid reader cannot rapidly get data from tags because of its static nature and short transmission range. therefore, high power and costly rfid reader is required to extend the range for quick information gathering. however, this would result in an increase in the price of the framework considering the high cost of rfid reader with a high transmission range (not less than $ ) and the increased expenditure of initiating the connection between backend servers rfid reader. the question can we limit rfid readers' quantity, while still accomplishing sufficient information accumulation? moreover, in customary rfid observing applications, such as tracking luggage in airlines, an rfid reader is necessary to rapidly handle many tags at various distances. an rfid reader can just read tags within its range. many limitations could negatively affect the data collection's performance, such as multi bath fading and limited bandwidth; these issues can be maintained by transmitting information in short separations through multi-hop information transmission mode in wsns. besides, in every data collection system, the most critical challenge is to consider the real-time requirements. combining rfid tags with rfid readers and wsns helps significantly in solving this challenge [ ] [ ] [ ] . in this paper, we develop a framework that integrates rfid with wireless sensor systems based on a clustering scheme to gather information efficiently. essentially, our framework utilizes a smart node proposed by shen et al. [ ] . the smart node contains an rfid tag, reduced function rfid reader (rfrr), and a wireless sensor. the cluster's construction depends on multi-criteria in choosing the clusterhead among smart nodes in the same range. for instance, each node can read the tag id and battery level of all smart nodes in its range; the node with the highest battery level will be chosen as the clusterhead. the cluster consists of a clusterhead and cluster members; each member in the cluster transmits their tag information to the clusterhead. then, the rfid readers send the collected data to the backend server for data management and processing. also, to protect exchanged data from potential attacks, we have applied two levels of security algorithms. the proposed technique can lend itself to a wide range of applications, for example, collecting data in smart cities, aiming to monitor people's healthcare in large events such as festivals, malls, airports, train stations, etc. the specific contributions of this paper are listed below: • we exploit the smart nodes to develop an efficient healthcare monitoring scheme based on a collaborative adaptive clustering approach. • the proposed clustering scheme reduces the reader's burden to read every node and allows them to read only the node within its range. this approach minimizes the channel access congestion and helps in reducing any other interference. it also reduces the transmission delay, thus collecting the information between nodes efficiently for a large-scale system. • we formulate the clustering problem as a mathematical programming model to minimize the energy consumption and the interference in a large-scale mobile network. • to protect the collected data by the proposed approach from security threats that might occur during data communication among smart nodes and primary readers, we secure the exchanged data by two security levels. • we develop a small-scale prototype where we explore the performance of the proposed approach. the prototype is composed of a set of wearable smart nodes that each consists of rfid tag, reduced function rfid reader, and body sensor. also, all exchanged data among the smart nodes have been encrypted. the rest of the paper is organized as follows. section presents the related work on health care monitoring applications. in sect. , the proposed system is discussed, starting with explaining the problem statement followed by the proposed clustering approach. in sect. , the cluster formation is modeled as an integer program. in sect. , we present and discuss the three used methods to evaluate our proposed approach. first, the optimal solution using integer programming is discussed. given the long-running time required for integer programming, the proposed system is simulated using matlab, where the local information is employed to construct the clusters. thirdly, a small-scale prototype is built to test the approach. finally, we conclude this paper with our findings and suggestions for future directions. this section summarizes some of the previous work related to health care monitoring applications. many researchers have focused on solving this problem by using either rfid or wsn as the short-range radio interfaces. however, very few of these solutions are suitable for the problem (health care monitoring applications for a large-scale system) that addresses a crowded area with high mobility. sun microsystems, in collaboration with the university of fribourg [ ] proposed a web-based application called (rfid-locator) to improve the quality of hospital services. rfid-locator tracks the patients and goods in the hospital to build a smart hospital. all patients in the hospital are given an rfid based on wristband resembling a watch with a passive rfid tag in it. all patient's history and treatment records are stored in a centralized secure database. doctors have rfid-enabled personal data assistant (pda) devices to read the patient's data determined on the patients' rfid bangles. the results are promising, but too much work is needed in the security and encryption of the collected data. dsouza et al. [ ] proposed a wireless localization network to follow the location of patients in indoor environments as well as to monitor their status (e.g., walking, running). the authors deploy static nodes at different locations of the hospital that interact with a patient mobile unit to determine the patient's position in the building. each patient carries a small mobile node that is composed of a small-size fleck nano wireless sensor and a three-axis accelerometer sensor to monitor his/her physical status. however, using everybody's smartphone gps and wi-fi is not an energy-efficient solution because it requires enormous power. chandra-saharan et al. [ ] proposed a location-aware wsn to track people in a disaster site using a ranging algorithm. the ranging algorithm is based on received signal strength indicator (rssi) environment and mobility adaptive (rema). like [ , ] , the authors in [ ] focused on the healthcare area and provided a survey that shows the current study on rfid sensing from the viewpoint of iot for individual healthcare also proves that rfid technology is now established to be part of the iot. on the other hand, the paper reveals many challenging issues, such as the reliability of the sensors and the actual dependence of the reader's node. there are even more advanced solutions in [ ] ; the authors proposed ihome approach, which consists of three key blocks: imedbox, imedpack, and the bio-patch. rfid tags are used to enable communication capabilities to the imedpack block also flexible, and wearable biomedical sensor devices are used to collect data (bio-patch). the results are promising, but the study didn't consider monitoring purposes. another smart healthcare system is proposed in [ ] to monitor and track patients, personnel, and biomedical devices automatically using deferent technologies rfid, wsn, and smart mobile. to allow these different technologies to interoperate a complex network communication relying on a coap, low-pan, and rest paradigms, two use cases have been implemented. the result proved a good performance not only to operate within hospitals but to provide power effective remote patient monitoring. the results are promising, but their approach needs more infrastructures of the wired and wireless sensor network. gope and hwang [ ] proposed a secure iot healthcare application using a body sensor network (bsn) to monitor patient's health using a collection of tiny-powered and lightweight wireless sensor nodes. also, the system can efficiently protect a patient's privacy by utilizing a lightweight anonymous authentication protocol, and the authenticated encryption scheme offset codebook (ocb). the lightweight anonymous authentication protocol can achieve mutual authentication, preserve anonymity, and reduce computation overhead between nodes. the ocb block cipher encryption scheme is well-suited for secure and expeditious data communication as well as efficient energy consumption. the results are promising, but their approach needs infrastructure. furthermore, an intelligent framework for healthcare data security (ifhds) has been proposed to secure and process large-scale data using a column-based approach with less impact on data processing [ ] . the following table comapres the proposed approach with the existing literature. it shows that there is no similar work to the proposed approach. techniques f- f- f- f- f- f- f- f- f- [ ] a hybrid rfid energy-efficient routing scheme for dynamic clustering networks a smart real-time healthcare monitoring and tracking system based on mobile clustering scheme a data collection method based on mobile edge computing for wsn energy-efficient large-scale tracking systems based on mobile clustering scheme energy-efficient large-scale tracking systems based on two-level hierarchal clustering : the smart node is a wearable smart node that includes a reduced function rfid reader (rfrr), a body sensor (bs), a rfid tag and a microcontroller, where in the rfid reader has a greater transmission range than the rfrr, where in the rfrr reads other smart nodes' tags and stores this data into its own rfid tag feature (f- ): aplurality of smart nodes, which integrate radio-frequency identification (rfid) and wireless sensor network (wsn) feature (f- ): the clustering scheme in which each node reads the tag id of all nodes in its range and a cluster head is a node which has the highest cost function (e.g. battery level); the cluster consists of a clusterhead and cluster members feature (f- ): the data collection scheme in which an rfid reader receives all packets of node data from the ch, and the rfid reader sends the collected information to a back-end server for data processing and management feature (f- ): formulating a novel mathematical programming model which optimizes the clustering structures to guarantee the best performance of the network. the mathematical model optimizes the following objective functions: ( ) minimizing the total distance between chs and cms to improve positioning accuracy; and ( ) minimizing the number of clusters which reduces the signal transmission traffic feature (f- ): two level security is obtained by when a node writes data to its rfid tag, the data is signed with a signature, which is a hash value, the obtained hash is encrypted with a aes bits shared key in this section, the proposed system is discussed, starting with explaining the problem statement followed by the proposed solution. during healthcare monitoring of people, the main challenge is to ensure safety, efficient data collection, and privacy. people stay in a bounded area, embedded with various random movements in their vicinity. different technologies have been suggested to collect data from crowds and can be categorized as passive and active sensing. passive sensing, such as computer vision, does not need any connection with the user. they can aid in movement detection, counting people, and density approximation [ , ] . however, these approaches fail to deliver accurate identification of individuals in addition to the need for ready infrastructure, which is very costly. there are also some active systems such as rfid tags that can be attached to the individual and obtain user's data. nevertheless, these systems require an expensive infrastructure for organizing rfid readers at points of data collection [ ] . therefore, to deliver accurate identification of individuals in addition to reduce the cost of the infrastructure and to attain efficient large-scale data collection for healthcare monitoring applications, we suggest employing a system of mobile smart nodes that is composed of rfid and wsn. the mobile smart nodes are clustered to minimize data traffic and ensure redundancy and delivery to the command center. however, clustering rfid nodes into groups comes with many technical challenges, such as achieving accurate positioning, collecting information in each cluster, and reporting this information from clusters head to the server for processing it. in addition, there are also many challenges related to clustering, which is crucial to managing the transmission to avoid interference. furthermore, the rfid tag is susceptible to malicious attacks; therefore, we implemented two levels of security algorithms to protect the stored data from potential attacks. this section discusses the proposed data collection technique that can efficiently collect the health information (e.g., temperature, heartbeat) and make them available to the backendback-end server in real-time. the main components in our system architecture include smart nodes, rfid readers, and a backend server, as shown in fig. . the smart node integrates the functionalities of rfid and wireless sensor node. it consists of body sensor (bs), rfid tag, and reduced-function rfid reader (rfrr). unlike standard sensors, bs does not have a transmission function. bs is responsible for collecting the body-sensed data, such as heartbeat, muscle, temperature. the rfrr is an rfid reader with a small range compared to the traditional rfid reader. the protocol is composed of two phases: cluster construction and data exchange. in the beginning, each node reads the tag particulars (e.g., id, battery level) of all nodes in its range. then, the node, for example, with the highest battery level, is autonomously nominated as a clusterhead for this group of nodes. all smart nodes initiate a table of the nominees to be the clusterhead of the newly constructed cluster. the clusterhead sends a message to all nodes within its range to inform them that i am a clusterhead to join its group. secondly, the node accepting the offer from this clusterhead node sends an acknowledgment message; this is important to avoid duplicate association with multiple nodes. this step ends the cluster construction. once the cluster is formed, it reads other smart nodes and stores their data into its local tag. the clusterhead tag works as a data storage. finally, when an rfrr comes across rfid, the stored data are transferred to rfid and the backendback-end server for further processing. this feature helps reach remote nodes and hence enhance the system reliability and reduce the infrastructure cost. this process is repeated periodically; new clusters are formed, and new clusterheads are selected along with their children. this technique guarantees fair load distribution among multiple devices to attain the network's maximum lifetime and avoid draining the battery of any individual smart node. the pseudo-code for our algorithm is shown below. choose the ch with highest bl : if it is ch and meet its cm then : read data from the cluster member : end if : if it is ch and meet an rfid reader then : send its data to the rfid reader : end if the ultimate goal of this research is to design an optimum healthcare monitoring application based on the rfid clustering scheme. to meet the practical requirements for applying the system in large-scale environments, the proposed system's energy consumption should be minimum, and communication quality must be high. therefore, the integer programming model presented below aims to optimize the following objectives: • minimizing the total distance between clusterheads (chs) and cluster members (cms). • minimizing the number of clusters. the first objective, which is to minimize the total distance between all chs and their respective cms, is meant to enhance tag detectability. also, shorter distances improve the signal quality and reduce the time delay of transmissions within each cluster. for example, in traditional rfid monitoring applications, such as supply chain management and baggage checking in delta airlines, an rfid reader is required to process several tags at different distances in a short time frame. an rfid reader can only read tags in its range. limited communication bandwidth, background noise, multi-path fading, and channel accessing contention between tags would severely deteriorate the performance of the data collection [ ] . the second objective is pursued because minimizing the number of clusters reduces signal transmission traffic, lowering the interference between signals. this results in reducing the use of energy and maximizing the lifetime of the network. for instance, rfid tag data usually is collected using direct transmission mode, in which an rfid reader communicates with a tag only when the tag moves into its transmission range. if many tags move towards a reader at the same time, they contend to access the channels for information transmission. when a node enters the reading range of an rfid reader, the rfid reader reads the node's tag information. suppose several nodes enter the range of rfid reader at the same time. in that case, the rfid reader gives the first meeting tag the highest priority to access the channel, reducing channel contention and long-distance transmission interference [ ] . in the clusterhead based algorithm, cluster members replicate their tag data to the clusterhead. when a clusterhead of a particular cluster reaches an rfid reader, the rfid reader receives all nodes' information in this cluster. this enhanced method significantly reduces channel access congestion and reduces the information exchanges between nodes. the method is suitable for a wide range of applications where monitored objects (e.g., zebras, birds, and people) tend to move in clusters. let i = to n denote the cm number, j = to n denote the ch number, dij denotes the distance between cmi and chj, and f denotes the fixed cost per ch. the user's battery level (bl) is defined as in ( ), / which is a predefined node energy threshold. expressions ( ) and ( ) define the decision variables, xij and yj, which are integer binary variables. fig. the architecture of the healthcare monitoring system fig. a timeline of the transactions carried out between smart nodes, b timeline of the transactions carried out between smart nodes and the main rfid reader wireless networks the complete integer-programming model of the clustering problem is given by ( ) . the first expression in ( ) is the objective function z, which consists of two terms. the first term is the total distance between chs and cms, and the second term is the total number of clusters in the network. the objective function z is minimized subject to four sets of constraints. constraint (i) ensures that every cm has a ch, so we avoid any isolated smart nodes. constraint ii controls the maximum cluster size (cs). constraint iii ensures that all cluster members are within the ch's rfid range, i.e., not more than d max away (e.g., two feet). finally, constraint iv ensures that a ch node's battery level must be at least / (e.g.. %). the fixed cost of each ch is denoted by f, which is analyzed later. in this section, the performance of the proposed approach is evaluated using three methods: the integer programming, simulation, and a small-scale prototype. the general algebraic modeling system (gams) is designed for modeling and solving linear programming (lp), nonlinear programming (nlp), and mixed-integer programming (mip) optimization problems [ ] . since the above model described in eq. ( ) is a binary integer program, it is solved by the mip feature of gams. we use gams version . . . we consider two different scenarios. the first scenario tackles the problem by considering the two terms in the objective function that aims at minimizing the number of clusters and the total distance between chs and cms to find the optimal cluster size (cs) in constraint ii. the second scenario applies sensitivity analysis by fixing the total number of nodes to n = , , , , and ; this is done while changing the fixed cost of each ch, f, and calculating the optimal value of the number of clusters and the total distance as well. both scenarios are analyzed under the condition that the service region's size is set as * ft . to achieve a % confidence level, we have repeated each experiment times using different random input for nodes' locations and the battery level for each node. it can be observed from fig. that the total distance between the chs and the cms is reduced on average when cs is equal (i.e., one clusterhead and five cluster members) for nodes and nodes. the total distance between the chs and the cms is also reduced on average when cs is equal , , for nodes, nodes, and nodes, respectively. for example, with nodes, the minimum accumulated distance between all clusters and their members is about ft when cluster size is equal , whereas, with cluster size, it is about ft. similar to nodes scenario, the minimum distance is about ft when cluster size is equal to , whereas, with cluster size, it is about ft and when cluster size is . therefore, the clustering approach is effective in reducing the total distances when cs is equal to for nodes and nodes, and , , for nodes, nodes, and nodes, respectively. figure displays the number of clusters while the cluster size is changing. it can be observed that the number of clusters drops when the cluster size increases. however, we are interested not only in minimizing the number of the clusters, but we are also interested in minimizing the total distances between the clusterhead and the cluster member to achieve the accuracy of positioning and maximize the lifetime of the network. for instance, with nodes, the optimum minimum distance is about ft when cluster size is equal , and with nodes, the optimum minimum distance is about ft when cluster size is equal . therefore, the optimum value of cluster size is equal to for nodes and nodes, and , , for nodes, nodes, and nodes, respectively. figure demonstrates the model's total distance when the fixed cost per master f is equal to e , where e = , , …, . for nodes, the optimal (minimum) total distance is ft, which is obtained when f is equal to (e = ). for the case of nodes, the optimal total distance is ft, which is also obtained when f is equal to . these numbers indicate that the clustering approach is well-suited for large-scale monitoring applications. figure illustrates the optimal number of the clusters when the value of fixed cost per master f is equal to e where e = , , …, . for nodes, the optimal (minimum) number of the clusters is clusters, which is obtained when e = , or f = . for the case of nodes, the optimal number of clusters is clusters, which is also obtained when f = . therefore, the best value of f for both terms in the optimization function in eq. ( ) to work effectively is . in this section, we formulate the energy consumption of the proposed clustering approach and the traditional approach analytically. in the beginning, we define the following parameters: r: rfid_maximum_data_rate, bps p a : rfid_active_power, w p i : rfid_idle_power, w t a tag ¼ l=r: rfid tag active time in second, where l is the data length, bits. we define the total energy consumption for the traditional approach as follows. for the traditional approach, t a tag ¼ t round . given the current advancement in rfid technology, we can assume that the collision rate is very low with confidence. hence, b i % . then, for the proposed approach, we define the following specific parameters.e ch : total energy consumption per clusterhead, e total : total energy consumption given the current advancement in rfid technology, we assume the collision rate to be minimal. hence, b i % and t i tag i ð Þ % t round À t a tag , for i. besides, in order not to miss any data, the clusterhead is set on for the whole round period, hence, t a rfrr ¼ t round and t i rfrr ¼ . equation ( ) can be rewritten as fig. the average total distance when changing cs from to nodes fig. number of clusters when changing cs from to nodes fig. the total distance when changing f from to nodes wireless networks we have implemented the proposed system for monitoring the health parameters using cisco packet tracer . since it supports iot, rfid, and many other functions. figure shows the smart node components as built using cisco packet tracer. the smart node consists of rfrr, bs, rfid tag, and the microcontroller. the rfrr is a standard rfid reader with a limited range. we program the rfrr to perform two tasks: the first task is reading data from the attached body sensors and storing data into its tag. the second task is reading the data from other smart nodes within its transmission range and storing it into its tag. the body sensor is responsible for collecting body-sensed data such as temperature, heartbeat. the rfid tag works as data storage. on the other hand, the microcontroller (mcu) is used to monitor, verify, and process smart nodes readings. the transmitted data between smart nodes and rfid readers has three fields. a unique smart node id assigned to each user ( byte), the sensed-data ( byte), and the timestamp, which records the time at when the data is collected ( bytes). furthermore, to protect the collected data from potential attacks, we apply rivest-shamir-adleman (rsa) algorithms [ ] . figure shows the components of the rfid reader and its connectivity with the backend server. the rfid readers are responsible for collecting the data from smart nodes and delivering them to the backend server. the transmission range of the rfid reader is much greater than that of the rfrr. upon reading the smart node tag data, it sends that data directly to the backend server wirelessly carried by udp packets. rivest-shamir-adleman (rsa) algorithms are also applied for the transmitted data from smart nodes to primary readers. using the above setup, we start by studying the performance of the packet delay, and the number of delivered packets have been calculated for the traditional approach and the cluster approach. in the traditional approach, every node sends its packets directly to an rfid reader. in the clustering approach, every node sends its packets to its clusterhead, and the clusterhead forwards them to an rfid reader. each node sends ten packets every minute, and the simulation has been tested for min to achieve a % confidence interval. the average delay per packets is calculated using eq. ( ), where n is the number of delivered packets and r t is the receiving time and s t is the sending time. average delay per packet ¼ n table shows a sample of the collected data at the backend server before and after implementing the rsa algorithm. the smart node appends the timestamp to the sensed data and stores the information in its tag through rfrr. as stated before, the transmitted data between smart nodes and rfid readers has three fields, namely, unique smart node id, the sensed data, and the timestamp when the data was collected. figure illustrates the average transmission delay per packet for a different number of nodes. we can notice that the traditional approach's delay per packet is almost fixed regardless of available smart nodes. this behavior can be attributed to the fact that each node would meet the rfid readers for forwarding its packets with equal probability. on the other hand, when the clustering approach is employed, the delay drops significantly; for example, when n = , the packet delay drops by %. the higher is the number of smart nodes, the lower is the packet delay; this happens because when the number of smart nodes increases in the same area, the density increases, as well as the fig. number of clusters when changing f from to nodes fig. the smart node components as built-in packet tracer number of clusterheads. therefore, the probability of a regular node meets a clusterhead increases, which leads to reduce the delay in delivering the collected data to the primary reader and then to the back-end-server. figure displays the number of delivered packets for different numbers of nodes. in the clustering approach, the system delivers exactly packets, which are the total number of packets generated by all smart nodes. on the other hand, in the traditional approach, the system suffers packet loss (e.g., % loss for n = ) due to the increase in channel access congestion as the number of nodes increases. next, we study the traditional approach's energy consumption, the optimal approach, and the proposed clustering approach. in the traditional approach, every node sends its packets directly to an rfid reader. in the clustering approach, as explained in sect. . , every node reads the tag particulars (battery level ) of all nodes in its range. the node with the highest battery level is then chosen as a clusterhead for this group of nodes. then, the clusterhead broadcast a message to all nodes within its range to inform them that i am a clusterhead to join its group. then, the node accepting this clusterhead node's offer sends an acknowledgment message; this is important to avoid duplicate association with multiple nodes. once the cluster is formed, the clusterhead remains active, and the cluster member remains in sleep mode. the clusterhead reads other smart nodes and stores their data into its local tag. the cluster member switches to active mode every s to store its data into its own local. finally, the clusterhead sends the data to an rfid reader, then to the backend server for further processing and management. this process is repeated every min; new clusters are formed, and new clusterheads are selected along with their children. this technique guarantees fair load distribution among multiple devices to attain the maximum lifetime of the network and avoiding draining the battery of any individual smart node. the relative performance of the three methods has been evaluated using matlab. it is assumed that each node can send data traffic at a rate of kbps, and it can send frames with sizes up to bytes (one byte for the id tag number, one byte for the data (heartbeat) and two bytes for timestamp and sequence number). table shows the rfid hardware energy consumption parameters, as specified by sparkfun [ ] . in order to achieve a % confidence interval, each simulation experiment was repeated times using different random topologies. for each simulation run, the total energy consumption for each round was calculated for different values of the number of nodes (n = , ,…, ). figure and table show the average total energy consumption for the traditional approach, the clustering algorithm, and the optimal gams solution of the integer programming model. figure shows that the clustering solution's total energy consumption is close to the minimum total consumption obtained by the optimal gams solution. the clustering algorithm's total energy becomes closer to the optimal value as the number of nodes increases. this result is clear from table , which shows a difference of % between the clustering algorithm's performance and the optimal gams solution when the number of nodes is equal to , but only a difference of . % when the number of nodes is equal to . this feature shows that the proposed clustering algorithm can produce high-quality, near-optimum solutions for large-scale problems. as shown in table , the traditional approach's energy consumption is . % higher than the optimal consumption specified by gams when the number of nodes is equal to , and . % higher when the number of nodes is equal to . the traditional approach (without clustering) is not a practical solution method for large-scale systems. in this section, we evaluate the performance of the proposed approach using a small-scale prototype. we begin by describing the experimental setup and then discuss the experimental results. figure shows the smart node components in our prototype testbed. the smart node consists of rfrr, bs, rfid tag, and the microcontroller. the rfrr is a standard rfid reader with a limited range, which can read up to two feet as in spark fun specification with onboard antenna [ ] . we program the rfrr to perform two tasks. the first task is reading the heartbeat, and the muscle sensed data from the bs (via pulse sensor, and muscle sensor), respectively, and storing this data into its tag. the second task is reading the data from other smart nodes within its transmission range and storing it into its tag. bs is responsible for collecting the body-sensed data such as heartbeat and muscle data. the rfid tag works as a packet memory buffer for data storage. arduino's read board is a microcontroller that is used to monitor, verify, and process smart nodes readings. the transmitted data between smart nodes and rfid readers has three fields, smart node id, the sensed data, and the sequence number of the data to know when the data was recorded. for each node, three packets of data are needed to be published so that other nodes can get their information. therefore, we need only four bytes of data entries: node id ( byte), heart rate information ( byte), and the sequence number ( bytes). the sequence number helps in discovering how recent the carried information is, and helps other nodes in deciding whether to record newly read data or discard it. each rfid tag has a -byte capacity; the first bytes are divided into chunks of bytes where each is used to store information of one node, this sums to a total of data slots. the remaining bytes are used for authentication. the first data slot is reserved for one's tag. other data slots are initially marked as available; that is, they do not contain data about other nodes and are ready to be utilized for that purpose. figure shows the flowchart that presents the process of handling new data. when a new data arrives and is to be stored, the controller tries to find whether a slot that contains data for the same id exists. if so, the slot is updated if the sequence number is less than the new sequence number; otherwise, the new data is discarded. if the controller does not find a previous record for that id, it stores its data in a new available slot, which means some data to be lost. we implement two levels of security algorithms to ensure the integrity of the arrived data, as well as to authenticate the source of data in our scheme. when a node writes the bytes data into its tag, the data is signed with bytes signature, which is used for authentication. to obtain the signature, the controller calculates the md bits hash value of the data bytes. then, the obtained hash is encrypted with the aes bits shared key. the result is the signature and is stored on the tag. to verify a newly read tag, the controller computes the hash of the new data (but not the signature), encrypts it with the shared key, and compares the result with the signature. the new data is valid if the result and its signature match each other. otherwise, it is considered an invalid node, and its data is discarded. the experimental prototype consists of three smart nodes ( , , ) and one primary rfid reader, as shown in fig. . each smart node consists of rfid tag, microcontroller, pulse sensor, and rfrr, a regular rfid reader with a limited range, which can read up to two feet with an onboard antenna. the primary rfid reader is an rfid reader attached to an external antenna to increase its transmission range. in this prototype, node , which has the highest battery level, plays the role of the clusterhead, and node and node play the role of the cluster members. node reads tag information of node and node . then, the primary rfid reader receives all packets of node , node , and node from node when it moves into the primary rfid reader range. then, the rfid reader sends the collected information to the backend server for data processing. figure shows a sample of the collected data of the pulse sensor that includes the beat per minute (bpm), live heartbeat or interbeat interval (ibi), and the analog signal (as) on the serial monitor. each row in fig. includes bpm, ibi, and as. for instance, the first row has as bpm, as ibi, and as as. the typical readings of the beat per minute of the pulse sensor should be between and . otherwise, it is considered an emergency case. it can be observed from figs. and that a valid foreign tag # is read and updated, and a valid foreign tag# is read and then updated on the serial monitor, respectively. to verify a newly read tag, the controller computes the hash of the new data (but not the signature), encrypts it with the shared key, and compares the result with the signature. the new data is valid if the result and its signature match each other. otherwise, it is considered an invalid node, and its data is discarded. figures and shows that tag# and tag# are valid. figure shows the captured data packets in an invalid foreign tag. in this example, the reader using the authentication process, which the controller executed, reported that tag number four is invalid. the controller computes the hash of the new data, encrypts it with the shared key, and compares the result with the signature, so tag four is considered as an invalid node. its data is discarded because the results and signature do not match. in this paper, we presented a novel technique for iot healthcare monitoring applications based on the rfid clustering scheme. the proposed scheme integrates rfid with wireless sensor systems to gather information efficiently, aiming at monitoring the health of people in large events such as festivals, malls, airports, train stations. the developed system is composed of clusters of wearable smart nodes. the smart node is composed of rfid tag, reduced function of an rfid reader, and body sensors. the clusters are reconstructed periodically based on specific criteria, such as the battery level. these clusters collect data from their members and when they come across rfid readers, they deliver the collected data to these readers. on the other hand, using the traditional approaches, only the nodes in the range of the rfid readers can send their tag data to the rfid readers. hence, this will cause several performance problems such as long delay, dropped packets, missing data, and channel access congestion. the proposed clustering approach overcome all these problems. it demonstrated outstanding performance in reducing the packet transmission delay, inter-node interference, and better energy utilization. the experimental results have supported the above performance. the proposed approach can lend itself easily to monitor and collect the health information of the society population continuously, especially in the current pandemic. as future research directions, we are planning to integrate the smart nodes with other sensors to ensure full health care application and test the new application in large-scale scenarios. there is also a need to improve the clustering algorithm to guarantee a high level of service quality of the deployed health applications. conflict of interest the authors declare that they have no conflict of interest. ethical approval the study only includes humans in roaming a large hall to test the connectivity of the established networks. a survey on the internet of things security handbook: fundamentals and applications in contactless smart cards, radio frequency identification and near field communication efficient data collection for large-scale mobile monitoring applications influence of thermal boundary conditions on the double-diffusive process in a binary mixture engineering design process an object-oriented finite element implementation of large deformation frictional contact problems and applications x-analysis integration (xai) technology. virginia technical report preventing deaths due to hemorrhage taxonomy and challenges of the integration of rfid and wireless sensor networks neuralwisp: a wirelessly powered neural interface with -m range a capacitive touch interface for passive rfid tags building a smart hospital using rfid technologies wireless localization network for patient tracking empirical analysis and ranging using environment and mobility adaptive rssi filter for patient localization during disaster management the research of network architecture in warehouse management system based on rfid and wsn integration bringing iot and cloud computing towards pervasive healthcare rfid technology for iot-based personal healthcare in smart spaces a health-iot platform based on the integration of intelligent packaging, unobtrusive bio-sensor, and intelligent medicine box an iot-aware architecture for smart healthcare systems bsn-care: a secure iot-based modern healthcare system using body sensor network ifhds: intelligent framework for securing healthcare bigdata effective data collection in multi-application sharing wireless sensor networks a hybrid approach of rfid andwsn system for efficient data collection an analysis on optimal clusterratio in cluster-based wireless sensor networks wireless regulation and monitoringsystem for emergency ad-hoc networks using nodes concurrent data collectiontrees for iot applications acooperation-based routing algorithm in mobile opportunistic networks a data prediction model based on extended cosine distance for maximizing network lifetime of wsn crpd: anovel clustering routing protocol for dynamic wireless sensor networks secure data transmission in hybrid radio frequency identification with wireless sensor networks real-time healthcare monitoring system using smartphones energy-efficient data collection scheme based on mobile edge computing in wsns iterative clustering for energy-efficient large-scale tracking systems an asynchronous clustering and mobile data gathering schema based on timer mechanis min wireless sensor networks optimum bilevel hierarchi-cal clustering for wireless mobile tracking systems modeling and representation to support design-analysis integration crowd analysis: a survey. machine vision and applications data-driven crowd analysis in videos gams specifications the sparkfun specification key: cord- - lmwnfda authors: ray, sumanta; lall, snehalika; mukhopadhyay, anirban; bandyopadhyay, sanghamitra; schonhuth, alexander title: predicting potential drug targets and repurposable drugs for covid- via a deep generative model for graphs date: - - journal: nan doi: nan sha: doc_id: cord_uid: lmwnfda coronavirus disease (covid- ) has been creating a worldwide pandemic situation. repurposing drugs, already shown to be free of harmful side effects, for the treatment of covid- patients is an important option in launching novel therapeutic strategies. therefore, reliable molecule interaction data are a crucial basis, where drug-/protein-protein interaction networks establish invaluable, year-long carefully curated data resources. however, these resources have not yet been systematically exploited using high-performance artificial intelligence approaches. here, we combine three networks, two of which are year-long curated, and one of which, on sars-cov- -human host-virus protein interactions, was published only most recently ( th of april ), raising a novel network that puts drugs, human and virus proteins into mutual context. we apply variational graph autoencoders (vgaes), representing most advanced deep learning based methodology for the analysis of data that are subject to network constraints. reliable simulations confirm that we operate at utmost accuracy in terms of predicting missing links. we then predict hitherto unknown links between drugs and human proteins against which virus proteins preferably bind. the corresponding therapeutic agents present splendid starting points for exploring novel host-directed therapy (hdt) options. the pandemic of covid- (coronavirus disease- ) has affected more than million people. so far, it has caused about . million deaths in over countries worldwide (https://coronavirus.jhu.edu/map.html), with numbers still increasing rapidly. covid- is an acute respiratory disease caused by a highly virulent and contagious novel coronavirus strain, sars-cov- , which is an enveloped, single-stranded rna virus . sensing the urgency, researchers have been relentlessly searching for possible therapeutic strategies in the last few weeks, so as to control the rapid spread. in their quest, drug repurposing establishes one of the most relevant options, where drugs that have been approved (at least preclinically) for fighting other diseases, are screened for their possible alternative use against the disease of interest, which is covid- here. because they were shown to lack severe side effects before, risks in the immediate application of repurposed drugs are limited. in comparison with de novo drug design, repurposing drugs offers various advantages. most importantly, the reduced time frame in development suits the urgency of the situation in general. furthermore, most recent, and most advanced artificial intelligence (ai) approaches have boosted drug repurposing in terms of throughput and accuracy enormously. finally, it is important to understand that the d structures of the majority of viral proteins have remained largely unknown, which raises the puts up the obstacles for direct approaches to work even higher. the foundation of ai based drug repurposing are molecule interaction data, optimally reflecting how drugs, viral and host proteins get into contact with each other. during the life cycle of a virus, the viral proteins interact with various human proteins in the infected cells. through these interactions, the virus hijacks the host cell machinery for replication, thereby affecting the normal function of the proteins it interacts with. to develop suitable therapeutic strategies and design antiviral drugs, a comprehensive understanding of the interactions between viral and human proteins is essential . when watching out for drugs that can be repurposed to fight the virus, one has to realize that targeting single virus proteins easily leads to the viruses escaping the (rather simpleminded) attack by raising resistance-inducing mutations. therefore, host-( ) we link existing high-quality, long-term curated and refined, large scale drug/protein -protein interaction data with ( ) molecular interaction data on sars-cov- itself, raised only a handful of weeks ago, ( ) exploit the resulting overarching network using most advanced, ai boosted techniques ( ) for repurposing drugs in the fight against sars-cov- ( ) in the frame of hdt based strategies. as for ( )-( ), we will highlight interactions between sars-cov- -host protein and human proteins important for the virus to persist using most advanced deep learning techniques that cater to exploiting network data. we are convinced that many of the fairly broad spectrum of drugs we raise will be amenable to developing successful hdt's against covid- . in the following, we will first describe the workflow of our analysis pipeline and the basic ideas that support it. we proceed by carrying out a simulation study that proves that our pipeline accurately predicts missing links in the encompassing drug -human protein -sars-cov- -protein network that we raise and analyze. namely we demonstrate that our (high-performance, ai supported) prediction pipeline accurately re-establishes links that had been explicitly removed before. this provides sound evidence that the interactions that we predict in the full network most likely reflect true interactions between molecular interfaces. subsequently, we continue with the core experiments. we predict links to be missing in the full (without artificially having removed links), encompassing drug -human protein -sars-cov- -protein network, raised by combining links from year-long curated resources on the one hand and most recently published covid- resources on the other hand. as per our simulation study, a large fraction, if not the vast majority of the predictions establish true, hence actionable interactions between drugs on the one hand and sars-cov- associated human proteins (hence of use in hdt) on the other hand. a b c d figure . overall workflow of the proposed method: the three networks sars-cov- -host ppi, human ppi, and drug-target network (panel-a) are mapped by their common interactors to form an integrated representation (panel-b). the neighborhood sampling strategy node vec converts the network into fixed-size low dimensional representations that perverse the properties of the nodes belonging to the three major components of the integrated network (panel-c). the resulting feature matrix (f) from the node embeddings and adjacency matrix (a) from the integrated network are used to train a vgae model, which is then used for prediction (panel-d). for the purposes of high-confidence validation, we carry out a literature study on the overall drugs we put forward. for this, we inspect the postulated mechanism-of-action of the drugs in the frame of several diseases, including sars-cov and mers-cov driven diseases in particular. see figure for the workflow of our analysis pipeline and the basic ideas that support it. we will describe all important steps in the paragraphs of this subsection. this reduces the training time compared to the general graph autoencoder model. we tested the model performance for a different number of sampled nodes, keeping track of the area under the roc curve (auc), average precision (ap) score, and model training time in the frame of a train-validation-test split at proportions : : . table shows the performance of the model for sampled sugraph sizes n s = , , , and . for sampled nodes, the model's performance is sufficiently good enough concerning its training time and validation-auc and -ap score. the average test roc-auc and ap score of the model for n s = are . ± . and . ± . . to know the efficacy of the model in discovering the existing edges between only cov-host and drug nodes, we train the model (with n s = ) on an incomplete version of the graph where the links between cov-host and drugs have been removed. we further compute the feature matrix f based on the incomplete graph, and use it. the test set consists of all the previously removed edges. the model performance is no doubt better for discovering those edges between cov-host and drug nodes (roc-auc: . ± . ap: . ± . for runs). the fastgae model is learned with the feature matrix (f) and adjacency matrix (a). the node feature matrix (f) is obtained from a using the node vec neighborhood sampling strategy. the model performance is evaluated with and without using f as feature matrix. figure shows the average performance of the model on validation sets with and without f as input for the different number of sampling nodes. we calculate average auc, and ap scores for complete runs of the model. from figure , it is evident that including f as feature matrix enhances the model's performance markedly. we use the node vec framework to learn low dimensional embeddings of each node in the compiled network. it uses the skipgram algorithm of the word vec model to learn the embeddings, which eventually groups nodes with a similar 'role' or having a similar 'connection pattern' within the graph. similar 'role' ensures that nodes within the sets/groups are structurally similar/equivalent than the other nodes outside the groups. two nodes are said to be structurally equivalent if they have identical connection patterns to the rest of the network . to explore this, we have analyzed the embedding results in two steps. first, we explore structurally equivalent nodes to identify 'roles' and similar connection patterns to the rest of the networks, and later use lovain clustering to examine the same within the groups/clusters. the most_similar function of the node vec inspects the structurally equivalent nodes within the network. we find out all the cov-host nodes which are most similar to the drug nodes. while it is expected to observe nodes of the same types within the neighborhood of a particular node, in some cases, we found some drugs are neighbors of cov-host proteins with high probability (pobs > . ). sars-cov- cl protease . some other drugs such as 'clenbuterol' and 'fenbendazole', the probable neighbor of ppp cb and eef a respectively, are used as bronchodilators in asthma. to explore the closely connected groups, we have constructed a neighborhood graph using the k-th nearest neighbor algorithm from the node embeddings and apply louvain clustering ( figure -panel-c). although there is a clear separation between host proteins (including cov-host) cluster and drug cluster, some of the louvain clusters contain both types of nodes. for example, louvain cluster- and - contain four and two drugs along with the other cov-host proteins, respectively. figure panel-d represents a network consisting of these six drugs and their most similar cov-host nodes. for drug-cov-host interaction prediction, we exploit variational graph autoencoder (vgae), an unsupervised graph neural network model, first introduced in to leverage the concept of variational autoencoder in graph-structured data. to make learning faster, we utilized the fastgae model to take advantage of the fast decoding phase. we have used two data matrices in the fastgae model for learning: one is the adjacency matrix, which represents the interaction information over all the nodes, and the other one is the feature matrix representing the low-dimensional embeddings of all the nodes in the network. we create a test set of 'non-edges' by removing all existing links between drugs and cov-host proteins from all possible combinations ( cov-host × drugs) of edges. the model is trained on the whole network with the adjacency matrix a and feature matrix f. the trained model is then applied to the test 'non-edges' to know the most probable links. we identified a total of most probable links with drugs and cov-host proteins with a probability threshold of . . the predicted cov-host proteins are involved in different crucial pathways of viral infection (table ). the p-values for pathway and go enrichment are calculated by using the hypergeometric test with . fdr corrections. figure , panel-a shows the heatmap of probability scores between predicted drugs and cov-host proteins. to get more details of the predicted bipartite graph, we figure . drug-cov-host predicted interaction: panel-a shows heatmap of probability scores between drugs and cov-host proteins. the four predicted bipartite modules are annotated as b , b , b and b within the heatmap. the drugs are colored based on their clinical phase (red-launched, preclinical-blue, phase /phase -green and phase- / phase- -black ). panel-b, c, d and e represents networks corresponding to b , b , b and b modules.the drugs are annotated using the disease area found in cmap database a b c d e figure . predicted interactions for probability threshold: . . panel-a shows the interaction graph between drugs and cov-host. drugs are annotated with their usage. panel-b, c, d and e represents quasi-bicliques for one, two, three and more than three drugs molecules respectively. use a weighted bipartite clustering algorithm proposed by j. beckett . this results in bipartite modules (panel-a figure ): b ( drugs, cov-host), b ( drugs, cov-host), b ( rugs and cov-host), and b ( drugs and cov-host). the other panels of the figure show the network diagram of four bipartite modules. b contains drugs, including some antibiotics (anisomycin, midecamycin), and anti-cancer drugs (doxorubicin, camptothecin). b also has some antibiotics such as puromycin, demeclocycline, dirithromycin, geldanamycin, and chlortetracycline, among them, the first three are widely used for bronchitis, pneumonia, and respiratory tract infections . some other drugs such as lobeline and ambroxol included in the b module have a variety of therapeutic uses, including respiratory disorders and bronchitis. the high confidence predicted interactions (with threshold . ) is shown in figure panel-a. to highlight some repurposable drug combination and their predicted cov-host target, we perform a weighted clustering (clusterone) on this network and found some quasy-bicluques (shown in panel-b-e) we matched our predicted drugs with the drug list recently published by zhou et al. and found six common drugs: mesalazine, vinblastine, menadione, medrysone, fulvestrant, and apigenin. among them, apigenin has a known effect in the antiviral activity together with quercetin, rutin, and other flavonoids . mesalazine is also proven to be extremely effective in the treatment of other viral diseases like influenza a/h n virus. . baclofen, a benzodiazepine receptor (gabaa-receptor) agonist, has a potential role in antiviral associated treatment . antiinflammatory antecedents fisetin is also tested for antiviral activity, such as for inhibition of dengue (denv) virus infection . it down-regulates the production of proinflammatory cytokines induced by a denv infection. both of the drugs are listed in the high confidence interaction set with the three cov-hosts: tapt (interacted with sars-cov- protein: orf c), slc a (interacted with sars-cov- protein: orf c), and trim (interacted with sars-cov- protein: orf a) ( figure -panel-c). topoisomerase inhibitors play an active role as antiviral agents by inhibiting the viral dna replication , . some topoisomerase inhibitors such as camptothecin, daunorubicin, doxorubicin, irinotecan and mitoxantrone are predicted to interact with several cov-host proteins. it has been demonstrated that the anticancer drug camptothecin (cpt) and its derivative irinotecan have a potential role in antiviral activity , . it inhibits host cell enzyme topoisomerase-i which is required for the initiation as well as completion of viral functions in host cell . daunorubicin (dnr) has also been demonstrated as an inhibitor of hiv- virus replication in human host cells . the conventional anticancer antibiotic doxorubicin was identified as a selective inhibitor of in vitro dengue and yellow fever virus replication . it is also reported that doxorubicin coupling with monoclonal antibody can create an immunoconjugate that can eliminate hiv- infection in mice cell . mitoxantrone shows antiviral activity against the human herpes simplex virus (hsv ) by reducing the transcription of viral genes in many human cells that are essential for dna synthesis . histone deacetylases inhibitors (hdaci) are generally used as latency-reversing agents for purging hiv- from the latent reservoir like cd memory cell . our predicted drug list (table ) contains two hdaci: scriptaid and vorinostat. vorinostrate can be used to achieve latency reversal in the hiv- virus safely and repeatedly . asymptomatic patients infected with sars-cov- are of significant concern as they are more vulnerable to infect large number of people than symptomatic patients. moreover, in most cases ( percentile), patients develop symptoms after an average of - days, which is longer than the incubation period of sars, mers, or other viruses . to this end, hdaci may serve as good candidates for recognizing and clearing the cells in which sars-cov- latency has been reversed. heat shock protein (hsp) is described as a crucial host factor in the life cycle of several viruses that includes an entry in the cell, nuclear import, transcription, and replication , . hsp is also shown to be an essential factor for sars-cov- envelop (e) protein . in , hsp is described as a promising target for antiviral drugs. the list of predicted drugs contains three hsp inhibitors: tanespimycin, geldanamycin, and its derivative alvespimycin. the first two have a substantial effect in inhibiting the replication of herpes simplex virus and human enterovirus (ev ), respectively. recently in , geldanamycin and its derivatives are proposed to be an effective drug in the treatment of covid- . inhibiting dna synthesis during viral replication is one of the critical steps in disrupting the viral infection. the list of predicted drugs contains six such small molecules/drugs, viz., niclosamide, azacitidine, anisomycin, novobiocin, primaquine, menadione, and metronidazole. dna synthesis inhibitor niclosamide has a great potential to treat a variety of viral infections, including sars-cov, mers-cov, and hcv virus and has recently been described as a potential candidate to fight the / sars-cov- virus . novobiocin, an aminocoumarin antibiotic, is also used in the treatment of zika virus (zikv) infections due to its protease inhibitory activity. in , chloroquine (cq) had been demonstrated as an effective drug against the spread of severe acute respiratory syndrome (sars) coronavirus (sars-cov). recently hydroxychloroquine (hcq) sulfate, a derivative of cq, has been evaluated to efficiently inhibit sars-cov- infection in vitro . therefore, another anti-malarial aminoquinolin drug primaquine may also contribute to the attenuation of the inflammatory response of covid- patients. primaquine is also established to be effective in the treatment of pneumocystis pneumonia (pcp) . cardiac glycosides have been shown to play a crucial role in antiviral drugs. these drugs target cell host proteins, which help reduce the resistance to antiviral treatments. the antiviral effects of cardiac glycosides have been described by inhibiting the pump function of na, k-atpase. this makes them essential drugs against human viral infections. the predicted list of drugs contains three cardiac glycosides atpase inhibitors: digoxin, digitoxigenin, and ouabain. these drugs have been reported to be effective against different viruses such as herpes simplex, influenza, chikungunya, coronavirus, and respiratory syncytial virus . mg , proteasomal inhibitor is established to be a strong inhibitor of sars-cov replication in early steps of the viral life cycle . mg inhibits the cysteine protease m-calpain, which results in a pronounced inhibition of sars-cov- replication in the host cell. in , resveratrol has been demonstrated to be a significant inhibitor mers-cov infection. resveratrol treatment decreases the expression of nucleocapsid (n) protein of mers-cov, which is essential for viral replication. as mg and resveratrol play a vital role in inhibiting the replication of other coronaviruses sars-cov and mers-cov, so they may be potential candidates for the prevention and treatment of sars-cov- . another drug captopril is known as angiotensin ii receptor blockers (arb), which directly inhibits the production of angiotensin ii. in , angiotensin-converting enzyme (ace ) is demonstrated as the binding site for sars-cov- . so angiotensin ii receptor blockers (arb) may be good candidates to use in the tentative treatment for sars-cov- infections . in summary, our proposed method predicts several drug targets and multiple repurposable drugs that have prominent literature evidence of uses as antiviral drugs, especially for two other coronavirus species sars-cov and mers-cov. some drugs are also directly associated with the treatment of sars-cov- identified by recent literature. however, further clinical trials and several preclinical experiments are required to validate the clinical benefits of these potential drugs and drug targets. in this work, we have successfully generated a list of high-confidence candidate drugs that can be repurposed to counteract sars-cov- infections. the novelties have been to integrate most recently published sars-cov- protein interaction data on the one hand, and to use most recent, most advanced ai (deep learning) based high-performance prediction machinery on the other hand, as the two major points. in experiments, we have validated that our prediction pipeline operates at utmost accuracy, confirming the quality of the predictions we have raised. the recent publication (april , ) of two novel sars-cov- -human protein interaction resources , has unlocked enormous possibilities in studying virulence and pathogenicity of sars-cov- , and the driving mechanisms behind it. only now, various experimental and computational approaches in the design of drugs against covid- have become conceivable, and only now such approaches can be exploited truly systematically, at both sufficiently high throughput and accuracy. here, to the best of our knowledge, we have done this for the first time. we have integrated the new sars-cov- protein interaction data with well established, long-term curated human protein and drug interaction data. these data capture hundreds of thousands approved interfaces between encompassing sets of molecules, either reflecting drugs or human proteins. as a result, we have obtained a comprehensive drug-human-virus interaction network that reflects the latest state of the art in terms of our knowledge about how sars-cov- and interacts with human proteins and repurposable drugs. for exploiting the new network-already establishing a new resource in its own right-we have opted for most recent and advanced deep learning based technology. a generic reason for this choice is the surge in advances and the resulting boost in operative prediction performance of related methods over the last - years. a particular reason is to make use of most advanced graph neural network based techniques, namely variational graph autoencoders as a deep generative model of utmost accuracy, the practical implementation of which was presented only a few months ago (just like the relevant network data). note that only this recent implementation enables to process networks of sizes in the range of common molecular interaction data. in essence, graph neural networks "learn" the structure of links in networks, and infer rules that underlie the interplay of links. based on the knowledge gained, they enable to predict links and output the corresponding links together with probabilities for them to indeed be missing. simulation experiments, reflecting scenarios where links known to exist in our network were re-established by prediction upon their removal, pointed out that our pipeline does indeed predict missing links at utmost accuracy. encouraged by these simulations, we proceeded by performing the core experiments, and predicted links to be missing without prior removal of links in our encompassing network. these core experiments revealed high confidence interactions relating to drugs. in our experiments, we focused on predicting links between drugs and human proteins that in turn are known to interact with sars-cov- proteins (sars-cov- associated host proteins). we have decidedly put the focus not on drug -sars-cov- -protein interactions, which would have reflected more direct therapy strategies against the virus. instead, we have focused on predicting drugs that serve the purposes of host-directed therapy (hdt) options, because hdt strategies have proven to be more sustainable with respect to mutations by which the virus escapes a response to the therapy applied. note that hdt strategies particularly cater to drug repurposing attempts, because repurposed drugs have already proven to lack severe side effects, because they are either already in use, or have successfully passed the preclinical trial stages. we further systematically categorized the repurposable drugs into categories based on their domains of application and molecular mechanism. according to this, we identified and highlighted several drugs that target host proteins that the virus needs to enter (and subsequently hijack) human cells. one such example is captopril, which directly inhibits the production of angiotensin-converting enzyme- (ace- ), in turn already known to be a crucial host factor for sars-cov- . further, we identified primaquine, as an antimalaria drug used to prevent the malaria and also pneumocystis pneumonia (pcp) relapses, because it interacts with the tim complex timm and alg . moreover, we have highlighted drugs that act as dna replication inhibitor (niclosamide, anisomycin), glucocorticoid receptor agonists (medrysone), atpase inhibitors (digitoxigenin, digoxin), topoisomerase inhibitors (camptothecin, irinotecan), and proteosomal inhibitors (mg- ). note that some drugs are known to have rather severe side effects from their original use (doxorubicin, vinblastine), but the disrupting effects of their short-term usage in severe covid- infections may mean sufficient compensation. in summary, we have compiled a list of drugs, which when repurposed are of great potential in the fight against the covid- pandemic, where therapy options are urgently needed. our list of predicted drugs suggests both options that had been identified and thoroughly discussed before and new opportunities that had not been pointed out earlier. the latter class of drugs may offer valuable chances for pursuing new therapy strategies against covid- . we have utilized three categories of interaction datasets: human protein-protein interactome data, sars-cov- -host protein interaction data, and drug-host interaction data. we have taken sars-cov- -host interaction information from two recent studies by gordon et al and dick et al , . in , high confidence interactions between sars-cov- and human proteins are predicted using using affinity-purification mass spectrometry (ap-ms). in , high confidence interactions are identified using sequence-based ppi predictors (pipe & sprint). the drug-target interaction information has been collected from five databases, viz., drugbank database (v . ) , chembl database, therapeutic target database (ttd) , pharmgkb database, and iuphar/bps guide to pharmacology . total number of drugs and drug-host interactions used in this study are and , respectively. we have built a comprehensive list of human ppis from two datasets: ( ) ccsb human interactome database consisting of , genes, and high-quality binary interactions - , ( ) the human protein reference database which consists of proteins and ppis. the summary of all the datasets is provided in table . cmap database is used to annotate the drugs with their usage different disease areas. we have utilized node vec , an algorithmic framework for learning continuous feature representations for nodes in networks. it maps the nodes to a low-dimensional feature space that maximizes the likelihood of preserving network neighborhoods. the principle of feature learning framework in a graph can be described as follows: let g = (v, e) be a given graph, where v represents a set of nodes, and e represents the set of edges. the feature representation of nodes (|v |) is given by a mapping function: f : v → r d , where d specify the feature dimension. the f may also be represented as a node feature matrix of dimension of |v | × d. for each node, v ∈ v , nn s (v) ⊂ v defines a network neighborhood of node v which is generated using a neighbourhood sampling strategy s. the sampling strategy can be described as an interpolation between breadth-first search and depth-first search technique . the objective function can be described as: this maximizes the likelihood of observing a network neighborhood nn s (v) for a node v given on its feature representation f . now the probability of observing a neighborhood node n i ∈ nn s (v) given the feature representation of the source node v is given as : where, n i is the i th neighbor of node v in neighborhood set nn s (v). the conditional likelihood of each source (v) and neighborhood node (n i ∈ nn s (v )) pair is represented as softmax of dot product of their features f (v) and f (n i ) as follows: variational graph autoencoder (vgae) is a framework for unsupervised learning on graph-structured data . this model uses latent variables and is effective in learning interpretable latent representations for undirected graphs. the graph autoencoder consists of two stacked models: ) encoder and ) decoder. first, an encoder based on graph convolution networks (gcn) maps the nodes into a low-dimensional embedding space. subsequently, a decoder attempts to reconstruct the original graph structure from the encoder representations. both models are jointly trained to optimize the quality of the reconstruction from the embedding space, in an unsupervised way. the functions of these two model can be described as follows: encoder: it uses graph convolution network (gcn) on adjacency matrix a and the feature representation matrix f. encoder generates a d -dimensional latent variable z i for each node i ∈ v , with |v | = n, that corresponds to each embedding node, with d ≤ n. the inference model of the encoder is given below: where, r(z i |a, f) corresponds to normal distribution, n ( z i µ i , σ i ), µ i and σ i are the gaussian mean and variance parameters. the actual embedding vectors z i are samples drawn from these distributions. decoder: it is a generative model that decodes the latent variables z i to reconstruct the matrix a using inner products with sigmoid activation from embedding vector, (z). where, a is the decoded adjacency matrix. the objective function of the variational graph autoencoder (vgae) can be written as: the objective function c v gae maximizes the likelihood of decoding the adjacency matrix w.r.t graph autoencoder weights using stochastic gradient decent. here, d kl (.||.) represents kullback-leibler divergence and p(z) is the prior distribution of latent variable. drug-sars-cov- link prediction . adjacency matrix preparation in this work, we consider an undirected graph g = (v, e) with |v | = n nodes and |e| = m edges. we denote a as the binary adjacency matrix of g. here v consists of sars-cov- proteins, cov-host proteins, drug-target proteins and drugs. the matrix (a) contains a total of n = nodes given as: where, n nc is the number of sars-cov- proteins. n dt is the number of drug targets, whereas n nt and n d represent the number of cov-host and drugs nodes, respectively. total number of edges is given by: where, e represents interactions between sars-cov- and human host proteins, e is the number of interactions among human proteins, and e represents the number of interactions between drugs and human host proteins. the neighborhood sampling strategy is used here to prepare a feature representation of all nodes. a flexible biased random walk procedure is employed to explore the neighborhood of each node. a random walk in a graph g can be described as the probability: where, π(v, x) is the transition probability between nodes v and x, where (v, x) ∈ e and a i is the i th node in the walk of length l. the transition probability is given by π(v, x) = c pq (t, x) * w vx , where t is the previous node of v n the walk, w vx is the static edge weights and p, q are the two parameters which guides the walk. the coefficient c pq (t, x) is given by where, distance(t, x) represents the shortest path distance between nodes t and node x. the process of feature matrix f n×d generation is governed by the node vec algorithm. it starts from every nodes and simulates r random walks of fixed length l. in every step of walk transition probability π(v, x) govern the sampling. the generated walk of each iteration is included to a walk-list. finally, the stochastic gradient descent is applied to optimize the list of walks and result is returned. . link prediction: scalable and fast variational graph autoencoder (fastvgae) is utilized in our proposed work to reduce the computational time of vgae in large network. the adjacency matrix a and the feature matrix f are given into the encoder of fastvgae. the encoder uses graph convolution neural network (gcn) on the entire graph to create the latent representation (z). the encoder works on full adjacency matrix a. after encoding, sampling is done and decoder works on the sampled sub graph. the mechanism of decoder of fastvgae is slightly different from traditional vgae. it regenerate the adjacency matrix a based on a subsample of graph nodes, v s . it uses a graph node sampling technique to randomly sample the reconstructed nodes at each iteration. each node is assigned with a probability p i and the selection of noes is based on the high score of p i . the probability p i is given by the following equation: where, f (i) is the degree of node i, and α is the sharpening parameter. we take α = in our study. the node selection process is repeated until |v s | = n s , where n s is the number of sampling nodes. the decoder reconstructs the smaller matrix, a s of dimension n s × n s instead of decoding the main adjacency matrix a. the decoder function follows the following equation: a s (i, j) = sigmoid(z t i .z j ), ∀(i, j) ∈ v s ×v s . at each training iteration different subgraph (g s ) is drawn using the sampling method. after the model is trained the drug-cov-host links are predicted using the following equation: where a i j represents the possible links between all combination of sars-cov- nodes and drug nodes. for each combination of nodes the model gives probability based on the logistic sigmoid function. a new coronavirus associated with human respiratory disease in china host-pathogen systems biology host-directed therapies for bacterial and viral infections network-based drug repositioning: approaches, resources, and research directions new horizons for antiviral drug discovery from virus-host protein interaction networks drug target prediction and repositioning using an integrated network-based approach mapping protein interactions between dengue virus and its human and insect hosts a review of in silico approaches for analysis and prediction of hiv- -human protein-protein interactions network-based study reveals potential infection pathways of hepatitis-c leading to various diseases prediction of the ebola virus infection related human genes using protein-protein interaction network a genome-wide positioning systems network algorithm for in silico drug repurposing deepdr: a network-based deep learning approach to in silico drug repositioning network-based drug repurposing for novel coronavirus -ncov/sars-cov- network bioinformatics analysis provides insight into drug repurposing for covid- a sars-cov- protein interaction map reveals targets for drug repurposing comprehensive prediction of the sars-cov- vs. human interactome using pipe , sprint, and pipe-sites scalable feature learning for networks fastgae: fast, scalable and effective graph autoencoders with stochastic subgraph decoding from community to role-based graph embeddings specific plant terpenoids and lignoids possess potent antiviral activities against severe acute respiratory syndrome coronavirus a next generation connectivity map: l platform and the first , , profiles improved community detection in weighted bipartite networks drugbank: a comprehensive resource for in silico drug discovery and exploration detecting overlapping protein complexes in protein-protein interaction networks the therapeutic potential of apigenin delayed antiviral plus immunomodulator treatment still reduces mortality in mice infected by high inoculum of influenza a/h n virus baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis c virus (hcv) infection antiviral and immunomodulatory effects of polyphenols on macrophages infected with dengue virus serotypes and enhanced or not with antibodies evaluation of topoisomerase inhibitors as potential antiviral agents potent antiviral activity of topoisomerase i and ii inhibitors against kaposi's sarcoma-associated herpesvirus antiviral action of camptothecin an analog of camptothecin inactive against topoisomerase i is broadly neutralizing of hiv- through inhibition of vif-dependent apobec g degradation water-insoluble camptothecin analogues as potential antiviral drugs inhibition of hiv- replication by daunorubicin a derivate of the antibiotic doxorubicin is a selective inhibitor of dengue and yellow fever virus replication in vitro elimination of hiv- infection by treatment with a doxorubicin-conjugated anti-envelope antibody antiviral activity of mitoxantrone dihydrochloride against human herpes simplex virus mediated by suppression of the viral immediate early genes histone deacetylase inhibitors for purging hiv- from the latent reservoir interval dosing with the hdac inhibitor vorinostat effectively reverses hiv latency the incubation period of coronavirus disease (covid- ) from publicly reported confirmed cases: estimation and application synthesis and in vitro anti-hsv- activity of a novel hsp inhibitor bj-b heat shock protein facilitates formation of the hbv capsid via interacting with the hbv core protein dimers severe acute respiratory syndrome coronavirus envelope protein regulates cell stress response and apoptosis hsp : a promising broad-spectrum antiviral drug target drug repositioning suggests a role for the heat shock protein inhibitor geldanamycin in treating covid- infection broad spectrum antiviral agent niclosamide and its therapeutic potential hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting sars-cov- infection in vitro pharmacokinetic optimisation in the treatment of pneumocystis carinii pneumonia the antiviral effects of na, k-atpase inhibition: a minireview severe acute respiratory syndrome coronavirus replication is severely impaired by mg due to proteasome-independent inhibition of m-calpain effective inhibition of mers-cov infection by resveratrol structural basis of receptor recognition by sars-cov- angiotensin receptor blockers as tentative sars-cov- therapeutics next-generation sequencing to generate interactome datasets development of human protein reference database as an initial platform for approaching systems biology in humans drugbank . : shedding new light on drug metabolism chembl: a large-scale bioactivity database for drug discovery therapeutic target database update : enriched resource for bench to clinical drug target and targeted pathway information the iuphar/bps guide to pharmacology: an expert-driven knowledgebase of drug targets and their ligands towards a proteome-scale map of the human protein-protein interaction network a proteome-scale map of the human interactome network a reference map of the human binary protein interactome stochastic backpropagation and approximate inference in deep generative models on information and sufficiency. the annals mathematical statistics key: cord- -is su ga authors: kalogeratos, argyris; mannelli, stefano sarao title: winning the competition: enhancing counter-contagion in sis-like epidemic processes date: - - journal: nan doi: nan sha: doc_id: cord_uid: is su ga in this paper we consider the epidemic competition between two generic diffusion processes, where each competing side is represented by a different state of a stochastic process. for this setting, we present the generalized largest reduction in infectious edges (glrie) dynamic resource allocation strategy to advantage the preferred state against the other. motivated by social epidemics, we apply this method to a generic continuous-time sis-like diffusion model where we allow for: i) arbitrary node transition rate functions that describe the dynamics of propagation depending on the network state, and ii) competition between the healthy (positive) and infected (negative) states, which are both diffusive at the same time, yet mutually exclusive on each node. finally we use simulations to compare empirically the proposed glrie against competitive approaches from literature. in recent years, the growing amount of available data on networks led to a revolution in the application of diffusion processes. the enrichment of analysis by means of detailed information regarding specific populations yielded to a plethora of realistic and accurate models. through diffusion models, it is possible to study disparate branches of knowledge: in economy (competition among products [ ] , viral marketing campaigns [ ] ), in epidemiology (disease spreading, vaccination and immunization problems), in computer science (computer viruses, information flow), in social sciences (social behavior [ ] ) and medicine (obesity diffusion [ ] , smoking cessation [ ] , alcohol consumption [ ] ) are just some instances. a large number of social behaviors can be modeled as states propagating over networks [ , , , ] . consequently to the availability of diffusion models, many intervention strategies were developed aiming to answer questions like: what are the most dangerous computers in a network? how to maximize the customer awareness for a product? on which individuals is better to focus to win a poll? few studies proposed strategies to advantage a state compare to another (like in marketing campaigns) or to mitigate the diffusion of an undesirable state (like in epidemiology). most of them are static strategies based on the network structure (e.g. [ ] ), while others are dynamic strategies that use the whole information about the current state of the system to suggest the best elements to treat. among them, the largest reduction in infectious edges (lrie) [ ] results to be the optimal greedy algorithm for resource allocation under limitations in the resource budget, in the n -intertwined susceptible-infected-susceptible (sis) epidemic model. this model is a two-state continuous-time markov process over a network, in which a node can change state according to a transition rate that is linear in its neighbors' states. however sis models have been deemed too simple to describe the complexity of real-world phenomena such as the contemporary presence of two distinct viruses which spread on the same network. in particular, there can be considered two possible cases; in the first an individual can be infected simultaneously by both diseases (e.g. as in the si i s model [ ] ) and, in the second, mutual exclusivity only one infection is allowed for each individual at a given time (e.g. si | s model [ ] ). other attempts tried to change the dynamical equations (like in the sisa [ , ] ). in this study, we propose the generalized largest reduction in infectious edges (glrie) strategy, which is adapted for the diffusion competition of recurrent epidemics, as well as nonlinearity and saturation of the functions of node transition rates. this strategy includes the lrie strategy [ ] and, as such, provides an optimal greedy approach for this more sophisticated network diffusion setting. glrie computes a node score using only local information about the state of close-by nodes. although in the present formulation the method can be applied to any two-state recurrent markov process and is easily generalizable to more states, in this work we focus on social behaviors that can be 'healthy' or 'unhealthy' with negative effects in the social environment. given a limited amount of resources we would like to target the few key-individuals so as to minimize the negative effects. apart from the mentioned habits affecting one's personal health (e.g. unhealthy diet, smoking, etc.), the recent covid- pandemic highlighted yet another interesting 'unhealthy' misbehavior: the disrespect of confinement under a city lock-down, or of social distantiation guidelines in general. indeed, this kind of misbehavior is a determinant factor for the reproduction rate (the infamous r t ) of an epidemic over time, and can be readily enforced by making more controls in key areas, or using mobility and contact information at individual level. a graph g = (v, e) is a set of nodes v, let n = |v|, endowed with a set of edges e ⊂ v × v. it can be intuitively represented by its adjacency matrix a = { , } n , where each a ij element is if (i, j) ∈ e, and otherwise. without loss of generality, we refer to undirected graphs without self-loops, i.e. a = a t and a ii = , ∀i = , ..., n . the neighborhood of node i is the set of all nodes connected to it with a direct edge, and is denoted by n i = {i k , ∀k ∈ { , ..., d i } : (i k , i) ∈ e}. the size of n i equals to the node degree, i.e. |n i | = d i = j a ji . we also denote the indicator function by {·}. the standard continuous-time homogeneous sis model describes the spread of a disease over a graph, where each node i represents an individual that can be in either the susceptible or the infected state: x i (t) = or , respectively). the system at time t is hence globally represented by the node state vector x(t) ∈ { , } n . the state of a specific node i evolves according to the following stochastic transition rates: where the parameters β, δ are the transition rates encoding respectively the infection aggressiveness and self-recovery capability of nodes. the epidemic control is realized by the resource allocation vector r(t) ∈ { , } n , whose coordinate r i (t) = iff we heal node i at time t, and otherwise. finally, ρ is the increase in recovery rate when a node receives a resource unit (thought as treatment). a generic two-state recurrent model. in this paper we study the dynamic epidemic suppression problem by first introducing the following generic two-states markovian process: (ii. ) i i and h i are two node-specific memoryless functions; respectively the infection rate function and recovery rate function for node i. the rate functions depend on the current overall network state x(t) and implicitly on the network structure (we omit this dependency in our notation). a markovian poisson process can be recovered using the rate functions of eq. ii. as follows: in the dynamic resource allocation (dra) problem [ , ] , the objective is to administer a budget of b treatment resources, each of them of strength ρ, in order to suppress an undesired states diffusion. the treatments can not be stored and their efficiency is limited to a certain value. in [ ] , a greedy dynamic score-based strategy is developed, called largest reduction of infectious edges (lrie), in order to address the dra problem. specifically, each node is associated with a score quantifying how critical it is for further spreading the infection of the standard sis model, eq. (ii. ). other score-based solutions have been proposed, e.g. based on fixed priority planning [ ] , or static ones based on spectral analysis [ ] (see details in sec. iv). the proposed generalized largest reduction in infectious edges (glrie) strategy, each time identifies and targets the most critical nodes in order reduce the disease in as quickly as possible. the idea generalizes the one introduced in [ ] as it to a wider range of models. let n i (t) . = i x i (t) be the number of infected nodes at time t. in a markovian setting, given the state of the population x at time t, the best intervention with respect to the resource allocation vector r would minimize the following cost function: where γ can be chosen so as to give emphasis on short-term effects [ , ] . expanding in series with respect to u, yields: the detailed evaluation of the three terms can be found in the supplementary material . here, though, we present the final results. to simplify our notation, we denote the updated transition rate of node j if node i is considered healthy, respectively: for the positive diffusion by we define accordingly the differences in these rates: then, the final forms of the derivatives are: in the third equation, and since our purpose is to minimize eq. iii. with respect to r i , we let the terms that are independent to any r i to get absorbed in the function Ξ(t). the terms of the expansion provide information about the way in which healing a node affects the cost function: the first order does not provide any new information, the second order suggests something as trivial as to heal only infected nodes, while the third order quantifies the contribution of healing a specific node in reducing the cost function. based on eq. iii. we derive the following score for each infected node i: the score has the following interpretation. we can identify two main parts: the quantification of the transition rate h i + i i of the node, and the effect of its recovery on the neighbors on the one hand, if a node could get easily reinfected (high i i value) or is going to be healed rapidly by either the self-recovery or the positive diffusion (high h i value), then it is not a good candidate to invest resources on. on the other hand, if a possible node recovery would largely increase the healing rate of its infected neighbors (low ∆h −i j value), then the node is attributed with a higher score. finally, if a possible node recovery would largely decrease the infectious rate of its infected neighbors (low ∆i −i j value), then the node gets also higher score. algorithm. at time t, the glrie strategy would take as input the network state x(t)and the budget of resources b. it would independently compute the criticality score of eq. (iii. ) for each node, rank them and finally note with 's in the resource allocation vector r(t) which nodes to target while respecting the budget, i.e. i r(t) = min(b, i x i (t))). the computational cost of the algorithm is o(n + n log n ). in this section we select competitors from the literature, define specific diffusion functions for the comparison, and present simulations on random and real networks. other strategies. as a naive baseline, we use the random allocation (rand) that targets infected nodes at random. the second competitor is the largest reduction in spectral radius (lrsr) [ ] , which is based on spectral graph analysis generalized to arbitrary healing effects (ρ = ∞). lrsr selects nodes that maximize the eigen-drop of the largest eigenvalue of the adjacency matrix, known as spectral radius. the next competitor is the maxcut minimization (mcm) [ ] , which introduces the priority planning approach. the strategy proceeds according to a precomputed node priorityorder, that is a linear arrangement of the network with minimal maxcut, i.e. maximum number of edges need to be cut in order to split the ordering in two parts. the last but most direct competitor is the greedy dynamic lrie [ ] that we generalize in this work. diffusion function. generally, the state transition rate for a node can be assumed to be a function either of the absolute number of neighbors in the opposing state (standard for sis), or of the fraction of those nodes our of all neighbors. here we take as an example the former option, as the strategies presented in the literature consider that type and it is therefore a fair comparison. future work could include additional experiments with the latter type. social behaviors have complex properties that are not covered by the standard sis models, such as non-linearity and saturation in the node transition rates [ ] . we employ sigmoid functions to model these model properties: where n and d − n are the number of infected and healthy neighbors. also, s i (resp. s h ) parameter controls the saturation level and i (resp. h ) the slope at the origin. next, we present comparative experiments in erdös-rényi (er), preferential attachment (pr), and small-world (sw) random networks of size nodes each. first we gradually introduce non-linearity in the diffusion, and then we show the effects of introducing also competition diffusion. from linear to non-linear spreading. we first consider only the negative diffusion (i.e. h = ) and we gradually increase i in an er random graph, moving gradually from linear (as in the standard sis model) to non-linear functions. fig. shows the average over , simulations of the percentage of infected nodes over time and the % confidence interval under the hypothesis of gaussian distribution. the results show that in the presence of non-linearity our strategy becomes much more efficient than the competitors. introducing competition. next, in fig. we present the effects of the positive diffusion, embedded in the function h, on er, pr, and sw random networks. the last plot each row shows the shape of the diffusion functions used in the simulations. the simulations show that, unlike glrie, the methods of the literature lack modeling power to deal with this complex setting involving non-linearity and competition, and suppress the infection. we performed simulations on the gnutella peer-to-peer network containing , nodes and , edges. two scenarios were used for the simulations, with and without positive diffusion, using a wide range of parameters. out of the many possible evaluation metrics for the quality of a strategy, e.g. expected extinction time (eet), final percentage of infection (fis), area under the curve (auc), we choose the auc. this has many advantages: it provides useful measurements even if the strategy did not removed the infection, which is a limitation of the eet metric; it accounts for the total amount of infected nodes in the process, which in a socioeconomic context is more interesting than the fis metric. the empirical comparison between glrie and competitors such as lrie and mcm is summarized in fig. in each heatmap, we fix the shape of the transition function h of the positive diffusion and only play with the parameters of the function i of the negative diffusion: its saturation level increases along the x-axis and its slope increases along the y-axis. on the top-left side of a heatmap, the epidemic parameters define a weak infection and any strategy would perform well, while on the bottom-right side the infection becomes hard to completely remove for all strategies (given the amount of resources). moreover, in the left border, the low saturation level causes i to already saturate with just one neighbor of the opposing state. in the regime where i is almost linear and h = , glrie and lrie are equivalent and perform almost the same. the general remark on the results is that glrie appears to be the most versatile and best performing strategy in this setting of competitive spreading. in this paper we discussed a general form of recurrent two-states continuous-time markov process that allows both non-linear node transition functions and competition among the two states. we then proposed the generalized lrie (glrie) strategy to suppress the diffusion of the undesired state. experiments showed that glrie is well-adapted to the considered setting of competitive spreading, and makes better use of the resources available by targeting the most critical infected nodes compared to competitors from literature. future work could generalize to more competing epidemic states and the incorporation of factors related to the network structure in the node scores. winner takes all: competing viruses or ideas on fair-play networks the dynamics of viral marketing emotions as infectious diseases in a large social network: the sisa model the spread of obesity in a large social network over years the collective dynamics of smoking in a large social network the spread of alcohol consumption behavior in a large social network infectious disease modeling of social contagion in networks on the vulnerability of large graphs a greedy approach for dynamic control of diffusion processes in networks interacting viruses in networks: can both survive? suppressing epidemics in networks using priority planning optimal control of epidemics in metapopulations optimizing the control of disease infestations at the landscape scale key: cord- -cx elpb authors: hassani-pak, keywan; singh, ajit; brandizi, marco; hearnshaw, joseph; amberkar, sandeep; phillips, andrew l.; doonan, john h.; rawlings, chris title: knetminer: a comprehensive approach for supporting evidence-based gene discovery and complex trait analysis across species date: - - journal: biorxiv doi: . / . . . sha: doc_id: cord_uid: cx elpb generating new ideas and scientific hypotheses is often the result of extensive literature and database reviews, overlaid with scientists’ own novel data and a creative process of making connections that were not made before. we have developed a comprehensive approach to guide this technically challenging data integration task and to make knowledge discovery and hypotheses generation easier for plant and crop researchers. knetminer can digest large volumes of scientific literature and biological research to find and visualise links between the genetic and biological properties of complex traits and diseases. here we report the main design principles behind knetminer and provide use cases for mining public datasets to identify unknown links between traits such grain colour and pre-harvest sprouting in triticum aestivum, as well as, an evidence-based approach to identify candidate genes under an arabidopsis thaliana petal size qtl. we have developed knetminer knowledge graphs and applications for a range of species including plants, crops and pathogens. knetminer is the first open-source gene discovery platform that can leverage genome-scale knowledge graphs, generate evidence-based biological networks and be deployed for any species with a sequenced genome. knetminer is available at http://knetminer.org. which is prone to information being overlooked and subjective biases being introduced. even when the task of gathering information is complete, it is demanding to assemble a coherent view of how each piece of evidence might come together to "tell a story" about the biology that can explain how multiple genes might be implicated in a complex trait or disease. new tools are needed to provide scientists with a more fine-grained and connected view of the scientific literature and databases, rather than the conventional information retrieval tools currently at their disposal. scientists are not alone with these challenges. search systems form a core part of the duties of many professions. studies have highlighted the need for search systems that give confidence to the professional searcher and therefore trust, explainability, and accountability remain a significant knetminer provides search term suggestions and real-time query feedback. from a search, a user is presented with the following views: gene view is a ranked list of candidate genes along with a summary of related evidence types. map view is a chromosome based display of qtl, gwas peaks and genes related to the search terms. evidence view is a ranked list of query related evidence terms and enrichment scores along with linked genes. by selecting one or multiple elements in these three views, the user can get to the network view to explore a gene-centric or evidence-centric knowledge network related to their query and the subsequent selection. (nilsson-ehle, ) and that the red pigmentation of wheat grain is controlled by r genes on the long arms of chromosomes a, b, and d (sears, figure a ). this network is displayed in the network view which provides interactive features to hide or add specific evidence types from the network. nodes are displayed in a defined set of shapes, colors and sizes to distinguish different types of evidence. a shadow effect on nodes indicates that more information is available but has been hidden. the auto-generated network, however, is not yet telling a story that is specific to our traits of interest and is limited to evidence that is phenotypic in nature. to further refine and extend the search for evidence that links tt to grain color and phs, we can provide additional keywords relevant to the traits of interest. seed germination and dormancy are the underlying developmental processes that activate or prevent pre-harvest sprouting in many grains and other seeds. the colour of the grain is known to be determined through accumulation of proanthocyanidin, an intermediate in the flavonoid pathway, found in the seed coat. these terms and phrases can be combined using boolean operators (and, or, not) and used in conjunction with a list of genes. thus, we search for traescs d g (tt ) and the keywords: "seed germination" or "seed dormancy" or color or flavonoid or proanthocyanidin. this time, knetminer filters the extracted tt knowledge network ( nodes) down to a smaller subgraph of nodes and relations in which every path from tt to another node corresponds to a line of evidence to phenotype or molecular characteristics based on our keywords of interest ( figure b ). overall the exploratory link analysis has generated a potential link between grain color and phs due to tt -mft interaction and suggested a new hypothesis between two traits (phs and root hair density) that were not part of the initial investigation and previously thought to be unrelated. furthermore, it raises the possibility that tt mutants might lead to increased root hairs and to higher nutrient and water absorption, and therefore cause early germination of the grain. more data and experiments will be needed to address this hypothesis and close the knowledge gap. biologists would generally agree to be informative when studying the function of a gene. searching a kg for such patterns is akin to searching for relevant sentences containing evidence that supports a particular point of view within a book. such evidence paths can be short e.g. gene a was knocked out and phenotype x was observed; or alternatively the evidence path can be longer, e.g. gene a in species x has an ortholog in species y, which was shown to regulate the expression of a disease related gene (with a link to the paper). in the first example, the relationship between gene and disease is directly evident and experimentally proven, while in the second example the relationship is indirect and less certain but still biologically meaningful. there are many evidence types that should be considered for evaluating the relevance of a gene to a trait. in a kg context, a gene is considered to be, for example, related to 'early flowering' if any of its biologically plausible graph patterns contain nodes related to 'early flowering'. in this context, the word 'related' doesn't necessarily mean that the gene in question will have an effect on 'flowering shown to a user; let alone if combining gcss for tens to hundreds of genes. there is therefore a need to filter and visualise the subset of information in the gcss that is most interesting to a specific user. however, the interestingness of information is subjective and will depend on the biological question or the hypothesis that needs to be tested. a scientist with an interest in disease biology is likely to be interested in links to publications, pathways, and annotations related to diseases, while someone studying the biological process of grain filling is likely more interested in links to physiological or anatomical traits. to reduce information overload and visualise the most interesting pieces of information, we have devised two strategies. ) in the case of a combined gene and keyword search, we use the keywords as a filter to show only paths in the gcs that connect genes with keyword related nodes, i.e. nodes that contain the given keywords in one of their node properties. in the special case where too many publications remain even after keyword filtering, we select the most recent n publications (default n= ). nodes not matching the keyword are hidden but not removed from the gcs. ) in the case of a simple gene query (without additional keywords), we initially show all paths between the gene and nodes of type phenotype/trait, i.e. any semantic motif that ends with a trait/phenotype, as this is considered the most important relationship to many knetminer users. gene ranking we have developed a simple and fast algorithm to rank genes and their gcs for their importance. we give every node in the kg a weight composed of three components, referred to as sdr, standing for the specificity to the gene, distance to the gene and relevance to the search terms. specificity reflects how specific a node is to a gene in question. for example, a publication that is cited (linked) by hundreds of genes receives a smaller weight than a publication which is linked to one or two genes only. we define the specificity of a node x as: where n is the frequency of the node occurring in all n gcs. d i s t a n c e assumes information which is associated more closely to a gene can generally be considered more certain, versus one that's further away, e.g. inferred through homology and other interactions increases the uncertainty of annotation propagation. a short semantic motif is therefore given a stronger weight, whereas a long motif receives a weaker weight. thus, we define the second weight as the inverse shortest path distance of a gene g and a node x: both weights s and d are not influenced by the search terms and can therefore be pre-computed for every node in the kg. relevance reflects the relevance or importance of a node to user-provided search terms using the well-established measure of inverse document frequency (idf) and term frequency (tf) (salton & yang, we define the knetscore of a gene as: the sum considers only gcs nodes that contain the search terms. in the absence of search terms, we sum over all nodes of the gcs with r= for each node. the computation of the knetscore biologists, such as tables and chromosome views, allowing them to explore the data, make choices as to which gene to view, or refine the query if needed. these initial views help users to reach a certain level of confidence with the selection of potential candidate genes. however, they do not tell the biological story that links candidate genes to traits and diseases. in a second step, to enable the stories and their evidence to be investigated in full detail, the network view visualises highly complex information in a concise and connected format, helping facilitate biologically meaningful conclusions. consistent graphical symbols are used for representing evidence types throughout the different views, so that users develop a certain level of familiarity, before being exposed to networks with complex interactions and rich content. scientists spend a considerable amount of time searching for new clues and ideas by synthesizing many different sources of information and using their expertise to generate hypotheses. knetminer is a user-friendly platform for biological knowledge discovery and exploratory data mining. it allows humans and machines to effectively connect the dots in life science data and literature, search the connected data in an innovative way, and then return the results in an accessible, explorable, yet concise format that can be easily interrogated to generate new insights. we discovering protein drug targets using the monarch initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species a wheat homolog of mother of ft and tfl acts in the regulation of germination zur kenntnis der mit der keimungsphysiologie des weizens in zusammenhang stehenden inneren faktoren bioinformatics meets user-centred design: a perspective meta-analysis of the heritability of human traits based on fifty years of twin studies information retrieval in the workplace: a comparison of professional search practices progress in biomedical knowledge discovery: a -year on the specification of term values in automatic indexing cytogenetic studies with polyploid species of wheat knowledge graphs and knowledge networks: the story in brief knetmaps: a biojs component to visualize biological knowledge networks identification of loci governing eight agronomic traits using a gbs-gwas approach and validation by qtl mapping in soya bean big data: astronomical or genomical? sensitivity to "sunk costs" in mice, rats, and humans iwgsc whole-genome assembly principal investigators whole-genome sequencing and assembly shifting the limits in wheat research and breeding using a fully annotated reference genome trend analysis of knowledge graphs for crop pest and diseases mother of ft and tfl regulates seed germination through a negative feedback loop modulating aba signaling in arabidopsis use of graph database for the integration allelic variation and transcriptional isoforms of wheat tamyc gene regulating anthocyanin synthesis in pericarp the authors declare that they have no competing interests. key: cord- -eelqmzdx authors: guo, chungu; yang, liangwei; chen, xiao; chen, duanbing; gao, hui; ma, jing title: influential nodes identification in complex networks via information entropy date: - - journal: entropy (basel) doi: . /e sha: doc_id: cord_uid: eelqmzdx identifying a set of influential nodes is an important topic in complex networks which plays a crucial role in many applications, such as market advertising, rumor controlling, and predicting valuable scientific publications. in regard to this, researchers have developed algorithms from simple degree methods to all kinds of sophisticated approaches. however, a more robust and practical algorithm is required for the task. in this paper, we propose the enrenew algorithm aimed to identify a set of influential nodes via information entropy. firstly, the information entropy of each node is calculated as initial spreading ability. then, select the node with the largest information entropy and renovate its l-length reachable nodes’ spreading ability by an attenuation factor, repeat this process until specific number of influential nodes are selected. compared with the best state-of-the-art benchmark methods, the performance of proposed algorithm improved by . %, . %, . %, . %, . %, and . % in final affected scale on cenew, email, hamster, router, condmat, and amazon network, respectively, under the susceptible-infected-recovered (sir) simulation model. the proposed algorithm measures the importance of nodes based on information entropy and selects a group of important nodes through dynamic update strategy. the impressive results on the sir simulation model shed light on new method of node mining in complex networks for information spreading and epidemic prevention. complex networks are common in real life and can be used to represent complex systems in many fields. for example, collaboration networks [ ] are used to cover the scientific collaborations between authors, email networks [ ] denote the email communications between users, protein-dna networks [ ] help people gain a deep insight on biochemical reaction, railway networks [ ] reveal the structure of railway via complex network methods, social networks show interactions between people [ , ] , and international trade network [ ] reflects the products trade between countries. a deep understanding and controlling of different complex networks is of great significance in information spreading and network connectivity. on one hand, by using the influential nodes, we can make successful advertisements for products [ ] , discover drug target candidates, assist information weighted networks [ ] and social networks [ ] . however, the node set built by simply assembling the nodes and sorting them employed by the aforementioned methods may not be comparable to an elaborately selected set of nodes due to the rich club phenomenon [ ] , namely, important nodes tend to overlap with each other. thus, lots of methods aim to directly select a set of nodes are proposed. kempe et al. defined the problem of identifying a set of influential spreaders in complex networks as influence maximization problem [ ] , and they used hill-climbing based greedy algorithm that is within % of optimal in several models. greedy method [ ] is usually taken as the approximate solution of influence maximization problem, but it is not efficient for its high computational cost. chen et al. [ ] proposed newgreedy and mixedgreedy method. borgatti [ ] specified mining influential spreaders in social networks by two classes: kpp-pos and kpp-neg, based on which he calculated the importance of nodes. narayanam et al. [ ] proposed spin algorithm based on shapley value to deal with information diffusion problem in social networks. although the above greedy based methods can achieve relatively better result, they would cost lots of time for monte carlo simulation. so more heuristic algorithms were proposed. chen et al. put forward simple and efficient degreediscount algorithm [ ] in which if one node is selected, its neighbors' degree would be discounted. zhang et al. proposed voterank [ ] which selects the influential node set via a voting strategy. zhao et al. [ ] introduced coloring technology into complex networks to seperate independent node sets, and selected nodes from different node sets, ensuring selected nodes are not closely connected. hu et al. [ ] and guo et al. [ ] further considered the distance between independent sets and achieved a better performance. bao et al. [ ] sought to find dispersive distributed spreaders by a heuristic clustering algorithm. zhou [ ] proposed an algorithm to find a set of influential nodes via message passing theory. ji el al. [ ] considered percolation in the network to obtain a set of distributed and coordinated spreaders. researchers also seek to maximize the influence by studying communities [ ] [ ] [ ] [ ] [ ] [ ] . zhang [ ] seperated graph nodes into communities by using k-medoid method before selecting nodes. gong et al. [ ] divided graph into communities of different sizes, and selected nodes by using degree centrality and other indicators. chen et al. [ ] detected communities by using shrink and kcut algorithm. later they selected nodes from different communities as candidate nodes, and used cdh method to find final k influential nodes. recently, some novel methods based on node dynamics have been proposed which rank nodes to select influential spreaders [ , ] .Şirag erkol et al. made a systematic comparison between methods focused on influence maximization problem [ ] . they classify multiple algorithms to three classes, and made a detailed explanation and comparison between methods. more algorithms in this domain are described and classified clearly by lü et al. in their review paper [ ] . most of the non-greedy strategy methods suffer from a possibility that some spreaders are so close that their influence may overlap. degreediscount and voterank use iterative selection strategy. after a node is selected, they weaken its neighbors' influence to cope with the rich club phenomenon. however, these two algorithms roughly induce nodes' local information. besides, they do not further make use of the difference between nodes when weakening nodes' influence. in this paper, we propose a new heuristic algorithm named enrenew based on node's entropy to select a set of influential nodes. enrenew also uses iterative selection strategy. it initially calculates the influence of each node by its information entropy (further explained in section . ), and then repeatedly select the node with the largest information entropy and renovate its l-length reachable nodes' information entropy by an attenuation factor until specific number of nodes are selected. experiments show that the proposed method yields the largest final affected scale on real networks in the susceptible-infected-recovered (sir) simulation model compared with state-of-the-art benchmark methods. the results reveal that enrenew could be a promising tool for related work. besides, to make the algorithm practically more useful, we provide enrenew's source code and all the experiments details on https://github.com/yangliangwei/influential-nodes-identification-in-complex-networksvia-information-entropy, and researchers can download it freely for their convenience. the rest of paper is organized as follows: the identifying method is presented in section . experiment results are analyzed and discussed in section . conclusions and future interest research topics are given in section . the best way to measure the influence of a set of nodes in complex networks is through propagation dynamic process on real life network data. a susceptible infected removed model (sir model) is initially used to simulate the dynamic of disease spreading [ ] . it is later widely used to analyze similar spreading process, such as rumor [ ] and population [ ] . in this paper, the sir model is adopted to objectively evaluate the spreading ability of nodes selected by algorithms. each node in the sir model can be classified into one of three states, namely, susceptible nodes (s), infected nodes (i), and recovered nodes (r). at first, set initial selected nodes to infected status and all others in network to susceptible status. in each propagation iteration, each infected node randomly choose one of its direct neighbors and infect it with probability µ. in the meantime, each infected node will be recovered with probability β and won't be infected again. in this study, λ = µ β is defined as infected rate, which is crucial to the spreading speed in the sir model. apparently, the network can reach a steady stage with no infection after enough propagation iterations. to enable information spreads widely in networks, we set µ = . µ c , where µ c = k k − k [ ] is the spreading threshold of sir, k is the average degree of network. when µ is smaller than µ c , spreading in sir could only affect a small range or even cannot spread at all. when it is much larger than µ c , nearly all methods could affect the whole network, which would be meaningless for comparison. thus, we select µ around µ c on the experiments. during the sir propagation mentioned above, enough information can be obtained to evaluate the impact of initial selected nodes in the network and the metrics derived from the procedure is explained in section . . the influential nodes selecting algorithm proposed in this paper is named enrenew, deduced from the concept of the algorithm. enrenew introduces entropy and renews the nodes' entropy through an iterative selection process. enrenew is inspired by voterank algorithm proposed by zhang et al. [ ] , where the influential nodes are selected in an iterative voting procedure. voterank assigns each node with voting ability and scores. initially, each node's voting ability to its neighbors is . after a node is selected, the direct neighbors' voting ability will be decreased by k , where k = * m n is the average degree of the network. voterank roughly assigns all nodes in graph with the same voting ability and attenuation factor, which ignores node's local information. to overcome this shortcoming, we propose a heuristic algorithm named enrenew and described as follows. in information theory, information quantity measures the information brought about by a specific event and information entropy is the expectation of the information quantity. these two concepts are introduced into complex network in reference [ ] [ ] [ ] to calculate the importance of node. information entropy of any node v can be calculated by: where p uv = d u ∑ l∈Γv d l , ∑ l∈Γ v p lv = , Γ v indicates node v's direct neighbors, and d u is the degree of node u. h uv is the spreading ability provided from u to v. e v is node v's information entropy indicating its initial importance which would be renewed as described in algorithm . a detailed calculating of node entropy is shown in figure . it shows how the red node's (node ) entropy is calculated in detail. node has four neighbors from node to node . node 's information entropy is then calculated by simply selecting the nodes with a measure of degree as initial spreaders might not achieve good results. because most real networks have obvious clumping phenomenon, that is, high-impact nodes in the network are often connected closely in a same community. information cannot be copiously disseminated to the whole network. to manage this situation, after each high impact node is selected, we renovate the information entropy of all nodes in its local scope and then select the node with the highest information entropy, the process of which is shown in algorithm . e k = − k · k · log k and k is the average degree of the network. l− is the attenuation factor, the farther the node is from node v, the smaller impact on the node will be. e k can be seen as the information entropy of any node in k -regular graph if k is an integer. from algorithm , we can see that after a new node is selected, the renew of its l-length reachable nodes' information entropy is related with h and e k , which reflects local structure information and global network information, respectively. compared with voterank, enrenew replaces voting ability by h value between connected nodes. it induces more local information than directly set voting ability as in voterank. at the same time, enrenew uses h e k as the attenuate factor instead of k in voterank, retaining global information. computational complexity (usually time complexity) is used to describe the relationship between the input of different scales and the running time of the algorithm. generally, brute force can solve most problems accurately, but it cannot be applied in most scenarios because of its intolerable time complexity. time complexity is an extremely important indicator of an algorithm's effectiveness. through analysis, the algorithm is proved to be able to identify influential nodes in large-scale network in limited time. the computational complexity of enrenew can be analyzed in three parts, initialization, selection and renewing. n, m and r represent the number of nodes, edges and initial infected nodes, respectively. at start, enrenew takes o(n · k ) = o(m) for calculating information entropy. node selection selects the node with the largest information entropy and requires o(n), which can further be decreased to o(log n) if stored in an efficient data structure such as red-black tree. renewing the l-length reachable nodes' information entropy needs o( k l ) = o( m l n l ). as suggested in section . , l = yields impressive results with o( m n ). since selection and renewing parts need to be performed r times to get enough spreaders,the final computational complexity is o(m + n) + o(r log n) + o(r k ) = o(m + n + r log n + rm n ). especially, when the network is sparse and r n, the complexity will be decreased to o(n). the algorithm's performance is measured by the selected nodes' properties including its spreading ability and location property. spreading ability can be measured by infected scale at time t f(t) and final infected scale f(t c ), which are obtained from sir simulation and widely used to measure the spreading ability of nodes [ , [ ] [ ] [ ] [ ] [ ] . l s is obtained from selected nodes' location property by measuring their dispersion [ ] . infected scale f(t) demonstrates the influence scale at time t and is defined by where n i(t) and n r(t) are the number of infected and recovered nodes at time t, respectively. at the same time step t, larger f(t) indicates more nodes are infected by initial influential nodes, while a shorter time t indicates the initial influential nodes spread faster in the network. f(t c ) is the final affected scale when the spreading reaches stable state. this reflects the final spreading ability of initial spreaders. the larger the value is, the stronger the spreading capacity of initial nodes. f(t c ) is defined by: where t c is the time when sir propagation procedure reaches its stable state. l s is the average shortest path length of initial infection set s. usually, with larger l s , the initial spreaders are more dispersed and can influence a larger range. this can be defined by: where l u,v denotes the length of the shortest path from node u to v. if u and v is disconnected, the shortest path is replaced by d gc + , where d gc is the largest diameter of connected components. an example network shown in figure is used to show the rationality of nodes the proposed algorithm chooses. the first three nodes selected by enrenew is distributed in three communities, while those selected by the other algorithms are not. we further run the sir simulation on the example network with enrenew and other five benchmark methods. the detailed result is shown in table for an in-depth discussion. this result is obtained by averaging experiments. . this network consists of three communities at different scales. the first nine nodes selected by enrenew are marked red. the network typically shows the rich club phenomenon, that is, nodes with large degree tend to be connected together. table shows the experiment results when choosing nodes as the initial spreading set. greedy method is usually used as the upper bound, but it is not efficient in large networks due to its high time complexity. enrenew and pagerank distribute nodes in community , nodes in community , and node in community . the distribution matches the size of community. however, the nodes selected by the other algorithms tend to cluster in community except for greedy method. this will induce spreading within high density area, which is not efficient to spread in the entire network. enrenew and pagerank can adaptively allocate reasonable number of nodes based on the size of the community just as greedy method. nodes selected by enrenew have the second largest average distance except greedy, which indicates enrenew tends to distribute nodes sparsely in the graph. it aptly alleviates the adverse effect of spreading caused by the rich club phenomenon. although enrenew's average distance is smaller than pagerank, it has a higher final infected scale f(t c ). test result on pagerank also indicates that just select nodes widely spread across the network may not induce to a larger influence range. enrenew performs the closest to greedy with a low computational cost. it shows the proposed algorithm's effectiveness to maximize influence with limited nodes. note: n and m are the total number of nodes and edges, respectively, and k = * m n stands for average node degree and k max = max v∈v d v is the max degree in the network and average clustering coefficient c measures the degree of aggregation in the network. c = n ∑ n i= * i i |Γ i | * (|Γ i |− ) , where i i denotes the number of edges between direct neighbors of node i. table describes six different networks varying from small to large-scale, which are used to evaluate the performance of the methods. cenew [ ] is a list of edges of the metabolic network of c.elegans. email [ ] is an email user communication network. hamster [ ] is a network reflecting friendship and family links between users of the website http://www.hamsterster.com, where node and edge demonstrate the web user and relationship between two nodes, respectively. router network [ ] reflects the internet topology at the router level. condmat (condense matter physics) [ ] is a collaboration network of authors of scientific papers from the arxiv. it shows the author collaboration in papers submitted to condense matter physics. a node in the network represents an author, and an edge between two nodes shows the two authors have collaboratively published papers. in the amazon network [ ] , each node represents a product, and an edge between two nodes represents two products were frequently purchased together. we firstly conduct experiments on the parameter l, which is the influence range when renewing the information entropy. if l = , only the direct neighbors' importance of selected node will be renewed, and if l = , the importance of -length reachable nodes will be renewed and so forth. the results with varying parameter l from to on four networks are shown in figure . it can be seen from figure that, when l = , the method gets the best performance in four of the six networks. in network email, although the results when l = and l = are slightly better comparing with the case of l = , the running time increases sharply. besides, the three degrees of influence (tdi) theory [ ] also states that a individual's social influence is only within a relatively small range. based on our experiments, we set the influence range parameter l at in the preceding experiments. with specific ratio of initial infected nodes p, larger final affected scale f(t c ) means more reasonable of the parameter l. the best parameter l differs from different networks. in real life application, l can be used as an tuning parameter. many factors affect the final propagation scale in networks. a good influential nodes mining algorithm should prove its robustness in networks varying in structure, nodes size, initial infection set size, infection probability, and recovery probability. to evaluate the performance of enrenew, voterank , adaptive degree, k-shell, pagerank, and h-index algorithms are selected as benchmark methods for comparing. furthermore, greedy method is usually taken as upper bound on influence maximization problem, but it is not practical on large networks due to its high time computational complexity. thus, we added greedy method as upper bound on the two small networks (cenew and email). the final affected scale f(t c ) of each method on different initial infected sizes are shown in figure . it can be seen that enrenew achieves an impressing result on the six networks. in the small network, such as cenew and email, enrenew has an apparent better result on the other benchmark methods. besides, it nearly reaches the upper bound on email network. in hamster network, it achieves a f(t c ) of . only by ratio of . initial infected nodes, which is a huge improvement than all the other methods. in condmat network, the number of affected nodes are nearly times more than the initial ones. in a large amazon network, nodes will be affected on average for one selected initial infected node. but the algorithm performs unsatisfactory on network router. all the methods did not yield good results due to the high sparsity structure of the network. in this sparse network, the information can hardly spread out with small number of initial spreaders. by comparing the methods from the figure , enrenew surpasses all the other methods on five networks with nearly all kinds of p varying from small to large. this result reveals that when the size of initial infected nodes varies, enrenew also shows its superiority to all the other methods. what is worth noticing is that enrenew performs about the same as other methods when p is small, but it has a greater improvement with the rise of initial infected ratio p. this phenomenon shows the rationality of the importance renewing process. the renewing process of enrenew would influence more nodes when p is larger. the better improvement of enrenew than other methods shows the renewing process reasonability redistributes nodes' importance. timestep experiment is made to assess the propagation speed when given a fixed number of initial infected nodes. the exact results of f(t) varying with time step t are shown in figure . from the experiment, it can be seen that with same number of initial infected nodes, enrenew always reaches a higher peak than the benchmark methods, which indicates a larger final infection rate. in the steady stage, enrenew surpasses the best benchmark method by . %, . %, . %, . %, . % and . % in final affected scale on cenew, email, hamster, router, condmat and amazon networks, respectively. in view of propagation speed, enrenew reaches the peak at about th time step in cenew, th time step in email, th time step in hamster, th time step in router, th time step in condmat and th time step in amazon. enrenew always takes less time to influence the same number of nodes compared with other benchmark methods. from figure , it can also be seen that k-shell also performs worst from the early stage in all the networks. nodes with high core value tend to cluster together, which makes information hard to dissipate. especially in the amazon network, after timesteps, all other methods reach a f(t) of . , which is more than twice as large as k-shell. in contrast to k-shell, enrenew spreads the fastest from early stage to the steady stage. it shows that the proposed method not only achieve a larger final infection scale, but also have a faster infection rate of propagation. in real life situations, the infected rate λ varies greatly and has huge influence on the propagation procedure. different λ represents virus or information with different spreading ability. the results on different λ and methods are shown in figure . from the experiments, it can be observed that in most of cases, enrenew surpasses all other algorithms with λ varying from . to . on all networks. besides, experiment results on cenew and email show that enrenew nearly reaches the upper bound. it shows enrenew has a stronger generalization ability comparing with other methods. especially, the enrenew shows its impressing superiority in strong spreading experiments when λ is large. generally speaking, if the selected nodes are widely spread in the network, they tend to have an extensive impact influence on information spreading in entire network. l s is used to measure dispersity of initial infected nodes for algorithms. figure shows the results of l s of nodes selected by different algorithms on different networks. it can be seen that, except for the amazon network, enrenew always has the largest l s , indicting the widespread of selected nodes. especially in cenew, enrenew performs far beyond all the other methods as its l s is nearly as large as the upper bound. in regard to the large-scale amazon network, the network contains lots of small cliques and k-shell selects the dispersed cliques, which makes k-shell has the largest l s . but other experimental results of k-shell show a poor performance. this further confirms that enrenew does not naively distribute selected nodes widely across the network, but rather based on the potential propagation ability of each node. figure . this experiment compares different methods regard to spreading speed. each subfigure shows experiment results on one network. the ratio of initial infected nodes is % for cenew, email, hamster and router, . % for condmat and . % for amazon. the results are obtained by averaging on independent runs with spread rate λ = . in sir. with the same spreading time t, larger f(t) indicates larger influence scale in network, which reveals a faster spreading speed. it can be seen from the figures that enrenew spreads apparently faster than other benchmark methods on all networks. on the small network cenew and email, enrenew's spreading speed is close to the upper bound. . . figure . this experiment tests algorithms' effectiveness on different spreading conditions. each subfigure shows experiment results on one network. the ratio of initial infected nodes is % for cenew, email, hamster and router, . % for condmat, and . % for amazon. the results are obtained by averaging on independent runs. different infected rate λ of sir can imitate different spreading conditions. enrenew gets a larger final affected scale f(t c ) on different λ than all the other benchmark methods, which indicates the proposed algorithm has more generalization ability to different spreading conditions. . this experiment analysis average shortest path length l s of nodes selected by different algorithms. each subfigure shows experiment results on one network. p is the ratio of initial infected nodes. generally speaking, larger l s indicates the selected nodes are more sparsely distributed in network. it can be seen that nodes selected by enrenew have the apparent largest l s on five networks. it shows enrenew tends to select nodes sparsely distributed. the influential nodes identification problem has been widely studied by scientists from computer science through to all disciplines [ ] [ ] [ ] [ ] [ ] . various algorithms that have been proposed aim to solve peculiar problems in this field. in this study, we proposed a new method named enrenew by introducing entropy into a complex network, and the sir model was adopted to evaluate the algorithms. experimental results on real networks, varying from small to large in size, show that enrenew is superior over state-of-the-art benchmark methods in most of cases. besides, with its low computational complexity, the presented algorithm can be applied to large scale networks. the enrenew proposed in this paper can also be well applied in rumor controlling, advertise targeting, and many other related areas. but, for influential nodes identification, there still remain many challenges from different perspectives. from the perspective of network size, how to mine influential spreaders in large-scale networks efficiently is a challenging problem. in the area of time-varying networks, most of these networks are constantly changing, which poses the challenge of identifying influential spreaders since they could shift with the changing topology. in the way of multilayer networks, it contains information from different dimensions with interaction between layers and has attracted lots of research interest [ ] [ ] [ ] . to identify influential nodes in multilayer networks, we need to further consider the method to better combine information from different layers and relations between them. the scientific collaboration networks in university management in brazil arenas, a. self-similar community structure in a network of human interactions insights into protein-dna interactions through structure network analysis statistical analysis of the indian railway network: a complex network approach social network analysis network analysis in the social sciences prediction in complex systems: the case of the international trade network the dynamics of viral marketing extracting influential nodes on a social network for information diffusion structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review efficient immunization strategies for computer networks and populations a study of epidemic spreading and rumor spreading over complex networks epidemic processes in complex networks unification of theoretical approaches for epidemic spreading on complex networks epidemic spreading in time-varying community networks suppression of epidemic spreading in complex networks by local information based behavioral responses efficient allocation of heterogeneous response times in information spreading process absence of influential spreaders in rumor dynamics a model of spreading of sudden events on social networks daniel bernoulli?s epidemiological model revisited herd immunity: history, theory, practice epidemic disease in england: the evidence of variability and of persistency of type infectious diseases of humans: dynamics and control thermodynamic efficiency of contagions: a statistical mechanical analysis of the sis epidemic model a rumor spreading model based on information entropy an algorithmic information calculus for causal discovery and reprogramming systems the hidden geometry of complex, network-driven contagion phenomena extending centrality the h-index of a network node and its relation to degree and coreness identifying influential nodes in complex networks identifying influential nodes in large-scale directed networks: the role of clustering collective dynamics of ?small-world?networks identification of influential spreaders in complex networks ranking spreaders by decomposing complex networks eccentricity and centrality in networks the centrality index of a graph a set of measures of centrality based on betweenness a new status index derived from sociometric analysis mutual enhancement: toward an understanding of the collective preference for shared information factoring and weighting approaches to status scores and clique identification dynamical systems to define centrality in social networks the anatomy of a large-scale hypertextual web search engine leaders in social networks, the delicious case using mapping entropy to identify node centrality in complex networks path diversity improves the identification of influential spreaders how to identify the most powerful node in complex networks? a novel entropy centrality approach a novel entropy-based centrality approach for identifying vital nodes in weighted networks node importance ranking of complex networks with entropy variation key node ranking in complex networks: a novel entropy and mutual information-based approach a new method to identify influential nodes based on relative entropy influential nodes ranking in complex networks: an entropy-based approach discovering important nodes through graph entropy the case of enron email database identifying node importance based on information entropy in complex networks ranking influential nodes in complex networks with structural holes ranking influential nodes in social networks based on node position and neighborhood detecting rich-club ordering in complex networks maximizing the spread of influence through a social network efficient influence maximization in social networks identifying sets of key players in a social network a shapley value-based approach to discover influential nodes in social networks identifying a set of influential spreaders in complex networks identifying effective multiple spreaders by coloring complex networks effects of the distance among multiple spreaders on the spreading identifying multiple influential spreaders in term of the distance-based coloring identifying multiple influential spreaders by a heuristic clustering algorithm spin glass approach to the feedback vertex set problem effective spreading from multiple leaders identified by percolation in the susceptible-infected-recovered (sir) model finding influential communities in massive networks community-based influence maximization in social networks under a competitive linear threshold model a community-based algorithm for influence blocking maximization in social networks detecting community structure in complex networks via node similarity community structure detection based on the neighbor node degree information community-based greedy algorithm for mining top-k influential nodes in mobile social networks identifying influential nodes in complex networks with community structure an efficient memetic algorithm for influence maximization in social networks efficient algorithms for influence maximization in social networks local structure can identify and quantify influential global spreaders in large scale social networks identifying influential spreaders in complex networks by propagation probability dynamics systematic comparison between methods for the detection of influential spreaders in complex networks vital nodes identification in complex networks sir rumor spreading model in the new media age stochastic sir epidemics in a population with households and schools thresholds for epidemic spreading in networks a novel top-k strategy for influence maximization in complex networks with community structure identifying influential spreaders in complex networks based on kshell hybrid method identifying key nodes based on improved structural holes in complex networks ranking nodes in complex networks based on local structure and improving closeness centrality an efficient algorithm for mining a set of influential spreaders in complex networks the large-scale organization of metabolic networks the koblenz network collection the network data repository with interactive graph analytics and visualization measuring isp topologies with rocketfuel graph evolution: densification and shrinking diameters defining and evaluating network communities based on ground-truth the spread of obesity in a large social network over years identifying the influential nodes via eigen-centrality from the differences and similarities of structure tracking influential individuals in dynamic networks evaluating influential nodes in social networks by local centrality with a coefficient a survey on topological properties, network models and analytical measures in detecting influential nodes in online social networks identifying influential spreaders in noisy networks spreading processes in multilayer networks identifying the influential spreaders in multilayer interactions of online social networks identifying influential spreaders in complex multilayer networks: a centrality perspective we would also thank dennis nii ayeh mensah for helping us revising english of this paper. the authors declare no conflict of interest. key: cord- -fgk n z authors: holme, petter title: objective measures for sentinel surveillance in network epidemiology date: - - journal: nan doi: . /physreve. . sha: doc_id: cord_uid: fgk n z assume one has the capability of determining whether a node in a network is infectious or not by probing it. then problem of optimizing sentinel surveillance in networks is to identify the nodes to probe such that an emerging disease outbreak can be discovered early or reliably. whether the emphasis should be on early or reliable detection depends on the scenario in question. we investigate three objective measures from the literature quantifying the performance of nodes in sentinel surveillance: the time to detection or extinction, the time to detection, and the frequency of detection. as a basis for the comparison, we use the susceptible-infectious-recovered model on static and temporal networks of human contacts. we show that, for some regions of parameter space, the three objective measures can rank the nodes very differently. this means sentinel surveillance is a class of problems, and solutions need to chose an objective measure for the particular scenario in question. as opposed to other problems in network epidemiology, we draw similar conclusions from the static and temporal networks. furthermore, we do not find one type of network structure that predicts the objective measures, i.e., that depends both on the data set and the sir parameter values. infectious diseases are a big burden to public health. their epidemiology is a topic wherein the gap between the medical and theoretical sciences is not so large. several concepts of mathematical epidemiology-like the basic reproductive number or core groups [ ] [ ] [ ] -have entered the vocabulary of medical scientists. traditionally, authors have modeled disease outbreaks in society by assuming any person to have the same chance of meeting anyone else at any time. this is of course not realistic, and improving this point is the motivation for network epidemiology: epidemic simulations between people connected by a network [ ] . one can continue increasing the realism in the contact patterns by observing that the timing of contacts can also have structures capable of affecting the disease. studying epidemics on time-varying contact structures is the basis of the emerging field of temporal network epidemiology [ ] [ ] [ ] [ ] . one of the most important questions in infectious disease epidemiology is to identify people, or in more general terms, units, that would get infected early and with high likelihood in an infectious outbreak. this is the sentinel surveillance problem [ , ] . it is the aspect of node importance, which is the one most actively used in public health practice. typically, it works by selecting some hospitals (clinics, cattle farms, etc.) to screen, or more frequently test, for a specific infection [ ] . defining an objective measure-a quantity to be maximized or minimized-for sentinel surveillance is not trivial. it depends on the particular scenario one considers and the means of interventions at hand. if the goal for society is to detect as many outbreaks as possible, it makes sense to choose sentinels to * holme@cns.pi.titech.ac.jp maximize the fraction of detected outbreaks [ ] . if the objective rather is to discover outbreaks early, then one could choose sentinels that, if infected, are infected early [ , ] . finally, if the objective is to stop the disease as early as possible, it makes sense to measure the time to extinction or detection (infection of a sentinel) [ ] . see fig. for an illustration. to restrict ourselves, we will focus on the case of one sentinel. if one has more than one sentinel, the optimal set will most likely not be the top nodes of a ranking according to the three measures above. their relative positions in the network also matter (they should not be too close to each other) [ ] . in this paper, we study and characterize our three objective measures. we base our analysis on empirical data sets of contacts between people. we analyze them both in temporal and static networks. the reason we use empirical contact data, rather than generative models, as the basis of this study is twofold. first, there are so many possible structures and correlations in temporal networks that one cannot tune them all in models [ ] . it is also hard to identify the most important structures for a specific spreading phenomenon [ ] . second, studying empirical networks makes this paper-in addition to elucidating the objective measures of sentinel surveillance-a study of human interaction. we can classify data sets with respect how the epidemic dynamics propagate on them. as mentioned above, in practical sentinel surveillance, the network in question is rather one of hospitals, clinics or farms. one can, however, also think of sentinel surveillance of individuals, where high-risk individuals would be tested extra often for some diseases. in the remainder of the paper, we will describe the objective measures, the structural measures we use for the analysis, and the data sets, and we will present the analysis itself. we will primarily focus on the relation between the measures, secondarily on the structural explanations of our observations. assume that the objective of society is to end outbreaks as soon as possible. if an outbreak dies by itself, that is fine. otherwise, one would like to detect it so it could be mitigated by interventions. in this scenario, a sensible objective measure would be the time for a disease to either go extinct or be detected by a sentinel: the time to detection or extinction t x [ ] . suppose that, in contrast to the situation above, the priority is not to save society from the epidemics as soon as possible, but just to detect outbreaks fast. this could be the case if one would want to get a chance to isolate a pathogen, or start producing a vaccine, as early as possible, maybe to prevent future outbreaks of the same pathogen at the earliest possibility. then one would seek to minimize the time for the outbreak to be detected conditioned on the fact that it is detected: the time to detection t d . for the time to detection, it does not matter how likely it is for an outbreak to reach a sentinel. if the objective is to detect as many outbreaks as possible, the corresponding measure should be the expected frequency of outbreaks to reach a node: the frequency of detection f d . note that for this measure a large value means the node is a good sentinel, whereas for t x and t d a good sentinel has a low value. this means that when we correlate the measures, a similar ranking between t x and f d or t d and f d yields a negative correlation coefficient. instead of considering the inverse times, or similar, we keep this feature and urge the reader to keep this in mind. there are many possible ways to reduce our empirical temporal networks to static networks. the simplest method would be to just include a link between any pair of nodes that has at least one contact during the course of the data set. this would however make some of the networks so dense that the static network structure of the node-pairs most actively in contact would be obscured. for our purpose, we primarily want our network to span many types of network structures that can impact epidemics. without any additional knowledge about the epidemics, the best option is to threshold the weighted graph where an edge (i, j ) means that i and j had more than θ contacts in the data set. in this work, we assume that we do not know what the per-contact transmission probability β is (this would anyway depend on both the disease and precise details of the interaction). rather we scan through a very large range of β values. since we anyway to that, there is no need either to base the choice of θ on some epidemiological argument, or to rescale β after the thresholding. note that the rescaled β would be a non-linear function of the number of contacts between i and j . (assuming no recovery, for an isolated link with ν contacts, the transmission probability is − ( − β ) ν .) for our purpose the only thing we need is that the rescaled β is a monotonous function of β for the temporal network (which is true). to follow a simple principle, we omit all links with a weight less than the median weight θ . we simulate disease spreading by the sir dynamics, the canonical model for diseases that gives immunity upon recovery [ , ] . for static networks, we use the standard markovian version of the sir model [ ] . that is, we assume that diseases spread over links between susceptible and infectious nodes the infinitesimal time interval dt with a probability β dt. then, an infectious node recovers after a time that is exponentially distributed with average /ν. the parameters β and ν are called infection rate and recovery rate, respectively. we can, without loss of generality, put ν = /t (where t is the duration of the sampling). for other ν values, the ranking of the nodes would be the same (but the values of the t x and t d would be rescaled by a factor ν). we will scan an exponentially increasing progression of values of β, from − to . the code for the disease simulations can be downloaded [ ] . for the temporal networks, we use a definition as close as possible to the one above. we assume an exponentially distributed duration of the infectious state with mean /ν. we assume a contact between an infectious and susceptible node results in a new infection with probability β. in the case of temporal networks, one cannot reduce the problem to one parameter. like for static networks, we sample the parameter values in exponential sequences in the intervals . β and . ν/t respectively. for temporal networks, with our interpretation of a contact, β > makes no sense, which explains the upper limit. furthermore, since temporal networks usually are effectively sparser (in terms of the number of possible infection events per time), the smallest β values will give similar results, which is the reason for the higher cutoff in this case. for both temporal and static networks, we assume the outbreak starts at one randomly chosen node. analogously, in the temporal case we assume the disease is introduced with equal probability at any time throughout the sampling period. for every data set and set of parameter values, we sample runs of epidemic simulations. as motivated in the introduction, we base our study on empirical temporal networks. all networks that we study record contacts between people and falls into two classes: human proximity networks and communication networks. proximity networks are, of course, most relevant for epidemic studies, but communication networks can serve as a reference (and it is interesting to see how general results are over the two classes). the data sets consist of anonymized lists of two identification numbers in contact and the time since the beginning of the contact. many of the proximity data sets we use come from the sociopatterns project [ ] . these data sets were gathered by people wearing radio-frequency identification (rfid) sensors that detect proximity between and . m. one such datasets comes from a conference, hypertext , (conference ) [ ] , another two from a primary school (primary school) [ ] and five from a high school (high school) [ ] , a third from a hospital (hospital) [ ] , a fourth set of five data sets from an art gallery (gallery) [ ] , a fifth from a workplace (office) [ ] , and a sixth from members of five families in rural kenya [ ] . the gallery data sets consist of several days where we use the first five. in addition to data gathered by rfid sensors, we also use data from the longer-range (around m) bluetooth channel. the cambridge [ ] and [ ] datasets were measured by the bluetooth channel of sensors (imotes) worn by people in and around cambridge, uk. st andrews [ ] , conference [ ] , and intel [ ] are similar data sets tracing contacts at, respectively, the university of st. andrews, the conference infocom , and the intel research laboratory in cambridge, uk. the reality [ ] and copenhagen bluetooth [ ] data sets also come from bluetooth data, but from smartphones carried by university students. in the romania data, the wifi channel of smartphones was used to log the proximity between university students [ ] , whereas the wifi dataset links students of a chinese university that are logged onto the same wifi router. for the diary data set, a group of colleagues and their family members were self-recording their contacts [ ] . our final proximity data, the prostitution network, comes from from self-reported sexual contacts between female sex workers and their male sex buyers [ ] . this is a special form of proximity network since contacts represent more than just proximity. among the data sets from electronic communication, facebook comes from the wall posts at the social media platform facebook [ ] . college is based on communication at a facebook-like service [ ] . dating shows interactions at an early internet dating website [ ] . messages and forum are similar records of interaction at a film community [ ] . copenhagen calls and copenhagen sms consist of phone calls and text messages gathered in the same experiment as copenhagen bluetooth [ ] . finally, we use four data sets of e-mail communication. one, e-mail , recorded all e-mails to and from a group of accounts [ ] . the other three, e-mail [ ] , [ ] , and [ ] recorded e-mails within a set of accounts. we list basic statistics-sizes, sampling durations, etc.-of all the data sets in table i . to gain further insight into the network structures promoting the objective measures, we correlate the objective measures with quantities describing the position of a node in the static networks. since many of our networks are fragmented into components, we restrict ourselves to measures that are well defined for disconnected networks. otherwise, in our selection, we strive to cover as many different aspects of node importance as we can. degree is simply the number of neighbors of a node. it usually presented as the simplest measure of centrality and one of the most discussed structural predictors of importance with respect to disease spreading [ ] . (centrality is a class of measures of a node's position in a network that try to capture what a "central" node is; i.e., ultimately centrality is not more well-defined than the vernacular word.) it is also a local measure in the sense that a node is able to estimate its degree, which could be practical when evaluating sentinel surveillance in real networks. subgraph centrality is based on the number of closed walks a node is a member of. (a walk is a path that could be overlapping itself.) the number of paths from node i to itself is given by a λ ii , where a is the adjacency matrix and λ is the length of the path. reference [ ] argues that the best way to weigh paths of different lengths together is through the formula as mentioned, several of the data sets are fragmented (even though the largest connected component dominates components of other sizes). in the limit of high transmission table i. basic statistics of the empirical temporal networks. n is the number of nodes, c is the number of contacts, t is the total sampling time, t is the time resolution of the data set, m is the number of links in the projected and thresholded static networks, and θ is the threshold. probabilities, all nodes in the component of the infection seed will be infected. in such a case it would make sense to place a sentinel in the largest component (where the disease most likely starts). closeness centrality builds on the assumption that a node that has, on average, short distances to other nodes is central [ ] . here, the distance d(i, j ) between nodes i and j is the number of links in the shortest paths between the nodes. the classical measure of closeness centrality of a node i is the reciprocal average distance between i and all other nodes. in a fragmented network, for all nodes, there will be some other node that it does not have a path to, meaning that the closeness centrality is ill defined. (assigning the distance infinity to disconnected pairs would give the closeness centrality zero for all nodes.) a remedy for this is, instead of measuring the reciprocal average of distances, measuring the average reciprocal distance [ ] , where d − (i, j ) = if i and j are disconnected. we call this the harmonic closeness by analogy to the harmonic mean. vitality measures are a class of network descriptor that capture the impact of deleting a node on the structure of the entire network [ , ] . specifically, we measure the harmonic closeness vitality, or harmonic vitality, for short. this is the change of the sum of reciprocal distances of the graph (thus, by analogy to the harmonic closeness, well defined even for disconnected graphs): here the denominator concerns the graph g with the node i deleted. if deleting i breaks many shortest paths, then c c (i) decreases, and thus c v (i) increases. a node whose removal disrupts many shortest paths would thus score high in harmonic vitality. our sixth structural descriptor is coreness. this measure comes out of a procedure called k-core decomposition. first, remove all nodes with degree k = . if this would create new nodes with degree one, delete them too. repeat this until there are no nodes of degree . then, repeat the above steps for larger k values. the coreness of a node is the last level when it is present in the network during this process [ ] . like for the static networks, in the temporal networks we measure the degree of the nodes. to be precise, we define the degree as the number of distinct other nodes a node in contact with within the data set. strength is the total number of contacts a node has participated in throughout the data set. unlike degree, it takes the number of encounters into account. temporal networks, in general, tend to be more disconnected than static networks. for node i to be connected to j in a temporal networks there has to be a time-respecting path from i to j , i.e., a sequence of contacts increasing in time that (if time is projected out) is a path from i to j [ , ] . thus two interesting quantities-corresponding to the component sizes of static networks-are the fraction of nodes reachable from a node by time-respecting paths forward (downstream component size) and backward in time (upstream component size) [ ] . if a node only exists in the very early stage of the data, the sentinel will likely not be active by the time the outbreak happens. if a node is active only at the end of the data set, it would also be too late to discover an outbreak early. for these reasons, we measure statistics of the times of the contacts of a node. we measure the average time of all contacts a node participates in; the first time of a contact (i.e., when the node enters the data set); and the duration of the presence of a node in the data (the time between the first and last contact it participates in). we use a version of the kendall τ coefficient [ ] to elucidate both the correlations between the three objective measures, and between the objective measures and network structural descriptors. in its basic form, the kendall τ measures the difference between the number of concordant (with a positive slope between them) and discordant pairs relative to all pairs. there are a few different versions that handle ties in different ways. we count a pair of points whose error bars overlap as a tie and calculate where n c is the number of concordant pairs, n d is the number of discordant pairs, and n t is the number of ties. we start investigating the correlation between the three objective measures throughout the parameter space of the sir model for all our data sets. we use the time to detection and extinction as our baseline and compare the other two objective measures with that. in fig. , we plot the τ coefficient between t x and t d and between t x and f d . we find that for low enough values of β, the τ for all objective measures coincide. for very low β the disease just dies out immediately, so the measures are trivially equal: all nodes would be as good sentinels in all three aspects. for slightly larger β-for most data sets . < β < . -both τ (t x , t d ) and τ (t x , f d ) are negative. this is a region where outbreaks typically die out early. for a node to have low t x , it needs to be where outbreaks are likely to survive, at least for a while. this translates to a large f d , while for t d , it would be beneficial to be as central as possible. if there are no extinction events at all, t x and t d are the same. for this reason, it is no surprise that, for most of the data sets, τ (t x , t d ) becomes strongly positively correlated for large β values. the τ (t x , f d ) correlation is negative (of a similar magnitude), meaning that for most data sets the different methods would rank the possible sentinels in the same order. for some of the data sets, however, the correlation never becomes positive even for large β values (like copenhagen calls and copenhagen sms). these networks are the most fragmented onesm meaning that one sentinel unlikely would detect the outbreak (since it probably happens in another component). this makes t x rank the important nodes in a way similar to f d , but since diseases that do reach a sentinel do it faster in a small component than a large one, t x and t d become anticorrelated. in fig. , we perform the same analysis as in the previous section but for static networks. the picture is to some extent similar, but also much richer. just as for the case of static networks, τ (t x , f d ) is always nonpositive, meaning the time to detection or extinction ranks the nodes in a way positively correlated with the frequency of detection. furthermore, like the static networks, τ (t x , t d ) can be both positively and negatively correlated. this means that there are regions where t d ranks the nodes in the opposite way than the t x . these regions of negative τ (t x , t d ) occur for low β and ν. for some data sets-for example the gallery data sets, dating, copenhagen calls, and copenhagen sms-the correlations are negative throughout the parameter space. among the data sets with a qualitative difference between the static and temporal representations, we find prostitution and e-mail both have strongly positive values of τ (t x , t d ) for large β values in the static networks but moderately negative values for temporal networks. in this section, we take a look at how network structures affect our objective measures. in fig. , we show the correlation between our three objective measures and the structural descriptors as a function of β for the office data set. panel (a) shows the results for the time to detection or extinction. there is a negative correlation between this measure and traditional centrality measures like degree or subgraph centrality. this is because t x is a quantity one wants to minimize to find the optimal sentinel, whereas for all the structural descriptors a large value means that a node is a candidate sentinel node. we see that degree and subgraph centrality are the two quantities that best predict the optimal sentinel location, while coreness is also close (at around − . ). this in line with research showing that certain biological problems are better determined by degree than more elaborate centrality measures [ ] . over all, the τ curves are rather flat. this is partly explained by τ being a rank correlation for t d [ fig. (b) ], most curves change behavior around β = . . this is the region when larger outbreaks could happen, so one can understand there is a transition to a situation similar to t x [ fig. (a) ]. f d [fig. (c) ] shows a behavior similar to t d in that the curves start changing order, and what was a correlation at low β becomes an anticorrelation at high β. this anticorrelation is a special feature of this particular data set, perhaps due to its pronounced community structure. nodes of degree , , and have a strictly increasing values of f d , but for some of the high degree nodes (that all have f d close to one) the ordering gets anticorrelated with degree which makes kendall's τ negative. since rank-based correlations are more principled for skew-distributed quantities common in networks, we keep them. we currently investigate what creates these unintuitive anticorrelations among the high degree nodes in this data set. next, we proceed with an analysis of all data sets. we summarize plots like fig. by the structural descriptor with the largest magnitude of the correlation |τ |. see fig. . we can see, that there is not one structural quantity that uniquely determines the ranking of nodes, there is not even one that dominates over ( ) degree is the strongest structural determinant of all objective measures at low β values. this is consistent with ref. [ ] . ( ) component size only occurs for large β. in the limit of large β, f d is only determined by component size (if we would extend the analysis to even larger β, subgraph centrality would have the strongest correlation for the frequency of detection). ( ) harmonic vitality is relatively better as a structural descriptor for t d , less so for t x and f d . t x and f d capture the ability of detecting an outbreak before it dies, so for these quantities one can imagine more fundamental quantities like degree and the component size are more important. ( ) subgraph centrality often shows the strongest correlation for intermediate values of β. this is interesting, but difficult to explain since the rationale of subgraph centrality builds on cycle counts and there is no direct process involving cycles in the sir model. ( ) harmonic closeness rarely gives the strongest correlation. if it does, it is usually succeeded by coreness and the data set is typically rather large. ( ) datasets from the same category can give different results. perhaps college and facebook is the most conspicuous example. in general, however, similar data sets give similar results. the final observation could be extended. we see that, as β increases, one color tends to follow another. this is summarized in fig. , where we show transition graphs of the different structural descriptors such that the size corresponds to their frequency in fig. , and the size of the arrows show how often one structural descriptor is succeeded by another as β is increased. for t x , the degree and subgraph centrality are the most important structural descriptors, and the former is usually succeeded by the latter. for t d , there is a common peculiar sequence of degree, subgraph centrality, coreness component size, and harmonic vitality that is manifested as the peripheral, clockwise path of fig. (b) . finally, f d is similar to t x except that there is a rather common transition from degree to coreness, and harmonic vitality is, relatively speaking, a more important descriptor. in fig. , we show the figure for temporal networks corresponding to fig. . just like the static case, even though every data set and objective measure is unique, we can make some interesting observations. ( ) strength is most important for small ν and β. this is analogous to degree dominating the static network at small parameter values. ( ) upstream component size dominates at large ν and β. this is analogous to the component size of static networks. since temporal networks tend to be more fragmented than static ones [ ] , this dominance at large outbreak sizes should be even more pronounced for temporal networks. ( ) most of the variation happens in the direction of larger ν and β. in this direction, strength is succeeded by degree which is succeeded by upstream component size. ( ) like the static case, and the analysis of figs. and , t x and f d are qualitatively similar compared to t d . ( ) temporal quantities, such as the average and first times of a node's contacts, are commonly the strongest predictors of t d . ( ) when a temporal quantity is the strongest predictor of t x and f d it is usually the duration. it is understandable that this has little influence on t d , since the ability to be infected at all matters for these measures; a long duration is beneficial since it covers many starting times of the outbreak. ( ) similar to the static case, most categories of data sets give consistent results, but some differ greatly (facebook and college is yet again a good example). the bigger picture these observations paint is that, for our problem, the temporal and static networks behave rather similarly, meaning that the structures in time do not matter so much for our objective measures. at the same time, there is not only one dominant measure for all the data sets. rather are there several structural descriptors that correlate most strongly with the objective measures depending on ν and β. in this paper, we have investigated three different objective measures for optimizing sentinel surveillance: the time to detection or extinction, the time to detection (given that the detection happens), and the frequency of detection. each of these measures corresponds to a public health scenario: the time to detection or extinction is most interesting to minimize if one wants to halt the outbreak as quickly as possible, and the frequency of detection is most interesting if one wants to monitor the epidemic status as accurately as possible. the time to detection is interesting if one wants to detect the outbreak early (or else it is not important), which could be the case if manufacturing new vaccine is relatively time consuming. we investigate these cases for temporal network data sets and static networks derived from the temporal networks. our most important finding is that, for some regions of parameter space, our three objective measures can rank nodes very differently. this comes from the fact that sir outbreaks have a large chance of dying out in the very early phase [ ] , but once they get going they follow a deterministic path. for this reason, it is thus important to be aware of what scenario one is investigating when addressing the sentinel surveillance problem. another conclusion is that, for this problem, static and temporal networks behave reasonably similarly (meaning that the temporal effects do not matter so much). naturally, some of the temporal networks respond differently than the static ones, but compared to, e.g., the outbreak sizes or time to extinction [ ] [ ] [ ] , differences are small. among the structural descriptors of network position, there is no particular one that dominates throughout the parameter space. rather, local quantities like degree or strength (for the temporal networks) have a higher predictive power at low parameter values (small outbreaks). for larger parameter values, descriptors capturing the number of nodes reachable from a specific node correlate most with the objective measures rankings. also in this sense, the static network quantities dominate the temporal ones, which is in contrast to previous observations (e.g., refs. [ ] [ ] [ ] ). for the future, we anticipate work on the problem of optimizing sentinel surveillance. an obvious continuation of this work would be to establish the differences between the objective metrics in static network models. to do the same in temporal networks would also be interesting, although more challenging given the large number of imaginable structures. yet an open problem is how to distribute sentinels if there are more than one. it is known that they should be relatively far away [ ] , but more precisely where should they be located? modern infectious disease epidemiology infectious diseases in humans temporal network epidemiology a guide to temporal networks principles and practices of public health surveillance stochastic epidemic models and their statistical analysis pretty quick code for regular (continuous time, markovian) sir on networks, github.com/pholme/sir proceedings, acm sigcomm -workshop on challenged networks (chants) crawdad dataset st_andrews/sassy third international conference on emerging intelligent data and web technologies proc. natl. acad. sci. usa proceedings of the nd acm workshop on online social networks, wosn ' proceedings of the tenth acm international conference on web search and data mining, wsdm ' proceedings of the th international conference networks: an introduction network analysis: methodological foundations distance in graphs we thank sune lehmann for providing the copenhagen data sets. this work was supported by jsps kakenhi grant no. jp h . key: cord- - ua z authors: reddy, c. rajashekar; mukku, t.; dwivedi, a.; rout, a.; chaudhari, s.; vemuri, k.; rajan, k. s.; hussain, a. m. title: improving spatio-temporal understanding of particulate matter using low-cost iot sensors date: - - journal: nan doi: nan sha: doc_id: cord_uid: ua z current air pollution monitoring systems are bulky and expensive resulting in a very sparse deployment. in addition, the data from these monitoring stations may not be easily accessible. this paper focuses on studying the dense deployment based air pollution monitoring using iot enabled low-cost sensor nodes. for this, total nine low-cost iot nodes monitoring particulate matter (pm), which is one of the most dominant pollutants, are deployed in a small educational campus in indian city of hyderabad. out of these, eight iot nodes were developed at iiit-h while one was bought off the shelf. a web based dashboard website is developed to easily monitor the real-time pm values. the data is collected from these nodes for more than five months. different analyses such as correlation and spatial interpolation are done on the data to understand efficacy of dense deployment in better understanding the spatial variability and time-dependent changes to the local pollution indicators. air pollution is one of the world's largest environmental causes of diseases and premature death [ ] . out of different air pollutants, particulate matter (pm) has been identified as one of the most dangerous pollutants. because of long-term exposure of pm, every year millions of people die and many more become seriously ill with cardiovascular and respiratory diseases [ ] . the issues are more aggravated in a developing country like india, where large sections of the population are exposed to high levels of pm levels [ ] . with increasing urbanization, the situation is only going to get worse. recent study in [ ] has also shown that a small increase in long-term exposure to pm . leads to a large increase in covid- death rate. therefore, it is important to develop tools for monitoring pm so that timely decisions can be made. in this paper, the focus is particularly on monitoring mass concentrations of pm . (fine pm or particles with aerodynamic diameter less than . µm) and pm (coarse pm or particles with aerodynamic diameter between . µm and µm) as these two pms are mostly linked with human health impacts [ ] . traditionally, pm monitoring is done using scientific-grade devices such as beta attenuation monitor (bam) and tapered element oscillating microbalance (teom) deployed by pollution controlling boards and other governmental agencies. although these systems are reliable and accurate, there are two important issues. first is that these systems are expensive, large and bulky, which leads to sparse deployment. for example, there are six monitoring stations deployed by central pollution control board (cpcb) in the indian city of hyderabad, which is spread over an area of km [ ] . also, these stations provide temporally more coarse data (hourly or daily). this in turn leads to low spatio-temporal resolution which is not enough to understand the exposure of citizens to pollution, which is non-uniformly distributed over the city. second issue is that the measured pollution data at the monitoring stations and estimates at other locations are not readily available [ ] . this lack of access to information results in lack of awareness among the citizens regarding the pollution in their area of residence or frequently visited locations such as home, office, schools and gardens. low-cost portable sensors along with internet of things (iot) can overcome the above two issues of traditional monitoring systems. the low-cost portable ambient sensors provide a huge opportunity in increasing the spatio-temporal resolution of the air pollution information and are even able to verify, fine-tune or improve the existing ambient air quality models [ ] . it has been shown in [ ] that a low-cost monitoring system, which is not as accurate as a traditional and expensive one, can still provide reliable indications about air quality in a local area. iot along with dense deployment of such lowcost sensors can provide real-time access of pollution data with high spatio-temporal resolution. government and citizens can use this information to identify pollution hot-spots so that timely and localized decisions can be made regarding reducing and preventing air pollution. there has been some work on pm monitoring in the literature [ ] , [ ] , [ ] , [ ] . in [ ] , [ ] , the performances of different low-cost optical pm . sensors such as nova sds , winsen zh a, plantower pms , honeywell hpma s and alphasense opc-n have been evaluated. authors in [ ] presented regulatory pm . and pm data availability along with the current status of the national monitoring networks and plans. in [ ] and [ ] , very few (six and three, respectively) iot nodes measuring pm . and pm were deployed in different geographical regions of santiago, chile, and southampton, uk respectively, to examine the suitability of low-cost sensors for pm monitoring in urban environment. however, there is a dearth of actual deployment and measurements of dense iot network to map fine spatio-temporal pm variations, which is precisely the focus of this paper. this paper focuses on studying the dense deployment based air pollution monitoring using iot enabled low-cost sensor nodes in indian urban conditions. for this, eight sensor nodes measuring pm . and pm are developed and deployed in iiit-h campus, which is . km . a web-based dashboard is developed to easily monitor the real-time air pollution . one of the eight deployed nodes is co-located with commercially available and factory-calibrated sensor node with a view of calibrating developed sensor nodes. the data is collected from these nine nodes for approximately five months. correlation analysis is done to understand correlation between different nodes in this denser (than traditional) deployment. for spatial interpolation, inverse distance weighing (idw) scheme is used on these nodes for the data collected before and during the bursting of firecrackers on the main night of diwali (one of the most popular festivals in india) to show the variability pattern in a small campus, hot spot detection and need for a dense deployment to provide better local pollution indicators. the paper is organized as follows. in section ii, details on iot network development and deployment along with measurement campaign are presented followed by data analysis tools in iii. section iv present the results while section v concludes the paper. figs. (a) and (b) show the block architecture and circuit diagram, respectively, of the pm monitoring sensor node developed at iiit-h. each node consists of esp based nodemcu microcontroller and sensors for pm, temperature and humidity. the specifications of the sensors used are given in table i . nova pm sds which is light scattering principle based sensor, has been used for pm . and pm measurements as it has been shown to have best performance among several low cost pm sensors in terms of closeness to the expensive and accurate beta attenuation mass (bam) and reproducibility among different sds units [ ] . since the light scattering based pm sensors do not perform reliably at extreme temperature and humidity conditions, dht is used to monitor these parameters for reliability of sds sensor readings. nodemcu samples data from the sensors and transmits it periodically via wifi to thingspeak [ ] , which is a cloud based iot platform for storing and processing data using matlab, for logging the data. nodemcu uses on-chip esp module to connect to available wifi access point for internet connection. nodemcu samples the nova pm sds sensor for pm . and pm in µg m − and dht sensor for environmental conditions temperature and relative humidity in • c and % respectively at a sampling rate of seconds and the network delay added for the communication with the server. the connections are made using a pcb printed and designed the website is live but the historic data, schematics and codes will be made public once the paper is published. at iiit-h for stability of the connectors between the sensors and the microcontroller. parameter resolution relative error sds [ ] pm . , pm humidity . % ± % fig. shows a deployment ready sensor node which consists of sensors, a nodemcu, a ma h power bank, g based portable wifi routers (volte-based jiofi jmr [ ] ) and a weather shield. power bank is needed for power backup in case of any fluctuations or drop in the power supply. a weather shield design with vents shown is used along to cater the ambient air flow requirements of dht for temperature and humidity. the components are enclosed in a poly carbonate box of ip rating as the deployment is outdoors. ip enclosures offer complete protection against dust particles and a good level of protection against water. g based wifi router shown in the figure is not common to all the nodes deployed and is used only when the node is deployed outside campus wifi coverage. the prototype deployment and measurement region is the iiit-h campus, gachibowli, hyderabad, india as shown in fig. . the area of the measurement region is acres ( . km ). in this small campus, eight nodes developed at iiit-h were deployed outdoors at locations shown in fig. . the figure also shows the notations and numbering of the nodes, which will be followed for the rest of the paper. before deploying, these eight nodes were collocated in a lab and measured data for seven days to ensure that none of the devices is too deviant from the bunch. the deployment period of the nodes has been from october to april (more than months). in addition to the eight nodes, a ninth node was also deployed by buying off-the-shelf commercial node from airveda [ ]. this node was factory calibrated with respect to bam and has been used as a reference node for our nodes in this work. this node is denoted as node -airveda and is collocated with node -maingate as shown in fig. . all the nodes are connected to continuous power supply. nodes , and are connected to the wifi provided by the access points which are part of the campus wifi network. node could connect to the campus wifi network but with weak signal strength, which sometimes resulted in connection outages and data loss. to avoid this and strengthen the wifi signal, a nodemcu has been deployed in appropriate location as a wifi repeater. nodes , , and are out of the campus wifi coverage and have been equipped with individual g based portable jiofi wifi routers for internet connectivity. node -airveda is using wifi provided by jiofi connected to node -maingate since these two nodes are collocated. each of the eight iiit nodes (i.e., nodes to ) uploads the sampled sensor data, namely pm . , pm , temperature and relative humidity to individual channels created on the thingspeak server using get method of the http protocol. node -airveda uses esp for wifi communication and uploads data to airveda server. the same data is retrieved using airveda application program interface (api) and saved in a separate channel in the thingspeak server. the website developed for displaying the real-time pm values is hosted at the address https:/spcrc.iiit.ac.in/air/. fig. shows the process flow of the web-bas. the webpage is designed such that the data is fetched from the thingspeak server and is displayed on the webpage on an open source map openstreetmap [ ] . the front-end of the webpage is designed in hyper text markup language (html) and back-end is designed in javascript. to get the map data, we used javascript library leaflet [ ] . to get data from thingspeak, another javascript library asynchronous javascript and xml (ajax) is used, which allows us to get data from the thingspeak api using the get function. after we get the data from the thingspeak api, the data is averaged and a colour is associated with the data value. next, using leaflet functions, the marker colour and information on the map is set. the process then goes to sleep. the complete process is repeated every minutes. note that this dashboard does not show node -airveda at the moment as it can be viewed on the airveda webpage or app by adding the station id. the following tasks were done to convert the raw data received from the sensor nodes into a usable data set: • it is essential to remove the outliers in a raw dataset as there are few extreme values that deviate from other samples in the data, which might be a result of several factors. data cleaning can be done using clustering based outlier detection, which is a well known unsupervised method used extensively. in this paper, density based clustering algorithm in [ ] has been employed to identify the outliers and the vectors with outlier have been dropped. environmental conditions such as temperature and humidity can affect the working of laser based pm sensors like sds . for example, there is overestimation of pm values at higher humidity. as such, these points also act as outliers and the corresponding vectors are removed using the density based clustering. • data averaging helps to look past random fluctuation and see the central trend of a data set. the sensor used in the pm measurements has a relative error of % so averaging the data helps to smooth the time series curve. b. analysis tools ) quantile-quantile plots: the quantile-quantile plot or qq plot is an analysis tool to assess if a pair of data variables' population possibly came from same distribution or not. a qq plot is a scatterplot created by plotting two sets of quantiles against one another. if both sets of quantiles have come from the same distribution, the scatter plot form a line that's roughly straight. many distributional aspects like shifts in location, scale, symmetry, and the presence of outliers can be detected from these plots. for two data sets that come from similar populations whose data distribution functions differ only by shifts mentioned earlier, the data points lie along a straight line displaced either up or down from the -degree reference line. qq plots help us understand the distributional features of the data sets and provide necessary confidence for assumptions for further analysis. ) correlation analysis: correlation is a bivariate analysis that measures the strength of association between two variables and the direction of the relationship. the correlation coefficient is a statistic tool used to measure the extent of the relationship between variables when compared in pairs. in terms of the strength of the relationship, the value of the correlation coefficient varies between + and - . there are several types of correlation coefficients such as pearson and kendall. pearson's correlation is one of the most commonly used correlation coefficient but makes several assumptions on the data such as normally distributed variables, linearly related variables, complete absence of outliers and homoscedasticity. on the other hand, kendall's tau doesn't require the above mentioned assumptions and is more suitable for the work in this paper. kendall's tau (τ ), which is a non-parametric rankbased measure of dependence is defined as where n c and n d are the numbers of concordant pairs and discordant pairs respectively. for a given pair (x i , y i ) and (x j , y j ), let us define z = (x i − x j )(y i − y j ). this pair is concordant if z > and discordant if z < . ) spatial interpolation: it is not practical to deploy and measure pm values at every location in the area of interest. however, using nearest measurement point to approximate the pm value at a location of interest may lead to erroneous results given the variability of pollution levels and weather in different locations in an urban environment. this can be mitigated by using spatial interpolation to estimate the pm values at unmeasured locations using known values at the measurement locations. in this paper, we have used idw, which is one of the simplest and popular deterministic spatial interpolation technique [ ] . idw follows the principle that the nodes that are closer to the location of estimation will have more impact than the ones which are farther away. idw uses linearly weighted combination of the measured values at the nodes to estimate the parameter at the location of interest. the weight corresponding to a node is a function of inverse distance between the location of the node and the location of the estimate. in this paper, weights have been chosen to be inverse distance squared. the following analyses were applied on the obtained data set after cleaning and preprocessing: qq plots, time series plots, correlation analysis and spatial analysis. qq plots have been used on the two co-located nodes node -airveda and node -maingate to verify the distribution similarity. node -airveda is an air quality monitoring device from airveda which has been tested against the standard pm sensor bam monitor and node -maingate is the sensor node developed at iiit-h. qqplots have been plotted with one-hour averaged data for pm . in fig. (a) and for pm in fig. (b) with node -airveda on horizontal axis and node -maingate on the vertical axis. the plots show linearity for most part with most of the sample points close to straight line with high density and very few points deviating from the linear relationship for both pm . and pm samples. in the case of pm . few deviating points belong to the higher end of the distribution while in the case of pm samples, few deviations can be seen at both lower and higher ends of the distribution. from the plots, it is safe to assume that the populations of the data samples of node -airveda and node -maingate follow a similar distribution with very few samples deviating. figs. (a) and (b) show idw based interpolation maps for pm plotted at timestamps : : (before burning crackers) and : : (after burning crackers) on the day of diwali. in fig. (a) , the hot-spot of the pm values is at the node -airveda and node -maingate which are placed near a six-lane highway and exposed directly to vehicular pollution. spatial variation can be clearly seen in fig. (a) between the nine points in an area of only acres ( . km ) with node -ftbg, node -kcis and node -library showing comparatively lower values being in the center of the campus. in fig. (b) , which shows the values at : after bursting of crackers, the values increase dramatically by to times. now the number of hot-spots has increased to four, of which node -flyg and node -fcyq are the sites for bursting crackers while node -airveda, node -maingate and node -bakul are affected by both vehicular pollution and crackers burned outside the campus. node -obh was off due to some technical issue on the evening of diwali, which has affected the interpolated values at that point and resulting in lower values than the actual. fig. (b) shows three nodes and the area in the center of the campus which are surrounded by the pollution hot-spots but yet show significantly lower values of pm. the spatial variation within the nine nodes is dominantly seen and hence demonstrates the need for local deployment of sensor nodes for accurate monitoring of the air quality conditions locally. fig. also show the temporal variation of the values within a small time period of five hours an increase in value from to around at the node -flyg and node -fcyq. although similar results have been obtained for pm . , they are not shown here for brevity. in this paper, the dense deployment of iot nodes has been evaluated for monitoring pm values in urban indian setting. for this, nine nodes have been deployed in a small campus of iiit-h. a web-based dashboard has been developed for real time pm monitoring. the measurements done over the period of more than five months clearly show significant increase in pm values during diwali as well as the noticeable reduction in pm values during national lockdown during covid- . it has been shown that correlation coefficient between some nodes in the same campus have low values demonstrating that the pm values across a small region may be significantly different. moreover, the idw-based spatial interpolation results on the day of diwali show significant spatial variation in pm values in the campus ranging from to for locations just a few hundred meters apart for pm . the results also show notable temporal variations with pm values rising up to times at the same spot in few hours. thus, there is sufficient motivation to use dense deployment of iot nodes for improved spatiotemporal monitoring of pm values. the lancet commission on pollution and health field performance of a low-cost sensor in the monitoring of particulate matter in monitoring particulate matter in india: recent trends and future outlook exposure to air pollution and covid- mortality in the united states a low cost sensing system for coopertaive air quality monitoring in urban areas evaluation of low-cost sensors for ambient pm . monitoring city scale particulate matter monitoring using lorawan based air quality iot devices sds nova sensor specifications dht sensor specifications jiofi portable router jmr javascript library for mobile-friendly interactive maps, accessed anomaly detection in temperature data using dbscan algorithm principles of geographical information systems key: cord- - uzl jpu authors: li, peisen; wang, guoyin; hu, jun; li, yun title: multi-granularity complex network representation learning date: - - journal: rough sets doi: . / - - - - _ sha: doc_id: cord_uid: uzl jpu network representation learning aims to learn the low dimensional vector of the nodes in a network while maintaining the inherent properties of the original information. existing algorithms focus on the single coarse-grained topology of nodes or text information alone, which cannot describe complex information networks. however, node structure and attribution are interdependent, indecomposable. therefore, it is essential to learn the representation of node based on both the topological structure and node additional attributes. in this paper, we propose a multi-granularity complex network representation learning model (mnrl), which integrates topological structure and additional information at the same time, and presents these fused information learning into the same granularity semantic space that through fine-to-coarse to refine the complex network. experiments show that our method can not only capture indecomposable multi-granularity information, but also retain various potential similarities of both topology and node attributes. it has achieved effective results in the downstream work of node classification and the link prediction on real-world datasets. complex network is the description of the relationship between entities and the carrier of various information in the real world, which has become an indispensable form of existence, such as medical systems, judicial networks, social networks, financial networks. mining knowledge in networks has drown continuous attention in both academia and industry. how to accurately analyze and make decisions on these problems and tasks from different information networks is a vital research. e.g. in the field of sociology, a large number of interactive social platforms such as weibo, wechat, facebook, and twitter, create a lot of social networks including relationships between users and a sharp increase in interactive review text information. studies have shown that these large, sparse new social networks at different levels of cognition will present the same smallworld nature and community structure as the real world. then, based on these interactive information networks for data analysis [ ] , such as the prediction of criminal associations and sensitive groups, we can directly apply it to the real world. network representation learning is an effective analysis method for the recognition and representation of complex networks at different granularity levels, while preserving the inherent properties, mapping high-dimensional and sparse data to a low-dimensional, dense vector space. then apply vector-based machine learning techniques to handle tasks in different fields [ , ] . for example, link prediction [ ] , community discovery [ ] , node classification [ ] , recommendation system [ ] , etc. in recent years, various advanced network representation learning methods based on topological structure have been proposed, such as deepwalk [ ] , node vec [ ] , line [ ] , which has become a classical algorithm for representation learning of complex networks, solves the problem of retaining the local topological structure. a series of deep learning-based network representation methods were then proposed to further solve the problems of global topological structure preservation and high-order nonlinearity of data, and increased efficiency. e.g., sdne [ ] , gcn [ ] and dane [ ] . however, the existing researches has focused on coarser levels of granularity, that is, a single topological structure, without comprehensive consideration of various granular information such as behaviors, attributes, and features. it is not interpretable, which makes many decision-making systems unusable. in addition, the structure of the entity itself and its attributes or behavioral characteristics in a network are indecomposable [ ] . therefore, analyzing a single granularity of information alone will lose a lot of potential information. for example, in a job-related crime relationship network is show in fig. , the anti-reconnaissance of criminal suspects leads to a sparse network than common social networks. the undiscovered edge does not really mean two nodes are not related like p and p or (p and p ), but in case detection, additional information of the suspect needs to be considered. the two without an explicit relationship were involved in the same criminal activity at a certain place (l ), they may have some potential connection. the suspect p and p are related by the attribute a , the topology without attribute cannot recognize why the relation between them is generated. so these location attributes and activity information are inherently indecomposable and interdependence with the suspect, making the two nodes recognize at a finer granularity based on the additional information and relationship structure that the low-dimensional representation vectors learned have certain similarities. we can directly predict the hidden relationship between the two suspects based on these potential similarities. therefore, it is necessary to consider the network topology and additional information of nodes. the cognitive learning mode of information network is exactly in line with the multi-granularity thinking mechanism of human intelligence problem solving, data is taken as knowledge expressed in the lowest granularity level of a multiple granularity space, while knowledge as the abstraction of data in coarse granularity levels [ ] . multi-granularity cognitive computing fuses data at different granularity levels to acquire knowledge [ ] . similarly, network representation learning can represent data into lower-dimensional granularity levels and preserve underlying properties and knowledge. to summarize, complex network representation learning faces the following challenges: information complementarity: the node topology and attributes are essentially two different types of granular information, and the integration of these granular information to enrich the semantic information of the network is a new perspective. but how to deal with the complementarity of its multiple levels and represent it in the same space is an arduous task. in complex networks, the similarity between entities depends not only on the topology structure, but also on the attribute information attached to the nodes. they are indecomposable and highly non-linear, so how to represent potential proximity is still worth studying. in order to address the above challenges, this paper proposes a multigranularity complex network learning representation method (mnrl) based on the idea of multi-granularity cognitive computing. network representation learning can be traced back to the traditional graph embedding, which is regarded as a process of data from high-dimensional to lowdimensional. the main methods include principal component analysis (pca) [ ] and multidimensional scaling (mds) [ ] . all these methods can be understood as using an n × k matrix to represent the original n × m matrix, where k m. later, some researchers proposed isomap and lle to maintain the overall structure of the nonlinear manifold [ ] . in general, these methods have shown good performance on small networks. however, the time complexity is extremely high, which makes them unable to work on large-scale networks. another popular class of dimensionality reduction techniques uses the spectral characteristics (e.g. feature vectors) of a matrix that can be derived from a graph to embed the nodes. laplacian eigenmaps [ ] obtain low-dimensional vector representations of each node in the feature vector representation graph associated with its k smallest non-trivial feature values. recently, deepwalk was inspired by word vec [ ] , a certain node was selected as the starting point, and the sequence of the nodes was obtained by random walk. then the obtained sequence was regarded as a sentence and input to the word vec model to learn the low-dimensional representation vector. deep-walk can obtain the local context information of the nodes in the graph through random walks, so the learned representation vector reflects the local structure of the point in the network [ ] . the more neighboring points that two nodes share in the network, the shorter the distance between the corresponding two vectors. node vec uses biased random walks to make a choose between breadthfirst (bfs) and depth-first (dfs) graph search, resulting in a higher quality and more informative node representation than deepwalk, which is more widely used in network representation learning. line [ ] proposes first-order and secondorder approximations for network representation learning from a new perspective. harp [ ] obtains a vector representation of the original network through graph coarsening aggregation and node hierarchy propagation. recently, graph convolutional network (gcn) [ ] significantly improves the performance of network topological structure analysis, which aggregates each node and its neighbors in the network through a convolutional layer, and outputs the weighted average of the aggregation results instead of the original node's representation. through the continuous stacking of convolutional layers, nodes can aggregate high-order neighbor information well. however, when the convolutional layers are superimposed to a certain number, the new features learned will be over-smoothed, which will damage the network representation performance. multi-gs [ ] combines the concept of multi-granularity cognitive computing, divides the network structure according to people's cognitive habits, and then uses gcn to convolve different particle layers to obtain low-dimensional feature vector representations. sdne [ ] directly inputs the network adjacency matrix to the autoencoder [ ] to solve the problem of preserving highly nonlinear first-order and second-order similarity. the above network representation learning methods use only network structure information to learn low-dimensional node vectors. but nodes and edges in real-world networks are often associated with additional information, and these features are called attributes. for example, in social networking sites such as weibo, text content posted by users (nodes) is available. therefore, the node representation in the network also needs to learn from the rich content of node attributes and edge attributes. tadw studies the case where nodes are associated with text features. the author of tadw first proved that deepwalk essentially decomposes the transition probability matrix into two low-dimensional matrices. inspired by this result, tadw low-dimensionally represents the text feature matrix and node features through a matrix decomposition process [ ] . cene treats text content as a special type of node and uses node-node structure and node-content association for node representation [ ] . more recently, dane [ ] and can [ ] uses deep learning methods [ ] to preserve poten-tially non-linear node topology and node attribute information. these two kinds of information provide different views for each node, but their heterogeneity is not considered. anrl optimizes the network structure and attribute information separately, and uses the skip-gram model to skillfully handle the heterogeneity of the two different types of information [ ] . nevertheless, the consistent and complementary information in the topology and attributes is lost and the sensitivity to noise is increased, resulting in a lower robustness. to process different types of information, wang put forward the concepts of "from coarse to fine cognition" and "fine to coarse" fusion learning in the study of multi-granularity cognitive machine learning [ ] . people usually do cognition at a coarser level first, for example, when we meet a person, we first recognize who the person is from the face, then refine the features to see the freckles on the face. while computers obtain semantic information that humans understand by fusing fine-grained data to coarse-grained levels. refining the granularity of complex networks and the integration between different granular layers is still an area worthy of deepening research [ , ] . inspired by this, divides complex networks into different levels of granularity: single node and attribute data are microstructures, meso-structures are role similarity and community similarity, global network characteristics are extremely macro-structured. the larger the granularity, the wider the range of data covered, the smaller the granularity, the narrower the data covered. our model learns the semantic information that humans can understand at above mentioned levels from the finest-grained attribute information and topological structure, finally saves it into low-dimensional vectors. let g = (v, e, a) be a complex network, where v represents the set of n nodes and e represents the set of edges, and a represents the set of attributes. in detail, a ∈ n×m is a matrix that encodes all node additional attributes information, and a i ∈ a describes the attributes associated with node represents an edge between v i and v j . we formally define the multi-granularity network representation learning as follows: , we represent each node v i and attribute a i as a low-dimensional vector y i by learning a functionf g : |v | and y i not only retains the topology of the nodes but also the node attribute information. definition . given network g = (v, e, a). semantic similarity indicates that two nodes have similar attributes and neighbor structure, and the lowdimensional vector obtained by the network representation learning maintains the same similarity with the original network. e.g., if v i ∼ v j through the mapping function f g to get the low-dimensional vectors y i = f g (v i ), y j = f g (v j ), y i and y j are still similar, y i ∼ y j . complex networks are composed of node and attribute granules (elementary granules), which can no longer be decomposed. learning these grains to get different levels of semantic information includes topological structure (micro), role acquaintance (meso) and global structure (macro). the complete low-dimensional representation of a complex network is the aggregation of these granular layers of information. in order to solve the problems mentioned above, inspired by multi-granularity cognitive computing, we propose a multi-granularity network representation learning method (mnrl), which refines the complex network representation learning from the topology level to the node's attribute characteristics and various attachments. the model not only fuses finer granular information but also preserves the node topology, which enriches the semantic information of the relational network to solve the problem of the indecomposable and interdependence of information. the algorithm framework is shown in fig. . firstly, the topology and additional information are fused through the function h, then the variational encoder is used to learn network representation from fine to coarse. the output of the embedded layer are low-dimensional vectors, which combines the attribute information and the network topology. to better characterize multiple granularity complex networks and solve the problem of nodes with potential associations that cannot be processed through the relationship structure alone, we refine the granularity to additional attributes, and designed an information fusion method, which are defined as follows: where n (v i ) is the neighbors of node v i in the network, a i is the attributes associated with node v i . w ij > for weighted networks and w ij = for unweighted networks. d(v j ) is the degree of node v j . x i contains potential information of multiple granularity information, both the neighbor attribute information and the node itself. to capture complementarity of different granularity hierarchies and avoid the effects of various noises, our model in fig. is a variational auto-encoder, which is a powerful unsupervised deep model for feature learning. it has been widely used for multi-granularity cognitive computing applications. in multi-granularity complex networks, auto-encoders fuse different granularity data to a unified granularity space from fine to coarse. the variational auto-encoder contains three layers, namely, the input layer, the hidden layer, and the output layer, which are defined as follows: here, k is the number of layers for the encoder and decoder. σ (·) represents the possible activation functions such as relu, sigmod or tanh. w k and b k are the transformation matrix and bias vector in the k-th layer, respectively. y k i is the unified vector representation that learning from model, which obeys the distribution function e, reducing the influence of noise. e ∼ ( , ) is the standard normal distribution in this paper. in order to make the learned representation as similar as possible to the given distribution,it need to minimize the following loss function: to reduce potential information loss of original network, our goal is to minimize the following auto-encoder loss function: wherex i is the reconstruction output of decoder and x i incorporates prior knowledge into the model. to formulate the homogeneous network structure information, skip-gram model has been widely adopted in recent works and in the field of heterogeneous network research, skip-grams suitable for different types of nodes processing have also been proposed [ ] . in our model, the context of a node is the low-dimensional potential information. given the node v i and the associated reconstruction information y i , we randomly walk c ∈ c by maximizing the loss function: where b is the size of the generation window and the conditional probability p (v i+j |y i ) is defined as the softmax function: in the above formula, v i is the node context representation of node v i , and y i is the result produced by the auto-encoder. directly optimizing eq. ( ) is computationally expensive, which requires the summation over the entire set of nodes when computing the conditional probability of p (v i+j |y i ). we adopt the negative sampling approach proposed in metapath vec++ that samples multiple negative samples according to some noisy distributions: where σ(·) = /( + exp(·)) is the sigmoid function and s is the number of negative samples. we set p n (v) ∝ d v as suggested in wode vec, where d v is the degree of node v i [ , ] . through the above methods, the node's attribute information and the heterogeneity of the node's global structure are processed and the potential semantic similarity kept in a unified granularity space. multi-granularity complex network representation learning through the fusion of multiple kinds of granularity information, learning the basic granules through an autoencoder, and representing different levels of granularity in a unified low-dimensional vector solves the potential semantic similarity between nodes without direct edges. the model simultaneously optimizes the objective function of each module to make the final result robust and effective. the function is shown below: in detail, l re is the auto-encoder loss function of eq. ( ), l kl has been stated in formula ( ), and l hs is the loss function of the skip-gram model in eq. ( ) . α, β, ψ, γ are the hyper parameters to balance each module. l v ae is the parameter optimization function, the formula is as follows: where w k ,ŵ k are weight matrices for encoder and decoder respectively in the kth layer, and b k ,b k are bias matrix. the complete objective function is expressed as follows: mnrl preserves multiple types of granular information include node attributes, local network structure and global network structure information in a unified framework. the model solves the problems of highly nonlinearity and complementarity of various granularity information, and retained the underlying semantics of topology and additional information at the same time. finally, we optimize the object function l in eq. ( ) through stochastic gradient descent. to ensure the robustness and validity of the results, we iteratively optimize all components at the same time until the model converges. the learning algorithm is summarized in algorithm . algorithm . the model of mnrl input: graph g = (v, e, a), window size b, times of walk p, walk length u, hyperparameter α, β, ψ, γ, embedding size d. output: node representations y k ∈ d . : generate node context starting p times with random walks with length u at each node. : multiple granularity information fusion for each node by function h (·) : initialize all parameters : while not converged do : sample a mini-batch of nodes with its context : compute the gradient of ∇l : update auto-encoder and skip-gram module parameters : end while : save representations y = y k datasets: in our experiments, we employ four benchmark datasets: facebook , cora, citeseer and pubmed . these datasets contain edge relations and various attribute information, which can verify that the social relations of nodes and individual attributes have strong dependence and indecomposability, and jointly determine the properties of entities in the social environment. the first three datasets are paper citation networks, and these datasets are consist of bibliography publication data. the edge represents that each paper may cite or be cited by other papers. the publications are classified into one of the following six classes: agents, ai, db, ir, ml, hci in citeseer and one of the three classes (i.e., "diabetes mellitus experimental", "diabetes mellitus type ", "diabetes mellitus type ") in pubmed. the cora dataset consists of machine learning papers which are classified into seven classes. facebook dataset is a typical social network. nodes represent users and edges represent friendship relations. we summarize the statistics of these benchmark datasets in table . to evaluate the performance of our proposed mnrl, we compare it with baseline methods, which can be divided into two groups. the former category of baselines leverage network structure information only and ignore the node attributes contains deepwalk, node vec, grarep [ ] , line and sdne. the other methods try to preserve node attribute and network structure proximity, which are competitive competitors. we consider tadw, gae, vgae, dane as our compared algorithms. for all baselines, we used the implementation released by the original authors. the parameters for baselines are tuned to be optimal. for deepwalk and node vec, we set the window size as , the walk length as , the number of walks as . for grarep, the maximum transition step is set to . for line, we concatenate the first-order and second-order result together as the final embedding result. for the rest baseline methods, their parameters are set following the original papers. at last, the dimension of the node representation is set as . for mnrl, the number of layers and dimensions for each dataset are shown in table . table . detailed network layer structure information. citeseer - - - - - - pubmed - - - - cora - - - - facebook - - - - to show the performance of our proposed mnrl, we conduct node classification on the learned node representations. specifically, we employ svm as the classifier. to make a comprehensive evaluation, we randomly select %, %, % nodes as the training set and the rest as the testing set respectively. with these randomly chosen training sets, we use five-fold cross validation to train the classifier and then evaluate the classifier on the testing sets. to measure the classification result, we employ micro-f (mi-f ) and macro-f (ma-f ) as metrics. the classification results are shown in table , , respectively. from these four tables, we can find that our proposed mnrl achieves significant improvement compared with plain network embedding approaches, and beats other attributed network embedding approaches in most situations. experimental results show that the representation results of each comparison algorithm perform well in node classification in downstream tasks. in general, a model that considers node attribute information and node structure information performs better than structure alone. from these three tables, we can find that our proposed mnrl achieves significant improvement compared with single granularity network embedding approaches. for joint representation, our model performs more effectively than most similar types of algorithms, especially in the case of sparse data, because our model input is the fusion information of multiple nodes with extra information. when comparing dane, our experiments did not improve significantly but it achieved the expected results. dane uses two auto-encoders to learn and express the network structure and attribute information separately, since the increase of parameters makes the optimal selection in the learning process, the performance will be better with the increase of training data, but the demand for computing resources will also increase and the interpretability of the algorithm is weak. while mnrl uses a variational auto-encoder to learn the structure and attribute information at the same time, the interdependence of information is preserved, which handles heterogeneous information well and reduces the impact of noise. in this subsection, we evaluate the ability of node representations in reconstructing the network structure via link prediction, aiming at predicting if there exists an edge between two nodes, is a typical task in networks analysis. following other model works do, to evaluate the performance of our model, we randomly holds out % existing links as positive instances and sample an equal number of non-existing links. then, we use the residual network to train the embedding models. specifically, we rank both positive and negative instances according to the cosine similarity function. to judge the ranking quality, we employ the auc to evaluate the ranking list and a higher value indicates a better performance. we perform link prediction task on cora datasets and the results is shown in fig. . compared with traditional algorithms that representation learning from a single granular structure information, the algorithms that both on structure and attribute information is more effective. tadw performs well, but the method based on matrix factorization has the disadvantage of high complexity in large networks. gae and vgae perform better in this experiment and are suitable for large networks. mnrl refines the input and retains potential semantic information. link prediction relies on additional information, so it performs better than other algorithms in this experiment. in this paper, we propose a multi-granularity complex network representation learning model (mnrl), which integrates topology structure and additional information, and presents these fused information learning into the same granularity semantic space that through fine-to-coarse to refine the complex network. the effectiveness has been verified by extensive experiments, shows that the relation of nodes and additional attributes are indecomposable and complementarity, which together jointly determine the properties of entities in the network. in practice, it will have a good application prospect in large information network. although the model saves a lot of calculation cost and well represents complex networks of various granularity, it needs to set different parameters in different application scenarios, which is troublesome and needs to be optimized in the future. the multi-granularity complex network representation learning also needs to consider the dynamic network and adapt to the changes of network nodes, so as to realize the real-time information network analysis. social structure and network analysis network representation learning: a survey virtual network embedding: a survey the link-prediction problem for social networks community discovery using nonnegative matrix factorization node classification in social networks recommender systems deepwalk: online learning of social representations node vec: scalable feature learning for networks line: large-scale information network embedding deep learning deep attributed network embedding structural deep network embedding semi-supervised classification with graph convolutional networks dgcc: data-driven granular cognitive computing granular computing data mining, rough sets and granular computing structural deep embedding for hypernetworks principal component analysis the isomap algorithm and topological stability laplacian eigenmaps for dimensionality reduction and data representation network representation learning based on multi-granularity structure word vec explained: deriving mikolov et al'.s negativesampling word-embedding method harp: hierarchical representation learning for networks sparse autoencoder network representation learning with rich text information a general framework for content-enhanced network representation learning anrl: attributed network representation learning via deep neural networks granular computing with multiple granular layers for brain big data processing an approach for attribute reduction and rule generation based on rough set theory metapath vec: scalable representation learning for heterogeneous networks grarep: learning graph representations with global structural information co-embedding attributed networks key: cord- -w tftva authors: suran, jantra ngosuwan; wyre, nicole rene title: imaging findings in domestic ferrets (mustela putorius furo) with lymphoma date: - - journal: vet radiol ultrasound doi: . /vru. sha: doc_id: cord_uid: w tftva lymphoma is the most common malignant neoplasia in domestic ferrets, mustela putorius furo. however, imaging findings in ferrets with lymphoma have primarily been described in single case reports. the purpose of this retrospective study was to describe imaging findings in a group of ferrets with confirmed lymphoma. medical records were searched between and . a total of ferrets were included. radiographs (n = ), ultrasound (n = ), computed tomography (ct; n = ), and magnetic resonance imaging (mri; n = ) images were available for review. median age at the time of diagnosis was . years (range . – . years). clinical signs were predominantly nonspecific ( / ). the time between the first imaging study and lymphoma diagnosis was day or less in most ferrets ( ). imaging lesions were predominantly detected in the abdomen, and most frequently included intra‐abdominal lymphadenopathy ( / ), splenomegaly ( / ), and peritoneal effusion ( / ). lymphadenopathy and mass lesions were typically hypoechoic on ultrasound. mild peritoneal effusion was the only detected abnormality in two ferrets. mild pleural effusion was the most common thoracic abnormality ( / ). expansile lytic lesions were present in the vertebrae of two ferrets with t ‐l myelopathy and the femur in a ferret with lameness. hyperattenuating, enhancing masses with secondary spinal cord compression were associated with vertebral lysis in ct images of one ferret. the mri study in one ferret with myelopathy was inconclusive. findings indicated that imaging characteristics of lymphoma in ferrets are similar to those previously reported in dogs, cats, and humans. l ymphoma is the most common malignant neoplasia in domestic ferrets, mustela putorius furo. following insulinoma and adrenocortical neoplasia, it is the third most common neoplasia of domestic ferrets overall. [ ] [ ] [ ] [ ] in ferrets, lymphoma can be classified based on tissue involvement, including multicentric, mediastinal, gastrointestinal, cutaneous, and extranodal. [ ] [ ] [ ] the presentation and organ distribution of lymphoma has been associated with the age of onset. , mediastinal lymphoma is more prevalent in young ferrets, particularly less than year of age. these ferrets tend to have an acute presentation and may present with dyspnea. ferrets with mediastinal lymphoma may also have multicentric involvement. , , ferrets years of age and greater have a variable presentation with multicentric disease being more prevalent. clinical signs in older ferrets may be chronic and nonspecific depending on organ involvement. some ferrets may have intermittent signs over several months, while others may be asymptomatic with lymphoma being diagnosed incidentally, detected either during routine physical examination or during evaluation of comorbidities. [ ] [ ] [ ] despite lymphoma being common in domestic ferrets and the use of radiography and ultrasonography being touted as part of the minimum database in the diagnosis of lymphoma, , imaging findings in ferrets with lymphoma have been limited to a few case reports. , , , , [ ] [ ] [ ] [ ] the goal of this retrospective study was to describe radiography, ultrasonography, computed tomography (ct), and magnetic resonance imaging (mri) findings in a series of ferrets with a confirmed diagnosis of lymphoma. medical records at the matthew j ryan veterinary hospital of the university of pennsylvania were searched for domestic ferrets with a diagnosis of lymphoma confirmed with cytology or histopathology that had radiography, ultrasonography, ct, or mri performed between january and april . signalment, clinical signs, laboratory findings, and any prior or concurrent disease processes were recorded. vet radiol ultrasound, vol. , no. , , pp - . radiographs, ct, mri, static ultrasound images, and when available, ultrasound cine loops were retrospectively evaluated, and abnormal findings were recorded by j.n.s. the imaging reports generated by a board-certified veterinary radiologist at the time the study was performed were also reviewed. ferrets were excluded if the imaging studies were unavailable for review. any imaging studies obtained after the reported start of clinical signs until a diagnosis was achieved were included. in addition to the reported ultrasonographic findings, the maximal splenic and lymph node thicknesses were measured from the available images if they were not reported. as splenomegaly is common in ferrets, most frequently due to extramedullary hematopoiesis, splenomegaly was subjectively graded as within incidental variation ("incidental splenomegaly") and larger than expected for "incidental splenomegaly." those with subjectively normal spleens or "incidental splenomegaly" were referred to as normal for the purposes of this paper, unless otherwise specified. retrieved ct images were reconstructed with a high-frequency high-resolution algorithm (bone algorithm with edge enhancement) in . mm slice thickness. the maximal lymph node thickness was measured on precontrast images. fourteen ferrets met the inclusion criteria. lymphoma was diagnosed from ultrasound-guided aspirates, surgical biopsies, and/or necropsy; three ferrets had two diagnostic procedures performed. ultrasound-guided aspirate cytology was performed in nine ferrets, surgical biopsy in three, and necropsy in five. both aspirates and biopsy were performed in two ferrets, and both aspirates and necropsy in one ferret. one ferret in this study was previously described. the median age at the time of lymphoma diagnosis was . years (range . - . years). eight of the ferrets were neutered males, and six were spayed females. prior disease histories included adrenal disease (n = ), cardiovascular disease ( ), cutaneous mast cell tumors ( ), diarrhea ( ), insulinoma ( ), cataracts ( ) , and one each of granulomatous lymphadenitis secondary to mycobacteriosis, renal insufficiency, and a chronic pelvic limb abscess. cardiovascular disease included second degree atrioventricular block ( ), systemic hypertension ( ), hypertrophic obstructive cardiomyopathy ( ) , and in one individual both aortic insufficiency and arteriosclerosis. for most ferrets, the time between the first imaging study and a diagnosis of lymphoma was day or less ( / ) . in one ferret each, the time between the initial imaging study and final diagnosis of lymphoma was days and . months. the duration of clinical signs prior to reaching the diagnosis of lymphoma ranged from less than day to months with a mode of less than day and a median of days. clinical signs include lethargy (n = ), diarrhea ( ) , inappetence ( ), weight loss ( ), ataxia ( ), lameness ( ), and vomiting ( ) . diarrhea was chronic in three ferrets and consistent with melena in two. two ferrets did not have overt clinical signs. physical exam findings included palpable abdominal masses ( ) , generalized splenomegaly ( ), palpable splenic nodules or splenic masses ( ), dehydration ( ), paraparesis and ataxia ( ), abdominal pain ( ), hypotension ( ), lumbar pain ( ), abdominal effusion ( ), inguinal and popliteal lymphadenopathy ( ), pyrexia ( ), urinary and fecal incontinence ( ), and a right femoral mass ( ). paraparesis and ataxia were attributed to t -l myelopathy in three ferrets and hypoglycemia in one. one ferret also presented with ptyalism and tremors, which resolved with dextrose administration. blood analyses, including a complete blood count and chemistry profile, were performed in of the ferrets. blood glucose evaluation alone was performed in one ferret. abnormalities included azotemia ( ), elevated liver enzymes ( ), nonregenerative anemia ( ), hypoglycemia ( ), hypoalbuminemia ( ), lymphocytosis ( ), hyperglobulinemia ( ), hypercalcemia ( ), and elevated total bilirubin ( ). two of the four ferrets with hypoglycemia had a previous diagnosis of insulinoma. radiographs were performed in / ferrets. each of these studies included the thorax and abdomen-six studies included a right or left lateral projection and a ventrodorsal projection and six included a left lateral, right lateral, and ventrodorsal projections. one ferret's radiographs were in an analog format (screen-film), and the remaining were digital (rapidstudy, eklin medical systems inc., santa clara, ca). ultrasound was performed in / ferrets using a - mhz linear transducer (ge medical logiq ultrasound imaging system, general electric medical systems, milwaukee, wi). abdominal ultrasonography was performed in / ferrets. ultrasonography of a rib mass and a femoral mass was performed in one ferret each. abdominal ultrasonography was performed twice in one ferret after the start of the reported clinical signs, but prior to the diagnosis of lymphoma. ultrasounds were performed by a board-certified radiologist or a radiology resident under direct supervision from a board-certified radiologist. in the one ferret with two ultrasounds performed prior to diagnosis of lymphoma, the later scan was used for measurements. computed tomography of the thorax and abdomen was performed in one ferret. the ferret was scanned under general anesthesia in dorsal recumbency using a -slice multidetector ct unit (ge brightspeed, general electric company, milwaukee, wi) in medium-frequency soft tissue algorithms ( . mm slice thickness, . pitch) before and immediately after iv administration of nonionic iodinated contrast (iohexol, mgi/ml, dosage mgi/kg, [omnipaque, ge healthcare, inc., princeton, nj]). contrast was manually injected through an iv catheter preplaced in a cephalic vein. one ferret underwent mri evaluation of the lumbar and sacral spine. magnetic resonance imaging was performed using a . t mri unit (ge medical system, milwaukee, wi) with the patient in dorsal recumbency under general anesthesia. image sequences included t weighted (t w) image series in a sagittal and transverse plane, t w fat-saturated images in a sagittal and transverse plane, and a sagittal single-shot fast spine echo. additional sequences including t weighted (t w) images and administration of gadolinium were not performed due to anesthetic concerns for the patient. radiographic and ultrasonographic findings are summarized in table . radiographic abnormalities were predominantly noted in the abdomen. decreased abdominal serosal detail was present in of the ferrets with radiographs. this was interpreted to be potentially due to a poor body condition in two ferrets. abdominal serosal detail was additionally mottled in seven ferrets. sonographically, peritoneal effusion was detected in / ferrets, and was considered mild in , moderate in , anechoic in , and echogenic in . the spleen was considered enlarged in out of ferrets radiographically and in out of ferrets in which abdominal ultrasonography was performed. the remaining five ferrets were considered to have "incidental splenomegaly" on ultrasound and were radiograph- ically considered normal (n = ) and enlarged ( ) . of the eight spleens sonographically considered abnormal, multifocal hypoechoic splenic nodules were present in six ferrets (fig. ) ; one of these six ferrets was considered to have a radiographically normal spleen. an isoechoic to hypoechoic mass with central hypoechoic nodules was present in one ferret. the spleen had a mottled echotexture in one ferret. three ferrets were sedated with butorphanol (torbugesic, fort dodge animal health, fort dodge, ia) and midazolam (hospira inc., lake forrest, il) prior to abdominal ultrasonography; sedation was not performed in any ferret prior to radiographs. all three sedated ferrets had enlarged spleens with one spleen having a mottled echotexture and the other two spleens having multifocal hypoechoic nodules. splenic cytology or histopathology was not performed in any of the ferrets that received sedation for ultrasound. cytology or histopathology was available in seven ferrets-aspirates were performed in / , necropsy in / , and both aspirates and necropsy in / . lymphoma was confirmed in three out of six ferrets with splenic nodules, the one ferret with a splenic mass, and one out of five ferrets with "incidental splenomegaly." in three of the five ferrets with "incidental splenomegaly," marked splenic extramedullary hematopoiesis with splenic congestion ( ) and without concurrent splenic congestion ( ) was diagnosed. the other seven ferrets did not have cytologic evaluation of the spleen. splenic thickness in ferrets where the spleen was considered within incidental variation ranged from . to . mm (median, . mm, mean . mm, standard deviation ± . mm; n = ), while in ferrets with splenomegaly splenic thickness ranged from . to . mm (median . mm, mean . mm, standard deviation ± . mm; n = ). single or multiple, round to oblong, soft-tissue opaque, abdominal masses consistent with enlarged lymph nodes were visible radiographically in / ferrets (fig. ) . one of these ferrets had a large cranial abdominal mass, which was subsequently confirmed to be a markedly enlarged pancreatic lymph node. one ferret had splenic lymphadenopathy detected on the radiographs retrospectively after evaluation of the sonographic findings. lymphadenopathy was reported in of the ferrets with abdominal ultrasonography and the one ferret in which whole body ct was performed. sonographically, abnormal lymph nodes were hypoechoic, rounded, variably enlarged, and surrounded by a fig. . ultrasound image of an enlarged hepatic lymph node in the same ferret as fig. . the hepatic lymph node is markedly enlarged and lobular. although the node is predominantly hypoechoic, there are patchy hyperechoic regions and smaller, round, hypoechoic, nodule-like regions within (arrow). the surrounding fat is hyperechoic, producing a halo around the lymph node (arrow heads). scale at the top of the image is mm between the major ticks. hyperechoic rim (fig. ). some lymph nodes had patchy hyperechoic regions within or a reticular, nodular appearance. abdominal lymph nodes involved included the mesenteric (n = ), hepatic ( ), sublumbar ( ), splenic ( ), gastric ( ), gastroduodenal ( ), colonic ( ), pancreatic ( ), ileocolic ( ), renal ( ), and inguinal ( ) lymph nodes. in one ferret other lymph nodes were reported to be involved in addition to the mesenteric and sublumbar nodes, but were not specifically identified. nine of the ferrets with abdominal ultrasound and the one ferret with ct had involvement of or more lymph nodes reported; / ferrets had only one reportedly abnormal lymph node ( splenic lymph node, mesenteric lymph node). lymph nodes measured from . to . mm thick (median . , mean . mm, standard deviation ± . mm). the lymph node thickness in ferrets with radiographically evident lymphadenopathy ranged from . mm to . mm with a median of . , and mean of . , and standard deviation of ± . mm (n = ). in ferrets in which lymphadenopathy was detected sonographically but not radiographically, lymph nodes measured . mm, . mm, and . mm thick (n = ). lymph node cytology or histopathology was available in seven ferrets-aspirates were performed in / ferrets, both aspirates and surgical biopsy in / , and necropsy in / . it was not clear if lymph nodes were histopathologically assessed in / ferrets in which a necropsy was performed. lymphoma was confirmed in / ferrets with sonographically abnormal lymph nodes (range . - . mm thick, median . mm, mean ± standard deviation . ± . mm). in / ferrets, lymphoid hyperplasia was identified postmortem ( . mm thick). cytologic evaluation of lymph nodes was not performed in the remaining ferrets. in ferrets with lymphadenopathy, eight had concurrent splenomegaly with fig. . ultrasound image of the gastric antrum. the wall of the gastric antrum, especially the muscularis layer, is circumferentially thickened (arrows). a small amount of fluid and gas (*) is present in the lumen. splenic nodules ( ), a splenic mass ( ), or a mottled splenic echotexture ( ) . lymphoma was identified in the liver of / ferrets by surgical biopsy ( ) and at necropsy ( ) . mild hepatomegaly with a normal echotexture was noted on ultrasound in the ferret with lymphoma diagnosed by biopsy. additional findings in this ferret included mild peritoneal effusion and lymphadenopathy. the ferret with hepatic lymphoma identified at necropsy had no radiographic or reported sonographic liver abnormalities. this ferret also had hepatic lipidosis. imaging findings in this ferret included an aggressive vertebral lesion, splenomegaly with splenic nodules, and lymphadenopathy. lymphoma was confirmed in each of these organs, as well as in the pancreas, which was reportedly normal on ultrasound. in addition to these two ferrets, hepatic histopathology from necropsy was available in four other ferrets. hepatic lipidosis was identified each of these four ferrets; one ferret also had extramedullary hematopoiesis. none of these four ferrets had radiographic or reported sonographic abnormalities. cytology or histopathology was not available in the remaining / ferrets. of these, / had no radiographic or sonographic abnormalities noted. mild hepatomegaly was noted radiographically in / ferrets; however, sonographic hepatic changes were not noted in these ferrets and cytologic evaluation was not performed. on ultrasound, one ferret had moderate hepatomegaly with a hypoechoic, mottled echotexture (radiography was not performed in this ferret). additionally one ferret had two hypoechoic cystic masses, one mass of which had central mineralization, detected with ultrasound (radiography was not performed in this ferret). gastrointestinal lymphoma was confirmed at necropsy in two ferrets. in one ferret, thickening of the gastric antrum up to . mm and blurring of wall layering was identified sonographically (fig. ) . (to the authors' knowledge, the normal gross or sonographic wall thickness of the gastrointestinal tract in ferrets has not been previously reported.) fig. . ultrasound image of the right kidney. there is a round, hypoechoic nodule in the parenchyma, which bulges from the renal contour. the calipers (+) denote the renal length ( ) and margins of the nodule ( , ). no abnormalities were noted in the small or large intestines. at necropsy, lymphoma was identified in the stomach and small intestines, in addition to a chronic ulcerative gastroenteritis. in the second ferret, aside from poor abdominal serosal detail attributable to poor body condition and mild anechoic peritoneal effusion, there were no other radiographic or sonographic abnormalities. in this ferret, lymphoma was identified postmortem in the descending colon. the colon was discolored, but there was no reported gross colonic wall thickening. renal masses were present in two ferrets. in each ferret, a single renal mass was detected with ultrasound and was well defined, hypoechoic, centered on the cortex, and protruded from the kidney. the mass was right-sided, rounded, and measured . mm in diameter in one of these two ferrets. in the second ferret, the mass was left-sided, lobular, and measured up to . mm in diameter. cytology of the right renal mass in the first ferret was not performed; however, the patient received chemotherapy for the treatment of lymphoma, confirmed from aspiration and biopsy of an enlarged lymph node, and the mass was seen to decrease in size during follow-up studies (fig. ) . although the masses in both ferrets had similar ultrasound characteristics, the renal mass in the second ferret was diagnosed as a spindle cell sarcoma. additionally that mass had been identified sonographically year prior to the diagnosis of lymphoma, was progressively increasing in size, and did not decrease in size following administration of chemotherapy for the treatment of lymphoma. concurrent sonographic changes in both ferrets included lymphadenopathy and peritoneal effusion. a large lobular retroperitoneal mass was present in one ferret. the mass was on midline, extending into the right and left sides of the retroperitoneal space, laterally displacing the right kidney. the left kidney was not visualized radiographically. in the right cranial retroperitoneal space, cranial to the right kidney, there was a cluster of heterogeneous mineral opacities in an adjacent second, smaller mass. sonographically the large retroperitoneal mass was heterogeneous, hypoechoic with patchy hyperechoic regions, and laterally displaced both kidneys. the smaller mineralized mass was confirmed to be an enlarged right adrenal gland sonographically. at necropsy, the retroperitoneal mass was confirmed to be lymphoma; however, a specific tissue of origin was not determined. as a normal left adrenal gland could not be identified sonographically or at postmortem, an adrenal origin for this mass was considered most likely, although adrenal tissue was not identified histopathologically within the mass. alternatively the mass may have arisen from a retroperitoneal lymph node or retroperitoneal adnexa. concurrent abdominal imaging findings considered incidental to the diagnosed lymphoma included renal cysts ( ) , cystic lymph nodes ( ), adrenomegaly in ferrets with diagnosed adrenal disease ( ), and pancreatic nodules in ferrets diagnosed with insulinoma ( ). on thoracic radiographs pleural fissure lines, consistent with a small volume of pleural effusion, were present in / ferrets. pericardial and pleural effusions were noticed during abdominal ultrasonography in one ferret. possible sternal ( ) and tracheobronchial lymphadenopathy ( ) were seen radiographically. in one ferret, sternal and cranial mediastinal lymphadenopathy were detected with ct. an interstitial pulmonary pattern was present in two ferrets, but was potentially attributable to the radiographic projections being relatively expiratory. aggressive osseous lesions were detected radiographically in three ferrets. the one ferret with a history of lameness had a soft-tissue mass involving the entire right femur with marked, multifocal areas of geographic to motheaten, expansile lysis throughout. smooth to mildly irregular periosteal reaction was present along the femoral diaphysis and greater trochanter. the adjacent acetabulum and ileum were questionably involved based on the radiographs. sonographically the soft-tissue components of the mass were homogeneously hypoechoic. cortical irregularities and disruption, consistent with lysis, were also present. histopathology of the mass following limb amputation was consistent with plasmablastic lymphoma. this ferret was previously described. vertebral lysis was apparent radiographically in two of the three ferrets with t -l myelopathy. in one of these two ferrets, there was geographic lysis of l involving the majority of the vertebral body and a pathologic fracture of the cranial end plate (fig. ) . other radiographic changes present in this ferret included splenomegaly and decreased abdominal serosal detail likely due to poor body condition. on ultrasound, peritoneal effusion, splenomegaly with splenic nodules, and lymphadenopathy were detected. at necropsy, intramedullary lymphoma was found in the l vertebra with epidural extension of the tumor. lymphoma was also found affecting the spleen, liver, pancreas, and fig. . right lateral radiograph cropped and centered on l . at l there is geographic lysis, including cortical thinning or loss, of the cranial two-thirds of the vertebral body and the cranial aspect of the pedicles (arrows). the cranial end plate of l has a concave indentation, presumptively secondary to a pathologic fracture (arrow head). mesenteric lymph nodes. in the second ferret with vertebral lysis, there was geographic lysis of the cranial two-thirds of the body and pedicles of t . there was also possible lysis of the cranial body and pedicles of l . the dorsal half of the left th rib was lytic and no longer visible. associated with this rib, there was a large, ill-defined, soft-tissue mass, which extended into the thoracic cavity. the adjacent ribs and vertebra were not appreciably involved. additional radiographic findings in this ferret included splenomegaly, hepatomegaly, and abdominal masses consistent with enlarged lymph nodes. with ct, expansile lysis of the left t vertebral body and pedicle was seen associated with a hyperattenuating (to muscle), strongly enhancing mass (fig. ) . the mass occupied the ventral two-thirds of the spinal canal and resulted in severe spinal cord compression. a possible pathologic fracture was present in the cranial endplate. at l , there was lysis of the left pedicle and body associated with a mildly compressive, hyperattenuating, contrast-enhancing mass. an additional, similar mass lesion was seen at t , with lysis of the midvertebral body and mild spinal cord compression. the rib mass was isoattenuating (to muscle), heterogeneous, mildly enhancing, and resulted in severe, expansile lysis. cytology of the rib mass obtained by ultrasound-guided fine-needle aspiration was diagnostic for lymphoma; cytological assessment of the other lesions was not performed in this ferret. one ferret with t -l myelopathy did not have gross skeletal pathology. radiographic changes included splenomegaly and poor, mottled serosal detail. sonographically, a mild peritoneal effusion was present, and the spleen was considered within incidental variation. an mri of the lumbar spine revealed an ill-defined area of suspect intramedullary t w hyperintensity within the spinal cord at the level of l . differential diagnoses for this lesion considered at the time included an artifact, prior infarct, gliosis, edema, myelitis, neoplasia, and hydromyelia. at the postmortem examination performed months after the mri, lymphoma was detected in the brain, meninges, choroid plexus, spinal cord, and extracapsular accessory adrenal tissue. additionally there was multifocal spinal cord malacia and hemorrhage. the lesions in the spinal cord were identified in histopathologic samples obtained at intervals from the cervical spine at c through to the lumbar spine at l , including at the level of l . specific correlation between the suspect mri lesion and histopathologic findings was not performed. splenic changes were consistent with congestion and extramedullary hematopoiesis. multicentric lymphoma was the most common presentation in this study. this is consistent with prior reports in which multicentric lymphoma is the most common presentation in ferrets older than years of age. , , , the most common imaging findings in this study were intraabdominal lymphadenopathy and splenomegaly with mildto-moderate peritoneal effusion. lymphadenopathy consisted of multiple enlarged, predominantly intra-abdominal lymph nodes, particularly including the mesenteric lymph node. only one ferret had peripheral lymphadenopathy, consisting of enlargement of the inguinal and popliteal lymph nodes, in addition to abdominal lymphadenopathy. lymph nodes greater than . mm thick sonographically were generally appreciable radiographically as round to oblong, soft-tissue nodules or masses in their respective locations. of the three ferrets in which sonographically detected lymphadenopathy was not appreciable radiographically, only one had a lymph node thickness greater than . mm. that ferret also had a large, retroperitoneal mass that likely accounted for a lack of visualization of the enlarged mesenteric lymph node due to silhouetting and displacement. previous studies in normal ferrets using ultrasound have reported the normal thickness of mesenteric lymph nodes as . ± . mm and . ± . mm. , given that some radiographically visible lymph nodes measured as small as . mm (which is within the reported normal ranges for mesenteric lymph nodes) it is possible that normal lymph nodes may be radiographically appreciable. in the authors' experiences, however, visualization of normal, small abdominal lymph nodes on radiographs of ferrets is uncommon. normal abdominal lymph nodes are not radiographically distinguishable in dogs and cats. , although some lymph nodes in this study that were considered abnormal measured within the reported normal ranges, there were other changes to those nodes to suggest pathology, such as hypoechogenicity. in dogs and cats, sonographic changes that have been associated with malignancy include an increase in maximal short and long axis diameter (enlarged), an increase in short-to-long axis length ratio (more rounded appearance), hypoechogenicity, hyperechoic perinodal fat with an irregular nodal contour, and heterogeneity. [ ] [ ] [ ] similar to previous reports, the spleen was the most common extranodal site of neoplastic infiltration with lymphoma in the current study. in a prior study splenomegaly was attributable to neoplastic infiltration in % of ferrets with lymphoma and extramedullary hematopoiesis in %. in general, splenomegaly secondary to extramedullary hematopoiesis is common in ferrets. to the authors' knowledge, there is no reference for normal splenic size in ferrets using ultrasound. grossly the normal spleen has been reported to measure . cm in length, . cm in width, and . cm thick. given that these are gross measurements, however, they were likely obtained postmortem. splenic size is variable and decreases postmortem, so these measurements may not be translatable to antemortem studies with sonographic measurements. the smallest splenic thickness in this study was . cm; using the gross measurement guidelines all spleens in this study would be considered enlarged. the degree of splenomegaly was therefore subjectively characterized as within incidental variation and larger than expected for "incidental splenomegaly" based on the authors' experiences. of the seven ferrets in which splenic cytology was available, lymphoma was confirmed in the four ferrets in which the spleen was considered abnormal and cytology was performed. of the three ferrets spleens in which cytology was performed and the spleen was considered within incidental variation, lymphoma was identified in one, and extramedullary hematopoiesis was confirmed in the other two. potential differential diagnoses for multicentric lymphadenopathy with splenomegaly in ferrets include reactive lymphadenopathy secondary to gastrointestinal disease with splenic extramedullary hematopoiesis, systemic mycobacteriosis, granulomatous inflammatory syndrome, and aleutian disease. , [ ] [ ] [ ] [ ] with systemic mycobacteriosis, ferrets can have other lymph nodes affected in addition to the abdominal lymph nodes, with the retropharyngeal lymph nodes being affected as commonly as the mesenteric lymph nodes. as with lymphoma, clinical signs of mycobacteriosis in the ferret depend on the organs that are affected and can include lethargy, anorexia, vomiting, and diarrhea; but as with other infectious diseases, changes in white blood cell counts can be seen. mycobacteriosis can be diagnosed on cytology and biopsy of the affected lymph node or organ. granulomatous inflammatory syndrome is a newly recognized systemic disease associated with coronavirus that causes inflammation in the spleen and lymph nodes. this syndrome results in a severe granulomatous disease that can affect the gastrointestinal tract, mesenteric lymph nodes, liver, and spleen. unlike lymphoma, it is usually seen in younger ferrets, but like lymphoma, clinical signs are nonspecific and depend on the organ that is affected. patients with this syndrome usually have polyclonal gammopathy that can also be seen with aleutian's disease virus and lymphoma. definitive diagnosis requires cytology or biopsy of the affected organs. aleutian's disease is a parvovirus that can cause lymphadenopathy and splenomegaly. as with lymphoma and granulomatous inflammatory syndrome, aleutian's disease can cause a polyclonal gammopathy. ferrets with this virus usually present with generalized signs of illness (lethargy, weight loss) as well as neurologic signs such as paresis or tremors. aspirates and biopsy samples of lymph nodes and the spleen can be difficult to interpret as the disease causes lymphoplasmacytic inflammation that can be easily confused with other diseases such as small cell lymphoma and epizootic catarrhal enteritis. in the one ferret with colonic lymphoma, there were minimal imaging findings including poor abdominal serosal detail and mild peritoneal effusion. at postmortem, lymphoma with mucosal erosions was detected in the colon. segmental lymphoplasmatic enteritis was identified in the small intestines. this ferret presented cachexic, hypotensive, anemic, had melena, and died within h of presentation. given that melena is referable to upper gastrointestinal bleeding, the clinical findings in this ferret could have been attributable to both helicobacter mustelidae gastritis and lymphoma. it is also possible that lymphoma was present in other portions of the gastrointestinal tract, but was not detected postmortem. after the spleen, the next most common extranodal sites of neoplastic involvement with lymphoma in ferrets have been reported to be the liver, kidneys, and lungs. in this study, two ferrets had confirmed hepatic infiltration-one of which had mild hepatomegaly on ultrasound (subjectively normal on radiographs) and the other of which had no reported hepatic abnormalities. hepatic lipidosis, identified in four ferrets, was not associated with radiographic or sonographic changes and may have been due to inappetence. given these findings, ultrasound does not appear to be sensitive for the detection of hepatic lymphoma in ferrets. sensitivity of ultrasound for hepatic lymphoma has also been reported to be low in dogs, cats, and humans. , one of two ferrets with renal masses had probable renal lymphoma based on the response to treatment. the second renal mass, a confirmed renal sarcoma, was not sonographically differentiable from the presumptive renal lymphoma. pulmonary involvement was not identified in this study. the small number of individuals in this study precludes extensive comparisons of the affected organ distribution to prior studies. the most common thoracic finding in this study was mild pleural effusion, which was present in four ferrets. there were no ferrets with a mediastinal mass in this study. mediastinal involvement, in general, is more prevalent in ferrets less than years of age, and has been reported to be the more common presentation of lymphoma in that age group with or without concurrent multicentric involvement. [ ] [ ] [ ] ferrets with mediastinal lymphoma may present for tachypnea or dyspnea secondary to the space-occupying effect of a large mediastinal mass, as well as concurrent pleural effusion. no ferrets in this study were less than years old, which may have accounted for the lack of mediastinal involvement in this cohort. additionally, although this institution also provides primary care to nontraditional small mammal species, it is also a tertiary care facility and the population of ferret patients may not have been representative of the general domestic ferret population. there may have been a selection bias for ferrets with more insidious signs, which tend to occur in ferrets greater than years of age, as opposed to younger ferrets, which may have a more acute and more rapidly progressive presentation. aggressive osseous lesions were present in three ferrets with skeletal lymphoma involvement. to the authors' knowledge, only three other ferrets with osseous involvement have been described previously. , in those ferrets, lytic lesions were present in the tibia, in the lumbar spine, and in the lumbosacral spine. , based on those ferrets and the ferrets in this study, it is possible that the lumbar spine is a predilection site for vertebral lymphoma; however, this remains speculative. an alternate possibility is that lysis may be relatively easier to detect in the lumbar spine where there is less superimposition of structures over the vertebrae, compared to the thoracic vertebrae where the ribs proximally are superimposed on the vertebrae. in humans with primary bone lymphoma, three radiographs patterns are described: the lytic-destructive pattern, which is predominantly lytic with or without a lamellated or interrupted periosteal reaction or cortical lysis; the blastic-sclerotic pattern in which there are mixed lytic and sclerotic regions; and "near-normal" findings in which there are only subtle radiographic changes and additional imaging (scintigraphic bone scans or mri) is required. osseous lesions seen in the ferrets of this study and the prior reports are similar to the lytic-destructive pattern, and had cortical disruption. this is also the pattern most typically seen in canines and felines with osseous involvement from lymphoma or other round cell neoplasms. diffuse central nervous system infiltration with lymphoma was present in one ferret. as lymphoma outside of the central nervous system was only detected in accessory adrenal tissues, this ferret presumably had a primary central nervous system lymphoma. primary central nervous system lymphoma in dogs and cats has not been reported to have an extraparenchymal vs. an intraparenchymal predilection. in humans, lesions with primary central nervous system lymphoma are most frequently intraparenchymal, and metastatic central nervous system lymphoma is more frequently extraparenchymal. , in this ferret, the meninges and choroid plexus were involved in addition to the brain and spinal cord. protracted clinical signs in that ferret consisted of variable paraparesis and lumbar pain over months from the start of clinical signs to the final diagnosis of lymphoma at necropsy. gradual progression of signs and the protracted clinical signs suggests a relatively slow-growing process. prednisone, administered for palliative treatment, was started approximately months after the initial clinical signs. radiographs and ultrasound, performed prior to starting prednisone, had minimal, nonspecific findings. magnetic resonance imaging, performed month after initiation of the prednisone regimen and months prior to the diagnosis of lymphoma, was inconclusive. administration of prednisone prior to the mri may have resulted in partial regression of lymphoma, therefore making it more difficult to identify; however, the ferret did not demonstrate improvement of the clinical signs so whether prednisone affected detection of neoplastic infiltration or not is speculative. additionally the mri was limited in that only t w images were obtained. perhaps if additional sequences were performed, particularly t w postcontrast images, or if a follow-up mri was performed at a later date, meningeal or parenchymal abnormalities may have been detected. also because the necropsy was performed months following mri, it is likely that the extent of the lesions seen postmortem had progressed com-pared to at the time of imaging. magnetic resonance imaging lesions in dogs and cats with primary central nervous system lymphoma (compared to white matter) have been reported to be predominantly t w hyperintense with indistinct margins, t w hypointense, contrast enhancing, had perilesional hyperintensity on flair consistent with perilesional edema, and had a mass effect. in humans, lesions have similar signal characteristics (compared to white matter) being t w hyperintense, t w iso-to hypointense, and contrast enhancing. , these findings are considered nonspecific in dogs, cats, and humans, and lesions may not be detected with mri at the onset of clinical signs. , two ferrets had no clinical signs referable to lymphoma. in one ferret, the owner palpated a markedly enlarged abdominal lymph node. radiographic and sonographic findings consisted of multicentric lymphadenopathy, peritoneal effusion, a renal mass, a hyperechoic liver, and "incidental splenomegaly." in the other ferret, progressive lymphocytosis was detected during routine treatment and monitoring of adrenocortical disease. lymphoma was identified in the peritoneal effusion of that ferret. additional sonographic findings included a multicentric lymphadenopathy, splenomegaly with splenic nodules, a cystic hepatic mass, and a renal mass (sarcoma). adrenal disease was a common comorbidity seen with lymphoma, as found in other studies. this is likely because adrenal disease is common in older ferrets in general. , , other relatively common comorbidities found in this study were cardiovascular disease and cutaneous mast cell tumors, both of which also commonly occur in older ferrets. , one ferret had a history of granulomatous lymphadenitis suspected to be secondary to mycobacteriosis. although this study describes the imaging findings in a small number of ferrets with lymphoma, it provides an important source of information for practicing clinicians. the small number of ferrets able to be included during the time frame of the study likely reflects that imaging is not performed in every ferret with suspected or confirmed lymphoma, and that a definitive diagnosis was not always attained prior to treatment in individuals with suggestive clinical and imaging findings. ultrasound-guided aspirates of lymph nodes, spleens, and aggressive osseous lesions performed in ferrets of this study were each diagnostic for or strongly suggestive of lymphoma. although aspirates are often the initial tissue sampling procedure, previous reports have cautioned the use of lymph node aspirates in the diagnosis of lymphoma as inflammatory and reactive changes may be misinterpreted as lymphoma. , this is particularly true of the gastric lymph node in ferrets with gastrointestinal signs. a false positive diagnosis of lymphoma is considered not likely to have occurred in the ferrets included for in this study. lack of a definitive diagnosis (i.e., false negative results) from aspirate samples likely resulted in exclusion of some individuals from this study. analysis of the frequency of misdiagnosis and nonconfirmatory aspirate samples in patients with lymphoma was not performed. this study was also limited in that histopathology was not performed on all organs in each individual, and therefore, whether or not the changes seen were each attributable to lymphoma cannot be confirmed. additionally, because ultrasound findings were based on the reports and images obtained; some structures were unable to be reassessed. this is particularly the case in which multiple lymph nodes were affected. images of each lymph node may not have been attained, the imaging report may not have been complete in describing which nodes were affected, and measurement performed retrospectively on the available static images may not have reflected the actual maximal nodal thickness in that individual. in conclusion, findings from the current study indicated that imaging characteristics of lymphoma in ferrets are similar to those previously reported for dogs, cats, and humans. lymphoma may most commonly be multicentric in ferrets. imaging findings frequently included intra-abdominal lymphadenopathy, splenomegaly, and peritoneal effusion. lymphadenopathy and mass lesions were typically hypoechoic on ultrasound. osseous lesions, when present, were predominantly lytic. lack of imaging abnormalities did not preclude the diagnosis of lymphoma. ferrets, rabbits, and rodents. saint louis: w.b. saunders hematopoietic diseases ferret lymphoma: the old and the new neoplastic diseases in ferrets: cases ( - ) clinical and pathologic findings in ferrets with lymphoma: cases malignant lymphoma in ferrets: clinical and pathological findings in cases ferrets: examination and standards of care diagnosis and treatment of myelo-osteolytic plasmablastic lymphoma of the femur in a domestic ferret t cell lymphoma in the lumbar spine of a domestic ferret (mustela putorius furo) t-cell lymphoma in a ferret (mustela putorius furo) malignant b-cell lymphoma with mott cell differentiation in a ferret (mustela putorius furo) cytomorphological and immunohistochemical features of lymphoma in ferrets anatomia ultrassonográfica dos linfonodos abdominais de furões europeus hígidos ultrasonography and fine needle aspirate cytology of the mesenteric lymph node in normal domestic ferrets (mustela putorius furo) bsava manual of canine and feline abdominal imaging. gloucester: british small animal veterinary association the peritoneal space characterization of normal and abnormal canine superficial lymph nodes using gray-scale b-mode, color flow mapping, power, and spectral doppler ultrasonography: a multivariate study observations upon the size of the spleen splenomegaly in the ferret. gainesville: eastern states veterinary association ferret coronavirus-associated diseases aleutian disease in the ferret mycobacterial infection in the ferret gastrointestinal diseases diagnostic accuracy of gray-scale ultrasonography for the detection of hepatic and splenic lymphoma in dogs ultrasongraphic findings in hepatic and splenic lymphosarcoma in dogs and cats primary bone lymphoma: radiographic-mr imaging correlation mri features of cns lymphoma in dogs and cats primary cns lymphoma in the spinal cord: clinical manifestations may precede mri delectability bienzle d. laboratory findings, histopathology, and immunophenotype of lymphoma in domestic ferrets the senior ferret (mustela putorius furo) key: cord- -tw armh authors: ma, junling; van den driessche, p.; willeboordse, frederick h. title: the importance of contact network topology for the success of vaccination strategies date: - - journal: journal of theoretical biology doi: . /j.jtbi. . . sha: doc_id: cord_uid: tw armh abstract the effects of a number of vaccination strategies on the spread of an sir type disease are numerically investigated for several common network topologies including random, scale-free, small world, and meta-random networks. these strategies, namely, prioritized, random, follow links and contact tracing, are compared across networks using extensive simulations with disease parameters relevant for viruses such as pandemic influenza h n / . two scenarios for a network sir model are considered. first, a model with a given transmission rate is studied. second, a model with a given initial growth rate is considered, because the initial growth rate is commonly used to impute the transmission rate from incidence curves and to predict the course of an epidemic. since a vaccine may not be readily available for a new virus, the case of a delay in the start of vaccination is also considered in addition to the case of no delay. it is found that network topology can have a larger impact on the spread of the disease than the choice of vaccination strategy. simulations also show that the network structure has a large effect on both the course of an epidemic and the determination of the transmission rate from the initial growth rate. the effect of delay in the vaccination start time varies tremendously with network topology. results show that, without the knowledge of network topology, predictions on the peak and the final size of an epidemic cannot be made solely based on the initial exponential growth rate or transmission rate. this demonstrates the importance of understanding the topology of realistic contact networks when evaluating vaccination strategies. the importance of contact network topology for the success of vaccination strategies for many viral diseases, vaccination forms the cornerstone in managing their spread and the question naturally arises as to which vaccination strategy is, given practical constraints, the most effective in stopping the disease spread. for evaluating the effectiveness of a vaccination strategy, it is necessary to have as precise a model as possible for the disease dynamics. the widely studied key reference models for infectious disease epidemics are the homogeneous mixing models where any member of the population can infect or be infected by any other member of the population; see, for example, anderson and may ( ) and brauer ( ) . the advantage of a homogeneous mixing model is that it lends itself relatively well to analysis and therefore is a good starting point. due to the homogeneity assumption, these models predict that the fraction of the population that needs to be vaccinated to curtail an epidemic is equal to À =r , where r is the basic reproduction number (the average number of secondary infections caused by a typical infectious individual in a fully susceptible population). however, the homogeneous mixing assumption poorly reflects the actual interactions within a population, since, for example, school children and office co-workers spend significant amounts of time in close proximity and therefore are much more likely to infect each other than an elderly person who mostly stays at home. consequently, efforts have been made to incorporate the network structure into models, where individuals are represented by nodes and contacts are presented by edges. in the context of the severe acute respiratory syndrome (sars), it was shown by meyers et al. ( ) that the incorporation of contact networks may yield different epidemic outcomes even for the same basic reproduction number r . for pandemic influenza h n / , pourbohloul et al. ( ) and davoudi et al. ( ) used network theory to obtain a real time estimate for r . numerical simulations have shown that different networks can yield distinct disease spread patterns; see, for example, bansal et al. ( ) , miller et al. ( ) , and section . in keeling and rohani ( ) . to illustrate this difference for the networks and parameters we use, the effect of different networks on disease dynamics is shown in fig. . descriptions of these networks are given in section and appendix b. at the current stage, most theoretical network infectious disease models incorporate, from a real world perspective, idealized random network structures such as regular (all nodes have the same degree), erd + os-ré nyi or scale-free random networks where clustering and spatial structures are absent. for example, volz ( ) used a generating function formalism (an alternate derivation with a simpler system of equations was recently found by miller, ) , while we used the degree distribution in the effective degree model presented in lindquist et al. ( ) . in these models, the degree distribution is the key network characteristic for disease dynamics. from recent efforts (ma et al., ; volz et al., ; moreno et al., ; on incorporating degree correlation and clustering (such as households and offices) into epidemic models, it has been found that these may significantly affect the disease dynamics for networks with identical degree distributions. fig. shows disease dynamics on networks with identical degree distribution and disease parameters, but with different network topologies. clearly, reliable predictions of the epidemic process that only use the degree distribution are not possible without knowledge of the network topology. such predictions need to be checked by considering other topological properties of the network. network models allow more precise modeling of control measures that depend on the contact structure of the population, such as priority based vaccination and contact tracing. for example, shaban et al. ( ) consider a random graph with a pre-specified degree distribution to investigate vaccination models using contact tracing. kiss et al. ( ) compared the efficacy of contact tracing on random and scale-free networks and found that for transmission rates greater than a certain threshold, the final epidemic size is smaller on a scale-free network than on a corresponding random network, while they considered the effects of degree correlations in kiss et al. ( ) . cohen et al. ( ) (see also madar et al., ) considered different vaccination strategies on scale-free networks and found that acquaintance immunization is remarkably effective. miller and hyman ( ) considered several vaccination strategies on a simulation of the population of portland oregon, usa, and found it to be most effective to vaccinate nodes with the most unvaccinated susceptible contacts, although they found that this strategy may not be practical because it requires considerable computational resources and information about the network. bansal et al. ( ) took a contact network using data from vancouver, bc, canada, considered two vaccination strategies, namely mortality-and morbidity-based, and investigated the detrimental effect of vaccination delays. and found that, on realistic contact networks, vaccination strategies based on detailed network topology information generally outperform random vaccination. however, in most cases, contact network topologies are not readily available. thus, how different network topologies affect various vaccination strategies remains of considerable interest. to address this question, we explore two scenarios to compare percentage reduction by vaccination on the final size of epidemics across various network topologies. first, various network topologies are considered with the disease parameters constant, assuming that these have been independently estimated. second, different network topologies are used to fit to the observed incidence curve (number of new infections in each day), so that their disease parameters are different yet they all line up to the same initial exponential growth phase of the epidemic. vaccines are likely lacking at the outbreak of an emerging infectious disease (as seen in the h n pandemic, conway et al., ) , and thus can only be given after the disease is already widespread. we investigate numerically whether network topologies affect the effectiveness of vaccination strategies started with a delay after the disease is widespread; for example, a day delay as in the second wave of the influenza pandemic in british columbia, canada (office of the provincial health officer, ). details of our numerical simulations are given in appendix a. this paper is structured as follows. in section , a brief overview of the networks and vaccination strategies (more details are provided in appendices b and c) is given. in section , we investigate the scenario where the transmission rate is fixed, while in section we investigate the scenario where the growth rate of the incidence curve is fixed. to this end, we compute the incidence curves and reductions in final sizes (total number of infections during the course of the epidemic) due to vaccination. for the homogeneous mixing model, these scenarios are identical (ma and earn, ) , but as will be shown, when taking topology into account, they are completely different. we end with conclusions in section . . on all networks, the average degree is , the population size is , , the transmission rate is . , the recovery rate is . , and the initial number of infectious individuals is set to . both graphs represent the same data but the left graph has a semi-log scale (highlighting the growth phase) while the right graph has a linear scale (highlighting the peak). (b)) on networks with identical disease parameters and degree distribution (as shown in (a)). the network topologies are the random, meta-random, and near neighbor networks. see appendix b for details of the constructions of these networks. detailed network topologies for human populations are far from known. however, this detailed knowledge may not be required when the main objective is to assert the impact that topology has on the spread of a disease and on the effects of vaccination. it may be sufficient to consider a number of representative network topologies that, at least to some extent, can be found in the actual population. here, we consider the four topologies listed in table , which we now briefly describe. in the random network, nodes are connected with equal probability yielding a poisson degree distribution. in a scale-free network, small number of nodes have a very large number of links and large number of nodes have a small number of links such that the degree distribution follows a power law. small world (sw) networks are constructed by adding links between randomly chosen nodes on networks in which nodes are connected to the nearest neighbors. the last network considered is what we term a meta-random network where random networks of various sizes are connected with a small number of interlinks. all networks are undirected with no self loops or multiple links. the histograms of the networks are shown in table , and the details of their construction are given in appendix b. the vaccination strategies considered are summarized in table . in the random strategy, an eligible node is randomly chosen and vaccinated. in the prioritized strategy, nodes with the highest degrees are vaccinated first, while in the follow links strategy, inspired by notions from social networks, a randomly chosen susceptible node is vaccinated and then all its neighbors and then its neighbor's neighbors and so on. finally, in contact tracing, the neighbors of infectious nodes are vaccinated. for all the strategies, vaccination is voluntary and quantity limited. that is, only susceptibles who do not refuse vaccination are vaccinated and each day only a certain number of doses is available. in the case of (relatively) new viral diseases, the supply of vaccines will almost certainly be constrained, as was the case for the pandemic influenza h n / virus. also in the case of mass vaccinations, there will be resource limitations with regard to how many doses can be administered per day. the report (office of the provincial health officer, ) states that the vaccination program was prioritized and it took weeks before the general population had access to vaccination. thus we assume that a vaccination program can be completed in - weeks or about days, this means that for a population of , , a maximum of doses a day can be used. for each strategy for each time unit, first a group of eligible nodes is identified and then up to the maximum number of doses is dispensed among the eligible nodes according to the strategy chosen. more details of the vaccination strategies and their motivations are given in appendix c. to study the effect of delayed availability of vaccines during an emerging infectious disease, we compare the effect of vaccination programs starting on the first day of the epidemic with those vaccination programs starting on different days. these range from to days after the start of the epidemic, with an emphasis on a day delay that occurred in british columbia, canada, during the influenza h n / pandemic. when a node is vaccinated, the vaccination is considered to be ineffective in % of the cases (bansal et al., ) . in such cases, the vaccine provides no immunity at all. for the % of the nodes for which the vaccine will be effective, a two week span to reach full immunity is assumed (clark et al., ) . during the two weeks, we assume that the immunity increases linearly starting with at the time of vaccination reaching % after days. the effect of vaccination strategies has been studied (see, for example, conway et al., ) using disease parameter values estimated in the literature. however, network topologies were not the focus of these studies. in section , the effect of vaccination strategies on various network topologies is compared with a fixed per link transmission rate. the per link transmission rate b is difficult to obtain directly and is usually derived as a secondary quantity. to determine b, we pick the basic reproduction number r ¼ : and the recovery rate g ¼ : , which are close to that of the influenza a h n / virus; see, for example, pourbohloul et al. ( ), tuite et al. ( . in the case of the homogeneous mixing sir model, the basic reproduction number is given by r ¼ t=g, where t is the per-node transmission rate. our table illustration of the different types of networks used in this paper. scale-free small world meta-random table degree histograms of the networks in table with , nodes. scale-free small world meta-random parameter values yield t ¼ : . for networks, t ¼ b/ks. with the assumption that the average degree /ks ¼ , the above gives the per-link transmission rate b ¼ : . the key parameters are summarized in table . in section , we use this transmission rate to compare the incidence curves for the networks in table with the vaccination strategies in table . some of the most readily available data in an epidemic are the number of reported new cases per day. these cases generally display exponential growth in the initial phase of an epidemic and a suitable model therefore needs to match this initial growth pattern. the exponential growth rates are commonly used to estimate disease parameters (chowell et al., ; lipsitch et al., ) . in section , we consider the effects of various network topologies on the effectiveness of vaccination strategies for epidemics with a fixed exponential growth rate. the basic reproduction number r ¼ : and the recovery rate g ¼ : yield an exponential growth rate l ¼ tÀg ¼ : for the homogeneous mixing sir model. we tune the transmission rate for each network topology to give this initial growth rate. in this section, the effectiveness of vaccination strategies on various network topologies is investigated for a given set of parameters, which are identical for all the simulations. the values of the disease parameters are chosen based on what is known from influenza h n / . qualitatively, these chosen parameters should provide substantial insight into the effects topology has on the spread of a disease. unless indicated otherwise the parameter values listed in table are used. the effects of the vaccination strategies summarized in table when applied without delay are shown in fig. . for reference, fig. shows the incidence curves with no vaccination. since the disease dies out in the small world network (see fig. ), vaccination is not needed in this network for the parameter values taken. especially in the cases of the random and meta-random networks, the effects of vaccination are drastic while for the scale-free network they are still considerable. what is particularly notable is that when comparing the various outcomes, topology has as great if not a greater impact on the epidemic than the vaccination strategy. besides the incidence curves, the final sizes of epidemics and the effect vaccination has on these are also of great importance. table shows the final sizes and the reductions in the final sizes for the various networks on which the disease can survive (for the chosen parameter values) with vaccination strategies for the cases where there is no delay in the vaccination. fig. and table show the incidence curves and the reductions in final sizes for the same parameters as used in fig. and table but with a delay of days in the vaccination. as can be expected for the given parameters, a delay has the biggest effect for the scale-free network. in that case, the epidemic is already past its peak and vaccinations only have a minor effect. for the random and meta-random networks, the table illustration of vaccination strategies. susceptible nodes are depicted by triangles, infectious nodes by squares, and the vaccinated nodes by circles. the average degree in these illustrations has been reduced to aid clarity. the starting point for contact tracing is labeled as a while the starting point for the follow links strategy is labeled as b. the number of doses dispensed in this illustration is . random follow links contact tracing table for the network topologies in table given a fixed transmission rate b. there is no delay in the vaccination and parameters are equal to those used in fig. . to further investigate the effects of delay in the case of random vaccination, we compute reductions in final sizes for delays of , , ,y, days, in random, scale-free, and meta-random networks. fig. shows that, not surprisingly, these reductions diminish with longer delays. however, the reductions are strongly network dependent. on a scale-free network, the reduction becomes negligible as the delay approaches the epidemic peak time, while on random and meta-random networks, the reduction is about % with the delay at the epidemic peak time. this section clearly shows that given a certain transmission rate b, the effectiveness of a vaccination strategy is impossible to predict without having reliable data on the network topology of the population. next, we consider the case where instead of the transmission rate, the initial growth rate is given. we line up incidence curves on various network topologies to a growth rate l predicted by a homogeneous mixing sir model with the basic reproduction number r ¼ : and recovery rate g ¼ : (in this case with exponential, l ¼ ðr À Þg ¼ : ). table summarizes the transmission rates that yield this exponential growth rate on the corresponding network topologies. the initial number of infectious individuals for models on each network topology needs to be adjusted as well so that the curves line up along the homogeneous mixing sir incidence curve for days. as can be seen from the table, the variations in the parameters are indeed very large, with the transmission rate for the small world network being nearly times the value of the transmission rate for the scale-free network. the incidence curves corresponding to the parameters in table are shown in fig. . as can clearly be seen, for these parameters, the curves overlap very well for the first days, thus showing indeed the desired identical initial growth rates. however, it is also clear that the curves diverge strongly later on, with the epidemic on the small world network being the most severe. these results show that the spread of an epidemic cannot be predicted on the basis of having a good estimate of the growth rate alone. in addition, comparing figs. and , a higher transmission rate yields a much larger final size and a longer epidemic on the meta-random network. the effects of the various vaccination strategies for the case of a given growth rate are shown in fig. . given the large differences in the transmission rates, it may be expected that the final sizes show significant differences as well. this is indeed the case as can be seen in table , which shows the percentage reduction in final sizes for the various vaccination strategies. with no vaccination, the final size of the small world network is more than times that of the scale-free network, but for all except the follow links vaccination strategy the percentage reduction on the small world network is greater. the effects of a -day delay in the start of the vaccination are shown in fig. and table . besides the delay, all the parameters are identical to those in fig. and table . the delay has the largest effect on the final sizes of the small world network, increasing it by a factor of - except in the follow links case. on a scale-free network, the delay renders all vaccination strategies nearly ineffective. these results also confirm the importance of network topology in disease spread even when the incidence curves have identical initial growth. the initial stages of an epidemic are insufficient to estimate the effectiveness of a vaccination strategy on reducing the peak or final size of an epidemic. the relative importance of network topology on the predictability of incidence curves was investigated. this was done by considering whether the effectiveness of several vaccination strategies is impacted by topology, and whether the growth in the daily incidences has a network topology independent relation with the disease transmission rate. it was found that without a fairly detailed knowledge of the network topology, initial data cannot predict epidemic progression. this is so for both a given transmission rate b and a given growth rate l. for a fixed transmission rate and thus a fixed per link transmission probability, given that a disease spreads on a network with a fixed average degree, the disease spreads fastest on scale-free networks because high degree nodes have a very high probability to be infected as soon as the epidemic progresses. in turn, once a high degree node is infected, on average it passes on the infection to a large number of neighbors. the random and meta-random networks show identical initial growth rates because they have the same local network topology. on different table without vaccination for the case where the initial growth rate is given. the transmission rates and initial number of infections for the various network topologies are given in table , while the remaining parameters are the same as in fig. meta-random network fig. . the effects of the vaccination strategies for different topologies when the initial growth rate is given. the transmission rates b are as indicated in table , while the remaining parameters are identical to those in fig. . network topologies, diseases respond differently to parameter changes. for example, on the random network, a higher transmission rate yields a much shorter epidemic, whereas on the metarandom network, it yields a longer one with a more drastic increase in final size. these differences are caused by the spatial structures in the meta-random network. considering that a metarandom network is a random network of random networks, it is likely that the meta-random network represents a general population better than a random network. for a fixed exponential growth rate, the transmission rate needed on the scale-free network to yield the given initial growth rate is the smallest, being about half that of the random and the meta-random networks. hence, the per-link transmission probability is the lowest on the scale-free network, which in turn yields a small epidemic final size. for different network topologies, we quantified the effect of delay in the start of vaccination. we found that the effectiveness of vaccination strategies decreases with delay with a rate strongly dependent on network topology. this emphasizes the importance of the knowledge of the topology, in order to formulate a practical vaccination schedule. with respect to policy, the results presented seem to warrant a significant effort to obtain a better understanding of how the members of a population are actually linked together in a social network. consequently, policy advice based on the rough estimates of the network structure should be viewed with caution. this work is partially supported by nserc discovery grants (jm, pvdd) and mprime (pvdd). we thank the anonymous reviewers for their constructive comments. the nodes in the network are labeled by their infectious status, i.e. susceptible, infectious, vaccinated, immune, refusing vaccination (but susceptible), and vaccinated but susceptible (the vaccine is not working), respectively. the stochastic simulation is initialized by first labeling all the nodes as susceptible and then randomly labeling i nodes as infectious. then, before the simulation starts, % of susceptible nodes are labeled as refusing vaccination but susceptible. during the simulation, when a node is vaccinated, the vaccine has a probability of % to be ineffective. if it is not effective, the node remains fully susceptible, but will not be vaccinated again. if it is effective, then the immunity is built up linearly over a certain period of time, taken as weeks. we assume that infected persons generally recover in about days, giving a recovery rate g ¼ : . the initial number of infectious individuals i is set to unless otherwise stated, to reduce the number of runs where the disease dies out due to statistical fluctuations. all simulation results presented in sections and are averages of runs, each with a new randomly generated network of the chosen topology. the parameters in the simulations are shown in table . the population size n was chosen to be sufficiently large to be representative of a medium size town and set to n ¼ , , while the degree average is taken as /ks ¼ with a maximum degree m¼ (having a maximum degree only affects the scalefree network since the probability of a node having degree m is practically zero for the other network types). when considering a large group of people, a good first approximation is that the links between these people are random. although it is clear that this cannot accurately represent the population since it lacks, for example, clustering and spatial aggregation (found in such common contexts as schools and work places), it may be possible that if the population is big enough, most if not all nonrandom effects average out. furthermore, random networks lend themselves relatively well to analysis so that a number of interesting (and testable) properties can be derived. as is usually the case, the random network employed here originates from the concepts first presented rigorously by erd + os and ré nyi ( ). our random networks are generated as follows: ( ) we begin by creating n unlinked nodes. ( ) in order to avoid orphaned nodes, without loss of generality, first every node is linked to another uniformly randomly chosen node that is not a neighbor. ( ) two nodes that are not neighbors and not already linked are uniformly randomly selected. if the degree d of both the nodes is less than the maximum degree m, a link is established. if one of the nodes has maximum degree m, a new pair of nodes is uniformly randomly selected. ( ) step is repeated n Â /ksÀn times. when considering certain activities in a population, such as the publishing of scientific work or sexual contact, it has been found that the links are often well described by a scale-free network structure where the relationship between the degree and the number of nodes that have this degree follows a negative power law; see, for example, the review paper by albert and barabá si ( ) . scale-free networks can easily be constructed with the help of a preferential attachment. that is to say, the network is built up step by step and new nodes attach to existing nodes with a probability that is proportional to the degree of the existing nodes. our network is constructed with the help of preferential attachment, but two modifications are made in order to render the scale-free network more comparable with the other networks investigated here. first, the maximum degree is limited to m not by restricting the degree from the outset but by first creating a scale-free network and then pruning all the nodes with a degree larger than m. second, the number of links attached to each new node is either two or three dependent on a certain probability that is set such that after pruning the average degree is very close to that of the random network (i.e. /ks ¼ ). our scale-free network is generated as follows: ( ) start with three fully connected nodes and set the total number of links l¼ . ( ) create a new node. with a probability of . , add links. otherwise add links. for each of these additional links to be added find a node to link to as outlined in step . ( ) loop through the list of nodes and create a link with probability d=ð lÞ, where d is the degree of the currently considered target node. ( ) increase l by or depending on the choice in step . ( ) repeat nÀ times steps and . ( ) prune nodes with a degree m. small world networks are characterized by the combination of a relatively large number of local links with a small number of non-local links. consequently, there is in principle a very large number of possible small world networks. one of the simplest ways to create a small world network is to first place nodes sequentially on a circle and couple them to their neighbors, similar to the way many coupled map lattices are constructed (willeboordse, ) , and to then create some random short cuts. this is basically also the way the small world network used here is generated. the only modification is that the coupling range (i.e. the number of neighbors linked to) is randomly varied between and in order to obtain an average degree equal to that of the random network (i.e. /ks ¼ ). we also use periodic boundary conditions, which as such is not necessary for a small world network but is commonly done. the motivation for studying small world networks is that small groups of people in a population are often (almost) fully linked (such as family members or co-workers) with some connections to other groups of people. our small world network is generated as follows: ( ) create n new unlinked nodes with index i ¼ . . . n. ( ) with a probability of . , link to neighboring and second neighboring nodes (i.e. create links i iÀ , i iþ , i iÀ , i iþ ). otherwise, also link up to the third neighboring nodes (i.e. create links i iÀ , i i þ , i iÀ , i i þ , i iÀ , i i þ ). periodic boundary conditions are used (i.e. the left nearest neighbor of node is node n while the right nearest neighbor of node n is node ). ( ) create the 'large world' network by repeating step for each node. ( ) with a probability of . add a link to a uniformly randomly chosen node excluding self and nodes already linked to. ( ) create the small world network by carrying out step for each node. in the random network, the probability for an arbitrary node to be linked to any other arbitrary node is constant and there is no clear notion of locality. in the small world network on the other hand, tightly integrated local connections are supplemented by links to other parts of the network. to model a situation in between where randomly linked local populations (such as the populations of villages in a region) are randomly linked to each other (for example, some members of the population of one village are linked to some members of some other villages), we consider a meta-random network. when increasing the number of shortcuts, a meta-random network transitions to a random network. it can be argued that among the networks investigated here, a meta-random network is the most representative of the population in a state, province or country. our meta-random network is generated as follows: ( ) create n new unlinked nodes with index i ¼ . . . n. ( ) group the nodes into randomly sized clusters with a minimum size of nodes (the minimum size was chosen such that it is larger than /ks, which equals five throughout, to exclude fully linked graphs). this is done by randomly choosing values in the range from to n to serve as cluster boundaries with the restriction that a cluster cannot be smaller than the minimum size. ( ) for each cluster, create an erd + os-ré nyi type random network. ( ) for each node, with a probability . , create a link to a uniformly randomly chosen node of a uniformly randomly chosen cluster excluding its own cluster. the network described in this subsection is a near neighbor network and therefore mostly local. nevertheless, there are some shortcuts but shortcuts to very distant parts of the network are not very likely. it could therefore be called a medium world network (situated between small and large world networks). the key feature of this network is that despite being mostly local its degree distribution is identical to that of the random network. our near neighbor network is generated as follows: ( ) create n new unlinked nodes with index i ¼ . . . n. ( ) for each node, set a target degree by randomly choosing a degree with a probability equal to that for the degree distribution of the random network. ( ) if the node has reached its target degree, continue with the next node. if not continue with step . ( ) with a probability of . , create a link to a node with a smaller index, otherwise create a link to a node with a larger index (using periodic boundary conditions). ( ) starting at the nearest neighbor by index and continuing by decreasing (smaller indices) or increasing (larger indices) the index one by one while skipping nodes already linked to, search for the nearest node that has not reached its target degree yet and create a link with this node. ( ) create the network by repeating steps - for each node. for all the strategies, vaccination is voluntary and quantity limited. that is to say only susceptibles who do not refuse vaccination are vaccinated and each day only a certain number of doses is available. for each strategy for each time unit, first a group of eligible nodes is identified and then up to the maximum number of doses is dispensed among the eligible nodes according to the strategy chosen. in this strategy, nodes with the highest degrees are vaccinated first. the motivation for this strategy is that high degree nodes on average can be assumed to transmit a disease more often than low degree nodes. numerically, the prioritized vaccination strategy is implemented as follows: ( ) for each time unit, start at the highest degree (i.e. consider nodes with degree d¼m) and repeat the steps below until either the number of doses per time step or the total number of available doses is reached. ( ) count the number of susceptible nodes for degree d. ( ) if the number of susceptible nodes with degree d is zero, set d ¼ dÀ and return to step . ( ) if the number of susceptible nodes with degree d is smaller than or equal to the number of available doses, vaccinate all the nodes, then set d ¼ dÀ and continue with step . otherwise continue with step . ( ) if the number of susceptible nodes with degree d is greater than the number of currently available doses, randomly choose nodes with degree d to vaccinate until the available number of doses is used up. ( ) when all the doses are used up, end the vaccination for the current time unit and continue when the next time unit arrives. in practice prioritizing on the basis of certain target groups such as health care workers or people at high risk of complications can be difficult. prioritizing on the basis of the number of links is even more difficult. how would such individuals be identified? one of the easiest vaccination strategies to implement is random vaccination. numerically, the random vaccination strategy is implemented as follows: ( ) for each time unit, count the total number of susceptible nodes. ( ) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. otherwise do step . ( ) if the total number of susceptible nodes is larger than the number of doses per unit time, randomly vaccinate susceptible nodes until all the available doses are used up. one way to reduce the spread of a disease is by splitting the population into many isolated groups. this could be done by vaccinating nodes with links to different groups. however given the network types studied here, breaking links between groups is not really feasible since besides the random cluster network, there is no clear group structure in the other networks. another approach is the follow links strategy, inspired by notions from social networks, where an attempt is made to split the population by vaccinating the neighbors and the neighbor's neighbors and so on of a randomly chosen susceptible node. numerically, the follow links strategy is implemented as follows: ( ) count the total number of susceptible nodes. ( ) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. ( ) if the total number of susceptible nodes is greater than the number of available doses per unit time, first randomly choose a susceptible node, label it as the current node, and vaccinate it. ( ) vaccinate all the susceptible neighbors of the current node. ( ) randomly choose one of the neighbors of the current node. ( ) set the current node to the node chosen in step . ( ) continue with steps - until all the doses are used up or no available susceptible neighbor can be found. ( ) if no available susceptible neighbor can be found in step , randomly choose a susceptible node from the population and continue with step . contact tracing was successfully used in combating the sars virus. in that case, everyone who had been in contact with an infectious individual was isolated to prevent a further spread of the disease. de facto, this kind of isolation boils down to removing links rendering the infectious node degree , a scenario not considered here. here contact tracing tries to isolate an infectious node by vaccinating all its susceptible neighbors. numerically, the contact tracing strategy is implemented as follows: ( ) count the total number of susceptible nodes. ( ) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. ( ) count only those susceptible nodes that have an infectious neighbor. ( ) if the number of susceptible nodes neighboring an infectious node is smaller than or equal to the number of doses per unit time, vaccinate all these nodes. ( ) if the number of susceptible nodes neighboring an infectious node is greater than the number of available doses repeat step until all the doses are used up. ( ) randomly choose an infectious node that has susceptible neighbors and vaccinate its neighbors until all the doses are used up. statistical mechanics of complex networks infectious diseases of humans a comparative analysis of influenza vaccination programs when individual behaviour matters: homogeneous and network models in epidemiology compartmental models in epidemiology comparative estimation of the reproduction number for pandemic influenza from daily case notification data trial of influenza a (h n ) monovalent mf -adjuvanted vaccine efficient immunization strategies for computer networks and populations vaccination against pandemic h n in a population dynamical model of vancouver, canada: timing is everything early real-time estimation of the basic reproduction number of emerging infectious diseases. phys. rev. x , . erd + os modeling infectious diseases in humans and animals infectious disease control using contact tracing in random and scale-free networks the effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing effective degree network disease models transmission dynamics and control of severe acute respiratory syndrome generality of the final size formula for an epidemic of a newly invading infectious disease effective degree household network disease models immunization and epidemic dynamics in complex networks network theory and sars: predicting outbreak diversity a note on a paper by erik volz: sir dynamics in random networks effective vaccination strategies for realistic social networks edge based compartmental modelling for infectious disease spread epidemic incidence in correlated complex networks office of the provincial health officer, . b.c.s response to the h n pandemic initial human transmission dynamics of the pandemic (h n ) virus in north america dynamics and control of diseases in networks with community structure a high-resolution human contact network for infectious disease transmission networks, epidemics and vaccination through contact tracing estimated epidemiologic parameters and morbidity associated with pandemic h n influenza sir dynamics in random networks with heterogeneous connectivity effects of heterogeneous and clustered contact patterns on infectious disease dynamics dynamical advantages of scale-free networks key: cord- -zf w ksm authors: suran, j. n.; latney, l. v.; wyre, n. r. title: radiographic and ultrasonographic findings of the spleen and abdominal lymph nodes in healthy domestic ferrets date: - - journal: j small anim pract doi: . /jsap. sha: doc_id: cord_uid: zf w ksm objective: to describe the radiographic and ultrasonographic characteristics of the spleen and abdominal lymph nodes in clinically healthy ferrets. materials and methods: fifty‐five clinically healthy ferrets were prospectively recruited for this cross‐sectional study. three‐view whole body radiographs and abdominal ultrasonography were performed on awake ( out of ) or sedated ( out of ) ferrets. on radiographs splenic and abdominal lymph node visibility was assessed. splenic thickness and echogenicity and lymph node length, thickness, echogenicity, number and presence of cyst‐like changes were recorded. results: the spleen was radiographically detectable in all ferrets. on ultrasound the spleen was hyperechoic to the liver ( out of ) and mildly hyperechoic ( out of ), isoechoic ( out of ) or mildly hypoechoic ( out of ) to the renal cortices. mean splenic thickness was . ± . mm. lymph nodes were radiographically discernible in out of ferrets and included caudal mesenteric and sublumbar nodes. an average of ± lymph nodes (mean± standard deviation; mode ) were identified in each ferret using ultrasound. a single large jejunal lymph node was identified in all ferrets and had a mean thickness of . ± . mm. for other lymph nodes the mean thickness measurements plus one standard deviation were less than . mm ( % confidence interval: ≤ . mm). clinical significance: the information provided in this study may act as a baseline for evaluation of the spleen and lymph nodes in ferrets. radiography and ultrasonography are part of the standard of care in domestic ferrets; however, there are few reports describing imaging findings or anatomic variations, despite lymph node lesions such as reactive lymphadenopathy and lymphoma occurring commonly in ferrets (o ' brien et al . , paul-murphy et al . , schwarz et al . , kuijten et al . , zaffarano , garcia et al . , eshar et al . , mayer et al . . radiographic and ultrasonographic references provide a baseline for clinical evaluations. in ferrets, there is a large jejunal lymph node in the mid-abdomen at the root of the mesentery which is commonly palpable in healthy individuals (paul-murphy et al . ) . the jejunal lymph node is also known as the mesenteric lymph node or cranial mesenteric lymph node. according to the nomina anatomica veterinaria (wava-icvgan ) , while the jejunal lymph nodes are part of the cranial mesenteric lymphocentre, carnivores lack a cranial mesenteric lymph node. therefore, although historically this lymph node has been identified as the mesenteric lymph node, it may be more appropriately termed the jejunal lymph node in ferrets, and will be referred to as such throughout this article. in both previous studies evaluating ferret lymph nodes with ultrasound, a single large jejunal lymph node was found in all animals (paul-murphy et al . , garcia et al . . with ultrasound, the jejunal lymph node was described as a round to ovoid structure with uniform echogenicity near the centre of the small intestinal mesentery and near the cranial and caudal mesenteric veins surrounded by fat (paul-murphy et al . ) . the mean and standard deviation for mesenteric lymph node dimensions varied somewhat between the studies and was reported as · ± · mm thick by · ± · mm long and · ± · mm thick by · ± · mm long (paul-murphy et al . , garcia et al . . in the later of the two ultrasound studies, the pancreaticoduodenal, splenic, gastric and hepatic lymph nodes were also examined (garcia et al . ) . anatomic landmarks for those lymph nodes were similar to that previously described in cats (schreurs et al . ) . pancreaticoduodenal lymph nodes were identified in %, splenic lymph nodes in %, gastric lymph nodes in % and hepatic lymph nodes in % of the ferrets (garcia et al . ). lymph nodes were described as circular to elongate, hypoechoic structures surrounded by fat; some lymph nodes also had a faint echogenic halo. length measurements (mean ±standard deviation) for the pancreaticoduodenal lymph nodes were reported as · ± · mm, for the gastric lymph nodes as · ± · mm, and for the splenic lymph nodes as · ± · mm; thickness measurements were only provided for the mesenteric lymph node (garcia et al . ) . in dogs and cats these and several other lymph nodes, such as the colic and medial iliac nodes, can be detected with ultrasound (d ' anjou , schreurs et al . . whether all of the lymph nodes detected in other carnivore species could be ultrasonographically identified in ferrets has not been determined. while splenomegaly is common in ferrets, the imaging appearance of the spleen in clinically healthy ferrets has also not been previously described. potential causes for splenomegaly include extramedullary haematopoiesis, neoplasia (especially lymphoma), lymphoid or myeloid hyperplasia, hypersplenism and infectious diseases such as aleutian disease, systemic coronavirus infection, mycobacteriosis and cryptococcus (ferguson , eshar et al . , dominguez et al . , morrisey & kraus , pollock , mayer et al . , nakata et al . , lindemann et al . . the spleen also increases in size with age and after administration of anaesthetics (fox , mayer et al . . the goal of this prospective, cross-sectional study was to describe the characteristics of the spleen and abdominal lymph nodes on radiographs and with ultrasound in a sample of clientowned, clinically healthy domestic ferrets ( mustela putorius furo ). healthy, client-owned ferrets between four months and four years of age were prospectively recruited at the matthew j ryan veterinary hospital of the university of pennsylvania between february and october . for the calculation of sample sizes, the standard deviations of previously reported measurements of abdominal viscera in clinically healthy ferrets were compared, including gross renal measurements ( · , · mm), ultrasonographic adrenal thickness ( · , · mm) and ultrasonographic jejunal lymph node thickness ( · , . mm) (o ' brien et al . , paul-murphy et al . , kuijten et al . , garcia et al . , krautwald-junghanns et al . , fox . using an averaged standard deviation for adrenal gland thickness of · mm and to % confidence intervals (ci) of ± · mm, the number of individuals needed to detect significant differences in organ measurements would be ( % ci) to ( % ci) (hulley ) . using an averaged standard deviation of the renal measurements ( · mm) and to % ci of ± · mm gives the same results (hulley ) . these standard deviation values were chosen because of the relatively small differences in the respective reported values. as adrenal and renal size has been shown to vary based on sex, we estimated that a total of ferrets for each sex would be needed, requiring a total recruitment of at least clinically healthy ferrets (neuwirth et al . , eshar et al . . additional ferrets were able to be enrolled because of the success of recruitment and remaining funding. this study was approved by and conducted in accordance with the institutional animal care and use committee -privately owned animal protocol committee (iacuc-poap # ); informed owner consent was received for all procedures and conducted in accordance with the privately owned animal protocol committee. a total of presumably healthy ferrets were actively recruited; ferrets were subsequently excluded for failure to meet the inclusion criteria. ferrets determined to be clinically healthy based on history, physical examination performed by an exotic animal veterinarian, complete blood count, chemistry panel, urinalysis and follow-up owner contact were included in the study. all procedures were performed on the same day as diagnostic imaging. owners were contacted regarding the health of their ferret following the study visit (mean days, range to days) in an attempt to exclude those with occult illness at the time of imaging. exclusion criteria included a history of either transient illness in the past six months or a long-term illness, administration of medications or the presence of any hormonal implant, any gross physical examination or clinicopathologic abnormality and manifestation of illness reported at any point in an individual ' s follow-up. individuals were also excluded if they had gross radiographic or ultrasonographic abnormalities based on previously published guidelines for the adrenal glands in ferrets and based on our experience in ferrets and in other species (o ' brien et al . , neuwirth et al . , paul-murphy et al . , kuijten et al . , garcia et al . , krautwald-junghanns et al . , eshar et al . . ferrets were not excluded if a cutaneous mast cell tumour was the sole abnormality (n= ); cutaneous mast cell tumours are typically focal benign lesions in ferrets with visceral involvement and malignancy being rare (orcutt & tater ) . reasons for exclusion are summarised in diagnostic imaging was performed and evaluated by a single board-certified radiologist. three-view whole body radiographs (including the thorax, abdomen and pelvis) were obtained and included right lateral, left lateral, ventrodorsal projections (canon lanmix cxdi- g detector; sound-eklin). abdominal ultrasonography was performed using an to mhz linear transducer (ge logiq s vet, sound-eklin) with the exception of five ferrets at the initiation of the study which were imaged with a to mhz linear transducer (ge medical logiq ultrasound imaging system; general electric medical systems) due to equipment changes at our facility. ferrets were scanned in dorsal recumbency. the ventral abdomen was shaved and warmed coupling gel was used. ferrets were not fasted prior to imaging. the spleen was radiographically identified using similar guidelines as in dogs and cats (armbrust ). whether the ventral extremity of the spleen was visible on the lateral radiographic projections along the ventral abdomen was recorded. on ultrasound, the splenic echotexture (homogeneous, mottled, nodular) and relative echogenicity (compared to the hepatic parenchyma and compared to the renal cortices) were recorded. with the spleen imaged in a longitudinal plane, parallel to the long axis, such that the splenic vein branches were evident along the mesenteric margin of the spleen, maximal thickness of the spleen was measured from the mesenteric margin to the anti-mesenteric margin (fig a) . radiographs were reviewed for distinguishable lymph nodes which were identified as small, round to oblong, soft-tissue opaque structures in the expected anatomic location of a lymph node and not associated with other visceral structures; these included sublumbar and caudal mesenteric nodes. the term "sublumbar lymph node" generally refers to lymph nodes from the iliosacral lymphocentre, including the medial iliac and internal iliac nodes, which may not be radiographically distinguishable from each other. the presence of a lymph node, number of nodes, maximal length and maximal thickness (perpendicular to the length measurement) were recorded. abdominal lymph nodes were ultrasonographically searched for and identified based on canine and feline anatomic references (bezuidenhout , d ' anjou , schreurs et al . ultrasound images of the spleen in a ferret with a homogeneous splenic echotexture. maximal splenic thickness measurements (callipers) were performed on images of the spleen parallel to its longitudinal axis and extended from the mesenteric margin to the anti-mesenteric margin. (b) ultrasound image of the spleen in a ferret with a nodular echotexture; small, round, ill-defined, hypoechoic regions were identifiable throughout the splenic parenchyma stantinescu & schaller , wava-icvgan ). lymph nodes evaluated included the hepatic, pancreaticoduodenal, splenic, gastric, jejunal (mesenteric or cranial mesenteric), caudal mesenteric (left colic), colic, ileocolic (right colic), lumbar aortic (peri-aortic) and medial iliac lymph nodes (fig ) . localisation of the renal, internal iliac (hypogastric) and sacral nodes was attempted. the presence of a lymph node, number of nodes, maximal length, maximal thickness (perpendicular to the length measurement), echogenicity, identification of a hyperechoic hilus and cyst-like regions within lymph nodes were recorded. echogenicity was classified as homogeneous, hyperechoic hilus with a hypoechoic rim, and heterogeneous (with or without a discernable hilus). the short-to-long axis ratio (s:l) was calculated for each lymph node. lymph node shape was classified as rounded (s:l> . ) or elongate (s:l< _ . ), similar to prior studies (de swarte et al . , beukers et al . . due to the u-shape or further serpentine shape of the large jejunal lymph nodes, length measurements for those lymph nodes were achieved by performing segmental linear measurements along the entire length of the node, then summing these values to achieve a total length (fig ) . all measurements were performed in duplicate; measurements for each individual were averaged for analyses. for the procedures ferrets were manually restrained or sedated. manually restrained ferrets were given a liquid oil supplement (ferre-tone skin and coat supplement, in pet products) as a distraction and as a treat. ferrets that resisted manual restraint or demonstrated escape behaviours during manual restraint were sedated. for sedation, . mg/kg midazolam (sagent pharmaceuticals) and . mg/ kg butorphanol tartrate (torbugesic; fort dodge animal health) were administered im (intramuscularly), and reversed respectively with im . mg/kg flumazenil (hikma farmaceutica) and . mg/ kg naloxone (hospira inc) upon completion of imaging. all procedures were first discussed with owners and informed consent was obtained. this study was approved by the university of pennsylvania institutional animal care and use committee. statistical analyses employed were predominantly descriptive. the mean, standard deviation (sd) and % ci were calculated fifty-five ferrets were included in this study. forty-two ferrets ( %) originated from commercial breeders and ( %) from private breeders. thirty-four ferrets were male (eight intact and fig . schematic illustration of intra-abdominal lymph nodes and major vessels. lymph nodes: = hepatic, = pancreaticoduodenal, = gastric, = splenic, = cranial mesenteric group, = jejunal, = caudal mesenteric, = lumbar aortic, and ´ = medial iliac. vessels: ao = aorta, cmv = cranial mesenteric vein, cvc = caudal vena cava, dci = deep circumflex iliac vessels, ei = external iliac vessels, pv = portal vein, sv = splenic vein. other landmarks: c = colon, d = duodenum, j = jejunoileum, s = stomach, sp = spleen, lk = left kidney, rk = right kidney. (from: atlas of small animal ultrasonography by penninck & d ' anjou ( ) . reproduced with permission of blackwell pub in the format journal/magazine via copyright clearance center. minor changes were made to the original image) fig . ultrasound image of a jejunal lymph node. a single large jejunal lymph node was identified in all ferrets. the lymph node has a hyperechoic hilus and a hypoechoic rim. because of the non-linear shape, length measurements were obtained by adding segmental linear measurements (callipers) along the long axes of the lymph node neutered males) and were female (one intact and neutered females). all sexually intact ferrets ( / ) were from private breeders. of the neutered ferrets, out of neutered females were from private breeders and were neutered at a relatively later age (up to · years old) than those from commercial breeders (typically neutered before six weeks of age). with regards to body weight, intact males generally weighed more than neutered males, which generally weighed more than neutered females (table ) . the single intact female in this study weighed more than the neutered females. age at presentation ranged from four months to · years (mean ±sd: · ± . years). sedatives were administered to out of ferrets ( %) to facilitate the procedures. there were no complications associated with sedation or any of the procedures. the spleen was radiographically identifiable in all ferrets. on ventrodorsal radiographs, the craniodorsal extremity was seen as a triangular soft-tissue opaque structure in the left side of the abdomen along the body wall, caudal to the stomach and craniolateral to the left kidney, and partly summating with the stomach and left kidney. the spleen then variably extended caudally or caudomedially as a broad curvilinear to crescentic soft-tissue structure. on lateral radiographic projections, the craniodorsal extremity of the spleen could be seen as a triangular soft-tissue opaque structure in the craniodorsal abdomen, caudal to the gastric fundus. the spleen could then sometimes be seen extending caudoventrally in the mid-abdomen as a broad, curvilinear soft-tissue opaque structure summating with the intestines. on lateral radiographic projections, the spleen could be seen along the ventral abdomen in out of ( %) ferrets; in out of ( %) this was seen only on the right lateral projection and in out of ( %) only on the left lateral projection. the spleen was more frequently visible along the ventral abdomen in male ferrets ( out of ; % males) than female ferrets ( out of ; % females) and in sedated ( out of ; %) than nonsedated ( out of ; %) ferrets. on ultrasound, the spleen was identified in the left lateral abdomen. the spleen was hyperechoic relative to the liver ( out of ). the spleen was mildly hyperechoic to the renal cortices in out of ( %) ferrets, isoechoic in out of ( %) and mildly hypoechoic in out of ( %). it had a homogeneous echotexture in out of ( %), a mildly mottled echotexture in out of ( %), and had ill-defined, round, hypoechoic nodules in out of ( %) (fig ) . the three ferrets with presumptively incidental cutaneous mast cell tumours all had a homogeneous splenic echotexture. with ultrasound the mean splenic thickness measurement was · ± · mm ( % ci: · to · mm; range · to · mm). the mean thickness was a little more in ferrets in which the spleen was radiographically detected along the ventral abdomen ( · ± · mm) compared to those in which it was not ( · ± · mm). lymph nodes were radiographically discernible in out of ( %) ferrets. the radiographic frequency of lymph node detection and measurements are summarised in table . caudal mesenteric lymph nodes were seen on lateral abdominal radiographs in out of ( %) ferrets. caudal mesenteric lymph nodes were identified as well-defined, oblong, soft-tissue opaque structures in the caudal abdomen dorsal and immediately adjacent to the descending colon at the level of l to l (fig ) . one caudal mesenteric lymph node was distinguishable in ferrets and two nodes were distinguishable in three ferrets. sublumbar lymph nodes were seen on lateral abdominal radiographs in out of ( %) ferrets. sublumbar lymph nodes were identified as well-defined, oblong, soft-tissue opaque structures in the caudal retroperitoneal space ventral to l and l (fig ) . one sublumbar lymph node was distinguishable in five ferrets and two nodes were distinguishable in one ferret. as the medial iliac lymph node was the only iliosacral lymphocentre node that was ultrasonographically identified, the sublumbar lymph nodes that were radiographically detected presumably represent medial iliac lymph nodes. lymph nodes were found with ultrasound in the expected locations and corresponding anatomic landmarks (fig ) as previously reported for dogs and cats (bezuidenhout , d ' anjou , schreurs et al . , constantinescu & schaller . detected lymph nodes included jejunal, pancreaticoduodenal, hepatic, caudal mesenteric, splenic, gastric, medial iliac and lumbar aortic lymph nodes. small lymph nodes were also seen in the mid-abdomen and were difficult to differentiate as ileocolic lymph nodes, colic lymph nodes or additional smaller jejunal lymph nodes. these were often seen in close proximity and slightly cranial to the single large jejunal lymph node. because these small nodes were not distinguishable as specific lymph nodes, these were grouped and termed cranial mesenteric lymph nodes. tables and , respectively. lymph node thickness measurements are also graphically depicted in fig . an average and sd of ± lymph nodes (mode lymph nodes; range to lymph nodes) were identified in each ferret. a single large jejunal lymph node was identified in all ferrets. all lymph nodes were oblong, with the exception of the single large jejunal lymph node which was u-shaped or serpentine (fig ) . most lymph nodes were hypoechoic relative to the surrounding mesenteric fat with a more echogenic central hilus (fig ) . homogeneous lymph nodes were either hypoechoic or mildly hyperechoic. heterogeneous lymph nodes were mildly heterogeneous and mildly hyperechoic. several of the heterogeneous lymph nodes were predominantly hyperechoic with discontinuous hypoechoic marginal regions, not forming a complete hypoechoic rim. anechoic cyst-like regions (fig ) were identified in out of a total of lymph nodes ( . %) evaluated, and were most frequently detected in the pancreaticoduodenal lymph nodes ( out of ; · % of cystic lymph nodes) followed by the hepatic lymph nodes ( out of ; · % of cystic lymph nodes). lymph nodes with cystlike changes were present in out of ( · %) ferrets. cystlike changes were present in one lymph node in seven ferrets, two lymph nodes in two ferrets, three lymph nodes in four ferrets and eight lymph nodes in one ferret. the mean age of ferrets with cyst-like regions was · ± · years (range: · to · years), compared to the mean age of ferrets without detected cyst-like regions, · ± · years (range: months to · years). the results presented in this study provide the most comprehensive evaluation of the spleen and abdominal lymph nodes with radiographs and ultrasound in clinically healthy ferrets to date. table . ultrasonographic features of abdominal lymph nodes in ferrets. lymph node shape was recorded as round or elongate based on the short-to-long axis ratio (s:l); the mode shape is provided in the table. lymph nodes with a s:l greater than . were characterised as round, while those with an s:l less than or equal to . were characterised as elongate. the number of ferrets in which cyst-like changes were detected or in which a hyperechoic hilus was appreciable is also reported. lymph node echogenicity was recorded as either homogeneous (either hypoechoic or mildly hyperechoic), hilar (having a hypoechoic rim and hyperechoic hilus) or heterogeneous (mildly heterogeneous and hyperechoic with or without a discernable hilus) the spleen was radiographically detectable in all ferrets, and had a similar appearance to that previously described in dogs and cats (armbrust ). in cats, the spleen is considered enlarged if the body or ventral extremity is visible along the ventral abdomen on lateral radiographs. aside from this guideline, radiographic assessment of splenic size is subjective in cats and dogs (armbrust ). the spleen was radiographically visible along the ventral abdomen in % of the clinically healthy ferrets in this study and therefore cannot be used as a general guideline to determine if the spleen is enlarged. additionally, as the body and ventral extremity of the spleen are mobile, splenic position within the abdomen may affect the radiographic appearance. specific radiographic guidelines to determine enlargement were not determined in this study; subjective radiographic assessment of the ferret spleen is therefore warranted in ferrets. gross splenic measurements have been previously reported as · cm length× · cm width. · cm thick (evans & an ) . the thickness measurements obtained in this study using ultrasound were all greater than previously reported with the smallest thickness measurement in this study being · mm. these discrepancies in measurements may result from differences between in vivo versus post-mortem sampling. signalment and body weight differences may also contribute but this information was not available for the ferrets in which the gross measurements were derived. on ultrasound, the spleen was hyperechoic to the liver. the relative echogenicity of the spleen compared to the renal cortices was variable, but most frequently the spleen was hyperechoic to the renal cortex. in cats fat may be deposited in the renal cortex and results in increased cortical echogenicity; this is associated with sex hormones in cats and is not associated with body weight (yeager & anderson , maxie . it is unknown whether renal cortical fat deposition also occurs in ferrets. the spleen had a homogeneous echotexture in % of ferrets. a mildly mottled echotexture or ill-defined small hypoechoic regions were seen in % of ferrets. causes for a non-homogeneous echogenicity may include potentially incidental etiologies such as nodular hyperplasia or extramedullary hematopoiesis, which commonly occurs in adult ferrets, although other subclinical pathologies, such as lymphoma or splenitis, cannot be excluded. ferrets with a non-homogeneous splenic echogenicity were not excluded as the ferrets remained clinically healthy throughout the study and follow-up period (levaditi et al . , mayer et al . . this is the first study to describe radiographic appearance of presumed normal lymph nodes in ferrets and those detected included the caudal mesenteric and sublumbar lymph nodes. at least one lymph node was radiographically discernible in % of ferrets in this study. radiographic lymph node measurements were generally greater than ultrasound measurements. the differences in measurements between imaging modalities is likely due to magnification on radiographs. silhouetting or superimposition of adjacent lymph nodes may also contribute to larger measurements on radiographs. additionally, measurements may be affected by patient positioning. the provided measurements are intended as a descriptor for lymph node dimensions. the poten- tial clinical utility for radiographic lymph node measurements is questionable as the use of measurements for radiographic image interpretation has not been found to be more accurate than subjective image interpretation (lamb & nelson ) . multiple lymph node groups were detectable with ultrasound. similar to previous studies, a single large jejunal lymph node (also known as mesenteric or cranial mesenteric lymph node) was detected in all ferrets (paul-murphy et al . , garcia et al . . the detection frequencies of the hepatic, pancreaticoduodenal, splenic and gastric lymph nodes were much higher than previously reported (garcia et al . ) . because of the overall small patient size, detection and differentiation of lymph nodes can be difficult. with the small size of ferrets, structures in the abdomen are relatively close together; additionally, compression of the abdomen and abdominal viscera which occur during ultrasonography may further compound this. small lymph nodes in the mid-abdomen just cranial to the single large jejunal lymph node were difficult to differentiate as ileocolic lymph nodes, additional colic lymph nodes (possibly middle anechoic cyst-like changes in a gastric lymph node. the lymph node has a hyperechoic hilus and a hypoechoic rim and contains a lobular, anechoic cyst-like region colic lymph nodes), additional smaller jejunal lymph nodes or a combination thereof. some of these nodes were suspected to be ileocolic lymph nodes, although the ileocolic junction is not readily identifiable in ferrets, which complicates identification of lymph nodes as ileocolic lymph nodes (evans & an ) . as the specific lymph node location could not be determined for these small lymph nodes and they likely belonged to the cranial mesenteric lymphocentre, they were termed cranial mesenteric lymph nodes. in carnivores, the cranial mesenteric lymphocentre is comprised of the ileocolic, colic and jejunal lymph nodes (bezuidenhout , constantinescu & schaller . the clinical relevance and importance of separating these small midabdominal lymph nodes into ileocolic, colic and jejunal lymph nodes is not known. based on the results of this study, the jejunal, pancreaticoduodenal, hepatic and caudal mesenteric lymph nodes can be routinely detected with ultrasound in most ferrets. the pancreaticoduodenal and hepatic lymph nodes were detected in · % ( out of ) and · % ( out of ) of ferrets, respectively. it is possible that these lymph nodes were not detected in those two and three ferrets, respectively, due to patient disposition and human error. for the caudal mesenteric lymph nodes, out of individuals in which the caudal mesenteric lymph nodes were not ultrasonographically detected were imaged at the initiation of this study. it is suspected that the caudal mesenteric lymph nodes were not detected, as opposed to not present, in those individuals, and after a steep learning curve these lymph nodes were routinely identified. the medial iliac lymph nodes were detected at a relatively high frequency as well; however, their small size (specifically thickness) made them more difficult to detect than others. with the exception of the single large jejunal lymph node, the mean value plus one sd for thickness measurements were all less than · mm and the upper ranges of the % ci were less or equal to than · mm. although previous studies using ultrasound reported detection of a single large jejunal lymph node, an anatomical reference for ferrets mentions both a left and right lymph node (paul-murphy et al . , garcia et al . , evans & an . in this study, a single large jejunal lymph node was identified with ultrasound in all ferrets; additional smaller lymph nodes in the vicinity of the jejunal lymph node may have represented smaller jejunal lymph nodes, ileocolic lymph nodes and/or colic lymph nodes. the mean jejunal lymph node thickness in this study ( · ± · mm) was similar to the previously reported mean of · mm by garcia et al . ( ) , but was less than the previously reported mean of . mm in the study by paul-murphy et al . ( ) . in the study by paul-murphy et al . ( ) , cytology was performed on a portion of the jejunal lymph nodes and demonstrated relatively high numbers of eosinophils in % of sampled nodes which may represent a normal variant in ferrets or may have represented occult disease (paul-murphy et al . ). cytology was not performed in this study or in that by garcia et al . ( ) . the mean jejunal lymph node length measurement ( · ± · mm) was greater than the previously reported mean values ( · ± · and · ± · mm) (paul-murphy et al . , garcia et al . ; this is suspected to be due to differences in methodology. because the jejunal lymph node was serpentine, segmented linear measurements were summed to determine the total length. although the specific technique was not described in the previous studies, presumably a single linear measurement was previously performed along the long axis of the largest part of the lymph node. this presumed difference in technique would account for the length measurements in this study being greater than previously reported. in general, lymph node thickness measurements and short-to-long axis ratios may have more clinical utility than evaluation of length measurements alone. when lymph nodes enlarge, they tend to become more rounded, having a greater short-to-long axis ratio; this change is attributed to a greater increase in thickness measurements compared to length measurements (de swarte et al . ) . in humans and dogs, evaluation of the short-to-long axis ratio may assist in differentiating benign versus malignant neoplastic lymphadenopathies; a greater short-to-long axis ratio is associated with neoplastic lymphadenopathy while a lesser short-to-long axis ratio is associated with normal and reactive or inflammatory lymphadenopathies (nyman et al . , de swarte et al . . cyst-like changes were identified in · % ( out of ) of all lymph nodes evaluated in this study and were more frequently identified in older ferrets. this finding is of unknown clinical significance, particularly since cyst-like changes were recognised in · % ( out of ) of the clinically healthy ferrets of this study. cyst-like changes are suspected to represent lymphatic sinus ectasia (lymphangectasia, lymphatic cysts, cystic lymphatic ectasia or sinus dilation). hyperplastic lymph nodes are common in older ferrets secondary to underlying gastrointestinal inflammation (antinoff & williams ) . overt gastrointestinal abnormalities were not ultrasonographically detected in ferrets included for data analysis, and ferrets with clinical signs referable to the gastrointestinal tract were excluded; however, lymph node hyperplasia secondary to subclinical gastrointestinal or non-gastrointestinal pathology cannot be excluded. in rats and mice, lymphatic sinus ectasia is associated with lymphoid atro-phy and can be seen in ageing animals (sainte-marie et al . , elmore . in humans, cyst-like areas in lymph nodes can be seen with metastatic neoplasia, particularly secondary to necrosis. cystic necrosis results in an anechoic area within the lymph node and is commonly found in metastatic lymph nodes in humans. coagulative nodal necrosis is uncommonly seen, results in an echogenic area and can be seen in malignant and inflammatory lymph nodes (ahuja & ying ) . nodules comprised of neoplastic cells may also have a pseudocystic appearance on ultrasound (ahuja & ying ) . less frequently nodal metastasis may produce a true cyst with an epithelial lining (verma et al . ) . further studies with histopathologic evaluation of cystic lymph nodes and to evaluate the clinical significance of this change are warranted. because of the small patient size, adrenal glands can be mistaken for lymph nodes and vice versa. the hepatic lymph nodes may be mistaken for the right adrenal gland, for example. the portal vein and caudal vena cava can be seen relatively close together in the cranial abdomen of ferrets. additionally the caudal vena cava is easily compressed from pressure of the ultrasound transducer during ultrasonography. although both the portal vein and caudal vena cava may be seen adjacent to the hepatic lymph nodes, the hepatic lymph nodes are more closely associated with the portal vein, while the right adrenal gland is in close apposition to the caudal vena cava. careful evaluation of lymph nodes relative to their anatomic landmarks, which are crucial for differentiating lymph nodes from each other and from the adrenal glands and which are in close vicinity to other landmarks, is warranted. the major limitations of the study included external validity (generalisability), selection bias, misclassification bias and human error (reliability and internal consistency). the ferrets in this study may not be representative of the general population; most ferrets in this study, in common with most in the usa, originated from a single large commercial breeder. additionally there were very few sexually intact ferrets. the generalisability of the findings in this study to ferrets from geographic locations outside of the usa is difficult to judge. another limitation is that there was no gold standard to confirm a disease-free status. organ sampling was not within the scope of this study. as an imperfect surrogate, we used the individual ' s history, physical exam, blood work, urinalysis and follow-up owner contact. diagnostic imaging findings did result in some patients being excluded, who may bias the data and result in exclusion of normal variants; however, based on our experiences with ultrasound in ferrets and in other species, those individuals were considered very likely to be abnormal. the included ferrets were clinically healthy throughout the study and follow-up interim. inter-observer and intra-observer differences were not evaluated. in summary, the information provided in this study may act as a baseline for evaluation of the spleen and lymph nodes in ferrets. on radiographs the spleen was visible in all ferrets, and the sublumbar or caudal mesenteric lymph nodes were discernible in % of ferrets. with ultrasound the spleen was hyperechoic to the liver and most often had a homogeneous or mildly mottled echotexture; additionally, multiple lymph nodes were identified. the jejunal, pancreaticoduodenal, hepatic and caudal mesenteric lymph nodes can be routinely detected with ultrasound. the jejunal lymph node was seen in all ferrets and had a mean thickness of · ± · mm. the mean thickness measurements plus one sd for all other lymph nodes were less than . mm. additional studies evaluating the clinical utility and predictive validity of the provided measurements are warranted. abdominal cavity, lymph nodes, and great vessels sonography of neck lymph nodes. part ii: abnormal lymph nodes neoplasia . in: ferrets, rabbits, and rodents the spleen . in: bsava manual of canine and feline abdominal imaging computed tomographic characteristics of presumed normal canine abdominal lymph nodes the lymphatic system . in: miller ' s anatomy of the dog illustrated veterinary anatomical nomenclature abdominal radiographic and ultrasonographic findings in ferrets (mustela putorius furo) with systemic coronavirus infection histopathology of the lymph nodes disseminated, histologically confirmed cryptococcus spp infection in a domestic ferret radiographic kidney measurements in north american pet ferrets (mustela furo) anatomy of the ferret idiopathic hypersplenism in a ferret normal clinical and biological parameters anatomia ultrassonográfica dos linfonodos abdominais de furões europeus hígidos appendix d: sample size for a descriptive study funding was provided by an institutional grant from the university of pennsylvania and a donation from abaxis. the authors would like to acknowledge and thank bruce williams for consultation on histopathology; thomas tyson for medical illustration; mary baldwin, alisa rassin and max emanuel for technical assistance and scales and tails rescue for assistance with recruitment. no conflicts of interest have been declared. key: cord- -f icyt authors: sharma, ujjwal; rudinac, stevan; worring, marcel; demmers, joris; van dolen, willemijn title: semantic path-based learning for review volume prediction date: - - journal: advances in information retrieval doi: . / - - - - _ sha: doc_id: cord_uid: f icyt graphs offer a natural abstraction for modeling complex real-world systems where entities are represented as nodes and edges encode relations between them. in such networks, entities may share common or similar attributes and may be connected by paths through multiple attribute modalities. in this work, we present an approach that uses semantically meaningful, bimodal random walks on real-world heterogeneous networks to extract correlations between nodes and bring together nodes with shared or similar attributes. an attention-based mechanism is used to combine multiple attribute-specific representations in a late fusion setup. we focus on a real-world network formed by restaurants and their shared attributes and evaluate performance on predicting the number of reviews a restaurant receives, a strong proxy for popularity. our results demonstrate the rich expressiveness of such representations in predicting review volume and the ability of an attention-based model to selectively combine individual representations for maximum predictive power on the chosen downstream task. multimodal graphs have been extensively used in modeling real-world networks where entities interact and communicate with each other through multiple information pathways or modalities [ , , ] . each modality encodes a distinct view of the relation between nodes. for example, within a social network, users can be connected by their shared preference for a similar product or by their presence in the same geographic locale. each of these semantic contexts links the same user set with a distinct edge set. such networks have been extensively used for applications like semantic proximity search in existing interaction networks [ ] , augmenting semantic relations between entities [ ] , learning interactions in an unsupervised fashion [ ] and augmenting traditional matrix factorization-based collaborative filtering models for recommendation [ ] . each modality within a multimodal network encodes a different semantic relation and exhibits a distinct view of the network. while such views contain relations between nodes based on interactions within a single modality, observed outcomes in the real-world are often a complex combination of these interactions. therefore, it is essential to compose these complementary interactions meaningfully to build a better representation of the real world. in this work, we examine a multimodal approach that attempts to model the review-generation process as the end-product of complex interactions within a restaurant network. restaurants share a host of attributes with each other, each of which may be treated as a modality. for example, they may share the same neighborhood, the same operating hours, similar kind of cuisine, or the same 'look and feel'. furthermore, each of these attributes only uncovers a specific type of relation. for example, a view that only uses the location-modality will contain venues only connected by their colocation in a common geographical unit and will prioritize physical proximity over any other attribute. broadly, each of these views is characterized by a semantic context and encodes modality-specific relations between restaurants. these views, although informative, are complementary and only record associations within the same modality. while each of these views encodes a part of the interactions within the network, performance on a downstream task relies on a suitable combination of views pertinent to the task [ ] . in this work, we use metapaths as a semantic interface to specify which relations within a network may be relevant or meaningful and worth investigating. we generate bimodal low-dimensional embeddings for each of these metapaths. furthermore, we conjecture that their relevance on a downstream task varies with the nature of the task and that this task-specific modality relevance should be learned from data. in this work, -we propose a novel method that incorporates restaurants and their attributes into a multimodal graph and extracts multiple, bimodal low dimensional representations for restaurants based on available paths through shared visual, textual, geographical and categorical features. -we use an attention-based fusion mechanism for selectively combining representations extracted from multiple modalities. -we evaluate and contrast the performance of modality-specific representations and joint representations for predicting review volume. the principle challenge in working with multimodal data revolves around the task of extracting and assimilating information from multiple modalities to learn informative joint representations. in this section, we discuss prior work that leverages graph-based structures for extracting information from multiple modalities, focussing on the auto-captioning task that introduced such methods. we then examine prior work on network embeddings that aim to learn discriminative representations for nodes in a graph. graph-based learning techniques provide an elegant means for incorporating semantic similarities between multimedia documents. as such, they have been used for inference in large multimodal collections where a single modality may not carry sufficient information [ ] . initial work in this domain was structured around the task of captioning unseen images using correlations learned over multiple modalities (tag-propagation or auto-tagging). pan et al. use a graph-based model to discover correlations between image features and text for automatic image-captioning [ ] . urban et al. use an image-context graph consisting of captions, image features and images to retrieve relevant images for a textual query [ ] . stathopoulos et al. [ ] build upon [ ] to learn a similarity measure over words based on their co-occurrence on the web and use these similarities to introduce links between similar caption words. rudinac et al. augment the image-context graph with users as an additional modality and deploy it for generating visual-summaries of geographical regions [ ] . since we are interested in discovering multimodal similarities between restaurants, we use a graph layout similar to the one proposed by pan et al. [ ] for the image auto-captioning task but replace images with restaurants as central nodes. other nodes containing textual features, visual features and users are retained. we also add categorical information like cuisines as a separate modality, allowing them to serve as semantic anchors within the representation. graph representation learning aims to learn mappings that embed graph nodes in a low-dimensional compressed representation. the objective is to learn embeddings where geometric relationships in the compressed embedding space reflect structural relationships in the graph. traditional approaches generate these embeddings by finding the leading eigenvectors from the affinity matrix for representing nodes [ , ] . with the advent of deep learning, neural networks have become increasingly popular for learning such representations, jointly, from multiple modalities in an end-to-end pipeline [ , , , , ] . existing random walk-based embedding methods are extensions of the random walks with restarts (rwr) paradigm. traditional rwr-based techniques compute an affinity between two nodes in a graph by ascertaining the steadystate transition probability between them. they have been extensively used for the aforementioned auto-captioning tasks [ , , , ] , tourism recommendation [ ] and web search as an integral part of the pagerank algorithm [ ] . deep learning-based approaches build upon the traditional paradigm by optimizing the co-occurrence statistics of nodes sampled from these walks. deepwalk [ ] uses nodes sampled from short truncated random walks as phrases to optimize a skip-gram objective similar to word vec [ ] . similarly, node vec augments this learning paradigm with second-order random walks parameterized by exploration parameters p and q which control between the importance of homophily and structural equivalence in the learnt representations [ ] . for a homogeneous network, random walk based methods like deepwalk and node vec assume that while the probabilities of transitioning from one node to another can be different, every transition still occurs between nodes of the same type. for heterogeneous graphs, this assumption may be fallacious as all transitions do not occur between nodes of the same type and consequently, do not carry the same semantic context. indeed, our initial experiments with node vec model suggest that it is not designed to handle highly multimodal graphs. clements et al. [ ] demonstrated that in the context of content recommendation, the importance of modalities is strongly task-dependent and treating all edges in heterogeneous graphs as equivalent can discard this information. metapath vec [ ] remedies this by introducing unbiased walks over the network schema specified by a metapath [ ] , allowing the network to learn the semantics specified by the metapath rather than those imposed purely by the topology of the graph. metapath-based approaches have been extended to a variety of other problems. hu et al. use an exhaustive list of semantically-meaningful metapaths for extracting top-n recommendations with a neural co-attention network [ ] . shi et al. use metapath-specific representations in a traditional matrix factorization-based collaborative filtering mechanism [ ] . in this work, we perform random walks on sub-networks of a restaurant-attribute network containing restaurants and attribute modalities. these attribute modalities may contain images, text or categorical features. for each of these sub-networks, we perform random walks and use a variant of the heterogeneous skip-gram objective introduced in [ ] to generate low-dimensional bimodal embeddings. bimodal embeddings have several interesting properties. training relations between two modalities provide us with a degree of modularity where modalities can be included or held-out from the prediction model without affecting others. it also makes training inexpensive as the number of nodes when only considering two modalities is far lower than in the entire graph. in this section, we begin by providing a formal introduction to graph terminology that is frequently referenced in this paper. we then move on to detail our proposed method illustrated in fig. . formally, a heterogeneous graph is denoted by g = (v, e, φ, σ) where v and e denote the node and edge sets respectively. for every node and edge, there exists mapping functions φ(v) → a and σ(e) → r where a and r are sets of node types and edge types respectively such that |a + r| > . for a heterogeneous graph g = (v, e, φ, σ), a network schema is a metagraph m g = (a, r) where a is the set of node types in v and r is the set of edge types in e. a network schema enumerates the possible node types and edge types that can occur within a network. a metapath m(a , a n ) is a path on the network schema m g consisting of a sequence of ordered edge transitions: we use tripadvisor to collect information for restaurants in amsterdam. each venue characteristic is then embedded as a separate node within a multimodal graph. in the figure above r nodes denote restaurants, i nodes denote images for a restaurant, d nodes are review documents, a nodes are categorical attributes for restaurants and l nodes are locations. bimodal random walks are used to extract pairwise correlations between nodes in separate modalities which are embedded using a heterogeneous skip-gram objective. finally, an attention-based fusion model is used to combine multiple embeddings together to regress the review volume for restaurants. let g = (v, e) be the heterogeneous graph with a set of nodes v and edges e. we assume the graph to be undirected as linkages between venues and their attributes are inherently symmetric. below, we describe the node types used to construct the graph (cf. figs. and and use the penultimate layer output as a compressed low-dimensional representation for the image. since the number of available images for each venue may vary dramatically depending on its popularity, adding a node for every image can lead to an unreasonably large graph. to mitigate this issue, we cluster image features for each restaurant using the k-means algorithm and use the cluster centers as representative image features for a restaurant, similar to zahálka et al. [ ] . we chose k = as a reasonable trade-off between the granularity of our representations and tractability of generating embeddings for this modality. the way patrons write about a restaurant and the usage of specialized terms can contain important information about a restaurant that may be missing from its categorical attributes. for example, usage of the indian cottage cheese 'paneer' can be found in similar cuisine types like nepali, surinamese, etc. and user reviews talking about dishes containing 'paneer' can be leveraged to infer that indian and nepali cuisines share some degree of similarity. to model such effects, we collect reviews for every restaurant. since individual reviews may not provide a comprehensive unbiased picture of the restaurant, we chose not to treat them individually, but to consider them as a single document. we then use a distributed bag-ofwords model from [ ] to generate low-dimensional representations of these documents for each restaurant. since the reviews of a restaurant can widely vary based on its popularity, we only consider the most recent reviews for each restaurant to prevent biases from document length getting into the model. . users: since tripadvisor does not record check-ins, we can only leverage explicit feedback from users who chose to leave a review. we add a node for each of the users who visited at least two restaurants in amsterdam and left a review. similar to [ , , ] , we introduce two kinds of edges in our graph: . attribute edges: these are heterogeneous edges that connect a restaurant node to the nodes of its categorical attributes, image features, review features and users. in our graph, we instantiate them as undirected, unweighted edges. . similarity edges: these are homogeneous edges between the feature nodes within a single modality. for image features, we use a radial basis function as a non-linear transformation of the euclidean distances between image feature vectors. for document vectors, we use cosine similarity to find restaurants with similar reviews. adding a weighted similarity edge between every node in the same modality would yield an extremely dense adjacency matrix. to avoid this, we only add similarity links between a node and its k nearest neighbors in each modality. by choosing the nearest k neighbors, we make our similarity threshold adaptive allowing it to adjust to varying scales of distance in multiple modalities. metapaths can provide a modular and simple interface for injecting semantics into the network. since metapaths, in our case, are essentially paths over the modality set, they can be used to encode inter-modality correlations. in this work, we generate embeddings with two specific properties: . all metapaths are binary and only include transitions over modalities. since venues/restaurants are always a part of the metapath, we only include one other modality. . during optimization, we only track the short-range context by choosing a small window size. window size is the maximum distance between the input node and a predicted node in a walk. in our model, walks over the metapath only capture short-range semantic contexts and the choice of a larger window can be detrimental to generalization. for example, consider a random walk over the restaurant -cuisine -restaurant metapath. in the sampled nodes below, restaurants are in red while cuisines are in blue. optimizing over a large context window can lead to mcdonald's (fast-food cuisine) and kediri (indonesian cuisine) being placed close in the embedding space. this is erroneous and does not capture the intended semantics which should bring restaurants closer only if they share the exact attribute. we use the metapaths in table to perform unbiased random walks on the graph detailed in sect. . . each of these metapaths enforces similarity based on certain semantics. we train separate embeddings using the heterogeneous skip-gram objective similar to [ ] . for every metapath, we maximize the probability of observing the heterogeneous context n a (v) given the node v. in eq. ( ) , a m is the node type-set and v m is the node-set for metapath m. arg max θ v∈vm a∈am ca∈na (v) log p(c a |v; θ) the original metapath vec model [ ] uses multiple metapaths [ ] to learn separate embeddings, some of which perform better than the others. on the dblp metapath-specific embeddings fig. . attention-weighted modality fusion: metapath-specific embeddings are fed into a common attention mechanism that generates an attention vector. each modality is then reweighted with the attention vector and concatenated. this joint representation is then fed into a ridge regressor to predict the volume of ratings for each restaurant. bibliographic graph that consists of authors (a), papers (p) and venues (v), the performance of their recommended metapath 'a-p-v-p-a' was empirically better than the alternative metapath 'a-p-a' on the node classification task. at this point, it is important to recall that in our model, each metapath extracts a separate view of the same graph. these views may contain complementary information and it may be disadvantageous to only retain the best performing view. for an optimal representation, these complementary views should be fused. in this work, we employ an embedding-level attention mechanism similar to the attention mechanism introduced in [ ] that selectively combines embeddings based on their performance on a downstream task. assuming s to be the set of metapath-specific embeddings for metapaths m , m , . . . , m n , following the approach outlined in fig. , we can denote it as: we then use a two-layer neural network to learn an embedding-specific attention a mn for metapath m n : further, we perform a softmax transformation of the attention network outputs to an embedding-specific weight finally, we concatenate the attention-weighted metapath-specific embeddings to generate a fused embedding we evaluate the performance of the embedding fusion model on the task of predicting the volume (total count) of reviews received by a restaurant. we conjecture that the volume of reviews is an unbiased proxy for the general popularity and footfall for a restaurant and is more reliable than indicators like ranking or ratings which may be biased by tripadvisor's promotion algorithms. we use the review volume collected from tripadvisor as the target variable and model this task as a regression problem. data collection. we use publicly-available data from tripadvisor for our experiments. to build the graph detailed in sect. . , we collect data for , restaurants in amsterdam, the netherlands that are listed on tripadvisor. we additionally collect , user-contributed restaurant reviews made by , unique users, of which only , users visit more than restaurants in the city. we only retain these , users in our graph and drop others. we also collect , user-contributed images for these restaurants. we construct the restaurant network by embedding venues and their attributes listed in table as nodes. bimodal embeddings. we train separate bimodal embeddings by optimizing the heterogeneous skip-gram objective from eq. ( ) using stochastic gradient descent and train embeddings for all metapaths enumerated in table . we use restaurant nodes as root nodes for the unbiased random walks and perform walks per root node, each with a walk length of . each embedding has a dimensionality of , uses a window-size of and is trained for epochs. embedding fusion models. we chose two fusion models in our experiments to analyze the efficacy of our embeddings: . simple concatenation model: we use a model that performs a simple concatenation of the individual metapath-specific embeddings detailed in sect. . to exhibit the baseline performance on the tasks detailed in sect. . simple concatenation is a well-established additive fusion technique in multimodal deep learning [ , ] . each of the models uses a ridge regression algorithm to estimate the predictive power of each metapath-specific embedding on the volume regression task. this regressor is jointly trained with the attention model in the attention-weighted model. all models are optimized using stochastic gradient descent with the adam optimizer [ ] with a learning rate of . . in table , we report the results from our experiments on the review-volume prediction task. we observe that metapaths with nodes containing categorical attributes perform significantly better than vector-based features. in particular, categorical attributes like cuisines, facilities, and price have a significantly higher coefficient of determination (r ) as compared to visual feature nodes. it is interesting to observe here that nodes like locations, images, and textual reviews are far more numerous than categorical nodes and part of their decreased performance may be explained by the fact that our method of short walks may not be sufficiently expressive when the number of feature nodes is large. in addition, as mentioned in related work, we performed these experiments with the node vec model, but since it is not designed for heterogeneous multimodal graphs, it yielded performance scores far below the weakest single modality. a review of the fusion models indicates that taking all the metapaths together can improve performance significantly. the baseline simple concatenation fusion model, commonly used in literature, is considerably better than the best-performing metapath (venues -facilities -venues). the attention basedmodel builds significantly over the baseline performance and while it employs a similar concatenation scheme as the baseline concatenation model, the introduction of the attention module allows it to handle noisy and unreliable modalities. the significant increase in the predictive ability of the attention-based model can be attributed to the fact that while all modalities encode information, some of them may be less informative or reliable than others, and therefore contribute less to the performance of the model. our proposed fusion approach is, therefore, capable of handling weak or noisy modalities appropriately. in this work, we propose an alternative, modular framework for learning from multimodal graphs. we use metapaths as a means to specify semantic relations between nodes and each of our bimodal embeddings captures similarities between restaurant nodes on a single attribute. our attention-based model combines separately learned bimodal embeddings using a late-fusion setup for predicting the review volume of the restaurants. while each of the modalities can predict the volume of reviews to a certain extent, a more comprehensive picture is only built by combining complementary information from multiple modalities. we demonstrate the benefits of our fusion approach on the review volume prediction task and demonstrate that a fusion of complementary views provides the best way to learn from such networks. in future work, we will investigate how the technique generalises to other tasks and domains. mantis: system support for multimodal networks of in-situ sensors hyperlearn: a distributed approach for representation learning in datasets with many modalities interaction networks for learning about objects, relations and physics heterogeneous network embedding via deep architectures the task-dependent effect of tags and ratings on social media access metapath vec: scalable representation learning for heterogeneous networks m-hin: complex embeddings for heterogeneous information networks via metagraphs node vec: scalable feature learning for networks deep residual learning for image recognition leveraging meta-path based context for top-n recommendation with a neural co-attention model multimodal network embedding via attention based multi-view variational autoencoder adam: a method for stochastic gradient descent distributed representations of sentences and documents deep collaborative embedding for social image understanding how random walks can help tourism image labeling on a network: using social-network metadata for image classification distributed representations of words and phrases and their compositionality multimodal deep learning multi-source deep learning for human pose estimation the pagerank citation ranking: bringing order to the web gcap: graph-based automatic image captioning deepwalk: online learning of social representations the visual display of regulatory information and networks nonlinear dimensionality reduction by locally linear embedding generating visual summaries of geographic areas using community-contributed images imagenet large scale visual recognition challenge heterogeneous information network embedding for recommendation semantic relationships in multi-modal graphs for automatic image annotation pathsim: meta path-based top-k similarity search in heterogeneous information networks line: large-scale information network embedding study on optimal frequency design problem for multimodal network using probit-based user equilibrium assignment adaptive image retrieval using a graph model for semantic feature integration heterogeneous graph attention network network representation learning with rich text information interactive multimodal learning for venue recommendation metagraph vec: complex semantic path augmented heterogeneous network embedding key: cord- - tj eve authors: porter, mason a. title: nonlinearity + networks: a vision date: - - journal: nan doi: nan sha: doc_id: cord_uid: tj eve i briefly survey several fascinating topics in networks and nonlinearity. i highlight a few methods and ideas, including several of personal interest, that i anticipate to be especially important during the next several years. these topics include temporal networks (in which the entities and/or their interactions change in time), stochastic and deterministic dynamical processes on networks, adaptive networks (in which a dynamical process on a network is coupled to dynamics of network structure), and network structure and dynamics that include"higher-order"interactions (which involve three or more entities in a network). i draw examples from a variety of scenarios, including contagion dynamics, opinion models, waves, and coupled oscillators. in its broadest form, a network consists of the connectivity patterns and connection strengths in a complex system of interacting entities [ ] . the most traditional type of network is a graph g = (v, e) (see fig. a) , where v is a set of "nodes" (i.e., "vertices") that encode entities and e ⊆ v × v is a set of "edges" (i.e., "links" or "ties") that encode the interactions between those entities. however, recent uses of the term "network" have focused increasingly on connectivity patterns that are more general than graphs [ ] : a network's nodes and/or edges (or their associated weights) can change in time [ , ] (see section ), nodes and edges can include annotations [ ] , a network can include multiple types of edges and/or multiple types of nodes [ , ] , it can have associated dynamical processes [ ] (see sections , , and ) , it can include memory [ ] , connections can occur between an arbitrary number of entities [ , ] (see section ) , and so on. associated with a graph is an adjacency matrix a with entries a i j . in the simplest scenario, edges either exist or they don't. if edges have directions, a i j = when there is an edge from entity j to entity i and a i j = when there is no such edge. when a i j = , node i is "adjacent" to node j (because we can reach i directly from j), and the associated edge is "incident" from node j and to node i. the edge from j to i is an "out-edge" of j and an "in-edge" of i. the number of out-edges of a node is its "out-degree", and the number of in-edges of a node is its "in-degree". for an undirected network, a i j = a ji , and the number of edges that are attached to a node is the node's "degree". one can assign weights to edges to represent connections with different strengths (e.g., stronger friendships or larger transportation capacity) by defining a function w : e −→ r. in many applications, the weights are nonnegative, although several applications [ ] (such as in international relations) incorporate positive, negative, and zero weights. in some applications, nodes can also have selfedges and multi-edges. the spectral properties of adjacency (and other) matrices give important information about their associated graphs [ , ] . for undirected networks, it is common to exploit the beneficent property that all eigenvalues of symmetric matrices are real. traditional studies of networks consider time-independent structures, but most networks evolve in time. for example, social networks of people and animals change based on their interactions, roads are occasionally closed for repairs and new roads are built, and airline routes change with the seasons and over the years. to study such time-dependent structures, one can analyze "temporal networks". see [ , ] for reviews and [ , ] for edited collections. the key idea of a temporal network is that networks change in time, but there are many ways to model such changes, and the time scales of interactions and other changes play a crucial role in the modeling process. there are also other [i drew this network using tikz-network, by jürgen hackl and available at https://github.com/hackl/tikz-network), which allows one to draw networks (including multilayer networks) directly in a l a t e x file.] . an example of a multilayer network with three layers. we label each layer using di↵erent colours for its state nodes and its edges: black nodes and brown edges (three of which are unidirectional) for layer , purple nodes and green edges for layer , and pink nodes and grey edges for layer . each state node (i.e. nodelayer tuple) has a corresponding physical node and layer, so the tuple (a, ) denotes physical node a on layer , the tuple (d, ) denotes physical node d on layer , and so on. we draw intralayer edges using solid arcs and interlayer edges using broken arcs; an interlayer edge is dashed (and magenta) if it connects corresponding entities and dotted (and blue) if it connects distinct ones. we include arrowheads to represent unidirectional edges. we drew this network using tikz-network (jürgen hackl, https://github.com/hackl/tikz-network), which allows one to draw multilayer networks directly in a l at ex file. , which is by jürgen hackl and is available at https://github.com/hackl/tikz-network. panel (b) is inspired by fig. of [ ] . panel (d), which is in the public domain, was drawn by wikipedia user cflm and is available at https://en.wikipedia.org/wiki/simplicial_complex.] important modeling considerations. to illustrate potential complications, suppose that an edge in a temporal network represents close physical proximity between two people in a short time window (e.g., with a duration of two minutes). it is relevant to consider whether there is an underlying social network (e.g., the friendship network of mathematics ph.d. students at ucla) or if the people in the network do not in general have any other relationships with each other (e.g., two people who happen to be visiting a particular museum on the same day). in both scenarios, edges that represent close physical proximity still appear and disappear over time, but indirect connections (i.e., between people who are on the same connected component, but without an edge between them) in a time window may play different roles in the spread of information. moreover, network structure itself is often influenced by a spreading process or other dynamics, as perhaps one arranges a meeting to discuss a topic (e.g., to give me comments on a draft of this chapter). see my discussion of adaptive networks in section . for convenience, most work on temporal networks employs discrete time (see fig. (b) ). discrete time can arise from the natural discreteness of a setting, dis-cretization of continuous activity over different time windows, data measurement that occurs at discrete times, and so on. one way to represent a discrete-time (or discretized-time) temporal network is to use the formalism of "multilayer networks" [ , ] . one can also use multilayer networks to study networks with multiple types of relations, networks with multiple subsystems, and other complicated networked structures. fig. (c)) has a set v of nodesthese are sometimes called "physical nodes", and each of them corresponds to an entity, such as a person -that have instantiations as "state nodes" (i.e., node-layer tuples, which are elements of the set v m ) on layers in l. one layer in the set l is a combination, through the cartesian product l × · · · × l d , of elementary layers. the number d indicates the number of types of layering; these are called "aspects". a temporal network with one type of relationship has one type of layering, a timeindependent network with multiple types of social relationships also has one type of layering, a multirelational network that changes in time has two types of layering, and so on. the set of state nodes in m is v m ⊆ v × l × · · · × l d , and the set of indicates that there is an edge from node j on layer β to node i on layer α (and vice versa, if m is undirected). for example, in fig. (c) , there is a directed intralayer edge from (a, ) to (b, ) and an undirected interlayer edge between (a, ) and (a, ). the multilayer network in fig. (c) has three layers, |v | = physical nodes, d = aspect, |v m | = state nodes, and |e m | = edges. to consider weighted edges, one proceeds as in ordinary graphs by defining a function w : e m −→ r. as in ordinary graphs, one can also incorporate self-edges and multi-edges. multilayer networks can include both intralayer edges (which have the same meaning as in graphs) and interlayer edges. the multilayer network in fig. (c) has directed intralayer edges, undirected intralayer edges, and undirected interlayer edges. in most studies thus far of multilayer representations of temporal networks, researchers have included interlayer edges only between state nodes in consecutive layers and only between state nodes that are associated with the same entity (see fig. (c)). however, this restriction is not always desirable (see [ ] for an example), and one can envision interlayer couplings that incorporate ideas like time horizons and interlayer edge weights that decay over time. for convenience, many researchers have used undirected interlayer edges in multilayer analyses of temporal networks, but it is often desirable for such edges to be directed to reflect the arrow of time [ ] . the sequence of network layers, which constitute time layers, can represent a discrete-time temporal network at different time instances or a continuous-time network in which one bins (i.e., aggregates) the network's edges to form a sequence of time windows with interactions in each window. each d-aspect multilayer network with the same number of nodes in each layer has an associated adjacency tensor a of order (d + ). for unweighted multilayer networks, each edge in e m is associated with a entry of a, and the other entries (the "missing" edges) are . if a multilayer network does not have the same number of nodes in each layer, one can add empty nodes so that it does, but the edges that are attached to such nodes are "forbidden". there has been some research on tensorial properties of a [ ] (and it is worthwhile to undertake further studies of them), but the most common approach for computations is to flatten a into a "supra-adjacency matrix" a m [ , ] , which is the adjacency matrix of the graph g m that is associated with m. the entries of diagonal blocks of a m correspond to intralayer edges, and the entries of off-diagonal blocks correspond to interlayer edges. following a long line of research in sociology [ ] , two important ingredients in the study of networks are examining ( ) the importances ("centralities") of nodes, edges, and other small network structures and the relationship of measures of importance to dynamical processes on networks and ( ) the large-scale organization of networks [ , ] . studying central nodes in networks is useful for numerous applications, such as ranking web pages, football teams, or physicists [ ] . it can also help reveal the roles of nodes in networks, such as those that experience high traffic or help bridge different parts of a network [ , ] . mesoscale features can impact network function and dynamics in important ways. small subgraphs called "motifs" may appear frequently in some networks [ ] , perhaps indicating fundamental structures such as feedback loops and other building blocks of global behavior [ ] . various types of largerscale network structures, such as dense "communities" of nodes [ , ] and coreperiphery structures [ , ] , are also sometimes related to dynamical modules (e.g., a set of synchronized neurons) or functional modules (e.g., a set of proteins that are important for a certain regulatory process) [ ] . a common way to study large-scale structures is inference using statistical models of random networks, such as through stochastic block models (sbms) [ ] . much recent research has generalized the study of large-scale network structure to temporal and multilayer networks [ , , ] . various types of centrality -including betweenness centrality [ , ] , bonacich and katz centrality [ , ] , communicability [ ] , pagerank [ , ] , and eigenvector centrality [ , ] -have been generalized to temporal networks using a variety of approaches. such generalizations make it possible to examine how node importances change over time as network structure evolves. in recent work, my collaborators and i used multilayer representations of temporal networks to generalize eigenvector-based centralities to temporal networks [ , ] . one computes the eigenvector-based centralities of nodes for a timeindependent network as the entries of the "dominant" eigenvector, which is associated with the largest positive eigenvalue (by the perron-frobenius theorem, the eigenvalue with the largest magnitude is guaranteed to be positive in these situations) of a centrality matrix c(a). examples include eigenvector centrality (by using c(a) = a) [ ] , hub and authority scores (by using c(a) = aa t for hubs and a t a for authorities) [ ] , and pagerank [ ] . given a discrete-time temporal network in the form of a sequence of adjacency matrices i j denotes a directed edge from entity i to entity j in time layer t, we construct a "supracentrality matrix" c(ω), which couples centrality matrices c(a (t) ) of the individual time layers. we then compute the dominant eigenvector of c(ω), where ω is an interlayer coupling strength. in [ , ] , a key example was the ranking of doctoral programs in the mathematical sciences (using data from the mathematics genealogy project [ ] ), where an edge from one institution to another arises when someone with a ph.d. from the first institution supervises a ph.d. student at the second institution. by calculating timedependent centralities, we can study how the rankings of mathematical-sciences doctoral programs change over time and the dependence of such rankings on the value of ω. larger values of ω impose more ranking consistency across time, so centrality trajectories are less volatile for larger ω [ , ] . multilayer representations of temporal networks have been very insightful in the detection of communities and how they split, merge, and otherwise evolve over time. numerous methods for community detection -including inference via sbms [ ] , maximization of objective functions (especially "modularity") [ ] , and methods based on random walks and bottlenecks to their traversal of a network [ , ] -have been generalized from graphs to multilayer networks. they have yielded insights in a diverse variety of applications, including brain networks [ ] , granular materials [ ] , political voting networks [ , ] , disease spreading [ ] , and ecology and animal behavior [ , ] . to assist with such applications, there are efforts to develop and analyze multilayer random-network models that incorporate rich and flexible structures [ ] , such as diverse types of interlayer correlations. activity-driven (ad) models of temporal networks [ ] are a popular family of generative models that encode instantaneous time-dependent descriptions of network dynamics through a function called an "activity potential", which encodes the mechanism to generate connections and characterizes the interactions between enti-ties in a network. an activity potential encapsulates all of the information about the temporal network dynamics of an ad model, making it tractable to study dynamical processes (such as ones from section ) on networks that are generated by such a model. it is also common to compare the properties of networks that are generated by ad models to those of empirical temporal networks [ ] . in the original ad model of perra et al. [ ] , one considers a network with n entities, which we encode by the nodes. we suppose that node i has an activity rate a i = ηx i , which gives the probability per unit time to create new interactions with other nodes. the scaling factor η ensures that the mean number of active nodes per unit time is η we define the activity rates such that x i ∈ [ , ], where > , and we assign each x i from a probability distribution f(x) that can either take a desired functional form or be constructed from empirical data. the model uses the following generative process: • at each discrete time step (of length ∆t), start with a network g t that consists of n isolated nodes. • with a probability a i ∆t that is independent of other nodes, node i is active and generates m edges, each of which attaches to other nodes uniformly (i.e., with the same probability for each node) and independently at random (without replacement). nodes that are not active can still receive edges from active nodes. • at the next time step t + ∆t, we delete all edges from g t , so all interactions have a constant duration of ∆t. we then generate new interactions from scratch. this is convenient, as it allows one to apply techniques from markov chains. because entities in time step t do not have any memory of previous time steps, f(x) encodes the network structure and dynamics. the ad model of perra et al. [ ] is overly simplistic, but it is amenable to analysis and has provided a foundation for many more general ad models, including ones that incorporate memory [ ] . in section . , i discuss a generalization of ad models to simplicial complexes [ ] that allows one to study instantaneous interactions that involve three or more entities in a network. many networked systems evolve continuously in time, but most investigations of time-dependent networks rely on discrete or discretized time. it is important to undertake more analysis of continuous-time temporal networks. researchers have examined continuous-time networks in a variety of scenarios. examples include a compartmental model of biological contagions [ ] , a generalization of katz centrality to continuous time [ ] , generalizations of ad models (see section . . ) to continuous time [ , ] , and rankings in competitive sports [ ] . in a recent paper [ ] , my collaborators and i formulated a notion of "tie-decay networks" for studying networks that evolve in continuous time. they distinguished between interactions, which they modeled as discrete contacts, and ties, which encode relationships and their strength as a function of time. for example, perhaps the strength of a tie decays exponentially after the most recent interaction. more realistically, perhaps the decay rate depends on the weight of a tie, with strong ties decaying more slowly than weak ones. one can also use point-process models like hawkes processes [ ] to examine similar ideas using a node-centric perspective. suppose that there are n interacting entities, and let b(t) be the n × n timedependent, real, non-negative matrix whose entries b i j (t) encode the tie strength between agents i and j at time t. in [ ] , we made the following simplifying assumptions: . as in [ ] , ties decay exponentially when there are no interactions: where α ≥ is the decay rate. . if two entities interact at time t = τ, the strength of the tie between them grows instantaneously by . see [ ] for a comparison of various choices, including those in [ ] and [ ] , for tie evolution over time. in practice (e.g., in data-driven applications), one obtains b(t) by discretizing time, so let's suppose that there is at most one interaction during each time step of length ∆t. this occurs, for example, in a poisson process. such time discretization is common in the simulation of stochastic dynamical systems, such as in gillespie algorithms [ , , ] . consider an n × n matrix a(t) in which a i j (t) = if node i interacts with node j at time t and a i j (t) = otherwise. for a directed network, a(t) has exactly one nonzero entry during each time step when there is an interaction and no nonzero entries when there isn't one. for an undirected network, because of the symmetric nature of interactions, there are exactly two nonzero entries in time steps that include an interaction. we write equivalently, if interactions between entities occur at times τ ( ) such that ≤ τ ( ) < τ ( ) < . . . < τ (t ) , then at time t ≥ τ (t ) , we have in [ ] , my coauthors and i generalized pagerank [ , ] to tie-decay networks. one nice feature of their tie-decay pagerank is that it is applicable not just to data sets, but also to data streams, as one updates the pagerank values as new data arrives. by contrast, one problematic feature of many methods that rely on multilayer representations of temporal networks is that one needs to recompute everything for an entire data set upon acquiring new data, rather than updating prior results in a computationally efficient way. a dynamical process can be discrete, continuous, or some mixture of the two; it can also be either deterministic or stochastic. it can take the form of one or several coupled ordinary differential equations (odes), partial differential equations (pdes), maps, stochastic differential equations, and so on. a dynamical process requires a rule for updating the states of its dependent variables with respect one or more independent variables (e.g., time), and one also has (one or a variety of) initial conditions and/or boundary conditions. to formalize a dynamical process on a network, one needs a rule for how to update the states of the nodes and/or edges. the nodes (of one or more types) of a network are connected to each other in nontrivial ways by one or more types of edges. this leads to a natural question: how does nontrivial connectivity between nodes affect dynamical processes on a network [ ] ? when studying a dynamical process on a network, the network structure encodes which entities (i.e., nodes) of a system interact with each other and which do not. if desired, one can ignore the network structure entirely and just write out a dynamical system. however, keeping track of network structure is often a very useful and insightful form of bookkeeping, which one can exploit to systematically explore how particular structures affect the dynamics of particular dynamical processes. prominent examples of dynamical processes on networks include coupled oscillators [ , ] , games [ ] , and the spread of diseases [ , ] and opinions [ , ] . there is also a large body of research on the control of dynamical processes on networks [ , ] . most studies of dynamics on networks have focused on extending familiar models -such as compartmental models of biological contagions [ ] or kuramoto phase oscillators [ ] -by coupling entities using various types of network structures, but it is also important to formulate new dynamical processes from scratch, rather than only studying more complicated generalizations of our favorite models. when trying to illuminate the effects of network structure on a dynamical process, it is often insightful to provide a baseline comparison by examining the process on a convenient ensemble of random networks [ ] . a simple, but illustrative, dynamical process on a network is the watts threshold model (wtm) of a social contagion [ , ] . it provides a framework for illustrating how network structure can affect state changes, such as the adoption of a product or a behavior, and for exploring which scenarios lead to "virality" (in the form of state changes of a large number of nodes in a network). the original wtm [ ] , a binary-state threshold model that resembles bootstrap percolation [ ] , has a deterministic update rule, so stochasticity can come only from other sources (see section . ). in a binary state model, each node is in one of two states; see [ ] for a tabulation of well-known binary-state dynamics on networks. the wtm is a modification of mark granovetter's threshold model for social influence in a fully-mixed population [ ] . see [ , ] for early work on threshold models on networks that developed independently from investigations of the wtm. threshold contagion models have been developed for many scenarios, including contagions with multiple stages [ ] , models with adoption latency [ ] , models with synergistic interactions [ ] , and situations with hipsters (who may prefer to adopt a minority state) [ ] . in a binary-state threshold model such as the wtm, each node i has a threshold r i that one draws from some distribution. suppose that r i is constant in time, although one can generalize it to be time-dependent. at any time, each node can be in one of two states: (which represents being inactive, not adopted, not infected, and so on) or (active, adopted, infected, and so on). a binary-state model is a drastic oversimplification of reality, but the wtm is able to capture two crucial features of social systems [ ] : interdependence (an entity's behavior depends on the behavior of other entities) and heterogeneity (as nodes with different threshold values behave differently). one can assign a seed number or seed fraction of nodes to the active state, and one can choose the initially active nodes either deterministically or randomly. the states of the nodes change in time according to an update rule, which can either be synchronous (such that it is a map) or asynchronous (e.g., as a discretization of continuous time) [ ] . in the wtm, the update rule is deterministic, so this choice affects only how long it takes to reach a steady state; it does not affect the steady state itself. with a stochastic update rule, the synchronous and asynchronous versions of ostensibly the "same" model can behave in drastically different ways [ ] . in the wtm on an undirected network, to update the state of a node, one compares its fraction s i /k i of active neighbors (where s i is the number of active neighbors and k i is the degree of node i) to the node's threshold r i . an inactive node i becomes active (i.e., it switches from state to state ) if s i /k i ≥ r i ; otherwise, it stays inactive. the states of nodes in the wtm are monotonic, in the sense that a node that becomes active remains active forever. this feature is convenient for deriving accurate approximations for the global behavior of the wtm using branchingprocess approximations [ , ] or when analyzing the behavior of the wtm using tools such as persistent homology [ ] . a dynamical process on a network can take the form of a stochastic process [ , ] . there are several possible sources of stochasticity: ( ) choice of initial condition, ( ) choice of which nodes or edges to update (when considering asynchronous updating), ( ) the rule for updating nodes or edges, ( ) the values of parameters in an update rule, and ( ) selection of particular networks from a random-graph ensemble (i.e., a probability distribution on graphs). some or all of these sources of randomness can be present when studying dynamical processes on networks. it is desirable to compare the sample mean of a stochastic process on a network to an ensemble average (i.e., to an expectation over a suitable probability distribution). prominent examples of stochastic processes on networks include percolation [ ] , random walks [ ] , compartment models of biological contagions [ , ] , bounded-confidence models with continuous-valued opinions [ ] , and other opinion and voter models [ , , , ] . compartmental models of biological contagions are a topic of intense interest in network science [ , , , ] . a compartment represents a possible state of a node; examples include susceptible, infected, zombified, vaccinated, and recovered. an update rule determines how a node changes its state from one compartment to another. one can formulate models with as many compartments as desired [ ] , but investigations of how network structure affects dynamics typically have employed examples with only two or three compartments [ , ] . researchers have studied various extensions of compartmental models, contagions on multilayer and temporal networks [ , , ] , metapopulation models on networks [ ] for simultaneously studying network connectivity and subpopulations with different characteristics, non-markovian contagions on networks for exploring memory effects [ ] , and explicit incorporation of individuals with essential societal roles (e.g., health-care workers) [ ] . as i discuss in section . , one can also examine coupling between biological contagions and the spread of information (e.g., "awareness") [ , ] . one can also use compartmental models to study phenomena, such as dissemination of ideas on social media [ ] and forecasting of political elections [ ] , that are much different from the spread of diseases. one of the most prominent examples of a compartmental model is a susceptibleinfected-recovered (sir) model, which has three compartments. susceptible nodes are healthy and can become infected, and infected nodes can eventually recover. the steady state of the basic sir model on a network is related to a type of bond percolation [ , , , ] . there are many variants of sir models and other compartmental models on networks [ ] . see [ ] for an illustration using susceptible-infectedsusceptible (sis) models. suppose that an infection is transmitted from an infected node to a susceptible neighbor at a rate of λ. the probability of a transmission event on one edge between an infected node and a susceptible node in an infinitesimal time interval dt is λ dt. assuming that all infection events are independent, the probability that a susceptible node with s infected neighbors becomes infected (i.e., for a node to transition from the s compartment to the i compartment, which represents both being infected and being infective) during dt is if an infected node recovers at a constant rate of µ, the probability that it switches from state i to state r in an infinitesimal time interval dt is µ dt. when there is no source of stochasticity, a dynamical process on a network is "deterministic". a deterministic dynamical system can take the form of a system of coupled maps, odes, pdes, or something else. as with stochastic systems, the network structure encodes which entities of a system interact with each other and which do not. there are numerous interesting deterministic dynamical systems on networksjust incorporate nontrivial connectivity between entities into your favorite deterministic model -although it is worth noting that some stochastic features (e.g., choosing parameter values from a probability distribution or sampling choices of initial conditions) can arise in these models. for concreteness, let's consider the popular setting of coupled oscillators. each node in a network is associated with an oscillator, and we want to examine how network structure affects the collective behavior of the coupled oscillators. it is common to investigate various forms of synchronization (a type of coherent behavior), such that the rhythms of the oscillators adjust to match each other (or to match a subset of the oscillators) because of their interactions [ ] . a variety of methods, such as "master stability functions" [ ] , have been developed to study the local stability of synchronized states and their generalizations [ , ] , such as cluster synchrony [ ] . cluster synchrony, which is related to work on "coupled-cell networks" [ ] , uses ideas from computational group theory to find synchronized sets of oscillators that are not synchronized with other sets of synchronized oscillators. many studies have also examined other types of states, such as "chimera states" [ ] , in which some oscillators behave coherently but others behave incoherently. (analogous phenomena sometimes occur in mathematics departments.) a ubiquitous example is coupled kuramoto oscillators on a network [ , , ] , which is perhaps the most common setting for exploring and developing new methods for studying coupled oscillators. (in principle, one can then build on these insights in studies of other oscillatory systems, such as in applications in neuroscience [ ] .) coupled kuramoto oscillators have been used for modeling numerous phenomena, including jetlag [ ] and singing in frogs [ ] . indeed, a "snowbird" (siam) conference on applied dynamical systems would not be complete without at least several dozen talks on the kuramoto model. in the kuramoto model, each node i has an associated phase θ i (t) ∈ [ , π). in the case of "diffusive" coupling between the nodes , the dynamics of the ith node is governed by the equation where one typically draws the natural frequency ω i of node i from some distribution g(ω), the scalar a i j is an adjacency-matrix entry of an unweighted network, b i j is the coupling strength on oscillator i from oscillator j (so b i j a i j is an element of an adjacency matrix w of a weighted network), and f i j (y) = sin(y) is the coupling function, which depends only on the phase difference between oscillators i and j because of the diffusive nature of the coupling. once one knows the natural frequencies ω i , the model ( ) is a deterministic dynamical system, although there have been studies of coupled kuramoto oscillators with additional stochastic terms [ ] . traditional studies of ( ) and its generalizations draw the natural frequencies from some distribution (e.g., a gaussian or a compactly supported distribution), but some studies of so-called "explosive synchronization" (in which there is an abrupt phase transition from incoherent oscillators to synchronized oscillators) have employed deterministic natural frequencies [ , ] . the properties of the frequency distribution g(ω) have a significant effect on the dynamics of ( ). important features of g(ω) include whether it has compact support or not, whether it is symmetric or asymmetric, and whether it is unimodal or not [ , ] . the model ( ) has been generalized in numerous ways. for example, researchers have considered a large variety of coupling functions f i j (including ones that are not diffusive) and have incorporated an inertia term θ i to yield a second-order kuramoto oscillator at each node [ ] . the latter generalization is important for studies of coupled oscillators and synchronized dynamics in electric power grids [ ] . another noteworthy direction is the analysis of kuramoto model on "graphons" (see, e.g., [ ] ), an important type of structure that arises in a suitable limit of large networks. an increasingly prominent topic in network analysis is the examination of how multilayer network structures -multiple system components, multiple types of edges, co-occurrence and coupling of multiple dynamical processes, and so onaffect qualitative and quantitative dynamics [ , , ] . for example, perhaps certain types of multilayer structures can induce unexpected instabilities or phase transitions in certain types of dynamical processes? there are two categories of dynamical processes on multilayer networks: ( ) a single process can occur on a multilayer network; or ( ) processes on different layers of a multilayer network can interact with each other [ ] . an important example of the first category is a random walk, where the relative speeds and probabilities of steps within layers versus steps between layers affect the qualitative nature of the dynamics. this, in turn, affects methods (such as community detection [ , ] ) that are based on random walks, as well as anything else in which the diffusion is relevant [ , ] . two other examples of the first category are the spread of information on social media (for which there are multiple communication channels, such as facebook and twitter) and multimodal transportation systems [ ] . for instance, a multilayer network structure can induce congestion even when a system without coupling between layers is decongested in each layer independently [ ] . examples of the second category of dynamical process are interactions between multiple strains of a disease and interactions between the spread of disease and the spread of information [ , , ] . many other examples have been studied [ ] , including coupling between oscillator dynamics on one layer and a biased random walk on another layer (as a model for neuronal oscillations coupled to blood flow) [ ] . numerous interesting phenomena can occur when dynamical systems, such as spreading processes, are coupled to each other [ ] . for example, the spreading of one disease can facilitate infection by another [ ] , and the spread of awareness about a disease can inhibit spread of the disease itself (e.g., if people stay home when they are sick) [ ] . interacting spreading processes can also exhibit other fascinating dynamics, such as oscillations that are induced by multilayer network structures in a biological contagion with multiple modes of transmission [ ] and novel types of phase transitions [ ] . a major simplification in most work thus far on dynamical processes on multilayer networks is a tendency to focus on toy models. for example, a typical study of coupled spreading processes may consider a standard (e.g., sir) model on each layer, and it may draw the connectivity pattern of each layer from the same standard random-graph model (e.g., an erdős-rényi model or a configuration model). however, when studying dynamics on multilayer networks, it is particular important in future work to incorporate heterogeneity in network structure and/or dynamical processes. for instance, diseases spread offline but information spreads both offline and online, so investigations of coupled information and disease spread ought to consider fundamentally different types of network structures for the two processes. network structures also affect the dynamics of pdes on networks [ , , , , ] . interesting examples include a study of a burgers equation on graphs to investigate how network structure affects the propagation of shocks [ ] and investigations of reaction-diffusion equations and turing patterns on networks [ , ] . the latter studies exploit the rich theory of laplacian dynamics on graphs (and concomitant ideas from spectral graph theory) [ , ] and examine the addition of nonlinear terms to laplacians on various types of networks (including multilayer ones). a mathematically oriented thread of research on pdes on networks has built on ideas from so-called "quantum graphs" [ , ] to study wave propagation on networks through the analysis of "metric graphs". metric graphs differ from the usual "combinatorial graphs", which in other contexts are usually called simply "graphs". in metric graphs, in addition to nodes and edges, each edge e has a positive length l e ∈ ( , ∞]. for many experimentally relevant scenarios (e.g., in models of circuits of quantum wires [ ] ), there is a natural embedding into space, but metric graphs that are not embedded in space are also appropriate for some applications. as the nomenclature suggests, one can equip a metric graph with a natural metric. if a sequence {e j } m j= of edges forms a path, the length of the path is j l j . the distance ρ(v , v ) between two nodes, v and v , is the minimum path length between them. we place coordinates along each edge, so we can compute a distance between points x and x on a metric graph even when those points are not located at nodes. traditionally, one assumes that the infinite ends (which one can construe as "leads" at infinity, as in scattering theory) of infinite edges have degree . it is also traditional to assume that there is always a positive distance between distinct nodes and that there are no finite-length paths with infinitely many edges. see [ ] for further discussion. to study waves on metric graphs, one needs to define operators, such as the negative second derivative or more general schrödinger operators. this exploits the fact that there are coordinates for all points on the edges -not only at the nodes themselves, as in combinatorial graphs. when studying waves on metric graphs, it is also necessary to impose boundary conditions at the nodes [ ] . many studies of wave propagation on metric graphs have considered generalizations of nonlinear wave equations, such as the cubic nonlinear schrödinger (nls) equation [ ] and a nonlinear dirac equation [ ] . the overwhelming majority of studies in metric graphs (with both linear and nonlinear waves) have focused on networks with a very small number of nodes, as even small networks yield very interesting dynamics. for example, marzuola and pelinovsky [ ] analyzed symmetry-breaking and symmetry-preserving bifurcations of standing waves of the cubic nls on a dumbbell graph (with two rings attached to a central line segment and kirchhoff boundary conditions at the nodes). kairzhan et al. [ ] studied the spectral stability of half-soliton standing waves of the cubic nls equation on balanced star graphs. sobirov et al. [ ] studied scattering and transmission at nodes of sine-gordon solitons on networks (e.g., on a star graph and a small tree). a particularly interesting direction for future work is to study wave dynamics on large metric graphs. this will help extend investigations, as in odes and maps, of how network structures affect dynamics on networks to the realm of linear and nonlinear waves. one can readily formulate wave equations on large metric graphs by specifying relevant boundary conditions and rules at each junction. for example, joly et al. [ ] recently examined wave propagation of the standard linear wave equation on fractal trees. because many natural real-life settings are spatially embedded (e.g., wave propagation in granular materials [ , ] and traffic-flow patterns in cities), it will be particularly valuable to examine wave dynamics on (both synthetic and empirical) spatially-embedded networks [ ] . therefore, i anticipate that it will be very insightful to undertake studies of wave dynamics on networks such as random geometric graphs, random neighborhood graphs, and other spatial structures. a key question in network analysis is how different types of network structure affect different types of dynamical processes [ ] , and the ability to take a limit as model synthetic networks become infinitely large (i.e., a thermodynamic limit) is crucial for obtaining many key theoretical insights. dynamics of networks and dynamics on networks do not occur in isolation; instead, they are coupled to each other. researchers have studied the coevolution of network structure and the states of nodes and/or edges in the context of "adaptive networks" (which are also known as "coevolving networks") [ , ] . whether it is sensible to study a dynamical process on a time-independent network, a temporal network with frozen (or no) node or edge states, or an adaptive network depends on the relative time scales of the dynamics of network structure and the states of nodes and/or edges of a network. see [ ] for a brief discussion. models in the form of adaptive networks provide a promising mechanistic approach to simultaneously explain both structural features (e.g., degree distributions and temporal features (e.g., burstiness) of empirical data [ ] . incorporating adaptation into conventional models can produce extremely interesting and rich dynamics, such as the spontaneous development of extreme states in opinion models [ ] . most studies of adaptive networks that include some analysis (i.e., that go beyond numerical computations) have employed rather artificial adaption rules for adding, removing, and rewiring edges. this is relevant for mathematical tractability, but it is important to go beyond these limitations by considering more realistic types of adaptation and coupling between network structure (including multilayer structures, as in [ ] ) and the states of nodes and edges. when people are sick, they stay home from work or school. people also form and remove social connections (both online and offline) based on observed opinions and behaviors. to study these ideas using adaptive networks, researchers have coupled models of biological and social contagions with time-dependent networks [ , ] . an early example of an adaptive network of disease spreading is the susceptibleinfected (si) model in gross et al. [ ] . in this model, susceptible nodes sometimes rewire their incident edges to "protect themselves". suppose that we have an n-node network with a constant number of undirected edges. each node is either susceptible (i.e., of type s) or infected (i.e., of type i). at each time step, and for each edge -so-called "discordant edges" -between nodes of different types, the susceptible node becomes infected with probability λ. for each discordant edge, with some probability κ, the incident susceptible node breaks the edge and rewires to some other susceptible node. this is a "rewire-to-same" mechanism, to use the language from some adaptive opinion models [ , ] . (in this model, multi-edges and selfedges are not allowed.) during each time step, infected nodes can also recover to become susceptible again. gross et al. [ ] studied how the rewiring probability affects the "basic reproductive number", which measures how many secondary infections on average occur for each primary infection [ , , ] . this scalar quantity determines the size of a critical infection probability λ * to maintain a stable epidemic (as determined traditionally using linear stability analysis of an endemic state). a high rewiring rate can significantly increase λ * and thereby significantly reduce the prevalence of a contagion. although results like these are perhaps intuitively clear, other studies of contagions on adaptive networks have yielded potentially actionable (and arguably nonintuitive) insights. for example, scarpino et al. [ ] demonstrated using an adaptive compartmental model (along with some empirical evidence) that the spread of a disease can accelerate when individuals with essential societal roles (e.g., health-care workers) become ill and are replaced with healthy individuals. another type of model with many interesting adaptive variants are opinion models [ , ] , especially in the form of generalizations of classical voter models [ ] . voter dynamics were first considered in the s by clifford and sudbury [ ] as a model for species competition, and the dynamical process that they introduced was dubbed "the voter model" by holley and liggett shortly thereafter [ ] . voter dynamics are fun and are popular to study [ ] , although it is questionable whether it is ever possible to genuinely construe voter models as models of voters [ ] . holme and newman [ ] undertook an early study of a rewire-to-same adaptive voter model. inspired by their research, durrett et al. [ ] compared the dynamics from two different types of rewiring in an adaptive voter model. in each variant of their model, one considers an n-node network and supposes that each node is in one of two states. the network structure and the node states coevolve. pick an edge uniformly at random. if this edge is discordant, then with probability − κ, one of its incident nodes adopts the opinion state of the other. otherwise, with complementary probability κ, a rewiring action occurs: one removes the discordant edge, and one of the associated nodes attaches to a new node either through a rewire-to-same mechanism (choosing uniformly at random among the nodes with the same opinion state) or through a "rewire-to-random" mechanism (choosing uniformly at random among all nodes). as with the adaptive si model in [ ] , self-edges and multi-edges are not allowed. the models in [ ] evolve until there are no discordant edges. there are several key questions. does the system reach a consensus (in which all nodes are in the same state)? if so, how long does it take to converge to consensus? if not, how many opinion clusters (each of which is a connected component, perhaps interpretable as an "echo chamber", of the final network) are there at steady state? how long does it take to reach this state? the answers and analysis are subtle; they depend on the initial network topology, the initial conditions, and the specific choice of rewiring rule. as with other adaptive network models, researchers have developed some nonrigorous theory (e.g., using mean-field approximations and their generalizations) on adaptive voter models with simplistic rewiring schemes, but they have struggled to extend these ideas to models with more realistic rewiring schemes. there are very few mathematically rigorous results on adaptive voter models, although there do exist some, under various assumptions on initial network structure and edge density [ ] . researchers have generalized adaptive voter models to consider more than two opinion states [ ] and more general types of rewiring schemes [ ] . as with other adaptive networks, analyzing adaptive opinion models with increasingly diverse types of rewiring schemes (ideally with a move towards increasing realism) is particularly important. in [ ] , yacoub kureh and i studied a variant of a voter model with nonlinear rewiring (where the probability that a node rewires or adopts is a function of how well it "fits in" within its neighborhood), including a "rewire-tonone" scheme to model unfriending and unfollowing in online social networks. it is also important to study adaptive opinion models with more realistic types of opinion dynamics. a promising example is adaptive generalizations of bounded-confidence models (see the introduction of [ ] for a brief review of bounded-confidence models), which have continuous opinion states, with nodes interacting either with nodes or with other entities (such as media [ ] ) whose opinion is sufficiently close to theirs. a recent numerical study examined an adaptive bounded-confidence model [ ] ; this is an important direction for future investigations. it is also interesting to examine how the adaptation of oscillators -including their intrinsic frequencies and/or the network structure that couples them to each other -affects the collective behavior (e.g., synchronization) of a network of oscillators [ ] . such ideas are useful for exploring mechanistic models of learning in the brain (e.g., through adaptation of coupling between oscillators to produce a desired limit cycle [ ] ). one nice example is by skardal et al. [ ] , who examined an adaptive model of coupled kuramoto oscillators as a toy model of learning. first, we write the kuramoto system as where f i j is a π-periodic function of the phase difference between oscillators i and j. one way to incorporate adaptation is to define an "order parameter" r i (which, in its traditional form, quantifies the amount of coherence of the coupled kuramoto oscillators [ ] ) for the ith oscillator by and to consider the following dynamical system: where re(ζ) denotes the real part of a quantity ζ and im(ζ) denotes its imaginary part. in the model ( ), λ d denotes the largest positive eigenvalue of the adjacency matrix a, the variable z i (t) is a time-delayed version of r i with time parameter τ (with τ → implying that z i → r i ), and z * i denotes the complex conjugate of z i . one draws the frequencies ω i from some distribution (e.g., a lorentz distribution, as in [ ] ), and we recall that b i j is the coupling strength on oscillator i from oscillator j. the parameter t gives an adaptation time scale, and α ∈ r and β ∈ r are parameters (which one can adjust to study bifurcations). skardal et al. [ ] interpreted scenarios with β > as "hebbian" adaptation (see [ ] ) and scenarios with β < as anti-hebbian adaptation, as they observed that oscillator synchrony is promoted when β > and inhibited when β < . most studies of networks have focused on networks with pairwise connections, in which each edge (unless it is a self-edge, which connects a node to itself) connects exactly two nodes to each other. however, many interactions -such as playing games, coauthoring papers and other forms of collaboration, and horse racesoften occur between three or more entities of a network. to examine such situations, researchers have increasingly studied "higher-order" structures in networks, as they can exert a major influence on dynamical processes. perhaps the simplest way to account for higher-order structures in networks is to generalize from graphs to "hypergraphs" [ ] . hypergraphs possess "hyperedges" that encode a connection between on arbitrary number of nodes, such as between all coauthors of a paper. this allows one to make important distinctions, such as between a k-clique (in which there are pairwise connections between each pair of nodes in a set of k nodes) and a hyperedge that connects all k of those nodes to each other, without the need for any pairwise connections. one way to study a hypergraph is as a "bipartite network", in which nodes of a given type can be adjacent only to nodes of another type. for example, a scientist can be adjacent to a paper that they have written [ ] , and a legislator can be adjacent to a committee on which they sit [ ] . it is important to generalize ideas from graph theory to hypergraphs, such as by developing models of random hypergraphs [ , , ]. another way to study higher-order structures in networks is to use "simplicial complexes" [ , , ] . a simplicial complex is a space that is built from a union of points, edges, triangles, tetrahedra, and higher-dimensional polytopes (see fig. d ). simplicial complexes approximate topological spaces and thereby capture some of their properties. a p-dimensional simplex (i.e., a p-simplex) is a p-dimensional polytope that is the convex hull of its p + vertices (i.e., nodes). a simplicial complex k is a set of simplices such that ( ) every face of a simplex from s is also in s and ( ) the intersection of any two simplices σ , σ ∈ s is a face of both σ and σ . an increasing sequence k ⊂ k ⊂ · · · ⊂ k l of simplicial complexes forms a filtered simplicial complex; each k i is a subcomplex. as discussed in [ ] and references therein, one can examine the homology of each subcomplex. in studying the homology of a topological space, one computes topological invariants that quantify features of different dimensions [ ] . one studies "persistent homology" (ph) of a filtered simplicial complex to quantify the topological structure of a data set (e.g., a point cloud) across multiple scales of such data. the goal of such "topological data analysis" (tda) is to measure the "shape" of data in the form of connected components, "holes" of various dimensionality, and so on [ ] . from the perspective of network analysis, this yields insight into types of large-scale structure that complement traditional ones (such as community structure). see [ ] for a friendly, nontechnical introduction to tda. a natural goal is to generalize ideas from network analysis to simplicial complexes. important efforts include generalizing configuration models of random graphs [ ] to random simplicial complexes [ , ] ; generalizing well-known network growth mechanisms, such as preferential attachment [ ] ; and developing geometric notions, like curvature, for networks [ ] . an important modeling issue when studying higher-order network data is the question of when it is more appropriate (or convenient) to use the formalisms of hypergraphs or simplicial complexes. the computation of ph has yielded insights on a diverse set of models and applications in network science and complex systems. examples include granular materials [ , ] , functional brain networks [ , ] , quantification of "political islands" in voting data [ ] , percolation theory [ ] , contagion dynamics [ ] , swarming and collective behavior [ ] , chaotic flows in odes and pdes [ ] , diurnal cycles in tropical cyclones [ ] , and mathematics education [ ] . see the introduction to [ ] for pointers to numerous other applications. most uses of simplicial complexes in network science and complex systems have focused on tda (especially the computation of ph) and its applications [ , , ] . in this chapter, however, i focus instead on a somewhat different (and increasingly popular) topic: the generalization of dynamical processes on and of networks to simplicial complexes to study the effects of higher-order interactions on network dynamics. simplicial structures influence the collective behavior of the dynamics of coupled entities on networks (e.g., they can lead to novel bifurcations and phase transitions), and they provide a natural approach to analyze p-entity interaction terms, including for p ≥ , in dynamical systems. existing work includes research on linear diffusion dynamics (in the form of hodge laplacians, such as in [ ] ) and generalizations of a variety of other popular types of dynamical processes on networks. given the ubiquitous study of coupled kuramoto oscillators [ ] , a sensible starting point for exploring the impact of simultaneous coupling of three or more oscillators on a system's qualitative dynamics is to study a generalized kuramoto model. for example, to include both two-entity ("two-body") and three-entity interactions in a model of coupled oscillators on networks, we write [ ] x where f i describes the dynamics of oscillator i and the three-oscillator interaction term w i jk includes two-oscillator interaction terms w i j (x i , x j ) as a special case. an example of n coupled kuramoto oscillators with three-term interactions is [ ] where we draw the coefficients a i j , b i j , c i jk , α i j , α i j , α i jk , α i jk from various probability distributions. including three-body interactions leads to a large variety of intricate dynamics, and i anticipate that incorporating the formalism of simplicial complexes will be very helpful for categorizing the possible dynamics. in the last few years, several other researchers have also studied kuramoto models with three-body interactions [ , , ] . a recent study [ ] , for example, discovered a continuum of abrupt desynchronization transitions with no counterpart in abrupt synchronization transitions. there have been mathematical studies of coupled oscillators with interactions of three or more entities using methods such as normal-form theory [ ] and coupled-cell networks [ ] . an important point, as one can see in the above discussion (which does not employ the mathematical formalism of simplicial complexes), is that one does not necessarily need to explicitly use the language of simplicial complexes to study interactions between three or more entities in dynamical systems. nevertheless, i anticipate that explicitly incorporating the formalism of simplicial complexes will be useful both for studying coupled oscillators on networks and for other dynamical systems. in upcoming studies, it will be important to determine when this formalism helps illuminate the dynamics of multi-entity interactions in dynamical systems and when simpler approaches suffice. several recent papers have generalized models of social dynamics by incorporating higher-order interactions [ , , , ] . for example, perhaps somebody's opinion is influenced by a group discussion of three or more people, so it is relevant to consider opinion updates that are based on higher-order interactions. some of these papers use some of the terminology of simplicial complexes, but it is mostly unclear (except perhaps for [ ] ) how the models in them take advantage of the associated mathematical formalism, so arguably it often may be unnecessary to use such language. nevertheless, these models are very interesting and provide promising avenues for further research. petri and barrat [ ] generalized activity-driven models to simplicial complexes. such a simplicial activity-driven (sad) model generates time-dependent simplicial complexes, on which it is desirable to study dynamical processes (see section ), such as opinion dynamics, social contagions, and biological contagions. the simplest version of the sad model is defined as follows. • each node i has an activity rate a i that we draw independently from a distribution f(x). • at each discrete time step (of length ∆t), we start with n isolated nodes. each node i is active with a probability of a i ∆t, independently of all other nodes. if it is active, it creates a (p − )-simplex (forming, in network terms, a clique of p nodes) with p − other nodes that we choose uniformly and independently at random (without replacement). one can either use a fixed value of p or draw p from some probability distribution. • at the next time step, we delete all edges, so all interactions have a constant duration. we then generate new interactions from scratch. this version of the sad model is markovian, and it is desirable to generalize it in various ways (e.g., by incorporating memory or community structure). iacopini et al. [ ] recently developed a simplicial contagion model that generalizes an si process on graphs. consider a simplicial complex k with n nodes, and associate each node i with a state x i (t) ∈ { , } at time t. if x i (t) = , node i is part of the susceptible class s; if x i (t) = , it is part of the infected class i. the density of infected nodes at time t is ρ(t) = n n i= x i (t). suppose that there are d parameters , . . . , d (with d ∈ { , . . . , n − }), where d represents the probability per unit time that a susceptible node i that participates in a d-dimensional simplex σ is infected from each of the faces of σ, under the condition that all of the other nodes of the face are infected. that is, is the probability per unit time that node i is infected by an adjacent node j via the edge (i, j). similarly, is the probability per unit time that node i is infected via the -simplex (i, j, k) in which both j and k are infected, and so on. the recovery dynamics, in which an infected node i becomes susceptible again, proceeds as in the sir model that i discussed in section . . one can envision numerous interesting generalizations of this model (e.g., ones that are inspired by ideas that have been investigated in contagion models on graphs). the study of networks is one of the most exciting and rapidly expanding areas of mathematics, and it touches on myriad other disciplines in both its methodology and its applications. network analysis is increasingly prominent in numerous fields of scholarship (both theoretical and applied), it interacts very closely with data science, and it is important for a wealth of applications. my focus in this chapter has been a forward-looking presentation of ideas in network analysis. my choices of which ideas to discuss reflect their connections to dynamics and nonlinearity, although i have also mentioned a few other burgeoning areas of network analysis in passing. through its exciting combination of graph theory, dynamical systems, statistical mechanics, probability, linear algebra, scientific computation, data analysis, and many other subjects -and through a comparable diversity of applications across the sciences, engineering, and the humanities -the mathematics and science of networks has plenty to offer researchers for many years. congestion induced by the structure of multiplex networks tie-decay temporal networks in continuous time and eigenvector-based centralities multilayer networks in a nutshell multilayer networks in a nutshell temporal and structural heterogeneities emerging in adaptive temporal networks synchronization in complex networks mathematical frameworks for oscillatory network dynamics in neuroscience turing patterns in multiplex networks morphogenesis of spatial networks evolving voter model on dense random graphs generative benchmark models for mesoscale structure in multilayer networks birth and stabilization of phase clusters by multiplexing of adaptive networks network geometry with flavor: from complexity to quantum geometry chaos in generically coupled phase oscillator networks with nonpairwise interactions topology of random geometric complexes: a survey explosive transitions in complex networksÕ structure and dynamics: percolation and synchronization factoring and weighting approaches to clique identification mathematical models in population biology and epidemiology how does active participation effect consensus: adaptive network model of opinion dynamics and influence maximizing rewiring anatomy of a large-scale hypertextual web search engine a model for the influence of media on the ideology of content in online social networks frequency-based brain networks: from a multiplex network to a full multilayer description statistical physics of social dynamics bootstrap percolation on a bethe lattice configuration models of random hypergraphs annotated hypergraphs: models and applications hebbian learning architecture and evolution of semantic networks in mathematics texts a model for spatial conflict reaction-diffusion processes and metapopulation models in heterogeneous networks multiple-scale theory of topology-driven patterns on directed networks generalized network structures: the configuration model and the canonical ensemble of simplicial complexes structure and dynamics of core/periphery networks the physics of spreading processes in multilayer networks mathematical formulation of multilayer networks navigability of interconnected networks under random failures identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems explosive phenomena in complex networks graph fission in an evolving voter model a practical guide to stochastic simulations of reaction-diffusion processes persistent homology of geospatial data: a case study with voting limitations of discrete-time approaches to continuous-time contagion dynamics is the voter model a model for voters? the use of multilayer network analysis in animal behaviour on eigenvector-like centralities for temporal networks: discrete vs. continuous time scales community detection in networks: a user guide configuring random graph models with fixed degree sequences nine challenges in incorporating the dynamics of behaviour in infectious diseases models modelling the influence of human behaviour on the spread of infectious diseases: a review anatomy and efficiency of urban multimodal mobility random hypergraphs and their applications elementary applied topology two's company, three (or more) is a simplex binary-state dynamics on complex networks: pair approximation and beyond quantum graphs: applications to quantum chaos and universal spectral statistics the structural virality of online diffusion patterns of synchrony in coupled cell networks with multiple arrows finite-size effects in a stochastic kuramoto model dynamical interplay between awareness and epidemic spreading in multiplex networks threshold models of collective behavior on the critical behavior of the general epidemic process and dynamical percolation a matrix iteration for dynamic network summaries a dynamical systems view of network centrality adaptive coevolutionary networks: a review epidemic dynamics on an adaptive network pathogen mutation modeled by competition between site and bond percolation ergodic theorems for weakly interacting infinite systems and the voter model modern temporal network theory: a colloquium nonequilibrium phase transition in the coevolution of networks and opinions temporal networks temporal networks temporal network theory an adaptive voter model on simplicial complexes simplical models of social contagion turing instability in reaction-diffusion models on complex networks games on networks the large graph limit of a stochastic epidemic model on a dynamic multilayer network a local perspective on community structure in multilayer networks structure of growing social networks wave propagation in fractal trees synergistic effects in threshold models on networks hipsters on networks: how a minority group of individuals can lead to an antiestablishment majority drift of spectrally stable shifted states on star graphs maximizing the spread of influence through a social network second look at the spread of epidemics on networks centrality prediction in dynamic human contact networks mathematics of epidemics on networks multilayer networks authoritative sources in a hyperlinked environment dynamics of multifrequency oscillator communities finite-size-induced transitions to synchrony in oscillator ensembles with nonlinear global coupling pattern formation in multiplex networks quantifying force networks in particulate systems quantum graphs: i. some basic structures fitting in and breaking up: a nonlinear version of coevolving voter models from networks to optimal higher-order models of complex systems hawkes processes complex spreading phenomena in social systems: influence and contagion in real-world social networks wave mitigation in ordered networks of granular chains centrality metric for dynamic networks control principles of complex networks resynchronization of circadian oscillators and the east-west asymmetry of jet-lag transitivity reinforcement in the coevolving voter model ground state on the dumbbell graph random walks and diffusion on networks the nonlinear heat equation on dense graphs and graph limits multi-stage complex contagions opinion formation and distribution in a bounded-confidence model on various networks network motifs: simple building blocks of complex networks portrait of political polarization six susceptible-infected-susceptible models on scale-free networks a network-based dynamical ranking system for competitive sports community structure in time-dependent, multiscale, and multiplex networks multi-body interactions and non-linear consensus dynamics on networked systems scientific collaboration networks. i. network construction and fundamental results network structure from rich but noisy data collective phenomena emerging from the interactions between dynamical processes in multiplex networks nonlinear schrödinger equation on graphs: recent results and open problems complex contagions with timers a theory of the critical mass. i. interdependence, group heterogeneity, and the production of collective action interaction mechanisms quantified from dynamical features of frog choruses a roadmap for the computation of persistent homology chimera states: coexistence of coherence and incoherence in networks of coupled oscillators network analysis of particles and grains epidemic processes in complex networks topological analysis of data master stability functions for synchronized coupled systems cluster synchronization and isolated desynchronization in complex networks with symmetries bayesian stochastic blockmodeling modelling sequences and temporal networks with dynamic community structures activity driven modeling of time varying networks simplicial activity driven model the multilayer nature of ecological networks network analysis and modelling: special issue of dynamical systems on networks: a tutorial the role of network analysis in industrial and applied mathematics a network analysis of committees in the u.s. house of representatives communities in networks spectral centrality measures in temporal networks reality inspired voter models: a mini-review the kuramoto model in complex networks core-periphery structure in networks (revisited) dynamic pagerank using evolving teleportation memory in network flows and its effects on spreading dynamics and community detection recent advances in percolation theory and its applications dynamics of dirac solitons in networks simplicial complexes and complex systems comparative analysis of two discretizations of ricci curvature for complex networks dynamics of interacting diseases null models for community detection in spatially embedded, temporal networks modeling complex systems with adaptive networks social diffusion and global drift on networks the effect of a prudent adaptive behaviour on disease transmission random walks on simplicial complexes and the normalized hodge -laplacian multiopinion coevolving voter model with infinitely many phase transitions the architecture of complexity the importance of the whole: topological data analysis for the network neuroscientist abrupt desynchronization and extensive multistability in globally coupled oscillator simplexes complex macroscopic behavior in systems of phase oscillators with adaptive coupling sine-gordon solitons in networks: scattering and transmission at vertices topological data analysis of continuum percolation with disks from kuramoto to crawford: exploring the onset of synchronization in populations of coupled oscillators motor primitives in space and time via targeted gain modulation in recurrent cortical networks multistable attractors in a network of phase oscillators with threebody interactions analysing information flows and key mediators through temporal centrality metrics topological data analysis of contagion maps for examining spreading processes on networks eigenvector-based centrality measures for temporal networks supracentrality analysis of temporal networks with directed interlayer coupling tunable eigenvector-based centralities for multiplex and temporal networks topological data analysis: one applied mathematicianÕs heartwarming story of struggle, triumph, and ultimately, more struggle topological data analysis of biological aggregation models partitioning signed networks on analytical approaches to epidemics on networks using persistent homology to quantify a diurnal cycle in hurricane felix resolution limits for detecting community changes in multilayer networks analytical computation of the epidemic threshold on temporal networks epidemic threshold in continuoustime evolving networks network models of the diffusion of innovations graph spectra for complex networks non-markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks temporal gillespie algorithm: fast simulation of contagion processes on time-varying networks forecasting elections using compartmental models of infection ranking scientific publications using a model of network traffic coupled disease-behavior dynamics on complex networks: a review social network analysis: methods and applications a simple model of global cascades on random networks braess's paradox in oscillator networks, desynchronization and power outage inferring symbolic dynamics of chaotic flows from persistence continuous-time discrete-distribution theory for activitydriven networks an analytical framework for the study of epidemic models on activity driven networks modeling memory effects in activity-driven networks models of continuous-time networks with tie decay, diffusion, and convection key: cord- -fgeudou authors: leung, alexander k. c.; davies, h. dele title: cervical lymphadenitis: etiology, diagnosis, and management date: - - journal: curr infect dis rep doi: . /s - - - sha: doc_id: cord_uid: fgeudou cervical lymphadenopathy is a common problem in children. the condition most commonly represents a transient response to a benign local or generalized infection. acute bilateral cervical lymphadenitis is usually caused by a viral upper respiratory tract infection or streptococcal pharyngitis. acute unilateral cervical lymphadenitis is caused by streptococcal or staphylococcal infection in % to % of cases. common causes of subacute or chronic lymphadenitis include cat-scratch disease and mycobacterial infection. generalized lymphadenopathy is often caused by a viral infection, and less frequently by malignancies, collagen vascular diseases, and medications. laboratory tests are not necessary in most children with cervical lymphadenopathy. most cases of cervical lymphadenitis are self-limited and require no treatment. the treatment of acute bacterial cervical lymphadenitis without a known primary source should provide adequate coverage for both staphylococcus aureus and streptococcus pyogenes. enlarged cervical lymph nodes are common in children [ ] . about % to % of otherwise normal children have palpable cervical lymph nodes [ ] . cervical lymphadenopathy is usually defi ned as cervical lymph nodal tissue measuring more than cm in diameter [ ] . cervical lymphadenopathy most commonly represents a transient reactive response to a benign local or generalized infection, but occasionally it might herald the presence of a more serious disorder (eg, malignancy). lymphadenitis specifi cally refers to lymphadenopathies that are caused by infl ammatory processes [ •• ] . this article reviews the pathophysiology, etiology, differential diagnosis, clinical and laboratory evaluation, and management of children with cervical lymphadenitis. the superfi cial cervical lymph nodes lie on top of the sternomastoid muscle and include the anterior group, which lies along the anterior jugular vein, and the posterior group, which lies along the external jugular vein [ •• ] . the deep cervical lymph nodes lie deep to the sternomastoid muscle along the internal jugular vein and are divided into superior and inferior groups. the superior deep nodes lie below the angle of the mandible, whereas the inferior deep nodes lie at the base of the neck. the superfi cial cervical lymph nodes receive afferents from the mastoid, tissues of the neck, and the parotid (preauricular) and submaxillary nodes [ •• ] . the efferent drainage terminates in the superior deep cervical lymph nodes [ •• ] . the superior deep cervical nodes drain the palatine tonsils and the submental nodes. the lower deep cervical nodes drain the larynx, trachea, thyroid, and esophagus. offending organisms usually fi rst infect the upper respiratory tract, anterior nares, oral cavity, or skin in the head and neck area before spreading to the cervical lymph nodes. the lymphatic system in the cervical area serves as a barrier to prevent further invasion and dissemination of these organisms. the nodal enlargement occurs as a result of proliferation of cells intrinsic to the node (eg, lymphocytes, plasma cells, monocytes, and histiocytes) or by infi ltration of cells extrinsic to the node (eg, neutrophils). because infections involving the head and neck areas are common in children, cervical lymphadenitis is common in this age group [ ] . causes of cervical lymphadenopathy are listed in table [ ]. the most common cause is reactive hyperplasia resulting from an infectious process, typically a viral upper respiratory tract infection [ ] . [ ] . anaerobic bacteria can cause cervical lymphadenitis, usually in association with dental caries and periodontal disease. group b streptococci and haemophilus infl uenzae type b are less frequent causal organisms. diphtheria is a rare cause. bartonella henselae (cat-scratch disease), nontuberculosis mycobacteria (eg, mycobacterium avium-intracellulare, mycobacterium scrofulaceum ), and mycobacterium tuberculosis ("scrofula") are important causes of subacute or chronic cervical lymphadenopathy [ ] . chronic posterior cervical lymphadenitis is the most common form of acquired toxoplasmosis and is the sole presenting symptom in % of cases [ ] . more than % of malignant tumors in children occur in the head and neck, and the cervical lymph nodes are the most common site [ ] . during the fi rst years of life, neuroblastoma and leukemia are the most common tumors associated with cervical lymphadenopathy, followed by rhabdomyosarcoma and non-hodgkin's lymphoma [ ] . after years of age, hodgkin's lymphoma is the most common tumor associated with cervical lymphadenopathy, followed by non-hodgkin's lymphoma and rhabdomyosarcoma. the presence of cervical lymphadenopathy is an important diagnostic feature for kawasaki disease. the other features include fever lasting days or more, bilateral bulbar conjunctival injection, infl ammatory changes in the mucosa of the oropharynx, erythema or edema of the peripheral extremities, and polymorphous rash. generalized lymphadenopathy might be a feature of systemic-onset juvenile rheumatoid arthritis, systemic lupus erythematosus, or serum sickness. certain drugsnotably phenytoin, carbamazepine, hydralazine, and isoniazid-might cause generalized lymphadenopathy. cervical lymphadenopathy has been reported after immunization with diphtheria-pertussis-tetanus, poliomyelitis, or typhoid fever vaccine [ ] . rosai-dorfman disease is a benign form of histiocytosis characterized by generalized proliferation of sinusoidal histiocytes. the disease usually manifests in the fi rst decade of life with massive and painless cervical lymphadenopathy, often accompanied by fever, malaise, weight loss, neutrophilic leukocytosis, elevated erythrocyte sedimentation rate, and polyclonal hypergammaglobulinemia. kikuchi-fujimoto disease (histocytic necrotizing lymphadenitis) is a benign cause of lymph node enlargement, usually in the posterior cervical triangle [ ] . the condition primarily affects young females. fever, nausea, weight loss, night sweats, arthralgia, myalgia, or hepatosplenomegaly might be present. the etiology of kikuchi-fujimoto disease is unknown, but a viral cause has been implicated [ ] . classical pathologic fi ndings include patchy areas of necrosis in the cortical and paracortical areas of the enlarged lymph nodes and a histiocytic infi ltrate [ ] . the differential diagnosis of neck masses is different in children due to a higher incidence of infectious diseases and congenital anomalies and the relative rarity of malignancies in the pediatric age group. cervical masses in children might be mistaken for enlarged cervical lymph nodes. in general, congenital lesions are painless and are present at birth or identifi ed soon thereafter [ ] . clinical features that may help distinguish the various conditions from cervical lymphadenopathy are as follows. the swelling of mumps parotitis crosses the angle of the jaw. on the other hand, cervical lymph nodes are usually below the mandible [ ] . a thyroglossal cyst is a mass that can be distinguished by its midline location between the hyoid bone and suprasternal notch and the upward movement of the cyst when the child swallows or sticks out his or her tongue. a branchial cleft cyst is a smooth and fl uctuant mass located along the lower anterior border of the sternomastoid muscle. a sternocleidomastoid tumor is a hard, spindle-shaped mass in the sternocleidomastoid muscle possibly resulting from perinatal hemorrhage into the muscle with subsequent healing by fi brosis [ ] . the tumor can be moved from side to side but not upward or downward. torticollis is usually present. cervical ribs are orthopedic anomalies that are usually bilateral, hard, and immovable. diagnosis is established with a radiograph of the neck. a cystic hygroma is a multiloculated, endothelial-lined cyst that is diffuse, soft, and compressible, contains lymphatic fl uid, and typically transilluminates brilliantly. a hemangioma is a congenital vascular anomaly that often is present at birth or appears shortly thereafter. the mass is usually red or bluish. a laryngocele is a soft, cystic, compressible mass that extends out of the larynx and through the thyrohyoid membrane and becomes larger with the valsalva maneuver. there might be associated stridor or hoarseness. a radiograph of the neck might show an air fl uid level in the mass. a dermoid cyst is a midline cyst that contains solid and cystic components. it seldom transilluminates as brilliantly as a cystic hygroma. a radiograph might show that it contains calcifi cations. a detailed history and a thorough physical examination are essential in the evaluation of the child with cervical lymphadenopathy. age of the child some organisms have a predilection for specifi c age groups. s. aureus and group b streptococci have a predilection for neonates; s. aureus , group b streptococci, and kawasaki disease for infants; viral agents, s. aureus , group a β -hemolytic streptococci, and atypical mycobacteria for children from to years of age; and anaerobic bacteria, toxoplasmosis, cat-scratch disease, and tuberculosis for children from to years of age. most children with cervical lymphadenitis are to years of age. the prevalence of various childhood neoplasms changes with age. in general, lymphadenopathy secondary to neoplasia increases in the adolescent age group [ •• ] . acute bilateral cervical lymphadenitis is usually caused by a viral upper respiratory tract infection or pharyngitis due to s. pyogenes [ , ] . acute unilateral cervical lymphadenitis is caused by s. pyogenes or s. aureus in % to % of cases [ , ] . the classical cervical lymphadenopathy in kawasaki disease is usually acute and unilateral. typically, acute suppurative lymphadenitis is caused by s. aureus or s. pyogenes [ ] . subacute or chronic cervical lymphadenitis is often caused by b. henselae , toxoplasma gondii , ebv, cmv, nontuberculosis mycobacteria, and m. tuberculosis [ , ] . less common causes include syphilis, nocardia brasiliensis , and fungal infection. fever, sore throat, and cough suggest an upper respiratory tract infection. fever, night sweats, and weight loss suggest lymphoma or tuberculosis. recurrent cough and hemoptysis are indicative of tuberculosis. unexplained fever, fatigue, and arthralgia raise the possibility of collagen vascular disease or serum sickness. preceding tonsillitis suggests streptococcal infection. recent facial or neck abrasion or infection suggests staphylococcal infection. periodontal disease might indicate infections caused by anaerobic organisms. a history of cat-scratch raises the possibility of b. henselae infection. a history of dog bite or scratch suggests specifi c causative agents such as pasteurella multocida and s. aureus . lymphadenopathy resulting from cmv, ebv, or hiv might follow a blood transfusion. the immunization status of the child should be determined. immunization-related lymphadenopathy might follow diphtheria-pertussis-tetanus, poliomyelitis, or typhoid fever vaccination. the response of cervical lymphadenopathy to specifi c antimicrobial therapies might help to confi rm or exclude a diagnosis. lymphadenopathy might follow the use of medications such as phenytoin and isoniazid. exposure to a person with an upper respiratory tract infection, streptococcal pharyngitis, or tuberculosis suggests the corresponding disease. a history of recent travel should be sought. general malnutrition or poor growth suggests chronic disease such as tuberculosis, malignancy, or immunodefi ciency. all accessible node-bearing areas should be examined to determine whether the lymphadenopathy is generalized. the nodes should be measured for future comparison [ ] . fluctuation in size of the nodes suggests a reactive process, whereas relentless increase in size indicates a serious pathology [ , ] . tenderness, erythema, warmth, mobility, fl uctuance, and consistency should be assessed. the location of involved lymph nodes often gives clues to the entry site of the organism and should prompt a detailed examination of that site. submandibular and submental lymphadenopathy is most often caused by an oral or dental infection, although this feature may also be seen in cat-scratch disease and non-hodgkin's lymphoma. acute posterior cervical lymphadenitis is classically seen in persons with rubella and infectious mononucleosis [ , ] . supraclavicular or posterior cervical lymphadenopathy carries a much higher risk for malignancy than does anterior cervical lymphadenopathy. cervical lymphadenopathy associated with generalized lymphadenopathy is often caused by a viral infection. malignancies (eg, leukemia or lymphoma), collagen vascular diseases (eg, juvenile rheumatoid arthritis or systemic lupus erythematosus), and some medications are also associated with generalized lymphadenopathy. in lymphadenopathy resulting from a viral infection, the nodes are usually bilateral and soft and are not fi xed to the underlying structure. when a bacterial pathogen is present, the nodes can be either unilateral or bilateral, are usually tender, might be fl uctuant, and are not fi xed. the presence of erythema and warmth suggests an acute pyogenic process, and fl uctuance suggests abscess formation. a "cold" abscess is characteristic of infection caused by mycobacteria, fungi, or b. henselae . in patients with tuberculosis, the nodes might be matted or fl uctuant, and the overlying skin might be erythematous but is typically not warm [ ] . clinical features that help differentiate nontuberculosis mycobacterial cervical lymphadenitis from m. tuberculosis cervical lymphadenitis are summarized in table [ , ] . approximately % of patients with lymphadenitis caused by nontuberculosis mycobacteria develop fl uctuance of the lymph node and spontaneous drainage; sinus tract formation occurs in % of affected patients [ •• , •• ]. in lymphadenopathy resulting from malignancy, signs of acute infl ammation are absent, and the lymph nodes are hard and often fi xed to the underlying tissue. a thorough examination of the ears, eyes, nose, oral cavity, and throat is necessary. acute viral cervical lymphadenitis is variably associated with fever, rhinorrhea, conjunctivitis, pharyngitis, and sinus congestion [ •• ] . a beefy red throat, exudate on the tonsils, petechiae on the hard palate, and a strawberry tongue suggest infection caused by s. pyogenes [ ] . unilateral facial or submandibular swelling, erythema, tenderness, fever, and irritability in an infant suggest group b streptococcal infection [ ] . diphtheria is associated with edema of the soft tissues of the neck, often described as "bull-neck" appearance. the presence of gingivostomatitis suggests infection with hsv, whereas herpangina suggests infection with coxsackievirus [ ] . rash and hepatosplenomegaly suggest ebv or cmv infection [ •• ] . the presence of pharyngitis, maculopapular rash, and splenomegaly suggest ebv infection [ ] . conjunctivitis and koplik spots are characteristics of rubeola. the presence of pallor, petechiae, bruises, sternal laboratory tests are not necessary in most children with cervical lymphadenopathy. a complete blood cell count might help to suggest a bacterial lymphadenitis, which is often accompanied by leukocytosis with a shift to the left and toxic granulations. atypical lymphocytosis is prominent in infectious mononucleosis [ ] . pancytopenia, leukocytosis, or the presence of blast cells suggests leukemia. the erythrocyte sedimentation rate and c-reactive protein are usually signifi cantly elevated in persons with bacterial lymphadenitis. blood culture should be obtained if the child appears toxic. a rapid streptococcal antigen test or a throat culture might be useful to confi rm a streptococcal infection [ ] . an electrocardiogram and echocardiogram are indicated if kawasaki disease is suspected. skin tests for tuberculosis should be performed in patients with subacute or chronic adenitis. chest radiography should be performed if the tuberculin skin test is positive or if an underlying chest pathology is suspected, especially in the child with chronic or generalized lymphadenopathy. serologic tests for b. henselae , ebv, cmv, brucellosis, syphilis, and toxoplasmosis should be performed when indicated. if the serology is positive, the diagnosis can be established and excision biopsy can be avoided [ • ] . ultrasonography (us) is the most useful diagnostic imaging modality in the assessment of cervical lymph nodes. us may help to differentiate a solid mass from a cystic mass and to establish the presence and extent of suppuration or infi ltration. high-resolution and color us can provide detailed information on the longitudinal and transverse diameter, morphology, texture, and vascularity of the lymph node [ •• , ] . a long-to-short axis ratio greater than suggests benignity, whereas a ratio less than suggests malignancy [ ] . in lymphadenitis caused by an infl ammatory process, the intranodal vasculature is dilated, whereas in lymphadenopathy secondary to neoplastic infi ltration, the intranodal vasculature is usually distorted. absence of an echogenic hilus and overall lymph node hyperechogenicity are suggestive of malignancy [ • ] . us can also be used to guide core-needle biopsy for diagnosing the cause of cervical lymphadenopathy in patients without known malignancy and may obviate unnecessary excisional biopsy [ • ] . advantages of us include cost-effectiveness, noninvasiveness, and absence of radiation hazard. a potential drawback is its lack of absolute specifi city and sensitivity in ruling out neoplastic processes as the cause of lymphadenopathy [ •• ] . diffusion-weighted mri with apparent diffusion coeffi cient mapping can be helpful to differentiate malignant from benign lymph nodes and delineate the solid, viable part of the lymph node for biopsy [ ] . the technique also allows detection of small lymphadenopathies. fine-needle aspiration and culture of a lymph node is a safe and reliable procedure to isolate the causative organism and to determine the appropriate antibiotic when bacterial infection is the cause [ ] . failure to improve or worsening of the patient's condition while on antibiotic treatment is an indication for fi ne-needle aspiration and culture [ •• ] . all aspirated material should be sent for gram and acid-fast stain and cultures for aerobic and anaerobic bacteria, mycobacteria, and fungi [ •• , ] . if the gram stain is positive, only bacterial cultures are mandatory. polymerase chain reaction testing is a fast and useful technique for the demonstration of mycobacterial dna fragments [ ] . an excisional biopsy with microscopic examination of the lymph node might be necessary to establish the diagnosis if symptoms or signs of malignancy are present or if the lymphadenopathy persists or enlarges in spite of appropriate antibiotic therapy and the diagnosis remains in doubt [ ] . the biopsy should be performed on the largest and fi rmest node that is palpable, and the node should be removed intact with the capsule [ , ] . treatment of cervical lymphadenopathy depends on the underlying cause. most cases are self-limited and require no treatment other than observation. this applies especially to small, soft, and mobile lymph nodes associated with upper respiratory infections, which are often viral in origin. these children require follow-up in to weeks. the treatment of acute bacterial cervical lymphadenitis without a known primary infectious source should provide adequate coverage for both s. aureus and s. pyogenes , pending the results of the culture and sensitivity tests [ ] . appropriate oral antibiotics include cloxacillin, cephalexin, cefprozil, or clindamycin [ ] . children with cervical lymphadenopathy and periodontal or dental disease should be treated with clindamycin or a combination of amoxicillin and clavulanic acid, which provide coverage for anaerobic oral fl ora [ , ] . referral to a pediatric dentist for treatment of the underlying periodontal or dental disease is warranted. antimicrobial therapy may have to be modifi ed once a causative agent is identifi ed, depending on the clinical response of the existing treatment. because of its proven effi cacy, safety, and narrow spectrum of antimicrobial activity, penicillin remains the drug of choice for adenitis caused by s. pyogenes , except in patients allergic to penicillin [ ] . methicillin-resistant s. aureus is resistant to many kinds of antibiotics. currently, vancomycin is the drug of choice for complicated cases, although trimethoprimsulfamethoxazole or clindamycin is often adequate for uncomplicated outpatient management [ ] . in most patients, symptomatic improvement should be noted after to hours of therapy. fine-needle aspiration and culture should be considered if there is no clinical improvement or if the patient's condition deteriorates. if the lymph nodes become fl uctuant, incision and drainage should be performed. failure of regression of lymphadenopathy after to weeks might be an indication for a diagnostic biopsy [ ] . indications for early excision biopsy for histology include lymph node in the supraclavicular area, lymph node larger than cm, lymph nodes in children with a history of malignancy, and clinical fi ndings of fever, night sweats, weight loss, and hepatosplenomegaly [ • ] . toxic or immunocompromised children and those who do not tolerate, will not take, or fail to respond to oral medication should be treated with intravenous nafcillin, cefazolin, or clindamycin [ ] . oral analgesia with medication such as acetaminophen might help to relieve associated pain. the current recommendation for the treatment of isolated cervical tuberculosis lymphadenitis is months of isoniazid, rifampin, and pyrazinamide, followed by months of isoniazid and rifampin by directly observed therapy for drug-susceptible m. tuberculosis [ ] . if possible drug resistance is a concern, ethambutol or an aminoglycoside should be added to the initial three-drug combination until drug susceptibilities are determined, and an infectious disease specialist should be consulted [ ] . nontuberculosis mycobacterial lymphadenitis is best treated with surgical excision of all visibly infected nodes [ •• ] . a recent randomized, controlled trial enrolled children with nontuberculous cervical adenitis to receive surgical excision ( n = ) or antibiotic therapy with clarithromycin and rifabutin ( n = ) [ •• ] . based on intention-to-treat analysis, the surgical cure rate was % versus % in the medical arm after months. furthermore, there were more adverse events in the medical arm. however, the major complication of surgery is permanent damage to the facial nerve, which occurred in about % of patients. transient facial nerve involvement occurred in another % [ •• ] . thus, careful consideration must be given to the location of the adenitis in the determination of node removal. when surgery is not feasible due to risk to the facial nerve, a two-drug antimycobacterial regimen that includes a macrolide should be considered [ , •• , ] . failure of medical therapy usually cannot be explained as a result of development of resistant organisms [ •• ] . cervical lymphadenopathy is a common and usually benign fi nding in children. in most cases, it is infectious in origin secondary to a viral upper respiratory tract infection. a good history and thorough physical examination are usually all that is necessary to establish a diagnosis. most children with cervical lymphadenopathy require no specifi c treatment, but do need follow-up in to weeks. the treatment of acute bacterial cervical lymphadenitis without a known primary infectious source should provide adequate coverage for both s. aureus and s. pyogenes . particular interest, published recently, have been highlighted as: • of importance •• of major importance childhood cervical lymphadenopathy palpable lymph nodes of the neck in swedish schoolchildren lymphadenopathy, lymphadenitis and lymphangitis acute, subacute, and chronic cervical lymphadenitis in children this is an excellent article that addresses the current approaches to the diagnosis and management of cervical lymphadenitis in children cervical lymphadenopathy in children cervical lymphadenopathy and adenitis group a b-hemolytic streptococcal pharyngitis in children mycobacterial cervical lymphadenitis in children: clinical and laboratory factors of importance for differential diagnosis kikuchi's disease: an important cause of cervical lymphadenopathy assessment of lymphadenopathy in children cervical lymphadenitis, suppurative parotitis, thyroiditis, and infected cysts cervical lymphadenopathy management of common head and neck masses cervical lymphadenopathy in children-incidence and diagnostic management mycobacterial cervical lymphadenitis surgical excision versus antibiotic treatment for nontuberculous mycobacterial cervicofacial lymphadenitis in children: a multicenter, randomized, controlled trial this multicenter, randomized, controlled trial compared surgical excision versus antibiotic treatment for nontuberculous myocobacterial cervicofacial lymphadenitis in children infectious mononucleosis rapid antigen detection testing in diagnosing group a b-hemolytic streptococcal pharyngitis a child with cervical lymphadenopathy this excellent article offers practical guidelines on the management of childhood cervical lymphadenopathy ultrasonography of abnormal neck lymph nodes sonographically guided core needle biopsy of cervical lymphadenopathy in patients without known malignancy this retrospective study showed a high yield and accuracy of sonographically guided core-needle biopsy for diagnosing the cause of cervical lymphadenopathy tawfi k a: role of diffusion-weighted mr imaging in cervical lymphadenopathy fine needle aspiration in the evaluation of children with lymphadenopathy cervical lymphadenopathy in children microbiology of cervical lymphadenitis in adults methicillin-resistant staphylococcus aureus: how best to treat now? report of the committee on infectious diseases lymphadenitis due to nontuberculous mycobacteria in children: presentation and response to therapy this article was published in part by leung and robson [ ] in the journal of pediatric health care , with permission from elsevier. it has been signifi cantly updated for the current article. no potential confl icts of interest relevant to this article were reported. key: cord- -p a zjy authors: backhausz, 'agnes; bogn'ar, edit title: virus spread and voter model on random graphs with multiple type nodes date: - - journal: nan doi: nan sha: doc_id: cord_uid: p a zjy when modelling epidemics or spread of information on online social networks, it is crucial to include not just the density of the connections through which infections can be transmitted, but also the variability of susceptibility. different people have different chance to be infected by a disease (due to age or general health conditions), or, in case of opinions, ones are easier to be convinced by others, or stronger at sharing their opinions. the goal of this work is to examine the effect of multiple types of nodes on various random graphs such as erdh{o}s--r'enyi random graphs, preferential attachment random graphs and geometric random graphs. we used two models for the dynamics: seir model with vaccination and a version of voter model for exchanging opinions. in the first case, among others, various vaccination strategies are compared to each other, while in the second case we studied sevaral initial configurations to find the key positions where the most effective nodes should be placed to disseminate opinions. freedom in choosing the position of these vertices. one of our main interests is epidemic spread. the accurate modelling, regulating or preventing of a possible epidemic is still a difficult problem of the st century. (as of the time of writing, a novel strain of coronavirus has spread to at least other countries from china, although authorities have been taking serious actions to prevent a worldwide outbreak.) as for mathematical modelling, there are several approaches to model these processes, for example, using differential equations, the theory of random graphs or other probabilistic tools [ , , ] . as it is widely studied, the structure of the underlying graph can have an important impact on the course of the epidemic. in particular, structural properties such as degree distribution and clustering are essential to understand the dynamics and to find the optimal vaccination strategies [ , ] . from the point of view of random graphs, in case of preferential attachment graphs, it is also known the initial set of infected vertices can have a huge impact on the outcome of the process [ ] : a small proportion infected vertices is enough for a large outbreak if the positions are chosen appropriately. on the other hand, varying susceptibility of vertices also has an impact for example on the minimal proportion of vaccinated people to prevent the outbreak [ , ] . in the current work, by computer simulations, we study various cases when these effects are combined in a seir model with vaccination: we have a multitype random graph, and the vaccination strategies may depend on the structure of the graph and types of the vertices as well. the other family of models which we studied is a variant of the voter model. the voter model is also a common model of interacting particle systems and population dynamics, see e.g. the book of liggett [ ] . this model is related to epidemics as well: durett and neuhauser [ ] applied the voter model to study virus spread. the two processes can be connected by the following idea: we can see virus spread as a special case of the voter model with two different opinions (healthy and infected), but only one of the opinions (infected) can be transmitted, while any individuals with infected opinion switch to healthy opinion after a period of time. also the virus can spread only through direct contacts of individuals (edges of the graphs), while in the voter model it is possible for the particles to influence one another without being neighbors in the graph. similarly to the case of epidemics, the structure of the underlying graph has an important impact on the dynamics of the process [ , ] . here we study a version of this model with various underlying random graphs and multiple types of nodes. we examined the virus spread with vaccination and the voter model on random graphs of different structures, where in some cases the nodes of the graph corresponding to the individuals of the network are divided into groups representing significantly distinct properties for the process. we studied the possible differences of the processes on different graphs, regarding the nature and magnitude of distinct result and both tried to find the reasons for them, to understand how can the structure of an underlying network affect outcomes. the outline of the paper is as follows. in the second section we give a description of the virus spread in continuous time, and the discretized model. parameters are chosen such that they match the realworld data from [ ] . we confront outcomes on different random graphs and the numerical solutions of the differential equations originating from the continuous time counterpart of the process. we also study different possible choices of reproduction number r corresponding to the seriousness of the disease. we examine different vaccination strategies (beginning at the start of the disease of a few days before), and a model with weight on edges is also mentioned. in the third section we study the discretized voter model on erdős-rényi and barabási-albert graphs firstly without, then with multiple type nodes. later we run the process on random graphs with a geometric structure on the plane. the dynamics of virus spread can be described by differential equations, therefore they are usually studied from this approach. however, differential equations use only transmission rates calculated by the number of contacts in the underlying network, while the structure of the whole graph and other properties are not taken into account. motivated by the paper "modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks" of diána h. knipl and gergely röst [ ] , we modelled the process on random graphs of different kinds. in this section we use the same notions and sets for most of the parameters. ideas for vaccination strategies are also derived from there. we examined a model in which individuals experience an incubation period, delaying the process. dynamics are also effected by a vaccination campaign started at the outbreak of the virus, or vaccination campaign coming a few days before the outbreak. in the classical seir model, each individual in the model is in exactly one of the following compartment during the virus spread: • susceptible: individuals are healthy, but can be infected. • exposed: individuals are infected but not yet infectious. • infectious: individuals are infected and infectious. • recovered: individuals are not infectious anymore, and immune (cannot be infected again). individuals can move through compartments only in the defined way above (it is not possible to miss out one in the line). the rate at which individuals leave compartments are described by probabilities (transmission rates) and the parameters of the model (incubation rate, recovery rate). individuals in r are immune, so r is a terminal point. seir with vaccination: we mix the model with a vaccination campaign. the campaign lasts for days, and we vaccinate individuals according to some strategy (described later) so that at the end of the campaign % of the population is vaccinated (if it is possible). we vaccinate individuals only in s, but the vaccination ensures immunity only with probability q, and only after days. we vaccinate individuals at most once irrespectively of the success of vaccination. however, vaccinated individuals can be infected within the first days. in this case, nothing differs from the process without vaccination. to describe the underlying network, we use real-life data. we distinguish individuals according to their age. in particular, we consider age groups since they have different social contact profile. the to describe the social relationships of the different age groups, we used the contact matrix obtained in [ ] : where the elements c i,j represent the average number of contacts an individual in age group i has with individuals in age group j. in the sequel, the number of individuals in a given group is denoted by the label of the group according to figure . the model is specified by the following family of parameters. • r = . : basic reproduction number. it characterizes the intensity of the epidemic. its value is the average number of infections an infectious individual causes during its infectious period in a population of only susceptible individuals (without vaccination). later we also study less severe cases with r = . − . . • β i,j : transmission rates. they control the rate of the infection between a susceptible individual in age group i and an infectious individual in age group j. they can be derived from r and the c contact matrix. according to [ ] we used β i,j = β · c i,j n j , where β = . for r = . . . : latent period. ν e is the rate of exposed individuals becoming infectious. each individual spends an average of ν i in i. • ν w = : time to develop antibodies after vaccination. • q i = . for i = , . . . , and q = . : vaccine efficacy. the probability that a vaccinated individual develops antibodies and becomes immune. • δ = . : reduction in infectiousness. the rate by which infectiousness of unsuccessfully vaccinated individuals is reduced. • λ i = j= β j,i · (i j + δ · i j v ) is the total rate at which individuals of group i get infected and become exposed. • v i : vaccination rate functions determined by a strategy. this describes the rate of vaccination in group i. the dynamics of the virus spread and the vaccination campaign can be described by differential equations ( for each age group), according to [ ] : we would like to create an underlying network and examine the outcome of virus spread on this given graph. we generated random graphs of different structures with n = nodes, such that each node has a type corresponding to the age of the individual. the age distributions and number of contacts in the graph between age groups comply with statistic properties detailed above. since the contact matrix c describes only the average number of contacts, the variances can be different. • erdős-rényi graphs: we create nodes and their types are defined immediately, such that the number of types comply exactly to the age distribution numbers. the relationships within each age group and the connections between different age groups are both modelled with an erdős-rényi graph in the following sense: we create an edge between every node in age group i and node in age group j independently with probability p i,j , where • preferential attachment graphs: initially we start from an erdős-rényi graph of size , then we keep adding nodes to the graph sequentially. every new node chooses its type randomly, with probabilities given by the observed age distribution. after that we create edges between the new node and the old ones with preferential attachment. if the new node is of type i, then we connect it with an edge independently to an old node v of type j with probability where d(v) denotes the actual degree of v, and d is the sum of degrees belonging to nodes with type j. thus the new node is more likely to attach to nodes with a high degree, resulting a few enormous degrees in each age group. on the other hand, the connection matrix c is used to ensure that the density of edges between different age groups is different. • preferential attachment mixed with erdős-rényi graphs: we create the nodes again with their types exactly according to age distribution numbers. first we create five preferential attachment graphs, the ith of size n i so that every node has an average of c i,i neighbours. in particular, the endpoints of the new edges are chosen independently, and the attachment probabilities are proportional to the degrees of the old vertices. then we attach nodes in different age groups independently with the corresponding p i,j probabilities defined above. • random graphs of minimal degree variances with configuration model: we prescribe not only a degree sequence of the nodes, but the degree of a node broken down into parts regarding the age groups in a way the expectations comply with the contact matrix c, but the degrees also have a small variance. the distribution is chosen such that the variance is minimal among distribution supported on the integers, given the expectation. for example in case of c , = , every node in age group has exactly or neighbours in age group , and the average number is , . our configuration model creates a random graph with the given degree sequence. according to [ ] , the expected value of the number of loops and multiple edges divided by the number of nodes tends to zero, thus for n = it is suitable to neglect them and to represent a network of social contacts with this model. in this section we detail how we implemented the discretization of the process on the generated random graphs. most of the parameters remained the same as in the differential equations, however to add more reality we toned them with a little variance. for the matter of the transmission rates, we needed to find different numbers to describe the probability of infections, since β was derived from c matrix and r basic reproduction number. since c is built in the structure of our graphs, using different parameters β i,j would result in adding the same effect of contacts twice to the process. therefore instead of β i,j , we determined a universalβ according to the definition of r . we set the disease transmissions toβ = r · . , under the assumption, that the contact profile of age groups are totally implemented in the graph structure. only the average density of the graph (without age groups), severity of the disease and average time spent in infectious period can affect the parameters. parameters ν w , δ, q i remained exactly the same, while ν e = . and ν i = holds only in expected value. the exact distributions are given as follows: we built in reduction in infectiousness δ in the process in such way that an unsuccessfully vaccinated individual spends · . = . days in average in i, instead of modifyingβ. in the discretized process, we start with infectious nodes chosen randomly and independently from the age groups. we observe a day period with vaccination plus days without it. (in the basic scenarios we start vaccination at day , however we later examine the process with vaccination starting a few days before the outbreak). at a time step firstly the infectious nodes can transmit the disease to their neighbours. only nodes in s can be infected, and they cannot be infected ever again. when a node becomes infected, its position is set immediately to e, and also the number of days it spends in e is generated. secondly we check if a node reached the end of its latent/infectious period, and we set its position to i or r. (as soon as a node becomes infectious, the days it spends in i is also calculated.) then at the end of each iteration we vaccinate . % of the whole population according to some strategy (if it is possible). only nodes in s get vaccination (at most once), it is generated immediately whether the vaccination is successful (with probability q i , according to its type). in case of success, the day it could become immune without any infection is also noted. if it reaches the th day, and still in s, its position is set to immune. the first question is whether the structure of the underlying graph can affect the process, in the case when the edge densities are described by the same contact matrix c. we can ask how it can affect the overall outcome and other properties, and how we can explain and interpret these differences regarding the structure of the graph. we compare results on different graphs with each other, and also with the numerical solution of the differential equations describing the process. in this section we study the basic scenario: our vaccination starts at day , we vaccinate by uniform strategy. this strategy does not distinguish age groups, every day we vaccinate . % of each age group randomly (if it is still possible). we set r = . . using the differential equation system from [ ] giving structure to the underlying social network boosted these numbers in every case, however differences are still significant between the random graphs of different properties. to study the result on the discretized model, we generated random graphs with n = nodes for each graph structure, and run the process times on a random graph with independent in case of most of the structures we can derive rather different outcomes on the same graph with different initial values concerning the peak of the virus. therefore using the same graphs more is acceptable.) as we can see on figure (compared to figure ), random graphs from the configuration model were the closest to the numerical solution of differential equations. however, the difference in outcomes can be clearly seen from every perspective: almost % of the population ( . % more) was infected by the virus at the end of the time period, the infection peaked almost days sooner (at day ) and the number of infectious cases at the peak is almost twice as large. we got similar, but more severe result on erdős-rényi graphs. however still only a maximum . % of the population was infected at the same time. the outcome in case of graphs with a (partial) preferential attachment structure shows that distribution of degrees do matter in this process. (this notice gave the idea initially to model a graph with minimal degree deviation with the help of the configuration model. we were curious if we can get closer results to the differential equations on such a graph.) on preferential attachment graphs . % of the individuals came through the disease. what is more, % of the population was infected at the same time at the peak of the virus, only at day . however, after day the infection was substantially over. with preferential attachment structure it is very likely that a node with huge degree gets infected in the early days of the process, irrespectively of initial infectious individual choices, resulting in an epidemic really fast. however, after the dense part of the graph passed through the virus around day , even % of the population is still in s, magnitude of infectious cases is really low. the process on preferential attachment mixed with erdős-rényi graphs reflects something in be- tween, yet preferential properties dominate. it was possible to reach % vaccination rate during the process, except in case of preferential attachment graphs. at the end of the th day, . − . proportion of individuals could acquire immunity after vaccination. basic reproduction number is a representative measure for the seriousness of disease. generally, diseases with a reproduction number greater than should be taken seriously, however the number is a measure of potential transmissibility. it does not actually tell, how fast a disease will spread. seasonal flu has an r about . , hiv and sars around − , while according to [ ] in this section we investigate how different strategies in vaccination can affect the attack rates. we study three very different strategies based on age groups or other properties of the graph. in each strategy . % of the population is vaccinated at each time step (sometimes exactly, sometimes only in expected value). after a days vaccination campaign % of the population should be vaccinated from each age group (if it is possible). we still start our vaccination campaign at day , and we vaccinate individuals at most once irrespectively of the success of the vaccination. • uniform strategy: this strategy does not distinguish age groups, every day we vaccinate randomly . % individuals of each age group. • contacts strategy: we prioritize age groups with bigger contact number, corresponding to denser parts of the graph (concerning the groups). we vaccinate the second age group for days, then the third age group for , first age group for days, forth group for days, and at last age group with the smallest number of contacts for days. this strategy turned out to be the best in the case without any graph structure [ ] . however, in conventional vaccination strategies, in the first days of the campaign, amongst others health care personnel is vaccinated which certainly makes sense, but can be also interpreted as nodes of the graph not only with high degree, but also with high probability to get infected. the effect of vaccination by degrees can be also noticed on the shape of infected individuals in age groups developing in time (see figure ). not only the magnitude decreased, but the vaccination also increased the skewness, especially for age group . vaccination by contacts totally distorted the curve of age group , while the others did not changed much. we examine if vaccination before the outbreak of a virus (only a few, - days before) could influence the epidemic spread significantly. delay in development of immunity after vaccination is one of the key factors of the model, thus pre-vaccination could counterweight this effect. edges of the graph so far represented only the existence of a social contact, however relationships between the individuals could be of different quality. it is also a natural idea to make a connection between the type of the nodes (age groups of individuals) and the feature of the edge between them. for example, generally we can assume that children of age - (age group one) are more likely to catch or transmit a disease to any other individual regardless of age, since the nature of contacts with children are usually more intimate. so on the one hand, creating weights on the edges of the graph can strongly be in connection with the type of the given nodes. on the other hand regardless of age groups, individuals tend to have a few relationships considered more significant in the aspect of a virus spread (individuals sharing a household), while many social contacts are less relevant. for the reasons above, we upgrade our random graphs with a weighting on the edges, taking into account the age groups of the individuals. regardless of age, relationships divided into two types: close and distant. only % of the contacts of an individual can be close, transmission rates on these edges are much higher, while on distant edges they are reduced. we examine a model in which age groups do not affect weights of the edges. we double the probabilities of transmitting the disease on edges representing close contacts, and decrease probabilities on other edges at a . rate. in expected value the total r of the disease has not changed. however, results on graphs can be different from the unweighted cases. with the basic scenario and in case of we experience the biggest difference on erdős-rényi-graphs, however models with edge weights give bigger attack rates of only . . we get a less severe virus spread with edges only on the configuration model. in this section we study the discretized voter model in which particles exchange opinions from time to time, in connection with relationships between them. we create a simplified process to be able to examine the outcome on larger graphs. firstly, we examine this simplified process on erdős-rényi and barabási-albert graphs, then multiple type of nodes is introduced. with a possible interpretation of different types of nodes in the graphs, we generalize the voter model. later we examine the "influencer" model, in which our aim is, in opposition to the seir model, to spread one of the opinions. the process in continuous time can be modelled with a family of independent poisson process. for each pair of vertices (x, y) we have a poisson process of rate q(x, y), which describes the moments x convincing y. the rate q(x, y) increases as the distance d(x, y) decreases. in this case, every time a vertex is influenced by another one, it changes its opinion immediately. in our discretized voter process, there are two phases at each time step. first, nodes try to share their opinions and influence each other, which is successful with probabilities depending on the distance of the two vertices. more precisely, vertices that are closer to each other have higher chance that their opinion "reaches" the other one. still, every vertex can "hear" different opinions from many other vertices. in the second phase, if a node v receives the message of m nodes with opinion , and m nodes with opinion , then v will represent opinion with probability m m +m during the next step, and otherwise. if a node v does not receive any opinions from others at a time step, then its opinion remains the same. this way, the order of influencing message in the first phase can be arbitrary, and it is also possible that two nodes exchange opinions. now we specify the probability that a vertex x manages to share its opinion to vertex y in the first phase. we transform graph distances d(x, y) into a matrix of transmission probabilities with choice q(x, y) = e −c·d(x,y) , where c is a constant. this is not a direct analogue of the continuous case, but it is still a natural choice of a decreasing function of d. (usually we use c = , however later we also investigate cases c ∈ { . , , , }. decreasing c escalates the process.) in the model above, on a graph on n nodes, at every time step our algorithm consists of o(n ) steps, which can be problematic for bigger graphs if our aim is to make sample with viter = or iteration of the voter model (in the sequel, viter denotes the number of steps of the voter model). however, with c = a node x convinces vertices y with d(x, y) = only with a probability of e − = , . thus we used the following simplified model: when we created a graph, we stored the list of edges and also calculated for each node the neighbours of distance . the simplified voter model spread opinions only on these reduced number of edges with the proper probabilities. we were able to run the original discretized model only on graphs with n = , while the simplified version can deal with n = nodes. we made the assumption that neglecting those tiny probabilities cannot significantly change the outcome of the process. from now on we only model the simplified version of the process. firstly we study the voter model on erdős-rényi(n, p) and barabási-albert(n, m) random graphs. • er(n, p): we create n nodes, and connect every possible pair x, y ∈ v independently with probability p. • ba(n, m): initially we start with a graph g . at every time step we add a new node v to the graph and attach it exactly with m edges to the old nodes with preferential attachment probabilities. let d denote the sum of degrees in the graph before adding the new node, then we attach an edge independently to u with probability d(u) d . we generated graphs starting from g = er , m ( − ) graph of complying density. multiple edges can be created by the algorithm, however loops cannot occur. attachment probabilities are not updated during a time step. multiple edges do matter in the voter model, since they somehow represent a stronger relationship between individuals: opinion on a k-multiple edge transmits with a k-times bigger probability. firstly, we examine the voter model on graphs without any nodes of multiple types to understand the pure differences of the process resulting from the structure. we compare graphs with the same density, ba( , m) graphs with m = { , , . . . , } and er( , p), where p ∈ [ . , . ]. initial probability of opinion is set to . in both graphs. we compare the probability of disappearing the opinion with viter = iteration of the voter model. we generated different graphs from each structure and ran voter model on each times with independent initial opinions. altogether the results of trials were averaged. figure shows the results. before the phase transition of erdős-rényi graphs, that is, with p < ln n n ≈ . with n = nodes (ba graphs of the same density are belonging to m ≤ ) the graph consists of several as mentioned before, in this sequel we investigate extreme outcomes of the process caused by one of the most important properties of barabási-albert graphs. since nodes do not play a symmetrical role in barabási-albert graphs, fixing the proportion of nodes representing opinion (we usually use v = . , so nodes represent opinion in expected value), but changing the position of these nodes in the graph can lead to different results. we examined the following three ways of initial opinion setting: • randomly: each individual chooses opinion with probability v. • "oldest nodes": we deterministically set the first nodes of the graph to represent opinion . these nodes have usually the largest number of degrees, thus they play a crucial part in the process. not only have they large degrees, but they are also very likely to be connected to each other (this is the densest part of the graph). • "newest nodes": we deterministically set the last nodes of the graph to represent opinion . these nodes usually have only m edges, and they are not connected to each other with a high probability. the histogram on figure shows the distribution of nodes with opinion with the three different choice of l vectors after viter = iterations of the voter model on ba( , ) graphs. we experience differences in terms of probabilities of disappearing opinion : with random opinion distribution %, with l new almost one third of the cases resulted in extinction of opinion , while for l old this probability was negligible ( . %). actually, for l old after only one iteration of the voter model it is impossible to see any structure in the distribution of individuals with opinion . vector of opinions became totally random, but with a probability of . . indeed only with one step of the voter model individuals with opinion could double in number, however opinion cannot take advantage of any special positions in the graph anymore. all in all, giving a certain opinion to individuals who are more likely to be connected in the graph, reduces the probability of disappearing, since they can keep their opinion with a high probability, while with opinion scattered across the graph (in case of l new as well as l rand ) with a dynamic parameter setting of c number of individuals with opinion can reduce drastically even in a few time steps. it is a natural idea to divide the nodes of a network into separate groups according to some aspect, where the properties of different groups can affect processes on the graph. there are various ways to classify nodes into different types. we examined a simple and an other widely used method. in the following section we only have nodes with two types, however definitions still hold for multiple type cases. from now on, for purposes of discussion we only refer to the types as red and blue. we consider two different ways to assign types to the nodes: • each node independently of each other chooses to be red with probability p r , and blue with − p r . (here index r corresponds to random.) • since preferential attachment graphs are dynamic models, this enables another very natural and logical way of choosing types: after a new node has connected to the graph with some edges, informally the node chooses its type with probabilities corresponding to the proportions among its neighbours' type (see also [ , , ] ). this way nodes with the same type tend to connect to each other with a higher probability, forming a "cluster" in the graph. we only examined linear models. according to [ ] , a few properties in the initial graph g and initial types of nodes can determine the asymptotic behaviour of the proportion of types. let g n denote the graph when n nodes have been added to the initial graph g . let a n and b n denote the number of red and blue nodes in g n . then the following theorem holds for the asymptotic proportion of red (a n ) and blue (b n )nodes, a n = an an+bn , b n = bn an+bn . let x n and respectively y n denote the sum of the degrees of red and blue nodes in g n . and that x , y ≥ . then a n converges almost surely as n → ∞. furthermore, the limiting distribution of a := lim n→∞ a n has full support on the interval [ , ], has no atoms, and depends only on x , y , and m. this property has great significance, since we would like to compare graphs with the same proportion of red and blue nodes. the theorem ensures us about the existence of such a limiting proportion. what is more, with the generation of barabási-albert graphs with multiple edges we can examine the speed of convergence. we set types of nodes in the initial graph g in such a way that not necessarily half of the nodes will be blue, but approximately the sum degree of nodes with type blue will be the half of the whole sum of degrees. (of course, in case of an initial erdős-rényi graph these will be the same in expected value. however we can get more stable proportion of types with the second method. in this case by stable we mean proportions can be closer to .) in the voter model we can see nodes with multiple types, defined in the last section, with the following interpretation. each node (individual) has two types according to two different aspects: so each node chooses a type from both of the aspects, and the choice of types according to different aspects are independent. (since four combination of these is possible, we could say that each node chooses one type from the possible pairs.) during the voter model, interaction of nodes with different types influence the process in the following way: complying with the names of the types, we expect that good reasoner nodes could convince any nodes with a higher probability than bad reasoner nodes. also any node should convince a node of unstable type with a higher probability than a node of a stable type. in a step of the voter model, when node x influences a node y, the probability of success should only depend on node x's ability to convince (good/ bad reasoner type) and node y's we investigated the model with symmetric parameter set: the probability of a good reasoner node convincing a stable one is equal to the probability of a bad reasoner node convincing an unstable one. we also made the assumption that a bad reasoner node can convince a stable node with probability . voter model was examined with different c( ) ≥ c( ) set of parameters, and different possible choices of types in the graph. in this sequel we examine a special case of voter model with multiple type nodes, in which the aim is to spread an initially underrepresented opinion. this problem might be related to finding good marketing strategies on online social networks, when "opinion" might be about a commercial product or a certain political convinction. we investigate the following "influencer" model: types of a node according to the different aspects is not independent, nor is the l vector of initial opinions. the nodes of the graph are divided into two groups, influencers and non-influencers. influencers usually form a smaller population; they represent opinion , which we want to spread across the graph. they are good reasoners, and also stable, while non-influencers have bad reasoner type according to the ability to convince, while they can be stable as well as unstable. according to definitions of c values, it is impossible for a bad reasoner node to convince a stable one, resulting influencers representing opinion for the whole process. firstly, we study a case in which nodes of a ba graph get a type randomly or deterministically, not according to preferential attachment. we study the equivalent of the case in subsection . . with multiple type nodes. in each graph % of individuals ( nodes) are influencers. in ba graphs influencers are situated randomly, on the "oldest nodes" or on the "newest nodes" of the graph. in er graphs influencers are situated randomly (however, since the role of nodes is symmetric, they can be situated anywhere, with no difference in the outcome). we would like to examine the differences in opinion spread. we are also interested whether it is possible to convince all the nodes of the graph to opinion , and in case it is, we calculate the average time needed to do so. we observed differences in the outcome for runs (on different random graphs), with c = [ , ] and m = parameter set (see figure ) (we wanted to exclude cases in which the proportion of one of the types is negligibly small.) for these reasons, we created er and ba graphs with multiple nodes, where the proportion of good reasoners is and according to preferential attachment (in ba graphs), we set these nodes to be the influencers (they are stable, while non-influencer individuals can be stable and unstable with probability ). so in expected value half of the nodes are influencers, but in case of ba graphs we can experience greater deviance (in expected value half of the nodes have good reasoner type according to the ability to convince, and of the nodes have stable type). for er graphs the only meaningful possibility to create types is the random choice, but the same proportions also hold. proportions in ba graphs are still a bit greater). however, in terms of disappearing opinions results are rather different. on er graphs in none of the cases could opinion disappear, while on ba graphs it strongly depended on the exact initial proportion of influencers in the graph: on the same graph (and hence with the same proportion of influencers) opinion disappears either within the first iterations of the voter model, or holds a high proportion of opinion , yet it will never be able to reach the limit. this main difference resulted from the fact that we can not exactly set the proportion of types in ba graphs, thus the co-existence of opinions is rather sensitive to changes in the number of influencers in ba graphs. (in er graphs only % of the examined runs resulted in disappearance of opinion , even with influencers.) in this section we examine the voter model on a random graph which has a geometric structure on the plane. since the graph model is not dynamic, nodes can only choose their type randomly (or according to some deterministic strategy related to the position of nodes in the plane). however firstly we study the model without multiple types, with constant c = . , . . since the voter model is rather time-consuming, and even in case of parameter c = . the probability of conviction q(x, y) = e −c·d(x,y) for d(x, y) = is ≈ . . thus we create a reduced graph from rp (n) by erasing edges in case of d(x, y) > . we can assume that results on the reduced graph can approximate the outcome on the original one, since transmission of opinions on those edges are negligible. the average degree in the reduced graph is still . . modifying the voter model to spread opinions only on these edges makes the algorithm less robust and manageable to run the process on graphs with many (n = ) nodes. firstly, we would like to understand the behaviour of the process without multiple types of the graph. in this section we make an advantage of the geometric structure of the graph, and examine different deterministic and random choices for initial opinions of l . we study how these alternative options can influence the outcome (the probability of the disappearance of an opinion, expected time needed for extinction). another interesting question is whether after a given number of iterations t of the voter model we can still observe any nice shape of the situation of opinions. in both of the following four choices for initial opinions in expected value % of the individuals are given opinion , the rest of them represent opinion . the discretized voter model with different initial opinion vectors l was performed on different graphs for viter = steps. with n = number of nodes, c = . and only % of population representing opinion , opinion disappeared only in a few (negligible) cases for any examined l . we can say, without any doubt according to picture , that after time steps the deterministic position of initial opinions is still recognizable (even after viter = steps). we can generally state that with clustering individuals with the same opinion in a group, makes proportion of opinions more stable in the process: from the different runs we observed that proportion of opinions (from initial . ) stayed between [ . , . ] with a probability of more than . in case of opinion situated in a corner of the graph, while in any other cases this was significantly lower (less than . ). with this placement of opinion , average distances within individuals with opinion was the smallest, however average distances between different groups of opinions was the largest among the examined cases, resulting in moderate change of opinions. number of individuals representing decreased below only with probability of . , while with placing opinion in the center this probability is . . opinion is the most likely to disappear (with probability . ), or reduce to an insignificant amount with random placement of opinion . however, inverse extreme cases are also more likely to occur, since proportions of opinion exceeding . is outstandingly high with this scenario. moreover, despite the high probability of extinction, in expected value we get the highest proportion of opinion after viter = iteration of the voter model with random initial configuration. we also examined random graphs on a plane with a random or deterministic type choice of the nodes corresponding to two different aspect as before. we set the type pairs to form an influencer model defined before. due to the fact that average distances in this random graph model are significantly larger than in er and ba graphs of small-world property, small proportion of influencers in most in most of the cases neither can help the problem the setting of all non-influencer individual to unstable type. even with random influencer position the calculation of average time needed to convince all nodes of the graphs is challenging due to its time cost. with the increase of influencers in number to , in half of the runs was able to reach opinion all nodes of the graph within time steps. however, sometimes only in a relatively small number of iterations, suggesting that exact position on the plane of randomly chosen individuals do effect the process significantly. coexistence in preferential attachment networks evolving voter model on dense random graphs on the spread of viruses on the internet a trust model for spreading gossip in social networks random graphs. second edition on critical vaccination coverage in multitype epidemics graphs with specified degree distributions, simple epidemics, and local vaccination strategies the noisy voter model on complex networks coexistence results for some competition models random graph dynamics sir epidemics and vaccination on random graphs with clustering random graphs and complex networks preferential attachment graphs with co-existing types of different fitnesses gergely röst modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks interacting particle systems daihai he, preliminary estimation of the basic reproduction number of novel coronavirus ( -ncov) in china, from to : a data-driven analysis in the early phase of the outbreak key: cord- -r wnz rq authors: wang, yubin; zhang, zhenyu; liu, tingwen; guo, li title: slgat: soft labels guided graph attention networks date: - - journal: advances in knowledge discovery and data mining doi: . / - - - - _ sha: doc_id: cord_uid: r wnz rq graph convolutional neural networks have been widely studied for semi-supervised classification on graph-structured data in recent years. they usually learn node representations by transforming, propagating, aggregating node features and minimizing the prediction loss on labeled nodes. however, the pseudo labels generated on unlabeled nodes are usually overlooked during the learning process. in this paper, we propose a soft labels guided graph attention network (slgat) to improve the performance of node representation learning by leveraging generated pseudo labels. unlike the prior graph attention networks, our slgat uses soft labels as guidance to learn different weights for neighboring nodes, which allows slgat to pay more attention to the features closely related to the central node labels during the feature aggregation process. we further propose a self-training based optimization method to train slgat on both labeled and pseudo labeled nodes. specifically, we first pre-train slgat on labeled nodes and generate pseudo labels for unlabeled nodes. next, for each iteration, we train slgat on the combination of labeled and pseudo labeled nodes, and then generate new pseudo labels for further training. experimental results on semi-supervised node classification show that slgat achieves state-of-the-art performance. in recent years, graph convolutional neural networks (gcns) [ ] , which can learn from graph-structured data, have attracted much attention. the general approach with gcns is to learn node representations by passing, transforming, and aggregating node features across the graph. the generated node representations can then be used as input to a prediction layer for various downstream tasks, such as node classification [ ] , graph classification [ ] , link prediction [ ] and social recommendation [ ] . graph attention networks (gat) [ ] , which is one of the most representative gcns, learns the weights for neighborhood aggregation via self-attention mechanism [ ] and achieves promising performance on semi-supervised node classification problem. the model is expected to learn to pay more attention to the important neighbors. it calculates important scores between connected nodes based solely on the node representations. however, the label information of nodes is usually overlooked. besides, the cluster assumption [ ] for semisupervised learning states that the decision boundary should lie in regions of low density. it means aggregating the features from the nodes with different classes could reduce the generalization performance of the model. this motivates us to introduce label information to improve the performance of node classification in the following two aspects: ( ) we introduce soft labels to guide the feature aggregation for generating discriminative node embeddings for classification. ( ) we use slgat to predict pseudo labels for unlabeled nodes and further train slgat on the composition of labeled and pseudo labeled nodes. in this way, slgat can benefit from unlabeled data. in this paper, we propose soft labels guided attention networks (slgat) for semi-supervised node representation learning. the learning process consists of two main steps. first, slgat aggregates the features of neighbors using convolutional networks and predicts soft labels for each node based on the learned embeddings. and then, it uses soft labels to guide the feature aggregation via attention mechanism. unlike the prior graph attention networks, slgat allows paying more attention to the features closely related to the central node labels. the weights for neighborhood aggregation learned by a feedforward neural network based on both label information of central nodes and features of neighboring nodes, which can lead to learning more discriminative node representations for classification. we further propose a self-training based optimization method to improve the generalization performance of slgat using unlabeled data. specifically, we first pre-train slgat on labeled nodes using standard cross-entropy loss. then we generate pseudo labels for unlabeled nodes using slgat. next, for each iteration, we train slgat using a combined cross-entropy loss on both labeled nodes and pseudo labeled nodes, and then generate new pseudo labels for further training. in this way, slgat can benefit from unlabeled data by minimizing the entropy of predictions on unlabeled nodes. we conduct extensive experiments on semi-supervised node classification to evaluate our proposed model. and experimental results on several datasets show that slgat achieves state-of-the-art performance. the source code of this paper can be obtained from https://github.com/jadbin/slgat. graph-based semi-supervised learning. a large number of methods for semi-supervised learning using graph representations have been proposed in recent years, most of which can be divided into two categories: graph regularization-based methods and graph embedding-based methods. different graph regularization-based approaches can have different variants of the regularization term. and graph laplacian regularizer is most commonly used in previous studies including label propagation [ ] , local and global consistency regularization [ ] , manifold regularization [ ] and deep semi-supervised embedding [ ] . recently, graph embedding-based methods inspired by the skip-gram model [ ] has attracted much attention. deepwalk [ ] samples node sequences via uniform random walks on the network, and then learns embeddings via the prediction of the local neighborhood of nodes. afterward, a large number of works including line [ ] and node vec [ ] extend deepwalk with more sophisticated random walk schemes. for such embedding based methods, a two-step pipeline including embedding learning and semi-supervised training is required where each step has to be optimized separately. planetoid [ ] alleviates this by incorporating label information into the process of learning embeddings. graph convolutional neural networks. recently, graph convolutional neural networks (gcns) [ ] have been successfully applied in many applications. existing gcns are often categorized as spectral methods and non-spectral methods. spectral methods define graph convolution based on the spectral graph theory. the early studies [ , ] developed convolution operation based graph fourier transformation. defferrard et al. [ ] used polynomial spectral filters to reduce the computational cost. kipf & welling [ ] then simplified the previous method by using a linear filter to operate one-hop neighboring nodes. wu et al. [ ] used graph wavelet to implement localized convolution. xu et al. [ ] used a heat kernel to enhance low-frequency filters and enforce smoothness in the signal variation on the graph. along with spectral graph convolution, define the graph convolution in the spatial domain was also investigated by many researchers. graphsage [ ] performs various aggregators such as meanpooling over a fixed-size neighborhood of each node. monti et al. [ ] provided a unified framework that generalized various gcns. graphsgan [ ] generates fake samples and trains generator-classifier networks in the adversarial learning setting. instead of fixed weight for aggregation, graph attention networks (gat) [ ] adopts attention mechanisms to learn the relative weights between two connected nodes. wang et al. [ ] generalized gat to learn representations of heterogeneous networks using meta-paths. shortest path graph attention network (spagan) to explore high-order path-based attentions. our method is based on spatial graph convolution. unlike the existing graph attention networks, we introduce soft labels to guide the feature aggregation of neighboring nodes. and experiments show that this can further improve the semi-supervised classification performance. in this paper, we focus on the problem of semi-supervised node classification. many other applications can be reformulated into this fundamental problem. let g = (v, e) be a graph, in which v is a set of nodes, e is a set of edges. each node u ∈ v has a attribute vector x u . given a few labeled nodes v l ∈ v , where each node u ∈ v l is associated with a label y u ∈ y , the goal is to predict the labels for the remaining unlabeled nodes in this section, we will give more details of slgat. the overall structure of slgat is shown in fig. . the learning process of our method consists of two main steps. we first use a multi-layer graph convolution network to generate soft labels for each node based on nodes features. we then leverage the soft labels to guide the feature aggregation via attention mechanism to learn better representations of nodes. furthermore, we develop a self-training based optimization method to train slgat on the combination of labeled nodes and pseudo labeled nodes. this enforces slgat can further benefit from the unlabeled data under the semi-supervised learning setting. in the initial phase, we need to first predict the pseudo labels for each node based on node features x. the pseudo labels can be soft (a continuous distribution) or hard (a one-hot distribution). in practice, we observe that soft labels are usually more stable than hard labels, especially when the model has low prediction accuracy. since the labels predicted by the model are not absolutely correct, the error from hard labels may propagate to the inference on other labels and hurt the performance. while using soft labels can alleviate this problem. we use a multi-layer graph convolutional network [ ] to aggregate the features of neighboring nodes. the layer-wise propagation rule of feature convolution is as follows: here, a = a + i is the adjacency matrix with added self-connections. i is the identity matrix, is a layer-specific trainable transformation matrix. σ (·) denotes an activation function such as denotes the hidden representations of nodes in the l th layer. the representations of nodes f (l+ ) are obtained by aggregating information from the features of their neighborhoods f (l) . initially, f ( ) = x. after going through l layers of feature convolution, we predict the soft labels for each node u based on the output embeddings of nodes: now we will present how to leverage the previous generated soft labels for each node to guide the feature aggregation via attention mechanism. the attention network consists of several stacked layers. in each layer, we first aggregate the label information of neighboring nodes. then we learn the weights for neighborhood aggregation based on both aggregated label information of central nodes and feature embeddings of neighboring nodes. we use a label convolution unit to aggregate the label information of neighboring nodes, and the layer-wise propagation rule is as follows: where w g is a layer-specific trainable transformation matrix, and g (l) ∈ r |v |×d (l) g denotes the hidden representations the label information of nodes. the label information g (l+ ) are obtained by aggregating from the label information g (l) of neighboring nodes. initially, g ( ) = softmax f (l) according to eq. . then we use the aggregated label information to guide the feature aggregation via attention mechanism. unlike the prior graph attention networks [ , ] , we use label information as guidance to learn the weights of neighboring nodes for feature aggregation. we enforce the model to pay more attention to the features closely related to the labels of the central nodes. a single-layer feedforward neural network is applied to calculate the attention scores between connected nodes based on the central node label information g (l+ ) and the neighboring node features h (l) : are layer-specific trainable transformation matrices, h (l) ∈ r |v |×d (l) h denotes the hidden representations of node features. · represents transposition and || is the concatenation operation. then we obtain the attention weights by normalizing the attention scores with the softmax function: where n i is the neighborhood of node i in the graph. then, the embedding of node i can be aggregated by the projected features of neighbors with the corresponding coefficients as follows: finally, we can achieve better predictions for the labels of each node u by replacing the eq. as follows: where ⊕ is the mean-pooling aggregator. grandvalet & bengio [ ] argued that adding an extra loss to minimize the entropy of predictions on unlabeled data can further improve the generalization performance for semi-supervised learning. thus we estimate pseudo labels for unlabeled nodes based on the learned node representations, and develop a self-training based optimization method to train slgat on both labeled and pseudo labeled nodes. int this way, slgat can further benefit from the unlabeled data. for semi-supervised node classification, we can minimize the cross-entropy loss over all labeled nodes between the ground-truth and the prediction: where c is the number of classes. to achieve training on the composition of labeled and unlabeled nodes, we first estimate the labels of unlabeled nodes using the learned node embeddings as follows: where τ is an annealing parameter. we can set τ to a small value (e.g. . ) to further reduce the entropy of pseudo labels. then the loss for minimizing the entropy of predictions on unlabeled data can be defined as: the joint objective function is defined as a weighted linear combination of the loss on labeled nodes and unlabeled nodes: where λ is a weight balance factor. we give a self-training based method to train slgat which is listed in algorithm. . the inputs to the algorithm are both labeled and unlabeled nodes. we first use labeled nodes to pre-train the model using cross-entropy loss. then we use the model to generate pseudo labels on unlabeled nodes. afterward, we train the model by minimizing the combined cross-entropy loss on both labeled and unlabeled nodes. finally, we iteratively generate new pseudo labels and further train the model. in this section, we evaluate our proposed slgat on semi-supervised node classification task using several standard benchmarks. we also conduct an ablation study on slgat to investigate the contribution of various components to performance improvements. we follow existing studies [ , , ] and use three standard citation network benchmark datasets for evaluation, including cora, citeseer and pubmed. in all these datasets, the nodes represent documents and edges are citation links. node features correspond to elements of a bag-of-words representation of a document. class labels correspond to research areas and each node has a class label. in each dataset, nodes from each class are treated as labeled data. the statistics of datasets are summarized in table . we compare against several traditional graph-based semi-supervised classification methods, including manifold regularization (manireg) [ ] , semi-supervised embedding (semiemb) [ ] , label propagation (lp) [ ] , graph embeddings (deepwalk) [ ] , iterative classification algorithm (ica) [ ] and planetoid [ ] . training validation test cora , , , , citeseer , , , , pubmed , , , furthermore, since graph neural networks are proved to be effective for semisupervised classification, we also compare with several state-of-arts graph neural networks including chebynet [ ] , monet [ ] , graph convolutional networks (gcn) [ ] , graph attention networks (gat) [ ] , graph wavelet neural network (gwnn) [ ] , shortest path graph attention network (spagan) [ ] and graph convolutional networks using heat kernel (graphheat) [ ] . we train a two-layer slgat model for semi-supervised node classification and evaluate the performance using prediction accuracy. the partition of datasets is the same as the previous studies [ , , ] with an additional validation set of labeled samples to determine hyper-parameters. weights are initialized following glorot and bengio [ ] . we adopt the adam optimizer [ ] for parameter optimization with initial learning rate as . and weight decay as . . we set the hidden layer size of features as for cora and citeseer and for pubmed. we set the hidden layer size of soft labels as for cora and citeseer and for pubmed. we apply dropout [ ] with p = . to both layers inputs, as well as to the normalized attention coefficients. the proper setting of λ in eq. affects the semi-supervised classification performance. if λ is too large, it disturbs training for labeled nodes. whereas if λ is too small, we cannot benefit from unlabeled data. in our experiments, we set λ = . we anticipate the results can be further improved by using sophisticated scheduling strategies such as deterministic annealing [ ] , and we leave it as future work. furthermore, inspired by dropout [ ] , we ignore the loss in eq. with p = . during training to prevent overfitting on pseudo labeled nodes. we now validate the effectiveness of slgat on semi-supervised node classification task. following the previous studies [ , , ] , we use the classification accuracy metric for quantitative evaluation. experimental results are summarized in table . we present the mean classification accuracy (with standard deviation) of our method over runs. and we reuse the results already reported in [ , , , , ] for baselines. we can observe that our slgat achieves consistently better performance than all baselines. when directly compared to gat, slgat gains . %, . % and . % improvements for cora, citeseer and pubmed respectively. the performance gain is from two folds. first, slgat uses soft labels to guide the feature aggregation of neighboring nodes. this indeed leads to more discriminative node representations. second, slgat is trained on both labeled and pseudo labeled nodes using our proposed self-training based optimization method. slgat benefits from unlabeled data by minimizing the entropy of predictions on unlabeled nodes. following shchur et al. [ ] , we also further validate the effectiveness and robustness of slgat on random data splits. we created random splits of the cora, citeseer, pubmed with the same size of training, validation, test sets as the standard split from yang et al. [ ] . we compare slgat with other most related competitive baselines including gcn [ ] and gat [ ] on those random data splits. we run each method with random seeds on each data split and report the overall mean accuracy in table . we can observe that slgat consistently outperforms gcn and gat on all datasets. this proves the effectiveness and robustness of slgat. in this section, we conduct an ablation study to investigate the effectiveness of our proposed soft label guided attention mechanism and the self-training based optimization method for slgat. we compare several variants of slgat on node classification, and the results are reported in table . we observe that slgat has better performance than the methods without soft labels guided attention in most cases. this demonstrates that using soft labels to guide the neighboring nodes aggregation is effective for generating better node embeddings. note that attention mechanism seems has little contribution to performance on pubmed when using self-training. the reason behind such phenomenon is still under investigation, we presume that it is due to the label sparsity of pubmed. the similar phenomenon is reported in [ ] that gat has little improvement on pubmed compared to gcn. we also observe that slgat significantly outperforms all the methods without self-training. this indicates that our proposed self-training based optimization method is much effective to improve the generalization performance of the model for semi-supervised classification. in this work, we propose slgat for semi-supervised node representation learning. slgat uses soft labels to guide the feature aggregation of neighboring nodes for generating discriminative node representations. a self-training based optimization method is proposed to train slgat on both labeled data and pseudo labeled data, which is effective to improve the generalization performance of slgat. experimental results demonstrate that our slgat achieves state-ofthe-art performance on several semi-supervised node classification benchmarks. one direction of the future work is to make slgat going deeper to capture the features of long-range neighbors. this perhaps helps to improve performance on the dataset with sparse labels. manifold regularization: a geometric framework for learning from labeled and unlabeled examples spectral networks and locally connected networks on graphs cluster kernels for semi-supervised learning convolutional neural networks on graphs with fast localized spectral filtering semi-supervised learning on graphs with generative adversarial nets understanding the difficulty of training deep feed for leveraging graph wavelet transform to address the short-comings of previous spectral graphrd neural networks semi-supervised learning by entropy minimization node vec: scalable feature learning for networks inductive representation learning on large graphs deep convolutional networks on graph-structured data adam: a method for stochastic optimization semi-supervised classification with graph convolutional networks link-based classification efficient estimation of word representations in vector space geometric deep learning on graphs and manifolds using mixture model cnns deepwalk: online learning of social representations schnet: a continuous-filter convolutional neural network for modeling quantum interactions pitfalls of graph neural network evaluation deep collaborative filtering with multi-aspect information in heterogeneous networks dropout: a simple way to prevent neural networks from overfitting line: large-scale information network embedding attention is all you need graph attention networks heterogeneous graph attention network deep learning via semi-supervised embedding a comprehensive survey on graph neural networks graph wavelet neural network. in: international conference on learning representations (iclr spagan: shortest path graph attention network revisiting semi-supervised learning with graph embeddings an end-to-end deep learning architecture for graph classification learning with local and global consistency semi-supervised learning using gaussian fields and harmonic functions key: cord- -mkmrninv authors: lepskiy, alexander; meshcheryakova, natalia title: belief functions for the importance assessment in multiplex networks date: - - journal: information processing and management of uncertainty in knowledge-based systems doi: . / - - - - _ sha: doc_id: cord_uid: mkmrninv we apply dempster-shafer theory in order to reveal important elements in undirected weighted networks. we estimate cooperation of each node with different groups of vertices that surround it via construction of belief functions. the obtained intensities of cooperation are further redistributed over all elements of a particular group of nodes that results in pignistic probabilities of node-to-node interactions. finally, pairwise interactions can be aggregated into the centrality vector that ranks nodes with respect to derived values. we also adapt the proposed model to multiplex networks. in this type of networks nodes can be differently connected with each other on several levels of interaction. various combination rules help to analyze such systems as a single entity, that has many advantages in the study of complex systems. in particular, dempster rule takes into account the inconsistency in initial data that has an impact on the final centrality ranking. we also provide a numerical example that illustrates the distinctive features of the proposed model. additionally, we establish analytical relations between a proposed measure and classical centrality measures for particular graph configurations. dempster-shafer theory of belief functions [ , ] is a widely used tool to measure belief or conflict between elements in a considered system [ , ] . recently it has also found use in the field of social network analysis [ ] . social networks represent interactions that are met between people, countries, in transportation systems, etc. one of the core problems in network science is the detection of central elements. in [ ] a modified evidential centrality and evidential semi-local centrality in weighted network are proposed. the measures use the combination of "high", "low" and "(high, low)" probabilities of the influence based on weighted and unweighted degrees of nodes via dempster's rule. in [ ] the same rule is applied in order to combine different node-to-node interactions in a network. the proposed measures that are able to detect social influencers were applied to twitter data. the theory of belief functions can be also adapted to the problem of community detection, i.e. the partition of nodes into tightly connected groups. for instance, in [ ] the author proposed a novel method based on local density measures assigned to each node that are further used for the detection density peaks in a graph. in the frame of the recent work we mostly focus on the problem of the detection of the most influential as well as the most affected elements in networks. the knowledge about the position of nodes plays a significant role in understanding of structural properties of complex systems. there exist several networking approaches that aim to assess the importance of nodes in graphs. the first class of the methods refers to classical centrality measures [ ] . it includes degree centrality measure that prioritizes over nodes with the largest number of neighbors or with the largest sum of incoming/outcoming weights. the eigenvector group of centralities, that includes eigenvector centrality itself, bonacich, pagerank, katz, hubs and authorities, alpha centrality, etc., takes into account the importance of neighbors of a node, i.e. the centrality of a vertex depends on centralities of the adjacent nodes [ ] [ ] [ ] [ ] [ ] . closeness and betweenness centralities consider the distance between nodes and the number of the shortest paths that go through nodes in a network [ , ] . another class of measures, that detect the most important elements, employs cooperative game theoretical approach. it includes the estimation of myerson values, that is similar to shapley-shubik index calculation [ ] . it also requires the introduction of nodes set functions, that can vary depending on the problem statement. in [ ] the hoede-bakker index is adjusted to the estimation of the influence elements in social networks. in [ ] long-range interaction centrality (lric) is proposed, that estimates node-to-node influence with respect to individual attributes of nodes, the possibility of the group influence and indirect interactions through intermediate nodes. however, all the approaches described above are designed for so-called monoplex networks and require adaptation to complex structures with many types of interactions between adjacent nodes (so-called multilayer networks [ ] ). in recent years multilayer networks became one of the central topics in the field of network science. a multilayer network where the set of nodes (or a part of nodes) remains the same through all layers is called multiplex network, which is the object of the research in this work. there exist several ways for the assessment of central elements in multiplex networks. firstly, one can calculate centralities for each layer separately and further aggregate the obtained values through all considered networks. secondly, one can aggregate connections between pairs of nodes to obtain monoplex network and then apply centrality measures to a new weighted graph. the mod-ification of classical centrality measures to interconnected multilayer networks is described in [ , ] . in [ ] social choice theory rules are applied to multiplex networks in order to detect key elements. however, the final results for these approaches are calculated from the secondary data. in this work we propose a novel technique of the key elements assessment. we construct a mapping between each node and sets of other nodes, which is a mass function. in case of several layers we combine mass functions on each layer to a unique function that can be used for the centrality estimation in the whole system. the key advantages of our approach are that we take into account interactions with different groups of nodes and we are able to estimate node-to-node influence within the whole network structure. we also take into account the consistency on connections on different network layers. this paper is organized as follows: in sect. we describe some basic concepts from belief functions theory. in sect. we propose a centrality measure for onelayer network and apply it to a toy example. in sect. we develop an approach to elucidate important elements in networks with several layers. in the same section we apply the proposed method to two-layers network. section contains a discussion of our approach as well as conclusion to the work. in this section we will remind some basic definitions and notions from dempster-shafer theory of belief functions [ , ] that are further employed in this work. let x be a finite set that is called frame of discernment and x is a set of all subsets of x. function m : x → [ ; ] that meets the requirements of normalization condition, i.e. m(∅) = and a∈ x m(a) = , is called basic probability assignment or a mass function. all a ∈ x such that m(a) > are called focal elements and the family of all focal elements is called the body of evidence. mass function m can be associated with two set functions namely a belief function denoted by g(a) = b⊂a m(b) and a plausibility function denoted g(a) = b:a∩b =∅ m(b), that is dual to belief function g(a). these two functions can be considered as lower and upper bounds for the probability estimation of event a : g(a) ≤ p (a) ≤ḡ(a), a ∈ x . the value of function g(a) reflects the belief level to the fact that x ∈ a ⊆ x, where x from x. we denote by bel(x) a set of all belief functions g on set x. belief function g can be also represented as a convex combination of categor- . note that η x describes vacuous evidence that x ∈ x. thus, we call this function as vacuous belief function. additionally, mass function m(a) can be also expressed from belief function g with möbius transformation as m(a) = b⊂a (− ) |a\b| g (b) . in this work we mainly focus on combination techniques adopted from dempster-shafer theory. by combination we mean some operator r : bel(x) × bel(x) → bel(x) that transforms two belief functions into one belief function. we denote by m = m ⊗ r m the combinations of two mass functions m and m under rule r. there exist various combination rules that are widely used in the theory and applications of belief functions. for instance, dempster rule [ ] , that is regarded as the pioneered and the most popular combination technique in dempster-shafer theory, is calculated as follows: indicates the level of conflict between two evidences. if k = then the level of conflict is the highest and rule ( ) is not applicable in this case. another combination technique that is similar to demster rule is yager combination rule [ ] that is defined as according to this rule, the value of conflict k is reallocated among the mass of ignorance m(x). other combination rules are also described in [ ] , some generalizations can be found in [ , ] , axiomatics and the description of conflict rules are reviewed in [ ] [ ] [ ] [ ] . additionally, discounted technique proposed in [ ] can be applied to mass functions in case when various sources of information that are determined by their belief functions have different levels of reliability or different priority. discounting of mass functions can be performed with the help of parameter α ∈ [ ; ] as follows: if α = then the source of information is regarded as thoroughly reliable and m α (a) = m(a) ∀a ∈ x . conversely, if α = then m α (x) = and the related belief function is vacuous. in this section we describe a graph model with one layer of interaction as well as the construction of centrality measure based on a mass function for a network. we consider connected graph as tuple g = (v, e, w ), where v = {v , ..., v n } is a set of nodes, |v | = n, and e = {e(v i , v j )} as a set of edges. for the simplicity, we associate v k with number k, k = , ..., n and denote e(v i , v j ) as e ij . in this work we consider undirected network, i.e. e ij ∈ e implies that e ji ∈ e. we also analyze weighted networks, i.e. each edge e ij in network g associates with weight w ij ∈ w . without loss of generality, we assume that all weights w ij ∈ [ ; ] and w ij = implies that e ij ∈ e. weight w ij between nodes v i and v j indicates the degree of interaction between corresponding nodes. our main focus is to range nodes with respect their importance in a network. we assume that a node is considered to be pivotal if it actively interacts with other nodes in a graph. in our analysis we take into account the connections with distant nodes as well as the cooperation with group of other nodes. more precisely, we suppose that centrality of a node depends on relative aggregated weight of adjacent subgraphs to the considered node. at the same time, the aggregated weight of a subgraph can be estimated with the help of monotonic measures including such measures as belief functions. we consider a family of belief functions we denote by |w | = i