key: cord-0482232-489gaxlb
authors: Francisquini, Rodrigo; Lorena, Ana Carolina; Nascimento, Mari'a C. V.
title: Community-based anomaly detection using spectral graph filtering
date: 2022-01-24
journal: nan
DOI: nan
sha: f6abd6a644bcae5eb8abe36519ace31f73f14e3e
doc_id: 482232
cord_uid: 489gaxlb

Several applications have a community structure where the nodes of the same community share similar attributes. Anomaly or outlier detection in networks is a relevant and widely studied research topic with applications in various domains. Despite a significant amount of anomaly detection frameworks, there is a dearth on the literature of methods that consider both attributed graphs and the community structure of the networks. This paper proposes a community-based anomaly detection algorithm using a spectral graph-based filter that includes the network community structure into the Laplacian matrix adopted as the basis for the Fourier transform. In addition, the choice of the cutoff frequency of the filter considers the number of communities found. In computational experiments, the proposed strategy, called SpecF, showed an outstanding performance in successfully identifying even discrete anomalies. SpecF is better than a baseline disregarding the community structure, especially for networks with a higher community overlapping. Additionally, we present a case study to validate the proposed method to study the dissemination of COVID-19 in the different districts of S~ao Jos'e dos Campos, Brazil.

The explosive growth of technology has led to a substantial increase in the amount of data collected from several applications. They include sensor measurements, transportation, internet, biological data, financial transactions, among others. Therefore, data analysis and processing techniques to manage massive amounts of data are of paramount importance. Data structure is usually irregular and relational data is commonly represented through graphs or networks (Ma et al., 2021; Dong et al., 2019) .

A great number of applications consists of networks with community structure. Some examples are in internet of things (IoT) (Chen et al., 2021) , protein-protein interactions (Francisquini et al., 2021) and COVID-19 related data (Francisquini et al., 2021) . Anomaly detection in this type of network is a specially relevant task. Anomaly detection in networks with community structure consists of identifying nodes, called anomalous nodes, that significantly differ from the standard observed in their community they belong to (Akoglu et al., 2015) . These are also known as context anomaly.

Despite being the target of intense investigation, there are networks for which graph anomaly detection tools are limited. There is a dearth of literature on data anomaly approaches that deal with data with relational properties and temporal information. Moreover, the topological graph structure is not usually taken into consideration on time series graph anomaly detection algorithms. According to Chen et al. Chen et al. (2021) , applications such as sensor networks, for which there is a strong geographical and temporal dependency, can benefit from frameworks that consider the topological graph structure.

Anomaly detection in networks has attracted a lot of attention in the last five years with a soar on the number of studies (Ma et al., 2021) . The reason behind this phenomena is not only the increase on the amount of applications but also the advent of sophisticate deep learning tools to perform such a task (Ma et al., 2021; Chen et al., 2021) . Most of the existing deep learning-based methods to time series anomaly detection uses semi-supervised learning, requiring some labeled data. In addition, according to Choi et al. Choi et al. (2021) , the existing methods are too case-specific, demanding domain knowledge.

The goal of this paper is to give new insights into anomaly detection in of the original unweighted graph to a weighted network through the inclusion of new edges. The method marks vertices whose signal values are outside the expected behavior for the community they belong as potential anomalies.

Computational experiments comparing the accuracy of the novel method with a counterpart using the classical adjacency matrix evidence the superiority of our community detection-based strategy in recovering anomalies, even for the most discrete cases. In addition to experiments with labeled artificial networks proposed in this paper, a comparative analysis with stateof-the-art time series anomaly detection algorithms on two IoT databases publicly available is performed. This paper also shows an experiment with the proposed strategy on a COVID-19 dataset. The results of this experiment indicates an anomalous growth in the number of COVID-19 cases in the different districts of an upstate city of São Paulo, Brazil. The main contributions of this paper are presented next.

• We propose an anomaly detection algorithm that addresses attributed networks for which the literature is scarce of anomaly detection algorithms;

• We introduce an anomaly detection algorithm, SpecF, based on graph signal processing theory which defines anomalous objects as those nodes whose signals required the most significant correction by a low-pass filter;

• We propose an anomaly detection algorithm, SpecF, which differs from other community-based anomaly detection methods by explicitly using the community structure in an extended adjacency matrix;

• We introduce a methodology to add normal and anomalous signals to networks with community structure, to better assess the proposed framework;

• We apply SpecF to COVID-19 data to analyze the dissemination of the COVID-19 virus by investigating the anomalous districts of an upstate city from Brazil with approximately 730 thousand inhabitants.

The remainder of this paper is organized as follows. Section 2 presents fundamental concepts of the GSP theory relevant for the understanding of the proposed tool. Section 3 shows a brief literature review on related anomaly detection algorithms. Section 4 introduces the proposed anomaly detection algorithm, SpecF. Section 5 presents the computational experiments, including a thorough analysis of the anomaly detection algorithm in the COVID-19 dataset compiled in this study. Finally, Section 6 presents final remarks and future research directions.

Data from several applications can be represented by graphs. Let G = (V, E, A) be a weighted graph, where V and E are its respective sets of n nodes and m edges, and A ∈ R n×n is the weighted adjacency matrix. Each node v i ∈ V describes an instance of the dataset, and the weight a ij of an undirected 1 edge carries the strength of the relation between a pair of vertices v i and v j . The degree of a vertex v i is quantified by the sum of the weights of the edges incident to it. Let D ∈ R n×n be a diagonal matrix, called degree matrix, where its i th diagonal element d ii receives the degree of node v i . Moreover, we denote here the set of neighbors of a vertex v i by N i , which means that this set contains all vertices adjacent to v i .

The graph signal, defined by function f : V → R, is represented by a vector f ∈ R n , where each element f i corresponds to the signal of node v i ∈ V , i.e, f (v i ).

The graph Laplacian, also known as the non-normalized graph Laplacian, is the matrix L = D − A, corresponding to a real-valued symmetric matrix (Von Luxburg, 2007) . Let U = [u ij ] n×n be the set of eigenvectors of L. Without loss of generality, let G be a connected component. Therefore U is orthonormal, since L is a real-valued symmmetric matrix. Moreover, the associated non-negative eigenvalues are referred here to as {λ l } l=0,1,...,n−1 ,

The graph Fourier transformf of a signal f ∈ R n given a symmetric shift operator S = RΛR * isf = R * f (Shuman et al., 2013) . As L = U ΛU T , U * = U T , the graph Fourier transform given L is calculated byf = U T f . In addition, the l-th row of U T corresponds to the eigenvector associated with λ l and each component off can be written as:

Let if = Uf be the inverse graph Fourier transform off considering the same shift operator L. It is possible to return to the signal f by the inverse graph Fourier transform of f , since

(2)

In classical Fourier analysis, eigenvalues {(2πξ) 2 } carry the notion of high and low frequencies. Low frequencies are associated with smooth complex exponential eigenfunctions that oscillate slowly, whereas high frequencies are related to complex exponential eigenfunctions that oscillate more rapidly.

The definition of high and low frequencies for signals indexed by graphs takes into account graph Fourier theory (GFT). According to the GFT, the eigevectors of graph Laplacians associated to the first eigenvalues vary more sharply. Consequently, the eigenvectors of end vertices of heavier edges are more likely to be similar.

The process of frequency filtering transforms and input signal into a linear combination of complex exponentials. As a consequence, filtering amplifies or attenuates the contributions of some frequencies. Graph spectral filtering can be directly generalized as

where h(•) is the filter transfer function and f in is the Fourier transform of an input signal function. Several well-known continuous filtering techniques can be implemented as discrete, considering filtering functions that satisfy Equation (3), such as Gaussian smoothing, bilateral filtering and non-local means filtering (Buades et al., 2005) .

A filter is said to be low-pass if it does not significantly affect the frequency content of low-frequency signals but attenuate the magnitude of highfrequency signals. An ideal low-pass filter keeps the magnitude of the spectrum at low frequencies unchanged and attenuates it at high frequencies.

The frequency response of these filters is defined as

Sandryhaila and Moura Sandryhaila and Moura (2014) demonstrated that the design of these filters is a linear problem and the construction of a filter with frequency response h(λ l ) = α l corresponds to solving a system of n l non-linear equations:

. . .

h 0 + h 1 λ n l −1 + ... + h dp λ dp n l −1 = α n l −1

where d p is the degree of the polynomial. We can find an approximate solution by, for example, the least squares method, to get around the overdetermination of the system when n l ≥ d p + 1. The complexity of the least squared method for determining a triangular n l -dimensional matrix inverse is O(n 2 l ).

An important task in data mining is finding instances with unexpected behavior, which are more likely to be anomalous observations. Although several techniques have been developed over the last years to identify anomalies in data (Li et al., 2021) , there are few techniques capable of efficiently dealing with graph-structured data. Graph structured data have complex correlations and require techniques capable of analyzing not only the data itself but the relations between the elements. This paper is particularly interested in graphs with node attributes to find community outliers. A community outlier is a node whose attribute values deviate significantly from its community members.

In a recent survey on graph-based anomaly detection, Akoglu et al. Akoglu et al. (2015) pointed out only two community-based methods for detecting anomalies in graphs with attributes. The first, introduced in (Gao et al., 2010) , is a probabilistic model that considers both relational and raw data information to find more meaningful outliers. On the one hand, the information regarding the relation is obtained from the topological structure of the network and describes the relationships that exist between instances of the dataset. On the other hand, the network node attributes store the data. According to the authors, the algorithm, called community outlier detection algorithm (CODA), can identify meaningful community outliers. The second method is a node outlier ranking technique in attributed graphs, called GOutRank, developed by Müller et al. Müller et al. (2013) . GOutRank ranks the graph nodes according to their degree of deviation in both graphrelational data and node attribute properties.

In a particular case of attributed graphs, where the attributes of the nodes can be interpreted as a signal indexed by the graph, techniques that extend signal processing concepts to the graph domain can be used to identify anomalies. For example, attributed graphs can represent wireless sensor networks, where each (sensor) node stores the value measured by the corresponding sensor in an instant of time. In this case, to identify an anomaly, it is necessary to consider the geographical proximity between the sensors and the values sensed by them. A sensor measurement significantly different from neighboring sensors' represents a potential anomaly. For this type of application, classical signal processing techniques, such as filters, can be extended to the graph domain.

Several studies use graph-based filters for data compression, data recovery, classification, noise removal, signal recovering, among others. However, to the best of our knowledge, few studies have adopted graph-based filtering to detect anomalies. Sandryhaila and Moura Sandryhaila and Moura (2014), for example, introduced the concept of total variation to frequency sorting. The authors developed a filter using the proposed ordering method to identify malfunctions in sensor networks by extracting high-frequency components. Egilmez and Ortega Egilmez and Ortega (2014) introduced a spectral anomaly detection method that also uses a graph-based filter. They considered collective anomalies which occurred locally in time and space and applied the proposed method to sensor networks.

Both Sandryhaila and Moura (2014) and Egilmez and Ortega (2014) adopted spectral filters and presented experiments with sensor network data where the adjacency relationships may be sufficient to detect nodes whose attributes deviate from the attributes of other nodes from the same community. However, in applications where the adjacency relationships are not only based on the physical distance, identifying a community outlier may consider the network's community structure. Neither of the previous work takes such information into account in their analysis.

In protein-protein interaction networks, for example, adjacency relationships define the biological strength of the interaction between a pair of proteins. Proteins, represented by the graph nodes, may have quantitative attributes that correspond to the expression level of the protein in the network. In these networks, a community can be interpreted as a group of functionally related proteins (Francisquini et al., 2021 ) whose attributes have similar characteristics. For this type of application, analyzing only the adjacency affinities may not reveal relevant information. Then, the data analysis must explicitly consider the community structure. To the extent of our knowledge, there is no graph-based filter method to detect anomalies that also takes the community structure into account, as proposed here.

The anomaly studied in this paper is defined as a vertex v i whose signal f (v i ) is far from the expected standard found in the community to where v i belongs. In this case, most of the signal's energy is concentrated in the low-frequency signals of the Fourier transform. A node with a signal value that considerably differs from the neighborhood average will probably be anomalous.

This type of anomaly is observed, for example, in temperature sensor networks, in which neighboring sensors are expected to have close measured values. In this case, a potential anomaly is a sensor measurement that deviates significantly from the values sensed by neighboring nodes, such as a sensor failure. In protein-protein interaction networks, expression levels of neighboring nodes that represent proteins belonging to the same biological process usually vary proportionately. For example, if the expression level of a protein increases, other proteins that belong to the same biological process usually have their expression level increased or decreased accordingly. In this case, where proteins from the same biological process generally belong to the same community, a potential anomaly is a node with a variation of expression level significantly different from the nodes in the same community, e.g., a protein involved in cancerous processes.

The reasoning behind the proposed strategy is that the attenuation of the magnitude of the signal spectrum at high frequencies can correct the abrupt variations that define the studied anomalous behavior. Thus, the anomaly detection method introduced in this paper, SpecF, relies on the idea that vertices whose signal value needed a substantial correction after the filtering process are more likely to be anomalous. For this, SpecF uses a low-pass filter to attenuate the magnitude of the high frequencies of the spectrumf of a signal f . The inverse Fourier transform, defined in Equation (2), is applied to the filtered spectrum to obtain a filtered signal f . The filtered signal f is compared to the original signal f to determine to which vertices the signal value has been severely attenuated, to obtain the set of potentially anomalous vertices. Algorithm 1 presents a basic pseudocode of SpecF, whose main steps will be thoroughly described in the next sections. Besides the graph and its signals, the input data required by this algorithm are a matrix representing G, which can be the adjacency matrix, and a partition C = {C 1 , C 2 , . . . , C |C| } representing the set of communities of G. To evaluate the proposed strategy, this paper also introduces an approach to generate synthetic anomalous signals similar to the signals considered in the hypothesis. The next section discusses the details of the proposed anomaly detection algorithm.

Data: A graph G, a matrix M G representing G, signal B,

In GSP theory, the first k eigenvalues of L correspond to the k-lowest frequencies in the spectrum of a signal. Low frequencies carry the information of signals that vary slightly across the nodes of the network. Moreover, the proposed strategy focuses on a signal whose intra-community variation is expected to be low. Therefore, the introduced method considers the number of communities k in the network as the cut-off frequency, so that λ cut = λ k . A low-pass filter, as defined in Section 2.3, attenuates the magnitude of high-frequency signals and keeps low frequencies unchanged. In this case, a high-frequency is defined as a frequency λ l higher than the cut-off frequency λ cut , that is, λ l > λ cut .

In cases where the expected partition is known beforehand, the choice of λ k is trivial, since k is the number of communities. On the other hand, when the number of communities is unknown, two approaches to estimate the number of communities in the network can be used. The first approach consists of applying a community detection algorithm to the network to estimate the number of communities k -algorithms that do not require the number of communities to be informed a priori. The second approach comes from graph theory and consists of finding the value of k by analyzing the eigenvalues of L and choosing k so that λ 1 , ..., λ k are very small, but λ k+1 is relatively large. In spectral graph theory, when a graph has k completely disconnected components, the first k eigenvalues of L will have the value 0 and then there is a gap to the eigenvalue in position k + 1 (Von Luxburg, 2007) . Algorithm 2 shows a pseudocode to determine the low-pass filter for the proposed anomaly detection algorithm. The complexity of Algorithm 2 is dominated by the least square method to the system (5), which has been discussed earlier, O(n 2 ).

Data: A graph G, the Laplacian matrix L, matrix U , spectrumB Result: Filtered signal B Estimate the number of communities k of the input graph G by analysing the eigenvalues of L (second approach) Find the h l values by solving the system (5), where n l and d p are n Define a diagonal matrix F where its i-th diagonal element is the approximate α i -according to system (5) TheB spectrum is submitted to the proposed low-pass filter to obtain a filtered spectrumB :B = FB Then, the inverse Fourier transform is applied inB to obtain a filtered signal B : B = UB Figure 1 plots the filtered signal B of the signal presented in Figure 4 .

By comparing the signal B to the signal B, it is possible to observe that the groups of vertices are more cohesive in signal B , demonstrating that the filter brought the signal values even closer to nodes from the same community. In addition, the anomalies, which previously diverged from other nodes in the same community, are now within the expected standard. Figure 1 shows the ability of the SpecF to correct anomalies and normalize the values of a signal according to the network's community structure. The filtered signal B is contrasted to the original anomalous signal B to identify in which nodes the correction of the signal was more significant. The intuition behind this idea is that, if anomalous nodes differ from what is expected for their community, the normalization applied to the signal considering the filter will be more intense in these nodes. Thus, when identifying the anomalous nodes, a set of nodes with the greatest anomalous potential is also detected. For such, let Y = |B − B | be a signal, and its i-th element be referred to as y i , corresponding to the difference signal at vertex v i . In general, vertices v i with a high y i value are more likely to be anomalous and the vector Y is deemed an abnormality quantifier.

To classify the nodes as anomalous or normal in a binary way, SpecF applies to Y a threshold to distinguish which y i values are considered normal and which are identified as abnormal observations. The threshold values employed in SpecF are regarded for each community, taking the mean and standard deviation of Y into account, as presented in Equation (9).

are, respectively, the mean, standard deviation and maximum of the values of y i 's that represent the nodes of community C k . Algorithm 3 presents a pseudocode of the strategy that defines the potentially anomalous nodes. In this algorithm, let c v i be the community C k where vertex v i belongs to. The complexity of Algorithm 3 is O(n). Figure 2 illustrates the relation between y i values and the vertices of a network. Moreover, it presents the threshold values T D(Y, C k ) -dotted curve -that separate anomalous nodes from normal nodes. It is possible to notice that the vast majority of nodes above the red curve are true positives and, therefore, are in the set of anomalous nodes defined by the generator. On the other hand, most of the nodes below the red curve are true negatives and, thus, are labeled normal. There are also false positives and false negatives and they are usually close to the threshold curve.

The computational complexity of SpecF, described in Algorithm 1, is O(n 2 ) since to calculate the Fourier transform a matrix multiplication operation is required.

To embed to a matrix information about its community structure, this paper proposes the use of an expanded adjacency matrix to represent a graph G -one of the forms to define matrix M G . Let W ∈ R n×n be the so-called expanded matrix that incorporates the community structure of a graph to represent the pairwise relationship between vertices v i and v j ∈ V , referred to as w ij . The values w ij are defined in Equation (10), where c v i is the community where vertex v i belongs to.

According to the definition of the expanded matrix, if vertices v i and v j are adjacent and in the same community, w ij will receive the highest weight, the value 5. If they are adjacent but belong to distinct communities, w ij will receive the intermediary value 3. If the two vertices are not adjacent but are in the same community, w ij will receive the value of 1. As a consequence, an explicit affinity between these vertices is defined in such a matrix. Other edges assume null weight and are disregarded.

This section introduces the methodology to generate anomalous and normal signals in networks with community structure.

Let G c be a graph whose nodes v c i represent communities C i of G and the edge weights w c ij are defined as the number of edges between communities C i and C j . To define a signal S c , the nodes of G c are sorted according to the sum of the edges' weights w c ij and s c i is then defined as

where N c i is the set of nodes adjacent to v c i . To properly evaluate the anomaly detection strategy proposed in this paper, we developed an artificial signal generator. The generator produces a synthetic signal S similar to the signal observed in the investigated applications. As a result, it creates signals whose values for nodes of the same community are similar.

Algorithm 4 describes the process employed by the generator to obtain the signal S from a graph G and a given signal S c , with elements defined by Equation (11). First, all nodes are marked and an auxiliary n-dimensional vector S x is initialized as empty. For every community C k of G, the algorithm assigns the value of s c k to the position of S x that corresponds to the highest degree node in the community C k . In the case of a tie, a node is randomly selected among the highest degree nodes. These nodes are regarded as community heads. Then, the algorithm starts a propagation process from the community heads. Each node propagates a percentage of its value to its neighbors. This percentage is lower (10%) if the nodes involved belong to different communities, and higher (at least 25%) if they are in the same community. For nodes from the same community, this percentage also depends on their degree, so that nodes with higher degrees have a greater influence on lower degree nodes. After the propagation process, every node that propagated values have their value reduced by 5%. This process is repeated until all nodes have propagated a number of their signal values. The last step consists of a normalization process to define the signal of each node as that considers the weight of the edge between neighbor nodes to carry out a weighted average. Figure 3 presents box-plots of the values of the S signal, obtained through Algorithm 4 applied to a 500-node LFR network (Lancichinetti and Fortu-

Create a list F with all unmarked nodes sorted by index values;

Unmark v j and include v j at the end of F ;

Remove v i from F ;

nato, 2009) with the parameters defined in Table 1 and low overlapping between communities (mixture degree 0.1). The network has 10 planted communities. Section 5.1 describes the parameters and the software to generate the networks. We show a box-plot for each of the expected communities, sorted by increasing order of intra-community average signal. Figure 3 shows that, although the average value of signal S in each community is different, vertices in the same community have similar values. 

In addition to the normal signal generator, a strategy to create anomalous signals in an attributed network is introduced in this paper.

For such, consider a signal S of a graph G. We generate an anomalous signal B from signal S by increasing the value of s i in some vertices of G. This process is detailed in Algorithm 5, which randomly selects a set of nodes to corrupt. The anomaly intensity θ defines how much greater the value of an anomalous node will be when compared to the rest of the community. An anomaly intensity value of 0.1, for example, means that an anomalous node will have a signal value between 5% and 10% higher than the largest signal value of that community. Figure 4 illustrates the values of an anomalous signal B at each node. Anomalies are highlighted (when 'Anomaly' is 1). Moreover, the vertices are represented in the x-axis, sorted according to the intra-community average signal, as in Figure 3 . The vertices within communities are ordered by index. It is possible to see that, when sorted by the community average, the anomalous nodes become more evident. However, they are within the mean and standard deviation values when considering the complete signal, which makes them difficult to detect using techniques that do not consider The next section presents the computational experiments performed to evaluate the efficacy of SpecF.

This section presents five experiments carried out to attest the efficiency of the anomaly detection method proposed in this paper. The first three experiments were carried out with artificial networks, whereas the forth and fifth consist of tests performed with real-world datasets. The first experiment assesses the behavior of SpecF by varying the number of anomalies present in the network. The second experiment analyzes SpecF when faced with variations in the intensity of the anomalies. The third experiment evaluates the effectiveness of the strategy in multiple executions with varied parameter values. Also an experiment with labeled data, the forth experiment employs two publicly available IoT datasets, where a comparative analysis with stateof-the-art algorithms is performed. The last experiment presents a thorough analysis of an unlabeled COVID-19 dataset.

Before going into detail about the experiments with artificial networks, the generated networks and employed evaluation metrics are discussed.

A set of artificial networks was generated using the software introduced in (Lancichinetti and Fortunato, 2009), referred to as LFR networks. By using this software, a set of undirected and unweighted benchmark graphs, with heterogeneous distributions of node degree and community sizes is created. The nodes of the generated LFR networks have an average degree d G of 10 and a maximum degree max d G of 50. The parameters related to the exponent of the distribution of degrees (neg. exp. d G ) and community vertex count (neg. exp. | C |) are 2 and 1, respectively. The mixing parameter (µ) was set to be valued between 0.1 and 0.8, with a step size of 0.1. The mixture degree of the communities reflects how well separated the communities are since µ specifies the amount of inter-community edges. Therefore, low values for the mixture parameter produce networks with a more evident division in communities. The strategy introduced to generate normal and anomalous signals discussed in Section 4.4 was applied to the LFR networks. More details regarding the anomaly intensity (θ) and percentage of anomalies (AN ) values are approached in the experiments. Table 1 summarizes the LFR and signal parameters used to generate the networks. Figure 5 illustrates an LFR network with 1000 vertices colored according to the signal values. The bluer a node, the higher its signal value. On the other hand, the redder the nodes, the lower their signal values. Larger nodes are those with a higher number of neighbors. The vertices are separated into different groups that represent the communities to which they belong. One may observe that intra-community vertices have similar colors, for example. However, there are some anomalous vertices, like the blue ones, that subtly clash with the standard of the vertices of their community.

Classic measures were used to evaluate and interpret the results obtained by SpecF.

The Receiver Operating Characteristic curve, or ROC curve, summarizes the trade-off between the false positive rate and true positive rate (also known as recall). The ROC curve allows the comparison of different models directly, and the area under the curve (AUC) can be used as a model quality quantifier. On the one hand, a random classifier, for example, could be represented by a diagonal curve that has an area of 0.5, starting at the bottom left and ending at the top right of the ROC space. On the other, a curve that starts in the lower-left corner, moves up to the upper left corner, and then advances to the upper right corner, adding up to an area of 1 would represent an ideal classifier.

In cases of binary classification problems with a skewed distribution of numbers of observations per class, Saito and Rehmsmeier Saito and Rehmsmeier (2015) point out the precision-recall curve as a more informative metric. Since anomaly detection is often characterized by a skewed distribution (there are far more normal cases than anomalies), the precision-recall curve in the evaluation of the anomaly detection performance is also presented here. The precision-recall curve is similar to the ROC curve, but it compares precision (Saito and Rehmsmeier, 2015) .

The first experiment seeks to evaluate the performance of SpecF by varying the number of anomalies and keeping the intensity θ at 5%. To assess the robustness of the method, this experiment employs a total of 250 networks generated by considering the following methodology:

• Generate five different LFR networks with µ = 0.1, 500 nodes and the remaining LFR parameters at the values presented in Table 1 ; • Apply the anomalous signal generator algorithm (Algorithm 4) to each network ten times for each of the AN values presented in Table 1 and fixing θ at value 5%. Therefore, this step produces 50 different networks (one for each execution of the anomalous signal generator) for each AN value, totaling 250 networks. Table 2 presents the mean and standard deviation of the AUC-ROC and AP values considering the 50 networks per AN, when a standard adjacency matrix is used in SpecF. These results show that the performance of SpecF improves as AN increases. Similarly, Table 3 presents the results of the same experiment, but using the expanded adjacency matrix W instead of the adjacency matrix A to calculate the Fourier transform defined in Section 2.1. A comparison of the results with both matrices reveals that in all cases the anomaly detection strategy performed much better using the expanded adjacency matrix W , for all AN values and both AUC-ROC and AP metrics.

The second experiment compares the accuracy of SpecF considering different anomaly intensities and keeping AN fixed at 5%. This experiment was carried out using the same networks generated in the second step of the methodology presented in the earlier section. Therefore, for this experiment, the third step of the methodology, the one that produces the anomalous signals, consists in following the procedure: • Apply the anomalous signal generator algorithm (Algorithm 4) to each network ten times for each of the θ values presented in Table 1 and fixing AN at 5%. Table 4 reports the results of SpecF when the standard adjacency matrix is used. In cases where the anomaly is extremely hard to identify, as in cases where the value of the signal at the anomalous nodes are only 1% greater than the maximum signal in their community, the accuracy of the model is poor. In the case with anomaly intensity θ = 1%, for example, the mean AP was approximately 0.43, which is very close to the values obtained by a random classifier. Table 5 presents the results of this experiment using the expanded adjacency matrix W instead. Again, the use of the matrix W improves the results in anomaly detection in all scenarios, despite the anomaly intensity value and performance metric considered.

This experiment evaluates the performance of SpecF in a considerably larger set of networks. The methodology to generate the set of 7200 networks • Generate five different networks for each pair (µ, n) of values presented in Table 1 , totaling 80 different networks;

• Apply the normal signal generator (Algorithm 4) once to each of the 80 networks (40 networks with 500 nodes and 40 networks with 1000 nodes);

• Apply the anomalous signal generator (Algorithm 5) to each network ten times for each of the nine possible pairs (θ, AN ), where θ, AN ∈ {1, 5, 10}, totaling 400 networks for each triple (n, θ, AN ).

Tables 6 and 7 report the mean AUC-ROC and AP for each triplet (n, θ, AN ), considering SpecF with the adjacency (A) and expanded (W ) matrices. Therefore, each row of these tables corresponds to the average results of 400 different networks. It is possible to observe that, in all cases, better results are obtained when the expanded adjacency matrix W is used instead of the adjacency matrix A in SpecF. Figure 6 illustrates the mean AP values for different values of µ and θ comparing SpecF with A and W as input matrices. It is evident that, in all cases, SpecF performed better when it adopted matrix W instead of matrix A. Besides, the results are better for larger values of θ, as the anomalies are more evident. The primary weakness of SpecF considering both matrices is apparently in cases where the anomaly is very slight and difficult to be identified.

The anomaly detection strategy using the adjacency matrix A presents worse results as the mixture parameter µ increases. In contrast, the performance improves as the value of µ increases when SpecF employs matrix W . This behavior can be explained by the characteristics of the anomalous signal studied in this paper. We analyzed a signal whose anomaly is defined as a vertex whose signal value is different from that of the others in the same community. If the percentage of inter-community edges is low, the average signal values between communities are more distant. Therefore, even if the signal at a node differs from the community where it is located, it will still be closer to the average value of that community than to any other community's average value. When the mixture parameter is higher, the average values of the communities are closer. Moreover, a node whose signal value differs from the signal of its community may be close to the average values The results for experiments I, II and III are clearly superior for all compared scenarios in favor of the usage of an expanded adjacency matrix. Therefore, statistical tests like Wilcoxon signed-rank test Demšar (2006) attest the superior performance of the expanded adjacency matrix when compared to the standard counterpart, at 99% of significance level. This experiment employs two publicly available databases. The first, SWaT, contains a water treatment testbed collected by a sensor-actuator network with 51 nodes (Mathur and Tippenhauer, 2016) . The SWaT dataset regards data collected of a total of 15 days of the water distribution operation, being 11 of normal operations and 4 of day operations with cyber-attacks. This dataset has an anomaly rate of 5.77%. The second, called WADI, contains a water distribution testbed collected by a sensor-actuator network with 123 nodes (Ahmed et al., 2017) . The duration of the attacks was from 2 to 25 minutes. The WADI dataset regards data collected from a total of 16 days of the water distribution operation, being 14 of normal operation days and 2 with cyber-attacks. The duration of the attacks was from 1.5 to 30 minutes. This dataset has an anomaly rate of 5.75%.

A comparative analysis is carried out by presenting the results reported in Chen et al. (2021) . The authors tested their deep learning-based algorithm, GTA, designed to detect anomalies in multivariate time series data. GTA defines anomalies according to a novel graph learning strategy proposed by the authors referred to as Influence Propagation convolution. They compared GTA with eight state-of-the-art anomaly algorithms in multivariate time series data: Principal Component Analysis (PCA), k-Nearest Neighbor (KNN) (Angiulli and Pizzuti, 2002) , Feature Bagging (FB) (Lazarevic and Kumar, 2005) , Autoencoders (AE) (Aggarwal, 2013) , Long Short Term Memory-based Variational Autoencoder (LSTM-VAE) (Park et al., 2018) , Multivariate Anomaly Detection with Generative Adversarial Networks (MAD-GAN) (Li et al., 2019) , Deep Autoencoding Gaussian Mixture Model (DAGMM) (Zong et al., 2018) and Graph Deviation Network (GDN) (Deng and Hooi, 2021) .

WADI and SWaT were modeled in two graphs with 112 and 51 vertices, respectively, where the vertices correspond to the sensor nodes and the edges between them contain the Pearson correlation between the values measured by the sensors over time. An edge only exists if the pairwise correlation is greater than 0.5.

As SpecF is not specifically designed to approach time series data, a methodology to define the anomalous signals according to an analysis of pairwise consecutive 30-second time windows of the time series is followed as explained next.

• Given a pair of consecutive time windows, calculate their distance using the Dynamic Time Warping (DTW) Sakoe (1971) with the implementation proposed by (Meert et al., 2020) .

• For the two time windows, consider the DTW the signal on the node and apply SpecF.

• The anomalous nodes returned by SpecF are deemed anomalous in the entire analyzed time windows.

Tables 8 and 9 present the Precision, Recall and F1-Score of the results obtained by these algorithms and SpecF on SWaT and WADI datasets, respectively. The F1-score measure is given Equation (12).

According to the F1-score, SpecF outperformed the PCA, KNN, FB and DAGMM methods in both datasets. In the WADI dataset, SpecF also outperformed the LSTM-VAE method. In contrast, SpecF was outperformed by MAD-GAN, GDN and GTA. Nonetheless, different from these methods, SpecF does not need any training steps to obtain information on the nonanomalous state of the system. Therefore, we believe that SpecF presented a good performance and was able to produce satisfactory results with a low computational cost. Although it was not specifically designed to deal with 

Here an experiment with real data to validate the performance of SpecF was carried out. The data set used in the experiment contains the number of daily COVID-19 cases in each of the districts of São José dos Campos, an upstate city of São Paulo -Brazil. To model the data, let G be a graph where the nodes represent districts of the city. A pair of nodes v i and v j is connected if the distance between the corresponding districts is less than 5 km. In this case, the weight w ij of the edge is proportional to the distance between districts represented by nodes v i and v j . Figure 7 illustrates the graph G, where the nodes were colored according to their communities (meaning that nodes with identical color belong to the same community). The case study graph has 366 nodes and the community structure of the network was identified using a community detection algorithm named Louvain (Blondel et al., 2008) . The Louvain method is a benchmark community detection heuristic that identifies the community structure through the modularity optimization (Newman and Girvan, 2004) .

The data used was collected from March 5, 2020 to March 25, 2021, totaling 385 days (equivalent to 55 weeks). For each week, only the nodes is the ratio between the sum of the total number of cases registered between days d − 7, d − 6, . . . , d − 1 and days d, d + 1, . . . , d + 6 in the district represented by node v i . The proposed anomaly detection strategy was applied to signal B (d) to identify anomalies in the weekly variation of cases. The hypothesis is that if the number of cases in a given district increases, the number of cases in adjacent districts must increase as well. As a consequence, SpecF may identify neighbor districts whose variation in the number of cases differs from a given district in a certain week.

The upper part of Figure 8 illustrates the weekly variation in the number of cases in each of the districts. The 366 different districts are represented by the y-axis whereas the x-axis represents the weeks. The signal corresponding to week 2, for example, is represented by B (14) , since d is reached by multiplying the week number by 7. The intensity of the colors represents the weekly variation in the number of cases. The redder the greater the increase in the weekly number of cases. The modeled graph is subject to SpecF to find the Y (d) signal that stores the abnormality of the nodes. The lower part of Figure 8 illustrates the abnormality in the variation on the number of cases for each district. It is possible to notice that the amount of light dots at the bottom is considerably smaller. This is because SpecF filters the signal and highlights the abnormalities, so that only potentially anomalous variations are maintained in the Y (d) signal.

To validate the hypothesis behind the introduced anomaly detection algorithm and attest to the veracity of the result, we compared the values of the nodes identified as anomalous with the values of the other nodes of the same community for every day d = {0, 7, 14, . . . , 378, 385}. For example, if y (d) i is pointed out as an anomaly, it means that the variation in the number of cases between the days d−7, d−1 and d, d+6 in the district represented by node v i differs from the standard of the community to which v i node belongs.

This behavior is best seen in Figure 9 , which illustrates a map with the weekly variation in the number of cases within the districts, with a single anomaly. Shades closer to blue represent a drop in the number of cases, while shades closer to yellow represent an increase in the total of COVID-19 cases. It is possible to observe that most districts experience a reduction in the number of cases, while only one district has an increase. This increase is considered an abnormal incident since the adjacent districts have the number of COVID-19 cases reduced.

A total of 117 values y (d) i

were identified as anomalous. Approximately 95% of these anomalies have the same characteristic: the node identified as anomalous in a time window was the only one that had an increase in the number of cases in its community. This highlights neighborhoods and routes of dissemination of the virus that may require public health interventions. Figure 10 shows this behavior for a sample of 50 of the 117 anomalies identified. Each box-plot illustrates the variation values of nodes from the same community to which the anomalous node belongs. The variation values of the anomalous nodes are highlighted in blue dots. The anomalous node presents an increase in the number of COVID-19 cases, in contrast to the other nodes of the same community, that showed a reduction in the number of cases.

We have identified that new anomalies usually occur in districts that are adjacent to districts where abnormal variations were observed in the previous two weeks. In 80.3% of the cases, if a node v i is indicated as anomalous in the signal y (d) i , then there is a node v j , neighbor of v i , that is pointed out as anomalous in the signal y strategy can identify nodes whose characteristics differ from the others from the same community, even in real-world scenarios. Furthermore, it shows that the anomaly trails through the network, such that an anomalous node hits adjacent nodes that become anomalous a few weeks later. Such analyzes can support anticipated public health decisions for a better management of the pandemics evolution in the city.

This paper proposed a method to detect anomalies in signals indexed by graphs, called SpecF. SpecF uses an extended Laplacian matrix that considers the community structure as the basis for the Fourier transform and a lowpass filter designed to attenuate high frequencies in the signal spectrum. The proposed filter uses as cut-off frequency the eigenvalue whose position is the number of communities in the network. This position can be passed as a parameter or automatically calculated through eigenvalue analysis of the adopted matrix.

Computational experiments attest to the accuracy of SpecF in the task of detecting anomalies. Moreover, the automatic calculation of the cut-off frequency using the network community structure makes the method independent of parameters and more robust when dealing with real-world networks. In addition, the experiments show that the proposed method performed better when adopting the expanded adjacency matrix W in the Fourier transform instead of the standard adjacency matrix A.

The results suggest that SpecF is an interesting solution to detect anomalies based on signals indexed by graphs. More specifically, the method can successfully identify nodes in the networks whose signal values differ from the expected according to the community where they belong. Therefore, the proposed method effectively finds anomalies in signals whose expected variation is low in nodes from the same community. Moreover, a comparative analysis with state-of-the-art algorithms in two publicly available real world datasets with known anomalies show a good performance of SpecF. SpecF outperformed the classic algorithms, being competitive with some recent reference methods, even though it has not any learning step and has not been designed for the time series anomaly detection task.

Additionally, a case study was presented to validate the proposed strategy. It consists of data related to the number of COVID-19 cases in the different districts of São José dos Campos, an upstate city of São Paulo, Brazil. By using SpecF, it was possible to detect districts whose number of cases soared while neighboring districts showed a reduction in the number of cases in the same week. The number of COVID-19 cases in a district in a given week was evaluated abnormally high or low. The number of COVID-19 cases in some neighbor districts in the former one or two weeks is also anomalous in most of the cases. This gives evidence that the uncontrolled dissemination of the virus occurs following a "path" in the city.

Outlier Analysis

Wadi: a water distribution testbed for research in the design of secure cyber physical systems

Graph based anomaly detection and description: a survey

Fast outlier detection in high dimensional spaces

Fast unfolding of communities in large networks

A review of image denoising algorithms, with a new one

Signal denoising on graphs via graph filtering

Learning graph structures with transformer for multivariate time series anomaly detection in IoT

Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines

Spectral graph theory

Statistical comparisons of classifiers over multiple data sets

Graph neural network-based anomaly detection in multivariate time series

Learning graphs from data: A signal representation perspective

Spectral anomaly detection using graphbased filtering for wireless sensor networks

Community-based network analyses reveal emerging connectivity patterns of protein-protein interactions in murine melanoma secretome

On community outliers and their efficient detection in information networks

Fault diagnosis of rolling bearing based on laplacian regularization

Community detection algorithms: a comparative analysis

Feature bagging for outlier detection

Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks

Clustering-based anomaly detection in multivariate time series data

A comprehensive survey on graph anomaly detection with deep learning

Swat: A water treatment testbed for research and training on ics security

Ranking outlier nodes in subspaces of attributed graphs

Finding and evaluating community structure in networks

Graph signal processing: Overview, challenges, and applications

A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder

Algebraic signal processing theory: 1-D space

Algebraic signal processing theory: Foundation and 1-D time

The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets

Dynamic-programming approach to continuous speech recognition

Discrete signal processing on graphs: Graph fourier transform

Discrete signal processing on graphs: Frequency analysis

The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains

A tutorial on spectral clustering

Deep autoencoding gaussian mixture model for unsupervised anomaly detection

The authors of this paper are grateful to Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) (Grant Number: 2017/24185-0), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) (Grant: 88887.507037/2020-00) and to the health department of São José dos Campos for providing the COVID-19 data.