key: cord-0750600-rs1u9lox authors: Lopes, António M.; Andrade, José P.; Tenreiro Machado, J.A. title: Multidimensional scaling analysis of virus diseases date: 2016-04-07 journal: Comput Methods Programs Biomed DOI: 10.1016/j.cmpb.2016.03.029 sha: 0c88cb2b06c8a313c0ac6ee8a320fdae3abecc26 doc_id: 750600 cord_uid: rs1u9lox BACKGROUND AND OBJECTIVE: Viruses are infectious agents that replicate inside organisms and reveal a plethora of distinct characteristics. Viral infections spread in many ways, but often have devastating consequences and represent a huge danger for public health. It is important to design statistical and computational techniques capable of handling the available data and highlighting the most important features. METHODS: This paper reviews the quantitative and qualitative behaviour of 22 infectious diseases caused by viruses. The information is compared and visualized by means of the multidimensional scaling technique. RESULTS: The results are robust to uncertainties in the data and revealed to be consistent with clinical practice. CONCLUSIONS: The paper shows that the proposed methodology may represent a solid mathematical tool to tackle a larger number of virus and additional information about these infectious agents. Viruses exert enormous damage on humans worldwide and are the single most important cause of infectious morbidity and mortality. History was, and still is, shaped since ancient times by viral diseases. These diseases began to be characterized in the 19th century leading to the identification and differentiation of many viral illnesses [1] . The first viruses were identified at the end of the 19th century and since then the process of discovery has continued steadily with a growing momentum in these years. In fact, in recent years it is possible to visualize viral structure at an atomic level of resolution, nucleotide sequences of viral genomes are known, and functional domains of numerous viruses and enzymes have been established [1, 2] . This information is now being applied to the development of diagnostic tools and effective antiviral therapies. The classification of viruses has also evolved. Firstly, subclassifications were based on pathologic features such as the preference of a specific organ (for example, the liver in viral hepatitis). Secondly, some epidemiologic characteristics were defined as the transmission by arthropods (arbovirus, for example) [1] . The current classifications are based on the type and structure of the viral nucleic acid and its replication strategy, the symmetry type of the capsid of the virus, and the presence or absence of a lipid envelop [1, 2] . More than 2000 species of viruses have been identified and approximately 650 are capable of infecting humans and animals [2] . Diseases can range from the common cold to fatal events such as Ebola, Smallpox or Rabies [2] . Globally, viral diseases are very diverse and present several degrees of complexity. In this study we will adopt multidimensional scaling (MDS) to visualize the relationships between 22 selected human viral infectious diseases. Some viruses were selected based on recent viral outbreaks and presence in the media (for example, Influenza A virus subtype H5N1, Ebola and Chikungunya), others were chosen due to historical reasons (for example, Rabies, Poliomyelitis and Smallpox), and still others due to their prevalence and incidence in human populations (for example, Influenza, Rhinovirus and Norovirus). In two viral diseases (Human Immunodeficiency Virus and Rabies) we consider both the treated and untreated paradigms of the disease due to the huge discrepancy in mortality. MDS is proven to obtain a new perspective on visualizing global data associated with human pathologies. MDS is a set of techniques used to analyse similarities in data that produce spatial or geometric representations of complex objects [3] [4] [5] . MDS had its origin in behavioural sciences for its help in understanding judgements of individuals (as preference, or relatedness) concerning elements in a set of objects [6] [7] [8] . Nowadays, MDS is used with a large variety of real data, such as biological taxonomy [9] [10] [11] [12] , finance [13, 14] , marketing [15] , sociology [16] , physics [17] , geophysics [18] [19] [20] , communication networks [21, 22] , biology and biomedics [23, 24] , among others [25, 26] . Bearing these ideas in mind, the paper is organized as follows. In Section 2 we present the MDS technique. In Section 3 we study and compare data regarding 22 virus diseases. Finally, in Section 4 we draw the main conclusions. Given s objects in a m-dimensional space and a measure of proximity, δ ij , between objects i and j, a symmetric s × s matrix, C = [ ] δ ij , of item to item (dis)similarities is calculated in a first step. The MDS algorithm produces a s × q (q < m) configuration, X, representing point coordinates (items), where q is specified by the user. Thus, row i from matrix X gives the coordinates of object i in the q-dimensional embedding space. Configuration X preserves, as best as possible, the proximities between pairwise elements in the higher m-dimensional space and unveils the underlying data structure. MDS is, consequently, different from other similar techniques, such as factor and cluster analysis, because there are no assumptions concerning which factors might drive each dimension. Additionally, MDS is able to treat distinct types of data, has better convergence rates, and is less complex than other methods [3, 27] . In order to arrive at the best configuration X, MDS evaluates different alternative configurations while minimizing a goodness-of-fit function. This problem, equivalent to minimizing the raw stress function, σ 2 , can be formulated as [28] : where z ij is a user chosen non-negative weight and d ij is a measure of the (dis)similarities among the items in the embedding space. Therefore, d ij is usually a distance measure. Smaller (larger) distances between two objects translate into more (less) similarities between them. For example, the Minkowski distance provides a general way to specify distance for quantitative data in a multidimensional space: where x ik is the value of dimension k for object i and αk is a weight factor. When αk = 1, the Euclidean and the city-block distances are obtained for r = 2 and r = 1, respectively. Nevertheless, the MDS technique allows users to choose other metrics for the comparison of objects that can be better adequate for their data. In the sequel we will adopt the Canberra distance and the cosine correlation. There are different stress measures, such as the normalized raw stress, which is σ 2 divided by the sum of squared dissimilarities. Possible alternatives are Kruskal's stress-1 and Kruskal's stress-2, which divide σ by the sum of squared distances, or by a function of the variances of distances, respectively. Another example is the S-stress measure given by the sum of squared errors between squared distances and squared dissimilarities [29, 30] . The Shepard diagram is used to infer the quality of the MDS solution. Let p ij denote the similarities between objects i and The stress plot represents σ 2 versus the number of dimensions q of the MDS maps. Usually, we get a monotonic decreasing chart and we choose q as a compromise between reducing σ 2 and having a low dimension for the MDS charts. MDS can be divided according to the classification of data similarities, the number of similarity matrices and the nature of the MDS model. We thus have the non-metric, or metric MDS, if similarity data are qualitative or quantitative. In what concerns the number of similarity matrices and nature of the model we have classical MDS (i.e., with one matrix and unweighted models), replicated MDS (i.e., with several matrices and unweighted models) and weighted MDS (i.e., with several matrices and weighted models). The MDS interpretation is based on the emerging clusters and distances between points in the map, rather than on their absolute coordinates, or the geometrical form of the locus. Thus, we can rotate or translate the MDS chart since the distances between points remain identical. Usually, two or three dimensional charts are selected, because they allow a direct graphical representation. MDS has advantages over other methods, such as principal component analysis (PCA), since MDS can follow similarity/ dissimilarity matrices based on several distinct metrics. MDS uses the inter-object distances rather than the coordinates of the objects and, therefore, it turns out that the MDS is a more general method than PCA [18, 31] . In this section we use MDS tools to visualize the relationships between s = 7 dimensional space of attributes. We start by preprocessing the quantitative and the qualitative data, yielding a new equivalent µ-dimensional space (to be defined in the sequel) for disease comparison. We consider the disease fatality rate, the average basic reproductive number, the average serial interval, the incubation period and the virus survival time outside the host. Tables 2 and 3 summarize the qualitative features, namely the transmission mode, and the main symptoms of the disease. The transmission mode dimension is represented as binary data (Table 2) , . This means that we consider t = 7 conditions, specifically animal-human, airborne droplet, bites, body fluids, sexual contact, surfaces and faecal-oral. The value v ih = 1 means that disease i has the transmission mode h, and v ih = 0 means that disease i does not have the transmission mode h. In a similar way, for the symptoms dimension we consider y = 28 indicators, represented as binary data W = [ ] Table 3 ). The value w il = 1 (or 0) means that disease i has (or has not) symptom l. Before applying the MDS algorithm we start by "normalizing" the quantitative data to the interval [0, 1], i.e., u u u In this way we avoid having some features saturating the numerical values. We proceed by constructing the vectors of features that embeds all quantitative and qualitative data. This is equivalent to the m-dimensional space defined previously for disease comparison. In the next subsections we use two indices to compare the preprocessed data, namely the Canberra distance, c ij , and the cosine correlation, cc ij . Other indices can be adopted, but these two are sufficient to explain the working concepts. We then apply the MDS technique and interpret the generated maps. Fig. 1 depicts a synoptic diagram of the disease's characteristics and quantification method. In constructing Tables 1-3, data were collected from the following sources: Influenza A virus subtype H5N1 commonly known as "Bird Flu" [32] [33] [34] [35] ; Chicken Pox (varicella-zoster virus infection) [36] [37] [38] [39] ; Chikungunya [40] [41] [42] ; Dengue Fever [43, 44] ; Ebola [45] [46] [47] ; Hepatitis B [48] [49] [50] ; Human Immunodeficiency Virus (HIV) and HIV-untreated [51] [52] [53] [54] ; Marburg haemorrhagic fever [47, 55] ; Measles [56] [57] [58] [59] [60] ; Middle East Respiratory In this subsection we consider the construction of matrix X using a measure based on the Canberra distance, c ij , between diseases i and j i j s , , , Given this index, the s × s symmetric matrix, , is then computed and the MDS tool applied. While several MDS criteria were tested, the Sammon criterion revealed good results and was adopted in all calculations. It should be noted that this criterion tries to optimize a cost function that describes how well the pairwise distances in a data set are preserved [88, 89] . Fig. 2 depicts the 2-and 3-dimensional (2D and 3D) maps produced by MDS. Each point represents a disease, denoted by the corresponding label as shown in Tables 1-3. We can observe that the Canberra index leads to poor clustering. Nonetheless, we should note that MDS is merely a mathematical clustering and visualization tool and that a physical perspective of the reported results must be found in the light of the comparison index [90] . Therefore, a further explanation about physical mechanisms associated with the results must be envisaged by standard complementary procedures. Figs. 3 and 4 depict the Shepard and stress plots, respectively, which represent standard tools for the assessment of the MDS results. The Shepard diagram shows a good distribution of points around the 45 degree line, particularly for the 3D representation, which means a good fit of the distances to the dissimilarities. The stress plot reveals that a three dimensional space describes well the locus of the s = 22 diseases. In fact, the stress diminishes strongly until q = 2, moderately towards q = 3 and weakly. The maximum curvature point of the stress plot is often adopted as the criterion for deciding the dimensionality of the MDS maps. This means that, although four or more dimensions would represent the data slightly more accurately, 3D maps represent a good compromise between accuracy and easiness of visualization. In this subsection we adopt the cosine correlation, cc ij , to construct the matrix X. are trustworthy. We observe now the emergence of a different pattern, but the main idea of clustering remains. This observation is usual in MDS charts, where alternative indices capture different characteristics of the phenomena and lead to distinct plots, but allowing the same conclusions. The "best" index is simply the one that produces a MDS map where clusters reflect real-world in a more direct perspective. The standard MDS analysis is based on the object groups in the final map. We can rely either in the direct visualization of the plot, or in the implementation of some extra algorithm to extract the clusters. Bearing this idea in mind, in subsection 3.3.1 we adopt the non-hierarchical clustering algorithm K-means to identify clusters in the MDS map. In subsection 3.3.2 we use hierarchical clustering to confirm the results obtained. In subsection 3.3.3 we analyse the sensitivity of the MDS maps. In subsection 3.3.4 we discuss the results. We restrict the analysis to the cosine correlation metric since it revealed better results. Clustering consists on grouping objects that are, in some sense, similar to each other. The K-means is a non-hierarchical clustering method commonly used in machine learning and data mining [91] . The algorithm starts with a collection of s objects, where each object is a point in a q-dimensional space, and a given number of clusters, K, specified in advance by the user. The K-means groups the s objects into K ≤ s clusters, so as to minimize the objective function given by the sum of distances between the points and the centres of their clusters. The K-means arrives at a solution in which objects within each cluster are as close to each other as possible, and as far from objects in other clusters as possible. A key issue in K-means is how to determine the correct number of clusters, K. It should be noted that the very notion of "good clustering" is subjective and is a question of point of view. However, we can rely on different indices to measure the quality of the clustering, namely the Davies-Bouldin, the Caliń ski-Harabasz and the silhouette indices [92] [93] [94] . In this work we adopt the silhouette to compare different solutions. The silhouette value, S, for each object, is a measure of how well each object lies within its cluster. Silhouette values vary in the interval S ∈ [−1, 1]. Silhouette values closer to S = 1 correspond to objects that are very distant from neighbouring clusters and, therefore, they are assigned to the right cluster. If S = 0, then the objects could be assigned to another cluster. When S = −1, then the objects are correctly assigned. Given the coordinates of the s = 22 objects in the q = 3 dimensional space generated by the MDS, we evaluate the clusters identified by the K-means algorithm when varying the number of clusters in the interval K ∈ [2, 7] . Fig. 6 depicts the silhouette average values versus the number of clusters, K, leading to the optimum value K = 4. Fig. 7 illustrates the shape of the silhouettes obtained for K = … 3 6 , , , where we can see that the best shape is obtained for K = 4. As an alternative approach, not involving the MDS, we use a hierarchical clustering algorithm that is fed directly with matrix C = [ ] cc ij . Fig. 8 depicts the dendrogram generated by successive (agglomerative) clustering and average-linkage method [95, 96] . We cut the tree at the level 0.27, since below this value we see that the clusters became too close from each other. We see that four clusters emerge with this method, confirming the results obtained in the previous subsection. Nevertheless, MDS uses more efficiently the space and produces charts with a more fruitful map for the objects. The s = 22 virus diseases were compared based on quantitative and qualitative features. As qualitative characteristics are subjective, their influence upon the final results needs to be analysed, so as to prevent biased conclusions. In this line of thought, we vary the weights, {αp, αq}, of the two qualitative features in the interval α α α α We use these matrices as the input for the MDS algorithm that generates r × r intermediate maps of "points" (i.e., one map per {αp, αq} pair). Finally, the charts are processed by means of Procrustes analysis in order to obtain a single global plot of "shapes", where the "points" of the original maps are optimally superimposed [97] . Procrustes analysis performs linear transformations, namely translation, reflection, orthogonal rotation and scaling, with the objective of minimizing a measure of the difference between the "points" in the original maps. The algorithm (i) chooses one MDS map for reference (by selecting one of the available instances); (ii) superimposes all other MDS instances into the current reference; (iii) computes the mean form of the current set of superimposed maps; (iv) compares the distance between the mean and the reference instances to a given threshold value and, if above, sets the reference to the mean form and continues to step (ii). { }identified previously. As can be seen, the results are quite robust to large variations in the qualitative features, since the r × r points corresponding to each original object (disease) deviate somehow, but the clusters remain. In a complementary perspective we address in the sequel the sensitivity of the MDS results to quantitative features. In fact, the quantitative values found in the literature diverge slightly, since they depend on the time of the study and on the conditions observed in each particular case, namely environmental conditions (e.g., temperature, humidity, pressure), geographic region, human development, medical assistance, among others. To assess the sensitivity we add random noise to the values of the quantitative features, with amplitude in the interval ±10% of the values shown in Table 1 (values are limited to zero to avoid negative numbers). In these conditions, we perform ten experiments, generating one MDS map per trial, and then the MDS individual maps are combined using Procrustes. The infections. Accordingly, these viruses are manipulated only in Biosafety Level 4 conditions due to their high individual and community risk. They are also considered biological agents with material threat determinations in the scope of bioterrorism, in the USA [98] . Also, the highly infectious agents responsible for MERS (from the Coronaviridae Subfamily) and Bird flu (Influenza A virus subtype H5N1), classified as Risk Group 3, are mapped in D. These viruses are recommended to be manipulated with Biosafety Level 3 precautions, indicated for agents that may cause serious or potentially lethal disease. In this cluster, untreated HIV and untreated rabies infections are also present. In contrast, and distant in the MDS map, we have cluster A . Most of the viruses are in Risk Group 2, because they generally do not cause serious or life threatening illness and most of them are readily treated or prevented easily with vaccines. They are manipulated, as most viruses, in Biosafety Level 2 environments. There is the exception of the Chikungunya, an important cause of febrile illness in the world, and now reemerging as cause of large outbreaks of human disease [99] . The arbovirus (arthropod-borne) alphavirus responsible for this pathology is considered a Risk Group 3 pathogen and requires Biosafety Level 3 precautions [100] . Cluster C presents several virus species of different Risk Groups. Polio virus is a Risk Group 2 pathogen as well as the dengue fever virus, an arbovirus. On the other hand, the SARSassociated coronavirus is a pathogen of the Risk Group 3. Furthermore, smallpox, the disease caused by the variola virus, is also present in C. Variola is considered a life-threatening disease posing the highest risk to national security due to its potential use as a biological weapon due to the high mortality rates and the major public health impact [98] . The reason is that smallpox was declared eradicated by the World Health Organization (WHO) in 1980, and vaccination, once widely practised, stopped in the same year [98] . Therefore, cluster C can be considered as a transition cluster from A to D. In other words, it is located in the MDS map "equidistant" from A and D. Identical reasoning can be applied to cluster B. In this cluster are the lentivirus (a subgroup of retrovirus) that causes HIV infection and acquired immunodeficiency syndrome (AIDS), a Risk Group 3 pathogen but generally manipulated with Biosafety Level 2 precautions, and the hepatitis B virus, considered belonging to Risk Group 2, equally transmitted by body fluids, and requiring a Biosafety Level 2 environment. Also considered of the Risk Group 2 is the third pathogen present in B, i.e., the virus of Lyssavirus genus of the Rhabdoviridae family that causes rabies. Note that if the diseases caused by HIV and rabies are non-treated, then there is a high lethality in humans. They emerge in cluster D of the MDS map. In conclusion, the MDS map resulted in a new visualization of the complex quantitative and qualitative data of several diseases caused by viruses, and several clusters were organized having some medical and epidemiological interest. In particular, a cluster emerged with viruses like Ebola and MERS, which are responsible for some recent viral outbreaks. In contrast, in the same MDS map, and distant from the previous group, there is a cluster of viruses associated with human diseases that present generally preventive and therapeutic interventions. The development of this methodology may help in understanding the dynamics of viral diseases. This paper addressed the clinical characteristics of 22 viruses. A significant number of quantitative and qualitative characteristics were considered. When handling a large volume of information, we are confronted with the problem of comparing all details, but highlighting the most important properties. Discharging information a priori may lead to incomplete or even biased results. Therefore, embedding all details requires adequate statistical, computational and visualization techniques capable of revealing the main aspects while "filtering" the information with low relevance. The MDS technique adopted in this study proved to produce solid results in accordance with present day knowledge about those infectious agents. Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Modern Multidimensional Scaling: Theory and Applications Multidimensional Scaling Multidimensional scaling and tourism research Multidimensional scaling of sorting data applied to cheese perception Theory and Methods of Scaling Lassa and Marburg viruses elicit distinct host transcriptional responses early after infection A classification approach for genotyping viral sequences based on multidimensional scaling and linear discriminant analysis Application of multidimensional scaling in numerical taxonomy: analysis of isoenzyme types of Candida species Phylogenetic analysis of the polyprotein coding region of an infectious South African bursal disease virus (IBDV) strain Analysis of financial data series using fractional Fourier transform and multidimensional scaling Multidimensional scaling visualization using parametric similarity indices Consumption universes based supermarket layout through association rule mining and multidimensional scaling Visualization of social networks in Stata using multidimensional scaling Matrixassisted laser desorption/ionization time-of-flight mass spectrometry combined with multidimensional scaling, binary hierarchical cluster tree and selected diagnostic masses improves species identification of Neolithic keratin sequences from furs of the Tyrolean Iceman Oetzi Analysis of temperature time-series: embedding dynamics into the MDS method Fractional dynamics and MDS visualization of earthquake phenomena Analysis and visualization of seismic data using mutual information Efficient weighted multidimensional scaling for wireless sensor network localization Sensor positioning in wireless ad-hoc sensor networks using multidimensional scaling The Human Respiratory System: An Analysis of the Interplay between Anatomy, Structure, Breathing and Fractal Dynamics Cerebral cortical mechanisms of copying geometrical shapes: a multidimensional scaling analysis of fMRI patterns of activation Dynamical analysis of compositions The persistence of memory Multigrid multidimensional scaling Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis Introduction to Multidimensional Scaling: Theory, Methods, and Applications Multidimensional Scaling: History, Theory, and Applications Dolas-Reyes, Statistical methods for interpreting Monte Carlo ensemble forecasts H5N1 influenza in Hong Kong: virus characterizations Modeling highly pathogenic avian influenza transmission in wild birds and poultry in West Bengal Influenza (Including Avian Influenza and Swine Influenza), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Environmental persistence of a highly pathogenic avian influenza (H5N1) virus Varicella in Europe-a review of the epidemiology and experience with vaccination Modification of chicken pox in family contacts by administration of gamma globulin Ventilation in the flow of measles and chickenpox through a community: progress report Chickenpox and herpes zoster (varicella-zoster virus), in: Mandell, Douglas, and Bennett's principles and Practice of Infectious Diseases Chikungunya outbreaks-the globalization of vectorborne diseases Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Chikungunya: evolutionary history and recent epidemic spread Dengue human infection model performance parameters Estimating the basic reproduction number for single-strain dengue fever epidemics Estimating the reproduction number of Ebola virus (EBOV) during the 2014 outbreak in West Africa Ebola: the real lessons from HIV scale-up Marburg and Ebola Hemorrhagic Fevers (Filovirus), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Stochastic model for hepatitis B virus infection through maternal (vertical) and environmental (horizontal) transmission with applications to basic reproductive number estimation and economic appraisal of preventive strategies Modelling the epidemiology of hepatitis B in New Zealand Hepatitis B virus and hepatitis delta virus, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Understanding HIV infection for the design of a therapeutic vaccine. Part I: epidemiology and pathogenesis of HIV infection Global epidemiology of HIV Minimum infective dose of HIV for parenteral dosimetry Human immunodeficiency virus, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Ebola and Marburg haemorrhagic fever Measles-the epidemiology of elimination Measles virus (Rubeola), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Rubella virus (German measles), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Molecular epidemiology of measles viruses in the United States Measles herd immunity: the association of attack rates with immunization rates in preschool children Coronaviruses, including Severe Acute Respiratory Syndrome (Sars) and Middle East Respiratory Syndrome (MERS), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Middle East respiratory syndrome corona virus, MERS-CoV Stability of Middle East respiratory syndrome coronavirus in milk Mumps virus, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Molecular biology, pathogenesis and pathology of mumps virus Global prevalence of norovirus in cases of gastroenteritis: a systematic review and meta-analysis Noroviruses and Sapoviruses (Caliciviruses), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Virus survival in the environment Poliovirus, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases A comparison of the clinical features of poliomyelitis in adults and in children Transmission dynamics and prospects for the elimination of canine rabies Epidemiology of human rabies in the United States Rabies (Rhabdoviruses), in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Aerosol transmission of rhinovirus colds Mechanisms of transmission of rhinovirus infections Rhinovirus, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Global seasonality of rotavirus disease Direct and indirect effects of rotavirus vaccination: comparing predictions from transmission dynamic models estimate of worldwide rotavirus-associated mortality in children younger than 5 years before the introduction of universal rotavirus vaccination programmes: a systematic review and metaanalysis The epidemiological profile of rubella and congenital rubella syndrome in the United States, 1998-2004: the evidence for absence of endemic transmission The aetiology, origins, and diagnosis of severe acute respiratory syndrome A complete analysis of HA and NA genes of influenza A viruses Transmission of influenza: implications for control in health care settings What was the primary mode of smallpox transmission? Implications for biodefense, Front The rediscovery of smallpox Distance in spatial interpolation of daily rain gauge data Dynamics of the Dow Jones and the NASDAQ stock indexes Multidimensional scaling visualization of earthquake phenomena Data clustering: 50 years beyond k-means A dendrite method for cluster analysis A cluster separation measure Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Dynamic analysis of earthquake phenomena by means of pseudo phase plane A survey of recent advances in hierarchical clustering algorithms Generalized Procrustes analysis Bioterrorism: an overview, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases Chikungunya: a re-emerging virus Emerging and reemerging infectious disease threats, in: Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases The authors declare no conflict of interest.