key: cord-323202-kcy8xoos authors: Burns, Jane C.; DeHaan, Laurel L.; Shimizu, Chisato; Bainto, Emelia V.; Tremoulet, Adriana H.; Cayan, Daniel R.; Burney, Jennifer A. title: Temporal Clusters of Kawasaki Disease Cases Share Distinct Phenotypes That Suggest Response to Diverse Triggers date: 2020-09-22 journal: J Pediatr DOI: 10.1016/j.jpeds.2020.09.043 sha: doc_id: 323202 cord_uid: kcy8xoos OBJECTIVE: To test the hypothesis that cases of Kawasaki disease within a temporal cluster have a similar pattern of host response that is distinct from cases of Kawasaki disease in different observed clusters and randomly constructed clusters. STUDY DESIGN: We designed a case-control study to analyze 47 clusters derived from 1332 patients with Kawasaki disease over a 17-yr. period (2002-2019) from a single clinical site and compared the cluster characteristics with two control groups of synthetic KD clusters. We defined a “true” KD cluster as at least 5 patients within a 7-day moving window. The observed and synthetic KD clusters were compared with respect to demographic and clinical characteristics and median values for standard laboratory data using univariate analysis and a multivariate, Rotated Empirical Orthogonal Function Analysis (REOFs). RESULTS: In a univariate analysis, the median values for age, coronary artery Z score, white blood cell count, erythrocyte sedimentation rate, C-reactive protein, and age-adjusted hemoglobin for several of the true KD clusters exceeded the 95th percentile for the two synthetic clusters. REOFs revealed distinct patterns of demographic and clinical measures within clusters. CONCLUSIONS: Cases of Kawasaki disease within a cluster were more similar with respect to demographic and clinical features, and levels of inflammation than would be expected by chance. These observations suggest that different triggers and/or different intensity of exposures result in clusters of cases of Kawasaki disease that share a similar response pattern. Analyzing cases within clusters or cases who share demographic and clinical features may lead to new insights into the etiology of KD. Kawasaki disease, a pediatric self-limited vasculitis that affects the coronary arteries, has eluded attempts to discover its etiology for over four decades. 1 Epidemiologic clues have provided valuable insights into the nature of the disease including the genetic predisposition that underlies susceptibility. 2 The elucidation of the distinct seasonality, the lack of documented person-toperson spread, and the spatiotemporal clustering of cases all suggest a fluctuating exposure that triggers the disease. 3 Clinical features of the illness, including mucosal inflammation of the lips, tongue, and upper airway coupled with cervical lymphadenopathy and hoarseness, all point to a trigger that enters through the nasopharynx. 4, 5 Specific atmospheric patterns have been linked to clusters of cases of Kawasaki disease that may increase exposure to the causative trigger. 3 Although KD is conceptualized as a monomorphic disease, it may be more accurate to characterize it as a syndrome with subtle variation among clinical sub-groups. Certain clinical features are not universal among patients with Kawasaki disease. These include the "node-first" presentation with fever and cervical lymphadenopathy, the specific injury pattern to the tongue usually associated with bacterial toxins (strawberry tongue), and the characteristic periungual desquamation in the convalescent phase. 5, 6 The prospective collection of detailed demographic and clinical details on cases of Kawasaki disease treated at Rady Children's Hospital San Diego allowed statistical analysis of the temporal pattern of cases of Kawasaki disease over many years, revealing clustering of cases beyond what would be expected due to KD seasonality and trends alone. Analysis of demographic and clinical characteristics of cases within these temporal clusters was compared with synthetic control clusters derived from either re-shuffling members of clusters or creating clusters of same-season cases that did not occur within observed clusters. The results highlighted the shared features within a cluster and suggest that future research on etiology should prioritize the separate analysis of cases that share similar sub-phenotypes. J o u r n a l P r e -p r o o f We enrolled 1,332 patients with KD who met American Heart Association (AHA) guidelines for complete or incomplete KD and were diagnosed and treated at Rady Children's Hospital San Diego (RCHSD) between January 1, 2002 and March 31, 2019. 7 The worst CA Z score was defined as previously published. 8 The collection of data was approved by the Institutional Review Board at the University of California San Diego and parents and participants gave signed informed consent or assent as appropriate. We computed the distribution density of case onsets (cases per 7-day period) across the entire sample. We took the value associated with the high tail of the 95% confidence interval (97.5 th percentile) of the density distribution as the starting definition for a temporal cluster, which was 5 or more cases in 7 days (Figure 1 ). Across the time series, any day that was part of a moving window in which case density equaled or exceeded 5 cases in 7 days was classified as a cluster day (irrespective of whether or not it had a case onset), and any KD onset falling in those windows was classified as a "cluster" case. Cases outside of that window were classified as "non-cluster" cases of Kawasaki disease. We used a set of 14 clinical and demographic quantities to calculate cluster-level characteristics (names used in figures shown in bold). Patient residence longitude and latitude -due to the very large geographic catchment area, we used geospatial coordinates to understand potential spatial dependence of clinical clusters. For cluster-level summaries, we used median longitude and latitude. Because of the uneven J o u r n a l P r e -p r o o f distribution of ethnic/racial groups in our region, we considered these geospatial coordinates as a surrogate for race/ethnicity. Patient age at date of onset with the first day of fever designated as illness day 1. For clusterlevel summaries, we used median age at onset. Laboratory data prior to IVIG infusion and within the first 15 days after fever onset: White blood cell count (WBC), Platelet count (PLT), erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), and age-adjusted hemoglobin (Zhemo). Maximal coronary artery Z score for the right coronary artery (RCA) and left anterior descending artery (LAD) was determined by echocardiography (Zworst). For cluster-level analysis, we used median values for each of these quantities. Hepatic enzyme elevations were analyzed as a dichotomous variable (alt_abnormal) with normal defined as below the age-adjusted upper limit of normal for alanine amino transferase (ALT) (< 60 IU), and abnormal values defined as >100 IU. For cluster-level analysis, mean values were used. Patient sex and the presence of a strawberry tongue (strawberry), enlarged cervical lymph nodes >1.5 cm (lymphnode), and convalescent peeling fingers or toes (peeling). For cluster-level summaries of these characteristics, we used mean values. We also computed the variance of each attribute for each cluster as a metric of similarity of patients within a cluster. For any clusters that included cases who presented beyond Illness Day 15, we calculated cluster-level values excluding these cases to prevent skewing of the data from patients diagnosed later in the illness when laboratory values of inflammation might be normalizing. Overall, there were 18 cases whose laboratory data were excluded from analysis because of diagnosis beyond Day 15. J o u r n a l P r e -p r o o f created 100 sets of synthetic clusters of the same size distribution as the observed clusters, hereafter called true clusters; we created synthetic clusters by shuffling cluster membership among the 315 cluster cases of Kawasaki disease (shuffled clusters). The universe of cluster cases was randomly re-allocated to 47 groups of the same size as the true observed temporal clusters. We also created a set of synthetic clusters drawn from non-KD cluster cases that occurred in any year, but within the same season (control clusters). Individual clusters as outliers: We assessed the degree to which each individual cluster was anomalous relative to both sets of synthetic clusters by testing whether the cluster-average value for each clinical characteristic lay outside the range of averages among the comparison groups (e.g., we compared median age of true cluster #7 to the distribution of median age in 100 shuffled cluster #7s and 100 control cluster #7s). We calculated a p-value for each true cluster compared to these groups. We also compared the variances of attributes in the true clusters compared to the synthetic comparison clusters (e.g., we compared variance in age of true cluster #7 to the variance in age in 100 shuffled cluster #7s and 100 control cluster #7s). We also compared the full distribution of average values and variances in the true clusters with a sample drawn from each of the comparison groups. This allowed us to assess whether the true clusters as a whole represented a departure from the comparison groups: that is, does the temporal clustering overall produce a different distribution of traits, beyond individual clusters? We took a random draw of 1 of each of the 47 groups of 100 shuffled clusters and 1 of each of the 47 groups of 100 control clusters, and compared, for example, the J o u r n a l P r e -p r o o f distribution of 47 true cluster median ages to the distributions of these two groups of 47 synthetic clusters. A key question is whether existing sets of clusters differ repeatedly from other clusters by combinations of clinical or demographic characteristics either in addition to or in lieu of, individual characteristics. To assess this, we conducted a rotated empirical orthogonal function (REOF) analysis. 9 This is similar to principal component analysis, extracting statistically independent "modes" of KD, wherein the first few explain a relatively large portion of the variance of the entire data sample. Varimax rotation was used to identify patterns of unusually high or low expression in the demographic or clinical characteristics in subsets of the clusters. Details of REOF generation are in Supplemental Methods. The dataset contained 47 clusters consisting of 315 cases of Kawasaki disease. (Figure 1 ) To determine if more cases of Kawasaki disease had their onset in clusters than would be expected at random, we used a Monte-Carlo analysis to create 100 timeseries with an equivalent number of cases from all days in the study period, which would account for seasonality and diagnosis trends over time. We tallied the clusters in these timeseries according to the same cluster definition and calculated the distribution of clusters. Although a certain number of clusters of 5 or more cases in 7 days occurred at random, a larger number of clusters of 6-7 cases in 7 days occurred in the true data compared with the Monte Carlo simulation. (Figure 1 , E) This indicated that more cases of Kawasaki disease were included in clusters than would be expected J o u r n a l P r e -p r o o f by chance. However, grouped together, the demographic and clinical characteristics of the 315 cluster cases of Kawasaki disease were compared with the 1,017 non-cluster cases of Kawasaki disease and no significant differences were noted (Table) . In the univariate analysis, cluster-level averages of demographic and clinical characteristics were compared between the set of 47 true KD clusters and 100 sets of 47 synthetic clusters of equal size created either by randomly shuffling membership of cluster cases (referred to as shuffled clusters) or by randomly creating clusters from non-cluster cases of Kawasaki disease within the same season as the true cluster (referred to as control clusters). (Figure 2 ; available at www.jpeds.com) Striking differences were noted between individual true clusters and synthetic clusters for the following variables: age, ESR, and the presence of enlarged lymph nodes or strawberry tongue. For all of the characteristics, the difference in the true cluster distribution versus either set of synthetic clusters was highly significant (Appendix Figure 1 ; available at www.jpeds.com). We next analyzed the coherence of characteristics of cases within true clusters. We calculated the Intra-Cluster Correlation (ICC) coefficient, which shows the percentage of the variance that occurs between versus within clusters (Appendix Table; available at www.jpeds.com). In this analysis, lymph node, WBC, ESR, Age, and PLT were significantly more similar within true clusters than between true clusters (p=0.007, 0.008, 0.009, 0.035, and 0.045, respectively). To help visualize the variation within and between clusters, individual cluster distributions for the true clusters are shown for several characteristics in Appendix Figure 2 (available at www.jpeds.com). In the multivariable analysis, the first four REOFs together explained 55% of the total variance of the true clusters: REOF1 explained 18%, REOF2 explained 13%, J o u r n a l P r e -p r o o f and REOF3 and REOF4 each explained 12%. For each of these four REOFs, the highest weightings on the 14 characteristics were identified to determine the defining characteristics presented in the true clusters (Figure 3, A) . Several of the defining characteristics in the multivariable analysis also were identified as strongly expressed features in the univariate analysis. For REOF1, the defining characteristics (the characteristics with a weight larger than one standard deviation) were ESR, CRP, and Zhemo. Clusters that loaded strongly (positively or negatively) onto this REOF had a higher than average ESR and CRP and a lower than average Zhemo or had a lower than average ESR and CRP and a higher than average Zhemo (Appendix The prevalence of these defining characteristics can be seen in Figure 3 , B, evaluating this pattern from REOFs obtained in a Monte Carlo sampling of the true clusters versus those from the shuffled and control clusters. From the true set of Monte Carlo samples, an REOF was found with the defining characteristics ESR, CRP and Zhemo almost 50% of the time, whereas from the shuffled clusters, this relationship occurred less than 25% of the time, and from the control clusters it occurred less than 15% of the time. This indicated that the true clusters expressed a much stronger relationship amongst these three characteristics than would be expected by chance. The defining characteristics of REOF2 were age, sex, and WBC, with age and fraction male presenting in the same direction and WBC in the opposite direction. Thus, some clusters had older patients with a higher percentage of males and lower white blood count and other clusters had younger patients with a lower percentage of males and higher white blood count. Because REOF2 captured less variability in the overall cluster universe than REOF1, these defining characteristics were less prevalent than the REOF1 defining characteristics, but still occurred J o u r n a l P r e -p r o o f over 20% of the time in the true cluster population. As with REOF1, this relationship was much stronger in the true clusters than in the synthetic clusters --both shuffled clusters and control clusters produced this combination in only about 3% of the 3000 Monte Carlo trials. REOF3 described a relationship between age, lymph nodes, and alt_abnormal in one direction and PLT in the other. REOF4 described a relationship between lymph nodes, periungual peeling, and location of patient residence (positive latitude with negative longitude indicated clusters from the northwest portion of the catchment area). Although these REOFs were less prevalent than REOF1 and REOF2, the Monte Carlo trials yielded such patterns significantly more frequently in the true clusters than in either the control or shuffled clusters. Historically, KD has been thought of as a monomorphic disease, with research efforts focused on these patients as a homogenous group. However, our clinical experience in diagnosing and treating patients with Kawasaki disease has suggested that many of the "classic" features of KD are not universal and that patients with certain sub-phenotypes are not evenly distributed over time, even after accounting for seasonality. Our analysis focused on temporal KD clusters and examined the characteristics of cases within and across clusters, as well as non-cluster cases, in the RCHSD time series. A series of tests indicated that individual clusters exhibited distinctive clinical and demographic characteristics. First, temporal clusters occurred more often than would be expected by chance. Second, in assessing whether cases of Kawasaki disease within these true clusters were different from cases of Kawasaki disease not in clusters, we found that the overall central tendencies between cluster and non-cluster cases were similar (Table) . Thus, it was not simply a question of clustered cases being different than non-clustered cases. Third, the data revealed important In the univariate analysis, ESR and CRP, both markers of systemic inflammation, were markedly elevated in certain clusters but not others. In the multivariable REOF analysis, it was further revealed that these measures of inflammation when elevated were more likely to be associated with low hemoglobin levels and vice versa (lower measures of inflammation associated with normal hemoglobin). Although this makes sense biologically, the important finding was that some clusters had this composite feature whereas other clusters did not. With respect to clinical features, cases manifesting "strawberry tongue" or "lymph node first presentation" also clustered, as did cases without these clinical features (Figure 2) . Strawberry tongue is a specific mucosal injury pattern that involves the sloughing of the cornified tips of the filiform papillae and is associated with bacterial toxin-mediated disease. 11 Overall in our time series, only 477/805 (59.3%) patients had this phenotype at the time of diagnosis, but strawberry tongue registered quite strongly as one of the important characteristics in true clusters, wherein its distribution differed from that in synthetic clusters (Appendix Figure 1, M) . The lymph node first presentation suggests an exaggerated immune response in regional lymph nodes to an antigenic stimulus that enters through the mucosal surfaces of the posterior oropharynx. Children presenting with only fever and enlarged cervical lymph nodes during the first week of illness have been described and represented 32.6% of patients from our center. 5 The clustered presence or absence of these phenotypes in patients with Kawasaki disease differed significantly from the two groups of synthetic clusters, which might suggest that the patients with Kawasaki disease within these clusters were responding to different stimuli leading to these different clinical presentations. Similarly, laboratory evidence of inflammation was either high or low in different clusters, again suggesting a non-random distribution of these features. In the REOF analysis, certain laboratory and clinical features were associated within the true clusters more often than expected by chance. Some of these statistical findings were consistent with our understanding of the biology of KD. For example, higher inflammatory markers (ESR and CRP) were linked to more pronounced anemia and vice versa. Similarly, in REOF 5, higher Z max was linked to higher ESR and platelet count. However, other relationships, such as REOF 3 that described a relationship between older age, enlarged lymph nodes, and elevated liver enzymes with lower platelet count were unexpected. Thus, new relationships of features revealed by the analysis could be used to group patients for study of genetic susceptibility and molecular investigations of potential etiologic agents. These observations suggest that different triggers or different intensity of exposures may operate to produce cases of Kawasaki disease that cluster temporally and share a similar response pattern. Important insight into the different etiologies of KD may be gained by focusing on patients who share the demographic and clinical phenotypes identified in these analyses. (E) Monte Carlo simulation drawing 100 time series of 1,332 randomly-selected dates with the same time trend and seasonality as the true data (blue squares) and comparing that to the true data (black dots) shows more clusters of 6-7 cases in the true data than would be expected from a random distribution of established trends and seasonality. Illness day of lab data 5 (4-7) 6 (4-7) Laboratory data J o u r n a l P r e -p r o o f A new infantile acute febrile mucocutaneous lymph node syndrome (MLNS) prevailing in Japan The genetics of Kawasaki disease Clustering and climate associations of Kawasaki Disease in San Diego County suggest environmental triggers Hoarseness as a Presenting Sign in Children with Kawasaki Disease Lymphnode-first presentation of Kawasaki disease compared with bacterial cervical adenitis and typical Kawasaki disease Periungual desquamation in patients with Kawasaki disease Term Management of Kawasaki Disease: A Scientific Statement for Health Professionals From the American Heart Association Kawasaki Disease Outcomes and Response to Therapy in a Multiethnic Community: A 10-Year Experience Rotation of principal components Clinical Characteristics of 58 Children With a Pediatric Inflammatory Multisystem Syndrome Temporally Associated With SARS-CoV-2 of subjects with the finding/No. of subjects for whom the observation of the presence or absence of the finding was made Abbreviations: LN: lymph node; WBC: white blood cell count; Z-hemoglobin: hemoglobin concentration normalized for age; ALT: alanine aminotransferase; ESR: erythrocyte sedimentation rate; CRP: C-reactive protein; NS: not significant