key: cord-0818616-3juj7uw4 authors: Kempker, Jordan A.; Martin, Greg S.; Rondina, Matthew T.; Cannon-Albright, Lisa A. title: Evidence for an Inherited Contribution to Sepsis Susceptibility Among a Cohort of U.S. Veterans date: 2022-01-11 journal: Crit Care Explor DOI: 10.1097/cce.0000000000000603 sha: 71401b1bfcf833eb89c2d1410241339a4d071c7b doc_id: 818616 cord_uid: 3juj7uw4 Analyze a unique clinical and genealogical resource for evidence of familial clustering of sepsis to test for an inherited contribution to sepsis predisposition. DESIGN: Observational study. SETTING: Veteran’s Health Affairs (VHA) Genealogy/Phenotype resource, a U.S. genealogy database with veterans individually linked to VHA electronic health records. PATIENTS: Sepsis was identified using International Classification of Disease, 9th Edition and 10th Edition codes. There were two comparison groups: one composed of the all veterans with linked data and deep genealogy and the other included 1,000 sets of controls, each set randomly sampled from the entire cohort after matching on sex and 10-year birth year range on a 1:1 ratio with cases. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: There were 4,666 cases of sepsis from 2001 to 2018, of which 96% were male and 80% greater than or equal to 65 years old. Utilizing the Genealogical Index of Familiality, there was a significant excess of pairwise relatedness among sepsis cases over that in the control sets sampled from VHA population (p = 0.03). The relative risk (RR) of sepsis among identified relatives compared with the larger linked VHA cohort demonstrated an excess of sepsis cases in the first-degree (RR, 1.39; 95% CI, 1.03–1.92; p = 0.05) and second-degree (RR, 1.50; 95% CI, 1.07–2.17; p = 0.04) relatives that were not demonstrated in higher degree relatives. The sepsis cases clustered into 1,876 pedigrees of which 628 had a significant excess of sepsis cases among the descendants (p < 0.05). CONCLUSIONS: The data from this cohort of nearly all male U.S. veterans demonstrate evidence for contribution of an inherited predisposition to sepsis and the existence of pedigrees with a significant excess of diagnoses that provide a valuable resource for identification of the predisposition genes and variants responsible. This complements studies on individual genetic variants toward estimating the heritability patterns and clinical relevance of genetic sepsis predisposition. S epsis is defined as a dysregulated immune response to infection that leads to organ dysfunction with mortality estimates ranging from 15% to 50% (1) (2) (3) . Sepsis is a common problem with estimates of annual incidence in high-income countries ranging from 80 to 300 cases per 100,000 adults and estimates in low-and middle-income countries incomplete but believed to be higher (2, 3) . The overall attempt to understand sepsis in order to surveil, prevent, and treat can perhaps be simplified into two approaches: 1) sepsis as a myriad of analogous afflictions specified by pathogen, initial organ system infected, and subsequent organ systems affected; and 2) a specific entity in and of itself defined by homologous pathways of immune dysregulation. Although these approaches can be complementary, an emphasis on sepsis as a common, identifiable pathway informs a research strategy to look for common causes identifiable in human biology. It is within this framework that we seek to examine the potential familial clustering of sepsis risk as evidence that may support the heritability. Although there has been a recent meta-analysis examining the associations of specific genetic variants with the risk of sepsis, in this work, we take a different approach in utilizing a U.S. national genealogy dataset linked to detailed medical diagnosis data to define the familial clustering observed for sepsis and to test hypotheses of an inherited contribution to the development of sepsis-causative infection and related host responses (4, 5) . This work adds to the existing rich literature on specific genetic variants associated with sepsis by potentially revealing patterns of inheritance and indicating whether, outside of specific genetic testing, a more easily obtained familial history could potentially contribute to the assessment of sepsis risk. This study used the Veteran's Health Affairs (VHA) Genealogy/Phenotype resource, which represents a U.S. genealogy data resource that has been recordlinked to nationwide medical data for the population of U.S. veterans utilizing the VHA system for medical care. This resource was created in part to allow analyses of evidence for a genetic contribution to health-related conditions (5, 6) . The parent genealogy data resource extends to the early 1700s, includes individuals born in all 50 states, and currently represents approximately 71 million individuals with linked relatives, whereas the subcohort-linked VHA medical data date to the mid-1990s and include a growing total of 1.04 million VHA patients with demographic data who were recordlinked to a unique individual in the genealogy data (6) . For this analysis, we included the 273,227 of these VHA patients with linked genealogy data including at least both parents, all four grandparents, and two of eight great grandparents, allowing the identification of close and distant relatives. Once VHA patients were linked to the genealogy data, only identification numbers were used to identify cases and analyze clustering. International Classification of Disease (ICD) codes with accompanying dates for all VHA patients with linked genealogy data were used to assign sepsis cases. Given the complexity of using administrative codes to define sepsis cases, a priori, we designated a principle case definition utilizing explicit sepsis codes as well as two more liberal case definitions to be used in preplanned sensitivity analyses examining the robustness of results against different case definitions. The codes used in these three definitions are listed in Table 1 , with the specific sepsis case definitions provided in Table 2 . For the different analyses described below, different comparison groups are required. For the Genealogical Index of Familiality (GIF) analyses described below, we selected 1,000 sets of controls from the entire population of VHA patients with linked deep genealogy. For each set, for each sepsis case, we randomly selected one VHA control patient who was also matched by sex and 10-year birth year range. For the relative risk (RR) analyses described below, population rates were estimated using the entire population of 273,227 VHA patients with linked genealogy (at least eight immediate ancestors) and medical data. Familial clustering was analyzed using several welldescribed methodologies that are described briefly here. The GIF was used to test for excess relatedness among the individuals diagnosed with sepsis compared with the expected relatedness of 1,000 similar groups of individuals from this population. This test was originally designed for use with the Utah genealogy data resource and has been used on earlier versions of the VHA resource (4) (5) (6) . The relatedness of each pair of sepsis patients is measured with the Malécot coefficient of kinship, which is the probability that a homologous allele in the two individuals was identical by descent from a common ancestor. The GIF statistic for all sepsis cases is the mean of all pairwise kinships. The majority of pairs of cases are not related, resulting in many coefficients of 0; therefore, the statistic is multiplied by 10 5 for presentation. A GIF statistic was calculated for 1,000 independent sets of matched VHA patient controls with genealogy data as described previously to estimate the average pairwise relatedness expected in the VHA patient population from which the sepsis cases were selected. The empirical significance of the GIF test is measured by comparison of the GIF statistic for the sepsis cases to the distribution of the 1,000 control GIF statistics. In addition to this GIF comparison, we also calculated a distant GIF (dGIF) statistic that is calculated similarly to the GIF but only considers more distant relationships. This is performed by ignoring close pairwise relationships (first and second degree), among whom common environment and risk factors could hypothetically play a role in the observed familial clustering. Specifically, although the GIF test includes evidence for familial clustering that may not be due to inheritance, the dGIF test hypothetically reduces the impact of such environmental contributions and provides more robust evidence for an inherited contribution to familial clustering. The contributions to the GIF statistic from various genetic relationships observed among cases and controls can be visualized by pairwise genetic distance and can be compared for cases and the 1,000 control sets ( Fig. 1) : a pairwise genetic distance of 1 represents parent/offspring relationship; a distance of 2 represents siblings, half siblings, or grandparent/grandchild; a distance of 3 represents avunculars or great grandparent/great grandchild; a distance of 4 primarily represents primarily first cousins; a distance of 6 primarily represents second cousins; and so forth. We also estimated RRs of sepsis in relatives as a more traditional test for a genetic contribution to a phenotype. To estimate RRs, we first estimated the cohortspecific rate of sepsis in the entire population of all VHA patients with linked genealogy. All 273,227 of the VHA patients with deep genealogy were assigned to a sex and 10-year-birth year range cohort. Cohort-specific rates of sepsis were estimated by dividing the total number of sepsis cases in each cohort by the total number of VHA patients with deep genealogy in the cohort. The RRs for sepsis for each type of relative were estimated as the observed number of sepsis cases in the set of relatives, divided by the expected number of sepsis cases in the set of relatives. The expected number of sepsis cases in the set of relatives was calculated by summing the cohort-specific rate of sepsis for all relatives. CIs and significance were estimated as described in Agresti (7) . Finally, we constructed high-risk sepsis pedigrees by analysis of ancestral vectors for all VHA sepsis patients; this allowed the identification of all clusters of two or more related sepsis patients with a common ancestor. The subset of such pedigrees that exhibit a significant excess of descendants diagnosed with sepsis over that expected represents high-risk sepsis pedigrees. For each sepsis pedigree identified, the observed number of sepsis cases among the VHA-linked descendants was compared with the expected number of sepsis cases among the VHA-linked descendants (estimated as described above using cohort-specific VHA sepsis rates applied to VHA-linked descendants). Those pedigrees with a significant excess of sepsis cases among the descendants (p < 0.05) were termed high risk. The University of Utah Institutional Review Board reviewed and approved this study (approval number 00031242). There were 4,666 patients with sepsis in the VHA data linked with deep genealogy for analyses. The sepsis cases spanned from 2001 to 2018 were predominantly male (96%) and 65 years old or older (80%). Table 3 shows the results of the GIF analysis for excess familial clustering. The VHA patients diagnosed with sepsis had significantly higher pairwise relatedness than expected in the VHA-linked population (GIF p = 0.028). Although the mean pairwise relatedness was also higher for cases than matched controls when close relationships (first and second degree) were ignored, the difference was not significant (dGIF p = 0.217). In our preplanned sensitivity analyses, where two more liberal definitions of sepsis were used, similar results were observed (Table 3) . Figure 1 shows the graphical representation of the contribution to the GIF statistic by pairwise genetic distance for case pairs compared with control pairs for the principle sepsis definition. The histogram for the controls shows the expected distribution of the familial clustering of individuals who are similar to the VHA sepsis cases (matched for sex and birth year) from this population of VHA patients with genealogy data available. In the presence of no excess familial clustering of the sepsis cases, the histogram for the sepsis cases would align with that for the controls. Instead, an excess of pairwise relationships is observed for cases over controls for most genetic distances, and the statistical comparison of this relatedness concluded a significant excess for cases (p = 0.028). Of note, because sepsis phenotypes are only available in a narrow generational time window for which data were available (from mid 1990s to 2016), close relationships that cross generations (e.g., great grandfather/great grandson) are not as commonly observed as those in the same generation in the dataset (e.g., second cousins). Table 4 shows the estimated observed versus expected RRs for sepsis in the first-to fifth-degree relatives of VHA sepsis patients who are VHA patients as well. For our principle sepsis case definition, significantly elevated risks were observed for sepsis among the first-and second-degree relatives, but no significant risk differences were observed for the third-, fourth-, and fifth-degree relatives. In our preplanned sensitivity analyses, where two more liberal definitions of sepsis were used, there is a similar pattern of RR results, although significant elevated risk was observed only for the fourth-degree relatives for both sensitivity groups examined. The 4,666 VHA patients diagnosed with sepsis cluster into 1,876 clusters/pedigrees including between two and 12 related sepsis cases descending from the same ancestor. None of the 1,876 sepsis pedigrees completely overlaps, but some sepsis cases may be members of more than one pedigree through different ancestors. Of the 1,876 sepsis pedigrees identified, 628 have a significant excess of sepsis cases among the descendants (p < 0.05). Figure 2 shows an example of a mid-sized, high-risk VHA sepsis pedigree. In this analysis, we used a large database of U.S. genealogy linked to VHA health records, identifying 4,666 VHA patients diagnosed with sepsis whose deep ancestry is also known. The data demonstrate a familial clustering of sepsis cases with three types of analyses. First, the GIF tests demonstrated an excess of relatedness of sepsis cases when compared with randomly selected controls. Second, the rates of sepsis among the first-and second-degree relatives were higher than sexand birth year-specific rates for the linked VHA population examined. Third, we identified a substantial number of pedigrees with a significant excess of sepsis cases among the descendants. These three analyses are complementary and combined, and provide strong evidence for a shared genetic component predisposing to infection and/or response to infection. This analysis has both limitations and strengths. Regarding limitations, the resource represents a population of U.S. veterans with available genealogy data, which was primarily male, older, and of Northern European ancestry (a result of reduced availability of translated genealogy data for other populations). Race/ethnicity was not available for the VHA patients for this study. Data for race/ethnicity and other social determinants would be a valuable addition to the VHA genealogy resource and might be possible, for example, from combination with the Million Veteran Program, which is collecting additional data and generating genetic data for approximately 1 million VHA patients. Additional limitations include that the genealogy data have some censoring: the entire U.S. population is not yet represented in the genealogy, and it must be noted that genealogy data do not always represent biological relationships. The sepsis phenotype data must also be considered censored; diagnoses in relatives who were not veterans or who did not use the VHA health system, diagnoses in relatives whose genealogy data were not linked, diagnoses that were not noted in the medical record, and diagnoses before the VHA data collection began would all be censored. Although this censoring is not expected to result in biased estimates, the RR estimates likely represent underestimates of risk. Finally, we relied on ICD coding to identify sepsis cases, a method that is specific but not sensitive when compared with clinical data (3) . Although speculative, we would hypothesize that such missed cases did not differ among cases and controls and, therefore, introduced random error that would tend to bias the results toward the null value. There are also many strengths to this resource and analysis. The extensive genealogy data and the identification of sepsis cases across the national VHA system allowed identification of both close and distant relationships among cases. Recall and ascertainment biases typical to most studies of disease risk in relatives were absent here. This resource and a similar resource representing the genealogy of Utah, as well as the analysis methods used, have been well validated, with numerous analyses of health-related phenotypes and identification of high-risk pedigrees already leading to predisposition gene identifications for some disorders. In the context of these strengths and limitations of the dataset, we interpret these findings as robust evidence of familial clustering of sepsis. The evidence was strongest among close relatives. Specifically, our GIF statistic demonstrated excess relatedness among sepsis cases, whereas the dGIF, which excludes close relatives, only demonstrated a nonsignificant trend toward excess relatedness. Again, in our RR analyses, the data demonstrated excess risk among the first-and second-degree relatives. This clustering can be due to shared environmental factors, genetic factors, or a combination of both. In the existing literature, there is evidence for both of these mechanisms with known genetic variants as well as evidence for social determinants of health associated with sepsis risk. In regard to the genetic variants associated with sepsis risk, there is a 2019 metaanalysis that summarizes the current literature (8) . In summary, the authors performed 204 meta-analyses on 76 genetic variants, identifying 29 variants of 23 genes associated with sepsis risk. These variants involved genes for pathogen-recognition receptor molecules, pro-and anti-inflammatory cytokines, and other immune-related molecules (8) . A similar finding was demonstrated in a more recent study in the Norwegian population (9) . Additionally, an older article examining risks in biological and adoptive children supported heritability of the risk of death from infection (10) . In regard to social determinants of sepsis, there are several studies that have demonstrated links between socioeconomic status and infections, and we conducted a large U.S. longitudinal study demonstrating that socioeconomic factors were among some of the strongest risk factors for septicemia death (11) (12) (13) (14) (15) . This link has been further highlighted during the current COVID-19 pandemic with data demonstrating higher case and fatality rates associated with areas of lower socioeconomic status (16) . Overall, with the totality of current knowledge about sepsis, it is most likely that causality is a complicated web including genetic, socioeconomic, and behavioral risk factors. This is likely to be expected in a condition with a complicated chain of events starting with a myriad of diverse pathogens contagious through different pathways that can infect various organ systems and cause multiple patterns of distant organ dysfunctions through dysregulated systemic inflammatory responses. Although this web will not be untangled by one study, our current analysis adds to the expanding literature regarding sepsis risk by demonstrating there is familial clustering that can be another steps toward identifying more specific causal pathways. The identification of numerous pedigrees exhibiting a significant excess of sepsis among the descendants provides a powerful and informative resource for predisposition gene identification that would be complementary to genome-wide association studies. Studies of high-risk pedigrees recruited in the similar Utah genealogy resource have allowed the identification of multiple disease predisposition genes and variants (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) . Identification of genes responsible for sepsis could allow for significantly reduced morbidity and mortality and provide knowledge to improve screening, prevention, and treatment. Our study provides robust evidence for familial clustering of sepsis. This describes important foundational information and identifies a powerful resource of high-risk pedigrees for further investigations into the genetic variants responsible and untangling their inheritance patterns in order to discover information that may help in the surveillance, prevention, and early treatment of sepsis in particular risk groups. The third international consensus definitions for sepsis and septic shock (Sepsis-3) The changing epidemiology and definitions of sepsis CDC Prevention Epicenter Program: Incidence and trends of sepsis in US hospitals using clinical vs claims data Utah family-based analysis: Past, present and future Creation of a national resource with linked genealogy and phenotypic data: The Veterans Genealogy Project Population genealogy resource shows evidence of familial clustering for Alzheimer disease Categorical Data Analysis Host genetic variants in sepsis risk: A field synopsis and meta-analysis Genome-wide linkage analysis of the risk of contracting a bloodstream infection in 47 pedigrees followed for 23 years assembled from a population-based cohort (the HUNT Study) Genetic and environmental influences on premature death in adult adoptees Risk factors for septicemia deaths and disparities in a longitudinal US cohort The relationship between census tract poverty and Shiga toxin-producing E. coli risk, analysis of FoodNet data Socioeconomic and racial disparities of pediatric invasive pneumococcal disease after the introduction of the 7-valent pneumococcal conjugate vaccine Changing disparities in invasive pneumococcal disease by socioeconomic status and race/ ethnicity in Connecticut Active Bacterial Core Surveillance/Emerging Infections Program Network: Socioeconomic and racial/ethnic disparities in the incidence of bacteremic pneumonia among US adults Association between state-level income inequality and COVID-19 cases and mortality in the USA Mapping of Alport syndrome to the long arm of the X chromosome Analysis of the p16 gene (CDKN2) as a candidate for the chromosome 9p melanoma susceptibility locus A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1 Alzheimer's Disease Genetics Consortium: Identification and genomic analysis of pedigrees with exceptional longevity identifies candidate rare variants Alzheimer's Disease Sequencing Project: Association of rare coding mutations with Alzheimer disease and other dementias among adults of European ancestry Alzheimer's Disease Neuroimaging Initiative: Linkage, whole genome sequence, and biological data implicate variants in RAB10 in Alzheimer's disease resilience A nonsynonymous variant in the GOLM1 gene in cutaneous malignant melanoma Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13 An intronic variant in the CELF4 gene is associated with risk for colorectal cancer A role for the MEGF6 gene in predisposition to osteoporosis A novel ribosomal protein S20 variant in a family with unexplained colorectal cancer and polyposis