key: cord-0254275-qkl8ndif authors: Bengtsson, Rebecca J.; Simpkin, Adam J.; Pulford, Caisey V.; Low, Ross; Rasko, David A.; Rigden, Daniel J.; Hall, Neil; Barry, Eileen M.; Tennant, Sharon M.; Baker, Kate S. title: Informing shigellosis prevention and control through pathogen genomics date: 2021-06-09 journal: bioRxiv DOI: 10.1101/2021.06.09.447709 sha: 6d056190b1e5b4392525cdff87e33a49bf692905 doc_id: 254275 cord_uid: qkl8ndif Shigella spp. are the leading bacterial cause of severe childhood diarrhoea in low- and middle-income countries (LMIC), are increasingly antimicrobial resistant and have no licensed vaccine. We performed genomic analyses of 1246 systematically collected shigellae from seven LMIC to inform control and identify factors that could limit the effectiveness of current approaches. We found that S. sonnei contributes ≥20-fold more disease than other Shigella species relative to its genomic diversity and highlight existing diversity and adaptative capacity among S. flexneri that may generate vaccine escape variants in <6 months. Furthermore, we show convergent evolution of resistance against the current recommended antimicrobial among shigellae. This demonstrates the urgent need to integrate existing genomic diversity into vaccine and treatment plans for Shigella, and other pathogens. Shigellosis is a diarrhoeal disease responsible for approximately 212,000 annual deaths and 38 accounting for 13.2% of all diarrhoeal deaths globally (1). The Global Enteric Multicenter Study (GEMS) 39 was a large case-control study conducted between 2007 and 2011, investigating the aetiology and burden 40 of moderate-to-severe diarrhoea (MSD) in children less than five years old in low-and middle-income 41 countries (LMICs) (2). GEMS revealed shigellosis as the leading bacterial cause of diarrhoeal illness in 42 children, who represent a major target group for vaccination (3). The aetiological agents are Shigella, a 43 Gram-negative genus comprised of S. flexneri, S. sonnei, S. boydii and S. dysenteriae, with the former two 44 serotypes causing the majority (90%) of attributable shigellosis in children in LMICs (3). Currently, the 45 disease is primarily managed through supportive care and antimicrobial therapy. However, there has been 46 an increase in antimicrobial resistance (AMR) among Shigella (4). Particularly concerning is the rise of 47 resistance against the fluoroquinolone antimicrobial ciprofloxacin, the current World Health Organisation 48 (WHO) recommended treatment, such that fluoroquinolone-resistant (FQR) Shigella is one of a dozen 49 pathogens for which WHO notes new antimicrobial therapies are urgently needed (5). The high disease 50 burden and increasing AMR of Shigella call for improvements in treatment and management options for 51 shigellosis, and significant momentum has built to rise to this challenge. 52 However, there is still no licenced vaccine available for Shigella and one of the main challenges in 53 its development is the considerable genomic and phenotypic diversity of the organisms (6). The distinct 54 lipopolysaccharide O-antigen structures of Shigella determine its serotype and is responsible for conferring 55 of shigellosis (i.e., isolates derived from case patients rather than from controls, as defined in GEMS). The 119 dominant genomic subtype was PG3, which comprised the majority (47%, 378/806) of total isolates, as 120 well as case (50%, 341/687) isolates, with some regional variation (Fig. 2) . This resulted in an increased 121 odds of cases (OR = 2.3, 95% CI = 1.5-3.6, p = 0.0001) for PG3 compared with other genomic subtypes 122 (PGs and Sf6) (methods, table S3). The association of cases with the dominant serotype, S. flexneri serotype 123 2a (accounting for 29% (234/806) of total isolates and 31% (210/687) of case isolates) also resulted in an 124 increased odds of cases (OR = 1.9, 95% CI = 1.7-3.2, p = 0.0099) (table S3). But the higher prevalence and 125 larger effect size of PG3 relative to serotype 2a on case status offers compelling evidence that targeting 126 vaccination by phylogroup might offer broader coverage per licenced vaccine relative to, or in combination 127 with, a serotype-specific approach. 128 129 The development of serotype-targeted vaccines is complicated by the diversity and distribution of 132 serotypes, which are heterogenous over time and place (8, 19, 26, 27) . Furthermore, genetic determinants 133 of O-antigen modification are often encoded on mobile genetic elements (28, 29) that can move horizontally 134 among bacterial populations, causing the recognised, but poorly quantified phenomenon of serotype 135 switching (20, 27, 28) , which may result in the rapid escape of infection induced immunity against 136 homologous serotypes. For our analyses of serotype switching, we focused on S. flexneri owing to high 137 disease burden and serotypic diversity. Phenotypic serotyping data were overlaid onto the phylogeny and 138 revealed that while generally there was a strong association of genotype (i.e. PG/Sf6) with serotype 139 (Fisher's exact test; p<2. 20E-16) , multiple serotypes were observed for each genotype (Fig. 3 ). The greatest 140 serotype diversity was observed in PG3, comprised of seven distinct serotypes and two subserotypes. 141 Correlation of serotypic diversity (number of serotypes) and genomic diversity (maximum pairwise SNP 142 distance within genotype) revealed no evidence for an association, but a significant positive correlation of 143 serotypic diversity with the number of isolates in each genotype was found ( fig. S6 ), indicating that serotype 144 diversity scales with prevalence. 145 To qualitatively and quantitatively determine serotype switching across S. flexneri, we examined 146 the number of switches occurring within each genotype. A switching event was inferred when a serotype 147 emerged (either as a singleton or monophyletic clade) that was distinct from the majority (>65%) serotype 148 within a genotype ( Fig. 3 and fig. S7 ). PG6 was excluded from the analysis, as only three isolates from 149 GEMS belonged to this genotype and a dominant serotype could not be inferred. Quantitatively, this 150 revealed serotype switching was infrequent, with only 26 independent switches (3.3% of isolates) identified 151 across the five S. flexneri genotypes. Although the frequency of switching varied across the genotypes, 152 statistical support for an association of serotype switching with genotype fell short of significance (Fisher's 153 exact test; p = 0.09). Qualitatively, the majority (22/26) of switching resulted in a change of serotype, with 154 few (4/26) resulting in a change of subserotype. Examination of O-antigen modification genes revealed that 155 serotype switching was facilitated by changes in the composition of phage-encoded gtr and oac genes in 156 the genomes, as well as point mutations in these genes (table S4) . Our data also revealed that few (4/26) 157 switching events resulted in more than two descendant isolates ( fig. S7 ). This indicates that while natural 158 immunity drives the fixation of relatively few serotype-switched variants in the short term, the potential 159 pool of variants that could be driven to fixation by vaccine-induced selective pressure following a serotype-160 targeted vaccination program is much larger. 161 In order to estimate the likely timeframe over which serotype switching events might be expected 162 to occur, we estimated the divergence time of the phylogenetic branch giving rise to each switching event. 163 To streamline the analysis, we focused on two subclades of PG3, the most prevalent phylogroup, in which 164 seven independent serotype switching events were detected ( fig. S8) have demonstrated protection in animal models (Table 1) . First, we assessed the distribution of the 180 candidates among GEMS Shigella isolates which revealed that the proportional presence of antigens varied 181 across species and with genetic context. Specifically, genes encoded on the virulence plasmid (ipaB, ipaC, 182 ipaD, icsP) were present in >85% of genomes for each species with the exception of S. sonnei ( fig. S9 ). 183 The low proportion (≤5%) of virulence plasmid encoded genes detected among S. sonnei was caused by a 184 similarly low detection of the virulence plasmid among S. sonnei (6%), which likely arose due to loss during 185 sub-culture (32). In contrast, the chromosomally encoded ompA was present in >98% of all isolates, while 186 the sigA gene (carried on the chromosomally integrated SHI-1 pathogenicity island (17)) was present in 187 99% of S. sonnei genomes, but only 63% of S. flexneri genomes. Notably, among S. flexneri genomes, the 188 sigA gene was exclusively found in PG3 and Sf6, and present in >96% of isolates in each genotype) ( fig. 189 S2), indicating an appropriate distribution for targeting the two genotypes. Second, we assessed the antigens 190 for amino acid variation and modelled the likely impact of detected variants, as antigen variation may also 191 lead to vaccine escape, as demonstrated for the P1 variant of 34) . We determined the 192 distribution of pairwise amino acid (aa) sequence identities per antigen against S. flexneri vaccine strains 193 for each species (methods). Overall, sequence identities were >90% but varied with antigen ( fig. S9 ). For 194 example, OmpA was present in the highest proportion of genomes, but showed ~5% sequence divergence, 195 while SigA was present in fewer genomes, but exhibited little divergence (<0.5%) among species. The least 196 conserved sequence was IpaD, ranging from 3 to 7% divergence within species. 197 Not all antigenic variation will affect antibody binding, so we performed in silico analyses of the 198 detected variants to assess whether they may compromise the antigens as vaccine targets. Again, we focused 199 our analyses on S. flexneri owing to its high disease burden and the likely complication of serotype-based 200 vaccination strategies for this species. We detected 121 variants across the six antigens, the majority (79%) 201 of which correlated with genotype (i.e. belonging to either PGs 1-5 or Sf6, fig. S11 ). We then determined 202 if amino acid variants were located in immunogenic regions (i.e. epitope/peptide fragment) (fig. S10) and 203 assessed their potential destabilization of protein structure through in silico protein modelling. For IpaB, 204 IpaC and IpaD, the epitopes have been empirically determined (35, 36). The sequence and location of 205 peptide fragments of SigA, IcsP and OmpA used in vaccine development are available (37, 38). Variants 206 located within the immunogenic regions were identified for all antigens relative to PG3 reference sequences 207 (methods, Fig. 4 ). Only 4 of 121 variants were predicted to be highly destabilising to protein structure, and 208 these occurred in: OmpA (residue 89) at a periplasmic turn, SigA (residues 1233 and 1271) in adjacent 209 extracellular turns in the translocator domain (fig. S12), and in IpaD (residue 247) within a beta-turn-beta 210 motif flanking the intramolecular coiled-coil (Fig. 4) . While it remains possible that these mutations could 211 affect antigenicity through the disruption of folding or global stability, it is less likely than if they occurred 212 in immunogenic regions. These results thus indicate that it is less likely that existing natural variation will 213 compromise antigen-based vaccine candidates for Shigella compared with serotype-based vaccines. 214 However, our approach is limited and the knowledge base incomplete. For example, there was no suitable 215 template available for IpaC, and some epitopes were predicted to be in membrane regions which should be 216 inaccessible to antibodies, indicating the need for more accurate publicly available protein structures to be 217 developed for many of the vaccine antigen candidates. Overall, a high frequency of AMR genes conferring resistance against aminoglycoside, 236 tetracycline, trimethoprim, and sulphonamide antimicrobials was observed, while resistance against other 237 antimicrobial classes varied with region and species (Fig. 5B ). The extended spectrum beta-lactamase gene 238 blaCTX-M-15 was detected in a small (9/1246) percentage of isolates, and genes conferring resistance to 239 macrolides and lincosamides were also infrequent (fig. S13), indicating that the recommended second-line 240 treatments likely remain effective antimicrobials (43). 241 However, higher rates of resistance were found against the first-line treatment. FQR in Shigella can 242 be conferred through the acquisition of FQR-genes or, more typically, by point mutations in the 243 chromosomal Quinolone Resistance Determining Region (QRDR) within the DNA gyrase (gryA) and the 244 topoisomerase IV (parC) genes. Single and double QRDR mutations are known to confer reduced 245 susceptibility to ciprofloxacin and are evolutionary intermediates on the path to resistance, conferred by 246 triple mutations in this region (41, 44). Overall, FQR-genes were uncommon in S. flexneri (4%, 33/806), S. 247 sonnei (1%, 3/305) and S. dysenteriae (7%, 4/60), but were present in 32% (24/75) of S. boydii. QRDR 248 mutations were identified in all species (fig. S13), but were more common among S. sonnei (65%, 199/305) 249 and S. flexneri (54%, 435/806) than compared with S. boydii (15%, 11/75) and S. dysenteriae (30%, 18/60). 250 Among these, triple QRDR mutations were identified in 13% (106/806) of S. flexneri and 14% (44/305) of 251 S. sonnei. Analysis of the QRDR mutants across the phylogenies indicate marked convergent evolution 252 toward resistance across the genus. Specifically, all triple QRDR mutant S. sonnei belonged to one 253 monophyletic subtype (previously described as globally emerging from Southeast Asia (45)), while three 254 distinct triple QRDR mutational profiles were found across three polyphyletic S. flexneri genotypes (Fig. 255 5C). Thus, the polyphyletic distribution of single, double, and triple QRDR mutants indicates continued 256 convergent evolution of lineages with reduced susceptibility or resistant to FQR. 257 We then stratified the dataset by geographic region which revealed that FQR were largely 258 associated with isolates from Asia where fluoroquinolones are more frequently used compared to African 259 sites ( Fig. 5B ) (46), which is consistent with trends observed in atypical enteropathogenic Escherichia coli 260 isolated from GEMS (46). Our analyses thus suggest that for the period of GEMS trial (2007 -2011), 17% 261 (150/881) of Shigella isolates from Asia were resistant and 58% (508/881) had reduced susceptibility to the 262 WHO recommended antimicrobial. The high level of reduced susceptibility together with marked 263 convergent evolution toward resistance suggests that management of shigellosis with fluroquinolones at 264 these sites may soon be ineffective and regional antimicrobial treatment guidelines may require updating. 265 These results indicate the value of AMR and microbiological surveillance in LMICs and the control and 266 management of shigellosis will be improved by initiatives such as the Africa Pathogen Genomics Initiative 267 (47) and the WHO Global Antimicrobial Resistance Surveillance System (48). 268 Pathogen genomics is a powerful tool that has a wide range of applications to help combat 271 infectious diseases. Here, we have applied this tool to an unparalleled systematically collected Shigella 272 dataset to characterise the relevant population diversity of this pathogen across LMICs in a pre-vaccine era. 273 Our results revealed that current antimicrobial treatment guidelines for shigellosis should be updated, and 274 that improved surveillance will be essential to guide antimicrobial stewardship. This study has also 275 highlighted the urgent need to continue the development of Shigella vaccines for children in endemic areas. Sequencing reads for isolates sequenced using the LITE pipeline and re-sequenced at CGR were combined 318 to increase overall sample depth of coverage. Sequence variants were identified against reference using 319 SAMtools v1.9-47 mpileup and bcftools v1.9-80 (53). Low quality SNPs were filtered if mapping quality 320 <60, Phred-scaled quality score <30 and read depth <4. 321 322 Filtered SNP variants were used to generate a reference-based pseudogenome for each sample, 325 where regions with depth of coverage >4x were masked in the pseudogenome. Additionally, regions 326 containing phage (identified using PHASTER (55)) and insertion sequences were identified from the 327 reference genomes, and co-ordinates were used to mask these sites on the pseudogenomes using BEDTools 328 v2.28.0 maskfasta (56). For each species, chromosome sequences from the masked pseudogenomes were 329 extracted and concatenated. Gubbins v2.3.4 (57) was used to remove regions of recombination and invariant 330 sites from the concatenated pseudogenomes. This generated a chromosomal SNP alignment length of 331 78,251 bp for S. flexneri (n=806), 5,081 bp for S. sonnei (n=305), 98,842 bp for S. boydii (n=75) and 45,031 332 bp for S. dysenteriae (n=60). Maximum-likelihood phylogenetic reconstruction was performed 333 independently for each species and inferred with IQ-TREE v2.0-rc2 (58) To measure the extent of shigella genomic diversity among GEMS population, pairwise SNP 343 distance was determined from the alignment of core genome SNPs identified outside regions of 344 recombination using snp-dists v0.7.0 (https://github.com/tseemann/snp-dists). For each species, the 345 genomic diversity, measured by SNPs per kbp, was determined by dividing the core genome SNP alignment The pangenome of each species was defined using Roary v3.12.0 (68) without splitting paralogues. 388 The pangenome accumulation curves were generated separately for each species using the specaccum 389 function from Vegan v2.5-7 (https://github.com/vegandevs/vegan/), with 100 permutations and random Medicine (Baltimore, Maryland), serotyping was performed as previously describe (19). In silico serotyping 397 of S. flexneri genomes was performed using ShigaTyper v1.0.6 (69) which detects the presence of serotype-398 determining genetic elements from sequencing reads to predict serotype. ShigaTyper predictions were 84% 399 concordant to the serotype data provided. SRST2 v2 (70) To determine the presence of antigen vaccine candidates among GEMS Shigella isolates, genes of 405 the antigen vaccine candidates was screened against draft genome assemblies using screen_assembly (17) 406 with a threshold of ≥80% identity and ≥70% coverage to the reference sequence. Reference sequences for 407 ipaB, ipaC, ipaD and icsP were derived from S. flexneri 5a strain M90T (accession GCA_004799585) and In order to assess the effect of point mutations on protein stability and vaccine escape, six antigen 417 candidates from S. flexneri PG3 were modelled: OmpA, SigA, IcsP, IpaB, IpaC and IpaD (Table 1) . PG3 418 was selected as it is the most prevalent phylogroup and is therefore the target of current vaccine 419 development. To model the antigen targets, we first searched for a suitable template using HHPred (73, 74) . 420 Five of the six proteins (OmpA, SigA, IcsP, IpaB and IpaD) had suitable homologues available. To improve 421 the performance of the comparative modelling, the signal peptides for OmpA, SigA and IcsP were removed 422 and OmpA, SigA and IpaB were modelled in two parts to make use of optimal templates. RosettaCM (75) The table below the plots 492 demonstrates for each species the genomic diversity (as measure by total number of SNPs per kbp 493 [methods]), the contribution to GEMS shigellosis burden and the shigellosis burden relative to genomic 494 diversity. For S. sonnei, the genomic diversity and shigellosis burden relative to genomic diversity that was 495 calculated excluding the two outliers are shown in bracket. 496 phylogenetic tree of S. flexneri genomes identified six distinct genomic subtypes, each highlighted in a 499 different colour according to the inlaid key displayed above the tree. The bar plot below the tree 500 demonstrates the relative frequencies of the subtypes at each study site. 501 Frequencies of AMR genotypic profiles among Shigella spp. Each point in the scatterplot represents a 516 unique AMR genotype profile: the proportion of isolates with a particular profile is displayed along the y-517 axis. Profiles identified in only a single isolate are not displayed. MDR genotypic profile conferring 518 resistance or reduced suppressibility to three or more drug classes are highlighted in pink, and normal AMR 519 genotype profile conferring resistance or reduced suppressibility in fewer than three drug classes are in 520 grey. Numbers displayed above the plot represents the number of AMR genotype profiles plotted for each 521 species. (B) Detection of known AMR genetic determinants associated with drug class grouped by country. 522 Each cell in the heatmap represents the percentage of isolates from a region containing genetic determinants 523 associated with resistance to a drug class. Genetic determinant conferring reduced susceptibility to 524 quinolone is indicated with an asterisk. (C) The genetic convergent evolution of ciprofloxacin resistance in 525 S. flexneri and S. sonnei. The presence of multiple monophyletic clades of QRDR mutations (single, double, 526 or triple according to the inlaid key) conferring reduced susceptibility or resistance to ciprofloxacin is 527 shown in the outer ring. B and C for S. boydii and S. dysenteriae are shown elsewhere (fig. S15). 528 the number of unique protein coding genes in the pangenome as a new genome is randomly added, with the 534 number of genomes plotted along the x-axis. Random permutation of the data were subsampled 100 times, 535 in which genomes are subsampled without replacement at each iteration. The y-axis shows the minimum 536 and maximum range of unique gene count after each iteration in (A) and the mean value in (B). 537 Phylogeny of S. flexneri population from GEMS. ML phylogenetic tree constructed using core genome 540 SNPs from alignments of 817 S. flexneri genomes from GEMS and publicly available genomes. Tree was 541 rooted using E. coli genome. The outer concentric rings illustrate different genotypic and epidemiological 542 data according to the numbered inlaid keys displayed next to the tree. Scale bars represents the number of 543 SNPs. 544 Phylogeny of S. sonnei population from GEMS. Midpoint rooted ML phylogenetic tree constructed using 547 core genome SNPs from alignments of 308 S. sonnei genomes from GEMS and publicly available genomes. diversity of the subtype, as measured by pairwise core SNP distance and plotted along the x-axis. Linear 565 regression analysis was performed to assess the association between serotype diversity and the different 566 properties of subtypes. The regression coefficient of determination (R 2 ) and p-value are displayed on the 567 top left of each plot. 568 Serotype switching events across S. flexneri genomic subtypes. ML phylogenetic tree of each subtype was 571 generated based on core genome SNPs. Serotypes determined through biochemical serotyping are displayed 572 on the right-hand side of each tree, and coloured according to the inlaid key. The 26 inferred serotype 573 switching events occurring along the phylogenetic branches are labelled accordingly. Numbers inside each 574 backets represents switch IDs, with further details provided in table S3. Where the dominant serotype 575 cannot be determined, a question mark is displayed, indicating switch from unknown ancestral type. 576 Serotype switching events resulting in more than two descendant isolates are highlighted in red. 577 Estimation of time frame for serotype switching among S. flexneri PG3 isolates. ML phylogenetic tree of 580 S. flexneri PG3 (n=384) generated using core genome SNPs is displayed on the right. Isolate serotype is 581 displayed on the outer ring, coloured according to the inlaid key displayed next to the tree. Two subclades 582 with branches highlighted in red were selected for BEAST analysis. Maximum clade credibility trees based 583 on two subclades within PG3 are displayed on the left. Independent switching events occurring along the 584 various phylogenetic branches are highlighted in black, labelled and annotated. BEAST estimated time 585 frame of divergence along the branches of the seven isolates that have undergone serotype switching are 586 shown in table S5. 587 The distribution of vaccine antigen candidate and protein sequence identity among Shigella spp. (A) 590 Lefthand y-axis refers to the grouped bar plot displaying presence of vaccine candidate genes identified 591 among Shigella isolates from GEMS. Bars are grouped by genes and coloured according to species. 592 Righthand y-axis (blue) refers to the boxplot displaying the interquartile range, median (red) and 593 minimum/maximum pairwise percentage identity of the amino acid sequences of antigen vaccine 594 candidates among GEMS, compared against the reference sequences. Presence of genes were identified 595 using BLASTn search against draft genome assemblies and amino acid sequence percentage identity were 596 inferred using BLASTp. (B) Mapping coverage of Shigella spp. virulence plasmid. Low percentage of 597 virulence plasmid were detected among S. sonnei isolates, likely contributed by the fact that S. sonnei 598 virulence plasmid is comparatively unstable and often lost during subculturing. 599 Temporal phylogenetic signal for S. flexneri. Correlation between isolate sampling time in months (x-axis) 649 and phylogenetic root-to-tip divergence (y-axis), as estimated by TempEst based on ML phylogeny of each 650 subclade. The two datasets correspond to S. flexneri 2a isolates belonging to node A (left) and S. flexneri 651 2b isolates belonging to node B (right) from PG3 in fig. S8 . The linear regression line is coloured in red, 652 with the coefficient of determination (R 2 ) and p-value displayed for each plot. 653 Morbidity and mortality 9 due to shigella and enterotoxigenic Escherichia coli diarrhoea: the Global Burden of Disease Study 1990-10 2016. The Lancet infectious diseases Burden and 12 aetiology of diarrhoeal disease in infants and young GEMS): a prospective, case-control study Use of quantitative molecular 15 diagnostic methods to identify causes of diarrhoea in children: a reanalysis of the GEMS case-control 16 study World health organization releases global priority 20 list of antibiotic-resistant bacteria to guide research, discovery, and development of new antibiotics Progress and pitfalls in 23 Shigella vaccine research Prospective study of the association between 25 serum antibodies to lipopolysaccharide O antigen and the attack rate of shigellosis Epidemiologic patterns of 28 acute diarrhea and endemic Shigella infections in children in a poor periurban setting in Effect of prior 31 infection with virulent Shigella flexneri 2a on the resistance of monkeys to subsequent infection with 32 Shigella sonnei Shigella volunteer challenge model in which the inoculum is administered with bicarbonate buffer: 35 clinical experience and implications for Shigella infectivity Clinical trials of Shigella vaccines: 37 two steps forward and one step back on a long, hard road Human 41 challenge study with a Shigella bioconjugate vaccine: Analyses of clinical efficacy and correlate of 42 protection Age-44 related efficacy of Shigella O-specific polysaccharide conjugates in 1-4-year-old Israeli children. 45 Vaccine Immunogenicity and efficacy of highly purified invasin 47 complex vaccine from Shigella flexneri 2a Broadly protective Shigella vaccine based on type III secretion apparatus proteins Atlas of group A 52 streptococcal vaccine candidates compiled using large-scale comparative genomics Bacterial genome variability and its impact on vaccine design Shigella 57 isolates from the global enteric multicenter study inform vaccine development Species-wide whole 60 genome sequencing reveals historical global spread and recent local persistence in Shigella flexneri Shigella sonnei genome 63 sequencing and phylogenetic analysis indicate recent global dissemination from Europe Genome diversity of Shigella boydii Global 70 population structure and genotyping framework for genomic surveillance of the major dysentery 71 pathogen, Shigella sonnei Defining the 73 phylogenomics of Shigella species: a pathway to diagnostics Shigella diarrhoea in six Asian countries: disease burden, clinical manifestations, and microbiology serotype X variant in an epidemic clone of Shigella flexneri Serotype-converting bacteriophages and O-antigen modification in 80 Shigella flexneri A novel plasmid-encoded 82 serotype conversion mechanism through addition of phosphoethanolamine to the O-antigen of Shigella 83 flexneri Vaccine escape recombinants emerge after 87 pneumococcal vaccination in the United States Deletion of toxin-antitoxin systems in the evolution of Shigella sonnei as 89 a host-adapted pathogen Multiple 91 SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity Evidence of escape of 93 SARS-CoV-2 variant B.1.351 from natural and vaccine-induced sera Shigella flexneri invasion plasmid antigens B and C: epitope 95 location and characterization with monoclonal antibodies Identification of epitope and surface-exposed 97 domains of Shigella flexneri invasion plasmid antigen D (IpaD) Outer membrane protein A (OmpA) of Shigella 100 flexneri 2a, induces protective immune response in a mouse model Guidelines for the control of shigellosis, including epidemics due to Shigella 102 dysenteriae type 1 Out of Asia: the independent rise and global spread of fluoroquinolone-104 resistant Shigella Comparison 106 of phenotypic and WGS-derived antimicrobial resistance profiles of Shigella sonnei isolated from cases 107 of diarrhoeal disease in England and Wales Guidelines for the treatment of dysentery (shigellosis): a systematic 112 review of the evidence Asia as a Reservoir for the Global Spread of Ciprofloxacin-Resistant Shigella sonnei: A Cross-Sectional 115 Study Dissecting the 117 molecular evolution of fluoroquinolone-resistant Shigella sonnei AMR NGHRUoGSo. Whole-genome sequencing as part of national and international 124 surveillance programmes for antimicrobial resistance: a roadmap Trimmomatic: a flexible trimmer for Illumina sequence data MultiQC: summarize analysis results for multiple 132 tools and samples in a single report Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM Alignment/Map format and SAMtools Qualimap: 138 evaluating next-generation sequencing alignment data PHASTER: a better, faster version 140 of the PHAST phage search tool The Swiss-Army Tool for Genome Feature Analysis Rapid phylogenetic 144 analysis of large samples of recombinant bacterial whole genome sequences using Gubbins IQ-TREE: a fast and effective stochastic 147 algorithm for estimating maximum-likelihood phylogenies Exploring the temporal structure of 151 heterochronous sequences using TempEst (formerly Path-O-Gen) Posterior Summarization in Bayesian 157 Phylogenetics Using Tracer 1.7 159 BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis Unicycler: resolving bacterial genome assemblies from 162 short and long sequencing reads Prokka: rapid prokaryotic genome annotation Roary: rapid large-scale 167 prokaryote pan genome analysis SRST2: Rapid 171 genomic surveillance for public health and hospital microbiology labs Gapped BLAST and 173 PSI-BLAST: a new generation of protein database search programs AliView: a fast and lightweight alignment viewer and editor for large datasets Fast and accurate automatic structure prediction 178 with HHpred Toolkit with a New HHpred Server at its Core High-resolution comparative 183 modeling with RosettaCM Improved protein structure 185 prediction using predicted interresidue orientations QMEANDisCo-188 distance constraints applied on model quality estimation Assessing the local structural quality of transmembrane protein 190 models using statistical potentials (QMEANBrane) Predicting the impact of missense 192 mutations on protein stability Validating the 194 AMRFinder Tool and Resistance Gene Database by Using Antimicrobial Resistance Genotype-Phenotype 195 Correlations in a Collection of Isolates UpSetR: an R package for the visualization of intersecting 197 sets and their properties Bauer disk diffusion susceptibility test protocol