key: cord-0840064-rksu8qll authors: Spurbeck, Rachel R.; Minard-Smith, Angela; Catlin, Lindsay title: Feasibility of neighborhood and building scale wastewater-based genomic epidemiology for pathogen surveillance date: 2021-10-01 journal: Sci Total Environ DOI: 10.1016/j.scitotenv.2021.147829 sha: fd6322c02cfae1882abf98b4d2f4da99f1289ad8 doc_id: 840064 cord_uid: rksu8qll The benefits of wastewater-based epidemiology (WBE) for tracking the viral load of SARS-CoV-2, the causative agent of COVID-19, have become apparent since the start of the pandemic. However, most sampling occurs at the wastewater treatment plant influent and therefore monitors the entire catchment, encompassing multiple municipalities, and is conducted using quantitative polymerase chain reaction (qPCR), which only quantifies one target. Sequencing methods provide additional strain information and also can identify other pathogens, broadening the applicability of WBE to beyond the COVID-19 pandemic. Here we demonstrate feasibility of sampling at the neighborhood or building complex level using qPCR, targeted sequencing, and untargeted metatranscriptomics (total RNA sequencing) to provide a refined understanding of the local dynamics of SARS-CoV-2 strains and identify other pathogens circulating in the community. We demonstrate feasibility of tracking SARS-CoV-2 at the neighborhood, hospital, and nursing home level with the ability to detect one COVID-19 positive out of 60 nursing home residents. The viral load obtained was correlative with the number of COVID-19 patients being treated in the hospital. Targeted wastewater-based sequencing over time demonstrated that nonsynonymous mutations fluctuate in the viral population. Clades and shifts in mutation profiles within the community were monitored and could be used to determine if vaccine or diagnostics need to be adapted to ensure continued efficacy. Furthermore, untargeted RNA sequencing identified several other pathogens in the samples. Therefore, untargeted RNA sequencing could be used to identify new outbreaks or emerging pathogens beyond the COVID-19 pandemic. Wastewater based epidemiology (WBE) is a rapidly growing field for the surveillance of pathogen load within human populations. Since it was discovered that SARS-CoV-2, the causative agent of COVID-19, can be quantified from feces (Wu et al., 2020b) , quantitative polymerase chain reaction (qPCR)-based wastewater testing has been implemented by several countries and states to track the COVID-19 pandemic by monitoring the viral load in wastewater (Achak et al., 2021; Ahmed et al., 2020; Bivins et al., 2020; Crits-Christoph et al., 2021; Foladori et al., 2020; Gonzalez et al., 2020; Gormley et al., 2020; Hamouda et al., 2020; Haramoto et al., 2020; Hart and Halden, 2020; Kitajima et al., 2020; Medema et al., 2020; Murakami et al., 2020; Weidhaas et al., 2020; Westhaus et al., 2021; Wu et al., 2020a) . However, while qPCR provides viral load data, it is not maximizing the information that could be gathered from wastewater. Next generation sequencing techniques can be used to provide full SARS-CoV-2 genomic sequences from wastewater samples to identify the dominant viral strain in circulation, including mutations that could affect diagnostic or therapeutic strategies. With vaccines being rolled out worldwide, it is imperative to monitor newly emerging viral variants which could bear mutations affecting vaccine efficacy (England, 2020; Kemp et al., 2020; Tu et al., 2021) . Furthermore, monitoring COVID-19 prevalence in smaller regions than an entire catchment, such as at the level of a neighborhood or building enables localized and swift reactions to trends in COVID-19. For instance, the University of Arizona was able to identify and isolate two infected students via monitoring wastewater at the level of student dormitory buildings (Peiser, 2020) . While other universities without a dormitory wastewater monitoring program have had to shut down in person classes periodically due to large COVID-19 outbreaks, the University of Arizona was able to quarantine the two students and continue classes. More than one pathogen can be monitored at a time when utilizing next generation sequencing techniques such as RNA sequencing which sequences all of the RNA present in a sample. By broadening the net to include all RNA in a sample, genomic WBE can monitor other communicable diseases such as influenza, norovirus, Salmonella, or nosocomial infections. This technique has been commonly utilized in environmental microbiology and virology to study the microbiome or virome of soil, water, and other niches such as the human gut. Recently, RNA sequencing was utilized to determine the impact of wastewater effluent on river water, showing that wastewater impacts the RNA virus diversity in rivers, and identified norovirus (Adriaenssens et al., 2018) . By utilizing these metagenomic techniques on wastewater over time, pathogen dynamics in a community will be identified and can be used to predict or monitor for local outbreaks of foodborne illness, influenza, or identify an emerging pathogen. With governments and nongovernment organizations calling for building a pandemic surveillance network, metagenomic wastewater-based epidemiology would be an ideal method to identify emerging pathogens, especially as pathogens circulating in populations can be monitored using non-invasive sampling techniques (Murakami et al., 2020) . Here we demonstrate that WBE can be utilized at the local level to monitor SARS-CoV-2 viral load, SARS-CoV-2 viral variants circulating in the population, and other pathogens of interest to public health. The sampling locations in this study were chosen specifically to enable comparison of viral load in wastewater to known case load or resident population numbers mapped to the sewer junction points. Two manholes used for sampling were located directly outside of major hospitals known to be treating COVID-19 patients, one was located outside of a nursing home, and a fourth manhole was located downstream of a residential neighborhood within Toledo, OH, USA. Our results show that our methods were able to detect SARS-CoV-2 when 1 out of 60 residents were infected. Multiple mutations were identified, with strains belonging to clades GH and GR. Untargeted RNAseq identified several other communicable diseases fluctuating within the samples over time, including uropathogenic E. coli, tetanus, and bacteria known to cause nosocomial pneumonia. Furthermore, two different RNA extraction methods were evaluated in conjunction with qPCR to determine the most robust method for COVID-19, demonstrating that the Qiagen Viral RNA Kit produced more reproducible results than TRIzol™ extraction by qPCR. Two hospitals known to be treating COVID-19 patients (Hospitals V and P), a Nursing Home (N) that had previously had COVID-19 positive residents, and a Residential Neighborhood Community (C) within Toledo, OH, USA within a zip code with the highest COVID-19 positive case load in June were identified by the Toledo-Lucas County Public Health Department. Manhole sewer access points were identified City of Toledo, Division of Environmental Services to ensure that the effluent sampled was only from the buildings or neighborhood of interest to the study. Positive case numbers for the nursing home were obtained for each week from the Toledo-Lucas County Public Health Department and the management of the nursing home. Positive Case counts were obtained directly from the management at Hospital V, and Hospital P declined to provide this information. No specific case count data was available for the residential neighborhood site. One-liter sample collections were conducted at four specific locations in Toledo, OH, USA from manhole access once a week for three weeks. 24-hour composite samples were collected using an autosampler (ISCO 3700C), with some exceptions due to low flow. In low flow situations, a bucket was used to collect wastewater prior to use of the autosampler to ensure enough volume was present for the composite collection. Table 1 shows which samples were grab or composite samples, and other metadata such as sample collection time, temperature and pH. All samples were collected between 10 AM and 12 PM EST and placed on ice in coolers during transportation to the analytical laboratory. As back up during analysis, 250 mL of each wastewater sample was stored. The following sample metadata was gathered each week during each sample collection: time of collection, method of collection, pH of samples, ambient temperature, and water temperature at time of collection. To encourage detachment of virions from organic material in the wastewater, 107 mL of glycine buffer (0.05 M glycine, 3% beef extract, pH 9.6) was added to 750 mL of wastewater and mixed for 1 min (Vlok et al., 2019) . The samples were then centrifuged at 8000 ×g for 30 min. The supernatant was filtered through 0.22 micron filters to remove bacteria, fungi, or eukaryotic cells present in the samples. The virions were precipitated from the filtrate using a PEG-8000/NaCl solution ((PEG/NaCl) 320 g/L PEG-8000, 70 g/L (w/v) NaCl) at a ratio of one-part PEG/NaCl to three parts filtrate (Wu et al., 2020a) . Mixtures were gently stirred for 12 h at 4°C. Precipitated virions were collected by centrifugation for 90 min at 13,000 ×g at 4°C. Pellets were resuspended in 600 μL 1× PBS. The volume changes throughout processing were captured to enable back calculation of total viral load. Two methods were used for RNA extraction. First, RNA was extracted from the collected viral preparations using TRIzol Reagent (ThermoFisher, Cat# 15596026) following the manufacturer's instructions. RNA quantity and quality were measured by NanoDrop. Second, viral RNA was extracted by QIAamp Viral RNA Mini Kit (Qiagen, Cat# 52904) following the manufacturer's protocol. Each RNA sample was analyzed by reverse transcription quantitative polymerase chain reaction (RT-qPCR) targeting the nucleocapsid gene segment (N1) to determine the SARS-CoV-2 viral load. Briefly, 5 μL of 4× TaqMan Fast Viral One-Step Master Mix (ThermoFisher, Cat# 4444432) 0.33 μL 60× custom TaqMan Gene Expression Assay primer/ probe mix (Forward primer 5′-GAC CCC AAA ATC AGC GAA AT-3′; Reverse primer 5′-TCT GGT TAC TGC CAG TTG AAT CTG-3′; Probe 5′-FAM-ACC CCG CAT TAC GTT TGG TGG ACC-NFQ-MGB-3′) (ThermoFisher, Cat# 4331348), 9.67 μL molecular biology grade water, and 5 μL sample were combined for a 20 μL reaction volume. Each plate contained an 8 standard (SARS-CoV-2 synthetic RNA from Biosynthesis, Inc.) serial dilution (1E7 to 1E0) run in triplicate, no template control in triplicate, and samples were in duplicate. Plates were run according to the following thermal cycling parameters: 50°C for 5 min, 95°C for 20 s, and 40 cycles of 95°C for 3 s followed by 60°C for 30 s in an ABI 7500 fast quantitative thermocycler. This method is sensitive down to one genome copy per μL. Total RNA was sequenced from each sample. RNA libraries were prepared using the Rapid RNAseq kit (Swift, Cat# R2096) from Swift Biosciences and sequenced on an Illumina NextSeq using NextSeq 500/550 High Output Kit v2.5 (150 Cycles, 75 × 75 bp). No template control libraries were prepared alongside RNAseq samples, but not sequenced. Data were analyzed following the pipeline outlined in the flowchart depicted in Fig. 1 . Samples were tested specifically for SARS-CoV-2 by amplicon sequencing using the SNAP SARS-CoV-2 Kit (Swift, Cat# SN-5X296 and COVG1-96) from Swift Biosciences. Briefly, first strand cDNA was synthesized from 11 μL of total RNA from each sample using the Superscript IV First Strand Synthesis System with random hexamers (Invitrogen, Cat# 18091050) following the manufacturer's instructions. The resulting cDNA (10 μL) was used as template for the SARS-CoV-2 Kit following the manufacturer's instructions. Due to adapter dimer issues encountered with SNAP SARS-CoV-2 Kit, the manufacturer's protocol had to be modified to include a second bead clean up prior to the final library amplification step to enable the SARS-CoV-2 sample libraries to be sequencable from wastewater. This extra step reduced the adapter dimers prior to amplification. Prior to sequencing, the samples were quantified by Qubit HS DNA kit (Invitrogen, Cat# Q32851), pooled, and sequenced on an Illumina MiSeq 300 cycle V3 flow cell (Illumina, Cat# MS-102-2002) . Samples were batched by week to enable high depth of coverage for single nucleotide polymorphism analysis. The amplicon data were aligned to the NCBI Reference Sequence NC_045512.2 (Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome) and analyzed using the pipeline provided by Swift Biosciences. a Partial composite obtained. Grab sampling was used to increase the volume to the 1 L required. Processing flowchart for Total RNA-Seq data. RNA-Seq data was obtained from runs on an Illumina NextSeq instrument. Files were demultiplexed, quality checked, and trimmed. Sequences were assembled using SPAdes into contigs and then compared to known reference sequences in the refseq database using BlastN or DIAMOND BlastX to identify pathogens in the wastewater samples. A positive control and a no template control were processed and quantified. The positive control (SARS-CoV-2 Standard from Exact Diagnostics) was sequenced demonstrating full genome coverage of reference sequence NC_045512.2. Due to high levels of adapter dimer present in the no template controls even after the addition of another clean up step, sequencing the no template control was impossible. The high level of adapter dimers would cause the flow cell to over cluster and abort sequencing mid run. Sequencing was achieved for 9 out of 12 samples by only sequencing the positive control and the samples known to have COVID-19 based on quantitative PCR results. The manufacturer claims ability to obtain full genome coverage from 10 genome copies from clinical samples, however there was no information on ability to sequence from wastewater in July-August of 2020 when these samples were assessed. Unlike most wastewater epidemiology studies which sample at the influent going into the wastewater treatment plants servicing a particular region of a city, our project studied the biological components in raw sewage derived from manholes in four specific locations in the city of Toledo, Ohio. The sites were carefully chosen through discussion with the city wastewater management team and the Toledo-Lucas County Public Health Department to ensure the sites were in locations with a high probability of SARS-CoV-2 circulation. Two sites were the direct effluent from hospitals known to be actively treating COVID-19 patients at the time of the study, Hospital P(P, Fig. 2A) , and Hospital V (V, Fig. 2B) . A third site was outside a Nursing Home N (N, Fig. 2C ) and a fourth was outside a residential neighborhood, Community C (C, Fig. 2D ). The type of sampling method, sampling time, the pH, ambient, and water temperatures were collected (Table 1) . During the first collection on 7/14/2020, the ambient temperature ranged from a high of 32.2°C to a low of 17.8°C. 3.2. Known case load at sampling sites by week and qPCR detection of SARS-CoV-2 The RNA extracted by the TRIzol reagent did not show evidence of any SARS-CoV-2 in the samples; however, the QIAamp Viral/Pathogen RNA preparations produced detectable levels of SARS-CoV-2 by qPCR in 9 of the 12 samples (Table 2 ). For comparison, the percentage of known COVID-19 cases in Hospital V and Nursing Home N were determined by dividing the number of known COVID-19 positive patients by the total number of inpatients, and a weak positive correlation is observed between the number of infected and viral load in the wastewater. Hospital P did not provide the data to enable the calculation of percent positive cases in that hospital. The percent positive cases in Community C were also unknown, as this residential neighborhood is smaller than the smallest reportable case counts for the region. In Lucas County, Ohio, USA, where these samples are taken, data was reported for community positive cases at the zip code level. In theory, RNAseq should afford direct SARS-CoV-2 signature detection in wastewater without requiring enrichment. However, despite following viral concentration protocols, a large amount of the sequence data was found to map to other organisms such as bacteria, bacteriophage, and eukaryotic organisms (Fig. 3) . Reads mapping to SARS-CoV-2 were detected in only five out of the twelve samples, with at most 3 reads mapping to the viral genome, despite there being an average of 11 M raw reads per sample and 3 M filtered reads per sample. SARS-CoV-2 was detected in Community C samples collected week 1 (3 reads) and week 2 (3 reads), and Hospital V samples collected in week 1 (1 read), week 2 (2 reads), and week 3 (2 reads). This low coverage is not sufficient for confidence in reporting presence of the pathogen at the local level. Due to the relatively small genome and other larger genomes present in the wastewater, the SARS-CoV-2 reads were likely obscured. Despite removal of microorganisms and enrichment for viral particles, a large portion of the sequencing data (51-90%) mapped to bacteria, and a fraction of the data (1-19%) mapped to macroorganisms, which demonstrates our viral preparation also included RNA from lysed cells (Fig. 3) . Although not the target of this study, the sequenced microbial RNA identified several commensal species that are nonpathogenic to their hosts, as well as infectious and opportunistic pathogens (Fig. 4) . While not pertaining to the COVID-19 pandemic, this finding demonstrates that other pathogenic organisms can be identified and tracked using wastewater sequencing, and therefore non-targeted wastewater sequencing could be continued after the pandemic wanes to improve our understanding of pathogen distribution within local communities and healthcare facilities. Several pathogenic categories of interest were detected in the hospitals, nursing home, and community, including microbes associated with nosocomial infections, gastroenteritis, cystitis, pneumonia, septicemia, and tetanus ( Fig. 5 ) from assembled contigs. The etiological agents identified for nosocomial infections included the Acinetobacter species: A. baumannii A. junii SH205, and A. Iwoffii SH145. Campylobacter jejuni subsp. jejuni BH-01-0142, a known cause of gastroenteritis, was identified in the Community C neighborhood, Nursing Home N, and Hospital V. The causative agent of cystitis identified in the first week of testing at Nursing Home N was Escherichia coli UTI89. This data suggests that WBE can be utilized to track contagions outside of a pandemic and could be a beneficial tool for future epidemiological and biosurveillance studies. The whole genome of SARS-CoV-2 was enriched from the wastewater samples by targeted amplicon sequencing, enabling single nucleotide polymorphisms (SNP) and insertion/deletion mutations (indels) to be identified when aligned to the reference strain SARS-CoV-2 Genome Wuhan HU-1. Table 3 presents non-synonymous mutations identified in the Toledo samples that are unique (previously unobserved) or existing mutations (previously observed). Of the twelve samples analyzed in this study, nine were COVID-19 positive based on qPCR. Sequencing the positive samples, only five produced full genomes with enough coverage for confident calls: weeks 1 and 2 from the Community C and weeks 1, 2, and 3 from Hospital V ( Table 3 ). The mutations identified enabled Clade classification. Community C samples fell into Clade GH, which carries the Spike D614G mutation. Hospital V data showed several mutations at variable penetrance indicating multiple strains are present in these samples, therefore the Clade was described as Other. Despite having multiple strains present, all three weeks had SARS-CoV-2 sequences carrying the Spike D614G mutation. A week-to-week comparison of Community C samples yielded highly similar existing nonsynonymous mutation loads with ten mutations observed in both weeks. In addition, different unique nonsynonymous mutations were observed, and one existing nonsynonymous mutation was lost from week 1 to week 2. All three weeks from Hospital V had five existing nonsynonymous mutations in common (NSP2_T85I, NSP3_M1788I, NSP12_P323L, NSP16_R216C, and Spike_D614G). However, Hospital V weeks 1 and 2 had an additional five existing nonsynonymous mutations in common (NSP5_L89F, NSP14_N129D, Fig. 5 . The abundance of pathogens identified by total RNA sequencing varied by sampling site and over time. Notably, pathogens associated with nosocomial infections were found in the effluent from the Nursing Home (N) and Hospital V (V), but not in Hospital P (P) or the Community C (C). Weeks of collection are indicated as 1 (7/14/20), 2 (7/21/20), or 3 (7/28/20). Table 3 Consensus sequences identified nonsynonymous mutations and categorized the strains by clade. Gray rows indicate samples with partial genome coverage, which may not reliably call the clade. Bold mutations are ones that were not present in all weeks at the same site. NS3_Q57H, NS8_S24L, and N_P67S), demonstrating that nonsynonymous mutation load has changed over time. Similarly, the number of SNPs and indels in the whole genome sequencing data changed over time at each site ( Table 4 ). The three genes most targeted by diagnostics were prone to both SNPs and indels, demonstrating the importance of continuous monitoring by sequencing as the pandemic continues. Over the three weeks of this study, the viral population present in the effluent from Hospital V underwent the most striking changes, with the total number of SNPs and indels increasing each week. Importantly, two deletions were found in the S protein in the third week of sampling (7/28/20) at Hospital V. One was a deletion of a full codon, ΔACC23493-23496, which was found in 9% of the data. The other S protein deletion, ΔG24696, was found in 25% of the data. The objective of this work was to first demonstrate the utility of wastewater surveillance at the neighborhood or hospital level and then to identify the best methods for tracking SARS-CoV-2 and other pathogens to enable a population-based approach to pandemic biosurveillance. Our data have demonstrated that while total RNAseq, targeted whole genome sequencing, and qPCR methodologies can be used to detect SARS-CoV-2 in wastewater, a combination of qPCR to identify viral load and targeted whole genome sequencing to detect mutations provides deeper insights to support public health decisions during pandemic surveillance than qPCR alone. RNAseq, while identifying other pathogens present in the samples such as microbes that cause nosocomial infections, periodontitis, gastroenteritis, pneumonia, or cystitis, did not recover enough reads mapping to SARS-CoV-2 to detect the viral pathogen with any confidence. Therefore, in a targeted effort to understand the viral variants in circulation during the pandemic, RNAseq would be limited in utility. However, the non-targeted surveillance technique would effectively alert hospitals and long-term care facilities to potential issues with nosocomial pathogens or residents that may have infections that need to be treated and can otherwise go undiagnosed such as cystitis due to asymptomatic or recurrent infections. Use in community settings would alert neighborhoods to potential norovirus or Salmonella outbreaks and help public health officials pinpoint restaurants or grocery stores selling contaminated products. Further method development will be necessary to increase the utility of untargeted RNAseq for surveillance of emerging pathogens. Utilizing amplicon sequencing to target and sequence the full SARS-CoV-2 genome enabled mutation analysis of the viral population identified in the wastewater samples. As demonstrated recently, new strains of SARS-CoV-2 are emerging which are hyper transmissible, and potentially could carry mutations in the Spike protein that could be vaccine or convalescent plasma therapy escape mutants (England, 2020; Kemp et al., 2020; Tu et al., 2021) . Also, the Spike, N, or Orf1ab genes are diagnostic targets so any SNP or indel identified in these genes that could affect primer or probe binding of a diagnostic test are of high interest and need to be reported to enable improved diagnostics. Indeed, several mutations have been identified that affect different COVID-19 diagnostic tests (Wang et al., 2020) . Furthermore, nonsynonymous mutations in the Spike protein could indicate the potential for vaccine escape and will need to be characterized to understand impact on vaccine efficacy. Sequencing from wastewater samples within a city or at hospital effluent can aid in surveillance of these mutations and reduces the burden on supplies, as instead of sampling, quantifying and sequencing SARS-CoV-2 in individuals, wastewater enables population sampling, reducing the number of assays and reagents used during surveillance. With both convalescent plasma and the vaccines targeting the spike protein, it is imperative that nonsynonymous mutations be monitored in this gene. Indeed, it has been demonstrated that spike protein mutations increase in SARS-CoV-2 as a patient is treated with convalescent plasma, indicating the antibodies are placing evolutionary pressure on the virus that selects for mutations in this therapeutic target (Kemp et al., 2020) . All Toledo wastewater samples sequenced carried the Spike G614D mutation, which may confer higher transmissibility to the virus, and is characteristic of clade G. Indeed, three of the sequenced samples were categorized as GH and one as GR. In a study in Asia, clades GH and GR were found to contain high levels of SNPs, indicating a lot of diversity in these strains (Sengupta et al., 2020) . Mutation analysis of wastewater samples collected in Toledo OH during July 2020 did not find the signatures for the two SARS-CoV-2 clade 20C/G variants that were later identified in Columbus, OH (Tu et al., 2021) . Likewise, the strain UK-B.1.1.7 (clade 20I/501Y.V1) (England, 2020; Galloway et al., 2021) , also thought to be hyper transmissible and bearing mutations in the S protein, was not present in July 2020 Toledo, OH wastewater samples. However, another spike mutation, Spike_N370K, was observed in week 1 of the Community C, and two deletions (ΔACC23493-23496 and ΔG24696) were observed in week 3 at Hospital V. Therefore, wastewater-based sequencing of SARS-CoV-2 can identify spike protein mutations which may impact transmission and response to the vaccine or convalescent plasma therapy. Most wastewater-based epidemiology takes place at the influent to a wastewater treatment facility, providing a large catchment that encompasses multiple municipalities. In contrast, our work focused on sampling effluent from a neighborhood, two hospitals, and a nursing home. One limitation of local sampling was flow rate issues (some of our sample locations could not provide 24-hour composites due to low flow); however, sampling at regions within a city provided more refined information on where strains are circulating in the community. For example, strains circulating in Community C were all in the clade GH, while a strain identified in Hospital P belonged to clade GR, and strains at Hospital V were categorized as Other. While all strains bore some mutations in common, each week's sample had mutations unique to that sampling, demonstrating that mutations are common in SARS-CoV-2, and several strains are present in one city. While population size is a concern, qPCR from effluent of our smallest population, that of Nursing Home N, was able to detect the single case of COVID-19 in week 3, from a population of 60 residents. Since Ohio has implemented weekly testing of nursing home residents, this detection was confirmed by the public health department's data. The case load data from Hospital V demonstrated an increase in hospitalized COVID-19 cases over time, and our qPCR data correlates with this trend, indicating that with more sampling, a predictive model could be developed and validated to identify an approximate number of infected people present in the catchment over time. Using data from the ongoing study, we are currently working to develop and validate candidate models. Another highly discussed aspect of wastewater-based epidemiology is the degradation of target viral RNA during its transit through the sewer system. While SARS-CoV-2 RNA is intact enough at the inlet to the wastewater treatment plant for qPCR identification, sequencing samples taken closer to the source, such as the effluent from a building or a neighborhood may provide less degraded samples for whole genome sequencing, and therefore provide a better representation of the viral mutations in circulation. Future work will explore the impact of the sewer system on RNA sample quality from different locations within a catchment to optimize viral sequencing. This study has demonstrated that wastewater epidemiology at the local level can be used to determine the viral load in a neighborhood or building and identify strain variants in circulation. Total RNAseq from wastewater samples, while effective in identifying bacterial pathogens in circulation, would require more sequencing depth to consistently detect SARS-CoV-2 for biosurveillance purposes, due to the low abundance of the viral RNA in comparison with all other RNA in the sample. Therefore, it is recommended that a combination of qPCR and targeted sequencing of SARS-CoV-2 be utilized to track the pandemic and provide early warning of new strains circulating in the population which could otherwise evade detection by diagnostics or indicate vaccine escape mutants. The use of wastewater in to track mutations is limited by the methods to bioinformatically deconvolute the viral sequences, and therefore in this work we presented the mutations identified without separating the sequences into individual genomes. Future directions will incorporate long read sequencing or linked read sequencing to enable deconvolution into individual viral strains. Furthermore, a comparison of genomic sequences from samples at wastewater treatment plants versus the local manholes utilized in this study would determine if the time between viral shedding in waste and collection impacts viral RNA integrity limiting the use of genomic sequencing to building or neighborhood wastewater samples for high quality full genome sequences. Rachel Spurbeck: Conceptualization, Methodology, Wastewater processing, Amplicon sequencing, RNA sequencing, Writing-Original Draft Preparation Lindsay Catlin: RNA extraction, Quantitative PCR, RNA sequencing, Amplicon sequencing, Writing-Reviewing and Editing Angela Minard-Smith: Bioinformatics, Data Visualization, Writing-Reviewing and Editing. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. SARS-CoV-2 in hospital wastewater during outbreak of COVID-19: a review on detection, survival and disinfection technologies Viromic analysis of wastewater input to a river catchment reveals a diverse assemblage of RNA viruses First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community Wastewaterbased epidemiology: global collaborative to maximize contributions in the fight against COVID-19 Genome sequencing of sewage detects regionally prevalent SARS-CoV-2 variants Investigation of Novel SARS-CoV-2 Variant: Variant of Concern 202012/01, Technical Briefing 3 SARS-CoV-2 from faeces to wastewater treatment: what do we know? A review Emergence of SARS-CoV-2 B.1.1.7 lineage-United States COVID-19 surveillance in Southeastern Virginia using wastewater-based epidemiology COVID-19: mitigating transmission via wastewater plumbing systems Wastewater surveillance for SARS-CoV-2: lessons learnt from recent studies to define future applications First environmental surveillance for the presence of SARS-CoV-2 RNA in wastewater and river water in Japan Computational analysis of SARS-CoV-2/COVID-19 surveillance by wastewater-based epidemiology locally and globally: feasibility, economy, opportunities and challenges Neutralising antibodies in Spike mediated SARS-CoV-2 adaptation SARS-CoV-2 in wastewater: state of the knowledge and research needs. Sci. Total Environ Presence of SARS-Coronavirus-2 RNA in sewage and correlation with reported COVID-19 prevalence in the early stage of the epidemic in the Netherlands Letter to the editor: wastewaterbased epidemiology can overcome representativeness and stigma issues related to COVID-19 The University of Arizona Says It Caught a Dorm's Covid-19 Outbreak Before It Started. Its Secret Weapon: Poop Clade GR and Clade GH Isolates in Asia Show Highest Amount of SNPs Distinct Patterns of Emergence of SARS-CoV-2 Spike Variants Including N501Y in Clinical Samples in Columbus Ohio Marine RNA virus quasispecies are distributed throughout the oceans Mutations on COVID-19 diagnostic targets Correlation of SARS-CoV-2 RNA in Wastewater With COVID-19 Disease Burden in Sewersheds Detection of SARS-CoV-2 in raw and treated wastewater in Germany -suitability for COVID-19 surveillance and potential transmission risks SARS-CoV-2 titers in wastewater are higher than expected from clinically confirmed cases Prolonged presence of SARS-CoV-2 viral RNA in faecal samples The authors would like to thank Dennis McIntyre, Great Lakes Environmental Center for sample collection; Angela Tucker, City of Toledo, Division of Environmental Services for help with sample site identification and access to sewers; and Dr. Jennifer Gottschalk, Toledo-Lucas County Public Health Department, and Hospital V for providing COVID-19 case data. We would also like to thank Drs. Jared Schuetter and Trevor Petrel for assistance with critique and editing the manuscript. This work was supported by the National Science Foundation [grant number: 2033137].