key: cord-0737020-yw6tngvy authors: Hilt, Evann E; Boocock, James; Trejo, Marisol; Le, Catherine Q; Guo, Longhua; Zhang, Yi; Sathe, Laila; Arboleda, Valerie A; Yin, Yi; Bloom, Joshua S; Wang, Pin-Chieh; Elmore, Joann G; Kruglyak, Leonid; Shrestha, Lasata; Bakhash, Shah A Mohamed; Lin, Michelle; Xie, Hong; Huang, Meei-Li; Roychoudhury, Pavitra; Greninger, Alexander; Chandrasekaran, Sukantha; Yang, Shangxin; Garner, Omai B title: Retrospective Detection of SARS-CoV-2 in Symptomatic Patients prior to Widespread Diagnostic Testing in Southern California date: 2021-05-03 journal: Clin Infect Dis DOI: 10.1093/cid/ciab360 sha: b55595964443e829c77d7befbb50725667b835c4 doc_id: 737020 cord_uid: yw6tngvy BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused one of the worst pandemics in recent history. Few reports have revealed that SARS-CoV-2 was spreading in the United States as early as the end of January. In this study, we aimed to determine if SARS-CoV-2 had been circulating in the Los Angeles (LA) area at a time when access to diagnostic testing for coronavirus disease 2019 (COVID-19) was severely limited. METHODS: We used a pooling strategy to look for SARS-CoV-2 in remnant respiratory samples submitted for regular respiratory pathogen testing from symptomatic patients from November 2019 to early March 2020. We then performed sequencing on the positive samples. RESULTS: We detected SARS-CoV-2 in 7 specimens from 6 patients, dating back to mid-January. The earliest positive patient, with a sample collected on January 13, 2020 had no relevant travel history but did have a sibling with similar symptoms. Sequencing of these SARS-CoV-2 genomes revealed that the virus was introduced into the LA area from both domestic and international sources as early as January. CONCLUSIONS: We present strong evidence of community spread of SARS-CoV-2 in the LA area well before widespread diagnostic testing was being performed in early 2020. These genomic data demonstrate that SARS-CoV-2 was being introduced into Los Angeles County from both international and domestic sources in January 2020. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused one of the worst pandemics in recent history since it first emerged in Wuhan, China in December 2019 [1, 2] . Almost a year later, the United States surpassed over 28 million diagnosed cases of coronavirus disease 2019 and over 500,000 deaths [3] . The ability to quickly test symptomatic individuals for SARS-CoV-2 is critical to fighting the pandemic. It is no secret that the roll out of diagnostic testing in the US was delayed for many reasons. The initial delay was caused by the need for clinical laboratories to submit their testing protocols to the US Food and Drug Administration (FDA) for Emergency Use Authorization (EUA). The other delay was from a lack of understanding of who should be tested [4] . The Center for Disease Control and Prevention (CDC) testing guidelines on February 1 st , 2020 only recommended testing for SARS-CoV-2 in patients under investigation (PUI) with lower respiratory tract symptoms or fever, and travel history to China or contact with a symptomatic person with travel to China [5] . By the end of February, the CDC released an expanded definition of PUI to patients who traveled to areas of high COVID-19 endemicity including Iran, South Korea and Italy, and patients admitted with acute respiratory distress syndrome (ARDS) with no known exposure [6] . Even though the PUI definition expanded, the ability to test these patients outside the CDC or local public health labs was extremely limited. There was evidence that SARS-CoV-2 was circulating in the US as early as the middle of January [7] [8] [9] [10] [11] [12] [13] . Notably, in Santa Clara County, California there were two deaths at the beginning of February for which SARS-CoV-2 RNA was detected by real-time-PCR testing at the CDC from postmortem tissue specimens [8] . In addition, another study searched electronic medical records from our health A c c e p t e d M a n u s c r i p t 5 system and reported a significant increase in patients presenting with symptoms suggestive of COVID-19 during winter 2019/2020 compared to the previous five years in the Los Angeles (LA) area [14] . All these data suggested that SARS-CoV-2 could have been circulating in LA County earlier than initially realized and prompted us to look retrospectively for missed SARS-CoV-2 in our patient population. We aimed to determine if SARS-CoV-2 had been circulating in the LA area at a time when access to diagnostic testing was severely limited. The remnant clinical samples were originally collected from UCLA health system patients We utilized a modified pooling strategy described previously [15] . Testing pools were created by combining 200µL of five samples into one microcentrifuge tube (Figure 1) [16]. If there was detection of either target (N1 or N2), then we called the pool "Detected". We set 45 cycle threshold (Ct) as the cutoff for calling a detected result in the initial testing of the pools because pooling can cause a three to five cycle increase in Ct value [15] . The 45 Ct threshold would allow us to catch possible individual specimens within a pool that have high Ct value between 35 and 40. Additional inquiry into 25 individual specimens from five indeterminate pools was performed with the ThermoFisher TaqPath COVID-19 Combo Kit. This kit uses the KingFisher Flex Purification system followed by a run on the ABI 7500 to amplify and detect three regions of the SARS-CoV-2 single stranded RNA genome: the ORF1ab, N and S genes. We followed the FDA EUA protocol using the Ct threshold cutoff of 37 when individual (de-convoluted) specimens were tested. We sequenced four SARS-CoV-2 genomes utilizing the commercial kit from New England Biolabs (NEBNext® Ultra™ II RNA) and Illumina NextSeq 500 technology explained in depth previously [17] . In brief, extracted RNA from the MagNA Pure (as described above) was processed following the NEBNext® Ultra™ II RNA protocol. The library from this protocol was sequenced on the Illumina NextSeq 500 sequencing system. We mapped the reads from each library to a composite reference genome consisting of human (hg38) and SARS-CoV-2 (NC_045512) using the BWA-MEM command [18] . For the NEB libraries, PCR duplicates were removed using MarkDuplicates from the Picard tool suite [19] . We calculated the number of reads that mapped to human rRNA, other regions of the human genome, and SAR2-CoV-2 before and after deduplication. We assigned lineages to these four genomes according to a proposed nomenclature with Pangolin [20] . A c c e p t e d M a n u s c r i p t 7 Confirmation testing was performed on one sample (P404-D) at UW. In brief, RNA was extracted on a Qiagen Bio Robot using the QIAamp Virus Bio Robot MDx Kit following manufacturer's instructions. A total sample volume of 140µL was extracted and eluted to 100µL and three extraction replicates were pooled to a total volume of 300µL. Extracted RNA was then concentrated to 100µL using the Savant SpeedVac DNA 130 at a "no" temperature setting for 5 hours. RT-PCR was performed using AgPath-ID One-Step RT-PCR kit with CDC primers and probes for N1/N2 following protocol and concentrations defined in Corman et al [21] . PCR was performed on both Using 11µL of extracted RNA, single strand complementary DNA (sscDNA) was synthesized using SuperScript IV First-Strand Synthesis System. Sequencing libraries were prepared using the Swift Normalase Amplicon Panel (SNAP) for SARS-CoV-2 as described previously [22] . Libraries were quantified using fluorometric methods (Quant-iT dsDNA High Sensitivity Kit), normalized, and sequenced on the Illumina NextSeq 500. Raw reads were processed using a custom bioinformatics pipeline (https://github.com/greninger-lab/covid_swift_pipeline) described previously [22] . A c c e p t e d M a n u s c r i p t 8 A total of 2321 remnant samples from 1985 patients were screened in 465 pools (Figure 2) . The majority of these remnant samples were nasopharyngeal swabs (1682/2321, 72%) that were previously submitted for either Flu A/B/RSV or RPP testing. Based on the pooled testing criteria, we detected SARS-CoV-2 in 8/274 (3%) pools in Group A and in three pools in Group B (Figure 2) . We took the 11 SARS-CoV-2 detected pools and de-convoluted the pools to the 55 individual samples to re-extract and test each sample individually. The deconvolution of these SARS-CoV-2 detected pools resulted in the detection of SARS-CoV-2 in 6 of the 55 samples. In deconvolution testing, we saw a decrease of four to five in Ct value, consistent with our control samples. There were five pools where we detected SARS-CoV-2 in the initial pool testing (P248, P253, P286, P398, P404), but were negative in the deconvolution testing of individual specimens ( Table 1) . We decided to re-extract and re-test the 25 individual specimens within these five indeterminate pools using another SARS-CoV-2 PCR assay (Supplementary Table 1 ). There was one individual specimen (P404-D) for which one of the three targets crossed the cycle threshold just above the cutoff of 37 (N Gene Ct=38.2). According to the FDA EUA protocol for this assay, this result is considered negative. However, we deemed this result as inconclusive since it was positive for both targets in the pool testing ( Table 1) and positive for 1/3 of the targets in the de-convoluted testing (Supplementary Table 1 ). A c c e p t e d M a n u s c r i p t 9 Confirmation Testing of the January 2020 Sample One specimen (P404-D) provided inconclusive results with our in-house testing methods. This sample was collected on January 13 th , 2020, and with high Ct values, sequencing is expected to be challenging. We sent the sample to the Clinical Virology Laboratory at UW, where a highly sensitive amplicon sequencing protocol was performed [22] . Although a full genome was not obtained, 378 highquality reads were detected which mapped to the SARS-CoV-2 genome ( Table 2 Table 2) . The seven individual positive SARS-CoV-2 specimens comprised six individual patients ( Table 1) . We performed an in-depth chart review for these six individual patients to determine whether they had a COVID test and/or any relevant travel or social history documented in their medical records. The date range for when these seven individual specimens were submitted for original testing was January 13, 2020 to March 28, 2020. Five of the seven individual specimens were from four individuals, and were submitted prior to the implementation of our in-house SARS-CoV-2 PCR testing on March 10, 2020 ( Table 1) For four of these six positive patients, we acquired the full SARS-CoV-2 genomes directly from the extracted RNA using an NEB sequencing approach described previously (Table 1 ) [17] . The viral lineages were described previously [20] and each has a destination description that contains the regions from which it is derived [23] . We looked specifically at the two of the genomes from individuals never tested for the presence of SARS-CoV-2 and had a relevant recent travel history (P113-B and P224-B). Patient P113-B, just returned from Seattle, WA and has a USA lineage (Lineage B.1.43) that is very closely related to other published genomes from Seattle, WA from the same time period. The genome from Patient P224-B was assigned the B.4 lineage, which is specifically tied to an outbreak in Iran that occurred earlier in 2020 [24] . This individual just returned from Iran three days prior to seeking medical care. These genomic data demonstrate that SARS-CoV-2 was being introduced into LA County from both domestic and international sources. For the earliest patient (P404-D, January 13, 2020), the sample was weakly positive (Ct>35), and <400 sequence reads were acquired, which covered 16.5% of the viral genome (Supplementary Figure 1 ). Despite this, we detected one high confidence SNP, C1059T (18X depth and 100% allele frequency), which caused a T265I mutation in the ORF1ab gene. This mutation emerged in the US in early 2020 and was unique to the US [25] . M a n u s c r i p t 12 Here we present the first evidence of community circulation of SARS-CoV-2 in the LA area as early as January 2020, well before widespread diagnostic testing was being performed. These data highlight the missed window of opportunity to implement widespread diagnostic testing instead of sticking with an epidemiologic-based testing approach in the early phase of the outbreak in the US. Widespread diagnostic testing early on could have helped to slow the spread of the virus. We identified two individuals who had recently traveled to areas of known endemicity toward the end of February 2020, but did not meet the CDC guidelines for SARS-CoV-2 testing [6] ; therefore, their samples were not eligible to be sent to the CDC for testing. Our second earliest patient (Patient P113-B, February 21 st , 2020) traveled from Seattle, WA where the first case of COVID-19 with documented community transmission was detected through the Seattle Flu Study in a specimen collected February 24 th , 2020 [26, 27] . Our data demonstrate that SARS-CoV-2 was detected in a patient who had returned from Seattle with symptoms three days prior to the collection of this Seattle Flu Study specimen. In addition, we have evidence that another patient (P224-B) with travel history abroad to Iran was infected with the B.4 lineage, which is directly tied to an outbreak in Iran [24] . These two individuals with relevant travel history were among the four individuals who presented to our health care system with symptoms prior to the implementation of in-house testing and never tested for COVID-19. These two patients did not appear to have an advanced state of disease that required hospitalization. However, they were evaluated and sent home with a diagnosis of minor URI and no instructions on self-isolation or quarantine since they did not meet the CDC guidelines. More importantly, the third individual with no relevant social or travel history was admitted to the hospital in the psychiatric ward and potentially exposed many individuals on the hospital staff, as well as other patients. A c c e p t e d M a n u s c r i p t Our pooling strategy to screen for SARS-CoV-2 is consistent with the literature that utilized pooling of specimens [15, 28, 29] . Our data show a gain of four to five Ct values when positive samples are pooled with negative samples. Four of the five pools that were de-convoluted and no positive individual sample can be explained by the fact that only one target (either N1 or N2) was positive during the initial pooled testing. This single-target positive would have been reported as inconclusive according to the CDC protocol [16] . A limitation for this work is that we were limited to the medical records to gather relevant travel and social history. Prior to this pandemic, physicians usually did not adequately document full travel and social history in medical records. We were not able to contact these six patients to gather any additional information. One pool (P404) was positive for both targets (N1 and N2) during the initial pooled testing and had one individual specimen (P404-D) test positive for one of three targets above the Ct cutoff of 37 for the ThermoFisher COVID-19 Combo Kit assay. Confirmatory testing at a different facility showed an additional positive PCR for one of two targets and detection of viral reads via sequencing ( Table 2) . These data point to a positive result and suggest that this patient was infected with SARS-CoV-2 as early as January 2020. This would be consistent with published data confirming cases of SARS-CoV-2 in the middle of January 2020 in the US [10, 11] . However, these previously reported and confirmed cases in January 2020 had a clear relevant travel history, unlike our patient who had no relevant travel history. The reason for such high Ct values in all PCRs may be that the sample was collected 2-3 weeks after symptom onset, and was obtained via nasal washing and not nasopharyngeal swab [30] . Notably, this sample was also positive for Rhinovirus. These results suggest that community spread was occurring in the LA area before the middle of January 2020. Consistent with this, a serological study from blood donors detected SARS-CoV-2 antibodies in December 2020, particularly on the west coast [31] . A c c e p t e d M a n u s c r i p t 14 Our clinical microbiology laboratory at UCLA was one of the first in Southern California to initiate in-house testing for SARS-CoV-2 in early March 2020. We had the tools and ability to bring on the test and scale up rapidly. However, our providers and laboratories were restricted from testing patients based on the CDC guidelines, which recommended against testing symptomatic patients without specific travel history. Here we present strong evidence that SARS-CoV-2 was already circulating in the LA area in January 2020, at a time when access to diagnostic testing was severely limited. Our data suggest that in hindsight, testing guidelines should have been expanded to include all patients with influenza-like illness as early as January 2020. None of the authors have any potential conflicts. M a n u s c r i p t 19 A c c e p t e d M a n u s c r i p t 21 A pneumonia outbreak associated with a new coronavirus of probable bat origin A new coronavirus associated with human respiratory disease in China An interactive web-based dashboard to track COVID-19 in real time Diagnostic Testing for Severe Acute Respiratory Syndrome-Related Coronavirus 2: A Narrative Review HAN00427-Update and Interim Guidance on Outbreak of 2019 Novel Coronavirus (2019-nCoV). Available at Update and Interim Guidance on Outbreak of Coronavirus Disease 2019 (COVID-19). Available at Severe Acute Respiratory Syndrome Coronavirus 2 from Patient with Coronavirus Disease, United States Evidence for Limited Early Spread of COVID-19 Within the United States Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States Clinical and virologic characteristics of the first 12 patients with coronavirus disease 2019 (COVID-19) in the United States Rapid Sentinel Surveillance for COVID-19 Community Prevalence of SARS-CoV-2 Among Patients With Influenzalike Illnesses Presenting to a Los Angeles Medical Center Excess Patient Visits for Cough and Pulmonary Disease at a Large US Health System in the Months Prior to the COVID-19 Pandemic: Time-Series Analysis CDC 2019-novel coronavirus (2019-nCoV) real-time RT-PCR diagnostic panel for emergency use only instructions for use Rapid cost-effective viral genome sequencing by Vseq Aligning sequence reads, clone sequences and assembly contigs with Broad Institute, GitHub Repository: Broad Institute A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR Sensitive Recovery of Complete SARS-CoV-2 Genomes from Clinical Samples by Use of Swift Biosciences' SARS-CoV-2 Multiplex Amplicon Sequencing Panel SARS-CoV-2 lineages: Lineage desctiptions An emergent clade of SARS-CoV-2 linked to returned travellers from Iran Mutational spectra of SARS-CoV-2 orf1ab polyprotein and signature mutations in the United States of America Early Detection of Covid-19 through a Citywide Pandemic Surveillance Platform Cryptic transmission of SARS-CoV-2 in Washington state Pooling of SARS-CoV-2 samples to increase molecular testing throughput Pooled RNA sample reverse transcriptase real time PCR assay for SARS CoV-2 infection: A reliable, faster and economical method Evaluation of Nasopharyngeal Swab Collection Techniques for Nucleic Acid Recovery and Participant Experience: Recommendations for COVID-19 Diagnostics Serologic testing of U.S. blood donations to identify SARS-CoV-2-reactive antibodies A c c e p t e d M a n u s c r i p t 15 A c c e p t e d M a n u s c r i p t