key: cord-1015051-hsht7fg5 authors: Crawford, Dana C.; Williams, Scott M. title: Global variation in sequencing impedes SARS-CoV-2 surveillance date: 2021-07-15 journal: PLoS Genet DOI: 10.1371/journal.pgen.1009620 sha: 1576752e5c448e9204ea96b9685b8819aa5faec8 doc_id: 1015051 cord_uid: hsht7fg5 nan Funding: This publication was made possible by the Clinical and Translational Science Collaborative of Cleveland(to DCC), UL1TR002548 from the National Center for Advancing Translational Sciences (NCATS) component of the National Institutes of Health and NIH Roadmap for Medical Research (to DCC). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests: SMW is a section editor of PGEN. respect to public health. With the real time evolution of SARS-CoV-2 and the resulting impact on disease transmission and disease severity, this bias has created a gaping hole in our understanding of the trajectory of COVID-19. Recently, this has been at least partially addressed by new initiatives and funding through programs such as National SARS-CoV-2 Strain Surveillance (NS3) system. Initiated in November 2020, this program is partnering the CDC as of early 2021 with state health departments to process and sequence 750 samples per week and with commercial diagnostic labs to sequence 6,000 samples per week [12] . CDC is also working with 7 universities to conduct genomic surveillance research. In addition, the CDC-led SARS-CoV-2 Sequencing for Public Health Emergency Response, Epidemiology, and Surveillance (SPHERES) consortium has been developed to coordinate among more than 160 institutions active in SARS CoV-2 sequencing. In stark contrast to the relatively uncoordinated and antiquated approach to SARS-CoV-2 genomic surveillance in America, successful public health sequencing surveillance programs outside the US embraced genomic technology early in the pandemic. China initiated this early by producing the first SARS-CoV-2 sequence [13] , whose public widespread dissemination enabled near real-time worldwide sequence comparisons and the unprecedented rapid development of successful vaccines. The UK, ranked ninth in SARS-CoV-2 sequencing as of late January 2021, formed the COVID-19 Genomics UK (COG-UK) Consortium in April 2020 and have since sequenced more than 200,000 viral genomes, representing approximately 6% of reported COVID-19 cases in the UK [2, 14] . As of April 2021, the UK climbed to fifth ranking with more than 8% of cases having been sequenced [4] . The COG-UK Consortium, financially supported by the Department of Health and Social Care, UK Research and Innovation, Wellcome, and Wellcome Sanger Institute, and the Consortium, includes several public health groups, universities, and others as their scientific partners [14] . Many of the same groups are core supporters of the UK Biobank, a cohort of 500,000 participants with genome-wide and health data available as a major worldwide research resource for health outcomes of interest, now including COVID-19 [15] . Meanwhile, the US had sequenced less than 0.5% of confirmed cases [2] with plans to ramp up sequencing [6, 16] , but this effort has coalesced only a year after the first US case of COVID-19 was confirmed in Washington State [17] . Although finances and limited supplies represented key impediments early in the US, this is not the case at present [4] . The CDC has recently committed more than $200 million to enhance the sequencing. Rather now the major issue may be that the US does not have an organized, ongoing population-based research cohort that can be leveraged for COVID-19 studies, genetic or otherwise, forcing investigators to scramble to form ad hoc consortia for the collection of data from electronic health records [18] or to augment existing public [19] and private [20] genomic collections with COVID-19 data. Data access is siloed and samples are held (or discarded [4] ) by a plethora of disconnected labs, both public and private. This balkanization of the public health and testing efforts has not only slowed the process; it has substantially increased expenses. The White House recently announced a $1 billion dollar influx to increase sequencing capacity [21] . In comparison, the UK has had 2 major influxes of money into SARS-CoV-2 sequencing efforts totaling 20 million pounds in March 2020 producing more than 200,000 sequences [22] and an additional 12 million to produce sequence data from at least 20,000 cases per week. The results are clear; as of March 2021, the UK has generated approximately 40% of the SARS-CoV-2 sequences toward the global surveillance effort [23] for a fraction of the investment expected in the US. Direct comparisons between non-US successful SARS-CoV-2 sequencing surveillance efforts and the US efforts are difficult and somewhat unfair given that the federal response to the pandemic was initiated under an administration that has since been replaced. Also, the American healthcare system and associated governmental agencies are mostly patchwork and disparate. The CDC, part of the US Department of Health and Human Services (HHS), typically leads disease surveillance and works in conjunction with other HHS agencies, such as the Indian Health Services, as well as public health agencies organized at the state level. The latter rely primarily on healthcare organizations for data on reportable diseases. Financial and technical resources at the state and local level can vary substantially, explaining in part why Washington State has sequenced 4.84% of their confirmed cases compared with just 0.45% in Ohio. The difference in sequencing observed between Washington State and practically every other US state may also be due to both the history of SARS-CoV-2 in the US and the existing public health genomics research activities [17, [24] [25] [26] [27] . Given that SARS-CoV-2 is a novel zoonotic disease with no prior human infections, sequencing and analysis inform both the trajectory of the outbreak as well as its evolution [28] [29] [30] . The opening of a new niche for the evolution of the virus makes tracking human borne mutations critical to our surveillance and control, as many of these mutations may not have been beneficial to the virus in other hosts and hence would not have survived earlier. This is of particular importance in areas with high incidence. For example, even though as of this writing, the rate of infection is waning in the US, due in part to vaccinations, it is raging in other parts of the world, such as India, with little to no access to vaccines. The current crisis in India and the past year's tragedy in America has created two of among the largest viral populations in the world that can mutate into more transmissible [31] and more severe [32, 33] versions of the original virus [34] . The emergence of B.1.1.7, B.1.351 [35] , P.1 [36] , among others, is a reminder that investments in SARS-CoV-2 genomics need to continue and be expanded as other variants are probably not be far behind given the worldwide variability in vaccination rates and adherence to COVID-19 precautions. Even though the US has set into motion funding and efforts to correct for its initial dearth of sequencing, the pipeline both in the US and globally will require additional and sustained support as the pandemic moves from locale to locale. Apart from increased capacity for sequencing and analysis, provisions are also sorely needed to link genetic data to clinical and epidemiological data sources for public health research. These critical data linkages remain problematic in the US and resource-limited countries, but they are essential [37, 38] . For countries with the adequate resources, increased sequencing capacity and the development of informatics and bioinformatics pipelines and workflows need to be adapted and adopted via international efforts. The pandemic is an evolving phenomenon, requiring worldwide genomic expertise and technology as part of effective SARS-CoV-2 surveillance. When linked to clinical and epidemiological data, the same expertise will help in understanding the factors relevant in variable host susceptibility and response to infection pre-or postvaccination, independent of and interacting with the genetic code of the evolving virus that knows no zip code or international boundaries. The Past, Present, and Future of Public Health Surveillance United States rushes to fill void in viral sequencing Why American is 'Flying Blind' to Mutations. Washington Post Why US coronavirus tracking can't keep up with concerning variants The COVID Tracking Project: The Atlantic Genomic Surveillance for SARS-CoV-2 Variants COVID-19 Want to track pandemic variants faster? Fix the bioinformatics bottleneck Danish scientists see tough times ahead as variant rises Million Veteran Program: A mega-biobank to study genetic influences on health and disease Potentially Preventable Deaths from the Five Leading Causes of Death How the US Failed to Prioritize SARS-CoV-2 Variant Surveillance A new coronavirus associated with human respiratory disease in China COVID-19 Genomics UK (COG-UK) Consortium 2020 Dynamic linkage of COVID-19 test results between Public Health England's Second Generation Surveillance System and UK Multitude of coronavirus variants found in the US-but the threat is unclear First Case of 2019 Novel Coronavirus in the United States The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction All of Us Research Program launches COVID-19 research initiatives Trans-ancestry analysis reveals genetic and nongenetic associations with COVID-19 susceptibility and severity The White House Tracking the UK SARS-CoV-2 outbreak UK variant hunters lead global race to stay ahead of COVID Evaluating Specimen Quality and Results from a Community-wide, Home-Based Respiratory Surveillance Study It's Just Everywhere Already': How Delays in Testing Set Back the U. S Coronavirus Response. The New York Times Early Detection of Covid-19 through a Citywide Pandemic Surveillance Platform Seattle Coronavirus Assessment Network (SCAN). The SCAN Dashboard 2021 Cryptic transmission of SARS-CoV-2 in Washington state Viral genomes reveal patterns of the SARS-CoV-2 outbreak in Washington State Establishment and lineage dynamics of the SARS-CoV-2 epidemic in the UK Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England Genetic mechanisms of critical illness in COVID-19 Risk of mortality in patients infected with SARS-CoV-2 variant of concern 202012/1: matched cohort study SARS-CoV-2 Viral Variants-Tackling a Moving Target Detection of a SARS-CoV-2 variant of concern in South Africa Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus, Brazil. Science. 2021:eabh2644 SARS-CoV-2 Variants of Concern in the United States-Challenges and Opportunities Tracking COVID-19 Variants Act, House of Representatives