key: cord-0909872-72v12ftb authors: Wylezich, Claudia; Calvelage, Sten; Schlottau, Kore; Ziegler, Ute; Pohlmann, Anne; Höper, Dirk; Beer, Martin title: Next-generation diagnostics: virus capture facilitates a sensitive viral diagnosis for epizootic and zoonotic pathogens including SARS-CoV-2 date: 2020-07-01 journal: bioRxiv DOI: 10.1101/2020.06.30.181446 sha: 0cd119a26d713dc8409bfd369da6a08e08d629fe doc_id: 909872 cord_uid: 72v12ftb Background The detection of pathogens in clinical and environmental samples using high-throughput sequencing (HTS) is often hampered by large amounts of background information, which is especially true for viruses with small genomes. Enormous sequencing depth can be necessary to compile sufficient information for identification of a certain pathogen. Generic HTS combining with in-solution capture enrichment can markedly increase the sensitivity for virus detection in complex diagnostic samples. Methods A virus panel based on the principle of biotinylated RNA-baits was developed for specific capture enrichment of epizootic and zoonotic viruses (VirBaits). The VirBaits set was supplemented by a SARS-CoV-2 predesigned bait set for testing recent SARS-CoV-2 positive samples. Libraries generated from complex samples were sequenced via generic HTS and afterwards enriched with the VirBaits set. For validation, an internal proficiency test for emerging epizootic and zoonotic viruses (African swine fever virus, Ebolavirus, Marburgvirus, Nipah henipavirus, Rift Valley fever virus) was conducted. Results The VirBaits set consists of 177,471 RNA-baits (80-mer) based on about 18,800 complete viral genomes targeting 35 epizootic and zoonotic viruses. In all tested samples, viruses with both DNA and RNA genomes were clearly enriched ranging from about 10-fold to 10,000-fold for viruses including distantly related viruses with at least 72% overall identity to viruses represented in the bait set. Viruses showing a lower overall identity (38% and 46%) to them were not enriched but could nonetheless be detected based on capturing conserved genome regions. The internal proficiency test supports the improved virus detection using the combination of HTS plus targeted enrichment but also point to the risk of carryover between samples. Conclusions The VirBaits approach showed a high diagnostic performance, also for distantly related viruses. The bait set is modular and expandable according to the favored diagnostics, health sector or research question. The risk of carryover needs to be taken into consideration. The application of the RNA-baits principle turned out to be user-friendly, and even non-experts (without sophisticated bioinformatics skills) can easily use the VirBait workflow. The rapid extension of the established VirBaits set adapted to actual outbreak events is possible without any problems as shown for SARS-CoV-2. The VirBaits approach showed a high diagnostic performance, also for distantly 39 related viruses. The bait set is modular and expandable according to the favored diagnostics, 40 health sector or research question. The risk of carryover needs to be taken into consideration. 41 The application of the RNA-baits principle turned out to be user-friendly, and even non-experts 48 3 Background 49 Disease control is encompassing several fields like powerful diagnostics for early detection, 50 efficient therapy and disease prophylaxis as for example vaccination. The present study focused 51 on broad and powerful diagnostics for viral pathogens including evolved, newly emerging or 52 unrecognized viruses. The detection of the latter can be challenging for conventional virus 53 diagnostics relying on specific quantitative PCR assays. Sequence information for such newly 54 emerging pathogens is often rare or not existing. In addition, PCR-based diagnosis of co- 55 infections with several pathogens is complicated, can be time-consuming and expensive, and 56 therefore, co-infections can easily be overseen. Moreover, pathogen detection in cases of 57 immunocompromised patients might be hampered due to a deviating course of the infection or 58 its ectopic location [1] . For all the mentioned challenges, untargeted metagenomics using high-59 throughput sequencing (HTS) offers a swift and broad solution. It also enables the simultaneous 60 detection of the genetic information of pathogens of all taxa including viruses, bacteria and 61 eukaryotic pathogens like parasites or fungi [2] . However, poor sample quality, low pathogen 62 loads and high levels of background consisting of nucleic acids of the host or accompanying 63 bacteria often lead to a detrimental pathogen/background nucleic acid ratio of the final 64 sequencing libraries [3] . Analyses of the resulting sequence datasets can be laborious and time 65 consuming and difficult to interpret, even for experts. This is especially true for pathogenic 66 viruses with rather small RNA genomes that can get lost in datasets generated from complex 67 samples. Extremely large sequence datasets and intense time-consuming analyses would be 68 necessary to compile enough meaningful information to finalize a genome of a certain 69 pathogen [4]. This significantly impedes pathogen detection causing a diagnostic gap and 70 making HTS difficult to implement in the daily routine of diagnostic laboratories. 71 The enrichment of certain pathogens prior to HTS using capture enrichment methods can help 72 to increase the target sequence information considerably, along with avoiding extensive 73 sequencing efforts, as recently reviewed by Gaudin For the purpose of improved virus diagnostics, we used RNA baits for an in-solution capture 87 assay with subsequent HTS (Fig. 1) The performance of the VirBaits set was tested for selected viruses including such with RNA and 148 DNA genomes. We mainly used infected sample material of real cases that had already been 149 sequenced for diagnostic purpose but also a spiked salmon sample from a laboratory 150 proficiency test ( Table 2) Table 4 ) only 183 declaring sample number and type (tissue sample with Trizol or already extracted RNA) 184 whereas the sample processor (C.W.) did not get any information on the samples like host, 185 virus, kind of organ or tissue or pre-diagnosis. The aim of the present study was the design and test of a virus enrichment panel, whose 261 application markedly enhances the virus signal in diagnostic metagenomics by targeted capture 262 enrichment. In general, the VirBaits approach with custom-designed kits using 80-mer RNA 263 probes was easy to apply for our purpose by just compiling the relevant virus genomes and 264 providing them to the supplier. In our opinion, it is applicable for diagnosticians who are in 265 need for bait sets to capture specific pathogens but are not proficient in bioinformatics skills 266 implementing the underlying genomes into meaningful oligonucleotides. 267 As a base for this HTS plus capture enrichment approach, we used a generic HTS workflow (Fig. 3, Table 4) were intending to design, improving diagnostic sensitivity but not necessarily leading to the 310 assembly of whole genomes. Typically, one specific read in a sequence dataset indicating a 311 pathogenic virus would be enough to call a suspicion and trigger follow-up analyses. However, 312 as shown here, the used tiling density in many cases allows for the assembly of nearly complete 313 genomes emphasizing the sensitivity of the approach. In addition, the detected SNVs of RABV 314 and KBLV genomes point to equal frequencies (Fig. 4) Table 2 ). In 341 sample X3, some reads of the NiV were found while they were not supposed to be in this Availability of data and materials 419 The bait set generated and tested in this study including the GenBank accession number of the 420 respective virus genomes are provided in Additional file 1 (fasta format). given for the dataset processed with the untargeted workflow (grey bars) and for the dataset after 568 VirBaits treatment (white bars). Four and two point mutations deviating from the consensus were 569 detected along the RABV genome (at positions 2,926, 7,419, 11,245 and 11,790) and the KBLV (at 570 positions 2,547 and 6,172), respectively, by applying a strand bias less than 70%. 571 Lethal invasive cestodiasis in immunosuppressed 444 patients A Versatile Sample Processing Workflow for Metagenomic 446 Pathogen Detection Hybrid capture-based next generation sequencing and its application to 448 human infectious diseases Metagenomic approaches to identifying infectious agents Specific capture and whole-genome sequencing of 452 viruses from clinical samples Comparative analysis of whole-genome sequence of African 454 swine fever virus Belgium 2018/1 Metavirome sequencing to evaluate Norovirus 456 diversity in sewage and related bioaccumulated oysters Rapid whole-genome sequencing of Mycobacterium 458 tuberculosis isolates directly from clinical samples Hybrid selection for sequencing pathogen genomes from 460 clinical samples RNA enrichment method for quantitative 462 transcriptional analysis of pathogens in vivo applied to the fungus Candida albicans Enhanced virome sequencing using targeted 465 sequence capture Virome capture sequencing enables sensitive viral diagnosis 467 and comprehensive virome analysis Capturing sequence diversity in metagenomes 469 with comprehensive and scalable probe design RNA:DNA hybrids are more stable than DNA:DNA duplexes in 471 concentrated perchlorate and trichloroacetate solutions Relative thermodynamic stability of DNA, RNA, and DNA:RNA hybrid 473 duplexes: relationship with base composition and structure A pneumonia outbreak associated with a new coronavirus of 476 probable bat origin ViralZone: a knowledge resource to understand virus 478 diversity RIEMS: a software pipeline for sensitive and comprehensive 480 taxonomic classification of reads from metagenomics datasets Novel picornavirus in lambs with severe 482 encephalomyelitis A novel alphaherpesvirus associated with fatal diseases in 484 banded Penguins A novel astrovirus associated with encephalitis and 486 ganglionitis in domestic sheep A variegated squirrel bornavirus associated with fatal human 488 encephalitis West Nile virus epidemic in Germany triggered by 490 epizootic emergence Target-enrichment strategies for next-generation 492 sequencing The paradox of HBV evolution as revealed from a 494 16th century mummy Development and preliminary evaluation of a 496 multiplexed amplification and next generation sequencing method for viral hemorrhagic fever 497 diagnostics Riems influenza a typing array (RITA): An RT-qPCR-499 based low density array for subtyping avian and mammalian influenza a viruses West Nile virus epizootic in Germany Proficiency testing of virus diagnostics based on 504 bioinformatics analysis of simulated in silico high-throughput sequencing datasets Proteogenomics uncovers critical elements of host response in 507 bovine soft palate epithelial cells following in vitro infection with Foot-And-Mouth Disease Virus. 508 Viruses Experimental transmission studies of SARS-CoV-2 in fruit bats, ferrets, pigs and chicken Application of shotgun metagenomics to smoked salmon 514 experimentally spiked: Comparison between sequencing and microbiological data using 515 different bioinformatic approaches 422 The authors declare that they have no competing interests.