key: cord-0724066-4wlxbbh4 authors: Van Borm, Steven; Belák, Sándor; Freimanis, Graham; Fusaro, Alice; Granberg, Fredrik; Höper, Dirk; King, Donald P.; Monne, Isabella; Orton, Richard; Rosseel, Toon title: Next-Generation Sequencing in Veterinary Medicine: How Can the Massive Amount of Information Arising from High-Throughput Technologies Improve Diagnosis, Control, and Management of Infectious Diseases? date: 2014-08-11 journal: Veterinary Infection Biology: Molecular Diagnostics and High-Throughput Strategies DOI: 10.1007/978-1-4939-2004-4_30 sha: 2f4eea79924fb1aacc3bc3cb22b75064a0cf693d doc_id: 724066 cord_uid: 4wlxbbh4 The development of high-throughput molecular technologies and associated bioinformatics has dramatically changed the capacities of scientists to produce, handle, and analyze large amounts of genomic, transcriptomic, and proteomic data. A clear example of this step-change is represented by the amount of DNA sequence data that can be now produced using next-generation sequencing (NGS) platforms. Similarly, recent improvements in protein and peptide separation efficiencies and highly accurate mass spectrometry have promoted the identification and quantification of proteins in a given sample. These advancements in biotechnology have increasingly been applied to the study of animal infectious diseases and are beginning to revolutionize the way that biological and evolutionary processes can be studied at the molecular level. Studies have demonstrated the value of NGS technologies for molecular characterization, ranging from metagenomic characterization of unknown pathogens or microbial communities to molecular epidemiology and evolution of viral quasispecies. Moreover, high-throughput technologies now allow detailed studies of host-pathogen interactions at the level of their genomes (genomics), transcriptomes (transcriptomics), or proteomes (proteomics). Ultimately, the interaction between pathogen and host biological networks can be questioned by analytically integrating these levels (integrative OMICS and systems biology). The application of high-throughput biotechnology platforms in these fields and their typical low-cost per information content has revolutionized the resolution with which these processes can now be studied. The aim of this chapter is to provide a current and prospective view on the opportunities and challenges associated with the application of massive parallel sequencing technologies to veterinary medicine, with particular focus on applications that have a potential impact on disease control and management. Genetic characterization of infectious agents plays a central role in the diagnosis, monitoring, and control of infectious diseases. The development of rapid DNA sequencing methods based on the selective incorporation of chain-terminating dideoxynucleotides ([ 1 ]; later termed "fi rst-generation sequencing technologies") and the polymerase chain reaction (PCR) DNA amplifi cation technologies ([ 2 ]; reviewed in [ 3 ]) has paved the way for the study of biological and evolutionary processes at the molecular level. Such technologies have been extensively applied to the diagnosis and molecular epidemiology of infectious diseases of livestock and become important tools for targeted research on host-pathogen interactions. The most recent versions of these fi rst-generation sequencing technologies are widely accessible and provide highquality data. However, their application to projects such as whole genome sequencing is expensive and time-consuming, often requiring prior knowledge of the target genome for specifi c template amplifi cation. These limitations have been particularly problematic for large sequencing projects and have motivated the development of alternative, post-Sanger sequencing technologies ("next-generation sequencing" or NGS). Next-generation sequencing platforms provide unprecedented throughput, generating hundreds of gigabases of data in a single experiment. Although the initial capital investment and cost per experiment remain high, the price per information unit (nucleotide) has been dramatically reduced in comparison with fi rstgeneration sequencing. Moreover, these technologies allow unbiased sequencing without prior knowledge of the complete DNA content in a sample while retaining the fl exibility to allow for targeted sequencing. This paradigm shift in the scale of DNA sequence data has revolutionized the way biological and evolutionary processes can be studied at the molecular level, enabling genome projects previously restricted to high profi le model organisms and human pathogens to target pathogens of lesser economic and medical signifi cance. Such advancements are now being increasingly applied to veterinary medicine. As a result, the increasing availability of these technologies combined with the rapid development of applied tools and protocols has provided a diverse array of applications for use in genomics and transcriptomics and even routine diagnostics. In this chapter, we review recent advances in NGS technologies that are becoming commonplace in many laboratories, with an emphasis on the applications that have the potential to signifi cantly impact on diagnosis, prevention, and control of infectious diseases in animals. A number of different NGS platforms are currently available, with each utilizing different sequencing chemistries and detection strategies. This has led to individual systems having their own strengths and limitations (reviewed in [ 4 -6 ] ). Second-generation sequencing platforms vary in technology and chemistry used but have the following properties in common: • A DNA library is made from the sample. This library is either representing all DNA in sample without prior knowledge or a targeted library using PCR amplifi cation or alternative enrichment methods. Adapter sequences are joined to the DNA molecules (by ligation or amplifi cation) and can include a barcode sequence that allows multiplexing of several samples in an experiment. • Individual DNA molecules in each library are clonally amplifi ed. • Clonal DNAs are sequenced by massive parallel sequencing. • Hundreds of thousands of DNA sequence reads result and need to be processed. The second-generation sequencing platforms fi rst emerged on the market with an emphasis on extreme high-throughput sequencing applications and initially were restricted to genome sequencing centers or core facilities. These technologies use different detection principles including pyrosequencing (454 Life Sciences, acquired by Roche, available since 2005, but planned to be discontinued by mid-2016), Illumina's sequencing by synthesis (previously Solexa, available since 2007), SOLiD ligation-based sequencing (Life Technologies, available since 2006), and, more recently, the Ion Torrent semiconductor sequencing technology. Over the last 5-7 years, all of the major platforms have made signifi cant improvements, with notable advancements made in terms of protocol complexity, overall performance (including read length, fi delity, lower input DNA), and cost effi ciency. More recently, smaller benchtop sequencers [ 7 ] have become available, making the technology more accessible for use in routine microbiology laboratories, while academic core facilities and commercial service providers focused increasingly upon providing users with access to a wider diversity of the sequencing technologies available. These developments will bring NGS technologies within the reach of many more research groups and diagnostic laboratories where NGS analysis of a single isolate will generate signifi cant quantities of data, many orders of magnitude greater than that generated by other typing methods. In addition to the continuous improvement of existing platforms, newer methodologies are being developed. Third-generation sequencing technologies are defi ned as single-molecule sequencers Next-Generation Sequencing in Infectious Disease Management (reviewed in [ 6 , 8 ] ). These approaches promise additional advantages such as scalability, simplicity, long read length, and low operational costs and do not require clonal amplifi cation of template DNA molecules, thereby removing potential errors associated with clonal amplifi cation. A single third-generation platform is currently available on the market since 2011 (PacBio, Pacifi c Biosciences) that sequences long single DNA molecules in real time, known as SMRT sequencing. Other technologies are still under development (e.g., [ 4 ]) such as DNA sequencing in nanopores that offer the potential of simple, inexpensive, single-molecule sequencing in miniaturized or highly scalable devices [ 9 ] . Although substantial validation data is still required, these technologies have the potential to make NGS even more widely available in diagnostic labs. While the advantages of NGS are numerous (unprecedented scale of genomic information, scalability, low-cost per information content, and high throughput), several challenges remain to be addressed. Several processes in the NGS workfl ow, from sample selection to data interpretation, are potentially vulnerable to bias and/or error introduction (Fig. 1 ) . This includes the error rates of the Fig. 1 Steps in next-generation sequencing and data analysis workfl ows where error and bias introduction may occur sequencing chemistry and library construction, as well as point mutations and insertions/deletions that may arise during reverse transcription and PCR amplifi cation. The amplifi cation of DNA by PCR to obtain clonal template sequences is subject to error introduction [ 8 ] and may result in an amplifi cation bias impacting the relative frequency of sequence variants present in the sample. Sampling bias can be introduced when a relatively small number of samples are analyzed per epidemiological unit (e.g., single animal or herd) due to fi nancial constraints restricting the thorough use of NGS. When only a small proportion of the nucleic acids in a single sample are subjected to sequence analysis, technical sampling bias occurs. While NGS data provides a high resolution of an individual sample, the resolution of higher epidemiological scales (Fig. 2 ) may thus be compromised due to insuffi cient sampling. Moreover, as a minimum amount of genetic material is needed as input for NGS workfl ows, this can result in bias towards samples with the highest pathogen titers. Errors and bias may also be introduced by methods to increase the sensitivity of the workfl ow, such as targeted pathogen genome amplifi cation [ 10 -12 ] or enrichment (e.g., [ 13 ]). High-throughput technologies can be applied to numerous aspects of animal infectious diseases Furthermore, reagent contamination has been reported to interfere with metagenomic analyses [ 14 , 15 ] . Errors can also be made during the sequencing process itself via base miscalls by NGS machines. For example, the loss of synchronicity (dephasing) in a percentage of the clonally amplifi ed DNA template [ 16 ] results in increased noise and sequencing errors [ 17 , 8 ] . Each of the different NGS platforms available has its own distinct characteristics in terms of read and error profi les, with the Illumina platform often regarded as having the lowest error rate, while other platforms can produce longer reads [ 7 , 18 , 19 ]. In addition, certain biases and errors can be introduced during the analysis of NGS datasets, due to the limitations of the algorithms or reference data used [ 20 , 21 ] . For example, genomic repeat regions are a well-known problem for sequence assembly algorithms [ 22 ] . However, software advancements combined with platform and chemistry developments are expected to further reduce operational costs, error rates, and input DNA quantities, allowing more careful sampling strategies and reducing experimentally introduced error and bias in the future. As a result of errors and bias introductions, NGS data needs to be "cleaned". This includes sequence fi ltering (removing lowquality sequences) and alignment followed by variant calling and error correction. Discriminating true biological variants from those due to experimental noise is an important issue when trying to identify low-frequency variants in a population, for example, in viral quasispecies or metagenomic analyses, and there are currently a number of bioinformatics tools to aid in this (e.g., [ 23 -26 , 19 ] ). Currently, a multitude of software has been developed to address different aspects of NGS analyses [ 27 , 28 ] . However, the available algorithms for both genome assembly and amplicon analysis can present some limitations [ 29 ] , meaning that custom-made scripting and in-house resolution of bioinformatic problems are often needed to investigate novel datasets and specifi c hypotheses. In this context, researchers are frequently faced with the need to acquire computer skills and bioinformatics expertise. To evaluate the potential of NGS for a wider group of scientists and diagnosticians, there is a real need to develop fl exible and practical bioinformatics workfl ows that can provide user-friendly tools for the analysis of massive datasets and that become publicly available. Although some software with a menu-driven approach is available (e.g., Geneious, CLC Workbench, Galaxy), most applications are optimized on UNIX-based operating systems and require some bioinformatics expertise. Although less user-friendly, UNIX-based pipelines are typically freely available to the NGS user community and are equipped with algorithms that track the high pace of innovation in the NGS fi eld. A further issue is the scale of genetic data produced by NGS technologies which presents a physical constraint in terms of data storage and analysis. Although limited datasets (e.g., resulting from the desktop-range 2nd-generation sequencers) can be managed using modest computing resources, like high-end desktop computers running virtual Linux machines [ 4 ], larger datasets typically require high-performance computational clusters, which present a considerable investment and require suffi cient information technology (IT) support. Cloud computational resources (i.e., renting time from commercial high-performance computational clusters) may be a solution [ 30 ] although further developments are needed, given the data transfer issues resulting from huge fi le sizes [ 4 ]. Another issue for diagnostic laboratories is protection of data from unauthorized access, which cannot be guaranteed in the cloud, as data from diagnostic examinations need to be kept confi dential. For labs frequently producing NGS data, data storage and backup costs can be substantial. Ideally, these huge genetic datasets should be made publicly available to the scientifi c community as they provide a source of information applicable to better understanding disease, design of targeted assays, systems biology, and integrated OMICS analysis approaches. To this end, online repositories such as the Sequence Read Archive (SRA; [ 31 ]) have been created to store both raw NGS and intermediate analysis fi les. It will also be important to consider how results from complex and massive NGS datasets will be communicated to policy groups and the public and become a decision-supporting tool. To this end, it is necessary that scientists and diagnosticians develop and agree on data formats for the communication of NGS results for analyses that go beyond simple genome sequences, for instance, for reporting quasispecies compositions. NGS technology is now being increasingly applied to study the etiology, genomics, evolution, and epidemiology of animal infectious diseases as well as host-pathogen interactions (Fig. 2 ) . These applications have provided novel insights and illustrate the potential of this new technology to directly impact on our understanding and control strategies for animal infectious disease. NGS platforms have been instrumental in the completion of large animal genomes and the documentation of genomic variation (reviewed in [ 32 ]). Available livestock genomes now include bovine, pig, sheep, equine, and avian [ 32 ] which provide an important source of knowledge for understanding food production and animal interaction with infectious pathogens. Additional livestock genome sequencing efforts have documented genomic variation providing information for the development of genetic markers applicable to animal breeding genetics [ 33 -35 ], including traits related to pathogen resistance and interaction with microbial communities in poultry [ 36 ] . Others have used novel sequencing technologies for the targeted study of specifi c gene families occupying key roles in host immunology (e.g., Toll-like receptor (TLR) gene family [ 37 ] ). The high variability and large size of the mitochondrial genome (mtDNA) of eukaryotic parasites have been recently explored using NGS (reviewed by [ 38 ] ). mtDNA sequences proved very informative in epidemiological studies [ 38 ] but also include comparative mtDNA sequencing of parasites with low and high zoonotic potential [ 39 ]. Targeting specifi c polymorphic genes in the Cryptosporidium parvum genome using NGS, extensive intra-host genetic diversity was documented [ 40 ] . Studies of the transcriptome (all mRNA transcripts in an organism, tissue, or cell; also called RNA-Seq ) of different parasite species and/or developmental stages provide insights into aspects of gene expression, regulation, and function, which are major steps to understanding their biology (reviewed in [ 41 ] Over the last fi ve years, NGS has been used as an extremely important tool in the tracing of transmission, genome characterization, and outbreak management of both viral and bacterial diseases. The sequencing of these two pathogen types poses very different sets of challenges and issues, where the large data output expressed typically in the Mb (megabase) to Gb (gigabase) range [ 4 ] is particularly suited for the sequencing of larger bacterial genomes. The high plasticity of some microbial genomes, with large mobile elements, genecoding plasmids, chromosomal genes, and regions of extensive genetic variability, can frequently complicate genome assembly [ 46 ] . While most viral genomes are signifi cantly smaller than their bacterial counterparts, the viral replication biology (particularly that of RNA viruses) poses its own unique problems. These involve the inherent variability of many viral genomes due to replication machinery lacking effi cient proofreading mechanisms. This, combined with a short generation time and high replication rate, results in a complex mix of differing genomes (a "swarm" of closely related viruses) within a single host that are often termed as " quasispecies ," reviewed in [ 47 ] . In addition, recombination and reassortment of segmented viral genomes frequently occur. NGS techniques offer an unprecedented "step-change" increase in the amount of sequence data that can be generated from both types of these samples. Figure 3 (different scales of sequence analysis) highlights where genetic analyses can target different biological scales and whether these are within an individual host, or between hosts, resulting in either host variation or inter-herd diversity/outbreak transmission. At the level of the quasispecies, NGS technologies can now determine complete viral genomes to a fi ne-point resolution, allowing the quantifi cation of viral diversity within samples [ 48 ] and making the sequencing of large numbers of samples economically feasible. The technology will allow the comparison of genetically diverse populations from different replication sites within a host [ 49 , 50 ] . Wright and colleagues investigated the genetic diversity Fig. 3 The differing levels of intra-and inter-host variation that can be explored using NGS technologies range from intracellular dynamics to epidemiological applications and resulting quasispecies population after inoculation of foot-andmouth disease virus (FMDV) into a single animal and identifi ed genetically distinct populations originating from different lesions [ 50 ] . Morelli and colleagues [ 51 ] studied the evolution of FMDV intra-sample sequence diversity during serial transmission in bovine hosts, providing novel insights into the fi ne-scale evolution of an RNA virus. NGS can also provide insights on microevolutionary processes of viruses at different scales, including the fi ne-point resolution molecular epidemiology analysis of outbreaks [ 52 ] . Recent studies on infl uenza A viruses have demonstrated that minority variants present in the donor population can be successfully transmitted to the recipient host and become prevalent with unpredictable impact on the virus biological properties [ 53 , 54 ]. These fi ndings suggest that the use of NGS approaches in RNA virus surveillance will be strategic to promptly detect biologically relevant viral quasispecies and will help in expanding our understanding of viral dynamics and emergence and the possible implications of mutation emergence for studies done using isolated viruses [ 55 , 56 ] . The study of the viral swarm within individual hosts also has implications for understanding the evolutionary dynamics of viral populations under selection pressures, e.g., antiviral drugs. This has been a particularly active fi eld in human medicine, e.g., with regard to human immunodefi ciency virus (HIV) antiviral drugs response, drug resistance, and viral tropism (reviewed in [ 57 -59 ]) and human infl uenza A (e.g., [ 60 ]) studies. The technologies' application to personalize antiviral treatment as a function of genetic marker makeup in human medicine is just around the corner [ 61 ] . Although at present only an emerging fi eld in veterinary science, the development of antiviral drugs has the potential to translate into effi cient animal infectious disease control strategies (e.g., [ 62 , 63 ] ). The majority of the papers using NGS to investigate animal infectious disease focus specifi cally upon the level of animal-toanimal transmission and the characterization of pathogens within a single host, as this yields the most useful data in terms of outbreak management and identifying mechanisms/sources of disease transmission. For NGS technology has also allowed the characterization of complete microbial communities without prior knowledge. For instance, the unbiased characterization of conserved bacterial ribosomal RNA-encoding sequences (rRNA profi ling) has been applicable to whole microbial community characterization (e.g., [ 94 , 95 ] ) and to molecular characterization of (uncultured) bacteria [ 96 ] . Metagenomics is the determination of the sequence content of a complete microbial community (reviewed in [ 97 ] ). The analysis of the resulting data can be taxonomy oriented (identifi cation and quantifi cation of species diversity; [ 98 ]) or function based (identification of coding gene diversity, e.g., [ 99 ] ). The latter has signifi cant potential, e.g., in the screening for virulence-associated, antibiotic resistance genes, and vitamin production-associated genes in microbial communities [ 100 ] . NGS also offers the potential of unbiased sequencing of the nucleic acid content of a sample and has been applied to the characterization of the viral metagenome in samples [ 101 ] or the identifi cation of unknown or unexpected viruses in diseased animals or insect vectors. Furthermore, metagenomic NGS workfl ows allow the study of the interaction of treatment with an animal's microbiome [ 102 ] . In the microbiology lab, NGS has the potential for greater diagnostic resolution than any other typing method, and clinical microbiology labs are currently investigating its potential for routine diagnosis [ 103 , 104 ]. Using NGS-based metagenomic approaches, multiple potential disease agents have been identifi ed in a wide range of both domestic and wild animals (reviewed in [ 105 -109 ]). Although the common goal is to identify potential pathogens, the studies can roughly be divided into three categories: (1) investigations of outbreaks of unknown etiology, (2) investigations of well-known disorders presumed to be of multifactorial etiology, and (3) Although it is an important fi rst step, the identifi cation and genetic characterization of candidate pathogens are not enough to establish causal relationships or understand how they may be associated with disease. It is therefore necessary to use a synergistic approach combining molecular diagnostic tools, such as NGSbased metagenomics and follow-up PCR-based assays targeting detected pathogen sequences, with more conventional diagnostic methods, including isolation and characterization. This is crucially important in situations where metagenomic data indicate the potential presence of multiple pathogens. While PCR-based prevalence studies in matching disease cases and healthy controls can provide further evidence for disease association, isolation of candidate pathogens is required to assign causality by addressing Koch's postulates [ 120 ] . The assembled data from such a multidisciplinary (pathology, epidemiology, metagenomic data, PCR prevalence studies, isolation, characterization, etc.) should be used to identify the most likely candidate etiologic agent and to make informed intervention decisions. The synergetic and parallel use of molecular and classical methods not only results in detection of infectious agents and development of targeted diagnostic tests but also has the potential to make isolates or strains available shortly after the occurrence of outbreaks. The availability of isolates or strains is of special importance to allow the design of effective vaccines or antimicrobial drugs. The power of NGS to boost the veterinary laboratory community's responsiveness to emerging diseases was demonstrated through the discovery of a novel Orthobunyavirus in 2011 associated with fever, decreased milk production, and diarrhea in dairy cattle. Metagenomics, using 454 technology, allowed the identifi cation of a novel virus, subsequently named Schmallenberg virus (SBV), in an epidemiological cluster of diseased cattle in Germany [ 121 ] . These viral sequences were used to rapidly design targeted molecular tests that were used to confi rm a clear association between the presence of the virus and affected animals [ 110 ] . International adoption of these molecular tests identifi ed a widespread occurrence of SBV in European countries ( http://www.efsa.europa.eu/ en/supporting/pub/429e.htm ) and its detection in stillborn and malformed lambs [ 122 , 123 ] , as well in insect vectors [ 124 , 125 ] . The molecular tests were also helpful in targeting samples for isolation of the virus, which ultimately led to the development of a prototype vaccine currently under evaluation [ 126 ] . Metagenomic NGS workfl ows also have the potential use for quality control of biological products [ 127 ] and vaccines [ 128 -132 ] and provide a powerful approach for the identifi cation and characterization of unexpected of highly divergent pathogen variants [ 133 , 85 ] that may remain undetected using targeted diagnostic tests. The technological possibility to study both the host and the pathogen with high resolution on the level of their genome, transcriptome, or proteome opens opportunities to study host/ pathogen interactions at several levels ((genomics, transcriptomics, microRNA s (miRNA)) and ultimately to analytically integrate these levels (integrative omics or systems biology) aiming to study the interaction of pathogen, microbiome, and host biological networks with many examples in veterinary science. Nordentoft and colleagues [ 134 ] used NGS metagenomics to study the infl uence of livestock management parameters and infection with Salmonella enteritidis on the microbial community in the chicken intestinal tract. Another study [ 135 ] documented the effect of Campylobacter jejuni infection on the chicken fecal microbiome. The application of metagenomic techniques in poultry production could lead to the development of novel alternatives to antibiotic growth promoters and better understanding of the colonization of food production animals by foodborne pathogens such as Salmonella enterica and Campylobacter spp. [ 36 ] . Other studies investigated the host response to pathogen infection. Glass and colleagues [ 136 ] used NGS transcriptomics to document bovine resistance and tolerance traits to parasitic infection. The technology was also used to study the ferret transcriptome response to infl uenza infection [ 137 ] , the chicken transcriptome response to Marek's disease [ 138 ] , the swine response to porcine reproductive and respiratory syndrome virus infection [ 139 ] , and the changes in the mouse transcriptome after Brucella sp. infection [ 140 ] . microRNA s are considered to be a key mechanism of gene regulation in both parasites and viruses. Their characterization contributes to better understanding the complex biology of pathogens. Wang and coworkers [ 141 ] characterized microRNA sequences from Orientobilharzia turkestanicum , a fl uke with zoonotic potential infecting sheep, and identifi ed key target miRNAs for parasite energy metabolism, transcription initiation factors, signal transduction, and growth factor receptors. Virus-encoded microRNAs (vmiRNA) regulating viral or cellular transcripts can be targeted for virus discovery [ 142 , 143 ] . miRNAs also play important roles in regulating host-pathogen interactions. NGS has been applied to investigate whether infection can modulate miRNA biogenesis and has also been used to identify miRNAs that infl uence pathogen replication, tropism, and pathogenic potential [ 144 -149 ] . In particular, cellular miRNAs have been shown to interact with the viral genomic RNA or mRNA, facilitating or inhibiting the virus life cycle. These molecules have demonstrated immense potential as a source of antiviral therapeutics effective against a number of viruses (adenovirus, rabies, Venezuelan equine encephalitis, porcine reproductive and syndrome virus [ 150 -153 ] ) or for the design of live-attenuated virus vaccine based on miRNAmediated gene silencing [ 154 , 155 , 147 ] . Next-generation sequencing technologies have the potential to revolutionize our understanding of the complex dimensions of animal infectious disease and infection biology (Fig. 2 ) , ranging from the intracellular interactions to disease epidemiology. The application of high-throughput biotechnology platforms in these fi elds and their typical low-cost per information content has increased the resolution with which these processes can now be studied. We now have high-resolution tools that provide veterinary diagnostic laboratories with the ability to undertake swift and fl exible responses to emerging infectious diseases and unexpected pathogen variants. Moreover, these tools provide an increased resolution for the characterization of pathogens and provide important assets to improve our understanding. Fundamental research on pathogen evolution, adaptation, and virulence determinants can now be studied on a scale allowing within and between host dissections of genetic variability. Moreover, high-throughput tools open new perspectives to study the complex interaction between pathogen, host, and microbiome with very high resolution and to deepen our understanding of the key biological processes leading to protective immunity. Not only will our increased understanding of pathogens and their interaction with livestock impact on future disease prevention, control, and management strategies, but the technologies may themselves become part of the intervention strategies, providing high-resolution data for molecular epidemiology to rapidly trace the origin and spread of outbreaks, for molecular typing, for predicting, and for optimizing the outcome of targeted treatment with antibiotics, antivirals, and anthelmintic. The ready availability of high-resolution genomic and transcriptomic data will impact upon the targeted development of novel vaccines and drugs [ 156 , 157 ] , while NGS has the potential to become a powerful tool for the control of vaccines and other biological products. As with any new technology, challenges remain. In the case of NGS, these include the requirement for expertise in both the laboratory and in the analysis of huge datasets and the current need for high investment in laboratory and data analysis hardware. As the technology is ever evolving towards lower cost, user-friendliness, and accessibility for smaller research and diagnostic labs, efforts are needed to make the data analysis more accessible to nonexpert users. This includes proper modeling of the sources of error introduction, solutions for public data storage, development of userfriendly but high standard analysis pipelines for routine applications, etc. Both the industry and the NGS user community can play a role in this evolution. Similarly, recent improvements in protein and peptide separation effi ciencies and highly accurate mass spectrometry have promoted the identifi cation and quantifi cation of proteins in a given sample [ 158 ] . Directly targeting peptide and protein content in a sample, proteomic approaches provide important additional information taking known issues, such as the quantitative discrepancy between mRNA transcript levels and fi nal protein levels and posttranslational modifi cation, into account [ 159 ] . Novel proteomic approaches have been applied to animal infectious disease research, including the study of E. coli response to chicken sera [ 160 ] , proteomic profi ling of porcine sera after FMDV infection [ 161 ] , host-pathogen interaction during bovine mastitis [ 159 ] , and metaproteomic studies characterizing the collective proteome of microbial communities [ 162 ] . This section contains excellent contributions exploring the application of high-throughput technologies to animal infectious diseases, including functional genomics of tick vectors infected with eukaryotic parasites, metagenomic approaches to detect bee viral pathogens, proteomics of vector-host-pathogen interactions, and NGS applications exploring parasites and intervention strategies. Sequencebased identifi cation of microbial pathogens: a reconsideration of Koch's postulates Novel orthobunyavirus in Cattle Diagnosis of Schmallenberg virus infection in malformed lambs and calves and fi rst indications for virus clearance in the fetus Epizootic of ovine congenital malformations associated with Schmallenberg virus infection Detection of Schmallenberg virus in different Culicoides spp. by real-time RT-PCR Schmallenberg virus in Culicoides spp. biting midges, the Netherlands Inactivated Schmallenberg virus prototype vaccines What's in a strain? Viral metagenomics identifi es genetic variation and contaminating circoviruses in laboratory isolates of pigeon paramyxovirus type 1 Analysis of porcine circovirus type 1 detected in Rotarix vaccine Extraneous agent detection in vaccines-a review of technical aspects Massively parallel sequencing for monitoring genetic consistency and quality control of live viral vaccines Massively parallel sequencing, a new method for detecting adventitious agents Viral nucleic acids in live-attenuated vaccines: detection of minority variants and an adventitious virus Evidence for a new avian paramyxovirus serotype 10 detected in rockhopper penguins from the Falkland Islands The infl uence of the cage system and colonisation of Salmonella Enteritidis on the microbial gut fl ora of laying hens studied by T-RFLP and 454 pyrosequencing Comparative metagenomics reveals host specifi c metavirulomes and horizontal gene transfer elements in the chicken cecum microbiome Living with the enemy or uninvited guests: functional genomics approaches to investigating host resistance or tolerance traits to a protozoan parasite, Theileria annulata , in cattle Sequencing, annotation, and characterization of the infl uenza ferret infectome Genome-wide identifi cation of allele-specifi c expression (ASE) in response to Marek's disease virus infection using next generation sequencing Analysis of the swine tracheobronchial lymph node transcriptomic response to infection with a Chinese highly pathogenic strain of porcine reproductive and respiratory syndrome virus Deepsequencing analysis of the mouse transcriptome response to infection with Brucella melitensis strains of differing virulence Characterization of microRNAs from Orientobilharzia turkestanicum , a neglected blood fl uke of human and animal health signifi cance Identifi cation of virus encoding microRNAs using 454 FLX sequencing platform Discovery of DNA viruses in wild-caught mosquitoes using small RNA high throughput sequencing Viruses and microRNAs The mammalian microRNA response to bacterial infections Viral and cellular microRNAs as determinants of viral pathogenesis and immunity MicroRNA-mediated species-specifi c attenuation of infl uenza A virus Integrated analysis of microRNA expression and mRNA transcriptome in lungs of avian infl uenza virus infected broilers Identifi cation of differentially expressed miRNAs in chicken lung and trachea with avian infl uenza virus infection by a deep sequencing approach Artifi cial microRNAs can effectively inhibit replication of Venezuelan equine encephalitis virus An adenoviral vector-based expression and delivery system for the inhibition of wildtype adenovirus replication by artifi cial microRNAs Inhibition of rabies virus replication by multiple artifi cial microRNAs Effi cient inhibition of porcine reproductive and respiratory syndrome virus replication by artifi cial microRNAs targeting the untranslated regions Harnessing endogenous miRNAs to control virus tissue tropism as a strategy for developing attenuated virus vaccines Engineering microRNA responsiveness to decrease virus pathogenicity Next generation deep sequencing and vaccine design: today and tomorrow The application of next-generation sequencing technologies to drug discovery and development Analysis of proteins and proteomes by mass spectrometry Proteomic analyses of host and pathogen responses during bovine mastitis Proteome response of an extraintestinal pathogenic Escherichia coli strain with zoonotic potential to human and chicken sera Proteomics analysis of porcine serum proteins by LC-MS/MS after foot-and-mouth disease virus (FMDV) infection Metaproteomics of our microbiome-Developing insight in function and activity in man and model systems The collaboration between the authors was supported by Epi-SEQ: a research project supported under the 2nd joint call for transnational research projects by EMIDA ERA-NET (FP7 project nr 219235). Additional support for this work in the United Kingdom was obtained from the Department of Environment, Food and Rural Affairs (Defra project SE2940) and BBSRC (BB/ I014314/1).