key: cord-0796334-81t59up3 authors: Woo, Patrick C.Y.; Lau, Susanna K.P.; Teng, Jade L.L.; Tsang, Alan K.L.; Joseph, Marina; Wong, Emily Y.M.; Tang, Ying; Sivakumar, Saritha; Bai, Ru; Wernery, Renate; Wernery, Ulrich; Yuen, Kwok-Yung title: Metagenomic analysis of viromes of dromedary camel fecal samples reveals large number and high diversity of circoviruses and picobirnaviruses date: 2014-10-29 journal: Virology DOI: 10.1016/j.virol.2014.09.020 sha: 36f51186e737161db5cf8174b84ec18b6d5b6fd0 doc_id: 796334 cord_uid: 81t59up3 The recent discovery of Middle East Respiratory Coronavirus and another novel dromedary camel coronavirus UAE-HKU23 in dromedaries has boosted interest in search of novel viruses in dromedaries. In this study, fecal samples of 203 dromedaries in Dubai were pooled and deep sequenced. Among the 7330 assembled viral contigs, 1970 were assigned to mammalian viruses. The largest groups of these contigs matched to Picobirnaviridae, Circoviridae, Picornaviridae, Parvoviridae, Astroviridae and Hepeviridae. Many of these viral families were previously unknown to dromedaries. In addition to the high abundance of contigs from Circoviridae (n=598 with 14 complete genomes) and Picobirnaviridae (n=1236), a high diversity of contigs from these two families was found, with the 14 Circoviridae complete genomes forming at least five clusters and contigs from both genogroup I and genogroup II potentially novel picobirnaviruses. Further studies comparing the incidence of these viral families in healthy and sick dromedaries will reveal their pathogenic potential. Camels are one of the most unique mammals on earth. In particular, they have shown perfect adaptation to desert life where the daytime temperature is very high, diurnal temperature range is large and the supply of food and water is scarce. Such adaptations are made through their distinct anatomical and physiological properties, such as short but thick fur, long legs, water conservation and unique fat metabolism. Therefore, camels were used for transportation of people and goods as well as for military uses in the past. In addition, they also provide a good source of meat, milk and wool. They are also important recreational animals in the Middle East and are used for camel racing. Having been associated with humans for at least 5000 years, camels usually pose little physical danger to humans. Occasionally, infectious diseases, such as brucellosis, can be transmitted from camels to humans. Dromedary camels are one of the two surviving old world camel species, namely Camelus dromedarius (dromedary or one-humped camel), which inhabits the Middle East and North and Northeast Africa; and Camelus bactrianus (Bactrian or two-humped camel), which inhabits Central Asia. Among the 20 million camels on earth, 90% are dromedaries. The recent emergence of Middle East Respiratory Coronavirus (MERS-CoV) from the Middle East and the presence of neutralizing antibodies against MERS-CoV from dromedaries in the Middle East have boosted interest in the search of novel viruses in dromedaries (de Groot et al., 2013; Lau et al., 2013; Perera et al., 2013; Reusken et al., 2013; Zaki et al., 2012) . Viruses of at least eight families, including Paramyxoviridae, Flaviviridae, Herpesviridae, Papillomaviridae, Picornaviridae, Poxviridae, Reoviridae and Rhabdoviridae, have been found to infect camels (Al-Ruwaili et al., 2012; Intisar et al., 2009; Khalafalla Contents Ure et al., 2011; Wernery et al., 2014; Wernery et al., 2008; Wernery and Zachariah, 1999; Yousif et al., 2004) . Recently, we have discovered a novel coronavirus, named dromedary camel coronavirus UAE-HKU23 (DcCoV UAE-HKU23), in dromedaries (Woo et al., 2014b) . As camels are closely associated with humans, knowledge on the variety of viruses present in this hardy group of animals is important to understand their potential for emergence. In this study, we analyzed the viromes of fecal samples of dromedaries in the Middle East, which is the first metagenomic study on animals of the family Camelidae. The interestingly large number and high diversity of contigs from the Circoviridae and Picobirnaviridae families were also discussed. Fecal samples of 203 dromedaries were pooled and deep sequenced using the Illumina HiSeq 2500 instrument, generating 29,247,514 paired-end 151-bp sequence reads. De novo assembly of the metagenome was performed using IDBA-UD to confirm isolation of viral genomes using default parameter with minimum read length of 200. There were 159,388 contigs ranging in size from 200 to 14,611 bp with a mean contig length of 540 bp. Among 159,388 contigs, 7330 were viral sequences. The most abundant fraction of viral contigs matched to the bacteriophages, including those of the order Caudovirales (n ¼3805), family Microviridae (n ¼509) and unclassified phages (n¼ 319) (Fig. 1 ). Viral contigs related to plant viruses included those of the families Geminiviridae (n ¼17), Betaflexiviridae (n ¼15), Totiviridae (n¼ 2), Nanoviridae (n ¼1) and Partitiviridae (n ¼4); and those related to insect viruses included those of the families Iflaviridae (n¼ 5), Dicistroviridae (n ¼3), Poxviridae (n ¼ 3), and Nodaviridae (n ¼2) and subfamily Densovirinae (n¼ 2) (Fig. 1) . One thousand nine hundred and seventy (26.9%) of the 7330 viral contigs were assigned to mammalian viruses (Fig. 1) . The largest group of the contigs matched to double-stranded RNA viruses in the family Picobirnaviridae (n ¼1236), followed by single-stranded DNA viruses in the family Circoviridae (n ¼598). The remaining contigs with homology to the most represented families of mammalian viruses were, in order of decreasing abundance, Picornaviridae [kobuviruses (n ¼ 17), enteroviruses (n ¼26), hunnivirus (n ¼14), encephalomyocarditis virus (n ¼4)]; Parvoviridae [porcine bocavirus (n ¼22), human bocavirus (n¼ 5), feline bocavirus (n ¼1), gorilla bocavirus (n ¼1)]; Astroviridae [porcine astrovirus (n¼ 8), feline astrovirus (n ¼7)]; Hepeviridae [HEV (n ¼3)]; Reoviridae [rotavirus (n ¼ 3)] and Caliciviridae [feline norovirus (n ¼2)]. These viral contigs showed a wide range of sequence identity to known viruses, suggesting some of these sequences might be derived from novel viruses. The 598 and 1236 contigs that belonged to the families Circoviridae and Picobirnaviridae respectively were analyzed by BLASTx. In both families Circoviridae and Picobirnaviridae, contigs that encoded the corresponding RdRp, capsid proteins and hypothetical proteins were observed ( Supplementary Fig. 1 ). Repeated terminal sequences in the contigs indicated a circular genome. Fourteen contigs containing complete circular genomes of the Circoviridae family, ranging in size from 2516 to 2977 bp (Fig. 2) . Overall, nucleotide identities to known members of the Circoviridae family were less than 75% for all the genomes. Therefore, according to the ICTV criteria (www.ictvdb.org) which state that circoviruses of the same species should share 4 75% and 470% nucleotide identity in their complete genome and capsid protein sequences respectively, these viruses found in dromedaries should be novel species in the Circoviridae family. Phylogenetic tree of these 14 complete genomes were constructed with representative complete genomes of circovirus, cyclovirus and circo-like virus sequences, starting at Rep ATG. These 14 complete genomes formed at least five clusters, including one related to porcine circovirus-like virus, five related to bovine stool-associated circular DNA virus (BoSCV), two related to fur seal feces-associated circular DNA virus (FSfaCV), two related to rodent stool associated circular genome virus (RodSCV) and four related to other cirovirus-like virus (Fig. 3) . The 12 and 21 contigs that belonged to the family Picobirnaviridae with complete RdRp and capsid genes respectively were further aligned with all available complete RdRp and capsid genes of picobirnaviruses. Phylogenetic tree of the complete RdRp and capsid genes of the picobirnavirus genome is shown in Fig. 4 . The contigs were highly diverse. Contigs that belonged to both genogroup I and genogroup II picobirnaviruses were observed. In addition, distinct branches that were not clustered with either genogroup I or genogroup II picobirnaviruses were also observed, suggesting that there may be one or more additional genogroups in picobirnaviruses. In this first metagenomic study on viromes in animals of the family Camelidae, more than 500 contigs (including 14 complete genomes) and around 25% of mammalian virus contigs observed in dromedary fecal samples belonged to the Circoviridae family. Members of the Circoviridae family are small non-enveloped circular single-stranded DNA viruses found in a wide variety of mammals and birds. Circovirus infections are very common and geographically widely distributed. Although subclinical infections are common, circovirus infections have been suggested to be associated with psittacine beak and feather disease, infectious chicken anemia, circovirus disease of pigeons, and the postweaning multisystemic wasting syndrome of pigs (Biagini et al., 2011) . Among the metagenomic studies on fecal samples of other mammals, only one study on fecal samples of pigs showed a comparable high number of sequences from the Circoviridae family (Table 1) (Sachsenroder et al., 2014) . At least five metagenomic studies did not show any circovirus sequence (Lager et al., 2012; Li et al., 2011a; Li et al., 2011b; Smits et al., 2013; van den Brand et al., 2012) . As for sequence diversity, the 14 complete genomes formed at least five clusters related to different known members of the Circoviridae family, including porcine circovirus-like virus, BoSCV, FSfaCV, RodSCV and other cirovirus-like virus, were observed in this study (Fig. 3) . This high diversity of sequences from the Circoviridae family was rarely seen in other metagenomic studies. The dromedary fecal samples also contain a large number (more than 1000 contigs and more than half of all mammalian virus contigs) and high diversity of picobirnavirus sequences. Picobirnaviruses are small non-enveloped bisegmented double-stranded RNA viruses found in human and a wide variety of mammals and birds. Since its first discovery in fecal samples of humans and rats in 1988 (Pereira et al., 1988a; Pereira et al., 1988b) , picobirnaviruses have been reported in other mammals and birds (Browning et al., 1991; Gallimore et al., 1993; pathogenicity of picobirnaviruses has not been established. Although picobirnaviruses have been detected in fecal samples from children with diarrhea and in immunocompromised patients, they have also been found in individuals without diarrhea (Delmas, 2011) . Although 4500 sequences (most are partial sequences) of picobirnaviruses are available in the GenBank database, phylogenetic and evolutionary studies have been hampered by the small number of complete genome sequences of picobirnaviruses available. So far, there are only four complete genome sequences of genogroup I picobirnaviruses, including one from a human picobirnavirus, another from a porcine picobirnavirus, the third from a turkey picobirnavirus and the fourth from a novel picobirnavirus, named otarine picobirnavirus, we discovered in the fecal sample of a California sea lion in an oceanarium in Hong Kong recently (Woo et al., 2012) . For these four available genomes, the larger genome segment, segment 1, is 2.3-2.5 kb in length and encodes the capsid protein and one to two putative proteins of unknown function, while the smaller genome segment, segment 2, is 1.6-1.7 kb in length and encodes the viral RdRp. Moreover, no genogroup II picobirnavirus complete genome sequence is available. This large number of picobirnavirus sequences observed in dromedary fecal samples has never been found in metagenomic studies in other animals. For example, pigs are well-known to have a relatively high prevalence of picobirnavirus infections (Banyai et al., 2008; Chen et al., 2014; Ganesh et al., 2012; Giordano et al., 2011; Martinez et al., 2010; Smits et al., 2011) , but in the various metagenomic studies on fecal samples of pigs, picobirnavirus sequences only constituted 0-7% of all mammalian virus sequences (Cheung et al., 2013; Lager et al., 2012; Sachsenroder et al., 2012; Sachsenroder et al., 2014; Shan et al., 2011) . In fact, in some metagenomic studies, such as those on bats, pine martens, European badgers and dogs, no picobirnavirus sequence was observed (Table 1 ) (Ge et al., 2012; Li et al., 2011a; Li et al., 2010; van den Brand et al., 2012; Wu et al., 2012) . As for the diversity, data from the present metagenomic study suggested the presence of large numbers of both genogroup I and genogroup II picobirnavirus contigs with high diversity in dromedaries (Fig. 4) . This is different from results obtained from metagenomic studies for fecal samples of other animals, in which mainly genogroup I picobirnavirus sequences were observed Cheung et al., 2013; Lager et al., 2012; Li et al., 2011b; Phan et al., 2011; Sachsenroder et al., 2012; Sachsenroder et al., 2014; Smits et al., 2013) . Complete genome sequencing of these picobirnaviruses in dromedaries would be invaluable in understanding genome evolution in this understudied family of virus. Solid circles mark the contigs determined in this study. Porcine circovirus-like virus, fur seal feces-associated circular DNA virus, bovine stool-associated circular DNA virus, circoviruses, cycloviruses, circovirus-like virus and rodent stool associated circular genome virus are represented by cyan, orange, red, green, blue, purple and citron branches respectively. In addition to circoviruses and picobirnaviruses, the present metagenomic analysis also revealed other viral families in dromedaries, many of which are previously not known in dromedaries, such as hepatitis E virus, picornaviruses, parvoviruses and astroviruses. The use of centrifugation steps and filters in metagenomics studies may result in the loss of viruses, particularly those with large virions. Unlike three recent studies which detected the presence of MERS-CoV in nose swabs of dromedaries in Qatar, Saudi Arabia and Egypt (Alagaili et al., 2014; Chu et al., 2014; Haagmans et al., 2014) , no MERS-CoV was found in the present metagenomic study on fecal samples. This is probably due to the inherent tissue tropism of MERS-CoV. Similarly, our recently discovered DcCoV UAE-HKU23 was also not found in the present metagenomic study on fecal samples collected from adult dromedaries (Woo et al., 2014b) . This is probably because DcCoV UAE-HKU23 was mainly detected in dromedary calves, whereas its prevalence in adult dromedaries was very low (Woo et al., 2014b) . The plant and insect viral sequences observed in fecal samples of dromedaries were probably results of ingestion of plants and insects by the dromedaries. As for the mammalian virus families, hepatitis E virus has never been reported in camels and its prevalence, genome sequence and phylogenetic analysis have been reported elsewhere (Woo et al., 2014a) . As for picornaviruses, astroviruses and parvoviruses, they are also commonly observed in previous metagenomic studies using fecal samples of other animals Phan et al., 2011; Shan et al., 2011) . Parvoviruses and astroviruses are important pathogens in various groups of mammals and these two families of viruses are previously not described in dromedaries. Since some contigs were found to be most closely related to human bocavirus, complete genome sequencing and further phylogenetic analysis would be useful to further understand the relationship between these "dromedary bocaviruses" and human bocaviruses. Although some members of picornaviruses are able to infect Bactrians and cause diseases such as foot-and-mouth disease, dromedaries were known to be not susceptible to these agents and camels are not the reservoir of these picornaviruses (Wernery et al., 2014) . Further studies comparing the incidence of these three important families of viruses in healthy and sick dromedaries will reveal their importance in this unique group of animals. All fecal samples of dromedaries (Camelus dromedarius) were left-over specimens submitted for coprological studies to Central Veterinary Research Laboratory in Dubai, United Arab Emirates from January to July 2013. The samples originated from different camel premises where dromedaries were kept for racing, including both dromedaries for routine check-up (n ¼200) and those with diarrhea (n ¼3). A total of 203 adult dromedaries have been tested in this study. The viral transport medium containing the 203 fecal samples were pooled, 200 μl ($0.2 g feces per 2 ml viral transport medium) each, and centrifuged at 10,000g for 10 min. The supernatant was then filtered through a 0.45-μm filter (Millipore, Massachusetts, USA) to remove eukaryotic and bacterial cell-sized particles. The filtrate was treated with a cocktail of DNase enzymes consisting of 14 U of turbo DNase (Ambion, Austin, TX, USA), 20 U of benzonase (Novagen, Madison, WI, USA) and 20 U of RNase One (Promega, Wisconsin, USA) at 37 1C for 60 min in 1 Â DNase buffer (Ambion, Austin, TX, USA) to digest unprotected nucleic acids. Total RNA from the sample was extracted using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany). Reverse transcription was performed using SuperScript III reverse transcriptase (Invitrogen, San Diego, CA, USA) and a random primer containing a 20-base arbitrary sequence at the 5 0 end followed by a randomized octamer (8 N) at the 3 0 end. A single round priming and extension was performed using Klenow fragment polymerase (New England Biolabs, Massachusetts, USA). PCR amplification with primer consisting of only the 20-based arbitrary sequence of the random primer was performed in 20 cycles of 94 1C for 15 s, 60 1C for 30 s and 68 1C for 1 min and a final extension at 68 1C for 7 min in an automated thermal cycler (Applied Biosystems, Foster City, CA, USA). Standard precautions were taken to avoid PCR contamination and no amplified PCR product was observed in negative control. The PCR product was purified using the MinElute PCR Purification Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol with slight modification. The purified DNA was eluted in 15 μl of EB buffer and used as the template for library construction. The metagenomic library was prepared using Nextera XT DNA Sample Preparation Kit (Illumina, San Diego, CA, USA) and Nextera Illumina sequence reads were adapter and quality trimmed using Trimmomatic with the Nextera adapter fasta sequences and following parameters: leading 3, trailing 3, sliding window 4:15, minimum length 36 bp (Lohse et al., 2012) . Trimmed paired-end reads were de novo assembled in silico with IDBA-UD 1.1.1 with default parameters and fixed threshold for a minimum contig length of 200 bp (Peng et al., 2012) . IDBA-UD algorithm is based on the de Bruijn graph approach for assembling reads. The contigs that aligned to rRNA sequences from the SILVA rRNA database were initially removed using Bowtie2 (Langmead and Salzberg, 2012) . The remaining contigs were compared to non-redundant protein sequences (nr) database from NCBI (http://www.ncbi.nlm. nih.gov), which contains non-redundant sequences from GenBank translations (i.e. GenPept) and sequences from other databanks (Refseq, PDB, SwissProt, PIR and PRF), using BLASTx with an E-value cutoff of 10 À 5 . BLAST results were parsed to save the best hits for each sequence. The best-hit sequences were individually annotated to note the sources of the matching sequences (virus, phage, bacteria, archaea and eukaryotes). Sequences were also analyzed using a metagenomic annotation tool, MEGAN version 4.70.4, to assign each sequence into different taxa present in the metagenomic sequences using the NCBI taxonomic database (Huson et al., 2011) . After de novo assembly, there were 598 and 1236 contigs belonging to the families Circoviridae and Picobirnaviridae respectively. Fourteen contigs containing complete circular genomes of the Circoviridae family were used for phylogenetic analysis. The 14 complete genome sequences have been submitted to GenBank with accession numbers KM573763-KM573776. Twelve and 21 contigs that covered complete RdRp and capsid genes respectively in the picobirnavirus genome were used for phylogenetic analysis. The 12 RdRp and 21 capsid sequences have been submitted to GenBank with accession numbers KM573798-KM573809 and KM573777-KM573797, respectively. Phylogenetic analysis was performed by the neighbor-joining method, using Jukes-Cantor substitution model with gamma distributed rate variation and bootstrap values calculated from 1000 trees. Viral and bacterial infections associated with camel (Camelus dromedarius) calf diarrhea in North Province, Saudi Arabia Middle East respiratory syndrome coronavirus infection in dromedary camels in saudi arabia Genogroup I picobirnaviruses in pigs: evidence for genetic diversity and relatedness to human strains Virus Taxonomy, Classification and Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses, International Union of Microbiological Societies, Virology Division Identification of multiple novel viruses, including a parvovirus and a hepevirus, in feces of red foxes The prevalence of enteric pathogens in diarrhoeic thoroughbred foals in Britain and Ireland Molecular detection of genogroup I and II picobirnaviruses in pigs in China A divergent clade of circular single-stranded DNA viruses from pig feces MERS coronaviruses in dromedary camels Middle east respiratory syndrome coronavirus (MERS-CoV): announcement of the coronavirus study group Virus Taxonomy, Classification and Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses, International Union of Microbiological Societies, Virology Division Detection and characterization of a novel bisegmented double-stranded RNA virus (picobirnavirus) from rabbit faeces Detection and molecular characterization of porcine picobirnavirus in feces of domestic pigs from kolkata Metagenomic analysis of viruses from bat fecal samples reveals many novel viruses in insectivorous bats in China Evidence of closely related picobirnavirus strains circulating in humans and pigs in Argentina Middle East respiratory syndrome coronavirus in dromedary camels: an outbreak investigation Evaluation of pepper mild mottle virus, human picobirnavirus and Torque teno virus as indicators of fecal contamination in river water Integrative analysis of environmental sequences using MEGAN4 Natural exposure of Dromedary camels in Sudan to infectious bovine rhinotracheitis virus (bovine herpes virus-1) An outbreak of peste des petits ruminants (PPR) in camels in the Sudan Diversity of viruses detected by deep sequencing in pigs from a common background Fast gapped-read alignment with Bowtie 2 Genetic characterization of betacoronavirus lineage C viruses in bats reveals marked sequence divergence in the spike protein of Pipistrellus Bat Coronavirus HKU5 in Japanese Pipistrelle: implications for the origin of the novel middle east respiratory syndrome coronavirus Viruses in diarrhoeic dogs include novel kobuviruses and sapoviruses The fecal viral flora of California sea lions Bat guano virome: predominance of dietary viruses from insects and plants plus novel mammalian viruses RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics Identification in porcine faeces of a novel virus with a bisegmented double stranded RNA genome Picobirnavirus causes persistent infection in pigs Picobirnavirus (PBV) natural hosts in captivity and virus excretion pattern in infected animals The picobirnavirus: an integrated view on its biology, epidemiology and pathogenic potential IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth Novel viruses in human faeces A virus with a bisegmented double-stranded RNA genome in rat (Oryzomys nigripes) intestines Seroepidemiology for MERS coronavirus using microneutralisation and pseudoparticle virus neutralisation assays reveal a high prevalence of antibody in dromedary camels in Egypt The fecal viral flora of wild rodents Middle East respiratory syndrome coronavirus neutralising serum antibodies in dromedary camels: a comparative serological study Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing The general composition of the faecal virome of pigs depends on age, but not on feeding with a probiotic bacterium The fecal virome of pigs on a high-density farm Genogroup I and II picobirnaviruses in respiratory tracts of pigs Metagenomic analysis of the ferret fecal viral flora Characterization of the complete genomes of Camelus dromedarius papillomavirus types 1 and 2 Infectious disorders of camelids Abortions in dromedaries (Camelus dromedarius) caused by equine rhinitis A virus Experimental camelpox infection in vaccinated and unvaccinated dromedaries Complete genome sequence of a novel picobirnavirus, otarine picobirnavirus, discovered in California sea lions New hepatitis E virus genotype in camels, the Middle East Virome analysis for identification of novel mammalian viruses in bat species from Chinese provinces Cytopathic genotype 2 bovine viral diarrhea virus in dromedary camels Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.virol.2014.09.020.