key: cord-0882957-k0f9fser authors: Shi, Weifeng; Li, Juan; Zhou, Hong; Gao, George F. title: Pathogen genomic surveillance elucidates the origins, transmission and evolution of emerging viral agents in China date: 2017-11-28 journal: Sci China Life Sci DOI: 10.1007/s11427-017-9211-0 sha: bef8702c9771266cbcebb9b0e9fd72537027d949 doc_id: 882957 cord_uid: k0f9fser In the past twenty years, numerous novel zoonotic viral agents with pandemic potential have emerged in China, such as the severe acute respiratory syndrome (SARS) coronavirus and, more recently, the avian-origin influenza A/H7N9 virus, which have caused outbreaks among humans with high morbidity and mortality. In addition, several emerging and re-emerging viral pathogens have also been imported into China from travelers, e.g. the Middle East respiratory syndrome (MERS) coronavirus and Zika virus (ZIKV). Herein, we review these emerging viral pathogens in China and focus on how surveillance by pathogen genomics has been employed to discover and annotate novel pathogenic agents, identify natural reservoirs, monitor the transmission events and delineate their evolution and adaption to the human host. We also highlight the application of genomic sequencing in the recent Ebola epidemics in Western Africa. In summary, genomic sequencing has become a standard research tool in the field of emerging infectious diseases which has been proven invaluable in containing these viral infections and reducing burden of disease in humans and animals. Genomic surveillance of pathogenic agents will serve as a key epidemiological and research tool in the modern era of precision infectious diseases and in the future studies of virosphere. With the advances in DNA sequencing technologies, the continuing decreases of the cost per base, application of automation and increasingly sophisticated bioinformatics, far greater numbers of biomedical scientists apply Sanger and, particularly, next-generation sequencing in their research. In the field of emerging infectious diseases, pathogen sequencing *Corresponding author (email: shiwf@ioz.ac.cn; wfshi@tsmc.edu.cn) has become a standard and indispensable research tool for the critical role of genomic surveillance in the prevention and control of infectious diseases (Holmes, 2007; Haagmans et al., 2009; McHardy and Adams, 2009 ). In the preceding decades, globally there have been numerous emerging and re-emerging viral pathogens, such as the pandemic influenza A/H1N1 virus in 2009 (Neumann et al., 2009) , the unprecedented, sustained Ebola virus (EBOV) epidemics in Western Africa during 2013-2015 (Baize et al., 2014; Gire et al., 2014) and the ZIKV outbreaks in the Americas from 2015 to date (Faria et al., 2016) , which have caused millions of infections and tens of thousands of fatal cases. There have also been several emerging viral pathogens identified in China (Figure 1) , one of the most populated countries on Earth, which have posed a serious threat to public health. In this article, we review the emerging viral pathogens in China and the available methods for identifying pathogenic agents, and highlight the critical role of genomic surveillance in containing these viral infections. Several approaches have been developed and widely used for the detection and identification of causative pathogenic agent(s) ( Table 1 ), which are based on different methodologies and each with their own distinct advantages and disadvantages (Table 1) . For example, many kits based on realtime polymerase chain reaction have been commercialized and widely used in hospitals and research institutes to identify pathogenic agents. These kits are reliable and have high sensitivity and specificity for the target agent. However, generally these approaches cannot provide pathogen genomic information, such as pathogenicity and drug-resistance associated mutations, which is of vital importance for clinical therapy and is provided by unbiased sequencing. In addition, apart from next-generation sequencing, all the remaining methods need prior knowledge about the potential pathogenic agents for assay design. Therefore, they cannot identify novel pathogenic agents and are therefore unsuitable for pathogen discovery. In particular, only next-generation sequencing is able to identify the majority and, conceivably, all microbial agents existing in the sample. When applying genomic sequencing in the identification of infectious agents, the sample treatment process is straightforward and commercialized kits, including sample treatment and quality control, are available for each step (Figure 2 ). In particular, for next-generation sequencing no prior knowledge is needed with respect to the suspected etiological agent and researchers do not have to perform preliminary identification using traditional serological or molecular diagnostic methods. Although next-generation sequencing has advantages over routine diagnostics, it has not been widely used for surveillance of pathogenic microbes. The major reasons for this include: first, high throughput sequencing data needs sophisticated computational processing and annotation, and, thus far, relatively few researchers with sufficient bioinformatics expertise engage in near-patient or disease surveillance activities. In addition, the potential causative agent usually has many sequence reads, particularly if an isolate or a sample type enriched in the pathogen is available, which renders diagnosis and confirmation relatively straightforward. However, if there is only a limited number of sequence reads covering the true etiological agent, it is far harder to identify without bioinformatics training. In either case, the real causative agent should fulfill Koch's postulates, and therefore confirmatory experiments are needed to validate the discoveries made by next-generation sequencing. Second, the whole process, including sample treatment, data processing and annotation needs standardization, which is extremely important for clinicians when a diagnosis based on high throughput sequencing data is required to guide treatment. Third, the cost of next-generation sequencing is still high, including library construction, patterned flow cells, DNA sequencing platforms and high performance computer servers. Finally, routine next-generation sequencing process is currently still relatively laborious and time-consuming, usually requiring a few days from sample receipt, which is unsuitable for emergent cases. In November 2002, an "infectious atypical pneumonia", subsequently named as severe acute respiratory syndrome (SARS), first emerged in Guangdong province in southern China (Peiris et al., 2004) . In late February 2003, a physician incubating the virus traveled to Hong Kong and spread the disease to local residents and to travelers, who further disseminated the disease via international air travel and precipitated one of the most significant public health events in recent history (Breiman et al., 2003) . At the same time, the disease spread rapidly across mainland China, and one of the mostly stricken regions was Beijing, which generated widespread public alarm. This infectious disease eventually caused 8,098 human infections and 774 deaths in 37 countries, with a case-fatality rate of 9.6% (Chan-yeung and Xu, 2003) . Back to early March 2003, when the outbreak had continued for three months, debate regarding the potential causative agent was still ongoing in China (Enserink, 2003) , and this indecisiveness prevented the correct containment and treatment of the disease to some degree. In late March of 2003, scientists from the United States Centers for Disease Control and Prevention first linked SARS to a previously uncharacterized novel coronavirus (subsequently termed SARS-CoV) (https://www.cdc.gov/mmwr/preview/mmwrhtml/mm5212a1.htm). Two independent research groups soon sequenced the viral genome and confirmed that the coronavirus was indeed the etiological agent responsible for the SARS pandemic (Marra et al., 2003; Rota et al., 2003) . Subsequently, this virus was detected in Himalayan palm civets (Paguma larvata), which are sold in live animal markets in Guangdong as a delicacy, and from a raccoon dog (Nyctereutes procyonoides) and even from humans working at the same market (Guan et al., 2003) . During the whole process of SARS, both the Chinese Ministry of Health and the Chinese scientists were passive. However, public health officials and scientists soon woke up to the situation and confirmed the causative agent, sequenced the viruses (Qin et al., 2003a; Qin et al., 2003b) , and performed a large-scale molecular epidemiological study of the SARS-CoV in China (He et al., 2004) and demonstrated for the first time, the coronavirus reservoir in horseshoe bats (family Rhinolophidae) (Li et al., 2005) . These findings underscored the critical significance of genomic surveillance and pathogen-discovery in emerging disease as a strategy to increased preparedness for future pandemic events. In retrospect, the SARS outbreak marked a turning point both for the Chinese Center for Disease Control and Prevention (China CDC) system and the scientists, and vital lessons were learned. Subsequently, China began to reform the China CDC system, by providing more research funding, procuring more modern research instruments, training of the staff working at the CDC system, and extensively collaborating with World Health Organization (WHO) and international scien-tists. What was more important was the establishment of a national direct reporting system from sentinel laboratories for infectious diseases, which has played a vital role in the containment and control of subsequent emerging infectious disease outbreaks. In March 2013, three urban residents from Shanghai and Anhui in China who presented with rapidly progressing lower respiratory tract infections deteriorated rapidly and died . Respiratory specimens were tested positive for influenza A virus, but were not subtypeable to currently circulating strains which prompted further investigation. Genomic sequencing demonstrated that the causative agent was a novel, previously undescribed H7N9 virus . To trace the potential origin of the viral agent, we performed a phylogenetic and coalescent analysis for the three H7N9 viruses from these human cases. Our results confirmed that the H7N9 agent was a reassortant with viral gene segments derived from multiple avian influenza virus (AIV) sources . In addition, although a number of the internal genes were closely related to a bird-derived strain, A/brambling/Beijing/16/2012(H9N2), we considered it likely on epidemiological grounds and contact tracing and proposed that the patients might be infected by a novel virus from domestic poultry . This was supported by an independent study, in which the researchers directly isolated the virus from chickens in live poultry markets (LPMs) . Therefore, closure of LPMs in the outbreak regions has served as one of the best means of reducing human infections (Bao et al., 2013; Gao, 2014) . The novel avian-origin H7N9 agent subsequently spread to other regions in the Yangtze River delta and caused further human infections during April and May, 2013. Almost all of the laboratory-confirmed human infections in the first H7N9 infection wave were completely sequenced. Phylogenetic analyses of these genome sequences showed that the internal genes of the H7N9 virus had become highly diversified concomitantly with its more widespread dissemination , whereas the HA and NA gene segments were relatively stable. Based on these observations, we proposed a dynamic reassortment model to describe the continuing evolution of H7N9 (Cui et al., 2014) . In this model, the two surface proteins, HA and NA, were conserved and constituted the genetic backbone of the virus. After the virus was transmitted to a new region, it dynamically recruited internal genes from H9N2 AIVs circulating locally via genetic reassortment. Subsequently, the novel virus was identified both in poultry and in humans in Guangdong province, a major region in the Pearl River delta, which has also been proposed to be the epicenter of novel highly pathogenic H5N1 influenza variants (Smith et al., 2006; Wu et al., 2010; Li et al., 2014) . Genetic analysis of the H7N9 strains from Guangdong showed that the HA and NA genes stemmed from those previously circulating in the Yangtze River delta, rather than from an independent origin (Qi et al., 2014a) . In particular, genomic sequencing provided strong support for the validity of our previously proposed dynamic reassortment model: the HA and NA gene segments of the Guangdong strains shared the same origin with the Yangtze River delta derived strains; however, their internal genes originated from H9N2 influenza viruses circulating locally in Guangdong province (Qi et al., 2014a) . However, as a low pathogenic influenza virus, poultry infected with H7N9 did not present clinical signs or just mild disease which raised concerns of widespread silent dissemination in avian populations. Along with poultry transportation, H7N9 has subsequently spread in eastern China since 2013 (Lam et al., 2015) . Genomic surveillance showed that the low pathogenic H7N9 virus has become endemic in eastern China and has established two major lineages, the Yangtze River delta lineage and the Pearl River delta lineage . In early 2017, human infections with a novel H7N9 virus were reported in mainland China and Taiwan . Genomic sequencing revealed that the HA and NA genes of the novel H7N9 variant likely originated from the Yangtze River delta lineage . However, distinct from previously circulating H7N9 strains, the novel isolates possessed multiple basic amino acids at the cleavage site of the HA protein, suggesting they might be highly pathogenic (HP) to chickens Zhang et al., 2017; Zhu et al., 2017b) . Based on current data, there have been at least two different motifs inserted at the cleavage site, KRKRTAR/G and KGKRIAR. There was one amino acid difference between the two motifs and the former had one more basic amino acid, suggesting that they might have slight differences in pathogenesis. In addition, of greater concern, the R292K mutation in the NA protein associated with enhanced resistance to oseltamivir and other neuraminidase inhibitors, and the E627K mutation in the PB2 protein associated with increased replication efficiency in mammalian hosts were also detected in some of the human-derived strains . To date, the HP H7N9 virus has only been identified in Guangdong province in mainland China, however, suspected HP H7N9 infections have also been reported in other provinces. Therefore, the further potential for transmission to other regions and potential adaptation of this novel HP virus to human populations warrants further investigation and ongoing surveillance. Although AIVs of the H6 subtype have been frequently isolated worldwide from poultry and wild birds, no human infections with this subtype had been reported prior to 2013. In May 2013, throat swabs of a 20-year-old female presented with pneumonia due to an unconfirmed type of influenza virus was sent to The Centers for Disease Control, Taiwan, which was subsequently found to be of the H6N1 subtype (http://www.cdc.gov.tw/english/infoaspx?treeid=bc2d4e89b154059b&nowtreeid=ee0a2987c-fba3222&tid=E36A5E9AB3D3A216). The patient fully recovered and all close contacts were found to be negative for H6N1, suggesting no human-to-human infection or onward transmission events. By phylogenetic analysis of the virus genome, several research groups independently studied the genetic origin of this virus, A/Taiwan/2/2013(H6N1) (abbreviated as TW2 thereafter). Yuan and co-workers proposed that TW2 was reassorted from multiple sources, with PB2, PA and M genes from H5N2, and the remaining genes from two different H6N1 avian lineages . We also performed a phylogenetic and coalescent analysis of TW2 and closely related viruses. Our results also revealed that TW2 was a novel reassortant; however, in disagreement with the previous study, we found that H5N2 did not establish a persistent lineage in Taiwan and they were more recent strains than the ancestor of the novel human-infecting H6N1 virus . Therefore, we proposed that TW2 genetically reassorted with different H6N1 lineages circulating in Taiwan, evidence for which has been supported by a subsequent study (Wei et al., 2013) . Influenza A (H10N8) virus has been previously identified in poultry in several countries, including China (Zhang et al., 2011; Jiao et al., 2012) . In December 2013, the first human infection with an avian-origin H10N8 influenza virus was reported in China . The patient was a 73-year-old woman from Jiangxi province and presented with fever. Unfortunately, she developed multiple organ failure and died nine days after illness onset. Deep sequencing of the virus (A/Jiangxi-Donghu/346/2013(H10N8)) (abbreviated as JX346) found that it was a novel reassortant distinct from the previously described Guangdong strain. In detail, the HA gene of the virus originated from an H10 lineage circulating in Eurasia at a low level, the NA gene belonged to a North American N8 lineage, and all six internal genes stemmed from contemporaneously circulating avian H9N2 influenza viruses in China. In addition, the patient had visited a LPM four days before illness onset; therefore, LPM was considered the most probable infection source. From poultry samples from the same LPM that the patient had visited, we successfully isolated an H10N8 virus, A/chicken/Jiangxi/102/2013(H10N8) (abbreviated as JX102) (Qi et al., 2014b) . JX102 was closely related to JX346 in six genes, apart from the PB1 and PB2 genes, which were more similar to some H7 AIVs circulating in eastern China. Based on this evidence, we considered that the genesis and evolution of the novel human-infecting H10N8 was similar to that of H7N9 and that both of these agents had undergone a dynamic reassortment process. After the virus obtained H10 and N8 genes, it dynamically recruited internal genes from the influenza gene pool existing in poultry in LPMs and farms and formed multiple co-circulating genotypes. In particular, we noted that both the novel human-infecting H7N9 and H10N8 viruses acquired internal genes from poultry harboring H9N2 viruses. Therefore, we proposed that poultry carrying H9N2 viruses may act as genetic incubators for the genesis of novel human-infecting AIVs given rise to the dynamic reassortant process we have identified, a model which warrants further study . In April 2014, China reported the first human infection with an influenza A (H5N6) virus, A/Sichuan/26221/2014(H5N6) (abbreviated as SC26221 hereafter) (Pan et al., 2016) . However, an independent study reported that a 5-year-old girl from Changsha, Hunan province was infected with an H5N6 virus in February 2014, which predated the SC26221 case . As of November 2016, a total of 17 human infections with H5N6 have been confirmed in China . Phylogenetic analysis revealed that SC26221 belonged to clade 2.3.4.4 and was closely related to a previous strain A/environment/Zhenjiang/C13/2013(H5N6) (abbreviated as C13 thereafter) (Pan et al., 2016) . We have performed a surveillance of AIV in Nanbu County, Sichuan province in April 2014, where the first human infection with H5N6 was reported . A total of six H5N6 viruses were successfully isolated and sequenced. These strains were more similar with SC26221 in eight gene segments than C13, suggesting direct poultry-to-human transmission events . Subsequent studies further identified SC26221-like H5N6 viruses in migratory birds and also revealed that these emerging H5N6 viruses possessed enhanced affinity for human type sialic acid receptors and in-contact transmission ability in ferrets , highlighting a potential higher threat which these viruses pose. To systematically investigate the prevalence and evolution of H5N6, we performed a surveillance of H5N6 AIV across eastern China during a greater than one year time period (Bi et al., 2016a) . Surprisingly, H5N6 had replaced H5N1 as a dominant AIV subtype in southern China, especially in ducks, although H9N2 was still the dominant subtype in chickens. Full-length genome sequences of an extremely large number of H5N6 viruses (n=505) were obtained and phylogenetic analyses showed that they have evolved into at least 34 distinct genotypes via dynamic reassortment, although the HA and NA demonstrated a clear matching pattern (Bi et al., 2016a) . These results provided another example in support of our dynamic reassortment model and highlighted the unexpectedly high prevalence and widespread nature of H5N6 viruses, which warrants increased surveillance of domesticated and wild birds to provide a greater evidence base so as to be able to mitigate future outbreak events. Apart from viral pathogens emerging and autochthonously circulating in China, there have been several newly emerging and re-emerging viruses imported into China, which also pose a potential threat to the public health. A good example is the Middle East respiratory syndrome coronavirus (MERS-CoV). MERS-CoV was first detected in the Kingdom of Saudi Arabia (KSA) in September 2012 (Su et al., 2016) . Genetically related to SARS-CoV, MERS-CoV can also cause severe acute respiratory tract infection with high case fatality rates in humans. Although 2,040 confirmed infections and 712 fatal cases have been reported thus far, the majority of the infections (ca. 82%) were confined to the KSA (http://www.who.int/emergencies/mers-cov/en/). However, in May 2015, a 68-year-old Korean man returning from travel in the Middle East was infected by MERS-CoV and caused 185 secondary cases, including 38 deaths in the subsequent two months, which is considered the largest outbreak outside the Middle East Lee et al., 2017) . On 26 May 2015, a 44-year-old Korean man who was in close contact with confirmed MERS-CoV infection cases in South Korea travelled to Guangdong province, and was diagnosed as the first imported MERS-CoV case in China (strain ChinaGD01) Wu et al., 2015; Xie et al., 2015) . Collaborating with China CDC, we obtained the full-length genome sequence and performed a comprehensive bioinformatics and phylogenetic analysis of ChinaGD01 . We found that ChinaGD001 was genetically related to the South Korean MERS-CoV index strain and both clustered within group 3 of clade B . In particular, the two strains formed an independent cluster within group 3, together with a few KSA strains isolated in early 2015. These findings were consistent with the epidemiological investigation. Further phylogenetic analyses of different coding regions however, revealed different phylogenetic relationships with the 2015 cluster: the ChinaGA01 strain fell within group 5 rather than group 3 in the S gene tree, indicative of a potential recombination event, which was also supported by our bootscanning analysis and, additionally, by an independent study from South Korea (Kim et al., 2016) . We further dated the recombination event and found that it might have occurred in late 2014, several months before the South Korea outbreak . However, it should be noted that although the South Korean MERS outbreak has been successfully contained, there are still several puzzling questions which remain and warrant further investigation, one of which is the phenomenon of superspreading (Oh et al., 2015; Wong et al., 2015; . Although the diagnostic delays and longer incubation period have been proposed to explain the nosocomial outbreaks of MERS (Chowell et al., 2015) , the underlying viral factors have not been investigated, e.g. whether certain amino acid substitutions of the virus or the genetic recombination within the S gene contribute to the enhanced transmissionability of MERS or whether differences in host genetics may play roles deserves deeper attention to shed light on the pathogenic mechanism(s) and, potentially, the development of prophylactics and antivirals to mitigate disease spread. In contrast to the MERS-CoV, ZIKV is not a novel emerging virus, but a re-emerging one. It was first discovered in Uganda in 1947 from febrile sentinel rhesus macaques (Dick et al., 1952) . ZIKV is a mosquito-borne flavivirus, circulating at low prevalence in much of Africa and Asia before 2013 (Haddow et al., 2012) . ZIKV infections in humans are mostly asymptomatic; however, a small percentage of patients may show clinical symptoms such as fever and rash, which resolve within a week or less. Prior to 2013, no large-scale outbreaks caused by ZIKV had been reported worldwide and no endemic or imported ZIKV cases were reported in mainland China (Hayes, 2009) . However, an unprecedented large-scale ZIKV outbreak swept South America in 2015 (Faria et al., 2016) . Of even greater concern, numerous studies have established the association between ZIKV infection and congenital abnormalities (Calvet et al., 2016; Li et al., 2016a; Mlakar et al., 2016; Ventura et al., 2016) , e.g. in utero growth restriction, placental insufficiency, microcephaly and fetal death, as well as neurological conditions, such as Guillain Barré syndrome, which raised global concerns and the declaration of a public health emergency of international concern by WHO in early 2016. Since the first confirmed imported ZIKV case in February 2016 Zhang et al., 2016b) , a total of 28 cases have been reported in China as of September 2016 (Deng et al., 2016; Li et al., 2016b; Sun et al., 2017) . Collaborating with China and Zhejiang Provincial CDC, we have sequenced the full-length genomes of the first three imported ZIKV cases into Zhejiang province and performed a phylogenetic analysis of the first 13 imported ZIKV cases in China . The 13 infected cases were travelers returning from South America (n=10) and Oceania (n=3). Phylogenetic analysis revealed that the imported ZIKVs fell within the Asian linage and were closely related to those causing the outbreak in the Americas . In particular, ZIKVs from travelers from Fiji/Samoa formed a minor, but well-supported, separate lineage, whereas viruses isolated from travelers returning from South America scattered within the Latin American lineage, suggesting that the imported ZIKVs have become highly genetically diversified ). An independent analysis of more imported ZIKV cases (n=19) in Guangdong, China subsequently confirmed our conclusions (Sun et al., 2017) . Rapid genomic sequencing facilitated the timely diagnosis and treatment of the imported ZIKV cases, which helped the prevention and control of ZIKV infections in China and served to alleviate public concern. Recently, the research group led by Gong Cheng at The Tsinghua University has identified that ZIKV infectivity was enhanced by a single A188V substitution in the NS1 protein of ZIKV . ZIKV possessing this amino acid substitution have both increased infectivity and prevalence in mosquito hosts, which might account for the widespread of ZIKV among the Americas. More recently, another single mutation event resulting in a S139N substitution in the prM protein of ZIKV, which we showed by evolutionary analysis to have arose prior to large scale outbreaks in French Polynesia, has been demonstrated to be a functional adaptation associated with higher neurovirulence in human (and murine) neural progenitor cells (Yuan et al., 2017) . Similar to ZIKV, RVFV is also a neglected tropical flavivirus which was first described in the Rift Valley province in Kenya in 1930 (Daubney et al., 1931) . Prior to 2010, RVF-associated outbreaks in livestock and humans were contained in Africa, especially sub-Saharan Africa (Nanyingi et al., 2015) . However, since 2000, RVFV has spread to the Arabian Peninsula, including Saudi Arabia and Yemen, and caused more than 2,000 human infections with approximately 400 deaths (Nanyingi et al., 2015) . In July 2016, an ex-patriot Chinese returning from Angola was confirmed to be the first imported RVFV case in China (Liu et al., 2017a) . He was suspected to have yellow fever when he went to hospital in Angola; however, after the patient was admitted, we tested for yellow fever, malaria, chikungunya, dengue fever, haemorrhagic fever with renal syndrome, and hepatitis A to E using serological and molecular approaches, but all tests were negative. Finally, acute serum and saliva samples from the patient were both positive for RVFV. An independent group also found the patient was yellow fever negative and by directly applying deep sequencing successfully identified RVFV from the serum sample . Phylogenetic analysis of the L, M and S gene segments of the virus showed that it was a reassortant, with the L and M genes from lineage E and the S gene from lineage A (Liu et al., 2017a) . Both the ZIKV and RVFV cases demonstrate that on one hand, these neglected tropical diseases could pose a potential threat to the public health of China; on the other hand, these neglected viruses have been undergoing evolution and adapting to their respective natural hosts and to humans. In some cases, a single amino acid substitution in the virus coding protein could be able to significantly enhance its infectivity and lead to a large-scale outbreak (Liu et al., 2017a; Tsetsarkin et al., 2007; Yuan et al., 2017) , which warrants long-term surveillance to mitigate future pandemic events. Ebola virus (EBOV) is a notorious high-consequence pathogen due to its high mortality rate. Although different Ebola virus species have caused more than twenty outbreaks in sub-Saharan Africa, most were contained within a limited geographic region and at most 425 cases were infected prior to (del Rio et al., 2014 . However, the 2013-2015 EBOV outbreak in West Africa was unprecedented, with 28,616 confirmed, probable and suspected cases and 11,310 deaths as of June 10, 2016 (http://who.int/csr/disease/ebola/en/). A preliminary study revealed that the outbreak of a communicable disease characterized by fever, severe diarrhea, vomiting and a high fatality rate in Guinea was caused by EBOV (Baize et al., 2014) . However, it was initially thought that the causative agent was a new EBOV strain different from EBOVs causing previous outbreaks in the Democratic Republic of Congo and Gabon (Baize et al., 2014) . However, a subsequent study suggested the currently circulating EBOVs likely stemmed and diverged from previous Middle African lineage EBOVs and had undergone rapid inter-and intrahost genetic variation after they infected humans from an unknown animal reservoir (Gire et al., 2014) . Shortly after the beginning of the outbreak, we also performed a bioinformatic analysis of EBOV genomes publicly available from GenBank and found that non-coding intergenic regions of the EBOV genome contained indispensable phylogenetic and evolutionary information. Without genetic data from non-coding regions of EBOV, phylogenetic analysis could result in misleading phylogenies . Although no human infection with EBOV has been reported in mainland China, in September 2014, upon the request of the Sierra Leone government, the Chinese government dispatched the China mobile laboratory testing team helping Sierra Leone fight against EBOV, which was led by one of us (George F. Gao). From 28 September to 11 November 2014, a total of 823 samples were laboratory confirmed to be EBOV-positive by the Chinese group and 175 full-length EBOV genomes were obtained (Tong et al., 2015) . Phylogenetic analysis of these virus genomes showed after the virus entered Sierra Leone, it continued to evolve and diversify into distinct lineages, suggesting that the virus is continuing to adapt to propagate in humans. After a complete analysis of all available EBOV genome sequences in the current outbreak, we found that the substitution rate of EBOV has slowed down compared with the strains isolated earlier in the outbreak and the substitution rate was close to the value estimated using all EBOV since 1976 (Tong et al., 2015) . A commentary in Nature remarked that new evidence ruled out rapid mutation of EBOV (Check Hayden, 2015) , which would be beneficial to eliminate the panic caused by this high-consequence pathogen. A BEAST analysis revealed that the population size of EBOV was continually increasing in our study period, suggesting the outbreak was still expanding. We further performed a phylogeographic analysis using the available EBOV genomes and re-constructed the EBOV transmission networks in West Sierra Leone, in which Freetown and Waterloo acted as two major transmission nodes (Tong et al., 2015) . We provided evidence that there were frequent virus transmission events between the two nodes and that they also disseminated the virus to adjacent regions. This information was helpful for the Government of Sierra Leone to formulate practical and feasible measures to mitigate the epidemic. A previous study first reported the existence of hundreds of intra-host single nucleotide variations (iSNV) in the EBOV genomes and they found that the iSNV frequencies were stable for samples from multiple time points (Gire et al., 2014) . When we analyzed the high throughput sequencing data, we also found the existence of iSNVs. Then we analyzed all publicly available sequencing data using our established bioinformatics pipeline and systematically studied the distribution and potential function of these iSNVs, and found that the distribution of iSNVs was not associated with either genomic coverage, sequencing depth, Ct values and collection dates of the samples (Ni et al., 2016) . In addition, apart from a higher distribution in the 3′-untranslated region, the distribution of iSNVs among other non-coding and coding regions were comparable. However, compared with other genes, 20 out of the 30 iSNVs occurred at the third position of the codons of the VP40 gene and 19 were synonymous, suggesting that VP40 was much conserved (Ni et al., 2016) . We also explored the potential function of a coupled T>C substitution (positions 3,008 and 3,011) using a mini-genome reporting system and found this coupled T>C substitution upregulated the transcription of the NP gene (Ni et al., 2016) . However, the occurrence of such serial T>C substitutions is not consistent with the prevailing neutral theory of molecular evolution and might reflect previously undescribed EBOV-host interaction, which deserves further investigation. Although no endemic or imported ZIKV cases were reported in China, the Chinese government dutifully helped Sierra Leone combat the EBOV outbreak. Even after the end of the EBOV outbreak, the Chinese government continued to dispatch medical experts to Sierra Leone to improve the local medical environment and to train local medical workers. In particular, a recombinant adenovirus type 5 vector-based Ebola vaccine developed by Chinese scientists has completed phase 1 trial in China and phase 2 trial in Sierra Leone, and both trials demonstrated that the candidate vaccine was safe and highly immunogenic Zhu et al., 2017a) . All of these events demonstrated China as a responsible country fully engaged in mitigating public health crises and reflected the greatly improved research capacity and capabilities in dealing with emerging infectious diseases. In January 2015, in his 2015 State of the Union address, former US President Obama announced the launch of the Precision Medicine Initiative to: "…enable a new era of medicine through research, technology, and policies that empower patients, researchers, and providers to work together toward development of individualized care". Subsequently, the Chinese government also announced launch of similar initiatives. In our view, the core of these initiatives and what makes them feasible is the precision. In terms of infectious diseases, it involves, at each step of the process, a precise diagnosis of the causative agent, precise identification of the origin the agent, precise identification of the transmission pathway of the virus, precise follow-up and delineation of the evolution and variation of the agent, precise development of safe and efficacious vaccines, precise treatment with antiviral drugs, and precise formulation of prevention and control measures. Although there is a long way to go before we can successfully complete all these tasks in a systematic and timely manner, scientists have been long-working towards this ultimate goal. A case in point is the recent EBOV epidemic. During the 2013-2015 EBOV outbreak, viruses from over 1,500 infected people were sequenced, accounting for approximately 5% of all confirmed infections, which makes it one of the most densely sampled infectious disease outbreaks and something that would have been almost inconceivable even a decade ago (Holmes et al., 2016) . Scientists have also developed a real-time, portable genome sequencing instrument (MinION, Oxford Nanopore Technologies, UK) for Ebola surveillance in the field (Quick et al., 2016) . A second case in point is the human infections with H7N9. As of 7 August, 2017, a total of 1,557 human infections with H7N9 AIV have been confirmed (http://www.who.int/csr/don/07-august-2017-ah7n9-china/en/). Scientists have sequenced over 650 strains of this reassortant and have made the sequence data available in public databases. In each case, rapid genome sequencing has played a critical role in tracing the transmission and evolution of the viruses. The term "virome" was proposed in early 2000s , accompanied by shotgun libraries and techniques for random amplification. Since then, hundreds of virome-associated studies have been reported and the host range is extremely large, including various wild (Bodewes et al., 2014) and domestic animals (Amimo et al., 2016) , insects (Bolling et al., 2015) , infected and healthy people (Linsuwanon et al., 2015; Hannigan et al., 2017) , plants (Mushegian et al., 2016) , and different environments (Aguirre de Cárcer et al., 2015; Bellas et al., 2015) . These studies have identified thousands of previously undescribed viruses, some of which have been classified into new viral families (Brum et al., 2015; Paez-Espino et al., 2016; Roux et al., 2016; Shi et al., 2016a) . The ultimate goal of these efforts is to re-shape the "virosphere", a term which was proposed by microbiologist Curtis Suttle in 2005 (Ash et al., 2006) . Accordingly, as more and more viruses are identified, it is likely for virologists to systematically screen which pathogens are human-infecting or potentially human-infecting and which are pathogenic to humans and/or animals. For these viruses, scientists can develop precise diagnostic methods, vaccines and antiviral drugs in advance before they cause an outbreak. If so, a world with a far decreased risk of emerging infectious diseases is conceivable. The author(s) declare that they have no conflict of interest. Biodiversity and distribution of polar freshwater DNA viruses Metagenomic analysis demonstrates the diversity of the fecal virome in asymptomatic pigs in East Africa Global screening for human viral pathogens Paradigms in the virosphere Emergence of Zaire Ebola virus disease in Guinea Liveanimal markets and influenza A (H7N9) virus infection Analysis of virus genomes from glacial environments reveals novel virus groups with unusual host interactions Genesis, evolution and prevalence of H5N6 avian influenza viruses in China Novel avian influenza A (H5N6) viruses isolated in migratory waterfowl before the first human case reported in China Two novel reassortants of avian influenza A (H5N6) virus in China Viral metagenomic analysis of feces of wild small carnivores Insect-specific virus discovery: significance for the Arbovirus community Methods in virus diagnostics: from ELISA to next generation sequencing Microarrays for rapid identification of plant viruses Role of China in the quest to define and control severe acute respiratory syndrome Detection and sequencing of Zika virus from amniotic fluid of fetuses with microcephaly in Brazil: a case study Latest Ebola data rule out rapid mutation A H10N8 virus infection: a descriptive study Human infections with the emerging avian influenza A H7N9 virus from wet market poultry: clinical analysis and characterisation of viral genome Transmission characteristics of MERS and SARS in the healthcare setting: a comparative study Enzootic hepatitis or Rift Valley Fever. An undescribed virus disease of sheep cattle and man from East Africa Ebola hemorrhagic fever in 2014: the tale of an evolving epidemic Isolation, identification and genomic characterization of the Asian lineage Zika virus imported to China Zika Virus (I). Isolations and serological specificity SARS in CHINA: China's missed chance Zika virus in the Americas: early epidemiological and genetic findings Influenza and the live poultry trade Human infection with a novel avian Characteristics of traveler with middle east respiratory syndrome Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China The application of genomics to emerging zoonotic viral diseases Genetic characterization of Zika virus strains: geographic expansion of the Asian lineage Evolutionary and functional implications of hypervariable loci within the skin virome Zika virus outside Africa The evolution of Ebola virus: insights from the 2013-2016 epidemic Differentiation and identification of Shigella spp. and enteroinvasive Escherichia coli in environmental waters by a molecular method and biochemical test Preliminary epidemiologic assessment of human infections with highly pathogenic avian influenza A(H5N6) virus, China Non-coding regions of the Ebola virus genome contain indispensable phylogenetic and evolutionary information Complete genome sequence of an H10N8 avian influenza virus isolated from a live bird market in southern China Human infection with highly pathogenic avian influenza A(H7N9) virus The recent ancestry of Middle East respiratory syndrome coronavirus in Korea has been shaped by recombination Role of molecular diagnostics in the management of infectious disease emergencies Dissemination, divergence and establishment of H7N9 influenza viruses in China A dynamic compartmental model for the Middle East respiratory syndrome outbreak in the Republic of Korea: a retrospective analysis on control interventions and superspreading events The clinical and virological features of the first imported case causing MERS-CoV outbreak in South Korea Zika virus disrupts neural progenitor development and leads to microcephaly in mice Bats are natural reservoirs of SARS-like coronaviruses Global and local persistence of influenza A(H5N1) virus Zika virus: a new threat from mosquitoes The fecal virome of children with hand, foot, and mouth disease that tested PCR negative for pathogenic enteroviruses Origin and diversity of novel avian influenza A H7N9 viruses causing human infection: phylogenetic, structural, and coalescent analyses The first imported case of Rift Valley fever in China reveals a genetic reassortment of different viral lineages Complete genome sequence of Zika virus from the first imported case in mainland China Evolutionary enhancement of Zika virus infectivity in Aedes aegypti mosquitoes The role of genomics in tracking the evolution of influenza A virus Zika virus associated with microcephaly Changes in the composition of the RNA virome mark evolutionary transitions in green plants A systematic review of Rift Valley Fever epidemiology 1931-2014 Emergence and pandemic potential of swine-origin H1N1 influenza virus Intra-host dynamics of Ebola virus during Middle east respiratory syndrome coronavirus superspreading event involving 81 persons Uncovering Earth's virome Human infection with a novel Severe acute respiratory syndrome Continuous reassortments with local chicken H9N2 virus underlie the human-infecting influenza A (H7N9) virus in the new influenza season Genesis of the novel human-infecting influenza A(H10N8) virus and potential genetic diversity of the virus in poultry A genome sequence of novel SARS-CoV isolates: the genotype Real-time, portable genome sequencing for Ebola surveillance New microbiology tools for public health and their implications Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses Redefining the invertebrate RNA virosphere Origin and molecular characterization of the human-infecting H6N1 influenza virus in Taiwan Increasing genetic diversity of Zika virus in the Latin American outbreak Emergence and predominance of an H5N1 influenza variant in China Epidemiology, genetic recombination, and pathogenesis of coronaviruses Highly pathogenic avian influenza H5N6 viruses exhibit enhanced affinity for human type sialic acid receptor and in-contact transmission in model ferrets Returning ex-patriot Chinese to Guangdong, China, increase the risk for local transmission of Zika virus A single mutation in chikungunya virus affects vector specificity and epidemic potential Genetic diversity and evolutionary dynamics of Ebola virus in Sierra Leone Zika virus in Brazil and macular atrophy in a child with microcephaly Two outbreak sources of influenza A (H7N9) viruses have been established in China Different outcomes of infection of chickens and ducks with a duck-origin H9N2 influenza A virus Origin and possible genetic recombination of the middle east respiratory syndrome coronavirus from the first imported case in China: phylogenetics and coalescence analysis Human infection with avian influenza A H6N1 virus: an epidemiological analysis MERS, SARS, and Ebola: the role of super-spreaders in infectious disease New evidence suggests southern China as a common source of multiple clusters of highly pathogenic H5N1 Avian influenza virus Imported case of MERS-CoV infection identified in China Genomic sequencing and analysis of the first imported Middle East Respiratory Syndrome Coronavirus (MERS CoV) in China Human infection caused by an avian influenza A (H7N9) virus with a polybasic cleavage site in Taiwan Origin and molecular characteristics of a novel 2013 Avian influenza A(H6N1) virus causing human infection in Taiwan A single mutation in the prM protein of Zika virus contributes to fetal microcephaly Human infections with recently-emerging highly pathogenic H7N9 avian influenza virus in China Characterization of an H10N8 influenza virus isolated from Dongting lake wetland Substitution rates of the internal genes in the novel Avian H7N9 influenza virus Clinical, epidemiological and virological characteristics of the first detected human case of avian influenza A(H5N6) virus Highly diversified Zika viruses imported to China Safety and immunogenicity of a novel recombinant adenovirus type-5 vector-based Ebola vaccine in healthy adults in China: preliminary report of a randomised, double-blind Safety and immunogenicity of a recombinant adenovirus type-5 vector-based Ebola vaccine in healthy adults in Sierra Leone: a single-centre, randomised, double-blind Biological characterisation of the emerged highly pathogenic avian influenza (HPAI) A(H7N9) viruses in humans