key: cord-0790212-fw9jpkdq authors: Lai, Alessia; Bergna, Annalisa; Acciarri, Carla; Galli, Massimo; Zehender, Gianguglielmo title: Early phylogenetic estimate of the effective reproduction number of SARS‐CoV‐2 date: 2020-03-03 journal: J Med Virol DOI: 10.1002/jmv.25723 sha: 237ea115852b2581ae3b79f5c5ed3c1991a37f5d doc_id: 790212 cord_uid: fw9jpkdq To reconstruct the evolutionary dynamics of the 2019 novel‐coronavirus recently causing an outbreak in Wuhan, China, 52 SARS‐CoV‐2 genomes available on 4 February 2020 at Global Initiative on Sharing All Influenza Data were analyzed. The two models used to estimate the reproduction number (coalescent‐based exponential growth and a birth‐death skyline method) indicated an estimated mean evolutionary rate of 7.8 × 10(−4) subs/site/year (range, 1.1 × 10(−4)‐15 × 10(−4)) and a mean tMRCA of the tree root of 73 days. The estimated R value was 2.6 (range, 2.1‐5.1), and increased from 0.8 to 2.4 in December 2019. The estimated mean doubling time of the epidemic was between 3.6 and 4.1 days. This study proves the usefulness of phylogeny in supporting the surveillance of emerging new infections even as the epidemic is growing. On 30 January 2020, the World Health Organization declared that the outbreak of an infection due to a novel-coronavirus (SARS-CoV-2) was a "Public Health Emergency of International Concern" (https:// www.who.int/news-room/detail/30-01-2020-statement-on-thesecond-meeting-of-the-international-health-regulations-(2005)emergency-committee-regarding-the-outbreak-of-novel-coronavirus-(2019-nCoV)). Emerging as a human pathogen in the Chinese city of Wuhan, SARS-CoV-2 (https://www.who.int/docs/default-source/ coronaviruse/situation-reports/20200121-sitrep-1-2019-ncov.pdf? sfvrsn=20a99c10_4) has caused a widespread outbreak of febrile respiratory illness and, as of 13 February 2020, there were 60 349 confirmed cases (including 527 outside mainland China) and a total of 1360 fatalities (https://gisanddata.maps.arcgis.com/apps/ opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6). Belonging to the β-coronavirus genus of the Coronaviridae family, SARS-CoV-2 is closely related to SARS-CoV as there is more than 70% nucleotide similarity in their approximately 30 kb long genomes. 1 A recent study has supported the view that, like other β-coronaviruses causing human infections such as SARS-CoV and MERS-CoV, SARS-CoV-2 originated from bats, and reported 96% genomic identity with a previously detected SARS-like bat coronavirus. 2, 3 However, it remains unclear whether the spillover also involved a different intermediary animal host. In the case of such an epidemic, it is important to make as reliable as possible an estimate of the basic reproductive number (R 0 , the number of cases generated from a single infected person) and the dynamics of transmission. The aim of this study was to investigate the temporal origin, rate of viral evolution and population dynamics of SARS-CoV-2 using 52 full genomes of viral strains sampled in different countries on known sampling dates available at the moment when the study was performed. In memory of Li Wenliang, Carlo Urbani, and of all the doctors and health workers who endangered their lives in the fight against epidemics. The analysis was based on 52 SARS-CoV-2 sequences publicly available at Global Initiative on Sharing All Influenza Data (GISAID) on 4 February 2020 (https://www.gisaid.org/). The accession IDs, sampling dates and locations are summarized in Table S1 . The sequences were aligned using the ClustalW Multiple Alignment programs included in the accessory application of Bioedit software, manually controlled, and cropped to a final length of 29 774 bp using BioEdit v.7.2.6.1 (http://www.mbio.ncsu.edu/bioedit/ bioedit.html. The simplest evolutionary model best fitting the sequence data were selected using software JmodelTest v.2.1.7 software, 4 The basic reproductive number (R 0 ) was calculated on the basis of the exponential growth rate (r) using the equation where D is the average duration of infectiousness estimated as described below. 8 The doubling time of the epidemic was directly estimated setting the tree before the coalescent exponential growth analysis with doubling time parameterization. The birth-death skyline model implemented in Beast 2.48 was used to infer changes in the effective reproductive number (R e ), and other epidemiological parameters such as the death/recovery rate (δ), the transmission rate (λ), the origin of the epidemic, and the sampling proportion (ρ). 9 Given that the samples were collected during a short period of time, a "birth-death contemporary" model was used. The analyses were based on the previously selected HKY substitution model and the evolutionary rate was set to the value of The BSP showed a rapid increase in the number of infections in a period between approximately 45 and 30 days before the end of January 2020 ( Figure 1A) . The IDs and available data of the sequences involved in the clades are shown in Table S1 . The estimated growth rate under the exponential growth model was 0.218 days −1 , corresponding to an R 0 estimation of 2.6 (CI, 2.1-5.1). The direct estimation of the doubling time of the epidemic gave a mean of 3.6 days (varying from 1.0 to 7.7). Figure 1B shows the Bayesian birth-death skyline plot of the R e estimates with 95% HPD and indicates that R e increased from less than 1 (mean, 0.8; 95% HPD, 0.3-1.3) to a mean value of 2.4 (95% HPD, 1.5-3.5) in December 2019, and has since remained at this value. The estimation allowing a single R e gave a mean value of 1.85 (95% HPD, 1.37-2.4). Table 1 shows the parameters estimated using the birth-death skyline plot. The epidemic originated an estimated mean of 3.7 months (CI, 3-4) before the present (BP), corresponding to October to November 2019, before the root tree (3.6 months BP). The estimated recovery rate (the time to becoming noninfectious) was One of the most important epidemiological parameters when monitoring an epidemic is R 0 (ie, the number of secondary cases induced by a single infected individual in a totally susceptible population) because it is fundamental to assess the potential spread of a microorganism. Its value changes during an epidemic being called the effective reproduction number (R e ). R 0 is usually estimated on the basis of the growth rate of the number of cases. The available epidemiological estimates of SARS-CoV-2 R 0 range from 2.2 to 2.9, although they changed from 1.4 to more than 7 during the first phases of the epidemic. 10, 14 Recently developed evolutionary models have made it possible to estimate epidemiological parameters on the basis of phylogenesis, 9, 15 and a coalescent and birth-death methods were used to estimate R 0 and the changes in the R e of the SARS-CoV-2 epidemic during a short period of time. This has allowed us to make a preliminary estimate that mean R 0 from the beginning of the epidemic to the first days of February 2020 was 2.2 (range, 3.6-5.8), and the birth-death skyline analysis showed an increase in R e from less than 1 to 2.4 (CI, 1.5-3.5) during December 2019. This agrees with the BSP analysis showing an increase in the number of infections in the same period of time. Commonly, the R e decreases during an epidemic because the decrease in the number of susceptible individuals. However, an increase in R e could be due to an increase in the transmissibility of the virus or in the contact rates within the population. 16 It is, therefore, possible to hypothesize, on the basis of our data, that the first passage of the virus from animal to human occurred through rather inefficient and still unknown transmission modes causing relatively few cases in the early times (before December). In December, the virus acquired a more efficient mode of human-to-human transmission (ie, through droplets), causing exponential growth also detected by the skyline. On the same basis, the estimated epidemic doubling time was 3.6 days with a CI between 1 and 7 days. We also tried to calculate it on the basis of the transmission (λ) and recovery rate (δ) estimated using days. 10 The difference in the estimate here obtained, may be due to the increased epidemic growth rate observed during the last days of January, or the initial delay in recognizing and reporting new cases. This preliminary study has some limitations. The R values and doubling times were estimated phylogenetically using all of the whole genomes available in a public database at the time the study was carried out (https://www.gisaid.org/). Given the small number of sequences and the relatively short sampling period, the CIs are wide and limit the precision of the estimates. Moreover, the analysis included isolates collected outside mainland China as it is assumed that they all belong to the same epidemic originating in Wuhan. Serial intervals were used to estimate the duration of infectiousness, although we do not yet have any information concerning the possible existence and duration of a latent (preinfectious) period that would contribute to the serial interval. In conclusion, these results allowed us to make a phylogenetic estimate of the R 0 of SARS-CoV-2 infection that is similar to that obtained using conventional epidemiological methods 18 We acknowledge the authors, originating and submitting laboratories of the sequences from GISAID. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Full-genome evolutionary analysis of the novel corona virus (2019-nCoV) rejects the hypothesis of emergence as a result of a recent recombination event A pneumonia outbreak associated with a new coronavirus of probable bat origin jModelTest: phylogenetic model averaging Bayesian phylogenetics with BEAUti and the BEAST 1.7 Bayesian selection of continuous-time Markov chain evolutionary models Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty The epidemic behavior of the hepatitis C virus Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV) Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Comparative population dynamics of HIV-1 subtypes B and C: subtype-specific differences in patterns of epidemic growth Transmission dynamics and control of severe acute respiratory syndrome Hospital outbreak of Middle East respiratory syndrome coronavirus Transmission dynamics of 2019 novel coronavirus (2019-nCoV) Evolutionary dynamics of the lineage 2 West Nile virus that caused the largest European epidemic: Italy Temporal variations in the effective reproduction number of the 2014 west Africa ebola outbreak Estimating the unreported number of novel coronavirus (2019-nCoV) cases in China in the first half of January 2020: a data-driven modelling analysis of the early outbreak Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak Early phylogenetic estimate of the effective reproduction number of SARS-CoV-2 The authors declare that there are no conflict of interests. AL, GZ, and MG conceived and designed the study. AB, CA, and AL collected data and prepared the datasets. GZ, AL, and AB participated to phylogenetic analyses. AL, GZ, AB, and MG wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version. http://orcid.org/0000-0002-3174-5721Massimo Galli http://orcid.org/0000-0001-8887-6215Gianguglielmo Zehender http://orcid.org/0000-0002-1886-2915