key: cord-0772161-7zkuftr3 authors: Wang, Liang; Didelot, Xavier; Bi, Yuhai; Gao, George F. title: Assessing the extent of community spread caused by mink-derived SARS-CoV-2 variants date: 2021-06-07 journal: Innovation (N Y) DOI: 10.1016/j.xinn.2021.100128 sha: eea366e765a8226a6bc6d1a9e824a4d1d2a5d486 doc_id: 772161 cord_uid: 7zkuftr3 SARS-CoV-2 has recently been found to have spread from humans to minks and then to have transmitted back to humans. However, it is unknown to what extent the human-to-human transmission caused by the variant has reached. Here, we used publicly available SARS-CoV-2 genomic sequences from both humans and minks collected in Denmark and Netherlands, and combined phylogenetic analysis with Bayesian inference under an epidemiological model to trace the possibility of person-to-person transmission. The results showed that at least 12.5% of all people being infected with dominated mink-derived SARS-CoV-2 variants in Denmark and Netherlands were caused by human-to-human transmission, indicating this “back-to-human” SARS-CoV-2 variant has already caused human-to-human transmission. Our study also indicated the need for monitoring this mink-derived and other animal source “back-to-human” SARS-CoV-2 in future and that prevention and control measures should be tailored to avoid large-scale community transmission caused by the virus jumped between animals and humans. Coronavirus disease 2019 (COVID-19) is caused by a novel type of coronavirus (known as SARS-CoV-2, 2019-nCoV, or HCoV-19) 1-3 that has led to more than 100 million infections of which at least 1.2 million have died worldwide on 10 th November 2020, posing a global concern on public health. 4 Apart from humans, natural infection of SARS-CoV-2 have been found in several other species of J o u r n a l P r e -p r o o f mammals through contact with COVID-19 patients, such as cats, 5 lions, 6 tigers, 6 dogs, 7 and minks. 8 Other animals have also been considered as possibly susceptible hosts (e.g. rabbit, pig, fox, mink, and civet) of SARS-CoV-2 through the entry test with pseudotype virus with S gene of SARS-CoV-2 and affinity abilities between the receptor binding domain (RBD) of S and host ACE2 protein. 9 In addition to human-to-animal transmission, SARS-CoV-2 in minks (Neovison vison) where it was initially introduced from humans could also transmit back to humans. 10 The virus was also shown to obtain some ongoing mink-adapted mutations such as Y453F, F486L, and N501T. 11 Since cross-species transmission has occurred and the SARS-CoV-2 can be transmitted back to humans from minks, it is important to clarify whether the "back-to-human" SARS-CoV-2 with ongoing mink-adapted mutations could further lead to transmission among humans. However, the reported study did not reach a conclusion on this point, but instead speculated that person-to-person transmission may have occurred. 10 Genomic sequence can be used to trace person-to-person transmission for SARS-CoV-2, 12 which represents an opportunity to confirm whether there was person-to-person transmission for the "back-to-human" SARS-CoV-2, even when epidemiological tracking information was not available or lacking. The main mink fur producing countries are Denmark, the Netherlands, Poland, and China. 13 It can be seen that Europe is the main production area of mink. Furthermore, mink fur delivered from European farms and sold at auction was worth 1.2 billion Euros in 2016. 14 Since tens of millions of minks have been culled to prevent further mutation and spread of the virus, the mink-derived SARS-CoV-2 variants (defined as those J o u r n a l P r e -p r o o f isolated from minks) caused a catastrophic blow to the mink farming industry. In this study, we used publicly available SARS-CoV-2 genomic sequences, and combined phylogenetic analysis with Bayesian inference under an epidemiological model to infer the probability of direct transmission between patients being infected with mink-derived SARS-CoV-2 variants in Denmark and Netherlands to evaluate the extent of human-to-human transmission caused by mink-derived SARS-CoV-2 variants. We retrieved genomic data from GISAID 15 on Jan 6 2021. We discarded genomic data with no exact collection date (accurate to days). Mink-derived sequences were defined as SARS-CoV-2 genomes isolated from minks. Since the most dominated mink-derived SARS-CoV-2 genomes belonged to B.1.298 and B.1.8 for Denmark and Netherlands, only human-derived and mink-derived SARS-CoV-2 genomes from these 2 lineages for Denmark and Netherlands were used. Genomic sequences were aligned using Mafft v7.310. 16 Then, we trimmed the uncertain regions in 3′ and 5′ terminals and also masked 30 sites (Supplementary Table 1 ) that are highly homoplastic and have no phylogenetic signal as previous noted (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473). J o u r n a l P r e -p r o o f As recombination could impact the evolutionary signal, we searched for recombination events in these SARS-CoV-2 genomes using RDP4. 17 No evidence for recombination was found in our dataset. We used jModelTest v2.1.6 18 to find the best substitution model for each dataset from different countries according to the Bayesian Information Criterion. The best substitution model for datasets from Denmark and Netherlands was HKY and GTR+I, respectively. The list of genomic sequences used in this study were provided in Supplementary Table 2. The list of genomic sequences used in this study were openly shared via the GISAID initiative. 19 We then used the Bayesian Markov Chain Monte Carlo (MCMC) approach implemented in BEAST v1.10.4 20 to derive a dated phylogeny for SARS-CoV-2. Three replicate runs for each 100 million MCMC steps, sampling parameters and trees every 10,000 steps. As genomic sequences used in each dataset were all from the same lineage, we assumed that they followed a strict molecular clock. The estimation of the most appropriate coalescent models for Bayesian phylogenetic analysis was determined using both path-sampling and stepping-stone models. 21 The best fitting combination of prior of coalescent model was Bayesian skyline tree prior for both datasets (Supplementary Table 3 ). Tracer 1.7.1 22 was then used to check the convergence of MCMC chain (effective sample size >200) and to compute marginal posterior distributions of parameters, after discarding 10% of the MCMC chain as burn-in. We also reconstructed the host for each ancestral branch by using the Bayesian asymmetric discrete trait evolution model 23 Table 4 ). As viral genomes were incompletely sampled and the pandemic is currently ongoing, TransPhylo v1.4.4 25 was used to infer the transmission tree using the dated phylogeny generated above as input. The generation time (i.e. the time gap from J o u r n a l P r e -p r o o f infection to onward transmission, denoted as G) of COVID-19 was previously estimated as 4.8±1.7 days, 26 and we used these values to compute the shape and scale parameter of a gamma distribution of G using the R package epitrix. 27 The distribution of sampling time (i.e. the time gap from infection to detection and sampling) was set equal to the distribution of generation time. We performed the TransPhylo analysis with at least 500,000 iterations simultaneously estimating the transmission tree, the proportion of sampling, the within-host coalescent time Neg, and the two parameters of the negative binomial offspring distribution (which represents the number of secondary cases caused by each infection). All results were generated after discarding the first part of the MCMC chains as burn-in (Supplementary Table 5 Since the inference of transmission tree and further estimation on the probability of directed transmission were solely based on a dated-phylogeny. We then tested whether the uncertainty in phylogeny affected the result. Ten dated phylogenetic trees J o u r n a l P r e -p r o o f were randomly selected from the MCMC chains for TransPhylo analysis. The parameter setting was the same as above. The estimated bidirectional probability of direct transmission for each patient pair was visualized by Cytoscape v3.8.2. 28 As of Jan 6 2021, there were a total of 761 mink-derived SARS-CoV-2 genomes have been available. They came from four countries: Canada (4 genomes), Denmark We used a discrete trait analysis to infer the ancestral host for each branch. An J o u r n a l P r e -p r o o f independent cross-species transmission event was considered to be occurred only if a clade meets all the following criteria: i) the direct two branches after the root of the clade have different host, ii) posterior probability of both branch and ancestral host for the root of the clade > 0.8. In Denmark dataset, we found three independent cross-species transmission events ( Figure 2 ). All of them were caused by human-to-mink transmission. In addition, we also found six SARS-CoV-2 genomes (in Clade I) from human were closed to mink-derived viral genomes, indicating they were highly likely to be transmitted from mink to human. However, we could not determine how many independent cross-species transmission events occurred due to the low posterior probability of branches. In Netherlands, three independent cross-species transmission events occurred in lineage B.1.298 (Figure 3 ). One of them was transmitted from human to mink, other two events were caused by transmission from mink to human, which contained one and five cases, respectively. Besides, we also found a cluster denoted as Clade I contained nine SARS-CoV-2 genomes from human were closed to mink-derived viral genomes, indicating they were highly likely to be transmitted from mink to human. However, we could only be sure that at least one independent cross-species transmission event has occurred between them. Besides, we also found additional four SARS-CoV-2 genomes from human were scattered within Clade III, indicating these four patients were also infected with mink-derived SARS-CoV-2 variants. Totally, we identified 18 patients being infected with mink-derived SARS-CoV-2 variants. We further tested whether there were human-to-human transmission events in those who infected with mink-derived J o u r n a l P r e -p r o o f SARS-COV-2 variants. We then calculated the probability of direct transmission between humans infected with mink-derived SARS-CoV-2 variants. In order to reduce the calculation, only clades with highly posterior probability of its root and contained humans infected with mink-derived SARS-CoV-2 variants were used to further analysis. In Denmark dataset, there are three patient pairs (D2/D3, D5/D6, and D1/D3) with bidirectional probability of direct transmission >0.5 (0.998, 0.731, and 0.607, respectively) ( Figure 4A ). Meanwhile, the number of intermediates between D2/D3, D5/D6, and D1/D3 were estimated as 0.002, 0.271, and 0.412, respectively ( Figure 4B ). All these results suggested that these three patient pairs were more likely to be transmitted from each other directly. In the Dutch dataset, we also found 2 pair of patients in Clade I (N7/N8, and N3/N4) with bidirectional probability of direct transmission >0.5 (0.95, and 0.931, respectively) ( Figure 4C ). In addition, the number of intermediates between N7/N8, and N3/N4 were estimated as 0.05, and 0.069, respectively ( Figure 4D ). There are also 2 pairs of patients in Clade II (N10/N11 and N13/N14) with bidirectional probability of direct transmission >0.5 (0.989, and 0.978, respectively) ( Figure 4E ). In addition, the number of intermediates between N10/N11, and N13/N14 were estimated as 0.011, and 0.022, respectively ( Figure 4F ). Since limited variations detected in mink-derived SARS-CoV-2 variants could not result in a highly resolved phylogeny, we next wanted to test how did phylogenetic uncertainty affect the result by repeating the analysis based on 10 trees randomly selected from MCMC chain. In the Danish dataset, the cluster with D1, D2 and D3 always contained a patient pair with highly bidirectional probability of direct transmission ( Figure 5) . However, the bidirectional probability of direct transmission for D5 and D6 was lower than 0.5 in 2 randomly selected trees, indicating the inference of direct transmission between D5 and D6 could be affected by the phylogenetic uncertainty ( Figure 5 ). In this case, we could conclude that only one person-to-person transmission event occurred in the Danish dataset. For the Dutch dataset, we found a similar pattern as in the Danish dataset that the phylogenetic uncertainty highly affected the inference of who infect whom. However, there was at least one direct transmission event with high bidirectional probability occurred in each cluster for 10 randomly selected phylogenies ( Figure 6 ). Besides, we also found that N10-N14 are all employees in the same mink farm, indicating that the direct transmission between them could be more likely occurred. In this case, we could conclude that at least two person-to-person transmission events occurred in the Dutch dataset. In summary, we totally identified at least 3 direct transmission events with high bidirectional probability among humans infected with mink-derived SARS-CoV-2 variants in Denmark and Netherlands. It accounted for 12.5% of all people infected with mink-derived SARS-CoV-2 variants in this study. We also found some mutations arising in those "back-to-human" SARS-CoV-2 genomes compared to their close-related mink-derived ones. In the Danish dataset, C2062T (locating at the 5' terminal of SARS-CoV-2 genome) was detected in D4. However, this mutation was not detected in other close related human SARS-CoV-2 and closest mink-derived variant. A nonsynonymous mutation (C12008T result in Leu3915Phe in ORF1ab) were lost in both D5 and D6, compared to their closest related mink-derived variant. In the Dutch dataset, more mutations were detected. Among them, we found that there was no common mutation shared by all human genomes. Together with the limited number of mink-derived SARS-CoV-2 genomes from human, we are currently unable to determine whether there are human adaptive mutations. Since the SARS-CoV-2 carried by mink could be transmitted back to humans, 10 it had led to the mass culling of infected animals, posing a huge threat to the public health and economy. The first thing we need to evaluate is whether the variant can continue to spread from person to person, and to study the extent of the current human-to-human transmission. We found that the phylogenetic type of dominant strains in different countries were not consistent (Figure 1) , indicating that the cross-species transmission events of SARS-CoV-2 from human to mink were not lineage specific. In other words, many phylogenetic subtypes of SARS-CoV-2 can be transmitted from human to mink. However, if all subtypes of the SARS-CoV-2 can be transmitted to mink is still unknown and needs further research and confirmation. Several independent cross-species transmission events were identified in this study, which contained both human-to-mink and mink-to-human direction. We also detected at least three human-to-human transmission events with highly bidirectional probability in this study. However, we are not sure who infected whom, mainly due to the phylogenetic uncertainty caused by limited mutations. The phylogenetic uncertainty also caused different number of direct transmission event for each dataset ( Figure 5&6 ). Yet, there was always one direct transmission event with high bidirectional probability occurred in each dataset. Under these circumstances, we could conclude that there were at least three direct transmission events identified in Denmark and Netherlands, accounting for at least 12.5% of all people infected with mink-derived SARS-CoV-2 variants in this study. However, the extent of human-to-human transmission caused by mink-derived SARS-CoV-2 variants was considered to be underestimated. The reasons are summarized as follows. First, not all viral genomes of patients infected with mink-derived SARS-CoV-2 variants were available right now. Second, the criteria of identifying direct transmission event were strict in this study, leading to a low true positive ratio. In this case, the mink-derived SARS-CoV-2 variants in human and then the extent of person-to-person transmission caused by this variant should be continuously monitoring. Despite mink was the only species so far that could be easily infected by humans with SAR-CoV-2, and then spill the mutants back to humans again, this phenomenon might also exist in other J o u r n a l P r e -p r o o f non-human mammals which could be infected by SARS-CoV-2 and were in frequent contact with humans. Under these circumstances, the contact between humans and susceptible animals should be cautious to prevent humans from transmitting SARS-CoV-2 to animals, so as to prevent the virus from continuously circulating and evolving in the animals. This will not only minimize the impact of the SARS-CoV-2 on the breeding industry, as increased mortality was detected in farmed minks that were positive to SARS-CoV-2 RNA, 29 but also decrease the probability of generating some novel and unpredictable mutants of SARS-CoV-2 within animals, thereby threatening public health. A novel coronavirus from patients with pneumonia in China A distinct name is needed for the new coronavirus A novel coronavirus genome identified in a cluster of pneumonia cases -Wuhan A novel coronavirus outbreak of global health concern SARS-CoV-2 natural transmission from human to cat From people to panthera: natural SARS-CoV-2 infection in tigers and lions at the Bronx Zoo Recurrent mutations in SARS-CoV-2 genomes isolated from mink point to rapid host-adaptation Inference of person-to-person transmission of COVID-19 reveals hidden super-spreading events during the early outbreak phase Mink, SARS-CoV-2, and the human-animal interface European mink industry -socio-economic impact assessment GISAID: global initiative on sharing all influenza data -from vision to reality MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform RDP4: detection and analysis of recombination patterns in virus genomes jModelTest 2: more models, new heuristics and parallel computing Data, disease and diplomacy: GISAID's innovative contribution to global health Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol Accurate model selection of relaxed molecular clocks in bayesian phylogenetics Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol Bayesian phylogeography finds its roots Bayesian evaluation of temporal signal in measurably evolving populations Genomic infectious disease epidemiology in partially sampled and ongoing outbreaks Meta-analysis of the SARS-CoV-2 serial interval and the impact of parameter uncertainty on the COVID-19 reproduction number epitrix: small helpers and tricks for epidemics analysis Cytoscape: A software environment for integrated models of biomolecular interaction networks SARS-CoV-2 infection in farmed minks, the Netherlands