key: cord-1027748-immehelw authors: Feng, Zhaomin; Cui, Shujuan; Lyu, Bing; Liang, Zhichao; Li, Fu; Shen, Lingyu; Xu, Hui; Yang, Peng; Wang, Quanyi; Zhang, Daitao; Pan, Yang title: Genomic characteristics of SARS-CoV-2 in Beijing, 2021 date: 2022-05-12 journal: Biosaf Health DOI: 10.1016/j.bsheal.2022.04.006 sha: 549bdaef9bc92816ea2e9a79930d7bd30047cf80 doc_id: 1027748 cord_uid: immehelw At present, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spread worldwide, which has emerged multiple variants and brought a global threat to public health. To analyze the genomic characteristics and variations of SARS-CoV-2 imported in Beijing, we collected the respiratory tract specimens of 112 cases of coronavirus disease 2019 (COVID-19) from January to September 2021 in Beijing, including 40 local cases and 72 imported cases. The whole-genome sequences of the viruses were sequenced by the next-generation sequencing method. Variant markers and phylogenic features of SARS-CoV-2 were analyzed. Our results showed that in all 112 sequences, the mutations were concentrated in spike protein. D614G was found in all sequences, and mutations including L452R, T478K, P681R/H, and D950N in some cases. Furthermore, 112 sequences belonged to 23 lineages by phylogenetic analysis. B.1.1.7 (Alpha) and B.1.617.2 (Delta) lineages were dominant. Our study drew a variation image of SARS-CoV-2 and could help evaluate the potential risk of COVID-19 for pandemic preparedness and response. in December 2019, which was highly transmissible and pathogenic and spread rapidly around the world [1, 2] . To date, over 481 million coronavirus disease confirmed cases and over -6.1 million deaths globally had been reported in total by World Health Organization (WHO) [3] . The COVID-19 global pandemic has brought a massive disaster to humans. SARS-CoV-2 is a member of the family Coronaviridae, subfamily Orthocoronavirinae, and betacoronavirus, the enveloped viruses with genome sizes of 29,903 bases in length [4] . The SARS-CoV-2 genome contains four structural proteins, including nucleocapsid protein (N), envelop protein (E), membrane protein (M), and spike protein (S) [5] . Besides, the SARS-CoV-2 genome was similar to that of a typical coronavirus and contained at least ten open reading frames (ORFs) [5] . Furthermore, the S protein played an essential role in identifying receptor binding sites and subsequent viral entry [6] . Furthermore, some significant amino acid substitutions in S protein might increase the infectivity and escape immune [7] [8] . Therefore, the emergence variants of SARS-CoV-2 posed an increased risk to global public health. There are five variants of concern (VOC) worldwide, including Alpha, Beta, Gamma, Delta, and Omicron, which could increase the transmissibility, virulence, or the risk of reinfection [9] . Soon after the COVID-19 outbreak, China quickly took strict measures to control the (43.8%), G215C (33%), S235F (35.7%), and D377Y (43.8%) ( Table 3) . Some mutations in membrane protein were observed, such as V23L, M64F, I82T, and L87F. The mutation I82T accounted for 49.1%. The percentage of T1001I, A1708D, and I2230T in ORF1a protein of SARS-CoV-2 were 35.7%, 35.7%, and 35.7%, respectively. The P314L mutation in the ORF1b protein was already widespread in SARS-CoV-2, which accounted for 99.1%. The percentage of T60A in the ORF9b protein was 43.8%. The phylogenetic analyses of SARS-CoV-2 were performed to determine the evolution of SARS-CoV-2 in Beijing in 2021. All sequences of SARS-CoV-2 were classified by the Pangolin COVID-19 Lineage Assigner Web application (https://pangolin.coguk.io/). We found that the whole genome sequence belonged to 23 lineages. The 40 SARS-CoV-2 strains were grouped into B.1.1.7 lineage, designated as a concern alpha variant by World Health Organization (WHO). The 49 SARS-CoV-2 strains were classified as B.1.617.2 lineage, belonging to a concern delta variant. There were four strains from the B.1.351 lineage, which belonged to a variant of concern beta. The phylogenetic tree showed that the strain sequences belonged to different lineage, which showed that the strains were of different countries of origin ( Figure 1 ). SARS-CoV-2 has extensive genetic variation during transmission, with a mutation rate of ~10 -6 in each round of replication [13] . As the SARS-CoV-2 continues to evolve, many variants of SARS-CoV-2 emerge around the world. Comparative assessment of variant characteristics and public health risks by WHO was designated variants of concern, variants of interest, and variants under monitoring. They could cause great concern for the variant of SARS-CoV-2. Furthermore, the large RNA genome in coronavirus allows for extra plasticity in genome modification by mutations and recombinations, thereby increasing the probability for intraspecies variability and novel variants to emerge under the right conditions [14] . In our study, we analyzed 112 whole sequences of SARS-CoV-2. They belonged to 23 different lineages and contained different mutation sites. The variants of concern included Alpha, Beta, and Delta were found. The Delta variants were dominant. In Some necessary molecular signatures were analyzed. Spike protein is an important structural protein in SARS-CoV-2. Its primary function is to promote the viral receptor binding domain to bind to angiotensin-converting enzyme 2 of host cells, which fuses the host cell with the virus. Spike protein contained receptor-binding domains at amino acid 319-541. It was noted that the majority of SARS-CoV-2 posed one to three mutations at the receptor-binding domains, which might increase the infectivity and immune escape [15] . The D614G variant was the earliest mutation that has been recognized and attracted attention, which could enhance the infectivity [16] . We found that all SARS-CoV-2 contained the D614G mutation. Furin protease cleavage sites (amino acid sites 681-685) were located in the middle of the two subunits of spike protein, which was the key for the virus to enter human host cells and could enhance the viral infectivity [17] . The Delta variation carried the P681R mutation at the furin cleavage site. Therefore, the Alpha variation beard the P681H mutation. The 10 .1016/j.chom.2021.11.005. *The general information of imported cases collected were from the 54 cases of COVID-19. Table 2 The mutations in spike protein from the whole genome sequence. *The representative strains were selected, which belonged to 23 lineages. For multiple strains in the same lineage, we selected the strains with the largest number of mutations in spike protein. Table 3 The mutations in nucleocapsid protein from the whole genome sequence. *The representative strains were selected, which belonged to 23 lineages. For multiple strains in the same lineage, we selected the strains with the largest number of mutations in spike protein. Figure 1 The phylogenetic tree based on the whole genome sequences of the SARS- A total 112 SARS-CoV-2 sequences were analyzed. All sequences belonged to 23 different lineages. The strains from the local cluster were indicated with red font; while the strains from the imported were indicated with green font. The phylogenetic tree was generated by MEGA7.0 with neighbor-joining method. The bootstrap was 1000. SARS-CoV-2 has spread around the world, which emerged multiple variants and brought a global threat to public health. In response to the COVID-19 outbreak, the surveillance of genomic variation and genetic evolution are of great significance. The emergence variants of SARS-CoV-2 posed an increased risk to global public health. At present, there are five variants of concern (VOC) around the world, including Alpha, Beta, Gamma, Delta, and Omicron, which could increase the transmissibility, or the virulence, or the risk of reinfection. Beijing had a high risk of local outbreaks caused by domestic and abroad imports. The first outbreak in clusters was caused by B.1.1.7 lineages of SARS-CoV-2 in January 2021 in Beijing. Therefore, it is necessary to carry out long-term genetic surveillance. In this study, we analyzed 112 whole genomes of SARS-CoV-2 collected from January Table 3 The mutations in nucleocapsid protein from the whole genome sequence. T19R T22I L24S A27S T29I A67V H69-V70-D80A T95I R102S D138Y G142D Y144-H146Y E156-F157-R158G D215G A222V L241-L242-A243-T376I K417N L452R S477N T478K V483F E484K N501Y A522S T547I A570D D614G Q675R P681R/H A701V T716I V772I D796H A845S S937T L938F D950N S982A G1099D I1114V D1118H K1181R D3L P20L N29S D63G P67S E136Q S194L R195I P199L R203K/M G204R T205I R209G A211V G214S G215C T205I M234I S235F S327L D377Y Q389L D402Y Nucleocapsid mutations A Novel Coronavirus from Patients with Pneumonia in China A pneumonia outbreak associated with a new coronavirus of probable bat origin COVID-19) Weekly Epidemiological Update and Weekly Operational Update A new coronavirus associated with human respiratory disease in China The Architecture of SARS-CoV-2 Transcriptome Emerging coronaviruses: Genome structure, replication, and pathogenesis SARS-CoV-2 501Y.V2 variants lack higher infectivity but do have immune escape The effect of spike mutations on SARS-CoV-2 neutralization WHO, Tracking SARS-CoV-2 variants Genomic characteristics of SARS-CoV-2 from the first outbreak in clusters caused by VOC202012/01-like variant in China Tracing infection source of an outbreak in Beijing caused by an imported asymptomatic case of COVID-19 Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability SARS-CoV-2 (COVID-19) by the numbers, elife Genetic Recombination, and Pathogenesis of Coronaviruses Antibody-Mediated Neutralization of Authentic SARS-CoV-2 B Variants Harboring L452R and T478K/E484Q SARS-CoV-2 spike D614G change enhances replication and transmission Functional evaluation of the P681H mutation on the proteolytic activation of the SARS-CoV-2 variant B.1.1.7 (Alpha) spike, iScience Nucleocapsid mutations R203K/G204R increase the infectivity, fitness, and virulence of SARS-CoV-2