key: cord-0962289-ivazy62l authors: Hossain, Mohammad Uzzal; Ahammad, Ishtiaque; Bhattacharjee, Arittra; Chowdhury, Zeshan Mahmud; Hossain Emon, Md. Tabassum; Chandra Das, Keshob; Keya, Chaman Ara; Salimullah, Md. title: Whole genome sequencing for revealing the point mutations of SARS-CoV-2 genome in Bangladeshi isolates and their structural effects on viral proteins date: 2021-12-03 journal: RSC advances DOI: 10.1039/d1ra05327b sha: 48b4a057b154918b377dc21d7b798838afa6285c doc_id: 962289 cord_uid: ivazy62l Coronavirus disease-19 (COVID-19) caused by SARS-CoV-2 has already killed more than one million people worldwide. Since novel coronavirus is a new virus, mining its genome sequence is of crucial importance for drug/vaccine(s) development. Whole genome sequencing is a helpful tool in identifying genetic changes that occur in a virus when it spreads through the population. In this study, we performed complete genome sequencing of SARS-CoV-2 to unveil the genomic variation and indel, if present. We discovered thirteen (13) mutations in Orf1ab, S and N gene where seven (7) of them turned out to be novel mutations from our sequenced isolate. Besides, we found one (1) insertion and seven (7) deletions from the indel analysis among the 323 Bangladeshi isolates. However, the indel did not show any effect on proteins. Our energy minimization analysis showed both stabilizing and destabilizing impact on viral proteins depending on the mutation. Interestingly, all the variants were located in the binding site of the proteins. Furthermore, drug binding analysis revealed marked difference in interacting residues in mutants when compared to the wild type. Our analysis also suggested that eleven (11) mutations could exert damaging effects on their corresponding protein structures. COVID-19 can be currently considered a menace to humankind brought about by the novel Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) which began its journey from the Wuhan province of the People's Republic of China. [1] [2] [3] [4] [5] [6] [7] [8] The infection basically targets the respiratory framework of its host causing inuenza-like sickness with symptoms such as cough, fever, and in progressively serious cases, troubled breathing. [9] [10] [11] [12] [13] [14] [15] According to the data available, mortality is higher in individuals of advanced ages (>60 years) and the ones with comorbidities. [16] [17] [18] [19] [20] [21] Apart from intense respiratory problems, COVID-19 has been shown to cause systemic irritation prompting sepsis, [22] [23] [24] [25] intense cardiovascular injury, [26] [27] [28] [29] [30] [31] cardiovascular breakdown 26, [32] [33] [34] [35] and multiorgan failure in critical patients. 36 COVID-19 has been rightly announced as a global pandemic by the World Health Organization (WHO) as it has spanned over 200 countries and territories around the world. [37] [38] [39] [40] [41] [42] Coronaviruses (CoVs) are enveloped, single-stranded, (+) RNA viruses that are pathogenic to their hosts. [43] [44] [45] [46] [47] SARS-CoV-2 is the causative agent behind COVID-19 and is more pathogenic in contrast with previously observed SARS-CoV (2002) and Middle East respiratory syndrome coronavirus (MERS-CoV, 2013). [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] There is a dire need to examine the virus more comprehensively to analyze the pathogenesis, its destructiveness and development of powerful therapeutic measures. 58 CoVs belongs to the Coronaviridae family under Nidovirales order. They have been grouped into four genera that belong to a-, b-, g-, and d-coronaviruses. 59 Among them, a-and b-COVs infect vertebrates, g-coronaviruses avians, while the d-coronaviruses infect both. SARS-CoV, mouse hepatitis coronavirus (MHV), MERS-CoV, Bovine coronavirus (BCoV), bat coronavirus HKU4, and human coronavirus OC43, including SARS-CoV-2, are b-coronaviruses. 60 Zoonotic transmission is the medium of transmission for each of the three CoVs, SARS-, MERS-, and SARS-CoV-2, and they spread through close contact. The essential multiplication number (R 0 ) of the individual-to-individual spread of SARS-CoV-2 is around 2.2-2.7, which implies that the conrmed cases develop at a striking exponential rate. 61 CoVs being 26 to 32 kb long have the biggest RNA viral genome. 62 The SARS-CoV-2 genome share approximately 90% identity with essential enzymes and structural proteins of SARS-CoV. Fundamentally, SARS-CoV-2 contains four basic proteins known asspike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins. These proteins share high sequence similarity with the sequence of the corresponding proteins in SARS-CoV, and MERS-CoV. Hence, it is vital to scrutinize the SARS-CoV-2 genome to determine why this infection is progressively inclined to be more infectious and lethal than its predecessors. Utilizing Sanger sequencing and cutting-edge whole genome sequencing of SARS-CoV-2 isolates from oropharyngeal samples, we depicted the genomic portraits of two genomes alongside other Bangladeshi strains. 63 In this study, we have analyzed the genomic arrangements of SARS-CoV-2 to identify the mutations found within the genomes and anticipate their effect on the protein structure from a structural biology perspective in order to shed light on the suitable therapeutics against this deadly virus. The oropharyngeal samples from two COVID-19 patients were collected. Two viral genome SARS-CoV-2/human/BGD/NIB_01/ 2020 and SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020 were obtained using the UTM™ kit containing 1 mL of viral transport media (Copan Diagnostics Inc., Murrieta, CA, USA) on day 7 of the patient's illness with symptoms of cough, mild fever, and throat congestion. The specimens were tested positive for SARS-CoV-2 by real-time reverse transcriptase PCR (rRT-PCR). Then, the viral RNA was extracted directly from the patient's swab using PureLink Viral RNA/DNA Mini kit (Invitrogen). The viral RNA was then converted into cDNA using SuperScript™ VILO™ cDNA synthesis kit (Invitrogen) according to the manufacturer's instructions. Sanger dideoxy based sequencing. The forty eight (48) pair primers were designed to cover the whole genome of the virus by following two conditions: (1) their sequence is conserved among all the available SARS-CoV-2 isolates and (2) the terminal of the amplicons will overlap with neighboring amplicons. The polymerase chain reaction (PCR) was performed and the 48 primers then generated 47 amplicons which were visualized in 1.5% agarose gel electrophoresis. The amplicons were further puried using Purelink PCR purication kit (ThermoFisher Scientic, USA). These puried amplicons were sequenced using Sanger dideoxy method by "ABI 3500" with BigDye Terminator version 3.1 cycle sequencing kit (Applied Biosystems, USA). The raw reads were assembled by DNA Baser (https://www.dnabaser.com) and veried by SeqMan Pro®. Version 14.1. DNASTAR. 64 Madison, WI. These overlapping regions were visualized by CLC Genomics Workbench 20.0.4 (https://digitalinsights.qiagen.com) and merged with EMBOSS: merger (https://www.bioinformatics.nl/cgi-bin/emboss/merger). Illumina Nextseq 550 next-generation sequencing technology was implemented to sequence the complete genome of the SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020 virus to where Nextera DNA Flex was utilized as library preparation kit for the synthesis of the nucleotides. 65 To cover the 300 cycle, the NextSeq High Output kit was utilized as the reagent cartridge. To generate the FASTQ data work-ow the run mode was set as local run manager in every NextSeq 4channel chemistry. Analysis and quality check was performed using a customized version of the DRAGEN RNA pipeline, which was also available on local DRAGEN server hardware. The Illumina® DRAGEN RNA Pathogen Detection App uses a combined human and virus reference to analyze pathogen data. The raw reads were cleaned by trimming low-quality bases with Trimmomatic 0.36 (-phred33, LEADING:20, TRAILING:20, SLIDCitation). The assembly was performed by the utilization of SPAdes using default parameters as well as used to cross-validate with the reference-based method as an internal control. The assembly statistics were executed by QUAST. 66 Basic Local Alignment Search Tools (BLAST) was employed to identify possible mutations in Sanger Based sequenced nucleotide sequences. Nucleotide program of blast was selected for this iden-tication. The mapped polymorphisms were investigated for their frequency worldwide and checked for their prole at China National Center for Bioinformation (CNCB) (https://www.cncb.ac.cn/) resource. Chimera was utilized to visualize the mapped polymorphisms. Besides, all the available Bangladeshi strains (n ¼ 323) of SARS-CoV-2 were retrieved from GISAID 67 and further explored to nd out the most common mutations. 68 To observe the mutational effect of the polymorphisms in 3 dimensional (3D) structure, homology modeling were executed using deep learning based RoseTTAfold algorithm in the ROBETTA server. 69 Later, the difference of the energy was calculated by Gromos96 in both wild type and mutant 3D structures to estimate the structural abnormality and change in stability. 70 Binding site of both wild and mutant structures were analyzed to check whether the amino acid residues are into the binding site region or not. Remdisivir and Ivermectin were selected since these drugs were suggested by DrugBank Protein Basic Local Alignment Tool (BLASTp) (https://go.drugbank.com/structures/search/bonds/ sequence). We retrieved the structures of all the interacting drugs (.pdb les) by virtual screening of the Drugbank database. We have performed molecular docking simulation using Autodock vina 71 for the analysis of interacting residues to the druggable targets. At rst, we generated the .pdbqt les of the targets (both mutant and wild types) for docking experiments. Aer that blind docking was performed for the identication of the most effective binding site of these drugs. The grid box parameter covered the whole protein for all docking runs. Finally, Autodock vina predicted the drug receptor interactions. This interactions were visualized via UCSF Chimera, Pymol and Discovery Studio Visualizer. In order to evaluate the evaluate the impact of novel mutations on the stability of the SARS-CoV-2 proteins under physiological conditions, 50 ns molecular dynamics simulation was carried out using GROningen MAchine for Chemical Simulations aka GROMACS (version 5.1.1). The GROMOS96 43a1 force-eld, 300 K temperature, pH 7.4, and 0.9% NaCl was used for building the system. It was then solvated in a triclinical box of the simple point charge water model with its edges at 0.5 nm distance from the protein surface. The overall charge of the system was neutralized using necessary ions using the genion module. Energy minimization of the neutralized system was carried out using the steepest descent minimization algorithm with maximum number of minimization steps to perform was set at 50 000. The ligand was restrained before carrying out the isothermal-isochoric (NVT) equilibration of the system for 100 ps with short-range electrostatic cutoff value of 1.2 nm. Isobaric (NPT) equilibration of the system was carried out for 100 ps following the NVT with short-range van der Waals cutoff xed at 1.2 nm. Finally, a 50 ns molecular dynamic simulation was run using periodic boundary conditions and time integration step of 2 fs. The energy of the system was saved every 100 ps. For calculating the long-range electrostatic potential, the Particle Mesh Ewald (PME) method was applied. Short-range van der Waals cutoff was kept at 1.2 nm. Modied Berendsen thermostat was used to control simulation temperature while the pressure was kept constant using the Parrinello-Rahman algorithm. The simulation time step was selected as 2.0 fs. The snapshot interval was set to 100 ps for analyzing the trajectory data. Finally, all of the trajectories were concatenated to calculate and plot root mean square deviation (RMSD), root mean square uctuation (RMSF), radius of gyration (Rg) and solvent accessible surface area (SASA) data. MD simulations were performed on the "bioinfo-server" running on Ubuntu 18.4.5 operating system located at the Bioinformatics Division, National Institute of Biotechnology. Root Mean Square Deviation (RMSD) calculation was performed in order to evaluate when a system attains equilibrium. The "rms" module built into the GROMACS soware was utilized to extract RMSD information throughout the course of the simulation. The result can be plotted graphically using the Xmgrace package. Room Mean Square Fluctuation (RMSF) is used to determine the exibility of a certain region of the protein. The higher the RMSF value the higher is the exibility of an amino acid. RMSF calculations were carried out using the "rmsf" module and the gures were generated using Xmgrace. The radius of gyration of our proteins was measured to determine its degree of compactness. A relatively steady value of radius of gyration means stable folding of a protein. Fluctuation of radius of gyration implies the unfolding of the protein. The "gyrate" module was used to generate the radius of gyration graphs for our proteins. Hydrophobic interactions composed of non-polar amino acids are crucial for maintaining the stability of the hydrophobic core of proteins. They do so by covering the non-polar amino acids within the hydrophobic cores and keeping them at a distance from the solvent. Solvent Accessible Surface Area (SASA) is used in molecular dynamic simulations to predict the hydrophobic core stability of proteins. In this study, SASA was calculated using the "sasa" module and the resulting graph was visualized using Xmgrace. The workow of this manuscript has been shown in Fig. 1 . In case of NIB-01 virus, forty eight (48) contigs with ninety four (94) overlapping regions were obtained. The sequence had 2X coverage (both forward and reverse reads). It was then assembled by SeqMan Pro and EMBOSS merger. 72 The assembled viral genome consisted of a single stranded positive (+) RNA that is 29 724 nucleotides long with: 8882 adenosines (29.88%), 5455 cytosine (18.35%), 5836 guanine (19.63%), and 9551 thymine (32.13%). The GC content of the whole genome was 38%. A total of 17 822 898 reads were produced in the reference-based alignment aer trimming 99% of them were mapped to the SARS-CoV-2 reference genome. The complete nucleotide sequence of SARS-CoV-2 isolate SARS-CoV-2/human/BGD/NIB_01/2020 from the Sanger sequencing has been deposited in GenBank under the accession number MT509958 (https://www.ncbi.nlm.nih.gov/ nuccore/MT509958). The complete nucleotide sequence of SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020 isolate from the NGS has been also deposited under the accession number MT568643 (https://www.ncbi.nlm.nih.gov/nuccore/ MT568643.1?report¼genbank). The contig length for NIB_01/ 202 and NIB-BCSIR_02/2020 were 29 724 and 29 737 bases respectively. Detail statistics are given ESI File 1. † We have found thirteen (13) (Table 1) . However, the MT568643 whole genome showed no mutation against the reference sequence. From them, six (6) mutations namely 93 rd ; C / T, 2889 th ; C / T, 23 255 th ; A/G, 28 733 rd ; G / A, 28 734 th ; G / A and 28 735 th ; G / C were found in CNCB resource where the available mutations of SARS-CoV-2 were enlisted (Table 1) . These mutations have already been found in different countries where the SARS-CoV-2 has been sequenced. These mutations were mostly found in the United States of America (USA) and the United Kingdom (UK) ( Table 1 ). The position of 93 rd ; C / T mutation is located in 5 0 UTR upstream region. And the position 2889 th ; C / T mutation has shown no change to its protein sequence. The other 23 255 th ; A/G, 28 733 rd ; G / A, 28 734 th ; G / A and 28 735 th ; G / C mutations can alter the amino acid sequence and can have the missense effect on the protein (Table 1) . Apart from these 6 mutations, seven (7) mutations have shown as unique variants against the reference sequence (Table 1 and Fig. 2 ). These mutations were not previously reported. Besides, we analyzed our assemble genome to look for any insertion/deletions but these two genomes contain no deletions/insertions. However, we have identied one (1) insertion and seven (7) deletions of the eight (8) Bangladeshi strain EPI_ISL_466692, EPI_ISL_450343, EPI_ISL_450344, EPI_ISL_468074, EPI_ISL_514614, EPI_ISL_445213, EPI_-ISL_445217 and EPI_ISL_450842 (Table 2 and Fig. 3) . We additionally scrutinized these deleted regions but we didn't nd any domain or motif on this region. Apart from our reported complete genomes, we have also identied the most common mutations occurred in Bangladesh from the complete genomes reported in GSAID database from Bangladesh. These genomes showed three mutations in the positions 14 408, 23 403 and 28 878 compared to reference genome (Table 1) . We have analyzed the mutational effect of all the mutations. Therefore, the 3D structure was built to explore the mutational effect on the protein structure (Fig. 4) . In this case, two types of 3D structure was built (i) the structure with wild type residue and (ii) the structure with mutant residue. We have performed the energy minimization of both the wild type and mutants. We have found signicant differences in the stability of the structure upon mutation (Table 3 ). Mutants 479 th ; T / A and 1015 th ; A / T showed higher energy minimization which predicted these proteins to be more stable than the wild type. All the other protein models based on mutation showed less energy minimization than the wild type protein model. Therefore, these protein structures could be more unstable upon mutation in the protein sequence. The highest difference was observed in the mutation in the 5642 nd position; G / T mutation (from À23 276.78 kJ mol À1 to À22 377.976 kJ mol À1 ) ( Table 3) . Aerwards, the binding site was analyzed to determine whether the wild type and the mutant residues fell within the ligand binding site or not. The binding site residues conrmed that all the mutations from the complete genome belonged to the binding site region (Fig. 5) . Later, we have performed the drug binding analysis followed by virtual drug screening in DrugBank server. Ivermectin and Remdisivir drugs topped the list of potential drug candidates. We then prepared the protein structures and converted them to .pdbqt format for molecular docking experiment. We identied the binding site region for each protein and set the grid box to allow the drugs only to bind to that specic region. The binding affinity analysis showed that compared to the wild type, the drug Ivermectin bound with higher score to the proteins which has mutation at positions 479 (T / A), 5642 (G / T), and 8023 (G / A) whereas they bound with less score to proteins with mutations at positions 481 (C/A), 1015 (A / T), 5098 (G / T), and 5237 (C / T) (ESI Fig. 1 †) . Remdisivir binds with more score to proteins with mutations at 28 733 (G / A), 28 734 (G / A), and 28 735 (G / C) with less score to 23 255 (A / G) compared to the wild type (Table 4) . It is to be noted that the interaction of residues of wild type protein were found to be different than that of mutant model. For example, 479 th ; T / A mutant model which acquired the V / D amino acid interacted with GLU37, GLU41, LEU177, GLY180, LEU104, VAL108, HIS110, GLU87, LEU88, LYS141, TYR154 residues whereas wild type interacted with LEU18, VAL28, GLU37, GLU41, HIS 45, LEU53, VAL54, ILE71, ARG73, VAL86, VAL121, LEU122, ASP139 (Fig. 4) . Results of molecular dynamic simulation analysis is presented in ESI Fig. 2 . † NSP3 V843F had higher RMSD compared to the Fig. 2 Novel mutations in the protein sequence in NIB_01 whole genome. In total, seven novel mutations was identified. Wild type NSP3 while in case of NSP3 A889V it was lower. In case of NSP1, the mutant V121D exhibited higher RMSD than its wild type. Radius of gyration analysis revealed that both the mutants of NSP3 namely V843F and A889V were more compact than their wild counterpart. However, the NSP1 mutant V121D was more exible than the wild type. Similar trend was observed from SASA calculation as well. These results imply that the mutations V843 and A889V in the NSP3 protein of SARS-CoV-2 made it more stable while the mutant V121D of NSP1 made the protein less stable. COVID-19 is highly contagious and the variation in its genome could be a leading reason for this feature. Besides, to understand the origin of the strains, the exploration of the wholegenome sequencing (WGS) data of SARS-CoV-2 strains is highly necessary. 73 Insights into the mutations of SARS-CoV-2 is an important factor in developing therapeutics against the virus. 74 In this study, we investigated the variation, insertion, and deletion of the Bangladeshi SARS-CoV-2 strains. We collected the samples SARS-CoV-2/human/BGD/NIB_01/2020 and SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020 from the patients who were tested as COVID-19 positive. We extracted the viral RNA from the samples and converted them to cDNA. We performed the Sanger sequencing of SARS-CoV-2/human/BGD/ NIB_01/2020 and next generation sequencing of SARS-CoV-2/ human/BGD/NIB-BCSIR_02/2020. The total length of the genomes were 29 724 and 29 737 nucleotides respectively. These two genomes were submitted in both Global Initiative on Sharing All Inuenza Data (GISAID) and National Center for Biotechnology Information (NCBI) databases. These two databases accepted the genomes and provided the accession number EPI_ISL_458133 from GISAID and MT509958 and MT568643 from NCBI. We investigated the possible variations of these two genome and found thirteen (13) mutations in SARS-CoV-2/human/BGD/NIB_01/2020 against the reference genome of SARS-CoV-2 (Table 1 and Fig. 2) . The mutations belonged different regions of the genome, but mostly found in the Orf11 ab gene. Eight (8) (Table 1) . Among the thirteen (13) mutations, six (6) mutations, namely 93 rd ; C / T, 2889 th ; C / T, 23 255 th ; A / G, 28 733 rd ; G / A, 28 734 th ; G / A and 28 735 th ; G / C were reported in different countries according to CNCB database ( Table 1 ). All the variations showed a missense effect upon structure except 2889 th ; C / T variant ( Table 1 ). The other seven (7) mutations, 479 th ; T / A, 481 st ; C/A; 1015 th ; A / T, 5098 th ; G / T; 5237 th ; C / T, 5642 nd ; G / T and 8023 rd ; G / A presented themselves as unique mutations in the MT509958 genome (Table 1 and Fig. 2) . Surprisingly, we did not nd any mutation in SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020. We also looked for indel prole of our assembled genomes. However, we did not nd any insertion/deletion occurred into the genome. We looked for it in the rest of the genomes of SARS-CoV-2 in Bangladesh. Seven (7) deletions were found in Bangladeshi strains EPI_ISL_450343, EPI_ISL_450344, EPI_-ISL_468074, EPI_ISL_514614, EPI_ISL_445213, EPI_ISL_445217 and EPI_ISL_450842. Among them, EPI_ISL_450343, EPI_-ISL_450344, EPI_ISL_468074 and EPI_ISL_514614 shared common deletions. These deletions belonged to the ORF8 gene and the length ranged from 27 913 to 28 254 (Table 2) . We have found another deletion in the Orf7a gene whose position in the genome ranged from 27 476 to 27 668. This was found in an isolate from Dhaka region of Bangladesh (EPI_ISL_450842). Three (3) dimensional structures were built for both wild type and the mutants in order to observe the mutational impact on their corresponding proteins (Fig. 4) . We have predicted the stability of the protein structure of the corresponding variant based on energy minimization analysis. The variants 479 th ; T / A (V121D Orf1ab) and 1015 th ; A / T (I120F Orf1ab) were the locally more stable compare to the other variants in the proteins. These variant's structures consumed more energy than the wild type structure. The other variants exhibited a decrease in stability. The decrease in stability corresponds to protective effect against SARS-CoV-2 and vice versa ( Table 3 ). The binding site of the protein structures was analyzed to look for the location of the relevant amino acid variant. We have found that all the variants were located within the ligand binding site (Fig. 5) . Therefore, these residues could be considered very important in terms of ligand/drug binding. Ivermectin and Remdisivir were selected for the drug binding analysis. We performed molecular docking for each of the wild type and mutant structures with these two drugs and analyzed the interactions. We observed that the interacting wild type residues were replaced with different residues in aer molecular docking with the drugs (Table 4 ). Here, the binding affinity was also found to be different from the wild type structure. It is to be clearly understood that only a single amino acid change from the wild type structure was responsible for these changes. From these analysis, it can be concluded that if any therapeutics are to be applied on these variants, the therapeutics might not work effectively due to the alteration of the residues in the mutant proteins. From the 50 ns molecular dynamics simulation carried out in GROMACS, it was observed that the mutations had an impact on the stability of SARS-CoV-2 proteins such as NSP1 and NSP3. To reiterate the core of our study, we have performed whole genome sequencing of SARS-CoV-2 to identify genetic variations and then analyzed their impact on the structures of their corresponding proteins. We have also identied the insertions/ deletions among all the sequenced Bangladeshi SARS-CoV-2 strains. The energy minimization and the drug binding analysis suggested that the identied mutations might have significant impact on structure and function of their target proteins. Therefore, the present study might be of great interest to the researchers/companies working to develop therapeutics against SARS-CoV-2 as well as gaining fundamental insights into pathogenesis of the virus. Appropriate international, national, and/or institutional guidelines were followed during the sample collections from the patients. The ethical approval numbers are National Institute of Biotechnology record no. NIBREC2020-01 and NIBREC2020-02. This study was funded internally by the National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka-1349, Bangladesh from its regular annual budget. No external or special funding was received for this project. The complete nucleotide sequence of SARS-CoV-2 isolate SARS-CoV-2/human/BGD/NIB_01/2020 from the Sanger sequencing has been deposited in GenBank under the accession number MT509958 (https://www.ncbi.nlm.nih.gov/nuccore/MT509958). The complete nucleotide sequence of SARS-CoV-2/human/BGD/ NIB-BCSIR_02/2020 isolate from the NGS has been also deposited under the accession number MT568643 (https:// www.ncbi.nlm.nih.gov/nuccore/MT568643.1?report¼genbank). The rest of the data generated or analyzed during this study are included within this article and its ESI les. † MUH, IA, MTHE. AB, and ZMC carried out mutational impact analysis and wrote and edited the manuscript. CAK and MS also edited the manuscript. KCD led the whole genome sequencing. MS supervised the whole project. All authors read and approved the manuscript. All authors declare no conict of interest. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia On the origin and continuing evolution of SARS-CoV-2 The origin and underlying driving forces of the SARS-CoV-2 outbreak COVID-19: a novel zoonotic disease caused by a coronavirus from China: what we know and what we don't China's Response to the COVID-19 Outbreak: A Model for Epidemic Preparedness and Management Detection of SARS-CoV-2 in Different Types of Clinical Specimens The origin of SARS-CoV-2 SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients Modeling the Onset of Symptoms of COVID-19 Clinical Characteristics of Coronavirus Disease 2019 in China Clinical Presentation of COVID-19: A Systematic Review Focusing on Upper Airway Symptoms Quantifying additional COVID-19 symptoms will save lives Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Features, Evaluation, and Treatment of Coronavirus (COVID-19), StatPearls Age separation dramatically reduces COVID-19 mortality rate in a computational model of a large population Demographic perspectives on the mortality of COVID-19 and other epidemics Insights into the rst wave of the COVID-19 pandemic in Bangladesh: Lessons learned from a high-risk country, medRxiv The Effect of Age on Mortality in Patients With COVID-19: A Meta-Analysis With 611,583 Subjects Residential context and COVID-19 mortality among adults aged 70 years and older in Stockholm: a population-based, observational study using individual-level data COVID-19 mortality risk for older men and women Viral sepsis is a complication in patients with Novel Corona Virus Disease (COVID-19) Recovery From Severe COVID-19: Leveraging the Lessons of Survival From Sepsis Coronavirus Disease 2019 Sepsis: A Nudge Toward Antibiotic Stewardship Sepsis and Coronavirus Disease 2019: Common Features and Anti-Inammatory Therapeutic Approaches Pathological features of COVID-19-associated myocardial injury: a multicentre cardiovascular pathology study COVID-19 and the heart: what we have learnt so far COVID-19 cardiac injury: Implications for long-term surveillance and outcomes in survivors Cardiac injuries in coronavirus disease 2019 COVID-19 and cardiac injury: clinical manifestations, biomarkers, mechanisms, diagnosis, treatment, and follow up Coronavirus Disease 2019 (COVID-19) and Cardiac Injury-Reply Cardiac inammation in COVID-19: Lessons from heart failure The Case Fatality Rate in COVID-19 Patients With Cardiovascular Disease: Global Health Challenge and Paradigm in the Current Pandemic Fatal lymphocytic cardiac damage in coronavirus disease 2019 (COVID-19): autopsy reveals a ferroptosis signature A current review of COVID-19 for the cardiovascular specialist Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China First Case of 2019 Novel Coronavirus in the United States The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study China's practice to prevent and control COVID-19 in the context of large population movement Impact of the COVID-19 Pandemic on Commodities Exports to China: UNCTAD Research Paper No. 44 Lockdown timing and efficacy in controlling COVID-19 using mobile phone tracking Immediate impact of stay-at-home orders to control COVID-19 transmission on socioeconomic conditions, food insecurity, mental health, and intimate partner violence in Bangladeshi women and their families: an interrupted time series Pathological ndings of COVID-19 associated with acute respiratory distress syndrome Effects of air temperature and relative humidity on coronavirus survival on surfaces Genome Organization, Replication, and Pathogenesis of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Sars-CoV-2 Envelope and Membrane Proteins: Structural Differences Linked to Virus Characteristics? Coronavirus envelope protein: current knowledge SARS-CoV-2 and Coronavirus Disease 2019: What We Know So Far, Pathogens COVID-19 and SARS-Cov-2 Infection: Pathophysiology and Clinical Effects on the Nervous System COVID-19): Causative agent, mental health concerns, and potential management options A close look at the biology of SARS-CoV-2, and the potential inuence of weather conditions and seasons on COVID-19 case spread Coronaviruses and SARS-CoV-2: A Brief Overview SARS-CoV-2, SARS-CoV, and MERS-CoV viral load dynamics, duration of viral shedding, and infectiousness: a systematic review and meta-analysis The SARS-CoV-2 outbreak: What we know Comparative Review of SARS-CoV-2, SARS-CoV, MERS-CoV, and Inuenza A Respiratory Viruses Comparison of the COVID-2019 (SARS-CoV-2) pathogenesis with SARS-CoV and MERS-CoV infections From SARS and MERS to COVID-19: a brief summary and comparison of severe acute respiratory infections caused by three highly pathogenic human coronaviruses Emerging coronaviruses: Genome structure, replication, and pathogenesis Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China Coronaviruses: an overview of their replication and pathogenesis Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Coding-Complete Genome Sequence of SARS-CoV-2 Isolate from Bangladesh by Sanger Sequencing DNASTAR's Lasergene sequence analysis soware Sequencing single-stranded libraries on the Illumina NextSeq 500 platform QUAST: quality assessment tool for genome assemblies GISAID: Global initiative on sharing all inuenza data -from vision to reality Chimera-a visualization system for exploratory research and analysis Protein structure prediction and analysis using the Robetta server GROMACS 3.0: a package for molecular simulation and trajectory analysis AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading Fragment Merger: An Online Tool to Merge Overlapping Long Sequence Fragments Comparative genomic study for revealing the complete scenario of COVID-19 pandemic in Bangladesh, medRxiv A SARS-CoV-2 vaccine candidate would likely match all currently circulating variants