key: cord-0938555-ukz1o1m9 authors: Nassir, Angus A.; Musanabaganwa, Clarisse; Mwikarago, Ivan title: Mutation Landscape of SARS COV2 in Africa date: 2020-12-21 journal: bioRxiv DOI: 10.1101/2020.12.20.423630 sha: bc3b70547e7e2a97a719f9be63b7ac1a84517963 doc_id: 938555 cord_uid: ukz1o1m9 COVID-19 disease has had a relatively less severe impact in Africa. To understand the role of SARS CoV2 mutations on COVID-19 disease in Africa, we analysed 282 complete nucleotide sequences from African isolates deposited in the NCBI Virus Database. Sequences were aligned against the prototype Wuhan sequence (GenBank accession: NC_045512.2) in BWA v. 0.7.17. SAM and BAM files were created, sorted and indexed in SAMtools v. 1.10 and marked for duplicates using Picard v. 2.23.4. Variants were called with mpileup in BCFtools v. 1.11. Phylograms were created using Mr. Bayes v 3.2.6. A total of 2,349 single nucleotide polymorphism (SNP) profiles across 294 sites were identified. Clades associated with severe disease in the United States, France, Italy, and Brazil had low frequencies in Africa (L84S=2.5%, L3606F=1.4%, L3606F/V378I/=0.35, G251V=2%). Sub Saharan Africa (SSA) accounted for only 3% of P323L and 4% of Q57H mutations in Africa. Comparatively low infections in SSA were attributed to the low frequency of the D614G clade in earlier samples (25% vs 67% global). Higher disease burden occurred in countries with higher D614G frequencies (Egypt=98%, Morocco=90%, Tunisia=52%, South Africa) with D614G as the first confirmed case. V367F, D364Y, V483A and G476S mutations associated with efficient ACE2 receptor binding and severe disease were not observed in Africa. 95% of all RdRp mutations were deaminations leading to CpG depletion and possible attenuation of virulence. More genomic and experimental studies are needed to increase our understanding of the temporal evolution of the virus in Africa, clarify our findings, and reveal hot spots that may undermine successful therapeutic and vaccine interventions. SARS CoV2 virus is a positive-sense single stranded RNA(+ssRNA) coronavirus responsible for the covid-19 pandemic (Asghari et al., 2020) . Since the initial isolation and genomic characterization of SARS CoV2 in January 2020, numerous mutation studies have tracked the evolution of the virus globally (Chaw et al., 2020; Korber et al., 2020; Koyama et al., 2020; Mishra et al., 2020; Tang et al., 2020; van Dorp et al., 2020) . Mutation studies are important as they reveal important information about the temporal evolution of the virus, reveal suitable targets for drug, diagnostics, and vaccine design, and reveal hot spots that may undermine successful therapeutic and vaccine interventions (Kayla et al., 2018; Perales et al., 2011) . Mutation studies also help track changes in the infectivity and virulence of mutants, reveal new strains with possible implications on immune escape and provide important clinicoepidemiological data (Abdullahi et al., 2020; Pachetti et al., 2020; Xi et al., 2020; Zou et al., 2020) . Studies have since shown that SARS CoV2 is a moderately mutating virus with a median mutation rate of 1.12 × 10−3 mutations per site-year (95% CI, CI: 9.86 × 10−4 to 1.85 × 10−4) (95% CI: 4.8 to 5.52) (Koyama et al., 2020) . This moderate mutation rate is lower than that of other +ssRNA viruses and this is attributable to the presence of a 3'-5' exonuclease that provides proof-reading ability (Duffy et al., 2018; Minskaia et al., 2006) . Mutation studies have also distinguished viral SARS CoV2 clades and mapped dominant strains. The six major SARS CoV2 clades include D614G or basal, L84S, L3606F, D448del and G392D. D614G was first seen in China and is now the dominant strain worldwide (Korber et al., 2020) . The shift to the D614G variant occurs even in areas where the wild type strain is established and is due to the increased fitness of the mutant over the wild type (Korber et al., 2020) . Viral entry of SARS Cov2 is facilitated by cleavage of the SARS CoV2 S protein. The D614G mutation more efficiently facilitates its cleavage by the host serine protease elastase-2 and this explains the high infectivity of D614G . Increased infectivity of D614G is supported by in vitro experimental studies showing elevated RNA levels and higher viral titers in clinical samples with the D614G mutation and D614G mutant pseudoviruses respectively Korber et al., 2020; Lorenzo-Redondo et al., 2020; Ozono et al., 2020; Wagner et al., 2020) . However, there's no conclusive evidence to show that the variant is associated with more severe disease or increased hospitalizations (Wagner et al., 2020) . Even so, the D614G strain has co-evolved with other mutations such as (F106F), 14408 C->T (P323L), 241 C->T, 25563 G->T (Q57H), and 1059 C-> T(T85I) and more studies are required to clarify the impact of these mutations on virulence. The D614G mutation is situated on the B cell epitope in a region that is highly immunodominant and this may possibly undermine vaccine effectiveness. However, experimental studies suggest that D614G mutants are sensitive to neutralization by polyclonal convalescent serum (Korber et al., 2020) . Subclades of D614G include D614G/Q57H/ and D614G/Q57H/T265I which were first identified in France, D614G/203_204delinsKR first identified in Germany and D614G/203_204delinsKR/T175M first identified in Iceland and Portugal (Koyama et al., 2020) . The L84S clade was first observed in China and has one subclade namely L84S/P5828L that was first observed in the United States. The L3606F clade was also first observed in China and has the L3606F/V378I/ subclade first observed in Italy and the L3606F/G251V/ subclade observed in Brazil. Other subclades are D448del which was first observed in France and G392D which was first observed in Germany. In general, there's high affinity between US and European samples with little similarity with East Asian samples and European clades dominate in samples in US (Koyama et al., 2020) . Mutations studies have also revealed the mechanisms of SARS CoV2 mutations. Dominant mutations are C->T transitions (Chaw et al., 2020; Koyama et al., 2020; Mishra et al., 2020) . Depletion of CpG dinucleotides is also a common mutation mechanism in SARS CoV2. Increased CG dinucleotide levels are inversely correlated with viral fitness, defined by decreased virulence and replication. Thus, CpG depletion is defined as an immune escape strategy to evade host antiviral mechanisms. (Theys et al., 2018) . Depletion of CpG dinucleotides in SARS CoV2 is possibly mediated by human zinc finger antiviral protein (hZAP) and apolipoprotein B mRNA editing enzyme (APOBEC1 and APOEBC3a). The hZAP attach to CpG dinucleotides in viral genomes to inhibit the replication of viruses and mediate the degradation of viral genome (Nchioua et al., 2020; Takata et al., 2017; Meagher et al., 2019; Trus et al., 2020) . APOBEC1 and APOBEC3a deplete CpG dinucleotides in RNA viruses by mediating cytidine-to-uridine (C → T) changes (Xia, 2020; DiGeorgio et al, 2020) . In comparative terms, Africa has had lower covid-19 morbidity and mortality numbers. This is a striking observation especially when the vulnerabilities associated with weak or nonexistent health systems, poor sanitation, high HIV prevalence, and high poverty rates in Africa are taken into account (Anim & Ofor-Asenso, 2020; de Aranzabal et al., 2020; Patel et al., 2020) . Despite this observation, and to the best of our knowledge, there have been no mutation studies focused on genomes sequenced from samples collected in Africa that seek to understand the evolution of this disease in Africa. In this study, we analyzed SARS-CoV-2 genomes from 282 samples collected in Africa in order to characterize the genetic variants circulating in Africa and understand the virus' temporal evolution. We evaluate if these mutations have clinically relevant outcomes, assessing implications on viral infectivity and disease severity in Africa and their potential effect on vaccine and therapeutic development and efficacy. We clarify the role of SARS CoV2 mutations in Covid-19 disease in Africa, relative to the rest of the world. 282 complete nucleotide sequences from African isolates were obtained from the NCBI Virus Database. The 282 sequences were aligned with the original Wuhan sequence (GenBank accession: NC_045512.2) (NCBI, 2020; Wang et al., 2020a) using the mem command in BWA v. 0.7.17 (Li, 2013) . The SAM file was converted to BAM file using SAMtools v. 1.10 followed by sorting and indexing (Li et al., 2009) . Duplicate marking and addition of read groups was done using Picard v. 2.23.4 (Broad Institute, 2019) . Variants were called using the mpileup command in BCFtools v. 1.11 and visualized in IGV Viewer v. 2.8.9 (Robinson et al., 2011) . Mutant proteins and their respective positions were abstracted from the called variants and the reference genome using a custom PHP script. Multiple alignment was done in BLAST2 with the Bat coronavirus RaTG13 (GenBank accession: MN996532.2) as the outgroup (NCBI, 2020; Zheng et al., 2000) . The BLAST2 dump file was converted into a Nexus file using the European Bioinformatics Institute (EBI) platform (Madeira et al., 2019) . MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms was used to select the general time reversible (GTR) evolutionary model (BIC score=340376.778) (Kumar, Stecher, Li, Knyaz, and Tamura 2018) . Phylogenetic analysis of the aligned sequences involved maximum likelihood (ML) method in Mr. Bayes: Bayesian Inference of Phylogeny v 3.2.6 (ML lset nst=6, Lset rates=invgamma, parsimony model, default priors, ngen=10000, burning=82) (Huelsenbeck & Ronquist, 2001; Ronquist & Huelsenbeck, 2003) . The nexus translation tree was visualized using FigTree v 1.4.4 (Rambaut, 2010) . The stability of point mutations was determined based on ddG values computed using I-Mutant 3.0, a support vector machine (SVM) tool that predicts protein stability upon single point mutations 8 A total of 282 nucleotide sequences from African isolates were analyzed. Majority of the sequences analyzed (80%) were from Egypt. 32.7% of sequences from the rest of Africa excluding Egypt were wild type with mutated sequences forming 67.3% of all non-Egyptian African sequences. All of the Egyptian sequences were mutated (Supplementary table S1). Transversion Transition For mutations with a frequency exceeding 1%, commonest mutations were missense mutations (54%) (table 2). (Supplementary table S5) . Majority of the mutations (~63%) occurred in the NSP3 (16%), N (11.1%), S (11.1%), RdRp (9.9%), NSP4 (8%), and NSP2 (6.8%) regions respectively. Least mutated regions were s2m, Non-structural protein mutations exceeding 1% frequency occurred in the E, N, and S genes. More than half of all Egyptian samples (62.6%, n=141) had the 26257G->T (E, V5F) mutation and this was not observed in any other African country. was observed in 4% of Egyptian samples. Co-occurrence of 14408 C->T (P323L) in samples from SSA was ~3% (Supplementary table S27 and Supplementary table S34) . Q57H Clade Wild type Mutations seen in Egypt and not observed elsewhere in Africa with a frequency of more than (Supplementary table S32) . The following mutations were reported in other regions and not observed in Africa (table 3) . In the present study, we compared SARS-CoV-2 genomes from 282 samples collected in Africa against the Wuhan reference genome NC_045512.2 with the aim of understanding the evolution of the virus in Africa and the impacts of the mutations on morbidity and mortality in Africa. We observed that 6% of all samples collected in Africa were wild type and 94% were mutants with 88% forming the D614G clade overall. 98% of all the Egyptian samples were the D614G variant and 98% of all D614G variants were from North Africa. If the Egyptian samples are excluded, wild type samples from the rest of Africa formed 34% of all sequences with the D614G clade comprising 25% of all SSA samples. This is in contrast to the earliest data demonstrating that the D614G variant formed 67% of all samples worldwide with higher running counts in Europe, Asia, Oceania, and North America and lower counts in parts of South America and Africa (Korber et al., 2020) . The wild type variant dominated the earliest samples collected from Kenya, Zambia, Ghana and Sierra Leone. Taken together with infection numbers, the data suggests that countries in which the earliest cases were caused by the wild type SARS CoV2 strain had relatively fewer infections. In converse, countries that had the highest number of infections in Africa such as South Africa, Morocco, Egypt, and Tunisia started off with a bigger proportion of the D614G variant. The predominance of the G614 variant in the early months of the pandemic in Africa may partly explain the relatively low numbers of those infected with covid-19 disease in many African countries. Predominance of the D614G variant during the early months of the pandemic may also explain the steep number of infections reported in South Africa and the North African countries of Egypt, Morocco, and Tunisia. This observation is based on findings that the D614G variant is associated with increased infectivity as clinical samples with the D614G mutation having higher viral titers Korber et al., 2020; Lorenzo-Redondo et al., 2020; Ozono et al., 2020; Wagner et al., 2020) . D614G is more infectious than the wild type sequence as it binds more efficiently to the human ACE2 receptor Korber et al., 2020; Lorenzo-Redondo et al., 2020; Ozono et al., 2020; Wagner et al., 2020) . The WHO recently raised an alarm over the surge in infections in parts of Africa where the number of infections had remained low since the first case in Africa was reported in February. As pointed out by Korber et al (2020) , the D614G variant has increased fitness and is under positive selection. Thus, the predominance may shift from the wild type to the D614G variant, allowing the latter to establish itself and predominate in areas where the wild type strain had been previously established with time (Korber et al., 2020; Koyama et al., 2020) . Whereas there's limited data of samples from Africa sequenced in the past few weeks, it is possible that the recent surge in cases in African countries that had hitherto low cases is due to the shift to the D614G variant. Genomic studies of current samples will help to clarify this. Absence of other mutations on the spike glycoprotein appear to have influenced the course of the disease in Africa. Global studies have identified 4 other variants on the spike glycoprotein that appear to enhance virus pathogenicity. These variants are D364Y, V483A, G476S, and V367F all of which affect the S1 RBD domain. Ou et al (2020) observed that the V367F and D364Y variants confer more structural stability to the S protein and this enables the SARS CoV2 virus to bind more efficiently to the human ACE2 receptor (Ou et al., 2020) . Experimental studies have demonstrated that V367F is associated with enhanced cell entry. Other RBD mutants that have been identified include N354D and W436R. N354D and D364Y, V367F, W436R had significantly lowered ΔG and significantly lowered equilibrium dissociation constant (KD) compared to the reference strain, suggesting that these mutants have significantly increased affinity to human ACE. In double mutants with N354D and D364Y, the latter provides increased affinity and this implies that the main contributor of the enhanced affinity is D364Y. Experimental validation assays prove that V367F significantly lowers the ED50 concentration of S and ACE2 receptor-ligand binding (Ou et al., 2020) . These mutants are proposed to bind ACE2 more stably due to the enhancement of the base 208 rigidity (Ou et al., 2020) . Our study did not identify any of these mutants in African samples and this may also explain the relatively lower morbidity and mortality seen in the African continent. Another significant finding was on the occurrence of the P323L mutation on the RdRp protein. Globally, P323L is one of the commonest SARS CoV2 mutations with a frequency of over 90% in countries such as the United States (Koyama et al., 2020) . In Africa however, the P323L mutation had an overall frequency of 34%, with only 3% of these mutations occurring in SSA. Whereas Egypt had a relatively high frequency of P323L (30.4%), only 25% of mutations cooccurred with D614G. Global data shows a near 100% correlation between D614G and P323L (Kannan et al., 2020) . P323L is thought to enhance viral infectivity together with D614G. According to Kannan et al (2020) , the location of P323L on the NSP12-NSP8 interface may position the leucine side chain closer to F396, leading to enhanced hydrophobic interactions between NSP8 L122 residue and nsp12 T323 (Cγ2) and L270 residues. This is thought to enhance viral replication through improved processivity of NSP12 (Kannan et al., 2020) . This observation seems to suggest that the low frequency of P323L in Africa may be a contributor to the relatively less severe impact of covid-19 disease in the continent compared to other regions even in areas such as Egypt where the D614G variant predominates. We also observed the low frequency of the ORF8 L84S mutation in Africa. L84S was the first observed mutation and is one of the commonest mutation worldwide with a frequency exceeding 50% and associated with severe disease in Italy, (Koyama et al., 2020; Wang et al., 2020b) . This mutation was present in 2.5% of African samples. The ORF8 protein aids in immune evasion through downregulation of major histocompatibility complex molecules class I (MCH-I) . The L84S mutation is thought to decrease protein stability (DdG=0.99 kcal/mol) and protein rigidity, a factor that may disfavour SARS CoV2, leading to increased immune surveillance and reduced viral titres (Wang et al., 2020b) . The effect of this mutation on disease severity and whether its low frequency in Africa contributes to the relatively low disease burden merits more experimental studies. Other important findings related to the ORF3a protein. The first observation was that only 1 sample from SSA had the Q57H mutation on the ORF3a accessory protein. This mutation occurred in 81% of all Egyptian samples, in 60% of Moroccan samples and in 38% of Tunisian samples. The second observation was that the G251V ORF3a mutation which occurred first in Italy and Brazil and was associated with many infections formed less than 2% of all African samples (Koyama et al., 2020) . As reported by Koyama et al., Q57H is the commonest mutation worldwide (Koyama et al., 2020) . Taken together with the earlier observations about the D614G variant, this observation suggests two things. First, it seems to point to Europe and or USA as the origin of the virus in much of North Africa since this the D614G/Q57H first occurred in France and has since then predominated in the USA (Koyama et al., 2020) . Secondly, it may have implications on the disease burden in Africa. ORF3a interacts with both S and ORF8 proteins. According to Wu et al (2020), the Q57H mutation results in increased binding affinity between the Q57H Orf3a and S (ΔΔG = 4.2 kcal/mol). This dramatic increase in the binding affinity due to the Q57H mutation may have several consequences. First, it may cause failure of treatment by shifting the protein-binding interface and destroying drug-targeting sites . Secondly, it leads to formation of an early stop codon to orf3b after amino acid 13 (Δ3b), resulting in a truncated ORF3b protein with consequent loss of interferon antagonism (Lam et al., 2020) . Other findings show that the Q57H variant does not seem to influence channel properties and does not result in any significant differences functionally compared to the wildtype ORF3a. This may be attributed to the mutation being located on the N-terminal which determines the subcellular localization of the virus without influencing channel properties (Kern et al., 2020) . Further research on the clinical importance of Q57H is warranted. In the present study, we also observed 3 missense mutations in the E protein: V5F, V62F, and L19F that may be of clinic-epidemiological importance. E is a 75-residue integral viroporin involved in viral replication, pathogenesis and assembly, activation of host inflammasome, and virion release (Lim & Liu, 2001; Nieto-Torres et al., 2014; Ruch & Machamer, 2012; Weiss & Navas-Martin, 2005) . E is highly conserved in coronaviruses with very few observed mutations (Qingfu et al., 2003) . Deletion of E is associated with attenuation in some coronaviruses. Reduced virulence due to E mutations has been reported (De Diego et al., 2007; Nieto-Torres et al., 2014; Pervushin et al., 2009) . E is hence a suitable target for drug and vaccine development and channel activity may be optimally inhibited by targeting small-molecule drugs to host cell Golgi and the endoplasmic reticulum-Golgi intermediate compartment (ERGIC) (Mandala et al., 2020) . Structurally, E is made up of an N-terminal domain (NTD), an ion-conducting transmembrane domain (TMD), and a cytoplasmic domain (CTD) (Wu et al., 2003; Mandala et al., 2020) . The V5F mutation affects the NTD and was present in more than half of all Egyptian samples (62.6%, n=141). This mutation has not been reported elsewhere to the best of our knowledge. The V19F and V62F mutations affect the TMD and CTD respectively. These 2 mutations are characterized by 8mer (TTTTTTTT) and 4mer (TTTT) homopolymeric stretches. Mutation analysis using the I-Mutant Suite indicate DdG values of 0.77 Kcal/mol, -1.21 Kcal/mol, and -1.04 Kcal/mol for V62F, L19F and V5F. This suggests that the mutations result in highly unstable and temperature-sensitive E proteins. This observation is consistent with the work of morphology aberrance (Fischer et al., 1998) . Since the E protein is essential for induction of interferon synthesis and apoptosis, RNA replication, and production and release of membrane vesicles or virus-like particles (VLPs) in coronaviruses (An et al., 1999; Corse & Machamer, 2000; Maeda et al., 2001; De Diego et al., 2007; Nieto-Torres et al., 2014; Pervushin et al., 2009; Mandala et al., 2020; Wu et al., 2003) , Observations about the mutations on the E protein need also to be discussed in conjunction with ORF3a since ORF3a is also a viroporin-coding gene in coronaviruses (An et al., 1999; Jiang et al., 2005; Verdia-Baguena et al., 2012) . In SARS CoV2, ORF3a forms homotetrameric potassium sensitive ion channels (viroporin) that mediates the activation of NLRP3 inflammasome (Siu et al., 2019; Farag et al., 2020; Wozniak et al., 2010) . Viroporin subunits undergo oligomerization, forming hydrophilic pores that allow ions to be shuttled across the membranes of host cells and facilitate the cellular entry of viruses and release of viruses from infected cells and viral replication and assembly (Farag et al., 2020) . Deletion of genes coding for viroporins leads to a significant reduction in viral progeny formation and reduces the pathogenicity of viruses (Farag et al., 2020) . As noted previously, viroporins induce inflammaosme activity. Inflammasomes can regulate the activity of caspase-1. Caspase-1 mediates interleukin-1 β (IL-1β) and interleukin 18 (IL-18) maturation. In turn, IL-1β and IL-18 (Farag et al., 2020) . E and ORF3a proteins are thought to contribute to NLRP3 inflammasome activity and are essential for maximal replication and virulence of SARS CoV (Farag et al., 2020) . SARS CoV viruses lacking E and ORF3a are not viable. Even though it contributes to viral pathogenesis, ORF3a in SARS CoV is not essential for replication (Siu et al., 2020) . Siu et al (2019) showed that ORF3a-associated activation of NLRP3 inflammasome activity is mediated through TNF receptor-associated factor 3 (TRAF3)-mediated ubiquitination of apoptosis-associated speck-like protein containing a caspase recruitment domain (ASC) (Siu et al., 2019) . Findings by Siu et al (2020) also demonstrate that ORF3a up-regulates expression of fibrinogen subunits FGA, FGB and FGG in host lung epithelial cells in SARS CoV. ORF3a is also involved in the induction of apoptosis in cell culture and in the downregulation of type 1 interferon receptor through induction of serine phosphorylation within the IFN alpha-receptor subunit 1 (IFNAR1) degradation motif and increasing the ubiquitination of IFNAR1 (Siu et al., 2019) . Based on the foregoing, the effect of the E and ORF3a mutations merit further investigations. Majority (>70%) of the mutations were transition mutations, with C->T transitions making up 44% of these transitions. This finding is similar to observations in other studies that show dominance of C->T mutations in SARS CoV2 genome (Koyama et al., 2020; Mishra et al., 2020; Wang et al., 2020b; Badua et al., 2020) . G->T transversions were the second most common mutations in our study followed by A->G and T->C transitions. Together, C->T and G->T transitions made up 64% of all mutations, underlining the role of deamination in SARS CoV2 evolution. proteins. This is because unmethylated CpG dinucleotides stimulate TLR 9 innate immune responses. Since CpG-rich codons have a lower transcription rate, CpG depletion also serves to enhance the virus transcription rate. Depletion of CpG dinucleotides may occur in response to selection pressure from host immune system, from spontaneous deamination of methylated cytosines in CpG dinucleotdes, and deamination of unmethylate d cytosines. Depletion of CpG is also a strategy that ensures the epigenetic silencing of the virus, leading to establishment of latent viral infection (Bird, 1980; Chinnery et al., 2012; Medvedeva et al., 2010; Wiebauer et al., 1993) . Depletion may be driven by host ZAP or APOBEC proteins. In the present study, we observed that 95% of all the mutations in the NSP12 (RdRp) protein are deamination mutations. The role of RdRp deamination mutations in attenuating viral virulence in SARS CoV2 needs to be investigated further. Findings reported in the present study may have clinico-epidemiological implications. We have noted the absence or presence in low frequencies of mutations associated with increased infectivity, virus fitness, and disease severity. Experimental studies will help to clarify the clinical impact of these mutations. On vaccines, the RBD has important epitopic antigens. Some of the identified mutations may alter the binding affinity of vaccines raised against the prototype strain hence leading to a reduction in vaccine efficacy (Ou et al., 2020) . Further studies also need to investigate if the CpG depletion that is widespread on the SARS CoV2 RdRp is a strategy to attenuate viral virulence (Ficarelli et al., 2020; Trus et al., 2020) and immune escape and determine if it has a role in the relatively low morbidity and mortality numbers in Africa. Mutation analysis is a critical factor when selecting suitable targets for drug design. Currently, there are a number of drugs in use, undergoing clinical trials, or proposed as suitable drug targets against SARS CoV2 genomic or sub-genomic RNA regions. Lopinavir and ritonavir were proposed for repurposing of SARS CoV2 3-chymotrypsin-like (3CLpro) and papain-like (PLpro) proteases (Nutho et al., 2020; Xiaopan et al., 2020) . Remdesivir and Favipravir target the RdRp (Goldman et al., 2020; Pandey et al., 2020) while Ribavirin interferes with mRNA capping and viral replication (Khalili et al., 2020; Pandey et al., 2020; Tong et al., 2020) . Osetalmivir which targets the 3CLPro was found to be ineffective in the treatment of SARS CoV2 . Interferons induce production of Mx proteins that are thought to inhibit viral replication (Spiegel et al., 2004) . Teicoplanin is thought to prevent viral entry through inhibition of cathepsin L . Baricitinib which was discovered using artificial intelligence (AI) methods, prevents viral entry by binding to AP2-associated protein kinase 1 (AAK1) to block clathrin-dependent endocytosis and modulation of inflammatory cytokines through selective inhibition of Janus Kinase (JAK) (Caputo et al., 2020) . Other compounds that have been mentioned as possible drugs for covid-19 disease include azithromycin (Echeverría-Esnal et al., 2020) and arbidol . Except for remdesivir and corticosteroids, studies on many of these other drugs are still ongoing or have produced conflicting results (Echeverría-Esnal et al., 2020; Pandey et al., 2020) . Mutations may have an impact on the binding of SARS CoV2 antiviral drugs and this has implications on drug design and drug resistance. This has been noted for antiviral drugs targeting the RdRp regions and the effectiveness of compounds such as remdesivir may be hampered by the mutations (Pandey et al., 2020) . On drug design, regions such as PLpro, RdRp, and S that exhibited high mutation rates may be less suitable drug and diagnostics targets, and regions showing limited or no mutations such as ORF8, NSP8, NSP9, NSP11, nsp15, ORF6, 2'-O-ribose methyltransferase, and ORF9 may be more attractive targets. Assessment of viral mutations provided protein stability data and can be used to model protein folding as well as assess binding affinity around the mutation. This helps to assess possible drug resistance phenotypes, select optimal targets for lead candidate development, correlate 27 mutations with disease severity, and map putative target sites. Further studies need to consider these mutations in this respect. Out of the 18,820 global SARS CoV2 sequences deposited in NCBI, African sequences accounted for 280 or just 1.5% of all sequences and Egypt accounts for 80% of these sequences collected in Africa at the time of the analysis. Sub-Saharan Africa accounted for 0.29% of all sequences. Only 9 African countries had some SARS CoV2 sequences; not a single sequence was seen for 45 other countries. Evidently, very little effort is being made to sequence samples collected in Africa and understand the mutation patterns in this continent. The small sample size may not be sufficient to make sweeping generalizations. The genetic picture captured in this study is a temporal screenshot that explains the genetic variation present months ago. The mutation landscape is a constantly changing mosaic that is temporal in nature and which requires constant genomic analysis for continuous tracking. Implications of SARS-CoV-2 genetic diversity and mutations on pathogenicity of the COVID-19 and biomedical interventions Induction of apoptosis in murine coronavirus-infected cultured cells and demonstration of E protein as an apoptosis inducer Water scarcity and COVID-19 in sub-Saharan Africa The Novel Insight of SARS-CoV-2 Molecular Biology and Pathogenesis and Therapeutic Options V483a -an Emerging Mutation Hotspot of Sars-Cov-2 Genomic and proteomic mutation landscapes of SARS-CoV-2 DNA methylation and the frequency of CpG in animal DNA DeepDDG: Predicting the Stability Change of Protein Point Mutations Using Neural Networks I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure The origin and underlying driving forces of the SARS-CoV-2 outbreak Mutations Strengthened SARS-CoV Prediction of protein stability changes for single-site mutations using support vector machines TLR9 ligand CpG-ODN applied to the injured mouse cornea elicits retinal inflammation Molecular basis of base substitution hotspots in Escherichia coli & en representación del Grupo de Cooperación internacionalde la COVID-19 and Africa: Surviving between a rock and a hard place Azithromycin in the treatment of COVID-19: a review Viroporins and inflammasomes: A key to understand virus-induced inflammation CpG dinucleotides inhibit HIV-1 replication through zinc finger antiviral protein (ZAP)-dependent and independent mechanisms Clinical features and efficacy of antiviral drug COVID-19 patients from East-West-Lake Shelter Hospital in Wuhan: a retrospective case series Structure of the RNA-dependent RNA polymerase from COVID-19 virus Remdesivir for 5 or 10 Days in Patients with Severe Covid-19 Making Sense of Mutation: What D614G Means for the COVID-19 Pandemic Remains Unclear Implications of SARS-CoV-2 Mutations for Genomic RNA Structure and Host microRNA Targeting Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19 MRBAYES: Bayesian inference of phylogeny SCRATCH: a Protein Structure and Structural Feature Prediction Server The D614G mutation of SARS-CoV-2 spike protein enhances viral infectivity and decreases neutralization sensitivity to individual convalescent sera Characterization of cytokine/chemokine profiles of severe acute respiratory syndrome Virus strain from a mild COVID-19 patient in Hangzhou represents a new trend in SARS-CoV-2 evolution potentially related to Furin cleavage site Infectivity of SARS-CoV-2: there Is Something More than D614G? Role of conformational sampling in computing mutation-induced changes in protein structure and stability Cryo-EM structure of the SARS-CoV-2 3a ion channel in lipid nanodiscs. bioRxiv Novel coronavirus treatment with ribavirin: Groundwork for an evaluation concerning COVID-19 Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus Cell Variant analysis of SARS-CoV-2 genomes Loss of orf3b in the circulating SARS-CoV-2 strains Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM The Sequence Alignment/Map format and SAMtools The Missing Link in Coronavirus Assembly retention of the avian coronavirus infectious bronchitis virus envelope protein in the pre-golgi compartments and physical interaction between the envelope and membrane proteins Baricitinib: A chance to treat COVID-19? A Unique Clade of SARS-CoV-2 Viruses is Associated with Lower Viral Loads in Patient Upper Airways. medRxiv: the preprint server for health sciences Mesquite: a modular system for evolutionary analysis The EMBL-EBI search and sequence analysis tools APIs in 2019 Protein stability: computation, sequence statistics, and new experimental methods Structure and drug binding of the SARS-CoV-2 envelope protein transmembrane domain in lipid bilayers Structure of the zinc-finger antiviral protein in complex with RNA reveals a mechanism for selective targeting of CG-rich viral sequences Intergenic, gene terminal, and intragenic CpG islands in the human genome Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome. NCBI Reference Sequence: NC_045512.2. Bethesda: National Center for Biotechnology Information Severe acute respiratory syndrome coronavirus E protein transports calcium ions and activates the NLRP3 inflammasome Why Are Lopinavir and Ritonavir Effective against the Newly Emerged Coronavirus Emergence of RBD mutations in circulating SARS-CoV-2 strains enhancing the structural stability and human ACE2 receptor affinity of the spike protein Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant Potential therapeutic targets for combating SARS-CoV-2: Drug repurposing, clinical trials and recent advancements Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules Poverty, inequality and COVID-19: the forgotten vulnerable Lethal mutagenesis of viruses STRUM: structure-based prediction of protein stability changes upon single-point mutation FigTree v1.3.1. Institute of Evolutionary Biology SARS-CoV-2 Is Restricted by Zinc Finger Antiviral Protein despite Preadaptation to the Low-CpG Environment in Humans Integrative Genomics Viewer MRBAYES 3: Bayesian phylogenetic inference under mixed models Variant analysis of 1,040 SARS-CoV-2 genomes The coronavirus E protein: assembly and beyond Naturally mutated spike proteins of SARS-CoV-2 variants show differential levels of cell entry bioRxiv Mechanisms of viral mutation. Cellular and molecular life sciences : CMLS Severe acute respiratory syndrome coronavirus ORF3a protein activates the NLRP3 inflammasome by promoting TRAF3-dependent ubiquitination of ASC The antiviral effect of interferon-beta against SARS-coronavirus is not mediated by MxA protein Molecular Evolutionary Genetics Analysis across computing platforms CG dinucleotide suppression enables antiviral defence targeting non-self RNA Is oseltamivir suiSupplementary table Sfor fighting against COVID-19: In silico assessment, in vitro and retrospective study On the origin and continuing evolution of SARS-CoV-2. (2020) Ribavirin therapy for severe COVID-19: a retrospective cohort study CpGrecoding in zika virus genome causes host-age-dependent attenuation of infection with protection against lethal heterologous challenge in mice Coronavirus biology and replication: implications for SARS-CoV-2 Emergence of genomic diversity and recurrent mutations in SARS-CoV-2 Coronavirus E protein forms ion channels with functionally and structurallyinvolved membrane lipids Comparing viral load and clinical outcomes in Washington State across D614G mutation in spike protein of SARS-CoV-2 (2020) The establishment of reference sequence for SARS-CoV-2 and variation analysis Characterizing SARS-CoV-2 mutations in the United States Coronavirus pathogenesis and the emerging pathogen severe acute respiratory syndrome coronavirus The repair of 5-methylcytosine deamination damage Intracellular proton conductance of the hepatitis C virus p7 protein and its contribution to infectious virus production The E protein is a multifunctional membrane protein of SARS-CoV Crystal structure of SARS-CoV-2 papain-like protease Extreme Genomic CpG Deficiency in SARS-CoV-2 and Evasion of Host Antiviral Defense Positive selection, not negative selection, in the pseudogenization of rcsA in Yersinia pestis The ORF8 Protein of SARS-CoV-2 Mediates Immune Evasion through Potently Downregulating MHC-I. bioRxiv Teicoplanin potently blocks the cell entry of 2019-nCoV A greedy algorithm for aligning DNA sequences SARS-CoV-2 viral load in upper respiratory specimens of infected patients