key: cord-0758506-6kwppo31 authors: Okoh, Olayinka Sunday; Nii-Trebi, Nicholas Israel; Jakkari, Abdulrokeeb; Olaniran, Tosin Titus; Senbadejo, Tosin Yetunde; Kafintu-kwashie, Anna Aba; Dairo, Emmanuel Oluwatobi; Ganiyu, Tajudeen Oladunni; Akaninyene, Ifiokakaninyene Ekpo; Ezediuno, Louis Odinakaose; Adeosun, Idowu Jesulayomi; Ockiya, Michael Asebake; Jimah, Esther Moradeyo; Spiro, David J.; Oladipo, Elijah Kolawole; Trovão, Nídia S. title: Epidemiology and genetic diversity of SARS-CoV-2 lineages circulating in Africa date: 2022-02-05 journal: iScience DOI: 10.1016/j.isci.2022.103880 sha: 6673458a802fd7a38f6e236d08650018811d97f0 doc_id: 758506 cord_uid: 6kwppo31 There is a dearth of information on COVID-19 disease dynamics in Africa. To fill this gap, we investigated the epidemiology and genetic diversity of SARS-CoV-2 lineages circulating in the continent. We retrieved 5229 complete genomes collected in 33 African countries from the GISAID database. We investigated the circulating diversity, reconstructed the viral evolutionary divergence and history, and studied the case and death trends in the continent. Almost a fifth (144/782, 18.4%) of Pango lineages found worldwide circulated in Africa, with five different lineages dominating over time. Phylogenetic analysis revealed that African viruses cluster more closely with those from Europe. We also identified two motifs that could function as integrin-binding sites and N-glycosylation domains. These results shed light on the epidemiological and evolutionary dynamics of the circulating viral diversity in Africa. They also emphasize the need to expand surveillance efforts in Africa to help inform and implement better public health measures. J o u r n a l P r e -p r o o f 2 replication. The protein functions during host cell attachment and entry by primarily mediating 125 binding to the extracellular domains of its receptor, the angiotensin-converting enzyme 2 (ACE2), 126 a transmembrane protein that is also used by SARS-CoV for cell entry (Wan et al., 2020) . CoV-2 binding to ACE2 and fusion with cellular membrane are facilitated by the S1 receptor-128 binding domain (RBD) and the S2 subunit, respectively. This unique function makes the spike 129 glycoprotein a target for the development of antibodies, therapeutics, and vaccines. Therefore, the (Toyoshima et al., 2020) . It is to be noted that most sequences from Africa included in 193 the genome variation analysis described above were mostly from North African countries 194 including Egypt, but not those of sub-Saharan Africa. The epidemiology of SARS-CoV-2 in the African continent calls for a comprehensive study of the 196 genomic and evolutionary patterns of this virus. Comparative analysis of viral genome sequences 197 represents a very useful approach to provide insight into pathogen emergence and evolution. This 198 study, therefore, pursues an in-depth investigation into the epidemiology, evolution, and molecular 199 motifs of SARS-CoV-2 in Africa to shed light on the pandemic dynamics, and aid in informing Focusing only on sequence data will not give a true representation of the disease dynamics of 207 SARS-CoV-2 in Africa, as only a fraction of the cases in Africa are sequenced. Therefore, we 208 analyzed reported COVID-19 cases using data from OurWorldInData.org downloaded on January 209 8, 2021. As presented in Figure S1 , North America has the highest number of COVID-19 cases (n 210 = 25 million), followed by Europe (n = 19 million) and Asia (n = 18 million). Oceania has the least 211 number of cases reported of the six continents (n = 20,575). The absolute number of cases in Africa 212 (n = 2 million) and South America (n = 6 million) are a very small fraction of the cases in Asia, 213 North America, and Europe. The global average of COVID-19 cases per 100,000 population 214 (hereafter referred to as cases/pop) is 895, represented by the red line in Figure 1 . We observed 215 that the average number of COVID-19 cases per 100,000 persons in Oceania, Africa, and Asia are 216 all below the global average. Considering the absolute cases of COVID-19, Asia is more affected 217 J o u r n a l P r e -p r o o f than South America. However, taking population into consideration reveals that South America 218 (1287 cases/pop) has a higher burden of COVID-19 cases than Asia (390 cases/pop). The number of deaths per 100,000 population (hereafter referred to as deaths/pop) followed the 220 same trend as the number of cases per 100,000. Deaths per 100,000 in Oceania (2 deaths/pop), In Africa, an analysis of COVID-19 cases per 100,000 population ( Figure 2) We looked specifically at the viral genetic diversity within Africa, as compared to the genetic 326 diversity observed in other continents (Figure 9 and Data S3). Inspection of the continent-specific 327 genetic distance distributions with a Wilcoxon signed-rank test revealed that the viral diversity 328 circulating in Africa is significantly higher (p-value < 2.2e-16) than that estimated in Oceania and 329 South America, but significantly lower than that in Asia, Europe, and North America. These We also investigated the genetic diversity across countries in the African continent ( Figure S7 ). The lowest viral diversities were observed in Madagascar, Zambia, and Algeria, while the highest with equal width to function as a target and align better to motifs in the database, thus producing 348 a match score as good or better than another target. The e-value shows the expected number of 349 false positives in the matches, with a threshold of ten or less. The motifs were matched with the study dataset but were not found in all isolates (data not shown). Tunisia. Compared to other continents, Africa appears to be relatively spared in terms of case 372 fatality rate. Nonetheless, Egypt, Sudan, Chad, and Niger, all of which share borders, were found 373 to have the highest numbers of COVID-19-related deaths, and thus further investigation is 374 necessary to uncover the factors that led to this public health burden. We estimated that the impact 375 of SARS-CoV-2 in Africa has been below the global average, both in terms of cases and mortality. We acknowledge the authors and laboratories that generated and submitted sequences into 518 GISAID's EpiFlu Database. A full table of sequence authors is available in Supplemental Table 2 . 519 We also thank Helix Biogen Institute, also known as Helix Biogen Consult for their support. We The authors declare no competing interests. Comparison of evolutionary divergence 645 We estimated the evolutionary divergence of several sequence datasets from each continent. Each Chloroquine and 707 hydroxychloroquine for the prevention or treatment of COVID-19 in Africa: caution for Hydroxychloroquine: from 739 malaria to autoimmunity 2021) Package 'maptools Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a 744 patient with atypical pneumonia after visiting Wuhan. Emerging microbes and infections R package 'sm': nonparametric smoothing methods 747 rio: A Swiss-army knife for data file I/O 751 (2020) Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus 752 pneumonia in Wuhan, China: a descriptive study Comprehensive mapping of neutralizing antibodies against SARS-CoV-2 755 variants induced by natural infection or vaccination Chloroquine for the 2019 novel coronavirus 757 SARS-CoV-2 Detection of 2019 novel coronavirus 760 (2019-nCoV) by real-time RT-PCR Origin and evolution of pathogenic coronaviruses Estimated transmissibility and 765 severity of novel SARS-CoV-2 Variant of Concern 202012/01 qwraps2: Quick Wraps 2 Prospects for inferring very large phylogenies by 918 using the neighbor-joining method Sixteen novel lineages of SARS-CoV-2 in South Africa Emergence of a SARS-CoV-2 variant of concern 927 with mutations in spike glycoprotein Accelerating genomics-based surveillance for 930 COVID-19 response in Africa Chloroquine cardiomyopathy-a review of 932 the literature SARS-CoV-934 2 genomic variations associated with mortality rate of COVID-19 Transmission of SARS-CoV-2 Lineage B Insights from linking epidemiological and genetic data. medRxiv Receptor recognition by the novel 941 coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS 942 coronavirus A novel coronavirus outbreak of 944 global health concern. The lancet Exploitation of 946 glycosylation in enveloped virus pathobiology Site-specific 949 glycan analysis of the SARS-CoV-2 spike A genomic perspective on the origin and emergence of 983 SARS-CoV-2 Virus-receptor interactions of glycosylated SARS-986 CoV-2 spike and human ACE2 receptor Team of 10 researchers at the WIV (2020) Remdesivir and chloroquine effectively 988 inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro a week: Convert Dates to Arbitrary Week Definitions A novel coronavirus from patients with pneumonia in China