key: cord-0687963-kzkt3u1h authors: Kandeel, Mahmoud; Mohamed, Maged E. M.; Abd El‐Lateef, Hany M.; Venugopala, Katharigatta N.; El‐Beltagi, Hossam S. title: Omicron variant genome evolution and phylogenetics date: 2021-12-15 journal: J Med Virol DOI: 10.1002/jmv.27515 sha: 714f693f8b92380749f4d3722037d8187892b973 doc_id: 687963 cord_uid: kzkt3u1h Following the discovery of the SARS‐CoV‐2 Omicron variant (B.1.1.529), the global COVID‐19 outbreak has resurfaced after appearing to be relentlessly spreading over the past 2 years. This new variant showed marked degree of mutation, compared with the previous SARS‐CoV‐2 variants. This study investigates the evolutionary links between Omicron variant and recently emerged SARS‐CoV‐2 variants. The entire genome sequences of SARS‐CoV‐2 variants were obtained, aligned using Clustal Omega, pairwise comparison was computed, differences, identity percent, gaps, and mutations were noted, and the identity matrix was generated. The phylogenetics of Omicron variants were determined using a variety of evolutionary substitution models. The ultrametric and metric clustering methods, such as UPGMA and neighbor‐joining (NJ), using nucleotide substitution models that allowed the inclusion of nucleotide transitions and transversions as Kimura 80 models, revealed that the Omicron variant forms a new monophyletic clade that is distant from other SARS‐CoV‐2 variants. In contrast, the NJ method using a basic nucleotide substitution model such as Jukes–Cantor revealed a close relationship between the Omicron variant and the recently evolved Alpha variant. Based on the percentage of sequence identity, the closest variants were in the following order: Omicron, Alpha, Gamma, Delta, Beta, Mu, and then the SARS‐CoV‐2 USA isolate. A genome alignment with other variants indicated the greatest number of gaps in the Omicron variant's genome ranging from 43 to 63 gaps. It is possible, given their close relationship to the Alpha variety, that Omicron has been around for much longer than predicted, even though they created a separate monophyletic group. Sequencing initiatives in a systematic and comprehensive manner is highly recommended to study the evolution and mutations of the virus. logenetics of Omicron variants were determined using a variety of evolutionary substitution models. The ultrametric and metric clustering methods, such as UPGMA and neighbor-joining (NJ), using nucleotide substitution models that allowed the inclusion of nucleotide transitions and transversions as Kimura 80 models, revealed that the Omicron variant forms a new monophyletic clade that is distant from other SARS-CoV-2 variants. In contrast, the NJ method using a basic nucleotide substitution model such as Jukes-Cantor revealed a close relationship between the Omicron variant and the recently evolved Alpha variant. Based on the percentage of sequence identity, the closest variants were in the following order: Omicron, Alpha, Gamma, Delta, Beta, Mu, and then the SARS-CoV-2 USA isolate. A genome alignment with other variants indicated the greatest number of gaps in the Omicron variant's genome ranging from 43 to 63 gaps. It is possible, given their close relationship to the Alpha variety, that Omicron has been around for much longer than predicted, even though they created a separate monophyletic group. Sequencing initiatives in a systematic and comprehensive manner is highly recommended to study the evolution and mutations of the virus. Omicron was discovered in Botswana in early November. South Africa reported it to the World Health Organization on November 24, 2021, and it was designated as a variant of concern (VOC) on Because of the emergent nature of the Omicron variation, various concerns have been raised, including the source of emergence, the effect of mutations in Omicron in the response to vaccinations, the influence of mutations on modulation of host immunity, clinical data, Omicron spreading potency and lethality. In this study, an attempt was made to trace the phylogenetic relationships of the Omicron genome. To achieve the best fit of alignment of whole genomes, many methodologies were used. The genomes of CoV variants were retrieved from GISAID (https:// www.gisaid.org/). 4 The basic information of the used genomes are provided in Table 1 . The CLC Genomics Workbench 12.0 (QIAGEN) 5 and Geneious prime 6 software were used to handle the sequences. The FASTA files containing entire genomes were uploaded to the Clustal Omega website at the European Bioinformatics Institute using the default parameters, and the results were analyzed. Using inhouse software, the output files were imported, and the pairwise comparison matrix was produced. Differences and identity percent were calculated, as well as gaps and mutations were noted, and the identity matrix was generated. The creation of the phylogenetic tree was accomplished through the use of two algorithms: the neighbor-joining (NJ) method or the UPMA method. For distance measuring, the Jukes-Cantor (JC), Kimura 80 substitution models 7 were employed. Bootstrap resampling with 100 replicates was applied. The first sequenced genome of Omicron variant was used to trace its phylogenetic relations with other SARS-CoV-2 variants ( Bioinformatics and phylogeny tools are gold standards in microbial evolution and drug discovery against selected molecular targets. [8] [9] [10] Analysis of SARS-CoV-2 genome constituents highlighted the forces affecting virus evolution. 11 In this study, we used a combination of tools to get insights into the evolution of Omicron variant. The UPGMA approach assumes that all lineages evolve at the same rate, and the mutation rate is not taken into account during tree construction. The tree construction depends on the pairwise distance. In contrast, the NJ considers the evolution rate during tree construction. The JC model of evolution considers all possible changes to nucleotides occurring with equal rates. While Kimura model assumes considers the transitions (e.g., changes of A to T or G to C) and transversions (e.g., changes from purines to pyrimidines). In virus evolution, a single evolution model cannot be assumed due to the complexity of virus evolution and variations even within single genes. 12 A NJ tree is expected to be insensitive to tree topology in the JC model, and a NJJC tree is thought to provide a good estimate of tree topology. World Health Organization Classification of Omicron (B.1.1.529): SARS-CoV-2 Variant of Concern Omicron variant (B.1.1.529) of SARS-CoV-2, a global urgent public health alert! Where did 'weird' Omicron come from? Global initiative on sharing all influenza data-from vision to reality 0 (QIAGEN) Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data Model selection in the estimation of the number of nucleotide substitutions Insulin from human, camel and farm animals: comparative bioinformatics and molecular dynamics studies Discovery of new potent anti-MERS CoV fusion inhibitors MERS-CoV inhibitor peptides. United States Patent Office From SARS and MERS CoVs to SARS-CoV-2: moving toward more biased codon usage in viral structural and nonstructural genes Selecting models of nucleotide substitution: an application to human immunodeficiency virus 1 (HIV-1) Omicron variant genome evolution and phylogenetics The authors declare that there are no conflict of interests. All data are within the manuscript. http://orcid.org/0000-0003-3668-5147Hossam S. El-Beltagi http://orcid.org/0000-0003-4433-2034