key: cord-0776683-04kbzowc authors: Littera, R.; Melis, M. title: How many relevant SARS-CoV-2 variants might we expect in the future? date: 2021-11-21 journal: nan DOI: 10.1101/2021.11.17.21266463 sha: c513e4c0047ce85def19f89e5f95709d4ad56748 doc_id: 776683 cord_uid: 04kbzowc Objectives: The emergence of new SARS-CoV-2 variants is a major challenge in the management of Covid-19 pandemic. A crucial issue is to quantify the number of variants which may represent a potential risk for public health in the future. Methods: We fitted the data on the most relevant SARS-CoV-2 variants recorded by the World Health Organization (WHO). The function exploited for the fit is related to the total number of infected subjects in the world since the start of the epidemic. Results: We found that the number of relevant SARS-CoV-2 variants up to November 2021 was about 44. Moreover, the number of new relevant variants per ten million cases turned out to be 1.64 in November 2021, slightly decreased in comparison to the value of 2.29 in March 2020. Conclusions: Our simple mathematical model can evaluate the number of relevant SARS-CoV-2 variants as the cumulative number of cases increase worldwide and may represent a useful tool in planning strategies to effectively contrast the pandemic. Most mutations in the genome of the severe acute respiratory syndrome coronavirus (SARS-CoV-2) are neutral or only mildly deleterious. However, a small proportion of mutations can increase infectivity and promote virus-host interactions that are critical to the establishment of persistent and more severe infection [1, 2] . For example, mutations in the spike protein, which mediates attachment of the virus to host cellsurface receptors [3] , can have significant effects on virus behaviour. In order to effectively control the pandemic, it is imperative to investigate the emergence and spread of variants with an impact on disease transmission and human health [4] . SARS-CoV-2 sequences are shared daily on public databases such as the Global Initiative on Sharing All Influenza Data (GISAID) [5] or the European Centre for Disease Prevention and Control (ECDC) [6] , which significantly contribute to surveillance of the pandemic. The World Health Organization has established that SARS-CoV-2 variants representing a possible risk to public health can be divided into three distinct categories [1] : variants under monitoring (VUMs), variants of interest (VOIs) and variants of concern (VOCs). Variants Under Monitoring (VUMs) are associated with genetic mutations which alter virus characteristics, although evidence of phenotypic or epidemiological impact is still unclear. Variants of Interest (VOIs) are associated with: i) genetic mutations which affect transmissibility, disease course, diagnostic or therapeutic escape; ii) relevant community transmission with an emerging risk to global public health. Variants of Concern (VOCs) are associated with one or more of the following characteristics: i) increase in transmissibility; ii) increase in virulence or change in disease severity; iii) decrease in effectiveness of social measures, diagnostics, vaccines and therapeutics. Given the continuous evolution of the SARS-CoV-2 virus, variants may be reclassified over time. In the present study, we fitted the WHO data [1, 2] by exploiting a function which exclusively depends on the number of infected cases worldwide. Our fit allows for a fairly good estimate of the number of relevant variants that can be expected to appear for a given number of infected subjects throughout the world. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. Furthermore, our approach can predict the number of new relevant variants per ten million cases in any epidemiological situation. The number of new relevant variants per ten million cases decreases very slowly as the cumulative number of Covid-19 cases increases. Therefore, it becomes crucial to carefully monitor and reduce virus circulation in order to avoid the emergence of new variants which may not be suitably covered by the vaccines and drugs currently available [7] . Despite the huge efforts put forth by healthcare services in many countries, vaccination campaigns have not achieved a population coverage that is high enough to prevent the spread of SARS-CoV-2. The WHO estimate [8] is that in Europe and Central Asia alone the current fourth wave of the Covid-19 pandemic is likely to cause more than half a million deaths. Although it is obvious that the number of relevant variants increases with the cases throughout the world, it is extremely difficult to find out the precise relationship between these two variables. Our model is a simple attempt to make a fairly reliable estimate of the risk of new variants that can impact public health as the virus continues to spread. By means of Wolfram Mathematica [9] we fitted the data on SARS-CoV-2 variants in order to evaluate the cumulative number of relevant SARS-CoV-2 variants versus the cumulative number of infected subjects worldwide. The function fitting the WHO data must satisfy the following conditions: 1) The function varies from zero to infinity. If there is no infection, the number of variants is zero; vice versa, if the virus replicates infinite times, the cumulative number of variants is also infinite. 2) increases monotonically, therefore the first derivative ′ ( ) is positive. The cumulative number of variants increases with the number of infections, i.e. as the virus replications increase. 3) The first derivative of decreases monotonically, therefore the second derivative ′ ′( ) is negative. As the cumulative number of variants increases with the total cases in the world, the emergence of new virus mutations turns out to be slightly less frequent. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint The fit of WHO data was obtained by means of the following function: where is the constant of the numerical fit and "log " represents the natural logarithm of . This function satisfies all the previous conditions, as shown below: The number of new relevant variants per ten million cases (∆ = 10 7 ) turns out to be: The relative variation | ′ ( )| of new relevant variants per ten million cases decreases as the number of Instead of imposing the three conditions listed previously, the function exploited in the fit can also be justified through a more heuristic approach discussed in Appendix A. We fitted the data recorded by WHO [1, 2] up to November 2021 by means of a specific code written with Wolfram Mathematica 12.1.3 [9] . Table 1 lists the characteristics of SARS-CoV-2 variants reported by WHO [1, 2] : date and country of the earliest detection, PANGO (Phylogenetic Assignment of Named Global Outbreak) and WHO classification, current relevance (variants of concern, variants of interest or under monitoring), total number of cases in the world at the end of the month of detection and cumulative number of variants. PANGO is a rule-based nomenclature system for naming and tracking SARS-CoV-2 genetic lineages [10] . The numerical fit of the WHO data was obtained by means of the function ( ) = • log ⁄ , where the constant of the numerical fit is = 3.35 • 10 −6 . The 95% confidence interval (CI) of is given by . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint The adjusted -squared, measuring the goodness of the fit, turned out to be 2 = 0.97. Further technical details on the fit of WHO data are presented in Appendix B. Characteristics of SARS-CoV-2 variants recorded by WHO [1, 2] : date and country of the earliest detection, PANGO and WHO nomenclature, current relevance (concern, interest or under monitoring) and cumulative number of cases in the world at the end of the month of detection. The last column summarises the cumulative number of observed relevant variants. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint Figure 1 represents the cumulative number of relevant SARS-CoV-2 variants versus the cumulative number of cases in the world. The dots from 1 to 10 correspond to the data reported by WHO [1, 2] from March 2020 to May 2021; the solid line represents the function used in the fit: = • log ⁄ . The total cases in the world up to the 14 th of November 2021 were 252826597 [2] . The corresponding cumulative number of relevant SARS-CoV-2 variants was 43.7, i.e. almost 19 variants more than the last WHO report dated back to May 2021, when the relevant variants were 25 [1] . As discussed in the Methods section, the number of new relevant variants per ten million (10 7 ) is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; for this reason, we could suppose that the parameter in the fit = • log ⁄ was constant, although it actually varies with the factors affecting the emergence of relevant variants. Since the start of the Covid-19 pandemic there has been an impressive global effort in investigating every aspect of the coronavirus epidemic [11] , including immunogenetic [12, 13] and epidemiological [14] issues. In this study, we built a simple mathematical model to calculate the number of relevant SARS-CoV-2 variants from the number of infected cases in the world. By fitting the WHO data listed in Table 1 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint Analogously, we found that the number of new relevant variants per ten million cases was 1.64 on the 14 th of November 2021, decreased only by 28.4% in comparison to March 2020 when the number was 2.29. Our method depends critically on the WHO efficiency in tracking the most relevant SARS-CoV-2 variants. A different approach would be to consider the whole number of variants detected by genomic sequencing of SARS-CoV-2 and recorded in public databases such as GISAID [5] . This choice would be independent from the WHO targeting of the most relevant variants but would be less interesting from a clinical viewpoint since only the variants affecting virus transmission or disease severity are important to the control and management of the pandemic. As shown in Figure 2 , the number of new relevant variants per ten million cases decreases very slowly as the cumulative number of cases increases. Therefore, the persistence of virus circulation will always cause the emergence of new relevant SARS-CoV-2 variants. Our model does not take into account the fact that the number of virus replications is different in each infected subject. However, the average number of replications can be assumed to be constant over a large number of cases, such as those recorded worldwide. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint new relevant variants may emerge anywhere in the world indicates that the winning strategy is not to leave any country behind in the battle against the virus. The possibility to predict the number of new relevant SARS-CoV-2 variants will become increasingly important in future to ensure optimal planning of vaccination campaigns by healthcare services, united in the awareness that new variants can change the characteristics of the virus and greatly influence the global management of the pandemic. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 21, 2021. ; https://doi.org/10.1101/2021.11.17.21266463 doi: medRxiv preprint Tracking SARS-CoV-2 variants. Working Definitions and Actions Taken Covid-19 Weekly Epidemiological and Operational Update SARS-CoV-2 variants, spike mutations and immune escape The origins and potential future of SARS-CoV-2 variants of concern in the evolving COVID-19 pandemic GISAID (Global Initiative on Sharing All Influenza Data). hCoV-19 tracking of variants Centre for Disease Prevention and Control) SARS-CoV-2 Variants and Vaccines WHO (World Health Organization) Regional Office for Europe. Update on COVID-19: Europe and Central Asia again at the epicentre of the pandemic Trial Version) A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo' Human Leukocyte Antigen Complex and Other Immunogenetic and Clinical Factors Influence Susceptibility or Protection to SARS-CoV-2 Infection and Severity of the Disease Course Natural killer-cell immunoglobulin-like receptors trigger differences in immune response to SARS-CoV-2 infection Undetected infectives in the Covid-19 pandemic Evolution patterns of SARS-CoV-2: Snapshot on its genome variants The authors are grateful to Anna Maria Koopmans for translations, professional writing assistance and preparation of the manuscript. The authors contributed equally to the article. The authors declare that no competing interests exist. The authors received no specific funding for this work. Not applicable.