key: cord-0747901-l0y2h8uh authors: Fort, Hugo title: A very simple model to account for the rapid rise of the alpha variant of SARS-CoV-2 in several countries and the world. date: 2021-08-05 journal: Virus Res DOI: 10.1016/j.virusres.2021.198531 sha: 5257ee488897685a38c71aa40abeff9fbf45a537 doc_id: 747901 cord_uid: l0y2h8uh Since its first detection in the UK in September 2020, a highly contagious version of the coronavirus, the alpha or British variant a.k.a. B.1.1.7 SARS-CoV-2 virus lineage, rapidly spread across several countries and became the dominant strain in the outbreak. Here it is shown that a very simple evolutionary model can fit the observed change in frequency of B.1.1.7 for several countries, regions of countries and the whole world with a single parameter, its relative fitness f, which is almost universal f ≈ 1.5. This is consistent with a 50% higher transmissibility than the local wild type and with the fact that the period in which this variant takes over has been in all the studied cases around 22 weeks. In September 2020 a new variant of the SARS-CoV-2 virus, known as lineage B.1.1.7 (aka 20I/501Y.V1, Variant of Concern 202012/01, British or Alpha variant), was detected in the UK and it quickly displaced the other variants of this virus in several countries (Hodcroft 2021) . Previous estimates of B.1.1.7 transmissibility have varied across different studies. For example: a preprint by Davis et al. (2020) reported that the variant was 56% (50%-74%) more transmissible than other variants across three regions in England (East of England, South East of England, and London), while another article concluded that it was 75% (70%-80%) more transmissible in the UK between October and November 2020 (Leung et al. 2021) . Here a very simple evolutionary model is proposed for describing the change in frequency of B.1.1.7. This model distinguishes the B.1.1.7 variant from all the others (treated just as one "mean" lineage). If we call x the fraction of the B.1.1.7 variant a simple equation to model evolution by natural selection is given by (Nowak 2006 , chapter 2): where f is the fitness of this lineage relative to the others (which are set to 1) and ̅ is the weighted mean fitness over all the variants, i.e. . (2) i.e. a logistic equation with a growth rate of f1. Eqs. (1) to (3) are well known equations to model natural selection.The solution of (3) is also well known and is given by (Fort 2020 ): ( 1) For long times, if f > 1, x will converge to an asymptotic value x =1. Then, Eq.(4) is compared against empirical data for those countries such that a minimum statistics of 500 sequences in each measurement were included (Table 1) . That is, countries are considered if they have at least N min = 500 sequences summing, at each time, all the variants that were tracked since the first time the B.1.1.7 lineage was reported in each country (Hodcroft 2021) . It turns out that this formula can fit the observed change in frequency x of B.1.1.7 until it became 1 or close to 1 for several countries, regions of countries with a single parameter, the relative fitness f of this variant, which is almost universal. It is worth remarking that in this communication we are only addressing the "rise" of B.1.1.7. Its "fall", which already occurred in countires like UK, involves the effect of factors like massive vaccination and new non-native variants of coronavirus (e.g. the Delta variant) which lie beyond this simple modeling. This modeling is also compared for the whole world, where x has decreased from a peak of x  0.77, reached the 5 th of May 2021, to below 0.2 since the 8 th of June 2021 (Latif 2021) . Thus in each case I varied the parameter f in steps of 0.1 and chosed the one that produced minimum mean absolute error (MAE). In the case of Switzerland (panel d) the frequency of B.1.1.7 reached a maximum of x = 0.9 and then started to decline. So I also included a fit with an asymptotic carrying capacity K = 0.9 (dashed line) just to show that the logistic fit for the last part of the sequence greately improves. Interestingly, the best fit for the four countries occurs for similar values of the relative fitness parameter; around 1.5. Additionally, the values of f =1.54 and f =1.53 for, respectively, Denmark and Switzerland are in agreement with a reported 52% higher transmissibility when compared to the wildtype in Denmark and a 51% higher transmissibility when compared to the wildtype in Switzerland (ISPM 2021) . This suggests that it is right to interpret f as the relative fitness to the local wildtype. It seems interesting to find out if the model still works for smaller spatial scales, for instance across different regions of a country. In fact, in many countries, sampling may not be equal across the country: samples may only cover one area or certain areas (Hodcroft 2021 ). Fig.2 shows the data for the five different regions of Denmark. Notice that, except for Nordjylland, the model fits the data quite well. Finally, let us see now the application of the model to the maximum spatial scale, i.e. the whole world. Fig. 3 shows the empirical data for the world (Latif 2021 ) and the fit provided by Eq.(4) at the beginning is not very good but then it yields again a remarkable agreement with the empirical data for a smaller value of the relative fitness. This failure for the initial weeks can be understood since the pace of the pandemic varied for countries across the world as well as the quantity and quality of measurements (mainly for the initial stages in the rise of B.1.1.7). I computed two metrics to quantitatively assess the quality of the fit: the pValue of a  2 goodness of fit test and the MAE. Table 1 shows these results. Notice tha both indices show the fitting is relevant. In particular the MAE varies from 0.023 to 0.059, which for a variable like x in [0,1], corresponds to a quite small average relative error (between 4.6 and 11.8 %). In summary: 1. The change in frequency of SARS-CoV-2 virus lineage B.1.1.7 in time is well described by formula (4) resulting from an elementary evolutionary model with a single parameter, the relative fitness f. This parameter can be interpreted as the relative fitness to the local wildtype. 2. The value of this parameter f varies around 1.50 for most of the time series considered in this study. This would imply that the variant is 50% more transmissible than the local wild type. 3. The period in which this variant displaces all or most of the other variants has been in all the studied cases around 22 weeks. This is consistent with the above mentioned almost universal relative fitness f. 4. The model outperforms for cases like Nordjylland which exhibits a very 'jagged' frequency; this often indicates low sequencing numbers (Hodcroft 2021 ) and seems to be the case since this Danish region is the one with lowest statistics of viral sequences. 5. As Hodcroft (2021) stressed, since in many countries sampling may not be equal across the country: samples may only cover one area or certain areas, it's important not to assume frequencies shown are necessarily representative of the country. However, in the case of Denmark it seems there are no important variations across the different regions. Let us enumerate the approximations and assumptions of the modeling used here. 1. All the spatial heterogeneities are neglected, so this is a mean field model (Fort 2020) . 2. The model distinguishes the variant of concern (VOC), the B.1.1.7 variant, from all the others, which are treated just as one "mean" lineage. In community and population ecology this is the so-called focal species approximation (Fort 2021; Fort 2020) . 3. The fitness of the VOC relative to the rest, f, is assumed as an intrinsic property of this variant, taking a constant value. 4. In addition, the time frame considered is such that mutations changing f can be ignored. Under these assumptions the Eq. (3) governs the dynamics of the VOC"s fraction, x. And this equation has two possible equilibria x* satisfying (f1)x*(1x*) = 0. (i) The stability of both equilibria are shown in Figure A1 below, i.e. x* = 0 (if f < 1) and x* = 1 (if f > 1). Since we are taking f >1, the only possible stable equilibrium is x* = 1. This is consistent to what we observe for most of the studied cases. However, in cases like Switzerland, x* = k < 1. In other words, this implies that there is an additional equilibrium, i.e. instead of (i) we have: (k x*) x*(1x*) = 0. including an extra factor (k x*). This slightly more complex and, from an ecological viewpoint, more interesting situation, in which the VOC can coexist with other variants corresponds to the phenomenon of frequency-dependent selection (Nowak 2006, chapter 4) , that is when the fitness of a phenotype or variant depends on its frequency, f = f(x). In other words, the fitness of variants is determined by the whole "ecosystem" in which it lives combinations of coexisting variants and the abiotic environment. Notice that if f(x) is taken proportional to (x+1k), then eq. (i) transforms into eq. (ii), and fig. A1-b transforms into fig. A2 . Therefore the equilibria x* = 1 became unstable, and the only stable equilibrium is now x* = k. Figure A2 . The equilibria of equation (ii) and their stability. Now the stable equilibrium is x* = k. The simplest way to model this frequency dependent selection is through the well known replicator equation (Nowak 2006 , chapter 4) given by: Where a and b are the "payoffs" received by the VOC when interacting with itself and average of all the other variants. Thus it is straightforward to obtain that the VOC "carrying capacity", k, is given by: (iv) and the Center for Viral Systems Biology. B.1.1.7 Lineage Report Danish Covid-19. 2021. Danish Covid-19 Genome Consortium Estimated transmissibility and severity of novel SARS-CoV-2 Variant of Concern 202012/01 in England Predicting the Yields of Species Occupying a Single Trophic Level with Incomplete Information: Two Approximations Based on the Lotka-Volterra Generalized Equations Ecological Modelling and Ecophysics: Agricultural and environmental applications CoVariants: SARS-CoV-2 Mutations and Variants of Interest Transmission of SARS-CoV-2 variants in Switzerland. Institute of Social and Preventive Medicine (ISPM), University of Bern Early transmissibility assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom Evolutionary Dynamics: Exploring the Equations of Life. Belknap Press, Canada.Since I am the only author, I did all the work I thank Juan Arbiza, Raúl Donangelo, Edgardo García-Álvarez, Santiago Mirazo and two anonymous reviewers for their comments on this analysis. I am also grateful to Jorge Pullin, who drew my attention to the issue of fixation of the British variant of concern. XThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.