key: cord-0969765-odp6fn5u authors: Nyayanit, Dimpal A.; Yadav, Pragya D.; Kharde, Rutuja; Cherian, Sarah title: Natural Selection Plays an Important Role in Shaping the Codon Usage of Structural Genes of the Viruses Belonging to the Coronaviridae Family date: 2020-12-22 journal: Viruses DOI: 10.3390/v13010003 sha: 615fe1e4b0827ebcebb0605e0c9c7e25097f45bc doc_id: 969765 cord_uid: odp6fn5u Viruses belonging to the Coronaviridae family have a single-stranded positive-sense RNA with a poly-A tail. The genome has a length of ~29.9 kbps, which encodes for genes that are essential for cell survival and replication. Different evolutionary constraints constantly influence the codon usage bias (CUB) of different genes. A virus optimizes its codon usage to fit the host environment on which it savors. This study is a comprehensive analysis of the CUB for the different genes encoded by viruses of the Coronaviridae family. Different methods including relative synonymous codon usage (RSCU), an Effective number of codons (ENc), parity plot 2, and Neutrality plot, were adopted to analyze the factors responsible for the genetic evolution of the Coronaviridae family. Base composition and RSCU analyses demonstrated the presence of A-ended and U-ended codons being preferred in the 3rd codon position and are suggestive of mutational selection. The lesser ENc value for the spike ‘S’ gene suggests a higher bias in the codon usage of this gene compared to the other structural genes. Parity plot 2 and neutrality plot analyses demonstrate the role and the extent of mutational and natural selection towards the codon usage pattern. It was observed that the structural genes of the Coronaviridae family analyzed in this study were at the least under 84% influence of natural selection, implying a major role of natural selection in shaping the codon usage. The Coronaviridae family has four genera Alphacoronaviridae, Betacoronaviridae, Gammacoronaviridae, and Deltacoronaviridae that include 23 subgenera [1] . Coronaviridae families have single-stranded positive-sense RNA with a genome range of 26-32 kb in length, which is capped and polyadenylated [1] . The genome of the virus encodes for structural, non-structural, and accessory proteins. The four structural proteins encoded by the genome are envelope protein (E), membrane protein (M), nucleocapsid protein (N), and spike glycoprotein (S). ORF1ab gene encodes for pp1ab polyprotein and pp1a polyprotein, which are further spliced to obtain 15 different proteins [2] . The nucleocapsid proteinplays an important role in maintaining the RNA conformation stable for the replication, transcription, and translation of the viral genome along with protecting the viral genome [1, 3] . It is highly immunogenic and capable of modulating the metabolism of an infected cell [3] . The envelope protein acts as a viroporin [4, 5] and plays multiple roles in viral replication [1] and signaling pathways that affect inflammatory and type 1 INF gamma signaling [6] . The spike protein "S" is responsible for receptor recognition and membrane fusion [7] that leads to viral entry into the host cells [8] . The membrane protein is associated with the spike protein and is responsible for the virus budding process [9] . Different factors influence the gene translation process, a mechanism by which a protein is encoded. External factors (mutational pressure and natural selection process), along with internal factors (translational machinery of the organism), influences the gene translation, leading to the uneven usage of codons [10] . Genetic code redundancy leads to the biased usage of the codons under the influence of different factors, as observed in most of the organisms [11] [12] [13] . Genetic code redundancy is a term used when a single amino acid can be encoded by different codons. Codon usage bias (CUB) is caused due to the redundancy of the genetic code. As a result, each organism favors a particular set of synonymous codons. The result of this is that an individual codon is either used optimally or has a rare usage. Evolution shapes codon bias, and exploring this bias may facilitate our understanding of the viral genome evolution. A single genome with different genes may have different CUB or can have the same codon usage bias [14] . Recently differential choice of codon bias was proved by Deka et al. for matrix 1 and matrix 2 proteins encoded for Influenza A virus [15] . Each gene in the Coronaviridae family plays an essential role in the viral replication survival and infection, due to which it becomes essential to analyze the evolutionary factors involved in determining its choice for the codon bias. A recent paper by Tort et al. suggested mutational pressure to be the major factor responsible for shaping the codon usage pattern (CUP) in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [16] . In another study by Anwar et al., natural selection and other factors were considered to shape the CUP of the SARS-CoV-2 [17] . A study by Dulicca et al. demonstrated the influence of both mutational and selectional pressures in shaping the CUP of the SARS-CoV-2 genes [18] . In this study, we analyzed the synonymous CUP of representative Coronaviridae sequences available in the GenBank database. It was observed that the nucleotide composition influences the codon usage of the different genes to different extents. Further, we studied the role of mutational pressure and natural selection on the evolution of the codon usage of the different genes, as observed in the Coronaviridae family. Complete genome sequences of the viruses from the Coronaviridae family were obtained from the GenBank database of the National Center for Biotechnology Information (NCBI, USA). The list of the names of the viruses with their accession number is provided in Supplementary Table S1. Similar genes from different viruses were grouped into a single set. Likewise, different gene sets were obtained each for N, M, S, E genes. Nucleotide composition for the different gene sets was obtained using MEGA software v. 7.0 [19] . Further, the nucleotide compositional analysis was carried out for the various sets of genes belonging to Coronaviridae in terms of (i) overall percentage of each nucleotide in the gene (A%, G%, C%, and T%), (ii) G + C at the first, second and third position of the codon for each gene (GC1, GC2, and GC3) and (iii) overall AU and GC percentage in each gene. Factor analysis was carried out in "R"software to understand the overall nucleotide distribution for the different gene sets of Coronaviridae. Relative Synonymous Codon Usage (RSCU) is the fraction of the observed codon frequency to expected codon frequency, given that all the codons for any particular amino acid are used equally [20] . RSCU value for each gene is calculated using the Equation (1), as previously described by Sharp and Li [20] . The Effective Number of Codon (ENc) Analysis The extent of variation in the codon usage bias for a gene can be determined using ENc. ENc value ranges from 20 to 61 [24] . An ENc value of 20 depicts an extreme CUB (using a single possible synonymous codon), whereas an ENc value of 61 is indicative of no bias (using all possible synonymous codons equally). ENc value is calculated using the Equation (2) ENc = 2 + 9 where F k (k = 2, 3, 4, 6) is the mean of F k values for k-fold degenerate amino acids. The values for F k can be calculated using Equation (3) given below (Wright, 1990) . where "n" is the total occurrence of the codons for that amino acid and n j is the total occurrence of the j th codon for that amino acid. The CodonW software was used to obtain the ENc values of all the gene sets. The ENc values obtained are plotted against the GC3 values to measure the factor responsible for bias in codon usage. The expected ENc value is calculated using Equation (4) where "a" denotes GC3s value. If the predicted ENc value has a standard deviation on the higher side than the standard curve of expected ENc value, then natural selection plays a significant role in the codon bias of the gene. If the predicted Enc value lies on or has a standard deviation less compared to the expected ENc value, then the codon bias of the gene is controlled by mutational pressure [25] . The Parity rule 2 (PR-2) analysis was performed to determine the role of selection and mutational pressure on the codon usage of different genes. PR-2 is the plot for the purine and pyrimidine usage at the third position of the four-fold degenerate amino acids. The value at the center position of both the axes determines the unbiased usage of the codons. A neutrality plot was used to determine the extent of natural selection and mutational pressure influence on the codon usage bias of a gene. The neutrality plot was drawn using the average GC at the first and second position of the codon (GC12) versus GC at the third position of the codon GC3 values for each gene [26] . The distance between the group and within the group is calculated using MEGA software. The nucleotide analysis demonstrates an overall highest mean for Thymine/Uracil (T/U), whereas Cytosine (C) has the lowest mean percentage of occurrence in all the genes (Table 1A) . The mean percentage occurrence of U was highest (37.7 ± 3.5) in the E gene and lowest (23.1 ± 2.4) in the N gene (Table 1A) . The mean highest percentage of occurrence for the C, A and G was 23.9 ± 2.3, 29.8 ± 1.9 and 23.0 ± 1.7 for the N gene. The lowest percentage of occurrence for the C, A and G was observed for ORF1a (17.5 ± 2.5), M (25.3 ± 1.8) and ORF8 (18.2 ± 1.2). Overall, the observation of individual nucleotide frequencies indicates the preference for A and U nucleotides. Table 1B depicts the mean percentages for the sum of AU and GC nucleotides along with their standard deviation at the first, second and third positions. It is observed that either A or U are the preferred nucleotides at all three codon positions (Table 1B ). The nucleotides A or U are most preferred at the third codon position. The mean value for the sum of AU leads to the overall observation that the virus belonging to the Coronaviridae family has an overall AU rich genome. A significant difference is observed in the GC content and the AU content at the second and third wobbles positions for all the genes. The means of nucleotides, as well as the sum of dinucleotide at each codon position, indicate compositional codon bias in the genes of the Coronaviridae family (Table 1) . RSCU values were calculated to determine the preference of each nucleotide at the wobble position. Figure 1 shows the RSCU plots for the structural genes "S" and "N" of viruses belonging to the Coronaviridae family, in the form of heat maps. It is observed that the most abundantly used codons are different for each virus, indicating that each virus has optimized its codon usage. It was observed that overall, U-ended codons (UUU, CUU, AUU, UCU, ACU and GUU) are over-represented while the G-ended codons (CAG, CCG, GUG, AAG, CGG and GGG) are under-represented. The amino acids Leu and Val are encoded in higher numbers for the different genes analyzed, whereas Trp and His are the least encoded amino acid. However, it is interesting to observe that almost all the viruses have AGA codon encoding for Arginine over-represented for different genes analyzed. (Table 2 ). It is observed that there is a more significant deviation of the ENc value for the E gene (7.29) followed by the M gene (5.45) and S gene (3.88), indicating that overall the ENc values are not conserved for these genes. The ENc value indicates the role of mutational pressure, along with other factors in shaping the CUB. The standard deviation of observed ENc values for S, N, M and E genes are 3.88, 3.74, 5.45 and 7.29 respectively. The mean ENc value of the genes ranges from 46.45± 3.88 to 53.33 ± 5.45 (Table 2 ). It is observed that there is a more significant deviation of the ENc value for the E gene (7.29) followed by the M gene (5.45) and S gene (3.88), indicating that overall the ENc values are not conserved for these genes. We further carried out the correlation analysis between the GC3 and GC12, GC and ENc for structural and non-structural genes along with ORF8 (Table 2) . A positive correlation was observed between GC12 and GC for all the structural and non-structural genes. GC12 vs. GC of the ORF8 gene had a lower correlation: r = 0.44, P < 1.33 × 10 −7 . GC12 has a non-significant positive correlation with the GC3 nucleotides for the different genes under comparison except for the ORF1A and S gene, which has a correlation above 0.6 with p-value < 0.05. We further carried out the correlation analysis between the GC3 and GC12, GC and ENc for structural and non-structural genes along with ORF8 (Table 2) . A positive correlation was observed between GC12 and GC for all the structural and non-structural genes. GC12 vs. GC of the ORF8 gene had a lower correlation: r = 0.44, P < 1.33 × 10 −7 . GC12 has a non-significant positive correlation with the GC3 nucleotides for the different genes under comparison except for the ORF1A and S gene, which has a correlation above 0.6 with p-value < 0.05. The correlation between the ENc and the GC12 position was positive, albeit the value of correlation was low (Table 2 ). An insignificant positive correlation of GC12 vs. GC: r = 0.266, p = 0.796 (Table 2 ) was observed. A correlation value of more than 0.5 was observed for the ENc and the third GC codon position. ORF8 had a non-significant positive correlation with the GCs and the ENc values but demonstrated a strong positive correlation with the GC3 nucleotides. Parity rule 2 analysis (PR-2 plot) was used to determine the role of natural selection in influencing the codon usage of different viruses in the Coronaviridae family ( Figure 3) . It was observed that the pyrimidine bases were used more than the purine bases for the N and S genes. E and M genes have more of the Uracil and Guanine nucleotides. The unequal distribution of the purine and pyrimidine nucleotides for the four-fold degenerate amino acids suggests that factors other than compositional bias may affect the codon usage bias. Viruses 2020, 12, x FOR PEER REVIEW 12 of 17 The correlation between the ENc and the GC12 position was positive, albeit the value of correlation was low (Table 2 ). An insignificant positive correlation of GC12 vs. GC: r = 0.266, p = 0.796 (Table 2 ) was observed. A correlation value of more than 0.5 was observed for the ENc and the third GC codon position. ORF8 had a non-significant positive correlation with the GCs and the ENc values but demonstrated a strong positive correlation with the GC3 nucleotides. Parityrule 2 analysis (PR-2 plot) was used to determine the role of natural selection in influencing the codon usage of different viruses in the Coronaviridae family ( Figure 3) . It was observed that the pyrimidine bases were used more than the purine bases for the N and S genes. E and M genes have more of the Uracil and Guanine nucleotides. The unequal distribution of the purine and pyrimidine nucleotides for the four-fold degenerate amino acids suggests that factors other than compositional bias may affect the codon usage bias. The GC3-Enc plot demonstrated the role of mutational pressure in shaping the codon usage pattern of the genes, as observed in Figure 4 . Further, a neutrality analysis was performed to determine the key factor (natural selection or mutational pressure) determining the shape of codon usage bias (Figure 4) . In a neutrality plot, if the slope of the regression line is close, mutational pressure governs the codon usage bias. The GC3-Enc plot demonstrated the role of mutational pressure in shaping the codon usage pattern of the genes, as observed in Figure 4 . Further, a neutrality analysis was performed to determine the key factor (natural selection or mutational pressure) determining the shape of codon usage bias (Figure 4) . In a neutrality plot, if the slope of the regression line is close, mutational pressure governs the codon usage bias. The slopes of the regression line calculated for M, N, S and E genes are found to be 0.136, 0.103, −0.072 and 0.158, respectively. Thus, the slope observed for the genes suggests the role of mainly natural selection being present at the codon positions. The slope of the E gene was followed by the N gene and indicated a mutational pressure of 7.2% and 10.3%, respectively. The relative neutrality (natural selection) was calculated to be 92.8% and 89.7%, indicating that natural selection plays a dominant role in determining the shape of codon usage bias. Likewise, the M gene and S gene have a mutational pressure of 13.6% and 15.8%, also indicating the role of natural selection in shaping the codon usage bias. The GC3-Enc plot demonstrated the role of mutational pressure in shaping the codon usage pattern of the genes, as observed in Figure 4 . Further, a neutrality analysis was performed to determine the key factor (natural selection or mutational pressure) determining the shape of codon usage bias (Figure 4) . In a neutrality plot, if the slope of the regression line is close, mutational pressure governs the codon usage bias. The slopes of the regression line calculated for M, N, S and E genes are found to be 0.136, 0.103, −0.072 and 0.158, respectively. Thus, the slope observed for the genes suggests the role of mainly natural selection being present at the codon positions.The slope of the E gene was followed by the N gene and indicated a mutational pressure of 7.2% and 10.3%, respectively. The relative neutrality (natural selection)was calculated to be 92.8% and 89.7%, indicating that natural selection plays a dominant role in determining the shape of codon usage bias.Likewise, the M gene and S gene have a mutational pressure of 13.6% and 15.8%, also indicating the role of natural selection in shaping the codon usage bias. Genetic degeneracy leads to the usage of different codons for the same amino acid within a gene. The preference of specific codons in organisms leads to uneven use of the codon set and is specific to the organism [27] . Further, the shape of codons usage bias for a gene is governed by evolutionary constraints. The evolutionary constraints structuring the mechanistic details of the codon are the balance between the mutational pressure and natural selection pressure.This work demonstrates the role of evolutionary pressure on the structural genes and accessory genes of the viruses belonging to the Coronaviridae family. The structural genes of the viruses analyzed from the Coronaviridae family were found to possess >50% of pyrimidine nucleotides except for the N gene.The genes had >60% of the AU nucleotide except for the M and N gene that had 57.6% and 52.9% AU, indicating that the family possesses AU rich genomes.The study of mutation pattern between the SARS-CoV-2 and Bat Coronavirus RaTG13 indicated a strong C > U biased which might be under the influence of the host factors whereby changing the mutational profile [28] . Thus, it is not so surprising that the codons are skewed towards AU (as opposed to GCs) in their genomes (Table 1B and Figure 1 ). An earlier published studysuggests higher AU nucleotides enhance the mutational selection pressure of the gene [29] , indicating the role of mutational pressure in the selection of codon usage. A recently published study suggests the role of natural selection in shaping the transmembrane polypeptide that emerged due to top Uracil rich The slopes of the regression line calculated for M, N, S and E genes are found to be 0.136, 0.103, −0.072 and 0.158, respectively. Thus, the slope observed for the genes suggests the role of mainly natural selection being present at the codon positions. The slope of the E gene was followed by the N gene and indicated a mutational pressure of 7.2% and 10.3%, respectively. The relative neutrality (natural selection) was calculated to be 92.8% and 89.7%, indicating that natural selection plays a dominant role in determining the shape of codon usage bias. Likewise, the M gene and S gene have a mutational pressure of 13.6% and 15.8%, also indicating the role of natural selection in shaping the codon usage bias. Genetic degeneracy leads to the usage of different codons for the same amino acid within a gene. The preference of specific codons in organisms leads to uneven use of the codon set and is specific to the organism [27] . Further, the shape of codons usage bias for a gene is governed by evolutionary constraints. The evolutionary constraints structuring the mechanistic details of the codon are the balance between the mutational pressure and natural selection pressure. This work demonstrates the role of evolutionary pressure on the structural genes and accessory genes of the viruses belonging to the Coronaviridae family. The structural genes of the viruses analyzed from the Coronaviridae family were found to possess >50% of pyrimidine nucleotides except for the N gene. The genes had >60% of the AU nucleotide Genetic degeneracy leads to the usage of different codons for the same amino acid within a gene. The preference of specific codons in organisms leads to uneven use of the codon set and is specific to the organism [27] . Further, the shape of codons usage bias for a gene is governed by evolutionary constraints. The evolutionary constraints structuring the mechanistic details of the codon are the balance between the mutational pressure and natural selection pressure. This work demonstrates the role of evolutionary pressure on the structural genes and accessory genes of the viruses belonging to the Coronaviridae family. The structural genes of the viruses analyzed from the Coronaviridae family were found to possess >50% of pyrimidine nucleotides except for the N gene. The genes had >60% of the AU nucleotide except for the M and N gene that had 57.6% and 52.9% AU, indicating that the family possesses AU rich genomes. The study of mutation pattern between the SARS-CoV-2 and Bat Coronavirus RaTG13 indicated a strong C > U biased which might be under the influence of the host factors whereby changing the mutational profile [28] . Thus, it is not so surprising that the codons are skewed towards AU (as opposed to GCs) in their genomes (Table 1B and Figure 1 ). An earlier published study suggests higher AU nucleotides enhance the mutational selection pressure of the gene [29] , indicating the role of mutational pressure in the selection of codon usage. A recently published study suggests the role of natural selection in shaping the transmembrane polypeptide that emerged due to top Uracil rich non-genic regions [30] , However, the effect of Uracil rich regions in the genic region need to be further looked upon. Relative synonymous codon usage analysis demonstrated that the abundantly used codons for genes are different for each virus under analysis, which indicated that each virus has a different set of codons that are optimized for their usage. RSCU based analysis led to the identification of a few over-represented and under-represented, U-ended codons and G-ended codons, respectively, for the Coronaviridae viruses under study. Differential usage of codons is observed in the structural genes of the Coronaviridae family, and it can be proposed that the usage of over-expressed codons might be influenced by the nucleotide composition of the codons. The S protein region has a higher amount of Ser, Thr, and Asn, which is indicative of the presence of glycosylation sites whereas the N-linked glycosylation is reported in the S gene of SARS-CoV-2 has been linked to immune evasion and protein folding [31] . Different studies reported the effect of single nucleotide changes in the receptor-binding domain (RBD) region to alter the viral entry in human angiotensin-Viruses 2021, 13, 3 9 of 11 converting enzyme 2 (ACE2) cells. These mutations can either reduce the serological response, immunogenicity of the virus, or generate an escape mutant virus [32] [33] [34] . Different methods were used to demonstrate the role and the extent of variation in codon usage bias caused by the mutational pressure and/or natural selection of a gene. ENc has a negative correlation with CUB. Most of the genes studied here belonging to the Coronaviridae family have high ENc (close to 50) indicating lesser codon usage bias, suggestive of lesser mutational pressure. A lower ENc value observed for the ORF1a (data not shown) and the S gene suggests a higher bias in the codon usage of these two genes compared to the other structural genes. The ENc value also indicates that the bias observed in the codon usage is not related to the gene expression [35] . Further, neutrality-based analysis demonstrated that both mutational pressure and natural selection had their role in influencing the codon usage pattern. Earlier work by Zhang et al. demonstrated adaptive evolution of the coronaviruses spike gene, due to positive selection pressure [36] . However, the selection pressures experienced by the different domains in the S gene are varied. It was observed that the RBD of the S gene has a higher number of positively selected sites in the SARS-CoV-1 [36] . Analysis of the SARS-CoV-2 and the RaTG13 demonstrated the positive selection to be concentrated in the region that mediates host ACE2receptor binding [37] . Analysis of the different functional proteins by Tang et al. demonstrated positive selection on the spike protein. The comparison carried by the group clustered the other structural genes with the accessory genes and hence were unable to look upon the effect of selection pressure with respect to the other structural genes [38] . As against the previous observation, maximum mutational pressure in the structural genes was observed for the spike gene and least for the E gene in this study. This indicated other structural genes have more influence of natural selection as compared to the S gene. The other genes analyzed ORF1a and ORF8 had 20% and 3% mutational pressure (data not shown), indicating that natural selection has a minimum of 80% effect on shaping the codon usage of the different genes analyzed. It was observed that the ORF8 gene was under maximum selection pressure, which is in agreement with the work published by Velazquez-Salinas et al. [39] . They demonstrated that the ORF8 gene of SARS-CoV-2 had higher dN/dS, and the Leu84Ser, which delineated the strains into the "S" and "L" lineages, indicated a directional selection pressure. The accessory ORF8 gene plays an important role in the innate immune response influencing viral pathogenicity [39, 40] . On the other hand, Hughes and Hughes demonstrated the existence of purifying selection to be prominent in RNA viruses compared to DNA viruses [41] . Overall, the study demonstrated that the codon usages in the structural genes of viruses from the Coronaviridae family are biased. The major factor that shapes the codon usage is natural selection. The ORF1a gene and S gene were noted to have a combined effect from both the mutational as well as natural selection, while natural selection contributed majorly to a majority of the structural genes in shaping the codon usage. Table S1 . Accession number of the viruses from the Coronaviridae family analyzed in this study. Author Contributions: Concept, data analysis, and manuscript writing, D.A.N.; data collection, R.K.; manuscript writing and supervision, P.D.Y., S.C. All authors have read and agreed to the published version of the manuscript. Funding: Indian Council for Medical Research-National Institute of Virology provided intramural research funding. The authors declare that they have no competing interests. Sense RNA Viruses-Positive Sense RNA Viruses Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites Coronavirus envelope protein: Current knowledge Severe Acute Respiratory Syndrome Coronavirus Envelope Protein Ion Channel Activity Promotes Virus Fitness and Pathogenesis Coronavirus virulence genes with main focus on SARS-CoV envelope gene Structure, Function, and Evolution of Coronavirus Spike Proteins The Coronavirus Spike Protein Is a Class I Virus Fusion Protein: Structural and Functional Characterization of the Fusion Core Complex Membrane binding proteins of coronaviruses Selection on Codon Bias Mutation and Selection Cause Codon Usage and Bias in Mitochondrial Genomes of Ribbon Worms (Nemertea) Codon usage suggests that translational selection has a major impact on protein expression in trypanosomatids Evidence that Natural Selection on Codon Usage in Drosophila pseudoobscura Varies Across Codons. G3 (Bethesda) Factors influencing codon usage bias in genomes DNA compositional dynamics and codon usage patterns of M1 and M2 matrix protein genes in influenza A virus A comprehensive analysis of genome composition and codon usage patterns of emerging coronaviruses Insights into The Codon Usage Bias of 13 Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Isolates from Different Geo-locations Codon Usage and Phenotypic Divergences of SARS-CoV-2 Genes MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets Codon usage in regulatory genes in Escherichia coli does not reflect selection for "rare" codons Codon Usage in the Iflaviridae Family Is Not Diverse Though the Family Members Are Isolated from Diverse Host Taxa Analysis of Nipah Virus Codon Usage and Adaptation to Hosts Genome-Wide Analysis of Codon Usage and Influencing Factors in Chikungunya Viruses The 'effective number of codons' used in a gene Hepatitis A virus evolution and the potential emergence of new variants escaping the presently available vaccines Directional mutation pressure and neutral molecular evolution The Yin and Yang of codon usage Mutation Patterns of Human SARS-CoV-2 and Bat RaTG13 Coronavirus Genomes Are Strongly Biased Towards C > U Transitions, Indicating Rapid Evolution in Their Hosts The Impact of Selection at the Amino Acid Level on the Usage of Synonymous Codons. G3 Genes Genomes Genet De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences Site-specific glycan analysis of the SARS-CoV-2 spike Human monoclonal antibody as prophylaxis for SARS coronavirus infection in ferrets Single Amino Acid Substitutions in the Severe Acute Respiratory Syndrome Coronavirus Spike Glycoprotein Determine Viral Entry and Immunogenicity of a Major Neutralizing Domain Natural Mutations in the Receptor Binding Domain of Spike Glycoprotein Determine the Reactivity of Cross-Neutralization between Palm Civet Coronavirus and Severe Acute Respiratory Syndrome Coronavirus Analysis of synonymous codon usage in SARS Coronavirus and other viruses in the Nidovirales Adaptive evolution of the spike gene of SARS coronavirus: Changes in positively selected sites in different epidemic groups Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function Differential Stepwise Evolution of SARS Coronavirus Functional Proteins in Different Host Species Positive selection of ORF3a and ORF8 genes drives the evolution of SARS-CoV-2 during the 2020 COVID-19 pandemic 2020 SARS-CoV-2 and COVID-19: The most important research questions More effective purifying selection on RNA viruses than in DNA viruses