key: cord-0003357-nzxpo4sd authors: Zhang, Xu; Cai, Yuchen; Zhai, Xiaofeng; Liu, Jie; Zhao, Wen; Ji, Senlin; Su, Shuo; Zhou, Jiyong title: Comprehensive Analysis of Codon Usage on Rabies Virus and Other Lyssaviruses date: 2018-08-14 journal: Int J Mol Sci DOI: 10.3390/ijms19082397 sha: 0a5768ad6aeeada6cb31980fbf5e65fbb2db121a doc_id: 3357 cord_uid: nzxpo4sd Rabies virus (RABV) and other lyssaviruses can cause rabies and rabies-like diseases, which are a persistent public health threat to humans and other mammals. Lyssaviruses exhibit distinct characteristics in terms of geographical distribution and host specificity, indicative of a long-standing diversification to adapt to the environment. However, the evolutionary diversity of lyssaviruses, in terms of codon usage, is still unclear. We found that RABV has the lowest codon usage bias among lyssaviruses strains, evidenced by its high mean effective number of codons (ENC) (53.84 ± 0.35). Moreover, natural selection is the driving force in shaping the codon usage pattern of these strains. In summary, our study sheds light on the codon usage patterns of lyssaviruses, which can aid in the development of control strategies and experimental research. Biologists are devoted to exploring the complexity of evolutionary interactions among divergent viruses and their underlying reservoirs, and apply latent theoretical tenets to resolve practical cases. Viruses from the genus Lyssavirus, usually called lyssaviruses, belonging to Rhabdoviridae of the Mononegavirales order, present a classical case to study the emergence and cross-species transmission of infectious disease [1] . Rabies is an acute and almost invariably fatal encephalomyelitis in humans, usually caused by rabies virus (RABV) infection, which is a single-stranded, negative-sense, non-segmented RNA virus of approximately 12 kilo bases. The genome mainly encodes five proteins: The nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and the large protein (L) [2, 3] . RABV can infect a variety of mammalian hosts, especially bats and certain carnivores. It is distributed worldwide and has a high mortality, and remains a permanent threat to public health [4] [5] [6] ; nevertheless, it is still neglected. Lyssaviruses are mainly classified into 16 [7] [8] [9] [10] . Putative species, including Taiwan bat lyssavirus (TBLV) and Kotalahti bat lyssavirus (KBLV), have not yet been classified [11, 12] . Historically, each species of lyssavirus is associated with a specific geographical area and is detected in different hosts and vectors [13] . For instance, EBLV-1 and EBLV-2 are found in serotine bats and Daubenton's bats, respectively, in the United Kingdom, the Netherlands, Switzerland, and Norway, while ABLV is found in pteropid and insectivorous bats in Australia. In addition, RABV is distributed worldwide in dogs and several carnivores, except in Antarctica and a few islands, though it is commonly found in China and India [14] [15] [16] . However, the evolutionary relationship among these different viruses, caused by geographical isolation, is still not clear. The genetic code is degenerated, meaning that an amino acid can be encoded by more than one codon. Codon usage is unbalanced in prokaryotes, eukaryotes, and viruses [17] . The preferential usage of codons is referred to as codon usage bias and is a widespread phenomenon in nature [18, 19] . Mutation pressure and natural selection are the two main forces influencing codon usage patterns. Other factors include dinucleotide abundance, tRNA abundance, GC content, gene function, gene length, RNA structure, replication, and external environment, among others [20] [21] [22] [23] . In terms of virus infection, the codon usage pattern of the respective host might affect virus survival, adaptation, evolution, and control of the host immune system, given that the virus relies on host cell machinery [24] . Thus, the study of codon usage patterns can provide more detailed information regarding virus evolution and a more detailed understanding of the pathogenesis, which can aid the development of drug targets for more effective vaccines and reinforce control measures to prevent the spread of this severe zoonosis. RNA viruses have a high evolutionary rate; however, the evolution of lyssaviruses is relatively conserved [25] . Previous studies have mainly focused on pathogenesis or evolution to find amino acid sites under selection [26] [27] [28] . However, the evolutionary diversity among lyssaviruses, in terms of genome codon usage, is still unclear. In this study, we performed a large-scale and comprehensive codon usage analysis of lyssaviruses strains and determined the driving forces that influence the pattern of codon usage. Nucleotide composition constraints can influence the pattern of codon usage so we analyzed the composition of RABV coding sequences. The highest mean compositions of nucleotides were A (28.34 ± 0.15%) followed by U (26.03 ± 0.23%), C (22.93 ± 0.20%), and G (22.70 ± 0.16%). The mean nucleotides at the third positions of synonymous codons A 3s (32.69 ± 0.63%) and U 3s (31.58 ± 0.76%) were also higher than G 3s (31.31 ± 0.68%) and C 3s (30.33 ± 0.62%). The mean compositions of AU (54.37 ± 0.27%) were more than the GC compositions (45.63 ± 0.27%). Therefore, the AU content was higher than the GC in RABV. The same result was observed in the other lyssaviruses species, except for in EBLV-1, for which the contents of the third codon positions were G 3s 34.43 ± 0.27%, A 3s 31.41 ± 0.25%, U 3s 31.23 ± 0.32% and C 3s 29.65 ± 0.28%. These results indicated that the coding sequences of lyssaviruses are AU-rich (Table S1 ). The ENC (effective number of codons) values were calculated to infer the degree of the codon usage bias of lyssaviruses. ENC values were calculated using complete coding sequences of different lyssavirus species strains and were then compared to identify differences among these strains. We found a low codon usage bias with the highest mean ENC value for RABV (53.84 ± 0.35) and the lowest mean ENC value for LBV (52.11 ± 0.69). Then, we calculated the mean ENC value of individual genes from different lyssaviruses strains ( Figure 1 ). More sequences were added for this analysis, as described in Table S2 . The highest ENC value corresponded to the P gene of RABV (57.87 ± 1.93). For DUVV and ABLV, the highest ENC values also corresponded to the P gene. Regarding other genes, the ENC values of other lyssavirus strains were different from that of RABV. These observed results suggested that, during the evolution of lyssaviruses, codon usage is relatively conserved and species-specific. suggested that, during the evolution of lyssaviruses, codon usage is relatively conserved and speciesspecific. To reveal the pattern of synonymous codons of RABV and other lyssaviruses, we performed RSCU (relative synonymous codon usage) analysis of the 59 codons. In RABV, among the 18 most used synonymous codons, 10 were G-and C-ended (6 G-ended; 4 C-ended) and the other 8 were Aand U-ended (6 U-ended; 2 A-ended), so the preferentially used codons were G-and U-ended codons. However, for other lyssaviruses the preferred codons were A-and U-ended (LBV: 6 A-ended and 8 U-ended; MOKV: 7 A-ended and 5 U-ended; DUVV: 5 A-ended and 5 U-ended; ABLV: 4 A-ended and 7 U-ended). Interestingly, the preferred codons of EBLV were equally ended in A-and U-or Gand C-ended (2 A-ended, 7 U-ended, 3 C-ended and 6 G-ended). Next, we found that 2 of the 18 preferred codons in RABV (UCU for Ser and AGA for Arg) had RSCU values >1.6, and the remaining preferred codons had RSCU values >0.6 and <1.6. The number of over-represented codons of ABLV, EBLV and DUVV were same to RABV, MOKV and LBV had 3 preferred codons with RSCU values >1.6 ( Table 1) . None of the preferred codons were under-represented (RSCU < 0.6), regardless of the virus strain. Overall, the patterns of synonymous codons of RABV and other lyssaviruses are similar, though there are some differences in terms of preferred codons at the third position of synonymous codons. To reveal the pattern of synonymous codons of RABV and other lyssaviruses, we performed RSCU (relative synonymous codon usage) analysis of the 59 codons. In RABV, among the 18 most used synonymous codons, 10 were G-and C-ended (6 G-ended; 4 C-ended) and the other 8 were Aand U-ended (6 U-ended; 2 A-ended), so the preferentially used codons were G-and U-ended codons. However, for other lyssaviruses the preferred codons were A-and U-ended (LBV: 6 A-ended and 8 U-ended; MOKV: 7 A-ended and 5 U-ended; DUVV: 5 A-ended and 5 U-ended; ABLV: 4 A-ended and 7 U-ended). Interestingly, the preferred codons of EBLV were equally ended in A-and U-or Gand C-ended (2 A-ended, 7 U-ended, 3 C-ended and 6 G-ended). Next, we found that 2 of the 18 preferred codons in RABV (UCU for Ser and AGA for Arg) had RSCU values >1.6, and the remaining preferred codons had RSCU values >0.6 and <1.6. The number of over-represented codons of ABLV, EBLV and DUVV were same to RABV, MOKV and LBV had 3 preferred codons with RSCU values >1.6 ( Table 1) . None of the preferred codons were under-represented (RSCU < 0.6), regardless of the virus strain. Overall, the patterns of synonymous codons of RABV and other lyssaviruses are similar, though there are some differences in terms of preferred codons at the third position of synonymous codons. To dissect the variations in the codon usage trends among different lyssaviruses, we carried out PCA (principal component analysis) with the RSCU values of the genome coding sequences and the individual coding sequences. The average of the first (f'1) and second (f'2) principal axes accounted for 26.1% and 12.5%, occupying 38.6% of the total variation in the codon usage of RABV. The third (f' 3) and fourth (f'4) axes accounted for 8.4% and 6.6% of the total variation in the codon usage of RABV, respectively. The downward trends in axes values were consistent with RABV for other lyssaviruses, indicating that the f'1 axes accounted for most of the codon usage variation ( Figure S1 ). The plot first (f'1) axes against second (f'2) axes showed that lyssaviruses are divided into six groups, although there was a degree of overlap, indicating that these lyssaviruses strains may have the same ancestor. PCA also revealed that whole genome coding sequences of lyssaviruses strains were frequently distributed along the first (f'1) and second (f'2) principal axes except for LBV, while the individual coding sequences of lyssaviruses strains were diffusely distributed ( Figure 2 ). lyssaviruses, indicating that the f'1 axes accounted for most of the codon usage variation ( Figure S1 ). The plot first (f'1) axes against second (f'2) axes showed that lyssaviruses are divided into six groups, although there was a degree of overlap, indicating that these lyssaviruses strains may have the same ancestor. PCA also revealed that whole genome coding sequences of lyssaviruses strains were frequently distributed along the first (f'1) and second (f'2) principal axes except for LBV, while the individual coding sequences of lyssaviruses strains were diffusely distributed ( Figure 2 ). (F) L. The color coding is the same as that of (A). To establish the forces shaping the codon usage patterns of RABV and other lyssaviruses, we constructed ENC-GC3s plots, PR2 (parity rule 2) bias, and correlations among the nucleotide compositions, codon compositions, Gravy, Aroma and principal axes. We found that the ENC values The color coding is the same as that of (A). To establish the forces shaping the codon usage patterns of RABV and other lyssaviruses, we constructed ENC-GC 3s plots, PR2 (parity rule 2) bias, and correlations among the nucleotide compositions, codon compositions, Gravy, Aroma and principal axes. We found that the ENC values of all lyssaviruses strains occur below the expected ENC curve and clustered together except for LBV in ENC-GC 3s plots ( Figure 3A ), indicating that, except for mutation pressure, other factors, including natural selection, also drive the codon usage bias of RABV and other lyssaviruses strains. However, in the plot constructed using individual gene coding sequences, some points fell on the expected curve, for instance the N, P and M genes of RABV, the M gene of LBV and MOKV ( Figure 3B-F) . Interestingly, most of the LBV rarely clustered together with other lyssaviruses, regardless of the coding sequences of genome or individual genes (Figure 3) , which is consistent with the plots of nucleotide distribution. To further analyze the impact of the highly biased genes restriction on codon choice, the relationships between the AU contents and the GC contents in the fourfold degenerate codon families (alanine, arginine, glycine, leucine, proline, serine, threonine and valine) were analyzed by PR2 plots (Figure 4) . We found that the distribution of nucleotides was unequal in whole genome or individual gene coding sequences. Additionally, we discovered that in the four-codon amino acids family A = U, G = C, indicating that the driving forces are not sole and the extent of the influence is also not equal. We hypothesized that this may be due to a combination of mutation pressure and natural selection. Then we calculated the correlation of multiple factors. Several indices significantly correlated with the principal axes ( Table 2 and Table S3 ), further confirming the above conclusion. Overall, natural selection and mutation pressure both have contributed to the codon usage bias of lyssaviruses strains. of all lyssaviruses strains occur below the expected ENC curve and clustered together except for LBV in ENC-GC3s plots ( Figure 3A ), indicating that, except for mutation pressure, other factors, including natural selection, also drive the codon usage bias of RABV and other lyssaviruses strains. However, in the plot constructed using individual gene coding sequences, some points fell on the expected curve, for instance the N, P and M genes of RABV, the M gene of LBV and MOKV ( Figure 3B-F) . Interestingly, most of the LBV rarely clustered together with other lyssaviruses, regardless of the coding sequences of genome or individual genes (Figure 3) , which is consistent with the plots of nucleotide distribution. To further analyze the impact of the highly biased genes restriction on codon choice, the relationships between the AU contents and the GC contents in the fourfold degenerate codon families (alanine, arginine, glycine, leucine, proline, serine, threonine and valine) were analyzed by PR2 plots (Figure 4 ). We found that the distribution of nucleotides was unequal in whole genome or individual gene coding sequences. Additionally, we discovered that in the four-codon amino acids family A ≠ U, G ≠ C, indicating that the driving forces are not sole and the extent of the influence is also not equal. We hypothesized that this may be due to a combination of mutation pressure and natural selection. Then we calculated the correlation of multiple factors. Several indices significantly correlated with the principal axes ( Table 2 and Table S3 ), further confirming the above conclusion. Overall, natural selection and mutation pressure both have contributed to the codon usage bias of lyssaviruses strains. Note: NS means non-significant (p > 0.05); * represents 0.01 < p < 0.05; ** represents p < 0.01. To determine the main factor shaping the codon usage pattern of the lyssaviruses, we performed neutrality plot analysis. We found a significant positive correlation between the P 12 (GC 1,2s ) and P 3 (GC 3s ) values (p = 0.003) of RABV. The P 12 and P 3 values of LBV (p = 0.042) and DUVV (p = 0.030) were positive correlated, whereas for EBLV (p = 0.011) there was a significant negative correlation between the P 12 and P 3 values. For MOKV (p = 0.342) and ABLV (p = 0.404) there was not a significantly correlation between the P 12 and P 3 values. Then, we calculated the slope of the regression line for each species lyssaviruses. The slope of RABV was 0.030 indicating that natural selection is the primary force influencing the codon usage patterns of RABV. The slopes of LBV, DUVV, EBLV-1, ABLV and MOKV were 0.075, 0.120, −0.080, −0.020 and 0.077 respectively. Thus, mutation pressures were 7.5%, 12.0%, 8.0% 2.0% and 7.7% and natural selection were 92.5%, 88%, 92%, 98% and 92.3%, respectively, demonstrating the dominant influence of natural selection in all lyssaviruses strains. Therefore, in comparison with mutation pressure, natural selection is the predominant force driving the codon usage of lyssaviruses ( Figure 5 ). To determine the main factor shaping the codon usage pattern of the lyssaviruses, we performed neutrality plot analysis. We found a significant positive correlation between the P12 (GC1,2s) and P3 (GC3s) values (p = 0.003) of RABV. The P12 and P3 values of LBV (p = 0.042) and DUVV (p = 0.030) were positive correlated, whereas for EBLV (p = 0.011) there was a significant negative correlation between the P12 and P3 values. For MOKV (p = 0.342) and ABLV (p = 0.404) there was not a significantly correlation between the P12 and P3 values. Then, we calculated the slope of the regression line for each species lyssaviruses. The slope of RABV was 0.030 indicating that natural selection is the primary force influencing the codon usage patterns of RABV. The slopes of LBV, DUVV, EBLV-1, ABLV and MOKV were 0.075, 0.120, −0.080, −0.020 and 0.077 respectively. Thus, mutation pressures were 7.5%, 12.0%, 8.0% 2.0% and 7.7% and natural selection were 92.5%, 88%, 92%, 98% and 92.3%, respectively, demonstrating the dominant influence of natural selection in all lyssaviruses strains. Therefore, in comparison with mutation pressure, natural selection is the predominant force driving the codon usage of lyssaviruses ( Figure 5 ). We calculated the 16 dinucleotide abundance of lyssaviruses strains coding sequences to understand the possible effect in codon usage bias ( Figure S2 ). We found that all the dinucleotide frequencies were not equal, and dinucleotides ApG, GpA and UpC were overrepresented, while dinucleotide CpG was underrepresented. Additionally, dinucleotide CpU was overrepresented in RABV and MOKV, while dinucleotides GpC and UpA were underrepresented in all the lyssaviruses strains coding sequences except for MOKV. Furthermore, the RSCU values of 8 CpG-containing codons (UCG, CCG, ACG, GCG, CAG, CGU, CGC, and CGG) were <1.6 indicating that dinucleotide CpG were inhibited. These results indicated dinucleotide abundance influences the codon usage bias of lyssaviruses. RABV, belong to the genus Lyssavirus, is the cause of acute zoonotic infectious diseases causing about 60,000 human deaths a year. Though the evolution of lyssaviruses, especially the RABV, has been previously investigated. However, many gaps still exist due to a lack of deep and systematic investigation. Here, we used 498 lyssaviruses sequences to perform a systematic and comprehensive analysis to understand the codon usage patterns during evolution and discriminate patterns of codon usage among different lyssaviruses species. The phenomenon of clustering among different lyssaviruses species in PCA plot demonstrates a significant correlation among these strains during evolution and that they may have diversified from a common ancestor as previously reported [29] . However, this still controversial [30] , thus increased surveillance is needed to solve this dilemma. In order to adapt to the changing of environment and the host, RNA viruses undergo evolutionary changes leading to genome divergence [31] . Codon usage bias is an important manifestation of gene evolution that can be influenced by many factors, the most common being natural selection and mutation pressure. We calculated ENC values and nucleotide composition and found that the highest mean ENC value was for RABV (53.84), indicating that the codon usage bias of RABV was the lowest. Previous studies have already reported low codon usage bias for RABV genes including, N [32] and G [33] . In addition, low codon usage bias has been identified in other RNA viruses, such as H5N1 influenza virus (50.91) [34] , H3N8 Equine influenza virus (52.09) [35] , Ebola virus (57.23) [36] and hepatitis C virus (HCV) (52.62) [37] . Low codon usage bias can help overcome host defense mechanisms and reduce the barriers for virus replication [38] [39] [40] . Therefore, it allows persistent infection in preferential host. The analysis of nucleotide composition can reveal the use of preferred codons and reflect the effect of mutation pressure on codon usage bias. In lyssaviruses, the AU content was comparatively higher than the GC content in the overall genomic composition, demonstrating that codon usage bias plays a role in evolution. For RABV, despite the AU content being higher than the GC content, the preferred codons ended in G or U. However, for LBV, MOKV and DUVV, the majority of codons ended in A or U, consistent with the nucleotide content. Overall, this imbalance in codon usage can well account for the effect of mutation pressure on codon usage bias. Moreover, we performed ENC-GC 3s plots, PR2 and correlation analysis to study the forces that drive codon usage bias. ENC-plot analysis showed that all strains of lyssaviruses occur below the expected ENC curve indicating that, except for mutation pressure, other factors including natural selection also drive the codon usage bias of RABV and other lyssaviruses strains. Additionally, most points in the plot constructed using individual gene coding sequences also occur below the expected ENC curve. In conclusion, mutation pressure is important in shaping the codon usage of lyssaviruses. Furthermore, the driving forces are not sole, and the effect of mutation pressure and natural selection is not equal revealed by PR2 analysis. In addition, the remarkable correlations between ENC, Gravy, Aroma and multiple factors revealed by correlation analysis indicated that natural selection contributes to the codon usage bias of lyssaviruses. We also constructed neutrality plots between the P 12 and P 3 values of complete genome and individual gene coding sequences and found that natural selection is the predominant force, consistent with a previous report [32] . Dinucleotide abundance is one factor influencing codon usage bias as previously described [35] . We found dinucleotides ApG, GpA and UpC were overrepresented in lyssaviruses, however dinucleotide CpG was underrepresented. And the un-methylated dinucleotide CpG can activate immune response by intracellular pattern recognition receptor-toll-like receptor 9 (TLR-9) [41, 42] . Therefore, low CpG use is contributed to evading immune responses. In summary, we performed a comprehensive analysis of the codon usage bias of six species viruses' genome coding sequences from genus Lyssavirus from 1931 to present to further understand the evolution of lyssaviruses. Our results revealed that the codon usage bias of lyssaviruses is slight and that natural selection is a major factor influencing codon usage. Additionally, dinucleotide bias partly contributed to lyssaviruses codon usage patterns. Overall, these results will serve future lyssavirus surveillance and basic research. The coding sequences of 498 lyssaviruses genomes across different lineages reported worldwide between 1931 and 2016 were downloaded from the National Center for Biotechnological Information (http://www.ncbi.nlm.nih.gov/genbank/) (accessed on 29 October 2017) GenBank database. The detailed information regarding collection date, country, host and accession number is provided in Table S4 . Different with many reported RNA viruses, which have a high rate on recombination [43] [44] [45] , the rabies virus genome has rarely been reported previously [46] , and so we excluded the effect of recombination on subsequent codon analysis in the screening of the database. The codon compositions at the third position (A 3s %, U 3s %, C 3s % and G 3s %) were computed using Codon W 1.4.2. The frequencies of A, U, C and G (%) were calculated using Bio-edit. The GC content and GC 1s , GC 2s and GC 3s were calculated using Emboss: cusp. The codon usage bias analysis excluded five codons including: AUG and UGG since they are the only codons encoding for Met and Trp, respectively and the termination codons UAA, UAG and UGA [39] . RSCU indicates the relative probability of synonymous codons encoding an amino acid removing the effect of amino acid composition and coding sequence length. The RSCU index was calculated as follows: The observed number of the i th codon for the j th amino acid expressed as X ij , and n i is the number of synonymous codons that encode the i th amino acid. A RSCU value >1.0 represents positive codon usage bias, while a RSCU value <1.0 indicates negative bias. A RSCU value of 1.0 indicates no codon usage bias [47] . Additionally, synonymous codons with RSCU values >1.6 and <0.6 indicate over-represented and under-represented codons respectively. RSCU values were calculated using MEGA (version 7.0) [48] . PCA is a multivariate statistical method to analyze the relationship between variables and samples to identify major variation trends. PCA was used to identify clustering between the RSCU value of each strain using a 59-dimensional vector, excluding AUG, UGG and three termination codons [49] . PCA analysis was performed using the software Graphpad Prism 5.0 (GraphPad Software Inc., San Diego, CA, USA) against the classification based on different lyssaviruses [50] . The ENC value describes the degree that the codon usage deviates from random selection and reflects the extent of preference for the non-equilibrium use of synonymous codons in the codon family. The values range from 20 to 61 [51] . The smaller the ENC value the stronger the bias [52] . The ENC value was calculated as follows: where F i (i = 2, 3, 4, 6) represents the mean value of F i for i-fold degenerate codon families. The Fi value was calculated using the following formula: N is the total number of occurrences of the codons for that amino acid and n j is the total number of frequencies of the j th codon for that amino acid. In order to explore the factors influencing codon usage bias and to determine the relationship between the GC 3S and ENC values, the expected ENC was calculated as follows: ENC expected = 2 + s + 29 where 's' is the frequency of G + C at the third codon position of synonymous codons. In ENC-GC 3s plots, if a point sits on the expected curve, it means mutation pressure is the only factor influencing evolution, whereas if it sits below the expected curve indicates that mutation pressure is not the sole evolutionary driving force [50] . PR2 analysis, which explores the relationship between (A 3 /(A 3 + U 3 ) and (G 3 /(G 3 + C 3 )) in the four-codon amino acids family, was used to demonstrate the effects of mutation pressure and natural selection on the codon usage of special genes. The points sitting in the center of the plot indicate A = U and G = C and therefore the effect of mutation and selection rates are equal [53, 54] . The correlations among the A%, U%, G%, C%, the codon on the third position (A 3 , U 3 , G 3 , C 3 and GC 3 ), GC 12 , ENC, Aroma, Gravy, Axis 1 and Axis 2 were calculated using GraphPad Prism (version 5.0). The correlation is determined by the p value. A p value < 0.01 means a strong significant correlation and 0.01 < p < 0.05 denotes significant correlation. Neutrality analysis was performed to identify the effects of natural selection and mutation pressure on the codon usage patterns by plotting the P 12 (GC 1,2s ) values of the synonymous codons and the P 3 (GC 3s ) values using Graphpad Prism 5.0 (GraphPad Software Inc., San Diego, CA, USA) [55] . The influence of natural selection and mutation pressure is expressed as the slope of a regression curve. If the slope of the regression curve is close to ±0.5, it indicates no or weak external selection pressure. When the slope is close to 0 or 1, it indicates a very low correlation between GC 1,2s and GC 3s . Dinucleotide frequency analysis was performed to estimate the dinucleotide abundances on codon usage patterns by using software DAMBE [56] . The frequencies of 16 dinucleotides were calculated as follows: P xy = f xy f x f y (5) In the formula, f x and f y represent the frequency of nucleotide X and Y, respectively, while f xy represents the observed frequency of the dinucleotide XY, and f y f x represents the expected frequency of the dinucleotide value. It is considered that the XY dinucleotide is overrepresented and underrepresented when P xy > 1.23 and <0.78, respectively [57] . Variable evolutionary routes to host establishment across repeated rabies virus host shifts among bats Replication strategies of rabies virus Human rabies: Neuropathogenesis, diagnosis, and management Still Neglected after 125 Years of Vaccination Selected highlights from other journals: Estimating the global burden of canine rabies Public Health Responses to Reemergence of Animal Rabies Evidence of two Lyssavirus phylogroups with distinct pathogenicity and immunogenicity Taxonomy of the order Mononegavirales: Update Lyssavirus in Japanese Pipistrelle Tentative novel lyssavirus in a bat in Finland Ecology and evolution of rabies virus in Europe The role of viral evolution in rabies host shifts and emergence A perspective on lyssavirus emergence and perpetuation Molecular Epidemiology and Evolution of European Bat Lyssavirus 2 Codon usage in twelve species of Drosophila Analysis of codon usage bias of envelope glycoprotein genes in nuclear polyhedrosis virus (NPV) and its relation to evolution Evolutionary and genetic analysis of the VP2 gene of canine parvovirus Factors affecting the codon usage bias of SRY gene across mammals Analysis of phylogeny and codon usage bias and relationship of GC content, amino acid composition with expression of the structural nif genes Codon Usage Bias and Determining Forces in Taenia solium Genome Analysis of codon usage bias of mitochondrial genome in Bombyx mori and its relation to evolution The impact of host genetic diversity on virus evolution and emergence Recombination in the rabies virus and other lyssaviruses The Lyssavirus glycoprotein: A key to cross-immunity Production and neurotropism of lentivirus vectors pseudotyped with lyssavirus envelope glycoproteins Role of the glycoprotein G in lyssavirus pathogenicity Emergence of Lyssaviruses in the old world: The case of Africa The Global Phylogeography of Lyssaviruses-Challenging the "Out of Africa" Hypothesis Science & SciLifeLab Prize. From persistence to cross-species emergence of a viral zoonosis Codon usage bias in the N gene of rabies virus Synonymous codon usage pattern in glycoprotein gene of rabies virus Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses Revelation of Influencing Factors in Overall Codon Usage Bias of Equine Influenza Viruses Genome-wide analysis of codon usage bias in Ebolavirus The characteristic of codon usage pattern and its evolution of hepatitis C virus Characterization of the porcine epidemic diarrhea virus codon usage bias Evolution of codon usage in Zika virus genomes is host and vector specific Evolutionary characterization of Tembusu virus infection through identification of codon usage patterns CpG oligonucleotide activates Toll-like receptor 9 and causes lung inflammation in vivo Clinical application of CpG-, non-CpG-, and antisense oligodeoxynucleotides as immunomodulators Genetic Recombination, and Pathogenesis of Coronaviruses Epidemiology, Evolution, and Pathogenesis of H7N9 Influenza Viruses in Five Epidemic Waves since 2013 in China Novel Influenza D virus: Epidemiology, pathology, evolution and biological characteristics Phylogenetic analysis reveals a low rate of homologous recombination in negative-sense RNA viruses An evolutionary perspective on synonymous codon usage in unicellular organisms Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus Theory and Applications of Correspondence-Analysis-Greenacre Genetic and codon usage bias analyses of polymerase genes of equine influenza virus and its relation to evolution The Effective Number of Codons Used In a Gene Comprehensive analysis of the overall codon usage patterns in equine infectious anemia virus Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position Intrastrand Parity Rules of DNA-Base Composition and Usage Biases of Synonymous Codons Directional mutation pressure and neutral molecular evolution Dinucleotide relative abundance extremes: A genomic signature Genomic analysis of codon usage shows influence of mutation pressure, natural selection, and host features on Marburg virus evolution The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.