key: cord-0072278-szhqlrwu authors: Nair, Rahul Raveendran; Mohan, Manikandan; Rudramurthy, Gudepalya R.; Vivekanandam, Reethu; Satheshkumar, Panayampalli S. title: Strategies and Patterns of Codon Bias in Molluscum Contagiosum Virus date: 2021-12-20 journal: Pathogens DOI: 10.3390/pathogens10121649 sha: 84a316f9c0cd297b0457c9606528890f8c8a5032 doc_id: 72278 cord_uid: szhqlrwu Trends associated with codon usage in molluscum contagiosum virus (MCV) and factors governing the evolution of codon usage have not been investigated so far. In this study, attempts were made to decipher the codon usage trends and discover the major evolutionary forces that influence the patterns of codon usage in MCV with special reference to sub-types 1 and 2, MCV-1 and MCV-2, respectively. Three hypotheses were tested: (1) codon usage patterns of MCV-1 and MCV-2 are identical; (2) SCUB (synonymous codon usage bias) patterns of MCV-1 and MCV-2 slightly deviate from that of human host to avoid affecting the fitness of host; and (3) translational selection predominantly shapes the SCUB of MCV-1 and MCV-2. Various codon usage indices viz. relative codon usage value, effective number of codons and codon adaptation index were calculated to infer the nature of codon usage. Correspondence analysis and correlation analysis were performed to assess the relative contribution of silent base contents and significance of codon usage indices in defining bias in codon usage. Among the tested hypotheses, only the second and third hypotheses were accepted. In universal genetic code, any given amino acid except tryptophan and methionine is encoded by a specific set of multi-fold degenerate codons called synonymous codons [1, 2] . As an event of mutation which causes replacement of one synonymous codon with another in a given coding region does not modify the amino acid sequence, these mutations are called 'silent' [3] . Although these synonymous changes are seemingly neutral, selection of synonymous codons occurs during the process of evolution as these 'silent' changes have many effects on the functioning of a living cell [3] . Due to selection, even though translational mechanisms in organisms are relatively conserved from pole to pole, patterns of synonymous codon usage (SCU) are non-random across species, resulting in speciesspecific SCU [4, 5] . Further, usage of synonymous codons varies within genes of the same genome [6] [7] [8] . Despite the fact that selection and mutation still remain as two major explanations in delineating the origin of SCUB (SCU bias) [5, 9] , several factors of varying intensities contribute to the origin of distinct patterns of SCU within and between genomes [10] , for instance, GC content [11, 12] , rate of gene expression [13, 14] , mRNA decoding tempo of ribosomes [13] , mRNA secondary structure [15, 16] , mRNA turnover [17, 18] , co-translational protein folding and translation elongation [19] , gene function [20] , rate of recombina-tion [21, 22] , gene length [23, 24] , codon position [21] , habitat stress [25, 26] and population size [21] . Intraspecies SCUB is often viewed as the result of selection because the higher the number of preferred codons, the higher the level of gene expression would be [23, 27] . In contrast, mutational pressure is assumed to be the primary player in determining interspecies SCUB [1, 28, 29] . However, such generalizations of driving forces behind SCUB in intraspecies and interspecies scenarios are not yet fully justified [30, 31] as compositional constraints (differential nucleotide contents) of genomes are also crucial. For instance, GCrich genomes tend to favor G and C ending codons whereas AT-rich genomes preferentially use A and T ending codons [6, 32, 33] . Research on SCUB in various species unveiled the role of weak selection acting at the molecular level towards molecular evolution [34] [35] [36] , and such studies produced substantial evidence to develop molecular evolutionary models based on selection other than neutral molecular evolution model [37, 38] . An understanding of differential influences of these forces on shaping SCUB in a species is of paramount importance to research as it paves way for studying the evolutionary potential of genomic machinery of that species. Viruses are parasites which depend on host cells to undertake key biomolecular measures of survival, such as transcription, translation and replication [39] . Viral genes are capable of altering various steps in the pathogen identification pathways of host cell [40] . Certain viruses are proposed to remain in host cell for long durations without being identified by host immune mechanisms and may follow a relaxed inexorable way of reproduction using cell's replication machinery [39] . Essentially, such long-term association in host cells can cause transformation of whole viral genome (DNA/RNA) as an integral part of the host genome (colonization), which will decide the direction of the evolution of the host. Analyses of SCUB of various viral genomes reported that the efficiency of adaptation of viral genomes to the host is directly proportional to rate of similarity of SCUB between virus and host; the more the similarity, the higher the adaptation will be [41, 42] . A recent study revealed that optimum SCUB pattern of viral genome follows slight deviation from the SCUB pattern of the natural host in order to avoid excessive expression and depletion of the tRNA pool as host fitness is important for the virus to survive in the natural host/virus systems [43] . Although debatable, the concept that viruses develop unique genes and then colonize bacterial and vertebrate lineages reveals the evolutionary significance of viruses [39] . Hence, studying SCUB patterns of viral genomes will help to gain significant insights into overall viral sustenance, codon adaptability and viral pathogenesis with respect to natural and symptomatic hosts [44] . Molluscum contagiosum virus (MCV) is a double-stranded DNA virus belonging to the genus Molluscipox of Poxviridae family [45] . Molluscum contagiosum (MC) is a self-limited skin disease caused by MCV in humans which is characterized by small but raised mollusca (lesions) on the top layer of skin [46] . High incidence of MC is limited to the pediatric population, but immunodeficient individuals and sexually active adults are also susceptible to this infectious dermatosis [47] . The disease characteristics were initially described in 1814 [48] , but the viral background of the disease was discovered in 1905 [49] . Although the raised mollusca associated with this infection are observed to be self-limiting, lesion clearance may take from 6 months to as long as 5 years [50] . As no significant difference was observed between treated and untreated cases [51] , no FDAapproved therapy exists for treatment [52] . In general, 'active non-intervention' is adopted as a recommended strategy in dealing with MCV infections [52] . Currently, MCV cannot be cultured in vitro, limiting the ability to investigate replication and pathogenesis [53] . Four subtypes of MCV are identified, viz., MCV-1, MCV-2, MCV-3 and MCV-4 [53] . Among these subtypes, MCV-1 causes nearly 98% of cases, particularly in children, whereas MCV-2 causes skin lesions in immunocompromised adults [53] . The double-stranded DNA genome of MCV contains 182 non-overlapping coding frames, but only half of them share homologies with other poxvirus proteins [54] . The variable region of the MCV genome hosts a number of unique genes [55] ; hence, the genomic machinery of MCV is highly divergent from other mammalian chordopoxviruses [56] . Considering the unique features of MCV such as (i) restriction to humans as a significant host, (ii) a lack of a system for culture and (iii) high divergence from other poxviruses [56] , continued studies of MCV are required to gain insights into viral evolution [44] , pathogenesis and cellular mechanisms which control the host's response to infection [57] . The present study focused on the genomes of MCV-1 and MCV-2 due to their higher rates of infectioncausing capabilities among the four sub-types. As MCV uses humans as their natural host, long-term association with human cells may provide MCVs a platform for their own evolution [58] . In light of the fact that MCV has unique strategies to coexist with natural host [45] , the present study is focused on testing the following three hypotheses to obtain insight into the co-evolving trend of the MCV genome with the host genome: (1) codon usage patterns of MCV-1 and MCV-2 are identical, (2) SCUB patterns of MCV-1 and MCV-2 slightly deviate from that of human host to avoid affecting the fitness of host, and (3) translational selection predominantly shapes the SCUB of MCV-1 and MCV-2. Overall and site-specific base contents of coding sequences were estimated for MCV-1 and MCV-2 genomes to assess the effect base composition in shaping SCUB. In all selected genomes of MCV-1 and MCV-2, G and C contents were higher overall than A and T contents (Figure 1 ), indicating that MCV is GC-rich. In the first codon position, G content was high whereas in the second position, T content was high although overall T content was relatively low. In synonymous sites (third position), C content was high in both subtypes. Complex correlations were observed between overall and site-specific base contents in MCV-1 and MCV-2 genomes (Table 1 ). In both subtypes, A content was in significant negative correlation with G3, whereas A content was in positive correlation with A3 in five genomes of MC-1. In MCV-2 genomes, A and A3 were not correlated. In all genomes, T content was in significant positive correlation with A3 and T3 and was in negative correlation with C3, G3 and GC3. Except two genomes of MCV-1 and five genomes of MCV-2, other genomes exhibited significant negative correlation between G and T3 whereas positive correlation existed between G and G3 in all selected MCV I and MCV 2 genomes. In both subtypes, C content was positively correlated with C3, G3 and GC3, whereas it was negatively correlated with A3 and T3. ENC and GC3 values were calculated for coding sequences of MCV-1 and MCV-2 genomes. Mean ENC values varied by 45.03 ± 0.57. Mean GC3 values were within the range of 53.308 ± 0.78. ENC values of majority of coding sequences were found to be lying in between 33-54 in MCV-1 and MCV-2 genomes indicating a clear but weak bias [59] . In the ENC vs. GC3 plot, the majority of coding sequences were lying considerably below the expected curve, indicating a high possibility of selection influencing SCUB ( Figure 2 ). The Mann-Whitney two-sample test did not reveal any significant differences between intergenomic ENC. Moreover, a strong positive correlation between ENC and GC3 values was observed in all genomes (p < 0.0001), indicating the possible role of mutation as one of the major determining factors in shaping SCUB. Among the coding sequences analyzed, a few were observed to be having low SCUB (ENC ≥ 55) ( Table 2 ). In the neutrality plot, strong positive correlations were observed between GC12 and GC3 in seven MCV-1 genomes (Figure 3a-g) , and relatively weaker negative correlations were observed between GC12 and GC3 in two MCV-1 genomes and all selected MCV-2 genomes (Figure 3h-o) . These significant correlations (p ≤ 0.001) indicated the critical role of mutation in shaping SCUB in the genomes of MCV-1 and MCV-2 but with varying intensities. Among the selected MCVs, in the seven genomes of MCV-1, slopes of regression lines were close to 1, revealing that mutational pressure is highly influential in determining SCUB (Figure 3a -g) [60, 61] , but the narrow distribution of GC3 could be due to the effect of some amount of selection. In the remaining genomes (two MCV-1 and all selected MC-2; Figure 3h -o), the scatter plots were widespread with relatively weaker correlations, and also the slopes of regression lines were ≤0.50. This indicated that mutational pressure is relatively lower and selection pressure is relatively higher in these genomes (Figure 3h -o) when compared with that of the seven MCV-1 genomes mentioned above [44] . PR2 bias plot revealed non-proportional usage of AT and GC count at 3rd codon position in four-fold degenerate codons in MCV-1 and MCV-2 genomes. Frequency of nucleotides A and T at degenerate positions (A3 and T3) were not equal with that of nucleotides G3 and C3 ( Figure 4 ). AT bias at degenerate positions in the coding sequences of MCV-1 and MCV-2 deviated considerably from the center (A = T = 0.5; bias) relative to GC bias at degenerate positions in the fourfold degenerate codons. RSCU values of 59 synonymous codons of coding sequences of MCV-1 and MCV-2 were tabulated (Table 3) . No strand-specific bias was observed in synonymous codon usage (Table 4 ). MCV-1 and MCV-2 genomes exhibited preference towards G/C ending rather than A/T ending codons in coding amino acids except methionine (Met) and tryptophan (Trp) as Met and Trp are coded by single codons. Among the thirty codons were underrepresented (RSCU < 0.6), 29 were A/T ending and one was G ending (CGG for Arg). Of the twenty-one G/C ending codons over-represented (RSCU > 1.6), TTC and CAG were found to be over-represented only in MCV-2 genomes. The codon CCC was over-represented only in a single MCV-2 genome and CCG was over-represented in genomes except two MCV-1 and one MCV-2 genomes (Table 3) . RSCU values of only 8 codons (~13.5%) were in the range of 0.6-1.6. Analyses of dinucleotide frequencies revealed that dinucleotide contents were not randomly distributed (χ2 test; p ≤ 0.05). The CC, GG and TA dinucleotides were the most under-represented in both MCV sub-types. The dinucleotides CG and GC were over-represented in all chosen MCV-1 and MCV-2 genomes. Among the 18 amino acids that are coded by synonymous codons, most preferred codons for six amino acids were recognized by the suboptimal isoacceptor tRNAs (GCG for Ala, CCG for Pro, ACG for Thr, TCG for Ser, CGC for Arg and ATC for Ile) in the isoacceptor tRNA pool (Table 5 ). Most preferred codons for remaining 12 amino acids were recognized by the abundant isoacceptor tRNAs in MCV genomes (Table 5 ). No single axis could explain majority of variations in RSCU values of coding sequences of MCV-1 and MCV-2 ( Supplementary Figures S1-S8) . Cumulatively, axes 1-7 accounted for more than half of the codon usage variations in both sub types of MCV. Among the seven principal axes chosen, axis 1 in MCV-1 and MCV-2 accounted for~24% of total variations. Axis 1 was positively correlated with G3, C3, GC3 and gene length in all chosen sub types of MCV, whereas axis 1 was negatively correlated with A3, T3, ENC and CAI ( Table 6 ). Most of the genes were spread across the axis 1 (Supplementary Figure S1) . Grouping of A/T ending codons to the left and G/C ending codons to the right of axis 1 was noticed in both MCV-1 and MCV-2 genomes. Cluster analyses revealed distinct grouping of MCV-1 and MCV-2 based on RSCU values ( Figure 5 ). Deciphering genomic nucleotide composition is a prerequisite for characterization of viral genomes [62] . Nucleotide composition at third codon sites is found unequal and nonrandom between species [63, 64] and identification of major determining factors of SCUB is essential for understanding viral genome evolution [65] . In this study, patterns of SCUB and various factors which influence the formation of SCUB patterns in selected individuals of MCV-1 and MCV-2 were examined in detail. Positively correlated homogeneous base contents and negatively correlated heterogeneous base contents in MCV-1 and MCV-2 indicate the major influence of mutational pressure [66] . However, correlation analyses revealed the existence of positive heterogenous correlations (T and A3; C and G3) in all selected MC viruses. Positively correlated heterogenous correlations (T and A3; C and G3) in MCV-1 and MCV-2 revealed that natural selection by host must have influenced the SCUB patterns as in viral genomes, positive correlation between heterogeneous contents and negative correlation between homogeneous contents indicate host-induced natural selection [67] . The highest occurrence of nucleotide C at silent sites confirms the fact that overall base contents of genomes determine patterns of SCUB [33, 63] as MCV genomes are GC rich [45] . ENC values of majority of genes were within a range , which indicates the prevalence of a distinct but weak SCUB [59] . The mean ENC value of 45.03 ± 0.57 revealed a relatively stable codon usage in genomes of MCV sub-types as ENC > 35 indicates a conserved genomic architecture [68, 69] . Significant differences in intragenomic ENC (SD ≥ 5.7) and GC3 (SD ≥ 7.2) and strong positive correlation between ENC and GC3 point out the role of base compositional constraints in shaping SCUB as reported in large double-stranded DNA viruses [6, 70] . Highly biased genes possess low ENC values <35 [6] indicating high levels of gene expression [71] . Variola virus, a genetically close member of MCV belonging to poxvirus group, causes a severe systemic disease with high immune response in humans, whereas MCV do not cause fulminant systemic disease and develops a low rate of immune response [45] . The low immune response developed by MCV infection can be attributed to missing of highly expressive genes of Variola virus in MCV genomic machinery which produce proteins for enabling virus-host interactions [45] . The weak SCUB (low expression) of MCV genomes can be attributed to the ability of MC viral machinery to be in the host for longer periods of time without eliciting a fulminant immune response. As the majority of genes lie far below the bell-shaped portion of the expected ENC curve, the assumption that G + C biased mutation pressure is the sole factor behind the SCUB patterns in MCV does not hold true [71] . Rejection of this null hypothesis, that is, SCUB is dictated solely by GC biased mutational pressure due to GC richness in MCV genomes reveals the possibilities of having selection influencing SCUB patterns [42] in MCV-1 and -2. The possible role of selection was further supported by the narrow distribution of GC3 in seven MCV-1 genomes and low regression slopes of remaining MCV-1 and all selected MCV-2 genomes [44] . Mean values of AT bias [A3/(A3 + T3)] and GC [G3/(G3 + C3)] bias were greater than 0.5, indicating preference of purines over pyrimidines, that is, A over T and G over C [42, 72] in synonymous codons of four-fold degenerate amino acids. The strong preference towards G/C ending codons was due to over-representation of CG/GC dinucleotides in MCV genomes. The low frequency of GG dinucleotide resulted in the under-representation of CGG codon in coding amino acid Arg. This confirms the fact that bias in dinucleotide frequencies shape SCUB [6, 73] . The under-representation of TA dinucleotide in MCV genomes may possibly be due to low thermal stability [74] resulting in destabilization of mRNA coupled with sensitivity of uracil in UpA (uracil-phosphateadenine) to cytoplasmic RNase [75] to regulate mRNA turnover in a cell [42] . Among the GC containing codons, GCG, CCG, CGC, TCG and ACG were used preferentially (RSCU > 1.5) whereas CGA, CGG and CGT were under-represented (RSCU < 0.6). The low frequencies of GG and GT dinucleotides can justify the under-representation of CGG and CGT. The possible reason for the low preference of CGA may be attributed to the low overall A content. These results suggest that SCUB in MCV genomes is largely influenced by dinucleotide bias as reported [42, 76] . Although codon usage patterns shared some common features as mentioned above, the cluster analysis ( Figure 5 ) revealed a clear difference in RSCU patterns of MCV 1 and MCV 2, as both sub-types formed distinct clusters. Role of translation selection in shaping SCUB in MCV can be confirmed by checking whether most preferred codons are recognized by most abundant isoacceptor tRNAs in the isoacceptor tRNA pool [9, 42] . In the selected MCV sub-types, most preferred codons of 12 amino acids correspond to the most abundant isoacceptor tRNAs, indicating the role of translational selection [77, 78] . Most of the non-optimal codon-anticodon base pairing occurred with CG dinucleotide containing codons (GCG for Ala, CCG for Pro, ACG for Thr, TCG for Ser, CGC) in MCV genomes, that is, most preferred CG dinucleotide containing codons in MCV were translated by rare tRNAs. This can be considered as a selective force to keep a low rate of translation [79, 80] in the beginning to develop proper folding of viral proteins [81] for evading host immunity [82] by reducing the anti-viral response from the host [73] . Moreover, strong positive correlations between CAI and ENC (p < 0.0001) also indicate selection pressure as observed in Nipah viruses [42] as correlation between ENC and CAI determine the relative magnitude selection versus mutation [83] . The strong correlations between axis 1 and silent base contents (A3, T3, G3 and C3) pointed out the relative influence of mutational pressure due to compositional constraints in shaping SCUB. CAI values are associated with selection and ENC values reveals SCUB which can be due to either mutation/selection [42] . The strong correlation between axis 1 and these two indices (ENC and CAI) specified the relative high magnitude of selection over mutation in MCV genomes. Similar to the pattern observed in MCV genomes, host cells also used G/C ending codons most preferentially [81, 84] . Although both MCV and host cells preferred G/C ending codons, the non-optimal codon-anticodon base pairing of most preferred codons containing CG dinucleotides indicated that MCV genomes may follow a deliberate slight deviation from host codon usage to remain in the host for a certain period to become adapted to host for acquiring ambient 'climate' for genome evolution [39] . Viral adaptation to host in terms of codon usage is essential for the infection to be successful in human host [41] either due to coevolution of human genome along with infected viral genome or due to human genome evolution from viral genome [85] . This study was performed to test the veracity of following three hypotheses. First hypothesis-Codon usage patterns of MCV-1 and MCV-2 are identical: Although SCUB patterns of MCV-1 and MCV-2 shared common features, apparent intrinsic differences existed in codon usage patterns as revealed by grouping of MCV-1 and MCV-2 in cluster analysis. Thus, the first hypothesis was not accepted. Second hypothesis-SCUB patterns of MCV 1 and MCV 2 slightly deviate from that of human host to avoid affecting the fitness of host: Despite both human and MCV genomes used G/C ending codons, most preferred codons containing CG dinucleotides were not recognized by most abundant isoacceptor isotypes. This indicated that MCV genomes followed a slight deviation from codon usage pattern of host cells. Thus, the second hypothesis was accepted. Third hypothesis-Translational selection predominantly shapes the SCUB of MCV-1 and MCV-2: The findings such as strong correlations between ENC and CAI, strong correlation between axis 1 and ENC and axis 1 and CAI, recognition of majority of most preferred codons in MCV genomes by the most abundant isoacceptor isotypes in host cells indicates dominant role of selection along with mutational pressure. Thus, the third hypothesis was also accepted. The coding sequences (CDS) with exact initiation and termination codons of nine MCV-1 and six MCV-2 genomes were retrieved in FASTA format from GenBank database of the National Center for Biotechnology Information (NCBI). Details such as subtypes, accession numbers, country of isolation, total number of CDS, selected CDS and size of genomes are provided in Table 7 . Only coding sequences of length ≥ 300 nucleotides were selected for analyses to avoid sampling errors and stochastic variations [6] . Sequences were aligned using MUSCLE algorithm [86] embedded in MEGA X [87] . For each genome, coding sequences on the plus and minus strands were grouped separately to assess strand-specific codon usage bias. Relative synonymous codon usage (RSCU) is an important measure to analyze the biased usage of synonymous codons in coding a given amino acid [88] . RSCU value of a codon which codes for a given amino acid is calculated as the ratio of observed occurrences of that codon to the expected occurrences of the same codon provided all synonymous codons of that particular amino acid are used equally [27] . If RSCU value of a codon is greater than 1, it indicates preferred usage over its synonymous counterparts [27, 89] . If RSCU value is less than 1, it indicates non-preferred usage and for rare codons, RSCU values fall below 0.66 [32] . No bias is indicated if RSCU value is 1 [27] . RSCU value was calculated according to the equation given below [27] where, RSCU mn is the relative synonymous codon usage value of mth codon of nth amino acid. F mn is the observed frequency of mth codon of nth amino acid and ci is the number of standard synonymous codons of nth amino acid, i.e., level of codon degeneracy. Dinucleotide frequencies were estimated to check whether any dinucleotides from possible 16 combinations are preferably used as dinucleotide bias is linked with SCUB [33] . Dinucleotide frequency was calculated as follows [42] P xy = F xy F x F y where F x = frequency of nucleotide x, F y = frequency of nucleotide y and F xy is the frequency of dinucleotide xy. The odds ratio is defined as the ratio of observed frequency of a dinucleotide to the expected frequency of that particular dinucleotide. If odds ratio of a given dinucleotide falls above 1.25, it is a sign of over-representation and if the value falls below 0.78, it is a sign of under-representation [42, 76] . Effective number of codons (ENC) was calculated to assess the extent of SCUB. ENC values range from 20 (extreme bias of synonymous codon usage, i.e., one codon for one amino acid) to 61 (near uniform synonymous codon usage). Expected ENC value of a given sequence is calculated as follows [71] ENC = 2 + s + 29 where s = GC content at the synonymous position of codons (GC3). In ENC vs. GC3 plot, expected curve is a bell-shaped curve indicating the expected values of ENC (ordinate) determined solely by base composition (GC3; abscissa) as per the equation above [71] . In the biological system, for a given sequence, observed ENC values may not always follow the path of expected curve. If observed ENC values fall on or just near the expected curve, it can be assumed that compositional constraints influence the SCUB to a great extent [89] . On the other hand, if observed ENC values fall considerably below the expected curve, it can be assumed that certain other factors (for, e.g., selection) must be influencing the shaping of SCUB [89] . Coding sequences having ENC values ≤ 30 are considered to be highly biased and those with ENC values ≥ 55 are considered to be less biased [59] . Average GC composition at 1st, 2nd and 3rd codon position were calculated. Using GC values at 1st and 2nd positions (GC1 + GC2 = GC12; ordinate) and GC3 (abscissa), neutrality plot was developed to assess the mutation-selection balance in framing SCUB [44] . In the scatter plot, each CDS is indicated by a dot and existence of high correlation between GC12 and GC3 with slope coefficient close to 1 indicates the role of mutation in shaping SCUB [90] . If dots are widespread with no correlation between GC3 and GC12 with slope coefficient tends towards 0, selection is presumed to be possibly influencing the SCUB [6, 44] . Parity rule 2 (PR2) plot was developed to determine relative magnitude of mutation and selection in framing base composition of coding sequences [44] . In this plot, AT bias [A/(A + T)] and GC bias [G/(G + C)] are plotted on ordinate and abscissa [91] . If equal proportion of nucleotides (A = T = G = C = 0.25) is assumed, 0.5 would be the value at the center of the plot indicating that effects of mutation and selection are equal [92] . In this study, AT and GC bias at the third codon positions [A3/(A3 + T3), G3/(G3 + C3)] of four-fold degenerate amino acids of each coding sequence were plotted as PR2 biases at the synonymous positions are relatively more significant [93, 94] . Correspondence analyses (CA) was performed on 59 synonymous codons (excluding ATG for Met, TGG for Trp, termination codons TAA, TAG and TGA) by assuming each coding sequence as a 59-dimensional vector with each dimension identical to RSCU value of a codon [61, 95] for delineating SCU variations across the genes of MCV genomes. The relative importance of each codon over each orthogonal axis is represented by eigen value [96] . The total variation of codon usage was partitioned across 59 orthogonal axes in terms of percentage variation accounted by each CA-axis [97] . The first axis of CA explained majority of variations followed by subsequent axes holding a declining number of variations [97] . The number of axes for spearman's rank correlation analyses to study the relative influence of various factors on SCUB was determined based on the condition that selected axes account for majority (>50%) of codon usage variations. Cluster analysis was performed on the pooled RSCU values of coding sequences of MCV 1 and MCV 2 genomes to study the pattern of codon usage in subtypes of selected MCV based on grouping of subtypes in terms of codon usage [6, 70] . A 15 × 59 matrix was constructed in which rows corresponded to 15 MCV strains (nine MCV 1 and six MCV 2) and columns corresponded to pooled RSCU values of 59 codons. The method employed for clustering MCV 1 and MCV 2 subtypes based on RSCU values was unweighted pair-group average clustering based on Euclidean distances [6] . Dambe ver 7.3.2 [98] was employed to compute overall base contents, site-specific nucleotide compositions, RSCU, ENC and codon adaptation index (CAI) values. Isoacceptor tRNA pool was identified using an online tool (GtRNAdb: Genomic tRNA database) [42] . All correlation analyses were carried out using non-parametric Spearman rank correlation method [6, 97] . Non-parametric Spearman rank correlation method, Mann-Whitney 2-sample test and cluster analysis were performed using PAST 4.03 [99] . For all statistical analyses, the level of significance was taken as p < 0.05. The following are available online at https://www.mdpi.com/article/ 10.3390/pathogens10121649/s1. Figure S1 : Coding sequences of MH320547, Figure S2 : Coding sequences of MH320552 and MH320553, Figure S3 : Coding sequences of MH320554 and MH320555, Figure S4 : Coding sequences of KY040275 and KY040276, Figure S5 : Coding sequences of KY040277 and U60315, Figure S6 : Coding sequences of MH320548 and MH320549, Figure S7 : Coding sequences of MH320551 and MH320550, Figure S8 : Coding sequences of MH320556 and KY040274. Selection on codon bias Synonymous but not the same: The causes and consequences of codon bias The Code of Silence: Widespread Associations between Synonymous Codon Biases and Gene Function Codon catalog usage and the genome hypothesis Variation and selection on codon usage bias across an entire subphylum Evolution of Synonymous Codon Usage Bias in West African and Central African Strains of Monkeypox Virus Speeding with control: Codon usage, tRNAs, and ribosomes Sounds of silence: Synonymous nucleotides as a key to biological regulation and complexity Revelation of Influencing Factors in Overall Codon Usage Bias of Equine Influenza Viruses Elucidation of Codon Usage Signatures across the Domains of Life Compositional dynamics of guanine and cytosine content in prokaryotic genomes A general model of codon bias due to GC mutational bias New insights into the factors affecting synonymous codon usage in human infecting Plasmodium species Genome-wide codon usage pattern analysis reveals the correlation between codon usage bias and gene expression in Cuscuta australis Synonymous Codon Usage Controls Various Molecular Aspects The Yin and Yang of codon usage Codon influence on protein expression in E. coli correlates with mRNA levels Codon optimality, bias and usage in translation and mRNA decay Synonymous Codon Usage-a Guide for Co-Translational Protein Folding in the Silent SNPs: Impact on gene function and phenotype Comparative analysis of codon usage bias and codon context patterns between dipteran and hymenopteran sequenced genomes Does recombination improve selection on codon usage? Lessons from nematode and fly complete genomes Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis Synonymous codon bias is related to gene length in Escherichia coli: Selection for translational accuracy? Tissue-specific differences in human transfer RNA expression Amino acid and codon usage profiles: Adaptive changes in the frequency of amino acids and codons Codon usage in yeast: Cluster analysis clearly differentiates highly and lowly expressed genes Codon usage between genomes is constrained by genome-wide mutational processes Forces that influence the evolution of codon bias GC-Content evolution in bacterial genomes: The biased gene conversion hypothesis expands Evolutionary determinants of genome-wide nucleotide composition Compositional properties and codon usage of TP73 gene family The extent of codon usage bias in human RNA viruses and its evolutionary origin Reduced selection for codon usage bias in Drosophila miranda Accounting for background nucleotide composition when measuring codon usage bias Weak selection on synonymous codons substantially inflates dN/dS estimates in bacteria The Population and Evolutionary Genetis of Codon Bias Are viruses alive? Immunity and immunopathology to viruses: What decides the outcome? Viral adaptation to host: A proteome-based analysis of codon usage and amino acid preferences Analysis of Nipah Virus Codon Usage and Adaptation to Hosts Genomic and evolutionary comparison between SARS-CoV-2 and other human coronaviruses Analysis of codon usage patterns and influencing factors in Nipah virus Genome sequence of a human tumorigenic poxvirus: Prediction of specific host response-evasion genes European guideline on the management of genital molluscum contagiosum Molluscum contagiosum: An update and review of new perspectives in etiology, diagnosis, and treatment A Practical Synopsis of Cutaneous Diseases Zue kenntnis des virus des Molluscum contagiosum Molluscum contagiosum: The importance of early diagnosis and treatment Molluscum contagiosum: To treat or not to treat? Experience with 170 children in an outpatient clinic setting in the northeastern United States Molluscum contagiosum: Review and update on clinical presentation, diagnosis, risk, prevention and treatment Molluscum Contagiosum Virus The genome of molluscum contagiosum virus: Analysis and comparison with other poxviruses Origin and Evolution of Poxviruses Immune evasion strategies of molluscum contagiosum virus Molluscum Contagiosum Virus Comparative studies on codon usage pattern of chloroplasts and their host nuclear genes in four plant species Evolution of codon usage in Zika virus genomes is host and vector specific Analysis of codon usage bias of chloroplast genes in Oryza species : Codon usage of chloroplast genes in Oryza species Nucleotide Composition and Codon Usage across Viruses and Their Respective Hosts Edging on Mutational Bias, Induced Natural Selection From Host and Natural Reservoirs Predominates Codon Usage Evolution in Hantaan Virus Roles for Synonymous Codon Usage in Protein Biogenesis Analysis of codon usage of severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) and its adaptability in dog Evolution of Synonymous Codon Usage in the Mitogenomes of Certain Species of Bilaterian Lineage with Special Reference to Chaetognatha The characteristics of the synonymous codon usage in enterovirus 71 virus and the effects of host on the virus in codon usage pattern An evaluation of measures of synonymous codon usage bias Comparative analysis of codon usage patterns in Rift Valley fever virus A detailed comparative analysis on the overall codon usage pattern in herpesviruses The 'effective number of codons' used in a gene Gene characteristics of the complete mitochondrial genomes of Paratoxodera polyacantha and Toxodera hauseri (Mantodea: Toxoderidae) Evolutionary basis of codon usage and nucleotide composition bias in vertebrate DNA viruses Predicting DNA duplex stability from the base sequence Evolution of the genome and the genetic code: Selection at the dinucleotide level by methylation and polyribonucleotide cleavage Codon Pair Bias Is a Direct Consequence of Dinucleotide Bias Synonymous codon usage in Drosophila melanogaster: Natural selection and translational accuracy Codon usage and tRNA content in unicellular and multicellular organisms In vivo introduction of unpreferred synonymous codons into the Drosophila Adh gene results in reduced levels of ADH protein Codon usage can affect efficiency of translation of genes in Escherichia coli Evidence for quasispecies distributions in the human hepatitis A virus genome Genome rhetoric and the emergence of compositional bias Codon usage in twelve species of Drosophila Codon usage and replicative strategies of hepatitis A virus Mobile elements: Drivers of genome evolution Multiple sequence alignment with high accuracy and high throughput Molecular Evolutionary Genetics Analysis across Computing Platforms Synonymous codon usage in Lactococcus lactis: Mutational bias versus translational selection Mutational pressure dictates synonymous codon usage in freshwater unicellular alpha-cyanobacterial descendant Paulinella chromatophora and beta-cyanobacterium Synechococcus elongatus PCC6301 Directional mutation pressure and neutral molecular evolution Intrastrand parity rules of DNA base composition and usage biases of synonymous codons Comprehensive analysis of synonymous codon usage patterns and influencing factors of porcine epidemic diarrhea virus Characterization of the porcine epidemic diarrhea virus codon usage bias Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position Theory and Applications of Correspondence Analysis Use and misuse of correspondence analysis in codon usage studies Synonymous codon usage, GC(3), and evolutionary patterns across plastomes of three pooid model species: Emerging grass genome models for monocots New and Improved Tools for Data Analysis in Molecular Biology and Evolution PAST: Palaeontological statistics software package for education and data analysis The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. The authors declare no conflict of interest.