key: cord-0023824-drv39exc authors: Chu, Pei-Yu; Huang, Hui-Wen; Boonchan, Michittra; Tyan, Yu-Chang; Louis, Kevin Leroy; Lee, Kun-Mu; Motomura, Kazushi; Ke, Liang-Yin title: Mass Spectrometry-Based System for Identifying and Typing Norovirus Major Capsid Protein VP1 date: 2021-11-22 journal: Viruses DOI: 10.3390/v13112332 sha: 25ebbe7ced68124ea735d90d681703c872f1cefd doc_id: 23824 cord_uid: drv39exc Norovirus-associated diseases are the most common foodborne illnesses worldwide. Polymerase chain reaction-based methods are the primary diagnostics for clinical samples; however, the high mutation rate of norovirus makes viral amplification and genotyping challenging. Technological advances in mass spectrometry (MS) make it a promising tool for identifying disease markers. Besides, the superior sensitivity of MS and proteomic approaches may enable the detection of all variants. Thus, this study aimed to establish an MS-based system for identifying and typing norovirus. We constructed three plasmids containing the major capsid protein VP1 of the norovirus GII.4 2006b, 2006a, and 2009a strains to produce virus-like particles for use as standards. Digested peptide signals were collected using a nano-flow ultra-performance liquid chromatography mass spectrometry (nano-UPLC/MS(E)) system, and analyzed by ProteinLynx Global SERVER and TREE-PUZZLE software. Results revealed that the LC/MS(E) system had an excellent coverage rate: the system detected more than 94% of amino acids of 3.61 femtomole norovirus VP1 structural protein. In the likelihood-mapping analysis, the proportions of unresolved quartets were 2.9% and 4.9% in the VP1 and S domains, respectively, which is superior to the 15.1% unresolved quartets in current PCR-based methodology. In summary, the use of LC/MS(E) may efficiently monitor genotypes, and sensitively detect structural and functional mutations of noroviruses. Human norovirus (HuNoV) is the leading cause of acute gastroenteritis (AGE) worldwide. Symptoms of infection include intense vomiting, nausea, watery diarrhea, abdominal cramping, and fever [1, 2] . Young children, old age, and immuno-compromised status are associated with increased morbidity and mortality [3] . The HuNov spreads quickly in crowded environments, owing to its low infectious dose (<10 2 viral particles), prolonged The VLPs were heated at 95 • C for 10 min before sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). Proteins were transferred onto a polyvinylidene difluoride (PVDF; GE Healthcare, Piscataway, NJ, USA) by using a semidry system (Bio-Rad, Hercules, CA, USA). After transfer, the PVDF was incubated in a blocking buffer (5% skim milk in phosphate-buffered saline containing 0.1% Tween-20) at room temperature (rt) for 1 h. The blots were hybridized with VP1 primary antibody (1:1000 in blocking buffer) at rt for 1 h, then washed thrice with PBST. The following secondary antibody (1:10,000 horseradish peroxidase-conjugated goat anti-mouse serum; Dako, Glostrup, Denmark) was incubated with samples at rt for 1 h, and then washed out by PBST. Signals were visualized with Amersham ECL Prime (GE Healthcare, Piscataway, NJ, USA), and recorded with a ChemiDoc MP Imaging System (Bio-Rad, Hercules, CA, USA). To examine the size and morphology of VLPs, purified VLPs were layered onto 400-mesh Formvar/Carbon-coated copper grids (EM Sciences, Hatfield, PA, USA). Samples were absorbed on the copper grids for 1 min, then were stained with 4.0% uranyl acetate (Merck KGaA, Darmstadt, Germany). Negatively stained specimens were examined by transmission electron microscope (TEM-HT7700; Hitachi Global, Chiyoda City, Tokyo, Japan). The VP1 protein (1 µg/µL) was extracted using the solvent mixture chloroform/methanol (2:1 v/v, CHCl 3 /MeOH). The matrix solution was 50 mg sinapic acid (SA) matrix in 1 mL acetonitrile/0.1% trifluoroacetic acid (3:7, v/v). Samples were prepared by the reversed thin-layer method for MALDI-TOF MS. A 1-µL matrix solution was applied to the top of a vacuum-dried spot, and the sample was added to form a homogenized crystal. The MS target was subsequently introduced into a MALDI-TOF mass spectrometer (Autoflex III; Bruker Daltonics, Bruker Corp., Bremen, Germany). The sample spot was irradiated with a neodymium:yttrium-aluminum-garnet (Nd: YAG, Nd:Y 3 Al 5 O 12 ) laser (355 nm; pulse duration, 3 ns; 200 Hz) for both desorption and ionization. For each sample, an average of 300 laser shots was used to obtain representative mass spectra, which were recorded over the range m/z 4000-70,000. All positive-ion mass Viruses 2021, 13, 2332 4 of 12 spectra were acquired in the reflectron mode at an acceleration voltage of 19 kV in pulsedion extraction delay mode. The VP1 proteins were quantified by data-independent acquisition parallel-fragmentation MS (UPLC/MS E ) performed with a Waters Xevo G2 Q-TOF mass spectrometer equipped with a nano-electrospray ionization interface (Waters Corporation, Milford, MA, USA) [27] . The VP1 protein (0.1 µg/µL) was dissolved in a solution containing 8M urea (Sigma-Aldrich, Burlington, MA, USA)/50 mM Tris-HCl (pH 8)/5mM Dithiothreitol (Roche Diagnostics Deutschland GmbH, Mannheim, Germany), and incubated at 37 • C for 1 h. The followed procedures were for alkylation and peptide digestion: we added 15 mM iodoacetamide (Sigma-Aldrich, Burlington, MA, USA) to VP1 protein, and incubated it in the dark for 30 min. Then, VP1 protein was digested with Trypsin Gold (Promega, Madison, WI, USA) in 50 mM Tris-HCl (pH 8) buffer, and incubated overnight. Digested peptides were separated by M-class UPLC (Waters Corporation, Milford, MA, USA), through an M-Class Symmetry C18 Trap Column (130Å, 1.7 µm, Spherical Hybrid, 75 µm × 250 mm) and a BEH C18 column (130Å, 1.7 µm, Spherical Hybrid, 75 µm × 250 mm) under gradient conditions of 300 nL/min flow rate at 40 • C for 70 min. The mobile phase was composed of buffer A (H 2 O + 0.1% formic acid) and buffer B (acetonitrile as the organic modifier and 0.1% formic acid) for molecule protonation. Detailed gradient protocols were initial 1% buffer B, reached 60% at 25th min and 99% at 35th min, maintained 99% at 50th min, and back to 1% at 55th min. Parallel ion fragmentation was programmed to switch between low (15V) and high (38V) energies in the collision cell. Peptide m/z signals were collected in the 300-3300 m/z range. Cone voltage was 40 kV. Glu-fibrinopeptide B (m/z 785.8426) was used as the real-time lock mass to calibrate data. Collision energies were 0-6V, and cone voltage was 2.65 kV. Lock mass was continuously collected every 10 s in a parallel channel. The MS data were processed with ProteinLynx Global Server (PLGS, v3.0.3) software (Waters Corporation, Milford, MA, USA). Deisotoped peptide identification and modifications were performed using a database downloaded from UniProt at https://www. uniprot.org (accessed on 5 March 2018). Ion matching requirements were three fragments per peptide, and seven fragments per protein. The post-translational modifications, such as phosphorylation and glycosylation, were analyzed according to the instructional design process of PLGS software [28] . The absolute quantification was performed by comparison of ion intensities to an internal standard of yeast alcohol dehydrogenase (MassPREP ADH digestion standard, SwissProt P00330; Waters Corporation, Milford, MA, USA) which was added to give a final concentration of 10 fmol/mL of on-column sample injection. Quantification was performed with Progenesis QI for proteomics (QIP) software (Nonlinear Dynamics; Waters Corporation, Milford, MA, USA). For each NoV GI-GVI genogroup from GenBank, three to five complete VP1 sequences were acquired using Blastp (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins, accessed on 22 January 2018). Multiple sequence alignment was done using the T-Coffee Multiple Sequence Alignment Program (Comparative Bioinformatics Group, Barcelona, Spain). We manually re-examined the viral sequences and alignment results. Some virus strains were excluded due to insufficient data for the isolation year or location, nonsense or frameshift mutation, or recombination. The final dataset comprised 139 sequences. For the full-length VP1 and each sub-domain, the best-fit substitution was selected according to the lowest Bayesian information criterion (BIC) score using Molecular Evolu-tionary Genetics Analysis (MEGA) software (v11; Pennsylvania State University; University Park, PA, USA) [29] . The dataset was quantified by likelihood mapping analysis performed in the TREE-PUZZLE program (v5.3; Heiko A. Schmidt and Arndt von Haeseler, Vienna, Austria) [30] . Analyses of 10,000 quartets were performed using quartet sampling and a neighbor-joining tree with exact parameter estimates. Data were considered reliable for phylogenetic inference if less than 10% of the dots fell in the center of the triangle (unresolved quartets); data were considered unreliable if more than 30% of the dots fell in the unresolved quartets. Epistatic interactions constrain evolution; in some key protein sites, mutations are tolerated only after critical compensatory mutations. Therefore, the results of epistasis analysis may facilitate the identification of groups of residues involved in the same function. To understand the function or interaction among detected phosphorylation sites in norovirus, and the co-evolving sites in the VP1 region by the aligned 139 norovirus strains, the Bayesian graphical model (BGM) for co-evolving sites was implemented on the Datamonkey Adaptive Evolution Server (https://www.datamonkey.org/, accessed on 16 August 2021) [31] . A significant association between the two sites was defined as a posterior probability exceeding a default cutoff of 0.5. Ten µL of supernatant from each subfraction from the CsCl ultracentrifuge was collected to measure the refraction index. Buoyant density was determined indirectly: a refractometer was used to obtain the refractive index, which was then converted to buoyant density using the following formula. Table 1 presents the refractive index and density of each subfraction. Notably, subfraction numbers 4, 5, and 6 had densities of 1.3245, 1.3299, and 1.3353 g/cm3, respectively. Density = 10.8601 × refractive index − 13.4974 (1) According to the SDS-PAGE results, the 55-60 kDa protein peaked at a density range of 1.3245~3.353 g/cm 3 ; the subfraction numbers were 4, 5, and 6 ( Figure 1A ). Western blot analysis confirmed that VP1 proteins were overexpressed in these subfractions; the range of molecular weights was approximated as 55-60 kDa ( Figure 1B) . The VP1 self-assembled particles were visualized by negative staining, and observed by TEM. The norovirus VP1-based VLP was found to be 38 nm ( Figure 1C ). Next, intact norovirus VP1 proteins of virus strain Hu/GII.4/Saga5/2006/JP were analyzed by MALDI-TOF. A distinct peak was detected at 58,043 m/z by 1 µg of VP1 protein ( Figure 1D ). According to these data, VP1 proteins were successfully produced by the baculovirus expression system with a molecular weight of 58 kDa by western blotting. These VLPs Viruses 2021, 13, 2332 6 of 12 could self-assemble into VLPs (size: 38 nm) by electron microscopy. Further, these VLPs were detectable by MALDI-TOF with a peak at 58,043 m/z at a concentration of 17.24 pmol. analysis confirmed that VP1 proteins were overexpressed in these subfractions; the range of molecular weights was approximated as 55-60 kDa ( Figure 1B) . The VP1 self-assembled particles were visualized by negative staining, and observed by TEM. The norovirus VP1-based VLP was found to be 38 nm ( Figure 1C ). Next, intact norovirus VP1 proteins of virus strain Hu/GII.4/Saga5/2006/JP were analyzed by MALDI-TOF. A distinct peak was detected at 58,043 m/z by 1 μg of VP1 protein ( Figure 1D ). According to these data, VP1 proteins were successfully produced by the baculovirus expression system with a molecular weight of 58 kDa by western blotting. These VLPs could self-assemble into VLPs (size: 38 nm) by electron microscopy. Further, these VLPs were detectable by MALDI-TOF with a peak at 58,043 m/z at a concentration of 17.24 pmol. Although there were some missing residues, all the genotypes of VLPs were detected correctly. After comparison with reference sequences, these three VLPs were best matched with their original sequences, or with the other strain with an identical VP1 amino acid sequence. In other words, the best-characterized strain (i.e., the sequence with the highest score) of variant 2006b was B5BTN4 (the original strain), with 99.1% (535/540) coverage; variant 2006a was B5BTS0, with 96.7% (522/540) coverage. The best-characterized strain variant 2009a was A0A0K2SS15, with a 94% (505/540) coverage rate; however, A0A0K2SS15 showed 100% similarity to A0A1B1CUK8 (original sequence of the VLP). The coverage rates exceeded 94% at a concentration of 3.6 fmoles. Compared with the secondary structure and functional domain, these common missing residues were located at the N-terminus and C-terminus (two and three missing residues in the N-and C-termini, respectively), and in positions 342-345, this region covers the HBGA-binding targets (343, 344, and 345) ( Figure 2 ). The three tested strains for VLPs were highly similar at the Although there were some missing residues, all the genotypes of VLPs were detected correctly. After comparison with reference sequences, these three VLPs were best matched with their original sequences, or with the other strain with an identical VP1 amino acid sequence. In other words, the best-characterized strain (i.e., the sequence with the highest score) of variant 2006b was B5BTN4 (the original strain), with 99.1% (535/540) coverage; variant 2006a was B5BTS0, with 96.7% (522/540) coverage. The best-characterized strain variant 2009a was A0A0K2SS15, with a 94% (505/540) coverage rate; however, A0A0K2SS15 showed 100% similarity to A0A1B1CUK8 (original sequence of the VLP). The coverage rates exceeded 94% at a concentration of 3.6 fmoles. Compared with the secondary structure and functional domain, these common missing residues were located at the N-terminus and C-terminus (two and three missing residues in the N-and C-termini, respectively), and in positions 342-345, this region covers the HBGA-binding targets (343, 344, and 345) ( Figure 2 ). The three tested strains for VLPs were highly similar at the N-terminal arm (NTA), S domain, hinge region, P1-1 subdomain, and P1-2 subdomain, with 2/45, 3/170, 0/10, 2/47, and 3/35 substitutions, respectively. The differences were in the P2 subdomain, particularly at the N-terminal of the P2 subdomain (340-415 residues). These results demonstrated that LC/MS E system has the potential to identify genotypes of norovirus. N-terminal arm (NTA), S domain, hinge region, P1-1 subdomain, and P1-2 subdomain, with 2/45, 3/170, 0/10, 2/47, and 3/35 substitutions, respectively. The differences were in the P2 subdomain, particularly at the N-terminal of the P2 subdomain (340-415 residues) . These results demonstrated that LC/MS E system has the potential to identify genotypes of norovirus. Figure 2 . The best-identified norovirus strains. The best-identified strains of variants 2006b and 2006a were B4BTN4 and B5BTS0, respectively. Although the best-identified strain of variant 2009a (A0A0K2SRW5) was A0A0K2SS15, they were identical in amino acid sequence. The virus strains for VLPs were shown in black accession numbers; the best-matched variant 2009a was indicated in the grey accession number. Dots represented amino acids identical to the top sequence-B5BTN4; missing residues were shadowed. The secondary structure of VP1 was noted under the aligned sequences. Modifications, such as phosphorylation and glycosylation, may alter the function of viral proteins and interfere with the binding affinity with host cell receptors [31, 32] . Thus, we investigated possible post-translational modifications on VLPs by using PLGS software according to the instructional design process. Results revealed that a total of twenty phosphorylation sites can be detected, including positions 5S, 14S, 115T, 130T, 134S, 171S, 197T, 224T, 240S, 251T, 300T, 359T, 364S, 368S, 369T, 377T, 393S, 394T, 425T, and 462Y (Figures 2, marked as ↓) . The phosphorylation sites were clustered in the P2 domain and non-β-stand regions. All phosphorylated peptides were convincing, with detectable daughter fragments and high PLGS scores in the software (Figure 3 ). Modifications, such as phosphorylation and glycosylation, may alter the function of viral proteins and interfere with the binding affinity with host cell receptors [31, 32] . Thus, we investigated possible post-translational modifications on VLPs by using PLGS software according to the instructional design process. Results revealed that a total of twenty phosphorylation sites can be detected, including positions 5S, 14S, 115T, 130T, 134S, 171S, 197T, 224T, 240S, 251T, 300T, 359T, 364S, 368S, 369T, 377T, 393S, 394T, 425T, and 462Y ( Figure 2 , marked as ↓). The phosphorylation sites were clustered in the P2 domain and non-β-stand regions. All phosphorylated peptides were convincing, with detectable daughter fragments and high PLGS scores in the software (Figure 3 ). The likelihood-mapping analysis is a graphical method to visualize the phylogenetic content of a sequence alignment. Each result is composed of two triangles. One of the two triangles presents dots only. Each dot within the triangle represents the likelihoods of three possible unrooted trees out of 10,000 random quartets. The other triangle divides those dots into seven partitioned regions. The percentage of dots falling in each region is indicated. The three corners of the triangle represent well-resolved phylogeny. The three side-rectangles are net-like regions, hard to distinguish between two of the three topologies. The central triangle area shows a star-like area, indicating "unresolved" [32] . Results of the likelihood-mapping analysis for (1) full-length VP1, (2) NTR+S domain, (3) P1-1, (4) P2, and (5) P1-2 are shown in ( Figure 4A ). Full-length aa sequences of the VP1and S domains detected by MS were analyzed by the likelihood-mapping analysis. Results revealed that unresolved quartets were 2.9% and 4.9%, respectively ( Figure 4A ). Although the N-and C-terminus revealed a few missing residues, they did not interfere with viral identification. In addition, the peptide sequences of P1-1, P2, and P1-2 detected by MS showed that 16.4%, 12.5%, and 11.5% of quartets were unresolved in the likelihoodmapping. Compared to 15.1% by the current PCR-based methodology, the above sequences were superior for identifying noroviruses. The likelihood-mapping analysis is a graphical method to visualize the phyloge content of a sequence alignment. Each result is composed of two triangles. One of the triangles presents dots only. Each dot within the triangle represents the likelihoo three possible unrooted trees out of 10,000 random quartets. The other triangle di those dots into seven partitioned regions. The percentage of dots falling in each reg indicated. The three corners of the triangle represent well-resolved phylogeny. The side-rectangles are net-like regions, hard to distinguish between two of the three to gies. The central triangle area shows a star-like area, indicating "unresolved" [32] . Results of the likelihood-mapping analysis for (1) full-length VP1, (2) NTR+S dom (3) P1-1, (4) P2, and (5) P1-2 are shown in ( Figure 4A ). Full-length aa sequences o VP1and S domains detected by MS were analyzed by the likelihood-mapping ana Results revealed that unresolved quartets were 2.9% and 4.9%, respectively (Figure Although the N-and C-terminus revealed a few missing residues, they did not inte with viral identification. In addition, the peptide sequences of P1-1, P2, and P1-2 det by MS showed that 16.4%, 12.5%, and 11.5% of quartets were unresolved in the likelih mapping. Compared to 15.1% by the current PCR-based methodology, the abov quences were superior for identifying noroviruses. To understand the function or interaction among detected phosphorylation si norovirus, epistatic analysis was also performed on 139 norovirus VP1 sequences an pairs of co-evolving sites (PP > 0.5). Intriguingly, these 342 pairs revealed numerous tiple epistatic interaction sites (MEISs). Residues with the most MEISs were residues 329, 155, 317, and 540, which had 14, 13, 12, 11, and 8 epistatic interaction pairs, re tively. Additionally, residues 130 and 475 each had seven epistatic interaction pairs ure 4, Supplementary Table S1). Three of the seven MEISs were located in the outermost P2 subdomain (position 317, and 329), whereas MEIS 540 was located in P1-2. Notably, MEIS 540 had the hi mutation impact, but was not influenced by others. To understand the function or interaction among detected phosphorylation sites in norovirus, epistatic analysis was also performed on 139 norovirus VP1 sequences and 342 pairs of co-evolving sites (PP > 0.5). Intriguingly, these 342 pairs revealed numerous multiple epistatic interaction sites (MEISs). Residues with the most MEISs were residues 305, 329, 155, 317, and 540, which had 14, 13, 12, 11, and 8 epistatic interaction pairs, respectively. Additionally, residues 130 and 475 each had seven epistatic interaction pairs (Figure 4, Supplementary Table S1 ). Three of the seven MEISs were located in the outermost P2 subdomain (positions 305, 317, and 329), whereas MEIS 540 was located in P1-2. Notably, MEIS 540 had the highest mutation impact, but was not influenced by others. In detail, MEIS 540 showed impacts on both position MEIS 305 (in P2 subdomain) and position 526 (in P1-2). Then, position 526 influenced MEIS 475, MEIS 317, position 26 (in NTA), MEIS 155 residue, residue 298, residue 445, and MEIS 329 ( Figure 4B,C, red arrow) . Additionally, the interaction network was indirectly or directly associated with HBGA binding site 1 (343 and 345) and site 2 (392 and 393). Two phosphorylation sites were involved in the epistatic network. The first phosphorylation site was on 228S, which may have an epistatic interaction with position 343 on the receptor-binding site. The second one was on 78S, showing the mutation impact on MEIS 130. Collectively, the co-evolution sites identified in VP1 were intimately networked. MEIS 540 (P1-2) was dominant in these interaction chains. Mutation of the MEIS 540 showed great impacts on the multi-epistatic interaction network, especially the receptor binding and antigenic determinant P2 subdomain. phorylation site was on 228S, which may have an epistatic interaction with position 343 on the receptor-binding site. The second one was on 78S, showing the mutation impact on MEIS 130. Collectively, the co-evolution sites identified in VP1 were intimately networked. MEIS 540 (P1-2) was dominant in these interaction chains. Mutation of the MEIS 540 showed great impacts on the multi-epistatic interaction network, especially the receptor binding and antigenic determinant P2 subdomain. In the left triangle, each dot represents the likelihoods of three possible unrooted trees out of 10,000 random quartets. The right triangle shows the seven basins of attraction with their corresponding attractors. The numbers indicate the percentage of quartets falling in each region. Numbers in the center area of the triangle correspond to phylogenetic noise; when the number was less than 10%, the data were considered reliable for phylogenetic inference. Likelihood mapping graphics were produced by the TREE-PUZZLE program v5.2. (B) The secondary structure guide is located at the third panel (PDB ID code 6ouu). (C) Multiple epistatic interactions are shown at the bottom of the figure. Each square represents a residue position that participates in at least one interaction with a marginal posterior probability (PP) exceeding a default cutoff of 0.5. Only epistatic interaction with HBAG binding site (highlighted with yellow cycle), phosphorylation sites (highlighted with purple box), and epistatic interaction group no less than seven (highlighted with red box) are shown. Arrows between squares indicated the epistatic direction between residues, the direction was also enhanced by color level from light to darkness, the PP values are indicated by the line weight. An epistatic network has been shown by the key dominant site position 540. The aim of this mass-based analysis was to establish an easy performed, sensitive, and high-resolution method for laboratory identification of norovirus infection. In this study, the most abundant protein VP1 was produced by a Baculovirus expression system The aim of this mass-based analysis was to establish an easy performed, sensitive, and high-resolution method for laboratory identification of norovirus infection. In this study, the most abundant protein VP1 was produced by a Baculovirus expression system as a standard for the MS-based assay. We demonstrated VLP formation by electron microscopy, which "froze" the protein modification pattern, and revealed the self-assembly ability of the VP1 protein. Limitations of the study were the small number of reference sequences other than GII.4. Further investigations for other genotypes and genogroups are required. Previously, 57% coverage (concentration range, 0.1 × 10 −12 to 50 × 10 −12 mol) has been reported in authentic standards of recombinant VLPs using nanospray tandem MS to detect recombinant HuNoV VP1 protein digests [33] . In the current study, by UPLC/MS E system, 98% coverage and 10 PTM sites achieved unambiguous identification in a concentration of 3.6 × 10 −15 mol of VLP. The ORF1/ORF2 junction (C region) of the NoV genome serves widely as a target for rapid detection by nucleic acid-based amplification techniques, such as RT-PCR, qRT-PCR, and NASBA [6] . The high mutation rate of NoV results in an inconsistency of genotyping [9] . In the current study, no more than 5% of noise signals (dots on unresolved quartet) were in full VP1 and S domains, which indicated that these regions are sufficient for phylogenetic analysis. Since circulating recombinant strains have been found worldwide, dual typing of ORF1-RdRp (P-type) and ORF 2-VP1 (genotype) is now routinely used for HNoV typing worldwide [6] . Along with the superior coverage rate and sensitivity, UPLC-MS E is a promising tool for NoV typing. Typical viruses have small genomes that code for less than 20 proteins. Thus, most viral proteins have multiple functions that enable small numbers of viral proteins to hijack the host cell as the machinery for the viral lifecycle. For example, the capsid protein involves decapsidation-encapsidation, trafficking, and modulation of the host immune response [34, 35] . Phosphorylation on the flexible regions of the capsid protein, e.g., N-or C-terminus, alters their molecular surface charge and conformation, which then affects how capsids interact with other viral and cellular molecules [36] . This is a dynamic process, like how the discovery of a cluster of differentiation markers in the immune cells relies on cancer cells, especially leukemia cells, to move out of the normal cell cycle, and "freeze" the cell marker in some stage of the carcinogenesis steps. This highlights the potential use of VLPs as tools for understanding the impact of capsid protein phosphorylation on the viral life cycle. The VP1 protein of norovirus starts with a flexible N-terminal arm (NTA, 1-45 aa), and it has been reported that this area is important for directing capsid assembly [8, 37] . The S domain (46-215) exhibits an 8-stranded β-barrel motif, a canonical icosahedral coat building block of all capsid viruses. Two in NTA (5S, 14S), and five phosphorylation S sites (115T, 130T, 134S, 171S, and 197TS) have been detected in this study. Of these, 171S has epistatic interaction by MEIS155, and 179T has epistatic interaction by MEIS130. Besides, 130T itself is a MEIS. This interaction chain suggests phosphorylation 130T might impact the interaction among NTA, S domain, and P2 subdomain. The P domains dimerize to form a protrusion on the capsid surface. Essentially, the fold of the P1 subdomain (227-273) consists of nine anti-parallel βstrands. The fold of the P2 subdomain (274-416) is a β-barrel of six anti-parallel strands. These β-strands are connected by extensive loops that have varying lengths and are exposed to the surface [8, 10] . Analysis of nonsynonymous mutations of HuNoV reveals hypervariability in common structural surface-exposed residues of the P2 domain as a result of immune-driven selection. In contrast, sites corresponding to HBGA-binding targets (343, 344, and 345) are restricted [13, 14] . A total of 13 phosphorylation STY sites were found in this region: one in the hinge region (224T); two in P1-1 (240S, 251T); eight in P2 (300T, 359T, 364S, 368S, 369T, 399T, 393T, and 394T); and two in P1-2 (425T and 4625Y). Of these, the 393T and 394T were involved in HBGA-binding, 171S has epistatic interaction by MEIS155, the 179T has epistatic interaction by MEIS130, and notably, the 130T itself is a MEIS. The 425T has epistatic interaction with MEIS475. The co-evolution analysis in the current study depicted a MEIS axis composed of seven MEISs: two positions in the S domain (130 and 155), three positions in the P2 subdomain (305, 317, and 329), and two positions in the P1-2 subdomain (475 and 540). Position 540 plays the role of a crowbar: mutations at this site impact at least seven other sites. These mutations also have direct or indirect impacts on the HBGA-binding target. Thus, the MEIS interaction axis may have essential roles in triggering conformational changes, and maintaining the balance between receptor binding function and immune escape function. This study established a powerful MS-based system (LC/MS E ) for identifying noroviruses. The system does not require viral genomic amplification or culture. Use of the system in a virological laboratory enables sensitive and efficient detection of mutations, monitoring of genotypes, and identification of protein structural/functional key sites according to VP1 peptide sequencing. By epistatic interaction and phosphorylation analysis, the position 540's impacts on the co-evolutional network were found, as were one phosphorylated site's (MEIS130) direct involvement in this MEIS axis, and the possible involvement of two residues' (393T and 394T) phosphorylation in HBGA-binding. Norovirus gastroenteritis Human norovirus transmission and evolution in a changing world Global Trends in Norovirus Genotype Distribution among Children with Acute Gastroenteritis. Emerg. Infect. Dis. 2021 Noroviruses: A comprehensive review Global prevalence of norovirus in cases of gastroenteritis: A systematic review and meta-analysis Updated classification of norovirus genogroups and genotypes Structural Requirements for the Assembly of Norwalk Virus-Like Particles Norovirus classification and proposed strain nomenclature X-ray Crystallographic Structure of the Norwalk Virus Capsid Association of Histo-Blood Group Antigens and Susceptibility to Norovirus Infections Conservation of Carbohydrate Binding Interfaces-Evidence of Human HBGA Selection in Norovirus Evolution Linear B-Cell Epitopes in Human Norovirus GII.4 Capsid Protein Elicit Blockade Antibodies. Vaccines Atomic resolution structural characterization of recognition of his-to-blood group antigens by Norwalk virus Norovirus Immunity and the Great Escape Replication of human noroviruses in stem cell-derived human enteroids A guide to utilization of the microbiology laboratory for diagnosis of infectious diseases: 2013 recommendations by the Infectious Diseases Society of America (IDSA) and the Advances in Laboratory Methods for Detection and Typing of Norovirus Mass spectrometric based detection of protein nucleotidylation in the RNA polymerase of SARS-CoV-2 Elemental Mass Spectrometry and Fluorescence Dual-Mode Strategy for Ultrasensitive Label-Free Detection of HBV DNA Characterization of adeno-associated virus capsid proteins with two types of VP3 related components by capillary gel electrophoresis and mass spectrometry A post-translational modification of human Norovirus capsid protein attenuates glycan binding Mass Spectrometric Analysis of Urine from COVID-19 Patients for Detection of SARS-CoV-2 Viral Antigen and to Study Host Response Identification of Candidate Protein Biomarkers for CIN2+ Lesions from Self-Sampled, Dried Cervico-Vaginal Fluid Using LC-MS/MS. Cancers Divergent evolution of norovirus GII/4 by genome recombination from Drift time-specific collision energies enable deep-coverage data-independent acquisition proteomics Challenges and strategies for targeted phosphorylation site identification and quantification using mass spectrometry analysis Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods Maximum-Likelihood Analysis Using TREE-PUZZLE. Current protocols in bioinformatics Rapid detection of co-evolving sites using Bayesian graphical models Likelihood-mapping: A simple method to visualize phylogenetic content of a sequence alignment Detection of Norovirus Capsid Protein in Authentic Standards and in Stool Extracts by Matrix-Assisted Laser Desorption Ionization and Nanospray Mass Spectrometry Advances in Human Norovirus Vaccine Research. Vaccines 2021 Non-encapsidation activities of the capsid proteins of positive-strand RNA viruses Phosphorylation coexists with O-GlcNAcylation in a plant virus protein and influences viral infection Norwalk Virus Minor Capsid Protein VP2 Associates within the VP1 Shell Domain