key: cord-335891-j78pnwgk authors: Piñana, Maria; Vila, Jorgina; Maldonado, Carolina; Galano-Frutos, Juan José; Valls, Maria; Sancho, Javier; Nuvials, Francesc Xavier; Andrés, Cristina; Martín-Gómez, María Teresa; Esperalba, Juliana; Codina, Maria Gema; Pumarola, Tomàs; Antón, Andrés title: Insights into immune evasion of human metapneumovirus: novel 180- and 111-nucleotide duplications within viral G gene throughout 2014-2017 seasons in Barcelona, Spain date: 2020-08-11 journal: J Clin Virol DOI: 10.1016/j.jcv.2020.104590 sha: doc_id: 335891 cord_uid: j78pnwgk BACKGROUND: Human metapneumovirus (HMPV) is an important etiologic agent of respiratory tract infection (RTI). This study aimed to describe its genetic diversity and clinical impact in patients attended at a tertiary university hospital in Barcelona from the 2014-2015 to the 2016-2017 seasons, focusing on the emerging duplications in G gene and their structural properties. METHODS: Laboratory-confirmed HMPV were characterized based on partialcoding F and G gene sequences with MEGA.v6.0. Computational analysis of disorder propensity, aggregation propensity and glycosylation sites in viral G predicted protein sequence were carried out. Clinical data was retrospectively reviewed and further associated to virological features. RESULTS: HMPV prevalence was 3%. The 180- and 111-nucleotide duplications occurred in A2c lineage G protein increased in prevalence throughout the study, in addition to short genetic changes observed in other HMPV lineages. The A2c G protein without duplications was calculated to protrude over F protein in 23% of cases and increased to a 39% and a 46% with the 111- and 180-nucleotide duplications, respectively. Children did not seem to be more affected by these mutant viruses, but there was a strong association of these variants to LRTI in adults. DISCUSSION: HMPV presents a high genetic diversity in all lineages. Novel variants carrying duplications might present an evolutionary advantage due to an improved steric shielding, which would have been responsible for the reported increasing prevalence and the association to LRTI in adults. Abstract: Background. Human metapneumovirus (HMPV) is an important etiologic agent of respiratory tract infection (RTI). This study aimed to describe its genetic diversity and clinical impact in patients attended at a tertiary university hospital in Barcelona from the 2014-2015 to the 2016-2017 seasons, focusing on the emerging duplications in G gene and their structural properties. Methods. Laboratory-confirmed HMPV were characterized based on partialcoding F and G gene sequences with MEGA.v6.0. Computational analysis of disorder propensity, aggregation propensity and glycosylation sites in viral G predicted protein sequence were carried out. Clinical data was retrospectively reviewed and further associated to virological features. Results. HMPV prevalence was 3%. The 180-and 111-nucleotide duplications occurred in A2c lineage G protein increased in prevalence throughout the study, in addition to short genetic changes observed in other HMPV lineages. The A2c G protein without duplications was calculated to protrude over F protein in 23% of cases and increased to a 39% and a 46% with the 111-and 180-nucleotide duplications, respectively. Children did not seem to be more affected by these mutant viruses, but there was a strong association of these variants to LRTI in adults. Discussion. HMPV presents a high genetic diversity in all lineages. Novel variants carrying duplications might present an evolutionary advantage due to an improved steric shielding, which would have been responsible for the reported increasing prevalence and the association to LRTI in adults. Human metapneumovirus, duplication, steric shielding, epidemiology, genetic diversity, clinical impact. The fusion (F) and the attachment (G) proteins are the major envelope glycoproteins. F protein is the major cross-protective antigenic determinant and is highly conserved between genotypes (88%) [4] . Hence, it is the main target for most vaccine strategies under development [6] . Differently, G protein is weakly immunogenic [7] , with 28% genetic divergence between genotypes and 74-82% intra-genotype [4] . In addition, 180-and 111-nucleotide duplications have been recently described into G protein's ectodomain [8] [9] [10] [11] . The aims of this study were to describe circulation pattern, genetic diversity and clinical impact of HMPV in paediatric and adult population attended at a tertiary university hospital in Barcelona from the 2014-2015 to the 2016-2017 seasons, focusing on the emergence and spread of variants carrying these two nucleotide duplications. From October/2014 to May/2017, respiratory specimens (nasopharyngeal aspirates, nasal and pharyngeal swabs, bronchoaspirates, bronchoalveolar, bronchoselective and tracheal washes and sputums) were received for the laboratory-confirmation of respiratory viruses from children and adults attended at the Hospital Universitari Vall d'Hebron with suspicion of respiratory tract infection (RTI). Institutional Review Board J o u r n a l P r e -p r o o f The propensity to adopt disordered conformations of three G sequences with and without nucleotide duplications was analysed using the MetaDisorder server [14] , their propensity to aggregate using the Pasta 2.0 server [15] , and the prediction of potential N-and O-glycosylation sites using NetNGlyc 1.0 [16] and NetOglyc 4.0 [17] servers, respectively. Ensembles consisting of 2,000 unfolded conformations were generated for each of the three G sequences using the ProtSA server [18] . The PDB file of each conformation was analysed to compute the distance between the N atom of the first extracellular residue (Asn52) and the more distant atom, as well as the radius of gyration of the particular conformation. Demographic (age and sex) and clinical features (URTI/LRTI, co-morbidities, coinfections, antibiotic use, need, type and length of respiratory support, length of hospital stay, ICU admission or exitus) of HMPV-laboratory confirmed cases were retrospectively reviewed from medical records and related to viral features. Patients included in the demographic study were those with clinical presentation of URTI or LRTI, whilst patients with other symptoms rather than respiratory were excluded from the study. For the severity study, only patients hospitalised due to LRTI were included, and exclusion criteria were those cases with other symptoms rather than LRTI and hospitalisation due to other clinical reasons even though the patient manifested RTI. Data were analysed with R software v3.5.1. For categorical data, Chi-squared or Fisher's exact test were performed. For numerical variables, t student, Mann-Withney, ANOVA or Kruskall-Wallis tests were performed according to the need. Statistical significance was taken at the p-value <0.05. A total of 20,132 samples of 14,769 patients were tested, of which 9,370 (47%) were laboratory-confirmed for at least one respiratory virus. Though being in a similar period, HRSV and influenza epidemics varied between seasons ( Figure 1A Table 1 ). Regarding the methods for detection, from all these 14,769 patients, 3,316 samples (22%) were tested with the immunofluorescence assay, while 8,483 (57%) were tested with the real-time RT-PCR multiplex assays. Importantly, 656 (4%) samples which tested negative for the immunofluorescence assay were re-tested with the real-time RT-PCR multiplex assays. The remaining samples had been tested with a rapid test. The use of immunofluorescence and PCR-based assays were changing throughout the period of study, decreasing for immunofluorescence assays and increasing for molecular (11, 13%) , HRSV (6, 7%), human coronavirus (HCoV) 229E (5, 6%), HCoV OC43 (5, 6%), HCoV NL63 (2, 2%), human parainfluenza 3 (1, 1%), influenza A (1, 1%) and B (1, 1%). Weekly distribution of HMPV showed a higher circulation from February to April in the first two seasons, but started at mid December in 2016-2017 season ( Figure 1B ). The peaks of incidence of the first two seasons were in March, but the last season presented a pattern with two peaks. Phylogenetic analyses of HMPV F and G sequences from 387 strains revealed that (Table 3) . Phylogenetic analyses of F (382, 94%) and G (365, 90%) sequences ( Figure 2) showed congruent results. Overall, 11 (3%) samples belonged to A2a, 37 (10%) to A2b, 153 (40%) to A2c, 106 (27%) to B1 and 79 (20%) to B2 (Table 3) . Genetic characterisation of A2 G revealed that A2a and A2c sequences generally had a length of 220 aa, and A2b of 218 due to premature stop codons. Genetic characterisation of 153 A2c strains revealed the presence of the novel 180-(A2c180dup; 46; 30%) and 111-nucleotide (A2c111dup; 13; 9%) duplications into G ectodomain with increasing prevalence (Table 3) . While all A2c180dup clustered together, two subgroups J o u r n a l P r e -p r o o f could be observed in the F phylogenetic tree ( Figure 2) . Differently, A2c111dup G clustered into 2 groups but their F genes clustered together (except NSVH2017-09-82477). B1 G clustered into two phylogenetic groups (I and II), differing in the acquisition of a premature stop codon in the 232 aa (relative to KU375606) in all strains belonging to group II (Figure 2 ), but one (NSVH2015-12-87728). In addition, two sequences The sequences of the present study were submitted to GenBank (MN617398-MN617753). The ectodomain ensemble of the non-glycosylated form of G protein of NSVH2015-06-62150 (A2cwt) was simulated [18] [19] [20] The pre-fusion conformation of the F trimer is calculated to protrude 13 nm [21] . According to the distance distributions of the three ensembles, the actual fraction of G protein's ectodomain protruding more than 13 nm from the membrane amounts to 23% in the A2cwt, and it increases to 39% in A2c111dup and 46% and A2c180dup. Due to the absence of clinical information (2), non-amplification (20) or manifestation of other syndromes rather than URTI or LRTI (20) , clinical features of 203 paediatric and 162 adult cases were finally studied ( For the severity study, only patients hospitalised due to LRTI (176) were considered, being 116 (66%) paediatric (Table 5 ) and 60 (34%) adult patients (Table 6 ). Children infected with A2 were more likely to be admitted in the ICU (OR 5.14, IC 95% 1.06-24.95, p 0.031). No other variables were found to be significant. This study reports recent data on prevalence, genetic diversity, structural biology of G protein and clinical features of HMPV in Barcelona, Spain. The positivity rate of HMPV was similar to recent reports [4, 22, 23] . Interestingly, the prevalence in adults was similar or even higher than in children, which emphasizes the importance of HMPV in adults. HMPV prevalence increased throughout the three seasons, probably due to the higher implementation of molecular methods, though there might be an underestimation, as a large number of positive samples for HRSV and influenza by rapid assay were not tested for other respiratory viruses. Most coinfections were with rhinoviruses, adenoviruses and bocaviruses, as previously reported [23, 24] . HMPV presented a clear seasonality, as previously described [2,6,24]. Interestingly, the last season presented a different pattern, showing two different peaks in one epidemic season without changes among circulating genotypes. Interestingly, the prevalence of HMPV was higher out of the HRSV/influenza epidemics in the first season but did not vary in the second and third seasons. This could be due to the higher implementation of PCR-based assays in detriment to the use of immunofluorescence assays. Moreover, the great majority of HMPV laboratoryconfirmed samples during these epidemics were previously tested by rapid assays and had a negative result, which would suggest that there might be many more samples that would be positive for HMPV but are not tested due to a HRSV or influenza positive result when HMPV circulation is coincidental with influenza epidemics. Thus, HMPV prevalence could be underestimated due to the lack of search of this virus when samples are HRSV or influenza laboratory-confirmed during the epidemics. Genetic characterisation revealed that both genotypes co-circulated with a shift in predominance, as expected [2]. However, there was an unpredicted co-predominance Congruent classification of both F and G genes was expected, as no genetic recombination has been described for HMPV. All subgenotypes were detected except A1, suggesting it has extinguished and been replaced by A2, according to previous studies [3] . According to the data of the present study, A2c lineage appears to be replacing A2a and A2b. Moreover, A2c strains with duplications might be replacing A2cwt in the near future, as they might present an improved mechanism of immune evasion. In fact, a group in Japan observed that A2c111dup had totally replaced the rest of A2 strains [25] , including A2c180dup. Interestingly, our group has observed how both A2c111dup and A2c180dup have replaced together the rest of HMPV-A viruses, being the latter more prevalent [26] . Different lengths of G protein have been observed due to premature stop codons, as previously described [3] . A2b and A2c lineages included viruses with G proteins of 218 and 220 aa respectively; and two different genetic groups (I and II) could be distinguished within B1 subgenotype, with a difference of 10 aa in length, which might evolve into novel lineages. Also, nucleotide duplications can lengthen the G aa sequence, such as long duplications in A2c, and short duplications in B2. For B2 viruses, KE duplications or KER variants should be monitored next seasons to reveal whether they confer an evolutionary advantage. The deletions observed seem not to have been fixed in the viral population. Once these A2c111dup and A2c180dup were described, one of the aims was to study their structural properties. G has a heavily glycosylated pattern [21] , enhanced by the emergence of duplications that increase the number of potential glycosylation sites. Although it is a very disordered protein and seems to have numerous random conformations, a composition of these conformations could be done. This prediction J o u r n a l P r e -p r o o f suggests that both A2c180dup and A2c111dup proteins protrude more than A2cwt. This finding supports the hypothesis of Leyrat [21] , who suggested that G protein had a shielding function towards F protein, masking its antigenic epitopes, and at the same time validates the hypothesis that these novel long duplications would enhance this immune evasion mechanism, as it would hide more efficiently F epitopes [8] . Sequences of the newly described A2c lineage [3,4] were compared to sequences of the previously described A2b1 and A2b2 sublineages [27, 28] and clustered together; that is to say, A2b and A2c lineages are exactly the same as A2b1 and A2b2, respectively. This misunderstanding between the genetic classification used in several articles highlights the urgent need of an official classification, as well as universal criteria to define new genotypes or lineages. Furthermore, clinical impact was also assessed. As in literature [2], LRTI is more common in children under 2 and adults over 65. Moreover, adults have an increase of 1.03 times the odds of suffering LRTI every passing year. The presence of chronic medical conditions as cardiopathy, more frequent in the elderly, may be responsible for this, so HMPV should be tightly surveilled in these cases. Comorbidities are also associated with LRTI in children, especially respiratory comorbidities and immunodepression. In this study, prematurity and cardiopathies were not associated with a major risk of developing LRTI in children, in opposite to previous studies [29] [30] [31] [32] . Paediatric and adult patients underwent more antibiotic treatment when manifesting LRTI than URTI. However, only 8% of children and 30% of adults treated with antibiotics had a positive bacterial culture. Hence, over-antibiotic prescription is still reported. Regarding infections by A2c, children seemed to be as affected by A2c with duplications as by A2cwt or other lineages, as it is probably a primary infection. Instead, A2c with duplications were more associated with LRTI in adults than A2cwt or other lineages. Although adults should have an efficient immune response [6], they have 3.45 times more odds of manifesting LRTI when infected by A2c with duplication than by A2cwt. This suggests that it might be acting as a primary infection, which supports the hypothesis of G protein's steric shielding over F protein. HMPV is known for the many immune evasion strategies it has, so this could be a new mechanism developed in recent years [33, 34] . Whether strains with duplication cause more severe disease could be demonstrated neither in children nor in adults. The increasing prevalence of viral variants carrying a duplication into the ectodomain of the G protein throughout the study period, the association of A2c111dup and A2c180dup with more severe disease in adults, and the prediction of an enhancing steric shielding of the G protein masking antigenic epitopes of the F protein suggest that these duplications might confer an evolutionary advantage contributing to the immune evasion during the infection. This mechanism would be similar to that described for other viruses which have been reported to evade the immune response due to the glycosylation they present in their envelopes [35] . Given that F protein is the main target for most vaccine strategies currently under development, the fact that it could be masked by G should be taken into account. Genetic diversity and molecular evolution of the major human metapneumovirus surface glycoproteins over a decade Novel human metapneumovirus with a 180-nucleotide duplication in the G gene A novel human metapneumovirus carrying a 111-nucleotide duplication within the G gene detected at a tertiary university hospital in Catalonia since the 2015-2016 season 180-nucleotide duplication in the G gene of human metapneumovirus A2b subgroup strains circulating in Yokohama city A novel 111-nucleotide duplication in the G gene of human metapneumovirus MEGA6: Molecular evolutionary genetics analysis version 6.0 ALTER: Program-oriented conversion of DNA and protein alignments MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins PASTA 2.0: An improved server for protein aggregation prediction Prediction of N-glycosylation sites in human proteins Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology ProtSA: A web application for calculating sequence specific protein solvent accessibilities in the unfolded ensemble A structural model for unfolded proteins from residual dipolar couplings and small-angle x-ray scattering Sequence-specific solvent accessibilities of protein residues in unfolded protein ensembles Structural insights into the human metapneumovirus glycoprotein ectodomain Human metapneumovirus: Insights from a ten-year molecular and epidemiological analysis in Germany Clinical and genetic features of human metapneumovirus infections in children Epidemiological and clinical features of human metapneumovirus in hospitalised paediatric patients with acute respiratory illness: a cross-sectional study in Southern China Predominant detection of the subgroup A2b human metapneumovirus strain with 111-nucleotide duplication in Yokohama City Human metapneumovirus: are the new duplications within the G gene responsible for doubling its prevalence?, in: 21st Congr Increase human metapneumovirus mediated morbidity following pandemic influenza infection Human metapneumovirus prevalence and molecular epidemiology in respiratory outbreaks in Ontario, Canada Human metapneumovirus infection is associated with severe respiratory disease in preschool children with history of prematurity Comparison of risk factors for human metapneumovirus and respiratory syncytial virus disease severity in young children Ampofo, Incidence, morbidity, and costs of human metapneumovirus infection in hospitalized children Modulation of host immunity by the human metapneumovirus Human metapneumovirus antagonism of innate immune responses The Secret Life of Viral Entry Glycoproteins: Moonlighting in Immune Evasion This study was supported by the Spanish Ministry of Economy and Competitiveness (grants BFU2016-78232-P), Instituto de Salud Carlos III and by the European Regional Development Fund, through the Interreg V-A programme: POCTEFA 2014-2020 (grant Pirepred EFA086/15). It was also co-financed by the European Development Regional Fund (ERDF) "A way to achieve Europe", Spanish Network for Research in Infectious The authors declare no conflicts of interest.