key: cord-0284933-m2q81ps8 authors: Sartor, I. T. S.; Varela, F. H.; Meireles, M. R.; Kern, L. B.; Azevedo, T. R.; Giannini, G. L. T.; da Silva, M. S.; Demoliner, M.; Gularte, J. S.; de Almeida, P. R.; Fleck, J. D.; Zavaglia, G. O.; Fernandes, I. R.; de David, C. N.; Santos, A. P.; de Almeida, W. A. F.; Porto, V. B. G.; Scotta, M. C.; Spilki, F. R.; Vieira, G. F.; Stein, R. T.; Polese-Bonatto, M. title: Y380Q novel mutation in receptor-binding domain of SARS-CoV-2 spike protein together with C379W interfere in the neutralizing antibodies interaction date: 2021-09-14 journal: nan DOI: 10.1101/2021.09.10.21262695 sha: 2d7114697d131902e81d4b7bd82df2d9d032a896 doc_id: 284933 cord_uid: m2q81ps8 Background: The emergence of SARS-CoV-2 variants is a current public health concern possibly impacting COVID-19 disease diagnosis, transmission patterns and vaccine effectiveness. Objectives: To describe the SARS-CoV-2 lineages circulating early pandemic among samples with S gene dropout and characterize a novel mutation in receptor-binding domain (RBD) of viral spike protein. Study design: Adults and children older than 2 months with signs and symptoms of COVID-19 were prospectively enrolled from May to October 2020 in Porto Alegre, Brazil. All participants performed RT-PCR assays for diagnosing SARS-CoV-2, samples with S gene dropout and Ct < 30 (cycle threshold) were submitted to whole genome sequencing (WGS), and homology modeling and physicochemical properties analysis were performed. Results: 484/1,557 participants tested positive for SARS-CoV-2. The S gene dropout was detected in 7.4% (36/484) as early as May, and a peak was observed in early August. WGS was performed in 8 samples. The B.1.1.28, B.1.91 and B.1.1.33 lineages were circulating in early pandemic. The RBD novel mutation (Y380Q) was found in one sample occurring simultaneously with C379W and V395A, and the B.1.91 lineage in the spike protein. Conclusion: Mutations in the SARS-CoV-2 spike region were detected early in the COVID-19 pandemic in Southern Brazil, regarding the B.1.1.28, B.1.91 and B.1.1.33 lineages identified. The novel mutation (Y380Q) with C379W, modifies important RBD properties, which may interfere with the binding of neutralizing antibodies (CR3022, EY6A, H014, S304). SARS-CoV-2 is a single RNA-stranded virus with high mutation rates. Strategies to mitigate the pandemic include the knowledge of its viral genome and expected mutations. These features could impact disease severity, virus transmission, and vaccine strategies [1] [2] [3] . As the COVID-19 pandemic evolves, there has been concern about the emergence of new SARS-CoV-2 mutations in the receptor binding domain (RBD) from the S region, due to probable effects on both virus transmissibility and the generation of escape mutants from antibodies previously formed to heterologous lineages and vaccines [4] . Genetic alterations in the RBD of SARS-CoV-2 may improve the affinity of the virus to binding host cells and these changes may lead to higher transmission rates [5, 6] . Binding affinity of the S protein and ACE2 make this region a key target for potential therapies and diagnosis [7] . COVID-19 molecular diagnostic tests directed to the S gene use it as one of the RT-PCR multiple target-regions. Our aim was to measure the prevalence of the S dropout and characterize the SARS-CoV-2 mutations in the RBD region in a cohort during the early pandemic. A prospective cohort study enrolled adults and children seeking care at emergency rooms, outpatient clinics, or hospitalized in general wards or intensive care units (ICU) at Hospital Moinhos de Vento and Hospital Restinga e Extremo Sul, both in Porto Alegre, Brazil. From May to early October 2020 were included participants presenting signs or symptoms suggestive of COVID-19 (cough, fever, or sore throat). The key exclusion criteria was a negative SARS-CoV-2 RT-PCR result or failure to sample collection. The study was . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 performed in accordance with the Decree 466/12 of the National Health Council [8] and Clinical Practice Guidelines, after approval by the Hospital Moinhos de Vento IRB nº 4.637.933 . All participants included in this study provided written informed consent. All participants performed qualitative RT-PCR assay to SARS-CoV-2 detection as described elsewhere [9] . Additionally, S gene dropout samples with cycle threshold less than 30 (Ct < 30.0) were submitted to whole genome sequencing (WGS) using the Illumina MiSeq high-throughput. RNA was extracted from oropharyngeal swab samples and the RT reaction was performed using SuperScript IV reverse transcriptase (Thermo Fisher Scientific). Library preparation was conducted using QIAseq SARS-CoV-2 Primer Panel paired for library enrichment and QIAseq FX DNA Library UDI Kit, according to the manufacturer instructions (Qiagen, Hilden, Germany). MiSeq Reagent Kit v3 was used for sequencing (600-cycle). FASTQ reads were imported to Geneious Prime, trimmed (BBDuk 37.25) , and mapped against the reference sequence hCoV-19/Wuhan/WIV04/2019 (EPI_ISL_402124) available in EpiCoV database from GISAID [10] . Complete genome alignment was performed with the sequences generated. Fifty-nine Brazilian SARS-CoV-2 complete genomes and the reference sequence (EPI_ISL_402124) (>29 kb) were retrieved from the GISAID database using Clustal Omega. Maximum Likelihood phylogenetic analysis was applied under the General Time Reversible model allowing for a proportion of invariable sites and substitution rates in Mega X applying 200 replicates and 1000 bootstrap. Wild type (Y380) and mutated spike protein sequences (Q380) were submitted to Bepipred 1.0 and 2.0 to detect putative humoral epitopes through HMMs and Random forest . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 algorithms [11, 12] . To increase sensitivity we set a threshold of -0.2 (Bepipred 1.0) and 0.45 (Bepipred 2.0). The search in Immune Epitope Database (IEDB) considered T-cells epitopes for SARS-CoV-2 spike protein (region of 10 residues flanking the Y380Q) with 70% similarity in BLAST. Potential binder sequences of representative supertypes MHC-I alleles For the wild type template selection the SARS-CoV-2 surface glycoprotein sequence (NCBI accession number: YP_009724390) was submitted to BLAST and SwissModel tools. Template crystal candidates were evaluated by the GMQE, QMEAN, Z-score, and residues distribution in the Ramachandran plot, using ERRAT, PROCHECK, PDBsum, ModFold, SwissModel [15] [16] [17] [18] . Protein Data Bank (PDB) 7CWL (3.8Å) was chosen for the approach. Phyre-2 software was employed to homology modeling using the expert mode (one-to-one threading job) for constructing models based on wild protein (P0DTC2) and mutated sequences [19] . Electrostatic potential (EP) was verified through Delphi web server calculations and PIPSA [20, 21] . The residue exposure characteristics such as hydrophobicity and the Accessible Solvent Surface Area (aSAS) were estimated using the Chimera interface [22] . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Crystal complexes of the RBD region with antibodies were recovered from PDB and examined to obtain experimental information of structural binding regions. The LigPlot program was applied to infer protein-antibody interaction sites [23] . Hydrogen-bonds (H-bonds) inferences among the RBD domain and antibodies were calculated in the Chimera interface. Data normality assumptions were verified for continuous variables, and median values and interquartile ranges (IQR) were calculated. Pearson's Chi-square test was used to evaluate proportions between the identified and undetermined results from S gene target, on the epidemiological week; Fisher's exact test was used to compare the frequencies of S dropout considering outpatient and inpatient populations. All analyses were performed in R 3.5.0 statistical software [24] . A total of 1,557 participants were screened and 484 were detected positive for SARS-CoV-2 (Supplementary Figure 1 ). Of these, 98 (20.2%) subjects were hospitalized and 386 (79.8%) were seen as outpatients only. S dropout was characterized as undetermined RT-PCR values for the S gene target, and detected values for ORF1ab and N target probes. We observed a total S dropout of 36/484 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.10.21262695 doi: medRxiv preprint epidemiological weeks. S gene dropout was detected in 12/98 hospitalized subjects (12.2%), whereas for outpatients the frequency was 24/386 (6.2%) with an OR (95% CI) of 2.10 (0.92-4.57, P = 0.052). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Wild C379 and Y380 residues of the RBD region were predicted as part B-cell epitopes. NetMHCpan4.1 returns seven binding sequences involving these sites, and two of them were . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101/2021.09.10.21262695 doi: medRxiv preprint also described as epitopes in the IEDB positive T-cell assays: SASFSTFKCY (for HLA-A*01:01 allele) and KCYGVSPTK (for HLA-A*03:01 allele). The wild sequence (SASFSTFKCY), predicted as a weak binder, turns to a non-binder when mutated (SASFSTFKWQ). The wild strong binder sequence (KCYGVSPTK) turns into a weak binder (KWQGVSPTK). The location of identified mutations in the spike protein is depicted in Figure 3A . Structural analysis revealed EP modifications (orange rectangle, Figure 3B Figure 3B ). These results can be even more evident examining the surface distribution charges: the D614 and D839 wild residues are negatively charged (red pattern, Figure 3B ) and this pattern is disrupted in G614 and Y839 mutated residues; while the surface EP distribution and models conformation show more discreet modifications for the RBD region variants. Amino acid substitutions revealed alterations in hydrophobicity, either by changing the direction of this property (from hydrophobic to hydrophilic and vice versa) or even its intensity. In a general way, substitutions observed in the B.1.91 lineage turn their regions to more hydrophobic, while the RBD mutations to more hydrophilic, denoted by negative and positive values (blue rectangle, Figure 3B ) considering hydrophilic and hydrophobic patterns, respectively. Substitutions from wild residues D614 (-3.5) and D839 (-3.5) to G614 (-0.4) and is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101/2021.09.10.21262695 doi: medRxiv preprint mutation changed the direction from highly hydrophobic (2.5) to hydrophilic (-0.9). Altered residues in the RBD region, specially the C379W and Y380Q mutations, are located close to each other and likely gain strength, thus providing an overall shift to hydrophilic profile. The changing potential of these two close mutations induced to the buried A395 a more hydrophilic profile when compared to the ancestor (from 4.2 to 1.8). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 wild type (ancestor). Both models (B.1.91 lineage and RBD mutations) are similar with ED = 0.18, and diverging from the ancestor. Below of the epogram, are shown the divergences on electrostatic surface distribution, considering the wild and mutated residues. The color scale is represented by a variation from red (more electronegative residues, -5) to blue (more electropositive, +5) passing through neutral (white, 0). The hydrophobicity scale varies from more hydrophobic residues A structural investigation of more than 20 crystals of viral spike protein from PDB, revealed that the mutations C379W, Y380Q and V395A are in a contact area complexed with antibodies in the RBD region. And four crystals that presented Fab antibodies (fragment antigen -binding) are in contact with 379 and 380 RBD residues. Figure 4B exhibits the CR3022 human antibody complexed with the RBD region in contact with mutation sites. We observed the same when evaluating the S2H97, EY6A and S304 antibodies. H-bonds are important non-covalent interaction forces which can assist in protein residue bindings, especially in stabilizing antibody-antigen interactions [26] . Mutated Q380 residue leads to an H-bond disruption observed previously between the wild Y380 with the S99 residue of the CR3022 neutralizing antibody. While the W379 alteration disables the previous H-bond between the wild C379 with T94 residue of the EY6A neutralizing antibody. Mutated W379 also affects the H-bond neighboring of the G381 with Y92 residue of EY6A antibody. Supplementary Table 1. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Our results suggest that SARS-CoV-2 S gene dropouts were present in the community as early as in the beginning of the pandemic in Southern Brazil. These results may indicate that the spread of mutations, resulting in different genetic variants of the virus, were already circulating much earlier than recognized in most settings. Further, a new mutation (Y380Q) . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 was identified, and together with C379W modify important properties in the RBD region, which may interfere with the binding of neutralizing antibodies (CR3022, EY6A, H014, S304). S gene dropout has been found strongly associated with a six-nucleotide deletion resulting in the loss of two amino acids: H69 and V70. Despite the WGS, we could not directly link the S dropout to specific viral mutations. This result may be due to the small sample size submitted to sequencing. Though, we found the presence of three different lineages B. 1.1.28, B.1.91 and B.1.1.33. Additionally, the Y380Q mutation was identified in the S gene, and thus could be associated with the natural history of virus evolution or may possibly be associated with the emergence of new and more virulent mutations. It is well known that modifications in the EP and hydrophobicity distribution may interfere in protein-protein interaction. Interface regions are usually composed of residues presenting opposite charges and hydrophobic pairs, where small changes on these properties, in important functional sites, may impact canonical interactions [27] . Neutralizing antibodies (NAbs) are fundamental elements of the immune system against viral infections [28] , and the H-bonds of wild C379 and Y380 residues with the S304, CR3022, S2H97 and EY6A NAbs reinforce the importance of these regions. Therefore, the H-bond disruptions observed in W379 and Q380 substitutions plus the alteration in hydrophobicity disfavor the RBD-antibodies interactions. As the C379 residue is part of one of the four disulfide bonds in the RBD region (C379-C432) its disruption could generate instability since it contributes to β sheet conformation maintenance [29] . A previous study reported that the C379 and Y380 residues are part of an epitope for H014 antibody, which could sterically compete with the ACE2 host molecule for the RBD interaction [30] . It also reported an overlap among the binding epitopes for the H014 and CR3022 antibodies. The CR3022 monoclonal antibody neutralizes the RBD region of SARS-CoV-2, disrupts the prefusion spike conformation, and also competes sterically with ACE2 [31] . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 without physically blocking it [32] . The recently described S2H97 is a potential NAb [33] that exhibits a notable tight binding even with divergent RBD regions from other Sarbecoviruses. Other SARS-CoV-2 NAbs that bind to the RBD were also described as non-overlapping with the ACE2 binding site [34] . This study has some limitations. The small sample size in which WGS was feasible may limit any conclusion about the clinical severity related to the mutations found. Moreover, individuals were enrolled in a single city in Southern Brazil, which may limit the generality of our findings. Nonetheless, despite such limitations, a novel mutation (Y380Q) in the RBD region of SARS-CoV-2 spike protein was described. The analysis based on crystal structures reinforces the importance of the Y380 and C379 residues in the NAbs binding, and thus mutations in these regions may affect the interaction effectiveness between the NAbs and SARS-CoV-2 protein, as inferred by computational analysis. Our findings indicate that SARS-CoV-2 variants were circulating quite early in the community. A possible role of the new described mutations with clinical severity can be speculated, but further studies are needed to confirm this hypothesis. Studies assessing the mutations and their relation to prognosis are necessary, and also to evaluate vaccine effectiveness in a challenging scenery that is continuously changing. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Humoral and cellular immune responses against SARS-CoV-2 variants and human coronaviruses after single BNT162b2 vaccination SARS-CoV-2 variants evolved during the early stage of the pandemic and effects of mutations on adaptation in Wuhan populations One Year of SARS-CoV-2: How Much Has the Virus Changed? Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Children have similar RT-PCR cycle threshold for SARS-CoV-2 in comparison with adults UCSF Chimera--a visualization system for exploratory research and analysis LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions The R Project for Statistical Computing cov-lineages/pangolin, CoV-lineages Intramolecular H-bonds govern the recognition of a flexible peptide by an antibody Characterization of Protein-Protein Interfaces A systematic review of SARS-CoV-2 vaccine candidates Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Spike Protein Binding and Allosteric Interactions with Antibodies Antibodies to the SARS-CoV-2 receptor-binding domain that maximize breadth and resistance to viral escape SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies We thank the Scientific Committee of the Research Support Nucleus ( is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101/2021.09.10.21262695 doi: medRxiv preprint It is made available under a perpetuity.is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Supplementary Table 1. The wild RBD-antibody interaction sites and the mutated models. PDB code RBD residues with direct contact to antibody is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021 Funding This work was supported by the Brazilian Ministry of Health, through the Institutional Development Program of the Brazilian National Health System (PROADI-SUS) in collaboration with Hospital Moinhos de Vento. The authors declare no conflict of interest. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 14, 2021. ; https://doi.org /10.1101 /10. /2021