key: cord-0940975-tylwbnpl authors: Issa, Elio; Merhi, Georgi; Panossian, Balig; Salloum, Tamara; Tokajian, Sima title: SARS-CoV-2 and ORF3a: Non-Synonymous Mutations and Polyproline Regions date: 2020-03-28 journal: bioRxiv DOI: 10.1101/2020.03.27.012013 sha: 9f1caa569ec33c7ffc040e415ebd5fb1161032c8 doc_id: 940975 cord_uid: tylwbnpl The effect of the rapid accumulation of non-synonymous mutations on the pathogenesis of SARS-CoV-2 is not yet known. To predict the impact of non-synonymous mutations and polyproline regions identified in ORF3a on the formation of B-cell epitopes and their role in evading the immune response, nucleotide and protein sequences of 537 available SARS-CoV-2 genomes were analyzed for the presence of non-synonymous mutations and polyproline regions. Mutations were correlated with changes in epitope formation. A total of 19 different non-synonymous amino acids substitutions were detected in ORF3a among 537 SARS-CoV-2 strains. G251V was the most common and identified in 9.9% (n=53) of the strains and was predicted to lead to the loss of a B-cell like epitope in ORF3a. Polyproline regions were detected in two strains (EPI_ISL_410486, France and EPI_ISL_407079, Finland) and affected epitopes formation. The accumulation of non-synonymous mutations and detected polyproline regions in ORF3a of SARS-CoV-2 could be driving the evasion of the host immune response thus favoring viral spread. Rapid mutations accumulating in ORF3a should be closely monitored throughout the COVID-19 pandemic. Importance At the surge of the COVID-19 pandemic and after three months of the identification of SARS-CoV-2 as the disease-causing pathogen, nucleic acid changes due to host-pathogen interactions are insightful into the evolution of this virus. In this paper, we have identified a set of non-synonymous mutations in ORF3a and predicted their impact on B-cell like epitope formation. The accumulation of non-synonymous mutations in ORF3a could be driving protein changes that mediate immune evasion and favoring viral spread. pathway leads to apoptosis( 11). A pro-apoptosis inducing APA3_viroporin conserved domain 1 0 2 detected in ORF3a of SARS-CoV-2 is also found in SARS-CoV 3A protein (11). The G251V was detected in ORF3a in 9.9% of the strains (n=53). G251V led to the loss a B cell- Of paramount importance is the emergence of PPRs in ORF3a detected in two of the SARS-1 0 9 CoV-2 sequenced genomes (in EPI_ISL_410486, France and EPI_ISL_407079, Finland). PPRs are an open field for recombination that viruses use to adapt based on selective pressure (13). PPRs were previously shown to be indispensable for the activity of the Coxsackievirus B 3A 1 1 2 protein which blocks ER-to-Golgi transport affecting protein synthesis (14). Studies on Hepatitis In conclusion, our study reveals and for the first time a common non-synonymous G251V A total of 537 SARS-CoV-2 complete genomes with high quality sequencing downloaded from 1 2 0 GISAID were utilized for genome and ORF3a alignments. were used as input in the PanX (5) pipeline for pan genome analysis. A core genome threshold of 1 2 4 0.99, MCL inflation parameter of 1.5, and a modified core diversity cutoff for branch lengths 1 2 5 above 0.001 were used alongside the default parameters. Sequences were aligned using MUSCLE v3.8.31 (6). PROVEAN was used to predict the 1 2 8 functional effects of amino acid substitutions (7). ExPASy and PROSPER were used for motif 1 2 9 scanning and protease site prediction, respectively (8, 9) . The Immune epitope database analysis 1 3 0 resource (IEDB-AR) was used for epitopes prediction using a 0.5 threshold and default settings 1 3 1 (10). We thankfully acknowledge the authors, generating and submitting laboratories of the sequences 1 3 4 from GISAID's EpiCoV™ database. We also acknowledge the authors of all Coronaviridae 1 3 5 genome sequences deposited in GenBank. This study does not claim ownership of these 1 3 6 sequences, which were used within the analysis workflow to further our understanding of the on- The authors wish to declare that they do not have any conflict of interests. Involved in Viral Adaptation. PLOS ONE 7:e35974. Percentage values in this column do not add to 100% as mutations only cover a fraction of the total sample size; Total number of sequences= 537. Inhibit Endoplasmic Reticulum-to-Golgi Transport Res 36:W513-518. Coronavirus 3a protein causes endoplasmic reticulum stress and induces ligand- Proline-Rich Region in the Coxsackievirus 3A Protein Is Required for the Protein To