key: cord-0700655-sp3a2u93 authors: Duart, Gerard; García-Murria, Ma Jesús; Grau, Brayan; Acosta-Cáceres, José M.; Martínez-Gil, Luis; Mingarro, Ismael title: SARS-CoV-2 envelope protein topology in eukaryotic membranes date: 2020-05-27 journal: bioRxiv DOI: 10.1101/2020.05.27.118752 sha: 6f3353ad5c1fcb3fb7e7c4556b646f14b29fe9d2 doc_id: 700655 cord_uid: sp3a2u93 Coronavirus E protein is a small membrane protein found in the virus envelope. Different coronavirus E proteins share striking biochemical and functional similarities, but sequence conservation is limited. In this report, we studied the E protein topology from the new SARS-CoV-2 virus both in microsomal membranes and in mammalian cells. Experimental data reveal that E protein is a single-spanning membrane protein with the N-terminus being translocated across the membrane, while the C-terminus is exposed to the cytoplasmic side (Ntlum/Ctcyt). The defined membrane protein topology of SARS-CoV-2 E protein may provide a useful framework to understand its interaction with other viral and host components and establish the basis to tackle the pathogenesis of SARS-CoV-2. The coronavirus disease 19 (COVID-19), an extremely infectious human disease 33 caused by coronavirus SARS-CoV-2, has spread around the world at an 34 unprecedented rate, causing a worldwide pandemic. While the number of confirmed 35 cases continues to grow rapidly, the molecular mechanisms behind the biogenesis Among the four major structural proteins, the E protein is the smallest and 45 has the lowest copy number of the membrane proteins found in the lipid envelope of 46 mature virus particles (reviewed [3, 4] ). However, it is critical for pathogenesis of other 47 human coronaviruses [5, 6] . Interestingly, the sgRNA encoding E protein is one of the 48 most abundantly expressed transcripts despite the protein being low copy number 49 in mature viruses [1] . It encodes a 75 residues long polypeptide with a predicted 50 molecular weight of ~8 kDa. Two aliphatic amino acids (Leu and Val) constitute a 51 substantial portion (36%, 27/75) of the E protein, which accounts for the high grand 52 average of hydropathicity (GRAVY) index of the protein (1.128), as calculated using 53 the ExPASy ProtParam tool (https://web.expasy.org/protparam/). Comparative 54 sequence analysis of the E protein of SARS-CoV-2 and the other six known human 55 coronaviruses, do not reveal any large homologous/identical regions (Figure 1) , with important to note that two possible N-linked glycosylation sites are located C-82 terminally of the predicted TM segment in E protein wild-type sequence at positions 83 N48 and N66 ( Figure 1 ). However, N48 is not expected to be modified even if 84 situated lumenally due to the close proximity of this glycosylation acceptor site to the 85 membrane if the hydrophobic region is recognized as TM by the translocon [11, 12] . 86 Thus, mono-glycosylation (at N66) would serves as a C-terminal translocation 87 reporter. To test N-terminal translocation a construct was engineered where a 88 predicted highly efficient glycosylation acceptor site (NST) was designed at the N- CoV-2 E protein insertion into the microsomal membranes in two opposite 98 orientations cannot be discarded, but being dominant an Ntlum/Ctcyt orientation. 99 To analyse protein topology in mammalian cells, a series of E protein variants 100 tagged with c-myc epitope at the C-terminus were transfected into HEK-293T cells. 101 As shown in Figure 3A , only an E protein construct harbouring the N-terminal 102 Similarly, SARS-CoV E protein was shown to mainly adopt an Ntlum/Ctcyt topology in 126 infected and transiently expressing mammalian cells [23] . This topology is 127 compatible with the ion channel capacity described previously [24] , and with the 128 recently published pentameric structural model of SARS-CoV E protein in micelles 129 [25], in which the C-terminal tail of the protein is α-helical and extramembrane. 130 The membrane topology described here, would allow the cytoplasmic C-131 terminal tail of the E protein to interact with the C-termini of M and/or S SARS-CoV-132 2 membrane embedded proteins [3] , and/or with Golgi scaffold proteins as previously 133 Architecture of SARS-CoV-2 Transcriptome, Cell. Membrane topology of 297 coronavirus E protein Biochemical evidence for the 300 presence of mixed membrane topologies of the severe acute respiratory 301 syndrome coronavirus envelope protein expressed in mammalian cells Control of topology and mode of assembly of a polytopic 304 membrane protein by positively charged residues Structure-based statistical 307 analysis of transmembrane helices Fine-tuning the topology of a polytopic membrane 310 protein: role of positively and negatively charged amino acids N-313 glycosylation efficiency is determined by the distance to the C-terminus and 314 the amino acid preceding an Asn-Ser-Thr sequon Viral 317 membrane protein topology is dictated by multiple determinants in its 318 sequence Subcellular location and topology of severe 321 acute respiratory syndrome coronavirus envelope protein Coronavirus E protein forms ion channels with 325 functionally and structurally-involved membrane lipids Structural model of the SARS coronavirus E 328 channel in LMPG micelles The cytoplasmic tails of infectious bronchitis virus 332 E and M proteins mediate their interaction Molecular code for transmembrane-helix recognition by the Sec61 336 translocon 338 Recognition of transmembrane helices by the endoplasmic reticulum 339 translocon Predicting 341 transmembrane protein topology with a hidden markov model: application to 342 complete genomes Transmembrane protein topology prediction using 345 support vector machines The HMMTOP transmembrane topology prediction 348 server Topology and Signal Peptide Prediction Method The TOPCONS web 354 server for consensus prediction of membrane protein topology and signal 355 peptides Membrane insertion and 358 biogenesis of the Turnip Crinkle Virus p9 movement protein Membrane 361 insertion and topology of the translocating chain-associating membrane 362 protein (TRAM)