key: cord-0813718-krowvk1v authors: Yuan, Hao-Wen; Wen, Hong-Ling title: Research progress on coronavirus S proteins and their receptors date: 2021-03-28 journal: Arch Virol DOI: 10.1007/s00705-021-05008-y sha: 455561fc360f03a630a2e62c4f2b6c0785088d8b doc_id: 813718 cord_uid: krowvk1v Coronaviruses are a large family of important pathogens that cause human and animal diseases. At the end of 2019, a pneumonia epidemic caused by a novel coronavirus brought attention to coronaviruses. Exploring the interaction between the virus and its receptor will be helpful in developing preventive vaccines and therapeutic drugs. The coronavirus spike protein (S) plays an important role in both binding to receptors on host cells and fusion of the viral membrane with the host cell membrane. This review introduces the structure and function of the S protein and its receptor, focusing on the binding mode and binding region of both. Since the beginning of the 21st century, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), which are highly pathogenic human coronaviruses, have become a great threat to human health [1, 2] . At the end of 2019, the outbreak of COVID-19 caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) rapidly spread around the world. [3] . At present, SARS-CoV-2 has spread to over 210 countries and regions. The rising morbidity and mortality are a great threat and challenge to global public health. Coronaviruses are spherical enveloped viruses belonging to the family Coronaviridae, which is divided into four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus. The coronaviruses that infect humans all belong to the genera Alphacoronavirus and Betacoronavirus [4] . The coronavirus genome is about 30 kb in length and contains 6-10 open reading frames (ORFs), with a cap structure at the 5' end and a poly(A) tail at the 3' end. Two thirds of the coronavirus genome consists of two ORFs: ORF1a and ORF1b. The rest of the coding portion of the genome encodes four structural proteins: the spike protein (S), the envelope protein (E), the membrane protein (M), and the nucleocapsid protein (N) [5, 6] . Some coronaviruses also have a hemagglutinin esterase(HE)protein [7] . There are also accessory genes in the coronavirus genome, and their composition and arrangement are different in different coronaviruses. Numerous studies have shown that the coronavirus S protein is the main structural protein involved in binding to the host cell receptor, and it also stimulates the host to produce a series of immune responses. Since the outbreak of severe acute respiratory syndrome (SARS) in 2003, significant progress has been made in determining the structure and function of the coronavirus S protein, especially regarding the mechanism of binding of the coronavirus S protein to its receptor. A characteristic feature of members of the family Coronaviridae is the presence of spike-shaped protrusions on the surface of the virion, which are composed of S protein trimers. The coronavirus S protein is composed of about 1300 aa and is a type I membrane fusion protein of about 180 kDa. It contains three main parts, the extracellular domain, the transmembrane domain, and the intracellular tail [8] . During the maturation of the coronavirus, the extracellular domain of the S protein is recognized and cleaved by a host protease into two subunits, S1 and S2. There are two cleavage sites Handling Editor: Tim Skern. on the S protein. The S1/S2 site is located between the S1 and S2 subunits, and the S2' site is located upstream of the fusion peptide (FP) [9] . S1 is a spherical structure near the N-terminus, and S2 is a rod-like structure near the C-terminus. The S1 and S2 subunits are connected by noncovalent interactions and embedded in the membrane via the S2 transmembrane region. There are two major domains in the S1 subunit: the N-terminal domain (NTD) and the C-terminal domain (CTD). Both the NTD and the CTD are receptorbinding domains (RBDs), and they are mainly responsible for the recognition of the coronavirus receptor and mediating the binding of the virion to the receptor [10]. Most studies so far have been concerned with the RBD. The RBD contains major neutralizing epitopes and has strong immunogenicity. It not only induces a host immune response but also provides targets for the design of vaccines and antiviral agents. There are three major domains in the S2 subunit: FP, heptad repeat N (HR-N), and heptad repeat C (HR-C). The HR-N and HR-C regions are highly conserved and mainly mediate the membrane fusion process after receptor binding. They interact to form a six-helix bundle structure that causes the virus envelope and cell membrane to rearrange to drive the fusion of the viral envelope and the host cell membrane [11] . Since the S protein not only binds to the host cell receptor but also mediates the fusion of the viral envelope and the cell membrane, it is the most important structural protein of coronaviruses. Virus receptors are sites on the host cell surface that specifically bind to the virus, facilitate invasion of the cell, and promote infection. The chemical nature of virus receptors is complex and diverse. They can be monomers or multimolecular compounds, and most of them are proteins. Among the receptors that have been discovered, some belong to the immunoglobulin superfamily, such as the HIV receptor CD4 and the measles virus receptor CD46 [12, 13] , some are glycoprotein receptors, such as the foot-and-mouth disease virus integrin receptor αVβ6 and the influenza virus receptor sialic acid (SA) [14, 15] , and some are physiologically active substances, such as the rabies virus G protein acetylcholine receptor [16] . Virus receptors have a high degree of specificity and affinity, mediate entry of the virus into the host cell, and initiate virus infection. The same receptor can be recognized by many viruses, such as heparin sulfate proteoglycan (HSPG), which functions as a receptor for foot-and-mouth disease virus, herpes simplex virus, and dengue virus. In addition, a virus can also recognize a variety of receptors, such as the HIV receptors CCR5 and CXCR4 in addition to CD4 [17] . Generally, the host range of a virus is narrow, and this is partially determined by the receptor that is used. The study of virus receptors can aid in understanding the scope and pathway of virus infection in host cells as well as the potential threat of certain viruses to humans. For example, porcine epidemic diarrhea virus (PEDV) can not only infect pigs but can also recognize human aminopeptidase N (APN) to infect human cells [18] . However, there is limited information about how PEDV causes infection or disease in humans. Virus receptors play a vital role in the prevention of virus infection, vaccine development, and antiviral drug screening. The interaction of the virus with its receptor is the first step in the invasion of the host cell by a coronavirus. Subsequently, the conformation of the S protein changes, resulting in fusion of the viral envelope and the cell membrane. The receptor recognition mechanism of coronaviruses is very complex. Different coronaviruses have different receptors and RBDs. Coronaviruses can recognize a variety of receptors, and the same receptor can also be recognized by a variety of coronaviruses. A variety of coronavirus receptors have been identified, such as the mouse carcinoembryonic antigen-related cell adhesion molecule 1 (mCEACAM1), angiotensin converting enzyme 2 (ACE2), APN, dipeptidyl peptidase-4 (DPP4), SA, and CD147 [18] [19] [20] [21] [22] [23] (Table 1) . Most of these are glycoproteins. mCEACAM1, a type I transmembrane protein, is widely expressed in mammalian epithelial and endothelial cells and mainly mediates cell adhesion and signal transduction. Two forms of mCEACAM1, mCEACAM1a and mCEACAM1b, are encoded by different alleles, and both of them can be used as a receptor for mouse hepatitis virus (MHV), which, however, binds to mCEACAM1a with higher efficiency than to mCEACAM1b [24] . ACE2, type I membrane protein, is mainly expressed in the lungs, heart, and kidneys. Its primary function is to catalyze the transformation of angiotensin, and it plays an important role in pulmonary ventilation. Cells expressing ACE2 can be infected by SARS-CoV-2, and they bind with high affinity to the SARS-CoV-2 RBD [25, 26] . High expression of ACE2 is associated with severe lung damage in patients with COVID-19 [27] . APN (CD13), a type II metalloproteinase, exists in a dimeric form on the cell surface and is abundantly expressed in small intestinal enterocytes, liver, and kidney. APN not only promotes tumor invasion and metastasis but also acts as a functional receptor for most alphacoronaviruses, such as PEDV, transmissible gastroenteritis virus (TGEV), and human coronavirus 229E (HCoV-229E) [18, 22, 28] . DPP4 (CD26), a type II transmembrane protein, is mainly expressed in endothelial and epithelial cells of organs such as lung, kidney, small intestine, liver, and prostate. DPP4 plays a role in glucose homeostasis, malignant transformation, and tumor invasion. MERS-CoV uses DPP4 as its receptor [21] . SA, also known as N-acetylneuraminic acid, is a component of glycoproteins and is present on the surface of almost all cells. The human coronavirus OC43 (HCoV-OC43) and bovine coronavirus (BCoV) enter host cells after binding to SA. The S protein S1 subunits of PEDV and TGEV can also bind to SA [22, 29] . CD147 is expressed in both adult and embryonic tissues. It participates in normal physiological metabolism as well as pathophysiological processes in disease. CD147 promotes invasion of host cells by SARS-CoV [23] . It can also be used in addition to ACE2 as a receptor for SARS-CoV-2 to invade host cells [30] . In the process of coronavirus infection, the binding of the S protein to the receptor is a key step. The fusion process can occur at the surface of the cell membrane, or it can occur in an endosome formed by receptor-mediated endocytosis, with fusion occurring between the viral envelope and the endosomal membrane [31] . The spike proteins of SARS-CoV-2 and SARS-CoV share about 76% to 78% amino acid sequence identity [32] , and the process by which these viruses infect host cells is very similar. First, the SARS-CoV-2 S protein RBD binds to the host cell receptor ACE2, and the S1/S2 and S2' sites are cleaved by the serine protease TMPRSS2 to destabilize the S protein trimer and activate the ability of the S protein to induce fusion between the virus envelope and the cell membrane, allowing the virus to enter the cell cytoplasm ( Fig. 1) [8, 9, 33] . Cryo-electron microscopy has shown that the SARS-CoV-2 S protein requires a series of conformational changes in order to carry out the process of membrane fusion, and these changes are similar to those observed with SARS-CoV [34, 35] . SARS-CoV-2 RBD undergoes hinge-like conformational movements during membrane fusion, causing the RBD to be hidden or exposed. It is called the "down" conformation when the RBD is hidden, and the "up" conformation when the RBD is exposed. When the RBD is in the "down" conformation, it cannot bind to ACE2 due to steric hindrance, but when the RBD is in the "up" conformation, it is free to bind the receptor [26] . The RBD is in a high-energy, unstable trimeric conformation before the S protein binds to the receptor. When the S protein binds to the receptor, the S1 subunit dissociates, and the S2 subunit undergoes a dramatic conformational change in which it is converted to the fusion conformation. The interaction between HR-N and HR-C in the S2 subunit then results in the formation of a six-helix bundle structure, which is stable. These changes expose the FP of the S2 subunit, and it inserts into the host membrane to initiate the fusion process (Fig. 2) . The conformational change in the membrane releases energy, which drives the fusion of the membranes [36] . A mutation resulting in the replacement of aspartic acid (D) with glycine (G) at codon 614 of the S protein of SARS-CoV-2 (D614G) became dominant in Europe during the COVID-19 pandemic. Mutations in the S protein can induce conformational modifications and increase the flexibility of the D614G RBD [37] . Recent studies have demonstrated that the D614G mutation in the SARS-CoV-2 S protein can increase the efficiency of viral entry [38] . Proteolytic cleavage of the S protein by furin or other cellular proteases at the S1/S2 site and S2′ site is essential for infection. It is believed that coronavirus S protein proteolysis is the trigger for membrane fusion. The SARS-CoV-2 spike cleavage site sequence may determine its host protease specificity [39] . In the process of binding of the S protein to the receptor, the serine protease TMPRSS2 is responsible for the cleavage and activation of the S protein and thus Fig. 1 The process of SARS-CoV-2 entry into host cells. The SARS-CoV-2 S protein binds to the receptor ACE2, and cleavage by the serine protease TMPRSS2 activates the fusion activity of the S protein, allowing the virus to enter the cytoplasm. represents a potential drug target for antiviral therapy. In the process of screening SARS-CoV and MERS-CoV antiviral drugs, it was found that TMPRSS2 inhibitors used clinically, such as camostat and nafamostat, can effectively prevent the virus from entering host cells [40, 41] . In a study of protease TMPRSS2 inhibitors, it was found that camostat mesylate can prevent SARS-CoV-2 from entering host cells. This compound should therefore be studied further as a potential drug in clinical trials [9] . Protease inhibitors provide a new intervention scheme to prevent the binding of SARS-CoV-2 to host cells. Furthermore, neutralizing antibodies could potentially be used as antivirals against coronavirus disease. Human monoclonal antibodies can be cloned using SARS-CoV-2 RBD-specific memory B cells isolated from COVID-19 convalescent patients. They have been shown to bind specifically to the SARS-CoV-2 RBD and block its interaction with human ACE2 to effectively neutralize a pseudovirus containing the SARS-CoV-2 S protein. These studies indicated that SARS-CoV RBD-specific antibodies may be used for treatment of SARS-CoV-2 infection [42] [43] [44] . The conformational change in the RBD of the coronavirus spike is required for successful fusion and entry. The S1 subunit of the coronavirus S protein contains two domains, NTD and CTD, which can function as receptor-binding domains. For most coronaviruses with known receptors, the corresponding receptor-binding domains and their crystal structures in the complex with their receptors have been analyzed. S1-CTD can be used as an RBD for SARS-CoV, SARS-CoV-2, and MERS-CoV, but their structures are different. SARS-CoV CTD consists of a core domain and an extended external domain. The core structure is a five-stranded antiparallel β-sheet. The extended external structure is located on both sides of the core structure, the two-stranded antiparallel β-sheet forms a depression, and the bottom of the depression combines with the N-terminal helix of ACE2 [45] . The structure of the SARS-CoV-2 RBD is very similar to that of the SARS-CoV RBD. The RBDs of these two coronaviruses share 72% amino acid sequence identity and molecular models show very similar three-dimensional structures. However, the less-flexible prolyl residues in SARS-CoV are replaced by a distinct loop with flexible glycyl residues in SARS-CoV-2, which may be one of the reasons for the stronger binding between the SARS-CoV-2 RBD and ACE2 [46, 47] . The core structure of MERS-CoV is also composed of a five-stranded antiparallel β-sheet. The external structure is clearly different, with the external Fig. 2 a Structure of the SARS-CoV-2 S trimer. The SARS-CoV-2 S protein in the down conformation is shown at the left (PDB ID:6VXX), and the same protein in the up conformation is shown at the right (PDB ID:6VYB). b Coronavirus membrane fusion model. First, the S protein binds to ACE2 and releases the S1 subunit trimer, and the FP, wrapped in S2 inserts into the host membrane. Finally, the interaction between HR-N and HR-C in the S2 subunit forms a six-helix bundle structure. extension structure of MERS-CoV composed of a fourstranded antiparallel β-sheet [48] . Unlike the coronaviruses described above, MHV uses S1-NTD to bind to the receptor [19] . The core structure of MHV-NTD is a 13-stranded β-sandwich structure with two antiparallel β-sheets stacked against each other through hydrophobic interactions. It has the same structural fold as human galactose-binding lectins. The β-sheet located in the upper layer of the core has three loops, with the N-terminal segments forming a substructure that binds to the receptor. The binding of S1-NTD to mCEACAM1a mainly depends on interactions between the peptide chains [24, 49] . In contrast, the NTDs of HCoV-OC43 and BCoV bind specifically to SA and other sugar molecules to promote adhesion of the virus to the host cell [31, 50] . The COVID-19 pandemic has led to an urgent need to develop vaccines and drugs to prevent and treat coronavirus infections. Research institutions are devoted to developing SARS-CoV-2 vaccines all over the world, and several vaccines are in use or undergoing clinical trials, including adenovirus type-5-vectored vaccines, which show good immunogenicity and induce a protective immune response against SARS-CoV-2 [51] . Understanding the cell entry mechanism of coronaviruses can inform intervention strategies and the design of vaccines and drugs to target the RBD of the coronavirus and its receptor. At the same time, mutations such as D614G that increase the infectivity of SARS-CoV-2 are a cause of serious concern. These mutations have critical implications for the global pandemic. The mechanism of binding between the SARS-CoV-2 S protein and its receptor has become clearer, and this will provide a strong basis for the prevention and treatment of COVID-19 as well as laying a solid foundation for dealing with other possible novel coronavirus threats to humans in the future. coronavirus HKU1 spike protein uses O-acetylated sialic acid as an attachment receptor determinant and employs hemagglutinin-esterase protein as a receptor-destroying enzyme. Isolation and characterization of viruses related to the SARS coronavirus from animals in Southern China Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia A novel coronavirus from patients with pneumonia in China Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans From SARS and MERS CoVs to SARS-CoV-2: moving toward more biased codon usage in viral structural and nonstructural genes Structural, glycosylation and antigenic variation between 2019 novel coronavirus (2019-nCoV) and SARS coronavirus (SARS-CoV) MERS-CoV spike protein: a key target for antivirals T-lymphocyte T4 molecule behaves as the receptor for human retrovirus LAV Molecular mechanism by which residues at position 481 and 546 of measles virus hemagglutinin protein define CD46 receptor binding using a molecular docking approach Role of neuraminidase in influenza A(H7N9) virus receptor binding Rules of engagement between αvβ6 integrin and foot-and-mouth disease virus Nicotinic acetylcholine receptor alpha 1(nAChRα1) subunit peptides as potential antiviral agents against rabies virus A biophysical perspective on receptor-mediated virus entry with a focus on HIV Receptor usage and cell entry of porcine epidemic diarrhea coronavirus Structural and molecular evidence suggesting coronavirusdriven evolution of mouse receptor Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC Role of porcine aminopeptidase N and sialic acids in porcine coronavirus infections in primary porcine enterocytes Function of HAb18G/CD147 in invasion of host cells by severe acute respiratory syndrome coronavirus Crystal structure of mouse coronavirus receptor-binding domain complexed with its murine receptor A pneumonia outbreak associated with a new coronavirus of probable bat origin Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Singlecell RNA expression profiling of ACE2, the putative receptor of Wuhan 2019-nCov Human aminopeptidase N is a receptor for human coronavirus 229E Point mutations in the S protein connect the sialic acid binding activity with the enteropathogenicity of transmissible gastroenteritis coronavirus SARS-CoV-2 invades host cells via a novel route: CD147-spike protein Early events during human coronavirus OC43 entry to the cell Receptor recognition by the novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS coronavirus Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells Cryo-electron microscopy structures of the SARS-CoV spike glycoprotein reveal a prerequisite conformational state for receptor binding Role of changes in SARS-CoV-2 spike protein in the interaction with the human ACE2 receptor: an in silico analysis Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant SARS-CoV-2 D614G variant exhibits efficient replication ex vivo and transmission in vivo. SCIENCE: e8499 The sequence at Spike S1/ S2 site enables cleavage by furin and phospho-regulation in SARS-CoV2 but not in SARS-CoV1 or MERS-CoV Protease inhibitors targeting coronavirus and filovirus entry Identification of nafamostat as a potent inhibitor of middle east respiratory syndrome coronavirus s protein-mediated membrane fusion using the split-protein-based cell-cell fusion assay Human monoclonal antibodies block the binding of SARS-CoV-2 spike protein to angiotensin converting enzyme 2 receptor Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine A noncompeting pair of human neutralizing antibodies block COVID-19 virus binding to its receptor ACE2 Structure of SARS coronavirus spike receptor-binding domain complexed with receptor Structure analysis of the receptor binding of 2019-nCoV Structural basis of receptor recognition by SARS-CoV-2 Crystal structure of the receptor-binding domain from newly emerged middle east respiratory syndrome coronavirus Structure of mouse coronavirus spike protein complexed with receptor reveals mechanism for viral entry Bovine coronavirus uses N-acetyl-9-O-acetylneuraminic acid as a receptor determinant to initiate the infection of cultured cells Immunogenicity and safety of a recombinant adenovirus type-5-vectored COVID-19 vaccine in healthy adults aged 18 years or older: a randomised, double-blind, placebo-controlled, phase 2 trial Acknowledgements Thanks to Dr. Edward C. Mignot, Shandong University, for linguistic advice. Ethical approval This article does not contain any studies with human participants or animal performed by any of the authors.