key: cord-0689381-elpq4da7 authors: Arbeitman, Claudia R.; Auge, Gabriela; Blaustein, Matías; Bredeston, Luis; Corapi, Enrique S.; Craig, Patricio O.; Cossio, Leandro A.; Dain, Liliana; D’Alessio, Cecilia; Elias, Fernanda; Fernández, Natalia B.; Gasulla, Javier; Gorojovsky, Natalia; Gudesblat, Gustavo E.; Herrera, María G.; Ibañez, Lorena I.; Idrovo, Tommy; Randon, Matías Iglesias; Kamenetzky, Laura; Nadra, Alejandro D.; Noseda, Diego G.; Paván, Carlos H.; Pavan, María F.; Pignataro, María F.; Roman, Ernesto; Ruberto, Lucas A. M.; Rubinstein, Natalia; Santos, Javier; Duarte, Francisco Velazquez; Zelada, Alicia M. title: Structural and Functional Comparison of SARS-CoV-2-Spike Receptor Binding Domain Produced in Pichia pastoris and Mammalian Cells date: 2020-09-17 journal: bioRxiv DOI: 10.1101/2020.09.17.300335 sha: 2014e231bfca761d1e88a028194aaf9d0ef92e72 doc_id: 689381 cord_uid: elpq4da7 The yeast Pichia pastoris is a cost-effective and easily scalable system for recombinant protein production. In this work we compared the conformation of the receptor binding domain (RBD) from SARS-CoV-2 Spike protein expressed in P. pastoris and in the well established HEK-293T mammalian cell system. RBD obtained from both yeast and mammalian cells was properly folded, as indicated by UV-absorption, circular dichroism and tryptophan fluorescence. They also had similar stability, as indicated by temperature-induced unfolding (observed Tm were 50 °C and 52 °C for RBD produced in P. pastoris and HEK-293T cells, respectively). Moreover, the stability of both variants was similarly reduced when the ionic strength was increased, in agreement with a computational analysis predicting that a set of ionic interactions may stabilize RBD structure. Further characterization by HPLC, size-exclusion chromatography and mass spectrometry revealed a higher heterogeneity of RBD expressed in P. pastoris relative to that produced in HEK-293T cells, which disappeared after enzymatic removal of glycans. The production of RBD in P. pastoris was scaled-up in a bioreactor, with yields above 45 mg/L of 90% pure protein, thus potentially allowing large scale immunizations to produce neutralizing antibodies, as well as the large scale production of serological tests for SARS-CoV-2. matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry; NTA-Ni 2+ , nickel-charged nitrilotriacetic acid affinity resin; PCR, polymerase chain reaction; PDB, Protein Data Bank; PNGaseF, Peptide-N4-(N-acetyl-beta-glucosaminyl) asparagine amidase; RBD, receptor binding domain; RBM, receptor-binding motif; SARS-CoV-1, severe acute respiratory syndrome coronavirus 1; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; SDS-PAGE, polyacrylamide gel electrophoresis; SEC, size exclusion chromatography; TFA, trifluoroacetic acid; SDS-PAGE, SDS polyacrylamide gel electrophoresis. The COVID-19 outbreak was first recognized in December 2019 in Wuhan, China 1 . Since then, this virus has spread to all parts of the world, resulting in a total of 29415168 infected individuals and 931934 deaths by September 14 th , 2020 ( https://www.coronatracker.com ). The causative agent is a coronavirus that causes a severe acute respiratory syndrome (SARS). This SARS-related coronavirus (SARSr-CoV) has been designated as SARS-CoV-2. Coronaviruses are enveloped non-segmented positive sense RNA viruses 2 that have four open reading frames (ORFs) for structural proteins -Spike, Envelope, Membrane, and Nucleocapsid- 3, 4 , from which Spike is the primary determinant of CoVs tropism. Spike mediates the viral and cellular membrane fusion by binding mainly to the angiotensin-converting enzyme 2 (ACE2), a homologue of ACE 5, 6 . The SARS-CoV-2 genome has 29903 nucleotides in length 7 , sharing 79% and 50% sequence identity with SARS-CoV-1 and MERS-CoV genomes, respectively 8 . Genetic studies suggest that both viruses originated from bat CoVs 8, 9 , with civet cats as intermediate hosts in the case of SARS-CoV-1 10 , and pangolins in the case of SARS-CoV-2 11, 12 . Among the structural proteins, the Envelope protein has the highest sequence similarity between SARS-CoV-2 and SARS-CoV-1 (96% identity), while the Spike protein, responsible for the interaction with the host receptor, the largest sequence divergence (76% identity with SARS-CoV-1) 13 . It has been suggested that the divergence of Spike could be related to an increased immune pressure 1 Due to its important role for SARS-CoV-2 entry into the host cell, Spike is the most studied protein of this virus. This transmembrane glycosylated protein is composed of 1273 amino acid assemblies as a homotrimer that forms spikes that protrude from the virus envelope. Spike has two domains, named S1 and S2 . Residues 319-591 from S1 correspond to the RBD, responsible for the interaction with ACE2 15 . RBD binds with high affinity to the ACE2, located on the outer surface of the cell membrane, which acts as a SARS-CoV-2 receptor since it mediates the fusion of the virus to the cell membrane 16 . Spike also includes a transmembrane domain and a fusion peptide 17 . Molecular dynamics simulations suggest that internal motions of the Spike trimer are important to expose the RBD domain so it can interact with the target receptor. However, part of the time RBD domain is hidden within the rest of the Spike protein, and this process is mediated by protein motions of high amplitude. The structure of RBD-ACE2 protein complex and the structure of Spike (as the full-length and trimeric form of the protein) were determined by X-ray crystallography and cryo-EM 16, [18] [19] [20] . RBD is a protein domain of 220 residues, it has nine cysteine residues (eight of them forming disulfide bonds) ( Figure 1 ) and two Nglycosylation sites (N331 and N343). The addition of glycan moieties might have a relevant role on the in vivo protein folding process, on the dynamics, stability and solvent accessibility of RBD and also on its immunogenicity 21, 22 . RBD is not a globular protein domain; it has a central twisted antiparallel beta-sheet formed by five strands decorated with secondary structure elements (short helices and strands) and loops 19 . The secondary structure analysis of the protein shows 12.4% helix, 33.0% sheet, 19 .1% turn, and 35.6% coil. The secondary structure elements of RBD are differentially colored (Alpha helices: purple, 3_10 helices: iceblue, beta strands: yellow, and turns/coil: cyan). Disulfide bridges (red) and tryptophan residues (blue) are shown as sticks, while N-glycosylation asparagine residues (green) are shown as VDW spheres. The region of ACE2 encompassing residues 1-115 (colored white) which interacts with RBD is also shown. The structure was generated using PDB structures 6xm0 and 6m0j. Despite its medium size of 25 kDa, RBD is an example of a challenging protein domain to express in heterologous systems due to its complex topology ( Figure 1 ). Nevertheless, it is of high importance to produce and purify RBD at low-cost and efficiently, since this domain is extensively used for the development of serological test kits as well as an immunogen, both for the production of animal immune sera and for vaccine development 23 . While E. coli is a cost-efficient system for the expression of many proteins, it is unlikely to be the case for RBD due to its requirement of disulfide bond formation and glycosylation for its proper expression and folding. For this reason, RBD is usually expressed in mammalian as well as insect cells 24 , 18 . The methylotrophic yeast Pichia pastoris is an alternative cost-effective eukaryotic system that allows relatively easy scaling-up of recombinant protein production, and which has previously been used for the expression of SARS-CoV-1 RBD to produce a vaccine 25 . This yeast can use methanol as an exclusive carbon source. This molecule is also an inductor of the strong and tightly regulated AOX1 promoter 26 , which can therefore be used to drive recombinant protein expression. When cultured in bioreactors, P. pastoris can reach high cell densities, and more importantly, this organism allows the efficient secretion of recombinant proteins to the culture medium, which contains relatively low levels of endogenous proteins, thus allowing the straightforward purification of recombinant secretory proteins 26 . In this work we expressed and purified SARS-CoV-2 Spike RBD from two different systems -the yeast P. pastoris and mammalian cells-and compared their structure, stability, glycosylation status, and immunogenicity in mice. Our work provides useful insights on the production of a key protein used in diagnosis and therapeutics to fight COVID-19 pandemia. Prior to designing the constructs to express Spike RBD domain from SARS-CoV-2 we looked for possible variation in its coding sequence in genomes publicly available at the Global Initiative for Sharing All Influenza Data (GISAID) database ( https://www.gisaid.org ) 27 . From a total of 75355 SARS-CoV-2 genome sequences available at GISAID, 85.8% (64,707 genomes) have 100% coverage of RBD (non truncated Spike proteins) with 100% of amino acid identity to the first published RBD sequence (Uniprot: QHN73795.1 ) 28 . This data set includes 38 Argentinean SARS-CoV-2 genomes. RBD sequences from the remaining genomes (14.2%) were distributed as follows: 3.5 % (6199/64707) have more than 99% of amino acid sequence identity (up to 2 amino acid substitutions or InDels), 1.4% (925/64707) have more than 80% (up to 44 amino acid substitutions or InDels) and only 300 genomes have a lower amino acid identity relative to the first published sequence. Thus, we considered appropriate to express the predominant RBD form, spanning from residue 319 to 537 of Spike protein, which consists of a relatively compact domain, and includes a slightly disordered C-terminal stretch useful for protein engineering ( Figure 1 ). The expression of RBD in mammalian cells (HEK-293T cell line) and in P. pastoris yielded significant quantities of protein (∼5 and 10-13 mg L -1 of cell culture, respectively, at a laboratory scale). In both cases, the recombinant protein was fused to appropriate secretion signal peptides, IL2 export signal peptide for HEK-293T expression and Saccharomyces cerevisiae α-factor secretion signal for P. pastoris expression. Both secretion signals allowed the recovery of mature RBD from cell culture supernatants. Since RBD expressed in both eukaryotic systems included a C-terminal His tag, similar purification protocols were used in both cases. However, given that the physico-chemical conditions required for optimal growth of mammalian and yeast cells were completely different (HEK-293T cells were grown at 37 o C in a medium buffered to pH 7.4, while P. pastoris were grown at 28 o C, buffered to pH 6.0), the covalent structure, intactness, conformation, post translational modifications and stability of RBD might still differ depending on the expression system used. Additionally, both strategies involved the accumulation of soluble RBD in the supernatant, which can pose an extra challenge for unstable proteins. For these reasons, it was crucial to evaluate parameters such as protein aggregation, oxidation, and possible alterations in disulfide bond patterns of proteins obtained from the different media. NTA-Ni 2+ -purified RBD from both HEK-293T and yeast exhibited high purity (>90%), as judged by SDS-PAGE analysis ( Figure 2A ). RBD from HEK-293T cells migrated as a 35 KDa single-smear band in SDS-PAGE 12%, while RBD produced in P. pastoris migrated as one highly diffuse and more abundant band of ∼45-40 kDa, and a less abundant band of ∼35 kDa, the latter similar to that of RBD produced in HEK-293T cells. HPLC profile analysis revealed that RBD produced in HEK-293T cells is highly homogeneous, as shown by its elution as a sharp peak at 48-49 % of acetonitrile in a reverse phase C18 column, while RBD produced in P. pastoris showed a considerably broader peak, although it eluted at very similar acetonitrile concentration. In addition, two very small peaks appeared in the chromatogram of RBD produced in P. pastoris . The area corresponding to the full-length protein was approximately 87%. The SDS-PAGE analysis of RBD purified from yeast and mammalian cell culture supernatants suggested the existence of glycosylation as the main post-traslational modification in RBD, as its theoretical mass (deduced from the amino acid sequence) is ∼26 kDa ( Figure 2 ), while both recombinant RBD forms migrated as products of more than 32-35 kDa. This was expected, since two N -glycosylation consensus sequences (NIT and NAT) are present at RBD N-terminal region. RBD from SARS-CoV-1 also bears three glycosylation sites at its N-terminal region, and was found to be glycosylated 25 . Even though Coomassie Blue staining showed heterogeneity in protein size, incubation with PNGaseF, a peptide-endoglycanase that removes high mannose, complex and hybrid N-glycans from proteins, homogenized all isoforms to a sharper band of ∼25-26 kDa, compatible with the predicted MW of deglycosylated RBD (26.5 kDa, Figure 3 ). The decrease in the molecular mass of RBD by endoglycanase digestion confirmed the existence of N -glycosylations in both proteins. Moreover, glycans from P. pastoris-RBD -and not from HEK-293T-RBD -were also removed by EndoH, an endoglycanase that eliminates only high-mannose type glycans, which are the expected type in P. pastoris yeasts. These results strongly suggest that RBD from mammalian cells bears only complex or hybrid glycans, while RBD from P. pastoris only bears high mannose glycans. Moreover, the persistence of two bands in RBD from HEK-293T cells after exhaustive deglycosylation with PNGaseF suggests the existence of heterogeneous O -glycosylation, although heterogeneity in amino acid sequence cannot be discarded either. The identity of the RBD forms was corroborated by fragmentation, controlled proteolysis, peptide assignment and MS/MS sequencing (MALDI TOF TOF for tryptic peptides analysis). Figure 4 shows molecular masses and spectra from the intact mass analysis, which agree with values expected for the samples. The peptide spectrum matches -PSM -with significant score from the MS/MS analysis clearly showed that protein species present in the samples belonged to the RBD from SARS-CoV-2, ( Table 2 ). The masses of peptides identified by MS/MS or peptide mass fingerprint ( Table 2 ) were in good agreement with those expected from a proteotypic peptide prediction software -PeptideRank 29 , and from the information available from the Peptide Atlas database 30 for peptides identified from SARS-CoV-2. To further validate the PSM findings, the data was also analyzed with COMET at Transproteomic Pipeline (a different MS/MS search engine), which produced similar results 31,32 . As expected due to the dispersion in sizes, the FPNITNLCPFGEVFNATR peptide that harbors two glycosylation consensus sequences -NIT and NAT motifs-was not observed in the RBD samples from P. pastoris or HEK-293T cells. On the other hand, a deglycosylation of RBD produced in P. pastoris , followed by MS/MS analysis revealed a m/z signal only present in that sample, in which two HexNAc moieties identified at the two N residues in FPNITNLCPFGEVFNATR ( Figure S1 ). Different analytical techniques were used to characterize proteins produced in yeast or mammalian cells. The UV absorption spectra of both recombinant proteins were very similar; they are dominated by a high content of tyrosine residues (16 Tyr, 2 Trp, 15 Phe, 4 disulfide bonds), as indicated by bands at approximately 276.0 and 281.0 nm. The absence of light scattering -suggested by the absence of a typical slope between 340 and 300 nm-strongly indicated that the proteins do not form soluble aggregates ( Figure 5 ). However, freezing and thawing resulted in protein precipitation when RBD concentration was higher than 80 μM (data not shown). The fourth derivative of absorption spectra can be used to evaluate RBD native conformation. Spectra corresponding to RBD produced in HEK-293T cells and P. pastoris were superimposable ( Figures 6B and C ) , suggesting a similar packing of the aromatic residues. In particular, the positive band at 290.4 nm corresponding to Trp residues observed in the native state of RBD ( Figure 6C ) showed a significant red shift compared to the 288 nm band of N-acetyl-L-tryptophanamide (NATA), suggesting that Trp residues in RBD are not fully exposed to the solvent ( Figure 6B ). Also the negative band at 287.8 nm (a contribution of Tyr and Trp residues) showed a significant red shift compared to that observed for the fully exposed NATA and N-acetyl-L-tyrosinamide, ) and a simulated RBD spectrum (green). (C) Comparison between the fourth derivative spectra from RBD obtained in HEK-293T (black), P. pastoris RBD Clone 7 (red) and the simulated spectrum presented in A (green). SEC-HPLC experiments of RBD produced in P. pastoris confirmed the absence of aggregated forms and showed a peak compatible with two species between 45-25 kDa ( Figure 6 ). Deconvolution of the chromatogram in two components by fitting to two gaussian curve suggested that ∼60% of the signal comes from a higher molecular weight component (>40 kDa), whereas the rest of the signal ∼40% corresponds to a lower molecular weight component (<30 kDa). It is worth mentioning that the exclusion profile corresponding to RBD produced in HEK-293T cells superimposes with the latter, suggesting a more homogeneous glycosylation of the protein. The experimental profile (red), deconvolution of the peak in two different gaussian curves (green and yellow) and the sum of the deconvoluted peaks (blue) are compared. RBD produced in mammalian and P. pastoris cells showed superimposable far-UV circular dichroism (CD) spectra ( Figure 7A ), suggesting a similar secondary structure. Moreover, the CD spectra are identical to that observed for RBD from SARS-CoV-1 produced in yeast 25 . However, given the particular shapes of the P. pastoris and HEK-293T SARS-CoV-2 RBD far-UV CD spectra (which show a single minimum at 206 nm and a maximum at 230 nm, the latter suggesting the contribution of aromatic residues to the spectra), it is difficult to estimate the secondary structure content by using standard sets of spectra. We further studied the conformation of RBD produced in HEK-293T cells or P. pastoris by tryptophan fluorescence spectroscopy. Spectra corresponding to the native forms of RBD superimposed very well, suggesting that these aromatic residues are The conformational stability of different RBD forms was studied through temperature unfolding experiments. Unfolding was monitored by fluorescence of Sypro-orange, an extrinsic probe that preferentially binds to proteins when they are in unfolded conformations. In these experiments, the observed T m usually correlates with T m obtained from differential scanning calorimetry experiments 33 . RBD produced in P. pastoris consistently showed a slightly lower T m value relative to that of RBD from HEK-293T cells (in all assessed conditions), an observation compatible with a reduced resistance to temperature-induced denaturation, which likely reflects a marginally lower conformational stability of RBD produced in P. pastoris ( Figure 7C ). When the unfolding process was studied at different ionic strengths, a significant increase in T m was observed when NaCl concentration was reduced from 500 to 75 mM ( Figure 7C ), suggesting that the tertiary structure of RBD is stabilized by ionic pair interactions. The dependence of the observed T m on the NaCl concentrations, led us to hypothesize that increasing the ionic strength destabilizes RBD conformation by shielding key charged residues. Although the RBD structure suggests some energetic frustration, given that there are several clusters of positively charged residues on RBD surface ((1) R136, R139 and K140; (2) K126, R28, R191; (3) R37, K38, R39 and R148; (4) R85, R90 and K99) 19 , our results suggest that these repulsive interactions are most likely compensated even in the context of the isolated RBD domain ( i.e. without the rest of Spike or the ACE2 receptor). This would explain why increasing ionic strength has a major destabilizing effect. The RBD crystallographic structure analysis indicates that residues have a particular distribution according to their type. The core subdomain (residues 333-442 and 504-526 on the Spike protein) is enriched in non-polar residues, whereas the receptor-binding motif (RBM) subdomain (residues 443-503) is enriched in polar ones. On the other hand, the charged residues are preferentially located close to the interface between the Core and RBM subdomains and form an electrostatic network Among the 14 positively and negatively charged residue pairs interacting at a distance lower than 6 Å, 8 are in the core subdomain, 5 in the RBM, and 1 pair is between the Core and RBM subdomains ( Table 3 ). The existence of 6 ionic pair interactions involving at least one occluded charged residue (D398, E406, D442, R454, D467, and R509) is also remarkable. The N-O distance between basic and acidic groups is shown at the right of each pair. The outlined residues correspond to the RBM region, and the other ones to the core. The values in brackets correspond to the accessibility of the interacting residues, expressed as the percentage of surface exposed. Occluded charged residues (ASA ratio <20%) are marked with an asterisk. The index of each residue corresponds to the numbering in the Spike protein. For this analysis we used the structure of chain E of the pdb code 6m0j 19 . The importance of these interactions merits further analysis, as they may modulate the conformational dynamics of the RBM, the transitions of the RBD in the Spike trimmer, and/or the interaction with the ACE2 receptor. With the aim of evaluating the ability of the RBD protein produced in P. pastoris to stimulate immune response we assessed antibody production in mice by an ELISA assay using plates coated with RBD produced either in HEK-293T or in P. pastoris . After a first dose of antigen plus adjuvants, mice presented higher antibody titers than controls, and after a second dose, the levels of antibodies increased significantly in a short period of time (20 days) relative to the first dose ( Figure 9A ). No significant differences in antibody titers were observed between plates coated with RBD from P. pastoris or from HEK-293T cells. Thus, immunization of mice with RBD produced in P. Evaluation of the cross-reactivity of antibodies produced in mice immunized with P. pastoris RBD . (A) Titers of antibodies obtained by immunization with RBD from P. pastoris plus adjuvants. Each bar represents the group mean (n=5) for specific titers as determined by end-point-dilution ELISA. ELISA was performed with plates coated with RBD protein produced in HEK-293T cells (black sparse bars) or P. pastoris (red sparse bars). First dose corresponds to blood samples obtained 30 days post-first immunization, and second dose to samples obtained 20 days post-second immunization. Pre IS, Pre Immune Sera; RBD + Adjuvants, RBD produced in P. pastoris + Al(OH) 3 + CpG-ODN 1826; Control, Al(OH) 3 + CpG-ODN 1826. P values indicate significant differences between different groups. Bars indicate SD. P values (t-test) are shown for statistically significant differences (p < 0.05). (B) Purified RBD produced in HEK-293T (1.0 g) and in P. pastoris (3.0 g) were analyzed by Western blot using sera from mice immunized with RBD produced in HEK-293T (anti-RBD HEK-293T, left), or in P. pastoris (anti-RBD, P. pastoris center). As a control a primary antibody against the His tag present in both RBD recombinant proteins was used (right). The fermentation of P. pastoris in a 7 L stirred-tank bioreactor for the production of recombinant RBD was carried out using a four-phase procedure described in Methods. In the batch phase, cell concentration reached a maximum level of 15.7 g DCW/L after At the end of the fermentation, a final volume of 5.5 liters of culture was reached, so that the total amount of RBD obtained was 247.5 mg, the volumetric productivity was 0.48 mg/L h, and the total productivity 2.66 mg/h. This work materialized the first two goals of our consortium assembled to fight COVID-19 pandemia: (a) to express and characterize RBD from SARS-CoV-2, and (b) to produce RBD at low cost with high yield. We were able to express this protein in two different systems: P. pastoris and mammalian cells (HEK-293T), which allowed us to gain useful insights concerning RBD conformation and stability. We attempted to express RBD in E. coli , even though an examination of its structure suggested that this system would not be suited for its expression due to the existence of 4 disulfide bonds and a non-globular shape. The E. coli SHuffle expression system only yielded insoluble RBD (in inclusion bodies) as expected, which was not further characterized as it was unsuitable for downstream applications (data not shown). In agreement with our results, in a previous attempt to express the similar RBD from SARS-CoV-1 Spike in E. coli , this protein was also found in the insoluble fraction, and neither its fusion to thioredoxin, nor to maltose-binding protein, increased its solubility. Moreover, while tagging the protein with glutathione S -transferase (GST) increased its solubility 34 , it remained strongly bound to the bacterial chaperone GroEL even after affinity purification 35 , thus making it unsuitable for downstream applications. By contrast, RBD expression in HEK-293T and P. pastoris eukaryotic cells produced soluble and properly folded polypeptides. Their UV-absorption, CD and Trp-fluorescence spectra showed high similarity with those previously described for SARS-CoV-1 RBD produced in P. pastoris 25 . RBD expressed in both eukaryotic systems was also characterized by controlled proteolysis and mass spectrometry analysis. Remarkably, the peptide of sequence FPNITNLCPFGEVFNATR was not easily detected by mass spectrometry. A plausible and straightforward explanation for this result is approximately 57 o C, a value significantly higher than that observed for RBD from SARS-CoV-2 in this work (approximately 50 o C), which might be due to the fact that denaturation of RBD219-N1 was carried out at a considerably lower ionic strength 38 . In addition, the pH of the protein sample was not constant throughout the experiment, given that the p K a of the Tris buffer used in this work is highly temperature dependent. Alternatively, the suppression of the N-terminal glycosylation in SARS-CoV-1 RBD might have an effect on its conformational stability. Nevertheless, the possibility that SARS-CoV-1 RBD has particular features that might increase its stability relative to that of SARS-CoV-2 RBD should not be excluded. In our hands, RBD in its native state was stable under a broad range of pH and concentrations. Although it exhibited a low tendency to aggregate at high concentrations, no significant complications were observed during filter-protein concentration or dialysis, which was performed either to change the buffer or to remove imidazole after protein purification. Freezing (-80 o C) and subsequent thawing of RBD did not result in protein aggregation at protein concentrations of 30-40 M or lower, therefore this strategy was used for its storage, as it made unnecessary the use of stabilizing molecules such as glycerol or trehalose. However, it should be noted that RBD precipitation was occasionally observed after thawing, at protein concentrations above 80 M. P. pastoris -produced RBD was able to stimulate antibody production in mice, and the resulting immune sera were capable of detecting RBD produced not only in P. pastoris but also in HEK-293T cells. Finally, scaling up of RBD expression in P. pastoris could be performed in a bioreactor with yields greater than 45 mg L -1 , which potentially allows the large scale immunization of animals in order to produce neutralizing antibodies, or the development of SARS-CoV-2 vaccines. Future biotechnological developments will be facilitated by the inclusion of a Sortase-A enzyme recognition site within the RBD coding sequence, which allows the native covalent coupling of RBD to fluorescent probes, peptides, proteins, or modified surfaces, through an efficient transpeptidation reaction 39 . Thus, the Sortase-A-mediated transpeptidation will allow future efficient in vitro covalent linking of RBD with protein carriers independently-produced in low cost systems such as E. coli . For RBD expression in mammalian cells, a DNA fragment optimized for expression in human cells encoding RBD (Spike residues from 319 to 537, preceded by the IL-2 export sequence (MYRMQLLSCIALSLALVTNS) and followed by a C-terminal Sortase-A recognition sequence for covalent coupling 39 The RBD coding sequence with codon optimization for P. pastoris (Spike amino acid residues 319-537) fused to the Saccharomyces cerevisiae alpha factor secretion signal (N-terminal) 40 followed by a C-terminal Sortase-A recognition sequence and a His6 tag (C-terminal) was synthesized and cloned into pPICZalpha by GenScript (NJ, USA) using EcoRI and SacII restriction sites to produce pPICZalphaA-RBD-Hisx6. The SacI linearized pPICZalphaA-RBD-Hisx6 vector (10 g) was used to transform electrocompetent X-33 P. pastoris strain at 2. The concentration of recombinant RBD was determined by UV spectrophotometry, using the following extinction coefficients derived from the protein sequence (considering all the disulfide bonds formed): for RBD produced in P. pastoris : 280nm = 33850 M -1 cm -1 (Abs 280nm = 1.304 for a 1 mg mL -1 protein solution); for RBD produced in HEK-293T cells (without considering IL2 export signal sequence: 280nm = 33850 M -1 cm -1 (Abs 280= 1.300 for a 1 mg mL -1 protein solution). Absorption spectra (240-340 nm range, using a 0.1-nm sampling interval) were acquired at 20 o C with a JASCO V730 BIO spectrophotometer (Japan). Ten spectra for each sample were averaged, and blank spectra (averaged) subtracted. A smoothing routine was applied to the data by using a Savitzky-Golay filter and subsequently the 4th derivative spectra were calculated. Purified RBD produced in P. pastoris or in HEK-293T cells was boiled in sample buffer The optical density of P. pastoris culture samples was measured at 600 nm using an UV-Vis spectrophotometer and converted to dry cell weights (DCW, in g L -1 ) with a previously calculated DWC versus OD 600nm calibration curve in accordance with the formula: DCW= 0.269 x OD 600nm , R 2 = 0.99. The protein profile throughout the methanol-induction phase was analyzed by 12% SDS-PAGE, and gels were stained with Coomassie brilliant blue G-250. RBD expression was confirmed by Western blot analysis using anti-RBD and anti-his antibodies. Total protein content was estimated by measuring the absorbance at 280 nm using a Beckman spectrophotometer. Sequence identity and coverage percentages were registered and counted. For HPLC, a JASCO system equipped with an autoinjector, an oven (thermostatized at 25°C) and a UV detector was used. A gradient from 0 to 100% acetonitrile was performed (0.05% TFA (v/v) was added to the solvents). An analytical C18 column was used (Higgins Analytical, Inc. U.S.A.), with a 1.0 mL min-1 flow. The protein samples were analyzed using a MALDI TOF TOF mass spectrometer (Applied Biosystems 4800 Plus) operating in linear mode. Previously, the samples were desalted on ZipTip C 4 column (Millipore, Merck KGaA, Darmstadt, Germany), then eluted in a matrix solution of sinapinic acid 10 mg ml -1 in 70 % acetonitrile, 0.1 % TFA or 2,5 dihydroxy benzoic acid 5 mg ml -1 in 70 % acetonitrile, 0.1 % TFA and deposited on the MALDI plate. The spots were allowed to dry and finally the samples were ablated using a pulsed Nd:YAG laser (355 nm). Spectra were acquired in positive or negative mode, depending on the sample characteristics. The protein samples were digested with trypsin (Promega, mass spectrometry grade) in The spectra were first acquired in reflectron mode and the main signals studied in MS/MS mode. The resulting MS/MS spectra were analyzed using the MASCOT search engine 44 (Matrix Science) program and COMET 45 at Transproteomic Pipeline. Also, for the manual analysis of spectra in reflectron mode, the GPMAW (Lighthouse data) program was used. The molecular Modelling analysis of the RBD domain was done using the chain E of the pdb structure 6M0J 19 . Figures of this structure were done using VMD 46 . The identification of residues making moderate and strong electrostatic interactions within RBD was performed using the Salt Bridges plug in of VMD. For the analysis we used a 6.0 Å cut-off distance between side chain oxygen and nitrogen atoms of residues D, E, K and R. The accessible surface area calculations for the residues of RBD was done using the GetArea server http://curie.utmb.edu/getarea.html using a 1.4 Å probe radius. Immunization of mice was carried out by experts from the High Level Identification of serum antibody against protein RBD in mice using an ELISA assay. Standard ELISA procedures were followed to measure antibody response against RBD. Briefly, RBD protein produced in P. pastoris or HEK- Differences were considered significant if p < 0.05. A novel coronavirus outbreak of global health concern Coronavirus envelope protein: current knowledge Characterization of a novel coronavirus associated with severe acute respiratory syndrome Mechanisms and enzymes involved in SARS coronavirus genome expression Coronavirus spike protein and tropism changes Epidemic and emerging coronaviruses (severe acute respiratory syndrome and Middle East respiratory syndrome) The architecture of SARS-CoV-2 transcriptome Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A pneumonia outbreak associated with a new coronavirus of probable bat origin SARS-CoV and emergent coronaviruses: viral determinants of interspecies transmission Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins Isolation of SARS-CoV-2-related coronavirus from Malayan pangolins Preliminary identification of potential vaccine targets for the COVID-19 coronavirus (SARS-CoV-2) based on SARS-CoV immunological studies Probable pangolin origin of SARS-CoV-2 associated with the COVID-19 outbreak An exploration of the SARS-CoV-2 spike receptor binding domain (RBD), a complex palette of evolutionary and structural features Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein The molecular biology of coronaviruses Structural basis of receptor recognition by SARS-CoV-2 Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Site-specific glycan analysis of the SARS-CoV-2 spike Analysis of the SARS-CoV-2 spike protein glycan shield: implications for immune recognition Shielding and Beyond: The Roles of Glycans in SARS-CoV-2 Spike Protein A vaccine targeting the RBD of the S protein of SARS-CoV-2 induces protective immunity Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine Yeast-expressed recombinant protein of the receptor-binding domain in SARS-CoV spike protein with deglycosylated forms as a SARS vaccine candidate Production of recombinant proteins by yeast cells Data, disease and diplomacy: GISAID's innovative contribution to global health A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Improved prediction of peptide detectability for targeted proteomics using a rank-based algorithm and organism-specific data The peptideatlas project A guided tour of the Trans-Proteomic Pipeline Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics Formulation development of therapeutic monoclonal antibodies using high-throughput fluorescence and static light scattering techniques: role of conformational and colloidal stability Receptor-binding domain of SARS-CoV spike protein: soluble expression in E. coli, purification and functional characterization Expression of SARS-coronavirus spike glycoprotein in Pichia pastoris UDP-GlC: glycoprotein glucosyltransferase-glucosidase II, the ying-yang of the ER quality control Deducing the N-and O-glycosylation profile of the spike protein of novel coronavirus SARS-CoV-2. bioRxiv Optimization of the production process and characterization of the yeast-expressed SARS-CoV recombinant receptor-binding domain (RBD219-N1), a SARS vaccine candidate Sortase A: a model for transpeptidation and its biological applications Alpha-factor-directed synthesis and secretion of mature foreign proteins in Saccharomyces cerevisiae Fed-batch methanol feeding strategy for recombinant protein production by Pichia pastoris in the presence of co-substrate sorbitol MAFFT multiple sequence alignment software version 7: improvements in performance and usability EMBOSS: the European molecular biology open software suite Probability-based protein identification by searching sequence databases using mass spectrometry data Comet: an open-source MS/MS sequence database search tool VMD: visual molecular dynamics 11 Natalia B. Fernández, 1,4 Javier Gasulla, 1,4,12 Natalia Gorojovsky Godoy Cruz 2290 C1425FQB GIBIO-Universidad Tecnológica Nacional-Facultad Regional Buenos Aires. Medrano 951 C1179AAQ Biotecnología y Biología Traslacional (iB3) Instituto de Química y Fisicoquímica Biológicas. (IQUIFIB) Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) César Milstein (Consejo Nacional de Investigaciones Científicas y Técnicas-Fundación Pablo Cassará) Centro de Investigaciones del Medio Ambiente (CIM) Instituto de Ciencia y Tecnología Dr. César Milstein Instituto de Química Física de los Materiales Instituto de Investigaciones en Microbiología y Parasitología (IMPaM) Instituto de Investigaciones Biotecnológicas (IIBio) Instituto de Química y Fisicoquímica Biológicas. (IQUIFIB) Ministerio de Relaciones Exteriores y Culto All authors (listed in alphabetical order) contributed equally to this work. Contact E-mail: anticovid.arg@gmail We thank LANAIS-PRO-EM for the support with mass spectrometry analysis of proteins and peptides, and Fundación Ciencias Exactas y Naturales from Universidad de Buenos Aires for their help. We thank Dr. Juan Ugalde from UNSAM for providing a serum from mice immunized with RBD produced in HEK-293T. We would like to specially thank Dr.Diego U. Ferreiro for his initial suggestions concerning SARS-CoV-2 protein expression.Author contributions (names must be given as initials)All authors contributed equally to this work. The author(s) declare no competing interests. This study was supported by the Agencia Nacional de Promoción de la Investigación, el UniProtKB -P0DTC2 (SPIKE_SARS2); Spike glycoprotein from SARS-CoV-2 .