key: cord-0908961-mp0o95ey authors: Khan, Wajihul Hasan; Khan, Nida; Mishra, Avinash; Gupta, Surbhi; Bansode, Vikrant; Mehta, Deepa; Bhambure, Rahul; Rathore, Anurag S. title: Dimerization of SARS-CoV-2 nucleocapsid protein affects sensitivity of ELISA based diagnostics of COVID-19 date: 2021-05-27 journal: bioRxiv DOI: 10.1101/2021.05.23.445305 sha: b937664f880f08b0d3fca0df9ef621c281cd4e3c doc_id: 908961 cord_uid: mp0o95ey Diagnostics has played a significant role in effective management of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Nucleocapsid protein (N protein) is the primary antigen of the virus for development of sensitive diagnostic assays. Thus far, limited knowledge exists about the antigenic properties of the N protein. In this paper, we demonstrate the significant impact of dimerization of SARS-CoV-2 nucleocapsid protein on sensitivity of enzyme-linked immunosorbent assay (ELISA) based diagnostics of COVID-19. The expressed purified protein from E.coli consists of two forms, dimeric and monomeric forms, which have been further characterized by biophysical and immunological means. Indirect ELISA indicated elevated susceptibility of the dimeric form of the nucleocapsid protein for identification of protein-specific monoclonal antibody as compared to the monomeric form of the protein. These findings have also been confirmed with the modelled structure of monomeric and dimeric nucleocapsid protein via HHPred software and its solvent accessible surface area, which indicates higher stability and antigenicity of the dimeric type as compared to the monomeric form. It is evident that use of the dimeric form will increase the sensitivity of the current nucleocapsid dependent ELISA for rapid COVID-19 diagnostic. Further, the results indicate that monitoring and maintaining of the monomerdimer composition is critical for accurate and robust diagnostics. Graphical abstract COVID-19 is a widespread global pandemic that has significantly damaged the financial stability and access to treatment for many, especially our most marginalized societies 1-3 . Diagnostics has played a major role in managing the pandemic, with most tests serving as an indicator of transmission at the time when the virus is in the upper respiratory tract 4 . However, detection of pathogen-specific antibodies that develop within days of infection is also a durable biomarker of prior exposure. The antibody-based assay has also been useful in identifying those who have been exposed to the virus 5, 6 . The SARS-CoV-2 genome is composed of approximately 30,000 nucleotides, which encodes four structural proteins including spike (S) protein, envelope (E) protein, membrane (M) protein, and nucleocapsid (N) protein. SARS-CoV-2 N protein is a ~45.6 kDa phosphoprotein, comprising of a N-terminal domain (NTD) and a C-terminal domain (CTD), connected by a loosely structured linkage region containing a serine/arginine-rich (SR) domain 7, 8 . The residues from 45 to 181 of the NTD are responsible for the binding of viral RNA to the N protein. SR area linking the NTD and CTD is the site of phosphorylation which is assumed to control N protein performance 9 . Hydrophobic CTD of the N protein contains residues responsible for the homodimerization of the N protein [10] [11] [12] [13] . Homodimers of N protein are recorded to self-assemble into higher-order oligomeric complexes, possibly through cooperative interactions of homodimers 14 . Development of higher-order oligomeric complexes requires both dimerization domain and the expanded asymmetric moiety of the CTD 7, 15, 16 . Upon SARS-CoV-2 infection, viral genomic RNA gets associated with the N protein to develop a ribonucleoprotein complex. This complex then packages itself into a helical conformation and combines itself with the M protein of the virion 8 . Despite being present within the viral particle and not very exposed to the surface, SARS-CoV-2 infected patients show elevated and earlier humoral response to the N protein rather than the spike 17 . This is the reason why the N protein is being widely used in vaccine development and serological assays [17] [18] [19] . It has been shown for SARS-CoV that the C-terminal region of the N protein is crucial for eliciting antibodies in immunological process 20 . Most diagnostic assays are based on the antigenic proteins, either N or S protein, of the SARS-CoV-2 [21] [22] [23] [24] [25] [26] [27] . Several formats of ELISA have been developed to detect IgM/IgG antibodies in a patient's serum against the SARS CoV 2 N protein 28, 29 . Structural study of the full-length coronavirus N protein expressed in Escherichia coli is complicated since the recombinant N protein is very susceptible to proteolysis 11 . As a result, minimal information exists on the structure of the SARS-CoV-2 N protein monomer and its assembly into higher-order complexes. In this study, full-length protein of SARS-CoV-2 was successfully expressed in E. coli BL21 (DE3) as aggregated inclusion bodies. Two major peaks of the N protein were identified as a monomeric and dimeric conformation via size exclusion chromatography coupled with multi-angle static light scattering (MALS), circular dichroism (CD), and fluorescence spectroscopy. Further, the antigenicity of these conformations was compared through a highly sensitive and precise ELISA-based antibody test. The epitope and solvent accessibility of the monomer and dimer forms of the N protein was also predicted using bioinformatics tools to study the structural stability and antigenicity of these conformations. It is evident that use of the dimeric form will increase the sensitivity of the current nucleocapsid dependent ELISA for rapid COVID-19 diagnostic. Further, the results indicate that monitoring and maintaining of the monomer-dimer composition is critical for accurate and robust diagnostics. To the best of our knowledge this is the first indepth investigation into impact of dimerization of SARS-CoV-2 nucleocapsid protein on sensitivity of enzyme-linked immunosorbent assay (ELISA) based diagnostics of COVID-19. protein fractions (c). Immunoblotting of purified N protein fraction using nucleocapsid specific antibody (d). 7 The protein band with the molecular weight of about 51.38 kDa represents the full-length N protein expressed as IBs. The protein was further confirmed with immunoblotting using protein-specific antibody (figure 1b). Protein expression was later scaled-up in a bioreactor and a batch fermentation of transformed E. coli. BL21 (DE3) was performed with 10 g L -1 (v/v) of glycerol as a carbon source. Upon completion of batch, a DO shoot was observed ( Figure 2 ) and feeding of 200 g L -1 (v/v) of the glycerol along with 1% (w/v) yeast extract was given to the bioreactor. The inclusion bodies were solubilised, and the protein was captured using SP Sepharose FF resin and purified using CEX chromatography ( figure 3 ). SARS-CoV-2 N protein of more than 95% purity was thus obtained (Figure 1c ) and confirmed with immunoblotting ( Figure 1d ). The protein was conformed with in gel trypsin digestion followed by Liquid Further, preparative SEC was performed to obtain fractions containing 5%, 10%, 25%, 55%, 75% dimer (figure 6). Since it is impossible to distinguish the complete dimer from the monomer, fractions with the greatest possible dimer content were used in this analysis. N protein monomer and dimer rich pools were used for structural characterization and determination of ELISA sensitivity. characterized for secondary structure by CD spectroscopy (Figure 7a ). It was observed that SARS-CoV-2 mainly consists of random coils as shown by the negative band at ~ 200 nm, which is consistent with reports in literature 30 . As is evident from data presented in figure 7b, both the monomer and dimer primarily consist of random coils. In dimer form, there is an Modelling of the structure of the monomeric/ dimeric forms. The SARS-CoV-2 N protein sequence was retrieved from the Uniprot database. It comprises of 419 amino acid residues. The current experimental structure contains 30-40% of these residues, rendering it the only structure known for the virus. Sequence alignment showed different potential templates covering the various segments of the protein. Figure Calculation of the solvent accessible surface area (SASA). SASA was calculated for each residue in monomer and dimer form. This indicates the amount of area for a residue that is exposed to the solvent. Hydrophobic residues do not prefer polar environment and thus bury and their corresponding percentage change in dimeric form was calculated. Figure 9 shows the percentage change of these hydrophobic residues between the monomer and the dimer. Six such residues, namely A183, V182, I283, L271, I269, and I289, changed from completely exposed to buried state in dimeric form. Moreover, A268 buried by 60% while A243 and I194 buried by ≈ 35% compared to their monomeric conformation. This indicates that dimerization of protein helped to bury these hydrophobic residues at the interface that can stabilize the protein in the solution. Prediction of epitopes. The 3D structure of a protein can be used to predict discontinuous epitopes. These epitopes are formed due to specific conformation of protein residues at the surface. To classify these discontinuous epitopes, many methods are used to evaluate monomer and dimer. Ellipro predicted 3 epitopic sites on the monomer surface and 6 epitopes for the dimer form. These 6 predicted epitopes for the dimer SARS-CoV-2 N are the duplicate of its corresponding monomer and hence improve the chance of antibody binding. and it is present in duplicate for the dimeric form. A complete list of the epitopes predicted by Ellipro is given in Table 1 . Further, a similar analysis was performed with the DiscoTope server, which predicted the probability of each residue to be part of an epitope. It predicted 110 residues at the epitopic site at the DiscoTope threshold score '0'. However, dimer has 238 B-cell epitope residues out of 576 residues. A list of the predicted epitope residues is shown in Supplementary Table S1 . Both the servers suggested that dimer has a greater number of structural epitopes and may have more affinity for antibodies. ELISA based on increasing amount of N protein dimer of fraction 10% and 55% (c) ELISA on increasing amount of N protein dimer fraction of 5%, 25% and 75%. The SARS-CoV-2 nucleocapsid protein antibody is more sensitive than the spike protein antibody for ELISA based identification of early infections 31 In the present study, we purified monomeric and dimeric form of the nucleocapsid protein. The dimer and monomer ratio did not change in the concentration range under consideration. Samples rich in monomeric or dimeric forms were used to investigate the antigenic sensitivity to the SARS-CoV-2 nucleocapsid. The highest-grade dimer fraction of the N protein demonstrated high sensitivity and a wider dynamic range for antibody detection. Later, we investigated the phenomenon of high sensitivity of dimer using computational approach. Structures of N protein were modelled in their monomeric and dimeric forms. Solvent accessibility of hydrophobic residues was found to be lower in the dimeric form, 35 . This resulted in monomeric structure of SARS-CoV-2 N protein sequence. Dimeric structure was built using the PDB template 6WZO structure 36 . Pymol tool was used to superimpose the structure of 6WZO and modelled monomeric structure to build its dimeric form. Solvent accessible surface area (SASA) calculation. SASA was calculated for monomeric and dimeric form using the naccess tool (http://www.bioinf.manchester.ac.uk/naccess/nacdownload.html). Epitopic prediction. Discontinuous epitopes were predicted using the 3D structure of a protein. Monomer and dimer were compared using several tools to identify these discontinuous fragments of the protein that can act as epitopes for antibody binding. Ellipro 37 was first used for this prediction. The starting residues of 1-48 in the monomer and dimer model protein structure did not appear as globular and were present at the terminal in extended conformation. They were not included in epitope prediction to avoid false positives. Later, a similar analysis was performed with the DiscoTope server. This method predicted the probability of each residue to be part of an epitope. A plausible approach to combat pandemic caused by SARS-CoV-2 is to improve the sensitivity of the existing diagnostics. N protein of the SARS-CoV-2 is an important candidate for the development of various diagnosis assays of the COVID-19. This work presents a reliable and sensitive ELISA based diagnostic assay for the detection of antibodies against SARS-CoV-2 N protein. Full length N protein expressed in E. coli mainly consist of monomeric and dimeric conformation in the solution. N protein monomer and dimer conformations were characterized by CD suggesting improved secondary structure of the dimer due to oligomerization. Similarly, fluorescence spectroscopy indicates the exposure of buried tryptophan in dimer resulting in the oligomerization. Indirect ELISA developed by using purified monomer and dimer N protein showed enhanced sensitivity of the N protein dimer as compared to the monomer. We further confirm this observation by SASA, which predict stabilization of N protein dimer by the burial of hydrophobic residues. Our findings have significantly upgraded the present understanding of SAR-CoV-2 N protein and its application in diagnostic assays. We further believe that the employment of N protein dimer Archives of medical research Biochemical and biophysical research communications Current protocols in bioinformatics The authors declare that they do not have any competing interests. Supplementary information of Table S1 is attached with the manuscript.