key: cord-0809657-n7ylgqfu authors: Giri, Rajanish; Bhardwaj, Taniya; Shegane, Meenakshi; Gehi, Bhuvaneshwari R.; Kumar, Prateek; Gadhave, Kundlik title: Dark Proteome of Newly Emerged SARS-CoV-2 in Comparison with Human and Bat Coronaviruses date: 2020-03-14 journal: bioRxiv DOI: 10.1101/2020.03.13.990598 sha: 041020541892fdc53e22681e8ee5894623b04142 doc_id: 809657 cord_uid: n7ylgqfu Recently emerged Wuhan’s novel coronavirus designated as SARS-CoV-2, a causative agent of coronavirus disease 2019 (COVID-19) is rapidly spreading its pathogenicity throughout the world now. More than 4000 mortalities have occurred worldwide till the writing of this article and this number is increasing every passing hour. World Health Organization (WHO) has declared it as a global public health emergency. The multiple sequence alignment data correlated with already published reports on SARS-CoV-2 indicated that it is closely related to Bat-Severe Acute Respiratory Syndrome like coronavirus (Bat CoV SARS-like) and well-studied Human SARS. In this study, we have exploited the complementary approach to examine the intrinsically disordered regions in proteome of SARS-CoV-2 using Bat SARS-like and Human SARS CoVs as comparative models. According to our findings, SARS-CoV-2 proteome contains a significant amount of ordered proteins except Nucleocapsid, Nsp8, and ORF6. Further, cleavage sites in replicase 1ab polyprotein are found to be highly disordered. We have extensively investigated the dark proteome in SARS-CoV-2 which will have implications for the structured and unstructured biology of SARS or SARS-like coronaviruses. Significance The infection caused by Novel Coronavirus (SARS-CoV-2) is responsible for the current pandemic that cause severe respiratory disease and pneumonia-like infection in humans. Currently, there is no such in-depth information on protein structure and function available in public domain and no effective anti-viral or vaccines are available for the treatment of this infection. Our study provides comparative order and disorder-based proteome information with Human SARS and Bat CoV that may be useful for structure-based drug discovery. third of genome codes for replicase gene (~20kb) containing genes for all non-structural viral proteins while the rest ~10kb genome contains genes for accessory proteins interspersed between the genes responsible for coding structural proteins (Masters, 2006) . For the expression of structural and accessory proteins, additional transcriptional regulatory sequences (TRS) are present within the viral genome. 5' and 3' UTRs contain stem-loop regions required for viral RNA synthesis (Hussain et al, 2005) . The ~20kb (replicase gene) ssRNA is translated first into two long polyproteins: replicase polyprotein 1a and 1ab inside host cells. The newly formed polyproteins after cleavage by two viral proteases result in 16 non-structural proteins that perform a wide range of functions for viruses inside the host cell. They also induce ER-derived double-membrane vesicles (DMVs) for viral replication and transcription. Structural proteins shape the outer cover of the virion, while accessory proteins are mostly involved in host immune evasion Sawicki et al, 2007) . In this manuscript, we want to analyze the dark proteome of the SARS-CoV-2, so explaining a perspective on ordered and disordered proteome is essential. In classical structure-functionparadigm, a stable and defined 3-Dimensional structure is a prerequisite for a protein to accomplish its role. This notion dominated over the years before the idea of intrinsic disorder in protein structures came into full discussions and acceptance among the structural biologists. The other class of proteins fails to fold into well-defined structures and instead remain disordered under physiological buffer conditions. These proteins are called intrinsically disordered proteins (IDPs). Regions within proteins that contain unstructured segments are called intrinsically disordered protein regions (IDPRs). The property of being intrinsically disordered in proteins is determined by the amino acid sequences (Van Der Lee et al, 2014; Wright & Dyson, 1999) . IDPs exhibit their biological functions in signaling, gene regulation, and control processes by interacting with their physiological partners (Dunker et al, , 2002 (Dunker et al, , 2008 Liu et al, 2006; Uversky et al, 2005) . The Mean predicted percentage of intrinsic disorder (Mean PPID) scores, that were obtained by averaging the predicted disorder scores from six disorder predictors (Supplementary Table 1 -3) for each protein of SARS-CoV-2 as well as Human SARS, and Bat CoV have been represented in table 1. * These sequences are based on genome annotations done by Wu et al. , (Wu et al, 2020a) . and 2C are 2D-disorder plots of SARS-CoV-2, Human SARS and Bat CoV respectively and represent the PPID PONDR-FIT vs PPID Mean plot. Based on the predicted levels of intrinsic disorder, proteins can be classified as highly ordered (PPID < 10%), moderately disordered (10% ≤ PPID < 30%) and highly disordered (PPID ≥ 30%) (Rajagopalan et al, 2011) . From the data in table 1, figures 2A, 2B and 2C; as well as the PPID based classification, we conclude that the Nucleocapsid protein from all three strains of coronavirus possesses the highest percentage of the disorder. ORF3b protein in Bat Cov, ORF6 protein in SARS-CoV-2, Human SARS, and Bat CoV and ORF9b protein in Human SARS and SARS CoV belong to the class of moderately disordered proteins. While the structured proteins, namely, Spike glycoprotein (S), Envelope protein (E) and Membrane protein (M) as well as accessory proteins ORF3a, ORF7a, ORF8 (ORF8a and ORF8b in case of Human SARS) of all three strains of coronavirus are ordered. ORF14 and ORF10 proteins also belong to the class of ordered proteins. In CH-CDF plot of the proteins of (D) SARS-CoV-2 (E) Human SARS and (F) Bat CoV, the Y coordinate of each protein spot signifies distance of corresponding protein from the boundary in CH plot and the X coordinate value corresponds to the average distance of the CDF curve for the respective protein from the CDF boundary. In order to further investigate the nature of the disorder in proteins of SARS-CoV-2, Human SARS, and Bat CoV; the results obtained from two binary classifiers of disorder; Charge hydropathy (CH) plot and Cumulative distribution function (CDF) plot were combined. This helped evaluate the disorder predisposition of the proteins based on their charge, hydropathy and PONDR ® VLXT disorder scores. The CH plot is a linear classifier that differentiates between proteins that are predisposed to possess extended disordered regions and include random coils and pre-molten globules from proteins that have compact conformations (ordered proteins and molten globule like proteins). The other binary predictor, CDF is a nonlinear classifier as it uses PONDR ® VLXT scores to discriminate ordered globular proteins from all disordered conformations, which include, native molten globules, premolten globules, and random coils. The CH-CDF plot can be divided into four quadrants: Q1, which is expected to include ordered proteins; Q2, which will include proteins predicted to be disordered by CDF and ordered by CH; Q3, which represents proteins that are predicted to be disordered by both CH and CDF analysis and Q4, which includes proteins disordered according to CH but ordered according to CDF analysis . Figures 2D, 2E and 2F represent the CH-CDF analysis of proteins of SARS-CoV-2, Human SARS, and Bat CoV and it can be observed that all the proteins are located within the two quadrants Q1 and Q2. The CH-CDF analysis leads to the conclusion that all proteins of SARS-CoV-2, Human SARS, and Bat CoV are ordered except Nucleocapsid protein, which is predicted to be disordered by CDF but ordered by CH and hence lies in Q2. . The mean disorder propensity which has been calculated by averaging the disorder scores from all six predictors is represented by short-dot line (sky-blue line) in the graph. The light sky-blue shadow region signifies the mean error distribution. The residues missing in the PDB structure or the residues for which PDB structure is unavailable is represented by the grey-coloured area in the graph. (D) A 3.60Å resolution PDB structure of spike glycoprotein of Human SARS obtained by cryo-EM (PDB ID: 6ACC) of 18-239, 244-660, 674-811, 832-1119 residues. It consists of 3 chains, A (violet-red), B (dark khaki) and C (turquoise), which represent the subunits S1, S2 and S2' respectively. MSA analysis among all three coronaviruses demonstrates that S protein of SARS-CoV-2 has a 77.71% sequence identity with Bat CoV and 77.14% identity with Human SARS (Supplementary Figure S1A) . All three S proteins are found to have a conserved C-terminal region. However, the N-terminal regions of S proteins display noteworthy differences. Given that there is significant sequence variation at S RBD in N-terminus, this might be the reason behind variation in its virulence and its receptor-mediated binding and entry into the host cell. According to our analysis of intrinsic disorder propensity, S protein of all three studied coronaviruses are highly structured as their predicted disorder propensity lies below 10% ( Table 1) . The mean PPID score of SARS-CoV-2, Human SARS, and Bat CoV is calculated to be 1.41%, 1.12%, and 1.85% respectively. Graphs in figures 3A, 3B and 3C represent the intrinsic disorder propensity of each residue in S protein of SARS-CoV-2, Human SARS and Bat CoV obtained from six disorder predictors. Envelope (E) glycoprotein: E Glycoprotein is a multifunctional inner membrane protein that plays an important role in the assembly and morphogenesis of virions in the cell (Ruch & Machamer, 2012; Ujike & Taguchi, 2015 ) (DeDiego et al, 2007 . E protein consists of two ectodomains associated with N and C-terminal regions, and a transmembrane domain. It homo-oligomerize to form pentameric membrane destabilizing transmembrane (TM) hairpins to form a pore necessary for its ion channel activity . Figure 4D shows the NMR-structure (PDB ID: 2MM4) of Human SARS envelope glycoprotein of 8-65 residues (Li et al, 2014) . MSA results illustrate ( Figure 4E ) that this protein is highly conserved, with only three amino acid substitutions in E protein of SARS-CoV-2 conferring its 96% sequence similarity with Human SARS and Bat CoV. Bat CoV shares 100% sequence identity with Human SARS. Mean PPID calculated for SARS-CoV-2, Human SARS, and Bat CoV E proteins are 5.33%, 6.58%, and 6.58% respectively ( Table 1) . The E protein is found to have a reasonably well-predicted structure. Our predictions suggest that residues of N-and C-terminals are displaying a higher tendency for the disorder. The last 18 hydrophilic residues (59-76) have been reported to adopt a random-coil conformation with and without the addition of lipid membranes (Surya et al, 2013) . Literature suggests that the last four amino acids of the Cterminal region of E protein containing a PZD-binding motif are involved in protein-protein interactions with a tight junction protein PALS1. PALS1 is involved in maintaining the polarity of epithelial cells in mammals (Teoh et al, 2010) . We speculate that the disordered region content may be facilitating the interactions with other proteins as well. Respective graphs in figure 4A , 4B, and 4C indicates the predicted intrinsic disorder in E proteins of SARS-CoV-2, Human SARS, and Bat CoV. M Glycoprotein plays an important role in virion assembly by interacting with the N, and E proteins (Tseng et al, 2013 (Tseng et al, , 2010 Corse & Machamer, 2003) . Protein M interacts specifically with a short viral packaging signal containing coronavirus RNA in the absence of N protein highlights an important nucleocapsid-independent viral RNA packaging mechanism inside host cells (Narayanan et al, 2003) . It gains high-mannose N-glycans in ER which is modified into complex N-glycans in the Golgi complex. Glycosylation of M protein is observed to be not essential for virion fusion in cell culture (Nal et al, 2005; Voss et al, 2009 ). Cryo-EM and Tomography data indicate that M forms two distinct conformations: Compact M protein having high flexibility and low spike density; Elongated M protein having a rigid structure and narrow range of membrane curvature (Neuman et al, 2011) . Regions of M glycoproteins are important as a dominant immunogen. This is reported as a deduced X-ray diffraction-based crystal structure (PDB ID: 3I6G) of HLA class I histocompatibility antigen, A-2 alpha chain, and Beta-2-microglobulin with small peptide of membrane glycoprotein (88-96 residues) complex in figure 5D . M protein of SARS-CoV-2 has a sequence similarity of 90.05% with Bat CoV and 89.59% with Human SARS M proteins ( Figure 5E ). Our analysis revealed the intrinsic disorder in M proteins of SARS-COV-2 Human SARS, and Bat CoV as estimated mean PPID is 2.70%, 1.36%, and 1.36% respectively. This is in line with the previous publication by Goh et al. on Human SARS HKU4 where they found the mean PPID of 4% using additional predictors such as TopIDP and FoldIndex along with the predictors used in our study (Goh et al, 2013) . Figures 5A, 5B and 5C show the graphs for the predisposition of intrinsic disorder in residues of M proteins of SARS-CoV-2, Human SARS, and Bat CoV. N protein is one of the major proteins playing a significant role in transcription, and virion assembly of coronaviruses (McBride et al, 2014) . It binds to viral genomic RNA forming a ribonucleoprotein core required for encapsidation of RNA during viral particle assembly (Saikatendu et al, 2007) . SARS-CoV virus-like particles (VLPs) formation has been reported to depend upon either M and E proteins or M and N proteins. For effective production and release of VLPs, co-expression of E and N proteins has been found to be necessary with M protein (Siu et al, 2008) . N protein of Human SARS consists of two structural domains: The N-terminal domain (NTD: 45-181 residues) and the C-terminal dimerization domain (CTD: 248-365 residues) with a disordered patch in between the domains. N protein has been demonstrated to bind viral RNA using both NTD and CTD (Chang et al, 2009) . Figure 6D1 displays the resolved crystal structure (PDB ID: 1SSK) of Human SARS nucleocapsid protein (45-181 residues) . Figure 6D2 shows another X-ray diffraction generated crystal structure (PDB ID: 2GIB) of Human SARS nucleocapsid protein (270-366 residues) (Yu et al, 2006) . Figure S1B ). Our analysis revealed the respective mean PPID of 60.38%, 71.09%, and 65.80% for SARS-COV-2 Human SARS, and Bat CoV N proteins. In accordance with the previously calculated intrinsic disorder (Goh et al, 2013) , N protein is highly disordered in all three SARS viruses ( Table 1) . Graphs in figure 6A , 6B, and 6C depicts the disorder in SARS-CoV-2 Human SARS, and Bat CoV nucleocapsid residues. that N and C-terminals are completely disordered with the central unstructured segment. As expected, N protein of novel coronavirus has similar residues showing the tendency for the disorder which is reported in Human SARS N protein prior to this study (Goh et al, 2013) . The N and C-terminals are completely disordered with the central unstructured segment. In novel N protein, following residues 1-57, 64-102, 145-162, 166-289 and 362-422 are found to be disordered. The residues are lying within the NTD and CTD region which due to their structural plasticity were not crystallized in Human SARS N protein. SARS-CoV-2 has a disordered segment from 168-289 residues while Human SARS has predicted to have an unstructured segment from 145-289 residues. Overall, all three N proteins are found to be highly disordered proteins. Literature suggests that some proteins are translated from the genes interspersed in between the genes of structural proteins. These proteins are termed as accessory proteins, and many of them are proposed to work in viral pathogenesis (Narayanan et al, 2008) . ORF3a and ORF3b: ORF3a is, of molecular weight ~31 kDa, a multifunctional protein that has been found localized in different organelles inside host cells. Also referred to as U274, X1, and ORF3, the gene for this protein is present between S and E genes of the SARS-CoV genome Yuan et al, 2005a; Tan et al, 2004) . The homo-tetramer complex of ORF3a has been demonstrated to form a potassium-ion channel on the host cell plasma membrane (Lu et al, 2006) . It performs a major function during virion assembly by colocalizing with E, M, and S viral proteins (Tan, 2005; McBride & Fielding, 2012) . ORF3b protein localizes in the cytoplasm, nucleolus, and outer membrane of mitochondria (Yuan et al, 2006 (Yuan et al, , 2005c . In Huh 7 cells, its over-expression has been linked with the activation of AP-1 via ERK and JNK pathways (Varshney & Lal, 2011) . Transfection of ORF3b-EGFP leads to cell growth arrest in the G0/G1 phase of Vero, 293 and COS-7 cells (Yuan et al, 2005b) . ORF3a induces apoptosis via caspase 8/9 directed mitochondrial-mediated pathways while ORF3b is reported to affect only caspase 3 pathways (Khan et al, 2006; Law et al, 2005b) . On performing MSA, shown in figure 7D, we found that ORF 3a protein of SARS-COV-2 is found to be closer to ORF3a of Bat CoV (73.36%) than ORF3a of Human SARS (72.99%). Graphs in Figures 7A, 7B and 7C depict the propensity of disorder in ORF 3a proteins of novel SARS-CoV-2, Human SARS, and Bat CoV (SARS-like). Mean PPIDs in ORF 3a proteins are: SARS-CoV-2 -9.09%, Human SARS -8.76%, and Bat CoV (SARS-like) -6.20%. According to the analysis of intrinsic disorder, the mean PPID in ORF3b proteins of SARS-CoV-2, Human SARS, and Bat CoV are 0%, 7.14%, and 23.08% respectively, as represented in figures 8A, 8B, and 8C. MSA results demonstrate that ( Figure 8D ), ORF3b of SARS-CoV-2 is not closer to ORF3b protein of Human SARS and to ORF3b protein of Bat-CoV, having a sequence similarity of only 54.55% and 59.09% respectively. ORF6: ORF6 is a very short SARS coronavirus protein having ~63 residues. Also known as P6, is a membrane-associated protein that serves as an interferon (IFN) antagonist . It downregulates the IFN pathway by blocking a nuclear import protein, karyopherin α2. Using its C-terminal residues, ORF6 disrupts karyopherin import complex in the cytosol and, therefore, hampers the movement of transcription factors like STAT1 into the nucleus Frieman et al, 2007) . It contains a YSEL motif near its C-terminal region, which functions in protein internalization from the plasma membrane into the endosomal vesicles (Netland et al, 2007) . Another study has also demonstrated the presence of ORF6 in endosomal/lysosomal compartments (Netland et al, 2007; Gunalan et al, 2011) . MSA results demonstrate that ( Figure 9D ), SARS-CoV-2 ORF6 is closer to ORF6 protein of Human SARS, having a sequence similarity of 68.85% than ORF6 of Bat CoV (SARS-like) (67.21%). Novel SARS-CoV-2 ORF6 is calculated to be the second most disordered structural protein with a PPID of 22.95%, especially, at the C-terminal region. Our analysis of intrinsic disorder using six predictors have revealed the mean PPID in ORF6 proteins of SARS-CoV-2, Human SARS, and Bat CoV are analyzed to be 22.95%, 20.63%, and 20.63% respectively ( Table 1) . Graphs in figures 9A, 9B and 9C illustrate the disorder for each residue in ORF6 proteins of all three studied coronaviruses. Moderately disordered protein ORF6 is predicted to have an intrinsic disorder near its C-terminal region ( Figures 9A, B, C) . These regions are significant for the biological activities of ORF6. As mentioned above, this hydrophilic region contains lysosomal targeting motif (YSEL) and diacidic motif (DDEE) responsible binding and recognition during translocation (Netland et al, 2007) . However, the N-terminal region is not showing any disordered prediction. The literature on Human SARS suggests that the 1-38 amino acid N-terminal region is alpha-helical and embedded in the membrane but not a transmembrane protein (Zhou et al, 2010) . ORF7a and ORF7b: Alternatively called U122, ORF7a is a type I transmembrane protein Huang et al, 2006) . It has been proven to localize in ER, Golgi, and peri-nuclear space. The presence of a KRKTE motif near the C-terminal imports the protein from the ER to the Golgi apparatus Huang et al, 2006) . It contributes to viral pathogenesis by activating the release of pro-inflammatory cytokines and chemokines like IL-8 and RANTES (Kanzawa et al, 2006; Law et al, 2005a) . In another study, overexpression of BCL-XL in 293T cells blocked the ORF7a mediated apoptosis (Tan et al, 2007) . Figure 10D represents the 1.8Å X-ray diffraction-based crystal structure (PDB ID: 1XAK) of ORF7a on Human SARS revealed its compact seven-stranded topology similar to Ig-superfamily members (Nelson et al, 2005) . Its structure includes a signal peptide, a luminal domain, a transmembrane domain and a short cytoplasmic tail at 5' end (Nelson et al, 2005; McBride & Fielding, 2012) . We found that 121 residues long ORF7a protein of SARS-CoV-2 shares 89.26% and 85.95% sequence identity with ORF7a proteins of Bat CoV, and Human SARS ( Figure 10E ). ORF7b protein of SARS-CoV-2 is found to be closer with Human SARS (81.40%) than Bat sequence with 79.07% sequence identity figure 11D. As can be observed from table 1, our disorder prediction resulted in the overall PPID for ORF7a proteins are 1.65% for SARS-CoV-2, 0.82% for Bat CoV and 0.82% for Human SARS. Mean PPIDs estimated for ORF7b proteins are: 9.30% of novel CoV, 4.55% for Bat CoV and 4.55% Human SARS. Graphs in figures 10A, 10B, and 10C represent the predisposed residues for disorder in ORF7a proteins of SARS-CoV-2, Human SARS, and Bat CoV respectively. Graphs in figures 11A, 11B, and 11C, depict the predisposed residues for disorder in ORF7b proteins of SARS-CoV-2, Human SARS, and Bat CoV respectively. According to our analysis, both proteins in all three studied coronaviruses have a significantly ordered structure. ORF7b is an integral membrane protein that has been shown to localize in the Golgi complex (Schaecher et al, 2007; Kopecky-Bromberg et al, 2006) . The same reports also confirm the role of ORF7b as an accessory as well as a structural protein for SARS-CoV virion. In animals and in isolates from early human infections, the ORF8 gene codes for a single ORF8 protein. However, it was observed that in late infections, at middle and late stages, a 29 nucleotide deletion led to the formation of two distinct proteins; ORF8a and ORF8b having 39 and 84 residues respectively (Oostra et al, 2007a; Chinese SARS Molecular Epidemiology Consortium, 2004) . Both proteins have a distinct conformation than the longer ORF8 protein. It has been reported that overexpression of ORF8b resulted in the downregulation of E protein while the proteins ORF8a and ORF8/ORF8ab have no effect on the expression of protein E. Also, ORF8/ORF8ab was found to interact very strongly with proteins S, ORF3a and ORF7a. ORF8a interacts with S and E proteins, whereas ORF8b protein interacts with E, M, ORF3a and ORF7a proteins (Keng et al, 2006) . Early SARS CoV-2 isolates also have a single and longer ORF8 protein having 121 residues and according to our analysis, it shares a 90.05% sequence identity with ORF8 protein of Bat CoV ( Figure 12C ). Our IDP analysis revealed no disorder in both ORF8 proteins (of SARS-CoV-2, Bat CoV) as can be observed from the graphs in figures 12A and 12B. Both proteins are analyzed to be completely structured having a mean PPID of 0.00%. In ORF8a and ORF8b proteins of the Human SARS, the predicted disorder is estimated to be 2.56%, and 2.38% respectively (Table 1) . Graphs in (Figures 13A and 13B) illustrate a little disorder near the end terminals of ORF8a and ORF8b proteins. ORF9b: This protein is expressed from an alternative ORF within the N gene through a leaky ribosome binding process (Xu et al, 2009) . Inside host cells, ORF9b enters the nucleus which is a cell cycle independent process and passive entry. This has been shown to interact with a nuclear export protein receptor Exportin 1 (Crm1) using which it translocate out of the nucleus (Sharma et al, 2011) . A 2.8Å resolved crystal structure (PDB ID: 2CME) of ORF9b protein of Human SARS exposed the presence of a dimeric tent-like structure along with central hydrophobic amino acids ( Figure 14D ). The published structure has highly polarized distribution of charges with positively charged residues on one side and negatively charged on the other (Meier et al, 2006) . Based on the sequence availability of Accession ID NC_045512.2, the translated protein sequence of ORF9b is not reported for novel SARS-CoV-2 yet. However, based on the report by Wu and colleagues, they have already annotated the sequences of SARS-CoV-2, so we have taken the sequences and analyzed the intrinsic disorder in our manuscript. According to the MSA represented in figure 14E , ORF9b protein of SARS-CoV-2 shares 73.20% similarity with Human SARS and 74.23% similarity with Bat CoV. Our IDP analysis ( Table 1) shows that ORF9b of Human SARS is a moderately unstructured protein with a mean PPID estimated to 26.53%. As depicted in the graphs of figure 14A , 14B, and 14C disorder mainly lies near N-terminal end 1-10 residues and 28-40 residues near the central region with a well-ordered inner core of Human SARS ORF9b protein. X-ray diffraction-based crystal structure of ORF9b has a missing electron density of the first 8 residues and 26-37 residues near the central region. This might be because disordered regions are difficult to crystallize due to their highly dynamic structural conformations. SARS-CoV-2 ORF9b protein with a mean PPID of 10.31% has a N-terminal (1-10 residues) predicted disordered segment. ORF9b of Bat coronavirus is analyzed to have an intrinsic disorder of 9.28%, comparatively lesser than Human SARS ORF9b protein. The newly emerged SARS-CoV-2 has ORF10 protein of 38 amino acids. ORF10 of SARS-CoV-2 has a 100% sequence similarity with Bat CoV strain Bat-SL-CoVZC45 (Wu et al, 2020b) . However, we have not done the IDPs analysis for ORF10 from the Bat-SL-CoVZC45 strain since we have taken different strain of Bat CoV (reviewed strain HKU3-1) in our study. Thus, we have done the IDP analysis for ORF10 protein of SARS-CoV-2 only, according to which, mean PPID is calculated to be 0.00%. Figure 15 shows the graph of the predisposition of intrinsic disorder in residues of ORF10 protein. ORF14: This is a 70 amino acids long uncharacterized protein of unknown function is present in Human SARS and Bat CoV. However, SARS-CoV-2 ORF14 is of 73 amino acid long protein. According to the MSA, ORF14 of SARS-CoV-2 have 77.14% similarity with Human-SARS and 72.86% similarity with Bat CoV as represented in figure 16D . We have performed the IDP analysis to understand the presence of disorder in this protein. The graph in figure 16A, 16B and 16C illustrates the intrinsic disorder in residues of ORF14 of SARS-CoV-2, Human-SARS, and Bat CoV; having calculated mean PPID is to be 0.00%, 2.86%, and 0.00% respectively. In coronaviruses, due to ribosomal leakage during translation two-third of the RNA genome is processed into two polyproteins: (i) Replicase polyprotein 1a and (ii) Replicase polyprotein 1ab. Both contain non-structural proteins (Nsp1-10) in addition to different proteins required for viral replication and pathogenesis. Replicase polyprotein 1a contains an additional Nsp11 protein of 13 amino acids function of which is not investigated yet. The longer replicase polyprotein 1ab of 7073 amino acids accommodates five other non-structural proteins (Nsp12-16) (Thiel et al, 2003) . These proteins assist in ER membrane-induced vesicle formation, which acts as sites for replication and transcription. In addition to this, nonstructural proteins work as proteases, helicases, and mRNA capping and methylation enzymes, crucial for virus survival and replication inside host cells (Thiel et al, 2003; Fan et al, 2004) . (Rajagopalan et al, 2011) , we conclude that none of the Nsps in SARS-CoV-2, Human SARS and Bat CoV are highly disordered. The highest disorder was observed in Nsp8 of all three Coronaviruses. Both Nsp1 and Nsp8 are moderately disordered proteins (10% ≤ PPID ≤ 30%). We also observed that Nsp2, Nsp3, Nsp5, Nsp6, Nsp7, Nsp9, Nsp10, Nsp15 and Nsp16 have disorder less than 10% and hence, belong to the category of highly ordered proteins. The other non-structural proteins, namely, Nsp4, Nsp12, Nsp13 and Nsp14 have negligible levels of disorder (PPID < 1%) which tells us that these are highly structured proteins. CoV, the Y coordinate of each protein spot signifies distance of corresponding protein from the boundary in CH plot and the X coordinate value corresponds to the average distance of the CDF curve for the respective protein from the CDF boundary. The CH-CDF analysis of the Nsps from SARS-CoV-2, Human SARS and Bat CoV have been represented in figures 17D, 17E and 17F respectively. It was observed that all the nsps of the three coronaviruses occur in quadrant Q1, which helps conclude that all the nsps are predicted to be ordered according to the binary predictors, CH and CDF. The longer replicase polyprotein 1ab is 7073 amino acids which contain 15 non-structural proteins that are mentioned in table 2. Nsp1, Nsp2, and Nsp3 are cleaved using a viral papain-like proteinase (Nsp3/PL-Pro) while the rest are cleaved by another viral 3C-like proteinase (Nsp5/3CL-Pro). Based on the cleavage of replicase 1ab polyprotein of Human SARS by two proteases, we have shown disorder propensity at the cleavage sites with few residues spanning the terminals (Figure 18) . Interestingly, we observed that all the cleavage sites are largely disordered, suggesting that intrinsic disorder may have a role in maturation of individual non-structural proteins. As the percentage identities of Nsps of Human SARS are closer with Nsps of SARS-CoV-2, we speculate the of presence of intrinsic disorder at the cleavage sites in polyproteins of SARS-CoV-2. Additionally, their structural and functional properties are thoroughly described below with the predicted intrinsic disorder regions. This works as a host translation inhibitor as it binds to the 40S subunit of the ribosome and blocks the translation of cap-dependent mRNAs as well as mRNAs that uses internal ribosome entry site (IRES) (Lokugamage et al, 2012) . Figure 19D shows the NMR solution structure (PDB ID: 2GDT) of Human SARS nsp1 protein (13-128 residues), residues 117-180 are missing from this structure (Almeida et al, 2007) . SARS-CoV-2 nsp1 protein share 84.44% sequence identity with nsp1 of Human SARS and 83.80% with nsp1 of Bat CoV. Its N-terminal region is found to be quite conserved than the rest of the protein sequence ( Figure 19E ). Mean PPIDs of nsp1 proteins of SARS-CoV-2, Human SARS, and Bat CoV are 12.78%, 14.44%, and 12.85%, respectively. Figure 19A , 19B, and 19C) illustrate the graphs of predicted intrinsic disorder propensity in residues of Nsp1 proteins of SARS-CoV-2, Human SARS, and Bat CoV. According to the analysis, the following residues are predicted to have disorderedness, SARS-CoV-2 (1-7, 165-180), Human SARS (1) (2) (3) (4) (5) (165) (166) (167) (168) (169) (170) (171) (172) (173) (174) (175) (176) (177) (178) (179) (180) (165) (166) (167) (168) (169) (170) (171) (172) (173) (174) (175) (176) (177) (178) (179) . NMR-based structure of Nsp1 of Human SARS revealed the presence of two unstructured segments near the N-terminal (1-12 residues) and C-terminal (129-179 residues) regions (Almeida et al, 2007) . The disordered region (128-180 residues) at C-terminus are characterized to be important for Nsp1 expression (Jauregui et al, 2013) . Based on sequence homology with Human SARS Nsp1 protein, the predicted disordered Nsp1 C-terminal region of SARS-CoV-2 may play a critical role in its expression. Nsp2 protein is not found to be much conserved in SARS-CoV-2. However, it shares equal similarity with both the analogs having a percentage identity of 68.34% with Nsp2 of Human SARS and 68.97% with Nsp2 of Bat CoV ( Figure 20D) . We have estimated the mean PPIDs of nsp2 proteins of SARS-CoV-2, Human SARS, and Bat CoV to be 5.17%, 2.04%, and 2.03% respectively. The predisposition of intrinsic disorder in residues of nsp2 proteins of SARS-CoV-2, Human SARS, and Bat CoV are depicted in graphs in figure 20A , 20B, and 20C. According to the analysis, the following residues in Nsp2 proteins are predicted to have disorderedness, SARS-CoV-2 (570-595), Human SARS (110-115), and Bat CoV (112-116). Nsp3 is a viral papain-like protease that affects the phosphorylation and activation of IRF3 and therefore antagonizes the IFN pathway. In another biochemical study, it was demonstrated that Nsp3 works by stabilizing NFinhibitor further blocking the NF-pathway (Frieman et al, 2009) . Figure 21D depicts the 1.85Å X-ray diffraction-based crystal structure (PDB ID: 2FE8) of the catalytic core of Nsp3 protein of Human SARS is obtained by Andrew and colleagues. This structure is consisting of 723-1036 residues. The structure revealed folds similar to a deubiquitinating enzyme invitro deubiquitinating activity of which was found to be efficiently high (Ratia et al, 2006) . Nsp3 protein of SARS-CoV-2 contains several substituted residues throughout the protein. It is equally close with both Nsp3 proteins of Human SARS and Bat CoV sharing respective 76.69% and 76.31% identity (Supplementary Figure S2A) . According to our results, the mean PPIDs of Nsp3 proteins of SARS-CoV-2, Human SARS, and Bat CoV are 7.40%, 7.91%, and 7.78% respectively ( Table 2) . Graphs in figure 21A , 21B, and 21C portray the tendency of intrinsic disorder in residues of Nsp3 proteins of SARS-CoV-2, Human SARS, and Bat CoV. Nsp3 proteins of all three studied SARS viruses were found to be highly structured. According to the analysis, following residues in Nsp3 proteins are predicted to have disorderedness, SARS-CoV-2 (1-5, 105-199, 1221 -1238 ), Human SARS (102-189, 355-384, 1195 -1223 ) and Bat CoV (107-182, 352-376, 1191 -1217 . Nsp4 has been reported to induce double-membrane vesicles (DMVs) with the co-expression of full-length Nsp3 and Nsp6 proteins for optimal replication inside host cells (Angelini et al, 2013; Hagemeijer et al, 2011; Sakai et al, 2017) . It localizes itself in ER-membrane when expressed alone but is demonstrated to be present in replication units in infected cells. It was observed that Nsp4 protein forms a tetraspanning transmembrane region having its N-and C-terminals in the cytosol (Oostra et al, 2007b) . No crystal or NMR solution structure is reported for this protein. Nsp4 protein of SARS-CoV-2 has multiple substitutions near the N-terminal region and has a quite conserved C-terminus (Supplementary Figure S2B) . It is found closer to Nsp4 of Bat CoV (81.40% identity) than Nsp4 of Human SARS (80%). Mean PPID of Nsp4proteins of SARS-CoV-2, Human SARS, and Bat CoV are estimated to be 0.80%, 0.60%, and 0.60% respectively. Predicted intrinsic disorder propensity in residues of Nsp4 proteins of SARS-CoV-2, Human SARS, and Bat CoV are depicted in graphs figure 22A, 22B, and 22C. With PPIDs around zero, Nsp4 proteins were analyzed to be highly structured. According to the analysis, disorderedness is not found in the Nsp4 protein. Also referred to as 3CL-pro, Nsp5 works as a protease and cleaves replicase polyprotein (1a and 1ab) at 11 major sites (Tomar et al, 2015; Sparks et al, 2008) . Crystal structure (PDB ID: 5C5O) obtained by X-ray diffraction from 3241-3546 residues is shown in figure 23D . In this complex, 3CL-protease is bind to a phenyl-betaalanyl (S, R)-N-declin type inhibitor. Another crystal structure resolved to 1.96Å revealed a chymotrypsin-like fold and a conserved substrate-binding site connected to a novel α-helical fold (Anand et al, 2002) . Recently, the structure of Nsp5 of SARS-CoV-2 has been crystallized using an X-ray diffraction-based technique (PDB ID: 6LU7) ( Figure 23E ). Nsp5 protein is found conserved in all three studied SARS viruses. SARS-CoV-2 Nsp5 shares its 96.08% sequence identity with Nsp5 of Human SARS and 95.42% with Nsp5 of Bat CoV (Supplementary Figure S2C) . Our results demonstrate the mean PPID of SARS-CoV-2, Human SARS, and Bat CoV to be equal to 1.96% ( Table 2) . The calculated predicted intrinsic disorder propensity in residues of respective (SARS-CoV-2, Human SARS, and Bat CoV Nsp5 proteins are illustrated in graphs of figure 23A , 23B, and 23C. As the graphs depict, Nsp5 proteins were found to have no intrinsic disorder. According to the analysis, residues 1-6 have disorderedness in all three CoV. Non-structural protein 6 (Nsp6): Nsp6 protein is involved in blocking ER-induced autophagosome/autolysosome vesicle that functions in restricting viral production inside host cells. It induces autophagy by activating the omegasome pathway which is normally implied by cells in response to starvation. SARS Nsp6 leads to the generation of small autophagosome vesicles thereby limiting their expansion (Cottam et al, 2014) . Nsp6 of SARS-CoV-2 is equally close to both Human SARS, and Bat CoV having a sequence identity of 87.24% ( Figure 24D ). According to our results, mean PPIDs in Nsp6 proteins are calculated to be 1.03% -SARS-CoV-2, 1.03% -Human SARS, and 4.48% -Bat CoV. Figure 24A , 24B, and 24C show the respective graphs of intrinsic disorder tendency in Nsp6 proteins of SARS-CoV-2, Human SARS, and Bat CoV. From our predicted IDP results, we found no disorder in all three Nsp6 proteins. According to the analysis, disorderedness is not found in the Nsp4 protein. The ~10kDa Nsp7 protein helps in primaseindependent de novo initiation of viral RNA replication by forming a hexadecameric ring-like structure with Nsp8 protein (te Zhai et al, 2005) . Both non-structural proteins (7 and 8) contribute 8 molecules to the ring structured multimeric viral RNA polymerase. Site-directed mutagenesis in Nsp8 protein revealed a D/ExD/E motif essential for in-vitro catalysis (te . Figure 25D depicts the 3.1Å resolution electron microscopy-based structure (PDB ID: 6NUR) of RDRP-nsp8-nsp7 complex bound to Nsp12 protein. The structure identified a conserved neutral Nsp7 and 8 binding sites overlapping with finger and thumb domains on Nsp12 of virus (Kirchdoerfer & Ward, 2019) . We found that Nsp7 of SARS-CoV-2 share 100% sequence identity with Nsp7 of Bat CoV and 98.80% with Nsp7 Human SARS ( Figure 25E ) while novel Nsp8 protein is more closer to Nsp8 of Human SARS (97.47%) than Nsp8 of Bat CoV (96.46%) ( Figure 26D ). Due to conserved residues, mean PPIDs of all Nsp7 proteins were found to be equal to 9.64%. Both SARS-CoV-2 and Human SARS Nsp8 proteins were calculated to have a mean PPID of 23.74% and, for Nsp8 of Bat CoV mean disorder is predicted to be 22.22%. Figure 25A , 25B, and 25C display the graphs of predicted intrinsic disorder tendency in Nsp7 proteins of SARS-CoV-2, Human SARS, and Bat CoV. Graphs in figure 26A , 26B, and 26C represent the predicted intrinsic disorder propensity in Nsp8 proteins of SARS-CoV-2, Human SARS, and Bat CoV. As our analysis suggests, Nsp7 proteins have a well-predicted structure while Nsp8 proteins have a moderate amount of disorderness. Nsp8 proteins is predicted to have a long disorder segment from 44-84 residues in both SARS-CoV-2 and Human SARS and 48-84 residues in Bat CoV. Nsp9 protein is a single-stranded RNA-binding protein (Egloff et al, 2004) . It might provide protection from nucleases by binding and stabilizing viral nucleic acids during replication or transcription (Egloff et al, 2004) . Presumed to evolve from a protease, Nsp9 forms a dimer using its GXXXG motif (Ponnusamy et al, 2008; Miknis et al, 2009) . Figure 27D shows a 2.7Å crystal structure (PDB ID: 1QZ8) of Human SARS Nsp9 that identified an oligosaccharide/oligonucleotide fold-like fold in its structure (Egloff et al, 2004) . Each monomer contains a cone-shaped β-barrel and a C-terminal α-helix arranged into a compact domain (Egloff et al, 2004) . Nsp9 of SARS-CoV-2 is equally identical to nsp9 proteins of both Human SARS and Bat CoV having a percentage identity of 97.35%. The difference in three amino acids at 34, 35 and 48 positions accounts for the above similarity score ( Figure 27E ). As calculated, the mean PPIDs of Nsp9 proteins of SARS-CoV-2, Human SARS, and Bat CoV are 7.08%, 7.96%, and 7.08% respectively. Graphs in figures 27A, 27B, and 27C depict the predicted intrinsic disorder propensity in the Nsp9 protein of SARS-CoV-2, Human SARS, and Bat CoV. According to our analysis of intrinsic disorder, all three Nsp9 proteins are completely structured. Non-structural protein 10 (Nsp10): Nsp10 performs several functions for SARS-CoV. It forms a complex with Nsp14 for hydrolyzes of dsRNA in 3′ to 5′ direction and activates in its exonuclease activity (Bouvet et al, 2012) . It also stimulates the MTase activity of Nsp14 protein required during RNA-cap formation after replication (Bouvet et al, 2010) . Figure 28D represents the structure (PDB ID: 5C8T) of the Nsp10/Nsp14 complex. In consistence with the previous biochemical experimental results, the structure identified important interactions with the ExoN (exonuclease domain) of Nsp14 without affecting its N7-Mtase activity (Bouvet et al, 2012 (Bouvet et al, , 2010 . SARS-CoV-2 Nsp10 protein is quite conserved having a 97.12% sequence identity with Nsp10 of Human SARS and 97.84% with Nsp10 of Bat CoV ( Figure 28E) . Mean PPIDs of all three studied Nsp10 proteins is found to be 5.04%. Figures 28A, 28B , and 28C show the graphs of predicted intrinsic disorder tendency in Nsp10 proteins of SARS-CoV-2, Human SARS, and Bat CoV respectively. Our results of the predicted intrinsic disorder suggest the absence of disorder among studied Nsp10 proteins. Nsp12 is an RNA-dependent RNA Polymerase (RDRP) of coronaviruses. It carries out both primer-independent and primer-dependent synthesis of viral RNA with Mn 2+ as its metallic co-factor and viral Nsp7 and 8 as protein co-factors (Ahn et al, 2012) . As mentioned above, a 3.1Å resolution crystal structure (PDB ID: 6NUR) of Human SARS Nsp12 in association with Nsp7 and Nsp8 proteins has been reported using electron microscopy ( Figure 25D ). Nsp12 has a polymerase domain similar to "right hand", finger domain (398-581, 628-687 residues), palm domain (582-627, 688-815 residues) and a thumb domain (816-919) (Kirchdoerfer & Ward, 2019) . SARS-CoV-2 Nsp12 protein has a conserved C-terminus (Supplementary Figure S2D) . It is found to share 96.35% sequence identity with Human SARS Nsp12 protein and 95.60% with Bat CoV Nsp12. Mean PPID in all three Nsp12 proteins is estimated to be 0.43% (Table 2) . Graphs in figure 29A , 29B, and 29C illustrate the respective predicted intrinsic disorder in Nsp12 proteins of SARS-CoV-2, Human SARS, and Bat CoV. As expected, no significant disorder was predicted in Nsp12 proteins. Nsp13 functions as a viral helicase and unwinds dsDNA/dsRNA in 5' to 3' polarity (Adedeji et al, 2012) . Recombinant Helicase expressed in E.coli Rosetta 2 strain has been reported to unwind ~280 bp per second (Adedeji et al, 2012) . Figure 30D represents a 2.8Å X-ray diffraction-based crystal structure (PDB ID: 6JYT) of Human SARS Nsp13 protein. Helicase contains a 19-20 loop on 1A domain which is primarily responsible for its unwinding activity. Moreover, the study revealed an important interaction of Nsp12 protein with Nsp13 that further enhances its helicase activity (Jia et al, 2019) . The 601 amino acid long Nsp13 protein of SARS-CoV-2 is nearly conserved as it shares 99.83% with Nsp13 of Humans SARS and 98.84% with Nsp13 of Bat CoV (Supplementary Figure S2E ). In accordance with our results, the mean PPIDs of all three Nsp13 proteins are estimated to be 0.67%. Figure 30A , 30B, and 30C show the respective graphs for the predisposition of intrinsic disorder in Nsp13 proteins of SARS-CoV-2, Human SARS, and Bat CoV. Figure 30D shows PDB structure of helicase/Nsp13 (PDB ID: 6JYT), from 5302-5902 residues (sienna colour). No significant disorder is revealed in all three Nsp13 proteins. Multifunctional nsp14 acts as an Exoribonuclease (ExoN) and methyltransferase (N7-Mtase) for SARS-Coronaviruses. It's 3' to 5' exonuclease activity lies in conserved DEDD residues related to exonuclease superfamily (Minskaia et al, 2006) . Its guanine-N7 methyltransferase activity depends upon the S-adenosyl-L-methionine (AdoMet) as a cofactor (Bouvet et al, 2010) . As mentioned previously, nsp14 requires nsp10 protein in for activating its ExoN and N7-Mtase activity inside host cells. Figure 28D depicts the 3.2Å deduced crystal structure (PDB ID: 5C8T) of human SARS nsp10/nsp14 complex using the X-ray diffraction technique. Amino acids 1-287 form the ExoN domain and 288-527 residues from the N7-Mtase domain of nsp14. A loop present from 288-301 residues is essential for its N7-Mtase activity (Ma et al, 2015) . SARS-CoV-2 nsp14 protein shares a 95.07% percentage identity with Human SARS nsp14 and 94.69% with Bat CoV nsp14 (Supplementary Figure S2F) . Mean PPID of nsp14 proteins of SARS-CoV-2 and Human SARS is calculated to be 0.38% while nsp14 of Bat CoV has a mean PPID 0.57%. Predicted intrinsic disorder propensity in residues of nsp14 proteins of SARS-CoV-2, Human SARS, and Bat CoV are represented in graphs in figure 31A , 10B, and 31C. As can be observed from PPID values, all nsp14 proteins are found to be highly structured. Nsp15 is a uridylate-specific RNA Endonuclease (NendoU) that creates a 2′-3′ cyclic phosphates after cleavage. Its endonuclease activity depends upon Mn 2+ ions as co-factors. Conserved in Nidovirus, it acts as an important genetic marker due to its absence in other RNA viruses (Ivanov et al, 2004) . As illustrated in figure 32D , Bruno and colleagues deduced a 2.6Å crystal structure (PDB ID: 2H85) of Uridylatespecific Nsp15 using X-ray diffraction. The monomeric Nsp15 has three domains: N-terminal domain (1-62 residues) formed by a three anti-parallel -sheets and two α-helices packed together; a middle domain (63-191 residues) contains an α-helix connect to a 39 amino acids long coil to two α and five region; and a C-terminal domain (192-345 residues) consisting of two anti-parallel three -sheets on each side of a central α-helical core (Ricagno et al, 2006) . Nsp15 protein is found to be quite conserved across human and bat CoVs. It shares an 88.73% sequence identity with Nsp15 of Human SARS and 88.15% with Nsp15 of Bat CoV (Supplementary Figure S2G) . Calculated mean PPID of SARS-CoV-2, Human SARS, and Bat CoV is 1.73%, 2.60%, and 2.60%, respectively. Predicted intrinsic disorder propensity in residues of Nsp15 proteins of SARS-CoV-2 Human SARS and Bat CoV are represented in graphs in figure 32A , 32B, and 32C. No significant disorder was revealed in studied Nsp15 proteins. Nsp16 protein is another Mtase domain-containing protein. As methylation of coronavirus mRNAs occurs in steps, three protein Nsp10, Nsp14 and Nsp16 acts one after another. The first event requires the initiation trigger from Nsp10 protein after which Nsp14 methylates capped mRNAs forming cap-0 (7Me) GpppA-RNAs. Nsp16 protein along with its co-activator protein Nsp10 acts on cap-0 (7Me) GpppA-RNAs to give rise to final cap-1 (7Me)GpppA(2'OMe)-RNAs (Bouvet et al, 2010; Decroly et al, 2008) . A 2Å X-ray diffraction-based structure (PDB ID: 3R24) of the Human SARS nsp10-nsp16 complex is depicted in figure 33D . The structure consists of a characteristic fold present in class I MTase family comprising of α-helices and loops surrounding a sevenstranded β-sheet (Chen et al, 2011) . Nsp16 protein of SARS-CoV-2 is found equally similar with Nsp16 proteins of Human SARS and Bat CoV (93.29%) (Supplementary Figure S2H) . As observed using different predictors, mean PPIDs SARS-CoV-2, Human SARS, and Bat CoV are 5.37%, 3.02%, and 3.02%. Graphs in figure 33A , 33B, and 33C show the predicted predisposition for intrinsic disorder in nsp16 proteins of SARS-CoV-2, Human SARS and Bat CoV, respectively. We found no disorder in Nsp16 proteins which are observed to be structured. As 1a contains identical non-structural proteins 1-10 with 1ab, therefore, we have not performed their IDP analysis separately. However, it has one additional non-structural protein designated as Nsp11. This is an uncharacterized protein cleaved from replicase polyprotein 1a. The small protein with unknown function requires experimental insights to further characterize this protein. The software used in this study requires >30 amino acid sequence, therefore, due to short sequence of all three studied coronaviruses Nsp11 proteins do not show any disordered residue. MSA of Nsp11 protein of SARS-CoV-2 is found to have a similarity of 84.62% with Nsp11 proteins of Human SARS and Bat CoV (Figure 34 ). The emergence of viruses and associated deaths around the globe is a major concern to mankind. There is very little information available in the public domain regarding protein structure and functions of SARS-CoV-2 yet. Based on the similarity with Human SARS and Bat CoV, the published reports have suggested the functions of its proteins. Using available information on its genome and translated proteome from Genbank, we have carried out a comprehensive analysis of disorderedness present in proteins of SARS-CoV-2. Additionally, a comparison is also made with its close relatives from the same group of beta coronaviruses; Human SARS and Bat CoV. As result suggested, the N proteins are predicted to be highly disordered having more than 60% of PPID. Another moderately disordered protein is encoded from ORF6, which downregulates the interferon pathway. All other proteins have shown less disordered regions depicting a three-dimensional structure in the native state. Generally, IDPs undergo structural transition upon association with their physiological partners, therefore, this study will help to understand the interaction of other viral proteins as well as host proteins in different physiological conditions. This will also guide structural biologists to carry out a structure-based analysis of its genome to explore the path for the development of new drugs and vaccines. The periodical outbreaks of pathogens worldwide always remind the lack of suitable drugs or vaccines for proper cure or treatment. In 2003, nearly 750 deaths were reported due to the SARS outbreak in more than 24 countries, but this time, the outbreak of Wuhan's novel coronavirus has surpassed this number quickly indicating more causalities in near future. The lack of accurate information and ignorance of primary symptoms are major reasons which cause many infection cases. The actual reason for SARS-CoV-2 spread is still unknown but some assumptions made by researchers and Chinese authorities, and also confirmed its transmission from human to human. It also has made major impacts on education and economy worldwide due to several restrictions such as traveling, transportation. Due to advancements in distinguished techniques, the full genome sequence was made available in a few days of the first infection report from Wuhan, China. But further research needs to be done to identify its actual cause and suitable treatment in the coming future. There are certain possibilities that can be explored with the available information. More in-depth experimental studies using molecular and cell biology to establish structure-function relationships are required to understand its proper functioning. Additionally, based on homology and other information on protein-protein interaction, the associated viral and host proteins should be explored which help in carrying out replication, maturation, and ultimately pathogenesis. Using structural biology, various purposes including drug development could be achieved such as high throughput screening of compounds virtually as well as experimentally. The thorough experimental disorder analysis of three coronaviruses in this study will also help to structure biologists to design experiments keeping this information in mind. Authors Contribution: RG: Conception and design, interpretation of data, writing, and review of the manuscript and study supervision. MS, TB, PK, BRG, KG: acquisition and interpretation of data, writing of the manuscript. Mechanism of nucleic acid unwinding by SARS-CoV helicase Biochemical characterization of a recombinant SARS coronavirus nsp12 RNA-dependent RNA polymerase capable of copying viral RNA templates Novel -Barrel Fold in the Nuclear Magnetic Resonance Structure of the Replicase Nonstructural Protein 1 from the Severe Acute Respiratory Syndrome Coronavirus Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra alpha-helical domain Mechanisms of coronavirus cell entry mediated by the viral spike protein In vitro reconstitution of SARS-coronavirus mRNA cap methylation RNA 3'-end mismatch excision by the severe acute respiratory syndrome coronavirus nonstructural protein nsp10/nsp14 exoribonuclease complex Important Role for the Transmembrane Domain of Severe Acute Respiratory Syndrome Coronavirus Spike Protein during Entry Coronavirus IBV: removal of spike glycopolypeptide S1 by urea abolishes infectivity and haemagglutination but not attachment to cells Multiple Nucleic Acid Binding Sites and Intrinsic Disorder of Severe Acute Respiratory Syndrome Coronavirus Nucleocapsid Protein: Implications for Ribonucleocapsid Protein Packaging Biochemical and structural insights into the mechanisms of SARS coronavirus RNA ribose 2'-O-methylation by nsp16/nsp10 protein complex Molecular evolution of the SARS coronavirus during the course of the SARS epidemic in China Severe Acute Respiratory Syndrome Coronavirus Nonstructural Protein 2 Interacts with a Host Protein Complex Involved in Mitochondrial Biogenesis and Intracellular Signaling Coronavirus disease The cytoplasmic tails of infectious bronchitis virus E and M proteins mediate their interaction Coronavirus NSP6 restricts autophagosome expansion Full-genome deep sequencing and phylogenetic analysis of novel human betacoronavirus Coronavirus Nonstructural Protein 16 Is a Cap-0 Binding Enzyme Possessing (Nucleoside-2'O)-Methyltransferase Activity A Severe Acute Respiratory Syndrome Coronavirus That Lacks the E Gene Is Attenuated In Vitro and In Vivo Identification and functions of usefully disordered proteins Flexible nets. The roles of intrinsic disorder in protein interaction networks Function and structure of inherently disordered proteins The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world Biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3C-like proteinase Characterization of a Unique Group-Specific Protein (U122) of the Severe Acute Respiratory Syndrome Coronavirus Severe Acute Respiratory Syndrome Coronavirus Papain-Like Protease Ubiquitin-Like Domain and Catalytic Domain Regulate Antagonism of IRF3 and NF-B Signaling Severe Acute Respiratory Syndrome Coronavirus ORF6 Antagonizes STAT1 Function by Sequestering Nuclear Import Factors on the Rough Endoplasmic Reticulum/Golgi Membrane The dark side of Alzheimer's disease: unstructured biology of proteins from the amyloid cascade signaling pathway The dark proteome of cancer: Intrinsic disorderedness and functionality of HIF-1α along with its interacting proteins Intrinsically Disordered Side of the Zika Virus Proteome. Frontiers in cellular and infection microbiology 2016, 6, 144.teome. Front Prediction of Intrinsic Disorder in MERS-CoV/HCoV-EMC Supports a High Oral-Fecal Transmission Nidovirales: evolving the largest RNA virus genome Recombination, Reservoirs, and the Modular Spike: Mechanisms of Coronavirus Cross-Species Transmission The nsp2 Replicase Proteins of Murine Hepatitis Virus and Severe Acute Respiratory Syndrome Coronavirus Are Dispensable for Viral Replication A putative diacidic motif in the SARS-CoV ORF6 protein influences its subcellular localization and suppression of expression of cotransfected expression constructs Cooperative Involvement of the S1 and S2 Subunits of the Murine Coronavirus Spike Protein in Receptor Binding and Extended Host Range Mobility and Interactions of Coronavirus Nonstructural Protein 4 Severe Acute Respiratory Syndrome Coronavirus 7a Accessory Protein Is a Viral Structural Protein Subclassifying disordered proteins by the CH-CDF plot method Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein Identification of Novel Subgenomic RNAs and Noncanonical Transcription Initiation Signals of Severe Acute Respiratory Syndrome Coronavirus Major genetic marker of nidoviruses encodes a replicative endoribonuclease Identification of residues of SARS-CoV nsp1 that differentially affect inhibition of gene expression and antiviral signaling Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus Nsp13 upon ATP hydrolysis Augmentation of chemokine production by severe acute respiratory syndrome coronavirus 3a/X1 and 7a/X4 proteins through NF-kappaB activation The human severe acute respiratory syndrome coronavirus (SARS-CoV) 8b protein is distinct from its counterpart in animal SARS-CoV and down-regulates the expression of the envelope protein in infected cells Over-expression of severe acute respiratory syndrome coronavirus 3b protein induces both apoptosis and necrosis in Vero E6 cells Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors Severe Acute Respiratory Syndrome Coronavirus Open Reading Frame (ORF) 3b, ORF 6, and Nucleocapsid Proteins Function as Interferon Antagonists Protein of Severe Acute Respiratory Syndrome Coronavirus Inhibits Cellular Protein Synthesis and Activates p38 Mitogen-Activated Protein Kinase The 3a protein of severe acute respiratory syndrome-associated coronavirus induces apoptosis in Vero E6 cells Classification of intrinsically disordered regions and proteins Structure of SARS coronavirus spike receptorbinding domain complexed with receptor Structure of a conserved Golgi complextargeting signal in coronavirus envelope proteins Intrinsic disorder in transcription factors The membrane protein of severe acute respiratory syndrome coronavirus acts as a dominant immunogen revealed by a clustering region of novel functionally and structurally defined cytotoxic T-lymphocyte epitopes Severe Acute Respiratory Syndrome Coronavirus Protein nsp1 Is a Novel Eukaryotic Translation Inhibitor That Represses Multiple Steps of Translation Initiation Severe acute respiratory syndrome-associated coronavirus 3a protein forms an ion channel and modulates virus release Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex The molecular biology of coronaviruses The Cytoplasmic Tail of the Severe Acute Respiratory Syndrome Coronavirus Spike Protein Contains a Novel Endoplasmic Reticulum Retrieval Signal That Binds COPI and Promotes Interaction with Membrane Protein The role of severe acute respiratory syndrome (SARS)-coronavirus accessory proteins in virus pathogenesis The coronavirus nucleocapsid is a multifunctional protein The crystal structure of ORF-9b, a lipid binding protein from the SARS coronavirus IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding Severe Acute Respiratory Syndrome Coronavirus nsp9 Dimerization Is Essential for Efficient Viral Growth Discovery of an RNA virus 3'->5' exoribonuclease that is critically involved in coronavirus RNA synthesis Differential maturation and subcellular localization of severe acute respiratory syndrome coronavirus surface proteins S, M and E Nucleocapsid-independent specific viral RNA packaging via viral envelope protein and viral RNA signal SARS coronavirus accessory proteins Structure and intracellular targeting of the SARS-coronavirus Orf7a accessory protein Enhancement of Murine Coronavirus Replication by Severe Acute Respiratory Syndrome Coronavirus Protein 6 Requires the N-Terminal Hydrophobic Region but Not C-Terminal Sorting Motifs A structural analysis of M protein in coronavirus assembly and morphology Intrinsically Disordered Proteins and Intrinsically Disordered Protein Regions The 29-Nucleotide Deletion Present in Human but Not in Animal Severe Acute Respiratory Syndrome Coronaviruses Disrupts the Functional Expression of Open Reading Frame 8 Localization and Membrane Topology of Coronavirus Nonstructural Protein 4: Involvement of the Early Secretory Pathway in Replication Length-dependent prediction of protein intrinsic disorder Optimizing long intrinsic disorder predictors with protein evolutionary information Variable oligomerization modes in coronavirus non-structural protein 9 A majority of the cancer/testis antigens are intrinsically disordered proteins Severe acute respiratory syndrome coronavirus papain-like protease: structure of a viral deubiquitinating enzyme Crystal structure and mechanistic determinants of SARS coronavirus nonstructural protein 15 define an endoribonuclease family Deciphering key features in protein structures with the new ENDscript server Sequence complexity of disordered protein The coronavirus E protein: assembly and beyond Ribonucleocapsid formation of severe acute respiratory syndrome coronavirus through molecular action of the N-terminal domain of N protein Two-amino acids change in the nsp4 of SARS coronavirus abolishes viral replication A Contemporary View of Coronavirus Transcription The ORF7b Protein of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) Is Expressed in Virus-Infected Cells and Incorporated into SARS-CoV Particles SARS-CoV 9b protein diffuses into nucleus, undergoes active Crm1 mediated nucleocytoplasmic export and triggers apoptosis when retained in the nucleus Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega Deciphering the dark proteome of Chikungunya virus The M, E, and N Structural Proteins of the Severe Acute Respiratory Syndrome Coronavirus Are Required for Efficient Assembly, Trafficking, and Release of Virus-Like Particles Ultrastructure and Origin of Membrane Vesicles Associated with the Severe Acute Respiratory Syndrome Coronavirus Replication Complex Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2 A Novel Mutation in Murine Hepatitis Virus nsp5, the Viral 3C-Like Proteinase, Causes Temperature-Sensitive Defects in Viral Growth and Protein Processing The Severe Acute Respiratory Syndrome (SARS)-coronavirus 3a protein may function as a modulator of the trafficking properties of the spike protein A Novel Severe Acute Respiratory Syndrome Coronavirus Protein, U274, Is Transported to the Cell Surface and Undergoes Endocytosis Induction of Apoptosis by the Severe Acute Respiratory Syndrome Coronavirus 7a Protein Is Dependent on Its Interaction with the Bcl-XL Protein The SARS coronavirus E protein interacts with PALS1 and alters tight junction formation and epithelial morphogenesis Mechanisms and enzymes involved in SARS coronavirus genome expression Ligand-induced Dimerization of Middle East Respiratory Syndrome (MERS) Coronavirus nsp5 Protease (3CLpro): IMPLICATIONS FOR nsp5 REGULATION AND THE DEVELOPMENT OF ANTIVIRALS The transmembrane oligomers of coronavirus protein E Identifying SARS-CoV membrane protein amino acid residues linked to virus-like particle assembly Selfassembly of severe acute respiratory syndrome coronavirus membrane protein Incorporation of spike and membrane glycoproteins into coronavirus virions Why are 'natively unfolded' proteins unstructured under physiologic conditions? Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling SARS-CoV accessory protein 3b induces AP-1 transcriptional activity through activation of JNK and ERK pathways The SARS-coronavirus nsp7+nsp8 complex is a unique multimeric RNA polymerase capable of both de novo initiation and primer extension Studies on membrane topology, N-glycosylation and functionality of SARS-CoV membrane protein Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of Life Discovery of Seven Novel Mammalian and Avian Coronaviruses in the Genus Deltacoronavirus Supports Bat Coronaviruses as the Gene Source of Alphacoronavirus and Betacoronavirus and Avian Coronaviruses as the Gene Source of Gammacoronavirus and Deltacoronavi Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Intrinsically unstructured proteins: Re-assessing the protein structure-function paradigm Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China Severe acute respiratory syndrome coronavirus accessory protein 9b is a virion-associated protein PONDR-FIT: a meta-predictor of intrinsically disordered amino acids Viral Disorder or Disordered Viruses: Do Viral Proteins Possess Unique Features? Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Identification of a novel protein 3a from severe acute respiratory syndrome coronavirus Crystal structure of the severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein dimerization domain reveals evolutionary linkage between corona-and arteriviridae Subcellular localization and membrane association of SARS-CoV 3a protein Mitochondrial location of severe acute respiratory syndrome coronavirus 3b protein G0/G1 arrest and apoptosis induced by SARS-CoV 3b protein in transfected cells Nucleolar localization of non-structural protein 3b, a protein specifically encoded by the severe acute respiratory syndrome coronavirus Insights into SARS-CoV transcription and replication from the structure of the nsp7-nsp8 hexadecamer The N-Terminal Region of Severe Acute Respiratory Syndrome Coronavirus Protein 6 Induces Membrane Rearrangement and Enhances Virus Replication Acknowledgments: All the authors would like to thank IIT Mandi for providing facilities. MS and BRG were supported by MHRD for funding. KG was supported by the Department of Biotechnology (DBT), India (BT/PR16871/NER/95/329/2015). PK was supported by IIT Mandi-IIT Ropar-PGI Chandigarh, BioX consortium grant (IITM/INT/RG/18). TB is grateful to the Department of Science and Technology for her INSPIRE fellowship for Funding. All authors declare that there is no financial competing interest.