key: cord-0800482-xlhi3dg7 authors: Goh, Gerard Kian-Meng; Dunker, A. Keith; Foster, James A.; Uversky, Vladimir N. title: A Novel Strategy for the Development of Vaccines for SARS-CoV-2 (COVID-19) and Other Viruses Using AI and Viral Shell Disorder date: 2020-10-02 journal: J Proteome Res DOI: 10.1021/acs.jproteome.0c00672 sha: 016dbbd7f884bc5ee9fc0f3190e90aaa2341dc1b doc_id: 800482 cord_uid: xlhi3dg7 [Image: see text] A model that predicts levels of coronavirus (CoV) respiratory and fecal–oral transmission potentials based on the shell disorder has been built using neural network (artificial intelligence, AI) analysis of the percentage of disorder (PID) in the nucleocapsid, N, and membrane, M, proteins of the inner and outer viral shells, respectively. Using primarily the PID of N, SARS-CoV-2 is grouped as having intermediate levels of both respiratory and fecal–oral transmission potentials. Related studies, using similar methodologies, have found strong positive correlations between virulence and inner shell disorder among numerous viruses, including Nipah, Ebola, and Dengue viruses. There is some evidence that this is also true for SARS-CoV-2 and SARS-CoV, which have N PIDs of 48% and 50%, and case-fatality rates of 0.5–5% and 10.9%, respectively. The underlying relationship between virulence and respiratory potentials has to do with the viral loads of vital organs and body fluids, respectively. Viruses can spread by respiratory means only if the viral loads in saliva and mucus exceed certain minima. Similarly, a patient is likelier to die when the viral load overwhelms vital organs. Greater disorder in inner shell proteins has been known to play important roles in the rapid replication of viruses by enhancing the efficiency pertaining to protein–protein/DNA/RNA/lipid bindings. This paper suggests a novel strategy in attenuating viruses involving comparison of disorder patterns of inner shells (N) of related viruses to identify residues and regions that could be ideal for mutation. The M protein of SARS-CoV-2 has one of the lowest M PID values (6%) in its family, and therefore, this virus has one of the hardest outer shells, which makes it resistant to antimicrobial enzymes in body fluid. While this is likely responsible for its greater contagiousness, the risks of creating an attenuated virus with a more disordered M are discussed. The coronavirus infectious disease 2019 (COVID-19) is caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). 1 The first sign of the SARS-CoV-2 spread was reported in December 2019 at a Wuhan seafood market in China that also sold live animals. 2−5 Since then, COVID-19 has become an extremely serious pandemic with confirmed global infections and deaths moving quickly toward 30 million and 1 million, respectively. SARS-CoV-2 has been found to be closely related to to the SARS-CoV that was responsible for the 2002−2003 outbreak. While the precise case-fatality rate (CFR) for COVID-19 remains controversial, the SARS-CoV-2 (CFR: 0.5−5%) is generally believed to be less virulent than the 2003 SARS-CoV (CFR: 10.5%), even if it is more contagious. 1 A model was built before the MERS-CoV outbreak in 2012. This model measured the percentage of intrinsic disorder (PID) of proteins from the viral inner and outer shells, the nucleocapsid, N, and membrane protein, M, respectively. 6, 7 Upon tabulation of the PIDs, the CoVs were clustered into three groups ( Table 1 ). The first group, group A, consists of coronaviruses that have lower fecal−oral but higher respiratory transmission potentials. Group B are CoVs that have intermediate levels of fecal−oral and respiratory potentials. Lastly, group C includes viruses that have higher fecal−oral but lower respiratory transmission potentials. The clustering is based mainly on PID values of corresponding N proteins, even though later statistical analysis detected that PID values of M proteins also slightly contribute to the categorization. The model indicated that SARS-CoV (PID M = 8%, PID N = 50%) falls into category B, which is indicative of intermediate levels of both fecal−oral and respiratory transmission levels. Category B consists of CoVs that have PID N that is between 54% and 46%, whereas categories A and C are made up of viruses with PID N above 53% and below 46%, respectively. In addition, a statistically significant ANOVA result implies that the viruses are clustered into identifiable groups. Chances for further validation of the model came with the 2012 MERS-CoV and the 2019 COVID (SARS-CoV-2) outbreaks. Proteomic and genetic analyses of the viruses indicates that MERS-CoV (PID M = 9%, PID N = 44%) and SARS-CoV-2 (PID M = 6%, PID N = 48%) belong to the categories C and B, respectively, according to the model already established. 2, 3, 7, 8 These results are consistent with clinical and other observations. MERS-CoV originated from a zoonotic transmission event of a camel coronavirus. Since fecal−oral transmission is generally the most efficient mode of transmissions among farm animals, including camels, this could evolutionarily account for the high fecal−oral transmission potential of MERS-CoV as predicted by the model. 9 This is also supported by the fact that MERS-CoV is not easily transmitted unless it involves close contacts. SARS-CoV-2, on the other hand, falls into the same category B as its closely related cousin, SARS-CoV, with intermediate levels of both respiratory and fecal−oral transmission potentials. Reason for Higher SARS-CoV-2 Infectivity: Harder Outer Shell, Greater Resilience, and, Thus, Greater Contagiousness The model detects, however, something else odd about SARS-CoV-2 and shows that its outer shell is among the hardest outer shells (i.e., low PID M = 6%) within the CoV family. 2, 3, 11 As the outer shell plays the greatest role in protecting the virion, it is therefore likely that SARS-CoV-2 should be more resistant to the antimicrobial enzymes found on the tongue and skin or in saliva, mucus, and other body fluids. 12, 13 As a result, the body is capable of shedding more viral particles. This might account for the fact that SARS-CoV-2 is more contagious than SARS-CoV. Further evidence that the M protein protects the virion from damage lies in the fact that the M is the most abundant protein found in CoV and its spans the membrane so as to provide greater rigidity to the virion. 14 The CoV shell model is a spinoff from a parent study that involved the investigation of viral shell disorder and the unavailability of vaccine for viruses such as HIV. 15, 16 The design of the model was first initiated as an application of disorder computational tools to existing veterinary knowledge of the transmission behaviors of CoVs of farm animals, particularly porcine CoVs. 6 It must be noted that much of the wealth of knowledge pertaining to CoVs arose from experiences with CoVs of farm animals which have devastated the farming industries during epidemics, while CoVs related to humans were previously not of medical interest, as they were usually only associated with mild colds before the 2003 SARS-CoV outbreak. As pigs are placed closely together in farms, Enigmatically, in other viruses, such as Ebolavirus (EBOV), Nipah virus (NiV), Dengue (DENV), and flaviviruses, analyzed by the shell disorder model as part of yet another spinoff from the parent project, the inner shell disorder was also found to be correlated with the level of virulence. 7,10,17−19 The inner shell proteins are therefore good vaccine targets. Our previous papers have briefly examined the relationship between the virulence and respiratory transmission potentials via analysis of the inner shell disorder and protein binding promiscuity. 3, 11 This paper will examine in greater detail the strategy of using nucleocapsid (N) as a vaccine target not just for SARS-CoV-2, but also for a variety of related and unrelated viruses. Predictors of Protein Intrinsic Disorder and Other Sequence Analysis Tools. A major concept used in the study is the phenomenon of protein intrinsic disorder, which refers to proteins or protein regions that have no unique 3D structure. While protein structures are linked to protein functions via a classic protein structure−function paradigm (where unique protein sequence encodes a unique 3D structure that is responsible for unique biological function), protein disorder, likewise, has been linked to myriads of protein functions. 20−24 Tools have been developed to predict disorder, since disordered and ordered proteins are characterized by specific and therefore predictable features of their amino acid sequences. The first of these disorder predictors is the PONDR VLXT (www.pondr.com), which is a neural network (artificial intelligence: AI) trained to recognize disordered and ordered sequences. 25−27 PONDR VLXT has been highly successful when used to study viral proteins, especially viral shell proteins, since many of these proteins are structural proteins held together by protein−protein or protein−RNA/DNA interactions, and PONDR VLXT is known to be one of the best predictors that can take into account these factors. 28−30 It is for these reasons that PONDR VLXT has been successfully used in the study of a large variety of viruses including human immunodeficiency virus (HIV), herpes simplex virus (HSV), hepatitis C virus (HCV), NiV, EBOV, 1918 HIN1 influenza A virus, CoVs, and flaviviruses including yellow fever (YFV), Zika (ZIKV), and DENV. 1−3,6−8, 10,11,15−19,31−34 An important number that is used as a yardstick to measure the level of disorder in a protein is the percentage of intrinsic disorder (PID), which is defined as the number of residues predicted to be disordered divided by the total number of residues in a protein. The sequences are available at UniProt (http:www.uniprot.org) and NCBI (https://www.ncbi.nlm. nih.gov/nuccore/MN908947). Relational database is used to store disorder and sequence information. 31 Statistical calculations using multivariate analysis were done using R statistical package. 35 Basic Alignment Search Tools for Proteins (BLASTP) is available at NCBI (https://blast.ncbi.nlm.nih. gov/Blast.cgi?PAGE=Proteins). Revisiting the CoV Shell Disorder Model. As mentioned previously, the CoV shell disorder model is based on the disorder levels of two major shell proteins: M (membrane, outer shell) and N (nucleocapsid, inner shell). 7, 14 Table 1 shows the grouping of the three categories of coronaviruses. Figure 1 tells us that the SARS-CoV-2 is odd as its outer shell (M) is among the hardest in the family given its low PID M value, which is likely indicative of greater resistance of this virus to antimicrobial enzymes found in the saliva, mucus, tongue, and skin. 2,3,11−13 This characteristic is believed to cause greater shedding of viral particles that is responsible for its greater contagiousness. There is, however, at least one other "competing" theory that could also account for the greater contagiousness of SARS-CoV-2. This involves research showing that the glycoprotein spike (S) is highly adapted to the human ACE2 (angiotensin converting enzyme-2) receptor by binding more tightly by a factor of 20−30 times that in the case of the 2003 SARS-CoV. 36 While the finding in this research does not in any way contradict the results as seen Table 1 and Figure 1 , the question of the true cause of its greater contagiousness remains. With a high probability, both of these factors contribute to the higher transmission potential of SARS-CoV-2. Incoming COVID-19 Data Supporting the CoV Shell Model. Incoming clinical data are increasingly providing compelling evidence that the SARS-CoV-2 sheds large quantities of infectious particles. Heavy viral shedding has been detected on the first day when the patients showed the slightest symptoms 37 or even no symptoms. 38 The heavy shedding lasted until the days of recovery. This shedding was observed to be much greater among COVID-19 patients than those infected with SARS-CoV. 37 Large amounts of virions are not confined to the respiratory tract and spread in the form of respiratory droplets, but can also be found in fecal matter. The infectious particles found in feces were observed to be active for a prolonged period of time. 39 Moreover, CoVs with such hard shell are found to be associated with burrowing animals such as rabbits and pangolins that are often in contact with buried feces. 11 All these are consistent with the CoV shell model and its predictions including the resilience of viral particles outside the physiological environment, as the virus is found to have among the hardest outer shells among CoVs, while it also has intermediate levels of both respiratory and fecal−oral potentials. Figure 1 tells us that the neighboring viruses with similar PID N values are bat coronaviruses. Not only does this corroborate with phylogenetic studies showing that one of the bat CoVs (RATG13) has as much as 96% similarity to SARS-CoV-2, 4,5 but it also highlights the evolutionary connection between this virus and bat CoV. One interpretation would be that the PID N in the range of 47−48% defines the optimal respiratory transmission potential to spread among the bats. If this is the case, then the N protein from SARS-CoV-2 did not evolve much, if any, and is not too different from its ancestral counterpart in bat CoV, even if there is an intermediary host before the appearance of CoV capable of infecting humans. (Since the writing of this paper, an added twist to this analysis has, however, been found upon inspection of pangolin-CoV data as seen in a recently published paper. 11 This is in sharp contrast to its close cousin, SARS-CoV, which has a PID N of 50%, which could imply that the N protein from SARS-CoV probably had time to evolve in an intermediary host, such as the civet cat, since the N PID has to evolve from 47% to 48% to 50%, and a greater number of mutations is required for this to happen.) A different story can be found while looking at the PID M values in different viruses. This is not contradictory, as different proteins affect organisms differently. For instance, a virus may need more immediate protection by evolving a harder outer shell (more rigid M protein), if it moves to a new host species that has stronger antimicrobial enzymes in its saliva. Figure 1 shows that very few other CoVs have such low PID M values as SARS-CoV-2, and unlike disorder levels in N proteins, none of the SARS-CoV-2 bat cousins are close to such low PID M . The closest counterparts found in this sample include caninerespiratory CoV and bovine CoV. This is likely a reflection of the possibility that SARS-CoV-2 M protein had evolved in an intermediary host. Indeed, there has been much debate with regard to the possibility of a snake or pangolin serving as intermediary host for SARS-CoV-2. 40, 41 Furthermore, SARS-CoV-2 is now know to infect domesticated animals, such as cats and dogs. 42 Our model seems to imply that there is probably a non-bat intermediary host in which the virus had for just enough time for its M protein to evolve, but its N protein probably needed a longer time to evolve outside the original bat host, and our data show that the N protein did not have a chance to evolve in an animal host other than bats. Links among Modes of Transmission and Virulence with Inner Shell (N) Disorder. Correlations between inner shell disorder and virulence have been previously discovered among a large variety of viruses, such as NiV, EBOV, and flaviviruses, such as Zika (ZIKV), Dengue (DENV), and yellow fever (YFV) viruses. Figure 2 represents the links between case-fatality rates (CFR) and inner shell disorder levels among a variety of viruses. While Figure 2A show some evidence of a link between SARS-CoV/SARS-CoV-2 virulence and PID N , Figure 2B , 18 The nucleocapsid, nucleoprotein, and capsid of NiV, EBOV, and DENV are denoted by N, NPm, and C, respectively. 14 As also seen in Figure 2 , the high correlations (R) as seen in the coefficients of determination (R 2 ) of greater than 0.25 for the various viruses reveal strong correlations between virulence and inner shell disorder. Further details pertaining to the correlations between virulence and inner shell disorder among the mentioned viruses can be found in previous papers. 10 Figure 2A shows that there is a link between SARS CFR and PID N . While the actual CFR for SARS-CoV-2 remains controversial, reasonably and commonly measured CFR for SARS-CoV-2 is in the range of 0.5−5%. SARS-CoV-2 is generally believed to be less severe than SARS-CoV, which has a CFR of about 10%. 49−52 However, it should be noted that establishing a link between virulence and PID N among CoV is an intricate matter, since CoVs are genetically diverse and have 4 genera. Furthermore, CoVs infect a large variety of animal hosts with different virulence and often use different host receptors. For instance, both SARS-CoV and SARS-CoV-2 enter the host cell using the ACE2 receptor, while MERS-CoV uses dipeptidyl peptidase-4 DPP4. 7,26 Even more confusingly, MERS-CoV has a 35% CFR for humans, but is generally harmless to its predominant camel host. 9 Incidentally, SARS-CoV and SARS-CoV-2 lie within the same lineage of β-CoVs and share close to 80% in genomic sequence identity, whereas MERS-CoV has only 50% sequence similarity to SARS-CoV-2. 53 For these reasons, MERS-CoV cannot be included in the correlation. HCOVs that are distantly related to SARS-CoV, and SARS-CoV-2 should also not be included. For instance, HCOV-NL63 should not be included even though it also enters the human host using the ACE2, but in a different manner from SARS-CoV/SARS-CoV-2. 54,55 HCOV-NL3 and HCOV-229E are α-CoVs, 56 while SARS-CoV, SARS-CoV-2, and MERS-CoV are in the β-CoV genus. 14, 41 Because of the distant relationship between SARS-CoV/SARS-CoV-2 and HCOV-229e/NL63, their sequence similarity is much lower than 50%, similar to what we have seen in the case of MERS-CoV and SARS-CoV-2. In fact, a sequence similarity study of the S proteins of SARS-CoV-2 and HCOV-NL63 has shown the percentage of identity to be as low as 14%. 53 Given these, it is only reasonable that HCOV-229e/NL63 should not be also included into this analysis. Links between Modes of Transmission and N Disorder: NiV and CoV. Upon examination of Table 1 and Figure 2E , one can see that there is also a link between the transmission modes and virulence and the inner shell disorder. This link has also been found in NiV. While Figure 2B shows correlation between NiV PID N and virulence, the NiV 1998− 1999 strain did not involve human-to-human transmission, unlike the other strains. Transmission during the Malaysian 1998−1999 outbreak involved mainly farmers, who were infected during handling of pigs and their fecal matter, whereas the human-to-human spread that involved respiratory transmission did commonly occur in other strains. 18 Because the 1998−1999 strain has the lowest CFR and PID N with the lowest observed respiratory transmission, a positive correlation between PID N and level of viral respiratory transmission potentials has been detected. Likewise, a positive correlation between N disorder and respiratory transmission potential has already been seen in Table 1 . Links among Virulence, Respiratory Transmission Potential, and N Disorder: Viral Loads in the Body and in Body Fluids. The previously mentioned link among virulence, modes of transmission, and disorder of N protein may seem like a perplexing enigma, but it is not. When we built the CoV shell model before 2012, the viruses were clustered into levels of respiratory potentials in line with the ordering of PID N . The reason has to do with the fact that the viral loads found in saliva and mucus of the infected person have to be above certain minimal levels before the virus can becomes infectious via the respiratory modes of transmission. 2, 3, 11 Similarly, in the case of virulence, death often occurs when the viral load in the human body crosses a certain threshold. 7,10,15−19 Inner shell disorder holds the connection for these two phenomena, as the inner shell is intimately involved in the viral replication process. The "Trojan Horse" Immune Evasion: Replicating Quickly via N Disorder before the Immune System Detects Its Presence. Similar proteins often share similar functions even in unrelated viruses. 14 This is especially the case for the nucleocapsid proteins, which are known to play important and similar roles in the replication of viral particles in the host cell. 7,14,57−59 During these processes, they are known to bind to proteins that are part of the host replicating machinery, and by being more disordered, they show greater binding efficiency, as the disorder is known to be associated with more promiscuous protein−protein interactions. 2, 24, 60 The ability to replicate quickly before the onset of immune response is part of a "Trojan horse" immune evasion strategy. 7, 16 Often, though, this strategy backfires on the virus, as the high viral load overwhelms the host organs and thus leads to death of the host. This is the reason that correlations of virulence and inner shell disorder are easily found in a large variety of related and unrelated viruses. The ability to rapidly replicate utilizing the advantages of the inner shell disorder is also manifested as greater respiratory potential, as this will also allow for greater viral load in body fluids, such as saliva and mucus, not just in the blood and internal organs. Journal of Proteome Research pubs.acs.org/jpr Reviews Functional Similarity of Inner Shells Across Viruses and the Advantages of More Promiscuous Binding Arising from Greater Levels of Disorder. It has long been known that greater disorder levels in the proteins are associated with the greater ability of proteins to be promiscuous binders, i.e., to interact with a greater variety of partners, including lipids, RNA/DNA, and proteins with higher efficiency. [20] [21] [22] [23] [24] 60 This trait is likely to be of advantage, when a viral protein needs to bind to host proteins with greater effectiveness, as a part of the process of the hijacking of the host cell machinery. The inner shell proteins play crucial roles in the replication of infectious viral particles and their disorder helps with the rapid replication. There are plenty of opportunities for protein intrinsic disorder to play vital roles, when binding between the viral and host proteins are considered. For instance, CoVs N protein transport viral proteins and RNA to areas near the endoplasmic reticulum (ER) and Golgi apparatus, where it helps in the packaging of viral particles. 3, 14, 59 Likewise, the C protein precursor from a flavivirus would egress and then bind to the membrane of the ER, where it interacts with other viral proteins and RNA for assembly and budding. 7,10,14,17 As for EBOV, its NP is responsible for the building of tube-like structures that facilitate the transportation and budding of viral particles. 19, 57 Lastly, the greater disorder in the NiV N protein provides important means for the more efficient binding of this protein to P and L proteins to form a complex, which becomes an RNA polymerase responsible for the viral RNA replication. 18, 59 We can now see that the inner shell proteins of the various viruses often play similar roles that provide for ample opportunities for protein intrinsic disorder to contribute to more efficient interactions with the host and viral proteins, DNA/RNA, and lipids. Needless to say, the greater efficiency of these interactions leads to quicker replication of viral particles. Attenuating the Virus Such That It Does Not Replicate Rapidly by Increasing Order in N. We have seen that inner shell disorder is likely to be responsible for fast replication of many viruses. Since this virulence arises form the increased viral load caused by greater disorder in the inner shell protein, a strategy for developing an attenuated vaccine would be to find ways of reducing disorder of N protein in the case of SARS-CoV-2. The vaccine strain would therefore be more optimally attenuated if the virus is still able to replicate but only very slowly so as to maintain the lowest possible viral load. Figure 3 shows the RNA binding domain of the N proteins of SARS-CoV-2 and murine hepatitis virus (MHV) with the disordered regions represented in red. It can be observed that the N protein from SARS-CoV-2 ( Figure 3A ) contains noticeably more disordered regions than the N protein from MHV ( Figure 3B) , even though the MHV PID N (46%) is just moderately lower than the SARS-CoV-2 PID N (48%). Since Figure 3 represents the structures of the RNA binding domain, it is likely that the greater disorder differences lie in this region. The reason that MHV is chosen in Figure 3B is that its data is easily accessible in PDB. There are, of course, CoVs with lower PID N values as seen in Table 1 . One illustrative example is shown in Figure 4 representing data for the close cousin of MHV, HCOV-HKU1, which has the lowest PID N among the CoV samples shown in Table 1 . Choosing Regions and Residues to Mutate Using Comparative Disorder Pattern. We have seen that N disorder is likely a large factor in virulence, as it might contribute to greater viral load in the body. The next question Journal of Proteome Research pubs.acs.org/jpr Reviews is then: How do we mutate SARS-CoV-2 such that it becomes optimally attenuated without it dying out from being totally unable to replicate? A strategy would be to compare the N sequences and disorder patterns of a CoV with a lower PID N to those of SARS-CoV-2. Figure 4A represents the PONDR VLXT plots for SARS-CoV-2 (blue), SARS-CoV (red), and HCOV-HKU1 (dashed green). Disordered regions are denoted by the PONDR VLXT scores of 0.5 or above. Keeping in mind that PID N of SARS-CoV-2, SARS-CoV, and HCOV-HKU1 are 48%, 50%, and 37%, respectively, we can see plenty of regions, which are disordered in SARS-CoV-2 but are not in HCOV-HKU1. These regions make good potential targets in the development of vaccine candidates. It should be restated that HCOV-HKU1 was chosen because of its lowest PID N , but there could also be other good N proteins with low PID that can be used to compare with the SARS-CoV-2 N disorder pattern. Any N protein from category C, as seen in Table 1 , could be a potential comparative candidate to be be used for the attenuation of SARS-CoV-2, but it is obviously wiser to choose the ones with a lower PID N so as to ensure the design of a potential vaccine strain that is sufficiently attenuated and to prevent, as much as we could, the vaccine strain from mutating back to its virulent selfas we have seen in the case of Sabin polio vaccine. 7, 11, 14 Sequence Analysis. Upon study of Figure 4A , one region that sticks out is a disorder peak around locations 74−99, which is labeled as "X" in the graph. Apparently, this is a region where the largest differences in disorder propensity lie, not just between SARS-CoV-2 and HCOV-HKU1, but also between SARS-CoV-2 and SARS-CoV. This area falls within the RNA binding domain approximately at locations 1−140. Figure 4B shows a BLASTP (https://blast.ncbi.nlm.nih.gov/Blast. cgi?PAGE=Proteins) alignment of SARS-CoV and SARS-CoV-2 of part of the N proteins. A large gap in disorder differences can be found around locations 17−27 ( Figure 4A ,B) that is likely to be responsible for the slight difference (48% vs 50%) in N PIDs between SARS-CoV-2 and SARS-CoV. The source of the differences can be found in a specific mutation, within the same region. This involves a single amino acid mutation of S (Serine) to T (Threonine) ( Figure 4B ) that can be seen when we compare SARS-CoV-2 to SARS-CoV. Serine (S) is more polar than threonine (T) and is therefore more disorder inducing. 25−27 We can also see, in Figure 4C , a large number of amino acid replacements from polar residues to nonpolar residues when we compare the sequences of SARS-CoV-2 and HCOV-HKU1 N proteins, respectively. Disorder and order inducting residues are generally polar and nonpolar, respectively. 25−27 All these provide for a wide range of potential mutations available for use in order to induce greater order to SARS-CoV-2 N protein while attempting to attenuate the virus. Risks of Genetically Manipulating M. We have seen how a more rigid M causes greater contagiousness by allowing the virus to be more resistant to antimicrobial enzymes found in body fluid and also to be more resilient outside the body. It is therefore a possible temptation to contemplate manipulating M so that the attenuated virus to decrease the chances of the vaccine virus or mutated form in spreading. Before we ponder this possibility, we need to have a better understanding of the risks of increasing M disorder based on what is known about the pathogenesis of viruses with disordered outer shell. Increasing M PID May Place Greater Risks of Morbidity, Such as Fetal Morbidity and Reduced Effectiveness of Vaccine. Figure 5 summarizes the dangers and implications of viruses with higher outer shell. While we have seen that greater inner shell disorder causes virulence, viruses with higher levels of outer shell disorder, on the other hand, can cause higher morbidity. For example, ZIKV can inflict higher morbidity on fetus by causing microcephaly, i.e., small head and brain. Figure 5A shows that ZIKV has a higher outer shell disorder (PID M = 29%), 17 which is exceptionally high especially among its flavivirus cousins. This is in sharp contrast to the DENV PID M of 10%, and although DENV2 causes microcephaly, it does it at a much lower rate. 17 As for the SARS-CoV-2 and SARS-CoV, while more data are needed, it is currently observed that the viruses generally harm the fetus of a pregnant woman. 61 The main reason for the link between outer shell disorder and morbidity has to do with the greater ability of viruses with higher outer shell disorder to penetrate organs as the placenta and brain. 7, [15] [16] [17] [18] [19] 34 This is likely the result of greater binding promiscuity resulting from higher disorder. Yet another risk of increasing levels of disorder in M protein in the attempt to attenuate a virus is that it could backfire and actually thwart the effectiveness of vaccine by reducing the ability of the vaccine to elicit immune response. It has been known that some viruses, including HIV, HSV, and HCV, evade the host immune system by having highly disordered outer shells, thereby decreasing the ability of antibodies to bind Journal of Proteome Research pubs.acs.org/jpr Reviews firmly to the viral surface glycoproteins. 7, 15, 16, 34 Further discussion on this immune-evasion mechanism is presented in the next paragraph. Feasibility of SARS-CoV-2 Vaccine Development: Encouraging News . The search for effective vaccines for HIV, HCV, and HSV has taken approximately 40, 30, and 100 years, respectively, with failures. The possibility of such failures would definitely be a nightmarish scenario for COVID-19. As mentioned, the research in this paper is actually a spin-off from parent research that began about 15 years ago. 6, 7, 15, 16, 34 The parent research has found that the reason for the difficulties in finding effective vaccines for these viruses has to do with the way that they are transmitted, and their relationships with sexual transmission that leads to the highly disordered outer shells. As a result of the motions of the outer shell proteins, neutralizing antibodies are not able to bind tightly to the surface proteins. Figure 5B provides encouraging news as the outer shell PIDs of HSV, HCV, and HIV look nothing like those of SARS-CoV, SARS-CoV-2 ( Figure 5A ), and all CoV samples that we have examined in Table 1 . The HIV-1, HCV, and HSV maximal outer shell PIDs reach 70%, 53%, and 63%, 7,16 respectively, in contrast to SARS-CoV and SARS-CoV-2 PID M values of 8% and 6%. As seen, the CoVs have relatively hard outer shells (M), while highly disordered outer shells have been detected for HSV, HCV, and HIV. CoVs need the harder outer shell to survive in the environment longer, given that they are transmitted via respiratory and fecal−oral modes, whereas HIV, HCV, and HSV, on the other hand, do not need to remain in the environment for long, as they can be usually spread by sexual intercourse or needle sharing. The authors declare the following competing financial interest(s): GKMG is an independent researcher and the owner of Gohs BioComputing, Singapore.GKMG has also written a book (Viral Shapeshifters: Strange Behaviors of HIV and Other Viruses) on a related subject. The authors have no other potential conflict of interests. ■ REFERENCES WHO, Novel coronavirus Rigidity of the outer shell predicted by a protein disorder model sheds light on COVID-19 (Wuhan-2019-nCoV) infectivity Shell disorder analysis predicts greater resilience of SARS-CoV outside the body and in body fluids. Microb. Pathog. 2020 Evolutionary history, potential intermediate animal host, and crossspecies analyses of SARS-CoV-2 A pneumonia outbreak associated with a new coronavirus of probable bat origin Understanding Viral transmission behavior via protein intrinsic disorder prediction: Coronaviruses Viral Shapeshifters: Strange Behaviors of HIV and Other Viruses Prediction of Intrinsic Disorder in MERS CoV/HCoV-EMC Supports a High Oral-Fecal Transmission WHO, Middle Eastern respiratory syndrome coronavirus Correlating Flavivirus virulence and levels of intrinsic disorder in shell proteins: Protective roles vs. immune evasion Shell disorder analysis suggests that pangolins offered a window for a silent spread of an attenuated SARS-CoV-2 precursor among humans Antiviral activities in saliva Innate antimicrobial activity of nasa secretions Fundamentals of Molecular Virology A comparative analysis of viral matrix proteins using disorder predictors Vaccine Mystery and Viral Shell Disorder Nipah shell disorder: Mode of tramsmission and virulence Detection of links between Ebola nucleocapsidand virulence using disorder analysis Intrinsically unstructured proteins Why are "natively unfolded" proteins unstructured under physiologic conditions? Intrinsically unstructured proteins: Re-assessing the protein structure-function paradigm Intrinsically disordered proteins and multicellular organisms DisProt: intrinsic protein disorder annotation in 2020 Predicting protein disorder for N-, C-, and internal regions Predicting Binding Regions within Disordered Proteins Sequence complexity of disordered protein Mining alpha-helix-forming molecular recognition features with cross species sequence alignments Coupled folding and binding with alpha-helixforming molecular recognition elements Addressing the intrinsic disorder bottleneck in structural proteomics Protein intrinsic disorder toolbox for comparative analysis of viral proteins Viral disorder or disordered viruses: Do viral proteins possess unique features? Protein intrinsic disorder and influenza virulence: The 1918 H1N1 and H5N1 viruses Shell disorder, immune evasion and transmission behaviors among human and animal retroviruses R: A language and environment for statistical computing Structural basis of receptor recognition by SARS-CoV-2 Virological asscessmant of hospitalized patients with COVID-19 Covid-19: Four fifths of cases are aysmptomatic, China figures indicate Prolong presence of SARS-CoV-2 viral RNA in faecal smaples Repurposing of clinically approved drugs for treatment of coronavirus disease 2019 in a 2019-novel coronavirus (2019-nCoV) related coronavirus model Cross-species transmission of the newly identified coronavirus 2019-nCoV Suseptability of ferrets, cats, dogs and other domesticated animals to SARS-Coronavirus 2 A yellow fever virus 17D infection and disease mouse model used to evaluate a chimeric Binjari-yellow fever virus vaccine. Vaccines (Basel) 2020, 8, E368. (44) Lobanov, M. Y.; Likhachev, I. V.; Galzitskaya, O. V. Disordered Residues and Patterns in the Protein Data Bank In vitro and in silico Models to Study Mosquito-Borne Flavivirus Neuropathogenesis, Prevention, and Treatment. Front (48) WHO, Nipah virus Immune responses in COVID-19 and potential vaccines: A lesson learned from SARS and MERS epidemics. Asian Pac How deadly is the coronavirus? Scientists are close to an answer Estimates of severity of coronavirus disease 2019: A model-based analysis The many estimates of COVID-19 case-fatality rate Genotype and phenotype of COVID-19: Their roles in pathogenesis SARS Coronavirus but not Human human coronavirus NL63 utilizes Cathepsin L ot infect ACE2-expressing cells Human coronavirus NL63 employs the severe acute respiratory syndrome coronavirus receptor fo cellular entry Host and sources of endemic coronaviruses Functional mapping of the nucleoprotein of the ebola virus Structural disorder within paramyxovirus nucleoproteins and phosphoproteins The coronavirus nucleocapsid is a multifunctional protein The balancing act of intrinsically disordeeered proteins enabling functions while minimizing promiscuity Mnagement of pregnant women infected with COVID-19