key: cord-0920100-qlglk6ly authors: Mishra, Pushpendra Mani; Verma, Navneet Chandra; Rao, Chethana; Uversky, Vladimir N.; Nandi, Chayan Kanti title: Intrinsically disordered proteins of viruses: Involvement in the mechanism of cell regulation and pathogenesis date: 2020-04-02 journal: Prog Mol Biol Transl Sci DOI: 10.1016/bs.pmbts.2020.03.001 sha: 2636c0ce0e415a04014e0223876d05a4dbeff83a doc_id: 920100 cord_uid: qlglk6ly Intrinsically disordered proteins (IDPs) possess the property of inherent flexibility and can be distinguished from other proteins in terms of lack of any fixed structure. Such dynamic behavior of IDPs earned the name “Dancing Proteins.” The exploration of these dancing proteins in viruses has just started and crucial details such as correlation of rapid evolution, high rate of mutation and accumulation of disordered contents in viral proteome at least understood partially. In order to gain a complete understanding of this correlation, there is a need to decipher the complexity of viral mediated cell hijacking and pathogenesis in the host organism. Further there is necessity to identify the specific patterns within viral and host IDPs such as aggregation; Molecular recognition features (MoRFs) and their association to virulence, host range and rate of evolution of viruses in order to tackle the viral-mediated diseases. The current book chapter summarizes the aforementioned details and suggests the novel opportunities for further research of IDPs senses in viruses. Intrinsically disordered proteins of viruses: Involvement in the mechanism of cell regulation and pathogenesis 1 . A general introduction to intrinsically disordered proteins (IDPs) and their major properties 5 1. 1 Intrinsically disordered proteins (IDPs) 5 1. 2 Properties of IDPs 6 1. 3 Roles of IDPs in protein interaction and PPI networks 10 1. 4 Predictors of intrinsic disorder 11 1.5 Structural assessment of IDPs through biophysical techniques 11 2. The dark proteomes of viruses Intrinsically disordered proteins (IDPs) possess the property of inherent flexibility and can be distinguished from other proteins in terms of lack of any fixed structure. Such dynamic behavior of IDPs earned the name "Dancing Proteins." The exploration of these dancing proteins in viruses has just started and crucial details such as correlation of rapid evolution, high rate of mutation and accumulation of disordered contents in viral proteome at least understood partially. In order to gain a complete understanding of this correlation, there is a need to decipher the complexity of viral mediated cell hijacking and pathogenesis in the host organism. Further there is necessity to identify the specific patterns within viral and host IDPs such as aggregation; Molecular recognition features (MoRFs) and their association to virulence, host range and rate of evolution of viruses in order to tackle the viralmediated diseases. The current book chapter summarizes the aforementioned details and suggests the novel opportunities for further research of IDPs senses in viruses. Rationale and importance of the book chapter This book chapter entitled "Intrinsically disordered proteins of viruses: involvement in the mechanism of cell regulation and pathogenesis" discusses extensively the intrinsically disordered protein (IDP)-mediated functional mechanisms, pathogenesis, structural regulation and cellular regulation of host cell by complex viral proteome. For a complete understanding of IDPs and their role in Viruses, this chapter starts with the brief introuction of IDPs and their associated atypical properties and different instrumental and computational techniques to characterize IDPs. Next, chapter describes the IDP-related aspect of viruses. Different possible modes of viral IDP molecular mimicry and host IDP-mediated regulation of host cells have been discussed and a diagrammatic model is proposed. Subsequently, the origin of viruses and their special properties have been described. Further, the importance of viral structural, non-structural and other proteins is emphasized. Furthermore, the IDP prevalence in viruses and their comparison to three distinct domains of life (Archaea, Bacteria, and Eukarya) are discussed in detail. The last portion of this book chapter explains various IDP-associated patterns in viruses and their relation to the host range, pathogenicity, and protein aggregation. Next, the structural and functional importance of IDPs in different viruses (Bacteriophage, Plant and Animal virus) is discussed. The examples of the aforementioned viruses and description of their IDP-associated mechanisms have been taken from the different referenced publications. Lastly, this chapter summarizes the conversed contents and further discusses the future outlook for the purpose of studying IDP prevalence, distribution, and disorder-related mechanisms in the proteomes of viruses. We hope that this chapter will help in grasping the concept of IDPs and IDPs' perspective of viruses and spawning many novel ideas in relation to deciphering the complexity of viral pathogenesis and drug discovery. For instance, the prevalence of IDPs and patterns of pathogenesis and host range have been explored and proven in a few viruses; however other related patterns have not been explored completely. Additionally, the mechanisms of cell regulation via disordered viral proteome have not been completely understood. The proposed model will form the basis for further research and understanding. By authors. The concept of structure-function paradigm that was widely accepted for more than a century tells us that the biological functions of proteins are linked to their rigid three-dimensional (3D) structures. 1 The normal functioning of most of the globular proteins (e.g. enzymes) requires the orderly arrangement of various functional groups of amino acids in protein's unique 3D structure to facilitate the catalysis of chemical reactions or other related functions. However, recent research demonstrated that the large fraction of genome-encoded proteins of many organisms lack the well-defined 3D structures, but still play various important roles in cellular functionality. The group of such proteins is generally known as intrinsically disordered proteins (IDPs). [2] [3] [4] However, they have multiple alternative names, such as natively denatured, 5 natively unfolded, 6 intrinsically unstructured, 7 natively disordered, 8 dancing proteins, 9 protein clouds, 10,11 4D, 12 malleable, [13] [14] [15] chameleon, 16 vulnerable, 17 intrinsically disordered, 18 intrinsically unfolded, intrinsically denatured, flexible, 19 mobile, 20 pliable, 21 rheomorphic, 22 and partially folded proteins. 23 The different name identities of IDPs are based on their properties observed in different experiments conducted at a different time. The computational analysis reveals that greater than one-third of eukaryotic proteins harbor the intrinsically disordered regions (IDRs) of greater than 30 residues in length. [24] [25] [26] [27] [28] [29] [30] [31] In solution, when IDPs are kept alone, they lack a unique 3D structure either in parts or completely. 32 The high abundance of IDPs is associated with their functional importance for many crucial cellular processes, such as signaling, recognition, and regulation by means of high specificity and low-affinity interaction and binding to multiple partners. The disorder-based signaling interaction can be mediated as many to one and one to many interactions. The functional tuning of IDPs induced by various post-translational modifications (PTMs), Alternative splicing and induced folding. The high prevalence of IDPs in various diseases suggests the root cause is not only the protein misfolding but beyond it and also caused by mis-signaling and misidentification. The peculiar behavior of IDPs draws attention to drug targets, which temper the protein-protein interactions. 8 Although the IDPs are biologically active molecules, they tend to adopt an extended mobile dynamic or collapsed conformational ensemble either at the tertiary or secondary structure. A comparative analysis of amino acid sequence of IDPs with respect to those of ordered proteins demonstrate the noticeable enrichment in the content of disorder-promoting amino acids, such as Ala, Arg, Gly, Gln, Ser, Glu, Lys, and Pro, paralleled by the significant depletion in the content of order-promoting amino acids, Ile, Leu, Val, Trp, Tyr, Phe, Cys, and Asn. In addition to the aboveobserved criteria, several other disorder-promoting factors are involved that are; 14 Å contact numbers, coordination number, hydropathy, Cys + Phe + Tyr + Trp, volume, Arg + Glu + Ser + Pro, bulkiness, net charge and β-sheet propensity that provide the reliable basis for differentiating disorder and other proteins. 18,33-36 One of the important properties of IDPs is being promiscuous in nature. This involves interaction with multiple partners and ability to act as highly connected nodes, or hubs, most frequently within the protein-protein interaction (PPI) networks. Hubs are vital for the normal functioning and stability of PPI networks in any organism. It has been shown that the deletion of hub protein could be lethal for that organism. [37] [38] [39] [40] [41] [42] [43] [44] [45] The illustrative examples of disordered hub proteins that bind to around 10-100 binding partners are p21, p27, p53, BRCA1, XPA, α-synuclein, estrogen receptor, 46 etc. IDRs within disordered hub protein present in at least one of the two functional forms; the one functional form defines the ability of the disordered binding site to interact with specific partner and, upon interaction, to adopt an ordered conformation, another functional form is a flexible linker that connects two ordered domain and allows their unrestricted movement. 47 The presence of charge in amino acid residues helps in establishing the structure and function of proteins. The high content of charged residue in the highly disordered proteins (native pre-molten globules (PMG) and native coils) is an important, conspicuous feature. 48 The high net charge is important for extended conformation of IDPs, 49 because it has been observed repeatedly for proteins in aqueous environment, that the sequences lacking in certain hydrophobic residues and rich in polar uncharged amino acids form the heterogeneous ensemble of collapsed structures. [50] [51] [52] [53] [54] [55] [56] Analysis of the number of highly charged polypeptides revealed that the intrinsic preference of a polypeptide backbone for the formation of collapsed structure depends on charge content. 49 The analysis of various protein databases of human diseases and other observations determine the contribution of IDPs to the pathogenesis of many human ailments and their role as common player in between the diseases. 8, 57 Few examples of diseases, where IDPs/IDRs are involved are listed below. Cancer: Intrinsic disorder has been observed in many cancer-associated proteins, such as p53, 58 p57kip2, 59 c-Fos, 60 Bcl-2 and Bcl-XL, 61 thyroid cancerassociated protein TC-1, 62 and protein components of cancer-causing viruses. 1 Down's syndrome: Non-filamentous deposits of intrinsically disordered amyloid-β (Aβ). 63 Alzheimer's disease: IDPs associated with this disease are depositions of Aβ, Tau, and α-synuclein NAC fragment. [64] [65] [66] [67] Other diseases, where intrinsic disorder in protein components was reported are family of polyQ diseases 68 ; variant of Alzheimer's disease, dementia with Lewy body, diffuse Lewy body disease, Parkinson's disease, Hallervorden-Spatz disease and multiple system atrophy 69 ; prion disease 70 ; argyrophilic grain disease, myotonic dystrophy, and motor neuron disease with neurofibrillary tangles, subacute sclerosing panencephalitis, Niemann-Pick disease type C. 66 Intrinsic disorder is also reported in protein components of viruses causing various human diseases, such as AIDS and Cutaneous diseases. 71 IDPs do not have specific fixed structures, hence they exist as dynamic ensembles, quite similar to the clouds of proteins. In these protein cloud structures, the atomic position and the backbone Ramachandran angle does not have the fixed value and vary significantly over time. Despite being dynamic in nature, these protein clouds could be represented by a fairly limited number of low-energy conformations (but still significantly more than one low-energy state typical for ordered proteins). 10, 72, 73 To understand the regulatory mechanism and cellular functions involvement, structural details of IDPs are necessary. Various methods have been developed to construct the ensemble modeling of IDPs. [74] [75] [76] 1.2.6 Hydration property Due to the difference in structure and structure-associated properties, ordered and disordered proteins possess different hydration degrees. The degree of hydration is significantly higher for the IDPs in comparison to the similar size globular proteins. Furthermore, the hydration degree also varies for the partially and fully intrinsically disordered proteins. [77] [78] [79] In addition to retaining a high amount of water content, IDPs also possess a high propensity of binding to charged solute ions. Both properties play an important protective role in biological systems. For example, under the adverse water-stressed conditions, D. radioduran is able to protect its enzyme nudix hydrolase from denaturation due to the aforementioned properties of the IDRs of this protein. 80 Several plants and free-living insect species also protect themselves by using the ability of IDPs and IDRs for excessive hydration and absorption of solute ion. 46 1.2.7 Property of induced folding Many IDPs can undergo (at least partial) disorder-to-order transitions upon binding to the specific partners. The free energy required for the transition comes from the interface contacts, which results in the formation of low net free energy association for the high specific interaction combination. 18, 38, 39, 81, 82 In IDPs/IDRs, coupled properties of high specificity and low affinity seems to ensure specific binding and reversibility to complete the signaling cascade. 46 IDPs/IDRs can change their shapes to readily bind multiple different partners. Also, it has been shown that in their unbound conformational ensembles, IDPs/IDRs have a preference for the structure they most likely to adopt after binding. 81,83,84 Interactions of IDPs with their partners are characterized by a diverse range of binding modes, due to which the formation of many unusually shaped complexes takes place, with some of these complexes being relatively static hence their structure could be determined by the x-ray crystallography method. 11 The most common binding modes of IDPs that have been studied extensively relative to others are Molecular Recognition Features (MoRFs). MoRFs are intrinsically disordered protein segments, which are short and interaction-prone. These regions also have intrinsic propensity for order, which is not strong enough to ensure their folding in the unbound state. However, upon binding to specific partners, MoRFs undergo disorderto-order transition. Such regions are chiefly involved in molecular recognition. The classification of MoRFs is based on their structures in the bound state. As a result, they are classified into α-helix-forming α-MoRFs, β-strand forming β-MoRFs, ordered regions without any regular structure or irregular ι-MoRFs, and complex MoRF that contain two or more types of secondary structure. 85, 86 In addition to MoRFs, other known binding modes are Pullers, 87 Penetrators, 88 Flexible Wrapper, [89] [90] [91] [92] Connectors and Armature, [93] [94] [95] [96] [97] Huggers, [98] [99] [100] Stackers or β-Arcs, 101 Intertwined Strings, [102] [103] [104] Long Cylindrical Containers, 105 Tweezers and a Forceps, 106 Grabbers, 107 Tentacles, 108 and Chameleons. 16, [109] [110] [111] [112] 1.3 Roles of IDPs in protein interaction and PPI networks IDP/IDR can play its roles by contributing to the binding diversity in three different ways, as it may serve as the structural basis for hub protein promiscuity, secondly, it may bind to structured hub proteins, and thirdly, IDR can act as a flexible linker between the functional domains and facilitate the binding diversity through the linker-enabling mechanism. 38 A vast range of functional importance of IDPs/IDRs has been found by the researchers. Few examples are given here to illustrate the type of biological activities carried by the IDPs/IDRs. (1) IDPs contain sites for various posttranslational modifications (PTMs), such as phosphorylation, methylation, glycosylation, ADP-ribosylation or acetylation; (2) Entropic spring (rubber-like) property can be provided by IDRs; (3) IDPs contain autoinhibitory domains; (4) IDPs/IDRs possess binding sites for DNA, rRNA, mRNA, tRNA, metal ions, and other proteins; (5) IDRs include regulatory protease digestion site; (6) Signal for the nuclear localization is located within IDRs; (7) IDRs provide flexible linkers between structured domains 112 ; and (8) IDPs, such as p21 and p27, mediate cell regulation. 113 Fig. 1 provides details of the involvement of IDPs in crucial cellular functions and processes. The compositional differences between ordered proteins and IDPs facilitated the development of various disorder predictors. These predictors were initially elaborated based on amino acid composition. Later, the predictors were developed on the basis of some basic physical principles and machine learning algorithms, which use the characteristic features of IDPs/IDRs, such as net charge, hydrophobicity, and other sequence features. As of 2009, more than 50 predictors for intrinsic disorder prediction have been developed and published, 116 and currently, this list is likely to be doubled. There are the good chances for the development of improved predictors for intrinsic disorder, if the proper sequence information is encoded into the prediction algorithm. The example of few common predictors are as follows: various members of PONDR family, 34 DISOPRED, 117 FoldIndex, 118 IUPRED, 119 DisEMBL, 120 DISOPRED2, 117 and RONN 121, 122 to name a few. There are three functional conformational states, in which IDPs could globally exist, depending upon the environment and content of residual structure. These are, in a range of the increasing depth of disorder, molten globule (MG), pre-molten globule (PMG), and random-coil-like (RC-like) states. Therefore, IDPs could adopt either extended conformations (RC and PMG) or remain globally collapsed (MG). 123 So far, the conformational and spectroscopic study of IDPs confirmed the important notion that the IDPs could not be represented by a homogeneous structural class, but it would be in the range of fully extended (RC-like) to compact (MG-like) conformations. Protein trinity hypothesis given by the Keith Dunker to accommodate three most known conformations of a protein molecule in a functional framework, which postulated that there a biologically active protein molecule can exist in three conformationally different native states, an ordered form, a state with collapsed-disorder (molten globule, MG) and a state with extended disorder (RC). Functional form is represented by any of the three conformations or transitions between them. Subsequently, this model was extended to accommodate an extra conformation that is the PMG, which is an intermediate conformation between MG and RC. 18 Many biophysical techniques can be applied for the conformational analysis and structure determination of IDPs. Some of these techniques provide outputs in an indirect way, while others are useful in providing more quantitative structural data. Nuclear Magnetic Resonance (NMR) is one of the most powerful techniques for deriving quantitative structural information. 124 A wide line NMR relaxation experiment characterizes the IDPs and provides details about the presence of the hydrated layer in the vicinity of disordered regions in the extended and open state. Additionally, the diffusion coefficient of protein can be measured by the pulse field gradient NMR, from which the hydrodynamic parameters could be derived. 4 Structural transition in IDPs can be mapped and documented by the electron paramagnetic resonance (EPR) spectroscopy. The introduction of new generation spin-labels EPR that target the residues other than the cysteine expanded the approach of this technique. [125] [126] [127] [128] [129] Small-angle X-ray scattering (SAXS) and small-angle neutron scattering (SANS), which are the experimental techniques for the extraction of quantitative information, lead to an investigation of transient intermediates and provide detailed information about the nature of IDPs. The techniques of a single-molecule approach such as fluorescence resonance energy transfer (FRET), [130] [131] [132] High-Speed Atomic Force Microscope (HS-AFM), 133, 134 and AFM-based force spectroscopy (FS) 135 are the tools to explore the dynamics and structure of IDPs. The change in distance between two residues and study of conformational equilibria in time length of less than a second based on the intramolecular distance distribution is done by the Single-molecule fluorescence resonance energy transfer (SM-FRET). Formation of secondary structures and probing of time scales from milliseconds to seconds is particularly sensed by AFM-based SM-FS. HS-AFM is used for the direct observation of dynamic processes and structural dynamics of biological molecules, with the temporal resolution of subsecond to sub-100 ms. 134, 136 To date, the various dynamic processes have been visualized successfully by using this approach. HS-AFM is applicable to both IDPs and well-structured protein. Various other complementary methods that can be used to study protein disorder are sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), gel filtration or size exclusion chromatography-based analysis, and specific behavior analysis in acidic and high-temperature environments. In the SDS-PAGE analysis, the observed mobility of IDPs appears to be anomalous. This phenomenon is explained by the less efficient binding of SDS molecules to highly charged IDPs in comparison to the globular proteins of similar molecular masses. The apparent molecular mass determine by this method is up to 1.2-to 1.8-fold higher than the molecular mass determined from the protein sequences or by mass spectrometry. 137 The unusually high apparent molecular mass of IDPs is also observed by gel filtration or size exclusion chromatography techniques. 123 The specific behavior of IDPs in different sets of environmental conditions, such as their stability in an acidic environment and insensitivity to high temperature, has been described for several IDPs, such as caldesmon, 138 microtubuleassociated protein-2 MAP2, 139 involucrin, 140 and α-synuclein 6 to name a few. These environmental conditions usually cause the denaturation or/and precipitation of globular proteins out of solution. This difference in the behavior of IDPs and globular protein in the various sets of environmental condition form the basis of purification of IDPs. [141] [142] [143] [144] This uniqueness of IDPs provides the first clue of their unusual structural conformation. 145 IDPs offer high flexibility to viral proteins 146 either in the wholly or partially disordered form. This provides viral proteins with the capability for quick adaption in the changing environment, survival in host body environments, and invasion of the defense mechanism of the host. To accomplish aforementioned tasks, a high mutation rate is exhibited by the viral genomes. For example, the rate of nucleotide exchange per position per generation exhibited by ribonucleic acid (RNA) viruses fall in the range of 10 À5 to 10 À3 , for deoxyribonucleic acid (DNA) viruses it is 10 À8 to 10 À5 , while eukaryotes and bacteria demonstrate mutation rate of 10 À9 . 147 Even a single mutation has high potency to affect more than one viral protein, due to high compactness of viral genomes and the existence of the overlapping reading frames, which often is observed in the viral system. Throughout the life cycle of the virus, many interactions are made to various components of the host cell. It begins with the attachment, entry, and proceeding for the hijacking of the cellular machinery and further viral components synthesis, viral particle assembly, and end by exiting the host cell in the form of new infectious particles. 148 And all these stages are heavily relying on the intrinsic disorder of viral proteins. 148 3. Involvement of IDPs in the pathogen-host mediated regulation of cell cycle cyclin-dependent kinase inhibitor 1B (p27 Kip1 ) protein is associated with positive and negative regulation of cell cycle. 149 (2) Preformed helical structure in disordered N-terminal transactivation domain (TAD) of p53 determines interaction of this protein with Mouse double minute 2 homolog (Mdm2), any change in amino acid residues in the molecular determinant region affects the binding of Mdm2 and subsequently cell regulation and apoptosis. (3) Conformational fluctuation in the intrinsically disordered cell proteins transiently exposes dynamic interaction motif that leads to posttranslational modifications (PTMs) and interaction with various target protein that affects cell cycle control. (4) Early mitotic inhibitor protein-1 (EMI-1) containing zinc-binding domain embedded on IDPs inhibits anaphase-promoting complex/cyclosome (APC/C) that controls cell division by promoting ubiquitin-mediated degradation of cyclins and other proteins involved in the regulation of the cell cycle. Kinase phosphorylations regulate the interaction between EMI-1 and APC/C. 114 Intrinsically Unstructured viral protein components, through molecular mimicry, could invade the host IDPs position involved in various cell regulatory processes (few of them as discussed above) and hijack host cell machinery. 114, [149] [150] [151] [152] In addition to the aforementioned pathways, viruses through histone mimicry can control the expression of the gene and ultimately cell cycle regulations network of host cells. 153 Many unconventional RNA binding proteins containing IDRs can also play an important role in the control of cellular machinery by viruses in the battlefront of host and pathogens. 154 Besides mimicking the function of host cells, viral protein complex directly attacks the cellular components and disrupt their normal functions, for instance, the disordered viral oncoproteins of many cancer-causing viruses attack the Retinoblastoma protein (pRb) and E2F complex and affect the normal cell regulation mechanism as shown in Fig. 3 . Phosphofurin acidic cluster sorting protein (PACS) acting as a traffic modulator first appeared in lower metazoans. Later evolution of this protein in vertebrate makes integration of cytoplasmic trafficking and interorganellar communication with nuclear gene expression. In due course of evolution, PACS functional diversity increases in the vertebrate by acquiring the phosphorylation sites and nuclear trafficking signals within its disordered regions. PACS proteins variants PACS-1 and PACS-2 mediated protein trafficking pathways hijacked by viruses for immune evasion, multiplication, and pathogenesis. However, the complete mechanism is yet to decipher. 156 To accomplish all the above functions, viral proteins made interactions with many cellular components, with few of them being Nucleic acids, Proteins, and Membranes. The presence of intrinsic disorder in viral proteins is advantageous for their interactions with the host cell components. The easiness of said interactions could be explained on the basis of the lack of rigid 3D structure and the presence of high structural flexibility in these viral IDPs or proteins containing IDRs to allow their interaction with many binding partners at a time. Linking of functional domains and their promiscuity is achieved by interaction with partner IDRs, where the flexibility plays a major role in bringing two or more domains in proximity in order to perform a particular function. Flexible linker functions of IDRs in viral protein confer the advantage of escaping the recognition by the host immune system; the viral protein interacts with host protein in such a way that the recognition of viral epitope becomes difficult to be recognized by the components of the host immune system. A mutation rate that is typically high in the viral system could be tolerated by the presence of these flexible regions in viral proteins that forbid the structural constraints, hence Host cell cycle regulation influenced by the attack of viral protein components on pRb and E2F complex. Viral protein complex forcibly releases the E2F from pRb and E2F complex and abruptly increases the cell cycle progression in an uncontrolled way. The blue color shows the normal pathway of G1 to S progression, while red color shows virus-induced uncontrolled cell progression from G1 to S. 155 avoid the susceptibility to mutation. The expected explanation behind all these incidents points toward the involvement of IDPs. The first and pivotal observation of an abundance of intrinsic disorder in the replicative complex of paramyxoviruses had been confirmed. [157] [158] [159] Availability and use of bioinformatics tools in the last decades and their continuous growth and the development of sensitive biophysical experimental techniques lead to the identification of an abundance of IDPs in Viruses. 31, [160] [161] [162] [163] Among all replicating organisms, the highest number is demonstrated by the viruses, which, therefore, are considered as the most abundant biological entities on the Earth, 164 For instance, if we compare the count of cells of all living creature present on the earth to number of the viral particles, it will be less than at least an order of magnitude. 165, 166 The number of viruses can be estimated by counting the number of virus-like particles in the environment. For example, 1 mL of natural water contains as many as 2.5 Â 10 8 viral particles. 167 Viruses are parasitic in nature, and in high abundance could be found in infected cells of Bacteria, Archaea, and Eukarya or even in other viruses. 164, 166, 168 The discovery of a small icosahedral virophage named Sputnik established the concept of infection of the virus by another virus 169 Sputnik virus infects the Acanthamoeba polyphaga Mimivirus (APMV) that in turn infects amoeba. APMV is a member of the Megaviridae family. [170] [171] [172] Infection of APMV by Sputnik virus is damaging and produces many deleterious effects in APMV, e.g., the assembly of capsid becomes abnormal and abortive viral forms appear. 169 This breach in the normal morphogenesis of APMV is explained on the basis of cytoplasm-independent replication center of APMV, where final morphogenesis normally takes place. However infection with Sputnik and multiplication of this virus at this center hinder its normal function. 173 From the structural perspective, viruses demonstrate very simple structural organization. However, they display various shapes and strictly do not possess a unique common morphology. The genome of all viruses either made up of double or single-stranded DNA or RNA. It is encapsulated within a protective protein coat known as the capsid. An additional lipid envelope contains a number of membrane proteins found in Enveloped viruses. The position of the envelope is above the matrix protein, which is an additional proteinaceous coat. Some complex viruses in addition to the non-structural proteins contain numerous accessory and regulatory proteins all that help in the assembly of the viral capsid. Viruses in reference to the structure of their genome, mechanism of replication, and transcription display a wide array of diversity. The viral genome could be of single or double-stranded DNA or single or double-stranded RNA and transcribed via a negative sense, positive sense, or ambisense mechanism. The diversity of the viruses either in genomic structure or mechanism of function leads to their classification in seven major classes. 174 Following this classification, all DNA based viruses kept in class I, II, and VII that contain dsDNA viruses, ssDNA viruses and dsDNA viruses that replicate via an intermediate single-stranded RNA (ssRNA) respectively. The remaining four classes, that is III, IV, V and VI, contain various RNA viruses, such as double-stranded RNA viruses (dsRNA), ssRNA viruses of positive (+) sense, ssRNA virus of negative (À) sense, and ssRNA virus of positive (+) sense that replicate via DNA intermediate, respectively. The certain features of viruses that typically oppose them to the living organisms are the absence of cell-like defined structure and inability of maintaining homeostasis and reproduce outside of the cellular environment due to the absence of their own metabolism and essential dependence on the host cell to make new products. The other features, such as the presence of a genome, replication ability and selfassembling creation of their own copies, and continuous evolution by natural selection make viruses similar to other living organisms. 175 The presence of unusual properties makes it difficult to agree on the common view on the viruses. It is difficult to elaborate on whether viruses are some organisms at the edge of life, different and special with respect to other living cellular organisms, or nonliving organic structures that have a self-driven property to interact with living organisms. 176 The recent discovery of the presence of the metabolic protein-encoding genes in giant viruses challenged the previous view of the lack of these genes in viruses. 177 Certain bacterial species, such as Mycoplasma, Rickettsia, and Chlamydia are obligate intracellular parasites exactly as viruses. All this approves the reconsideration of criteria describing the living organisms. There is an incomplete understanding of virus origin, three chief hypotheses have put forth to explain the understanding of their beginning. 178 The first hypothesis is the coevolution theory, according to which viruses and cells appeared simultaneously in the early history of the Earth. Since their emergence, viruses have a dependency on cellular life. The second hypothesis is known as the cellular origin hypothesis or the vagrancy. According to this hypothesis it is assumed that the evolution of viruses occurs from the DNA or RNA pieces that escaped from the genes of the larger organisms. Examples of potential candidates for this escaped genetic material are (1) physically separated chromosomal DNA that is naked and can replicate independently called plasmid, (2) DNA pieces that have the ability to move from one place to other within the gene and replicate, termed transposon. Last, the third hypothesis of virus origin is a regressive or degeneracy hypothesis that proposes the origin of viruses take place from a parasitic cell that sheds all genes that were not required for the support of parasitism. The root of viral origin also traced from the nucleoprotein world that transiently existed during the transition of the RNA world to the modern DNA-RNA-Protein world according to different hypotheses. The appearance of RNA viruses took place either due to reduction or escape from the RNA containing primitive cells. These RNA viruses are also considered as the evolutionary starting point for some of the DNA viruses. 178 The origin of viruses considered to be in the early phase of the evolution of life, 179 when the living cells first evolved. Since then the existence of viruses has been proposed. This could be a reason why viruses have the ability to affect the cells from all three kingdoms of life that are Eukarya, Archaea, and Bacteria. The primitive viruses and their quick evolution propose a possible explanation for the lack of homology among the major viral proteins and proteins of cellular organisms. 178 Viruses contribute to the evolution of life through their ability to promote horizontal gene transfer and discovering DNA and its mechanism of replication among different life forms. The amalgamation of foreign genes often from unrelated organisms and modification in replication machinery leads to continuous evolution and genetic diversity. 180 The contribution of virally originated DNA fragments in the genetic material of humans is between 3% and 8%. Origin of few DNA replicating proteins through viral sources and their successive transfer in the cellular organisms advocate the key role of viruses in the formation of DNA and subsequent development of replication mechanism. These viral-mediated developmental processes were essential for the evolution of the eukaryotic nucleus and potentially the development of three domains of life. 178 A new classification for the life forms present on the Earth has been proposed. According to this classification all ribosome encoding organisms that include Archaea, Bacteria and Eukaryotes are kept in one class and all viruses are included in separate class of capsid encoding organisms that dependent on ribosome-encoding host for completion of their life cycle and contain nucleic acids and proteins and also possess the ability of self-assembly into nucleocapsids. 181 The viral capsid is the protective coat surrounding the viral genome. Protein monomeric subunits termed as capsomers or protomers combine together to build the shell structure of the capsid. A tight association of RNA or DNA based genome to capsid protein results in the formation of a nucleoprotein complex. Nucleoprotein complex of viruses has the capability to interact with both nucleic acids and proteins, thereby possessing multifunctionality. Capsid structure is determined by the arrangement of capsomers. On this basis capsid could be of helical, icosahedral, or complex in shape. Highly ordered helical structure is a property of the capsids of helical, rod-shaped and filamentous viruses that are generally formed around a central axis with a single type of capsomer packaging. The genetic material of viruses made of RNA or DNA occupies the central cavity of the capsid, the positive charge of capsid protein and negative charge of viral genome maintain an electrostatic interaction between them. There is a variation in the size of helical viruses, which could be very long and flexible or very short and rigid. Capsid length of the helical viruses is defined by their genome size, whereas their diameter is defined by the size and arrangement of capsomers. Well-known illustrative examples among filamentous viruses are Sulfolobus islandicus filamentous virus (SIFV), Tobacco mosaic virus (TMV), Acidianus filamentous virus 1 (AFV1), and bacteriophage fd. In icosahedral viruses, the capsids are either icosahedral or nearly spherical with icosahedral symmetry. Although the number of capsomers required in the formation of such an icosahedral structure theoretically is calculated to be 60, in reality, in the majority of icosahedral viruses it is above the 60. 148 Viral capsids are often made up of more than one capsid protein. For instance, capsid of Human papillomavirus (HPV) is made of major (L1) and minor (L2) capsid proteins. In the case of icosahedral viruses, the capsid is made up of more than 60 identical subunits. To develop the icosahedral shape, the same protein in different sites shows different symmetries. This intriguing puzzle has been the topic of long-lasting debates on how the identical subunits with identical unique 3D structures fit into different symmetries in different environments. 182 Few viruses have complex capsid structures that are neither completely helical nor icosahedral and contain some extra structures, such as protein tails or complex outer walls. The example of one of the best-studied complex viruses is T4 bacteriophage. The characteristic feature of this virus is an icosahedral head on the top of the helical tail. A structure of a hexagonal base plate with extended and protruding proteinaceous fibers occurs at the end of its tail. T4 virus attains the ability to bind host bacterium and successfully transfer its genome into it due to this tail structure that acts as a molecular syringe. 183 Lipid membrane of viral capsid is acquired from the host by certain viruses. The membrane-coated capsid of these viruses is known as the viral envelope that might also contain the viral glycoprotein, for example, gp160 in Human Immunodeficiency Virus (HIV) that contains transmembrane subunit gp41 and structural subunit gp120, proton-selective ion channel and M2 protein of influenza virus and Hemagglutinin (HA) and neuraminidase in other enveloped viruses. The functional role of these surface-incorporated viral glycoproteins is rather diverse. Few among these glycoprotein that protrudes from the lipid bilayer of the virus, for example, neuraminidase (NA), HA and gp120, play a number of important roles in early-stage viral infection typically associated with attachment and penetration of the viruses into the host cells. 184 As stated earlier, viral glycoprotein performs diverse functions related to the life cycle of enveloped viruses. For instance, the M2 proton channel of influenza A virus has a crucial role in the early and late replication cycle of influenza. The exposure of viral content to host cytoplasm requires hydrogen ion to lower the pH. At lower pH, M1 dissociate from the ribonucleoprotein and initiate viral uncoating. The supply of hydrogen ion into the viral particle from endosomes is mediated through the integral homotetrameric membrane protein (M2 proton channel), situated in the viral envelope. This ion channel is proton-selective and is gated by low pH conditions. 185 In enveloped viruses, the viral envelope is attached to their core via matrix proteins. Matrix protein plays their role once the virus enters into the host cell. In addition to expelling the genetic material from the viral core, matrix proteins have various regulatory roles via interacting with host components. For instance, influenza virus matrix protein M1 controls inhibition of viral transcription, its ribonucleoprotein export from nucleus and budding. 186,187 Non-structural proteins do not form the capsid structure. Instead, they participate in viral multiplication and have multiple regulatory functions. Below are some illustrative examples of the non-structural proteins of a few viruses and their involvement in crucial viral functions. HPV open reading frames (ORFs) are classified in early (E) and late (L) types on the basis of location within the viral genome. HPV early ORFs code for non-structural proteins. Both E1 and E2 proteins participate in viral replication and regulation of transcription at an early stage. E1 binds to the origin of replication and unveil helicase and ATPase activity, 188, 189 while E2 facilitates E1 binding to the origin of replication by forming the complex with it. [189] [190] [191] E2 also plays a role of a transcription factor by regulating (both positively and negatively) early gene expression by attaching to the specific recognition sites within the upstream regulatory region (URR). 192, 193 A differentiationdependent productive phase of the viral life cycle is promoted by the highly expressed protein E4 that is involved in a number of important functions. [194] [195] [196] In vitro studies found that E5 has weak transforming capabilities. 197, 198 It disrupts the MHC class II maturation 199 and is involved in the HPV late functions. 200, 201 E6 and E7 proteins are primarily involved in the progression of HPV-mediated malignant cells that ultimately cause invasive carcinoma. Their role in high-risk HPVs is to act as partial oncoproteins at least by targeting the cell cycle regulator/tumor suppressor p53 and Rb. Another example that demonstrates the diversity of functional roles attributed to the non-structural proteins is given by Hepatitis C Virus (HCV), where the interaction of non-structural protein with the hVAP-33 (VAMP-Associated Protein A), which is a human cellular vesicle membrane transport protein, lipid raft membranes, and with each other leads to the formation of the HCV RNA replication complex also called HCV replicon. 202 In the diversity of their functional roles, immunomodulation is also demonstrated by non-structural proteins. The non-structural protein NS1 of West Nile virus (WNV) has displayed is presence in the immunomodulation, as concluded by experimental finding that both cell-surface associated, as well as soluble NS1 was able to bind and recruit the complement regulatory protein factor H. Due to this activity, there is a decrease in the complement activation that minimizes the targeting of WNV by immune system via decrease in the infected cells complement recognition. 203 The immune modulation role is also exhibited by rinderpest virus non-structural C protein, but via a different mechanism. In rinderpest virus action of type 1 and type 2, interferons, which are responsible for the induction of innate immune response, are specifically blocked by non-structural C protein. 204 It has been determined that many non-structural V proteins of paramyxovirus have shown their roles in countering the response of antivirals. 205 At last, gene transactivation may require viral non-structural proteins. For instance, the autonomous parvovirus minute virus of mice (MVM) non-structural protein NS1 is required the activation of p39 promoter that controls the transcription of a gene that encodes capsid protein. Gene that code for NS-1 also codes NS-2 due to overlapping transcription unit in MVM virus. This gene is transcribed by a P04 promoter. 206 Many of the crucial functions of viruses are performed by various accessory and regulatory proteins through their involvement in an indirect functional role that ranges from transcription rate regulation of viral gene encoding structural proteins to modification of host cell functions. For instance, the replication of HIV-1 is actively controlled by the production of several accessory (Nef, Vpu, Vif, and Vpr) and two regulatory proteins (Rev and Tat). These regulatory and accessory proteins control the various aspects of the viral life cycle, in addition to regulating the host cell functions, such as gene regulation and apoptosis. 207 A number of accessory proteins are, in fact, responsible for in vivo infection. For instance, Vif protein overcomes the host defense mechanism, while Nef increases the viral pathogenesis by targeting the bystander cells. 207 6. Role of bioinformatics in divulging the dark proteome of viruses Viral proteins contain many unusual features that are lacking in the cellular proteins of other organisms. 160, 179 The presence of a specific feature in a viral proteome helps them to adopt to a hostile environment quickly, while providing means for controlling the cellular machinery easily. 208 The absence of corresponding features in the proteome of other organisms might reflect the ancient origin of viruses and their genome from the cellular lineage that is extinct now. 209 In addition to demonstrating various peculiar features, as enlisted in, 160 viral proteome contains frequent short disordered regions that generally lack the hydrophobic residues and lysine, while containing the polar residues and residues that are not involved in the regular secondary structure formation. 148, 160 The polar residues are required for the specific recognition and stabilizing the interaction with partner molecule through hydrogen bonding in a bound state and maintain randomness in an isolated state. 210 The loosely packed and disorder-enriched viral proteome resists the negative effect of mutations that is a quite common event in viruses. 148 In order to evaluate the correlation of structure, function, and extent of disorder in the proteome of viruses, Pfam database analysis was carried out. 162 The disordered regions of viruses are mainly attributed to the protein-protein interaction, recognition, signal transduction, and regulation. 148 Viruses hijack the host cell machinery and use it for their specific functions on the basis of their ability to mimic the host protein short linear motif (SLIMs). 152 SLIMs are embedded in disordered regions and play a great number of diverse roles, such as directing proteins to the correct subcellular localization, targeting host proteins for proteasomal degradation, cell signaling, deregulating cell cycle checkpoints, and altering transcription of host proteins. 211 Based on the requirements, the proportion of SLIMs varies; hence the number of disordered regions could vary from one viral family to other. Recent studies also determined that there is no specific correlation in the genome size and disordered content in viruses. 212 Bioinformatics plays an important role in divulging the Intrinsic disorderness of small biological machinery owning the replication ability in the host, and establishing the structural, functional and regulation networking as discussed and referenced in the aforementioned paragraph. Many different studies and evaluations of IDPs fraction in evolutionarily distant species were conducted in the last decade. [24] [25] [26] [27] [28] [29] 213 Based on the major outcomes, in general, it was concluded that in comparison to prokaryotic proteomes, proteomes of Eukaryotic species have a higher portion of IDPs and IDRs. The basis for the justification of these observations was the repertoire of the specific function of IDPs/IDRs which are mainly involved in the events of recognition, regulation, and signaling. The regulatory network of eukaryotic organisms, especially those who are multicellular, is explicitly depend on the ability of IDPs/IDRs to perform multiple vital functions. 2, 38, 39 Although, as much as the functional basis considered as an important component that acts as a driving force for evolutionary changes, the change of proteome by itself cannot be ignored. The assumption to establish the relationship between morphological complexity and proteome size of the organism is alluring. Although this trend is valid in the case of establishing the difference between eukaryotes and prokaryotes, but cannot be implemented among species of eukaryotes, where the wide variations in nuclear genome size have been reported and termed as the C-value paradox. C-value, which is simply described as the amount of haploid DNA present in the cells of an organism, was described as a significant quantity that could be used to estimate and look into the nature of the gene. [214] [215] [216] In comparison to the human genome. The genome size of a plant Paris japonica is nearly 50 times greater; genome sizes of some unicellular Protista are much larger than the human genome. For instance, Polychaos dubium genome is 210 times of human genome and is the largest known genome. 217 Cells of some salamanders contain 40 times more DNA than cells of humans. 218 The mystery of complexity of the relation of eukaryotic genome size and gene number is solved with the discovery of non-coding DNA revealing that the most of the DNA of eukaryotes is non-coding in nature hence cannot be incorporated in genes. This discovery also proposed that the description of organisms should not be solely based on a total number of proteinencoding genes, but the number of encoded proteins should be taken into account. However, the recent finding evidenced the poor correlation between the complexity of a given organism and its proteome size for instance number of proteins in the whole proteome of Nematode, Caenorhabditis elegans is $20,000 219 and is similar to the number of proteins encoded by the human genome. A study focused on the analysis of predicted intrinsic disorder in the proteome of 3484 organisms including viruses conducted in 2012 revealed the number of significant details of the proteomes of various organisms. 31 Table 1 lists the details of the prevalence of intrinsic Disorder in protein contents of different viruses deposited into the DisProt database. 220 Analysis of IDPs of the 3484 proteomes of different species resulted in the observation of the continuous spectrum of the proteome size space among the proteomes of eukaryotes, bacteria, archaea, and viruses, as wonderfully depicted in Fig. 1A . 31 Eukaryotes demonstrate wide-scale variations in the size of their proteome that form proteins whose number ranges from 4000 for unicellular species to $20,000 for multicellular species. Bacterial proteomes have a number of proteins in the range of 500-8000, with only a small portion of bacterial species having proteome size less than 1500 proteins. The archaeal proteomes are condensed to the much narrow range of 1500-3000 proteins. Proteomes of viruses are very compact, being limited to less than 1000 proteins. Log-based plot analysis (Fig. 1B of 31 ) determines that the only one polyprotein is possessed by the greater than 200 viruses and the number of viruses whose genome encode proteins between 15 and 30 is limited in comparison to the viruses with other sizes of the proteome. 31 So far, nine large mimiviruses are known, each containing more than 500 proteins. The size of the proteome of these mimiviruses is so large, that we can say that it is of nearly equal size to the proteome size of some small bacteria. Therefore, the continuous spectrum of a size of proteome arrange in the order of viruses to archaea, to unicellular eukaryotes and lastly to multicellular eukaryotes. The proteome of bacterial species overlapped with the proteome of viruses, archaea, and unicellular eukaryotes. 31 Disorder protein content in the majority of bacterial species is estimated to be between 18% to 28%, which is quite low. Although the small number of bacteria shows disorder content as high as 35%, this value represents the lower boundary of the fraction of disordered residues predicted for both unicellular and multicellular eukaryotic organisms (Figs. 1 and 2 of 31 ). Based on the estimated disordered content in Archaea, this kingdom can be split into three classes. Class one consists of the organism whose proteomic disordered content range from 12% to 21%, and 61 organisms such have been analyzed. Class two consists of 4 organisms whose disordered content varies from 21% to 32%. The last class has the 8 organisms with the estimated variation in their disordered content being reported to range from 32% to 38%. The comparatively higher percentage of disorder in the class three species is attributed to the peculiarities of their habitats. As confirmed by the studies, the high disordered bearing archaeal species are halophiles and methanophiles. 29 Generally, the global disorder predictors are developed on the basis of the training set of non-halophilic proteins under the normal physiological conditions of 100-150 mM NaCl. The accuracy of determined IDRs for the proteins of the extremophilic microorganisms surviving under the hypersaline conditions with the help of such predictors might vary. Actually, since halophilic microorganisms are the salt-loving extremophilic organism, their optimum growth occurs in the salt-rich environment. A strategy used by these microorganisms to maintain an appropriate osmotic environment in their cytoplasm is "salting-in". Through this, they accumulate molar concentration of chloride and potassium. 221 Extensive adaptation in the intracellular proteins is required for this strategy to tackle the presence of excessive salt concentration, as at near saturating salt concentration they should maintain proper conformation and activity. The proteomes of these "salting-in" organisms are highly acidic in nature and corresponding proteins possess remarkable structural instability in low salt conditions, while possessing soluble and active conformations in a hypersaline (Salt rich) condition that are usually detrimental to proteins of nonhalophilic organisms. Furthermore, a salt-rich environment determines the structure to function capability. In similarity to their physiological environment, excessive salts and water bind to proteins of these organisms in solvent conditions that depend upon the acidic amino acid residues present on the protein surface. [222] [223] [224] [225] [226] [227] [228] [229] [230] Considering the aforementioned reasons, it could be suggested that prediction of high disorder in these organisms may simply represents prediction error. 31 The analyzed disorder levels among non-viral proteomes revealed that unicellular and multicellular eukaryotes generally have the highest amount of IDPs/IDRs in their proteomes. Comparative fractional analysis of disorder for them range between 35% and 45%. However, a group of unicellular eukaryotes has levels of disordered residues in the range of 45-50%. The organisms included in this group are Cryptococcus neoformans (CRYNE, DISORDER%, 47.1), Neurospora crassa (NEUCR, DISORDER%, 48.2), Plasmodium falciparum (PLAF7, DISORDER%, 49.5), Plasmodium yoelii (PLAYO, 46.0%), and Ustilago maydis (USTMA, 49.9%). The observed high variability and high levels of predicted disorder are in line with the earlier study that revealed enrichment of predicted disorder in early-branching protein, while comparing it to typical eukaryotic proteins structure submitted in Swiss-Prot database and ordered proteins from PDB. 231 As much as twice the fraction of IDRs with !30 disordered residues is found in some protozoa, in comparison to Swiss-Prot database-based representative set of proteins. If it will be compared with similar regions from a PDB select 25 set of proteins, it would be sevenfold increase. 231 It is noteworthy that more disordered proteins were found in parasitic protozoa than in non-parasitic protists. 231 For instance, 35% proteins encoded by genes present on the chromosomes 2 and 3 of P. falciparum were predicted to contain long IDRs (i.e., longer than 40 residues). 24 Although more recent study revealed that the data on the amount of disorder in P. falciparum was underestimated, proposing that 52-67% proteins of this organism contain long disorder regions. 232 The latest study examines the prevalence of disorder in the proteome of many apicomplexan parasites, the obtained result demonstrated that the primate malaria parasite (P. knowlesi) and human malaria parasites (P. falciparum and P. vivax) contain more disordered regions in comparison to rodent malaria parasite. 25 Additionally, more disorder was reported in the proteins expressed at a sporozite stage of P. falciparum in comparison to those expressed in the other stages of their life cycle. 25 It has been proposed that a high abundance of disorder in the proteome of this unicellular organism is related to its adaptation to changing environment during its whole life-cycle, as it is able to affect many different hosts. 231 In simple words, we may say that the abundance of intrinsic disorder in the apicomplexan parasite evolves as a way to adopt a parasitic life style. 231 Overall observance of various proteomes of different life forms and their disorder contents revealed that with the increase in the proteome size, the lower bound fractions of disordered content appear to increase continuously, whereas the upper bound fractions of disordered residues decrease in viruses and increase among the bacteria, archaea, and eukaryote. Therefore, the species whose proteome size falls between 1000 and 2000 proteins have the least variance of the fraction of disordered residues. Nevertheless, if the variance of a fraction of disordered residues is measured by different domains of life, the largest variance comes to 70% and it would be for viruses, whereas for multicellular eukaryotes variance comes to 12% which is smallest. 31 There is a variation in the fraction of disordered protein residues among viral proteomes as shown in Fig. 1 of reference . 31 For example, avian carcinoma virus proteome has the highest fraction of disordered residues (77.3%), while human coronavirus NL63 has very low fraction of disordered residues (7.3%). Few species of viruses are highly rich in disordered residues. There are 20 small viruses that encode 5 proteins in their proteomes and that have disorder content 50% or greater. In viruses, it appears that with increasing proteome size, the disorder content converges in the range of 20-40%. The prediction of the high content of intrinsically disordered residues in viruses found to be in great agreement with a study showing that many proteins of bacteriophage, viruses, bacteria, and archaea are significantly depleted in the hydrophobic residues and enriched in polar (hydrophilic) residues in their sequences. 210 A portion of IDRs in viruses is likely to evolve to support their ability to deal with their hostile habitat, in addition to be profoundly involved in functioning of their proteins. Still, other IDRs have evolved to deal with the alternative splicing, antisense transcription, and gene overlapping in a way that makes more efficient use of genetic material. 162 Polar residues have the ability of specific recognition and could establish a strong hydrogen bond with partner molecules contrary to the non-specific hydrophobic interactions. An increased amount of polar residues in viral proteins could be linked to increasing demand for disorder in the unbound state and specific recognition and stabilization inbound states. 31,210 A model has been proposed to categorize the different coronaviruses on the basis of the distribution patterns of IDPs within their Nucleocapsid (N) and Membrane (M) proteins. This categorization allows the quick determination of transmission behaviors (Route, mode, and mechanisms) of various coronaviruses regardless of their genetic proximity. For instance, the shell rigidity has been reported in the viruses transmitted by the oral-fecal route because rigidity in shell protein protects the virions from damage, rigidity in shell protein is directly linked to intrinsic disorder of N and M protein. 233 Envelope protein gp120 of HIV-1 contains both ordered and disordered regions. V3 loop represents a disordered region that is important for controlling the immune cell receptor chemokines co-receptor mediated entry. Chemokines co-receptors CCR5 (R5), CXCR4 (X4) or Both (R5X4) used by the viruses are known as R5, X4, and dual tropic respectively. HIV-1 variant, while infecting the host, uses the different chemokine receptors. Switch from R5 to X4 is related to disease progression and pathogenesis, however, the reason for switching is majorly unknown. Xiaowei Jiang et al. hypothesized that this change is associated with sequence variation and intrinsic disorder. Detailed analysis by the same group using the nonparametric statistical approach determined that there is an increased disordered propensity in the V3 domain, while switching from the dual/R5 tropic to the X4 tropic virus. This increased structural disorder of the V3 domain is associated with HIV-1 cell tropism. 234 The aforementioned study forms the basis for the identification of different hidden patterns with respect to IDPs and their association with viral distinguished characteristics. Host cellular machinery hijacking and modulation of regulation network/components often results in the formation of insoluble inclusions/ aggregates that usually contains the viral structural components. These viral-mediated aggregates utilize the viruses to build the large complex containing both viral and host protein assembly for promoting viral replication, transcription, and translation and Intra/Intercellular transport. The aggregated structure housing the viral-host assembled complex protects it from cellular degradation mainly. Although the complete role and mechanism of function of these aggregates with respect to specific viruses are not completely understood, 235 however, in most cases, the pattern of aggregates and their associated characteristics helps in unraveling the behavior, quantification, and identification of viruses. 236 However, deep understanding and establishing an association between aggregation behavior and intrinsic disorder might provide the surplus information pertaining to the viral infection. Fig. 4 demonstrates the analysis of intrinsic disorder predisposition and intrinsic propensity for aggregation (and intrinsic solubility) in Japanese encephalitis (JEV), Enterovirus-71 (EV-71) and ZIKV genome polyproteins. Viral proteins are atypical in nature due to their poor homology to the proteins of modern cells, which proposed viruses are very primitive. 179 While evading the defense mechanisms of the host, it is compulsory for the viruses that they are able to survive outside and inside the host and also be able to quickly adapt to fast-changing surroundings. In order to keep the pact of quick adaptation with the fast-changing environment, viruses undergo a very high mutation (for RNA viruses it is 10 À5 to 10 À3 nucleotide exchange per generation and for DNA viruses it is in the range of 10 À8 to 10 À5 ). 147 This much higher rate of mutation in viruses is due to the lack of RNA repair mechanisms. On average, mutation rate in Bacteria and eukaryotes is 10 À9 nucleotide exchange per generation, which is comparatively low. 147 The viral genome is quite compact, and there is an overlap of many reading frames, a single mutation might affect more than one viral protein. 240 During various stages of their life cycle, viral proteins usually interact with multiple components of the host cells, starting from the early entry to formation and exit of new infectious viral particles. In order to perform crucial functions associated with their life cycle events, viruses interact with host nucleic acid and proteins, even though the large gaps exist in between viral and host protein. 178, 240 The aforementioned features incite curiosity to look into more details of their unique characteristics from the biophysical perspectives. The extent of the presence of intrinsic disorder in the viral proteome provides the corresponding plasticity that confers numerous functional advantages. The flexibility of an IDP/IDR and the lack of compact rigid structure enable it for multiple interactions. IDR binding promiscuity is facilitated by various mechanisms, with the operability of theses mechanisms depending upon its extent of the flexible linking property. This property of flexible linking provides an additional advantage to the viral proteins for eluding the host immune system and making it difficult for the host immune system to properly recognize the epitope. High disorder in viral proteome can be a way to deal with high frequency of mutations. Deleterious effects of mutation buffered by the high adaptability and low interaction between amino acids (flexibility) of IDPs. This is because the unstructured IDPs has less to lose when substitution takes place than a highly ordered structure that might have more impact on substitution. It is clearly evident that viral proteins can be benefited from flexibility garnered by disordered residues but not all the viral proteins have IDRs nor they are IDPs. There is a relation between disorder content and location of a protein within the virion, and a comparative analysis of disorder predictors used in the analysis of viral proteins confirms it. 241 Such a study has begun with the construction of a database including viral proteins from HIV and Influenza-related viruses that followed by the protein sequence comparison, structure prediction, as well as function and location within the virion. The outcomes (particularly for influenza virus) demonstrated a correlation between the proximity to the RNA core of the virion and the levels of disorder in protein, where the closer protein is located to core the higher disordered percentage it would have. This finding of a relation between disorder and proximity to the core could be explained on the basis of more interactions with viral RNA. It has been found that nucleic acid-binding proteins are commonly disordered or at least have disordered regions at the site of nucleic acid binding. 242 In the case of the HIV, the correlation between proximity to the core and high disorder content has not to be observed possibly due to the presence of enzymes around the core region that are predominantly structured proteins. [243] [244] [245] The matrix protein of both HIV and influenza A viruses have rather different disorder contents. The HIV matrix protein is predicted to be highly disordered, while Influenza A virus protein is less disordered or somewhat ordered. 241 Concerning the Surface protein disorder, it was found that the surface protein gp120 of HIV has less disorder content across all analyzed strains, while gp41 found to be highly disordered. 241 Surface proteins of Influenza A virus NA and HA are predicted to be mostly disordered. 241 However, the subsequent studies revealed that predicted disordered content vary among subtypes and suggested that this variability could have a link to the virulence level. 241,246 IDPs can made interaction with several distinct partners due to their conformational flexibility and property of interaction adaptability. When a single IDR binds to many partners, then it converts themselves in many different structural forms. 110 IDPs demonstrate different interaction modes, either being able to form a very stable complex structure or transiting between the interacting partner as dynamic bound and unbound state acting as an on-off switch in signaling pathways. 11 Depending upon the surrounding environment, IDPs adopt different conformations and functions accordingly. Binding promiscuity is an important characteristic and required feature of the for viral proteomes, since despite encoding many proteins, viruses explicitly require host cell machinery to complete their life cycle, and in doing so, binding promiscuity is helping them to fulfill this role. The binding promiscuity and interaction types have well explained in the earlier paragraphs of this chapter. The compact genomes of viruses restrict them to encoding fewer proteins, but the presence of IDRs or global disorder allowed proteins to be involved in different tasks by interacting with various partners. With a few given examples, it would be easy to understand how the binding promiscuity of viruses is related to their intrinsic disorder. The replication of the RNA genome of hepatitis delta virus (HDV) requires the translation of a single basic protein known as the delta antigen (δAg). δAg is a small protein containing 195 amino acid residues and has no known enzymatic activity, although being essential for the replication of viral genome. 247 Experimental CD measurement and computational research via disordered protein meta-predictors have proven this protein to be an IDP. 248 Completion of the HDV replication cycle of the depends on this protein and various components of the host cell. Therefore, it is easy to understand the importance of binding promiscuity of δAg, that interact with multiple components in the host cell for various reasons and through a different approach, although the exact purpose of these interactions is still unclear and studied widely. 249, 250 In an in vitro analysis, it was found that δAg binds to RNAs and even dsDNA in addition to binding to HDV RNAs that shows a lack of specificity in δAg protein. 248 HCV NS5A protein that is involved in viral replication and viral particle assembly makes another example. 110, 251 NS5A is a membrane-associated protein that has both disordered and ordered regions, an anchor attaches its N-terminal region to the membrane, but its cytoplasmic regions are mostly disordered and contain three domains. Among these three domains, domain I (D1) is highly conserved and has ordered sequence, 252 while domain II (D2) and III (D3) are highly disordered and less conserved. 253, 254 Promiscuity of NS5A is well studied, and some of the interactions that involve its disordered domain have been identified. 255 D2-associated binding motifs that appear to affect the host regulation pathways, such as apoptosis and signaling demonstrate distinct interaction patterns described in detail. 256 A third example of binding promiscuity was described for the Measles virus (MeV) Nucleoprotein (N) that forms the nucleocapsid of the virus. Intrinsically disordered regions are located at the C-terminal of N-protein, 159 that make interaction with phosphoprotein of the viral polymerase complex and perform functions required in replication and transcription. 257 Besides interacting with phosphoprotein for crucial processes, N-protein interacts with several host components, including cellular receptor and cellular cytoskeleton through its C-terminal tail. 258 Phosphoprotein of MeV is an important cofactor of polymerase complex and requires for recruitment of transcriptional machinery through its long disordered regions that it contained. 258 It has been observed that when IDRs of both phosphoprotein and N-protein binds, the major extent of flexibility disappeared, although some flexibility still presents that represent remaining disorder within the complex. 259 This finding in N-protein suggests that its IDRs act as a platform for the interaction with various protein partners for the completion of cellular processes. 259 The common feature of the structural disorder has successively shown in the nucleoprotein of Paramyxoviruses. 259 Disordered (Intrinsically unstructured) components were found together with structural components in proteins like nucleoprotein and phosphoprotein of Hendra and Nipah viruses. 259 In due course of evolution to maximize the use of the limited genome in regulatory and structural protein, viruses adapted sophisticated genetic organization and mechanisms such as alternative splicing of polycistronic RNA which are necessary for the expression of the regulatory viral proteins in controlled manners. Viruses also evolve their genetic constitution, genomic structure and mechanism of transcription and replication to efficiently use both positive and negative and even ambisense transcription. Among examples of such viruses are human T-cell lymphotropic virus type 1 (HTLV-1), a delta-retrovirus that causes HTLV-1-associated myelopathy, adult T-cell leukemia (ATL), and Strongyloides stercoralis hyperinfection. Economic usage of the genetic material of HTLV-1 is due to the wide accumulation of intrinsically disordered proteins in its proteome. This is paralleled to the occurrences of intrinsic disorder in HIV-1 protein, where intrinsic disorder was observed in post-translational cleavage sites leading to the production of Gag, Pro and Pol from Gag-pro and Gag-pro-pol grand polyproteins and cleavage sites of polyproteins that yield MA, NC, CA, RT, TM, IN, and SU proteins. 148 In few viruses, a protein named viral genome-linked protein (VPg) is bound to 5 0 end of their RNA genome through a phosphodiester bond formed between the hydroxyl group of Thr/Ser/Tyr residues and 5 0 phosphate group of RNA. [260] [261] [262] VPg's are highly diverse in terms of their size and sequence. For example, in Comoviridae and Picornaviridae members it is 2-4 kDa, Caliciviridae, Sobemoviruses, and Potyviridae members it is 10-26 kDa, while it is up to 90 kDa in Birnaviridae members. 263 VPg plays a key role in major steps of the viral life cycle, such as cell-cell movement, replication, and translation. Since VPg performs these crucial functions either in its mature or precursor form, VPg precursor processing represents one of the regulatory mechanisms of its multi-functionality. 262 The multitude of interactions with different viral and host proteins define VPg multifunctional role. The different interactions made by VPgs are: VPg to itself, cylindrical inclusion helicase, cylindrical inclusion protein, nuclear inclusion protein b, helper component protease, coat protein or eukaryotic translation initiation factors eIF4A, eIF4E, eIF3, and eIF4G, and the poly(A)-binding protein. 262, [264] [265] [266] [267] [268] [269] [270] [271] [272] Poly-functionality and binding promiscuity of VPs' at least to some extent is due to its intrinsically disordered nature. Intrinsically disordered nature of VPg was reported for many viruses through their individual protein characterization. These viruses are: rice yellow mottle virus (RYMV), Sesbania mosaic virus (SeMV), potato virus Y (PVY), potato virus A (PVA), and lettuce mosaic virus (LMV). 262, [273] [274] [275] [276] The computational analysis showed that functionally important disordered VPg representative of viral diversity includes four members of the Caliciviridae family, six potyviruses and six sobemoviruses. 276 The disordered VPg components associated with the regulation of enzymatic activity in different viruses 273, 277 in addition to performing specific regulation and transportation of viral RNA from one cell to another. 278 In order to determine the intrinsic disorder content in viral proteins, bioinformatic studies were carried on a few viruses matrix proteins. 241, 279 This study revealed that matrix proteins p17 of SIVmac and HIV-I possess high disorder content, while low disorder was observed in the matrix protein of equine infectious anemia virus (EIAV). 279 Matrix protein p17 of HIV-I, also known as MA protein, is 132 amino acid long polypeptide that lines the inner surface of the virion membrane and holds the RNA containing viral core at its place. The N-terminal part of the p17 matrix protein is myristylated. 280, 281 p17 associated with the inner leaflet of the viral membrane and form the protective shell and participate in virion assembly. 282 A targeting signal for the Gag polyprotein transport to plasma membrane is provided by co-translational myristylation of p17 N terminus. 280, 281 A specific feature; i.e., the presence of a set of basic residues within the first 50 amino acid residues of p17, enable its involvement in membrane targeting. 283 In addition to performing the number of functions in the viral replication cycle, it could be involved in nuclear import possibly through its specific nuclear localization sequence. 284 HIV-I nucleocapsid protein is 55 residues long protein that contains two zinc finger domains flanked by linker comprised of basic amino acids, which is required for nucleic acid interaction. 285, 286 This nucleocapsid covers the genomic RNA inside the virion core. The important function of nucleocapsid is in viral genomic RNA assembly; it binds to the signal sequence of fulllength RNAs and transports them into the assembling virion. 283 Within the virion, nucleocapsid binds to ssRNA non-specifically due to its highly charged basic regions and protects it from nuclease besides compacting it. Nucleocapsid also acts as a chaperone for viral RNA and facilitates the several steps of the viral life cycle associated with a nucleic acid, such as the melting of secondary structure within RNA, annealing of t-RNA primer, stimulating integration 287 and promoting the DNA exchange reactions during reverse transcription. [157] [158] [159] Computational prediction reveals that p7 is a highly disordered protein except for a few regions that are corresponding to the zinc finger domain and possess ordered structure identified as α-MoRFs. 23 Flexible nature of p7 (NC) explains its multiple functional roles, such as participation in RNA chaperoning and viral replication. 288 Rhabdoviridae members: Intrinsic disorder and disorder-to-order transitions Paramyxoviridae and Rhabdoviridae are the members of the mononegavirales order consisting of viruses with non-segmented ssRNA genome of negative polarity. 289 In mononegavirales, genome is tightly encapsidated by the nucleoprotein within a helical nucleocapsid. The viral nucleocapsid serves as a substrate for both replication and transcription. Both replication and transcription are performed by the viral RNA-dependent RNA polymerase (RDRP) that consists of complex formed between the viral large protein (L) and phosphoprotein (P). P protein acts as an essential polymerase cofactor and recruits the L-protein onto the nucleocapsid template. Beyond its role as a polymerase cofactor, it also acts as chaperone for the N-protein in a way that it prevents their illegitimate self-assembly when genomic RNA synthesis does not occur and maintain them in a soluble form (N°) within a complex (N°ÀP) and used for the encapsidation of Nascent RNA chain during replication. 290 The significant functional importance of N and P protein appears due to their involvement in numerous protein-protein interactions within the internal (viral) and external (Host) PPI networks. Multiple biological functions occur due to this interactability. Including modulation of both acquired and innate immunity. Experiments have proven the abundance of disorder in the N and P protein of these viruses. The persistence of disorder in the C-terminal domain of nucleoprotein (N TAIL ), even after complex formation, indicates potential role of this region in binding, 259, 291, 292 as described in case of MeV NTAIL, whose first 20 amino acids interacts with cellular nucleoprotein receptor 293, 294 and C-terminal region interact with the major inducible heat shock protein Hsp 70 that leads to both viral replication and transcription. 295 The disordered nature of N TAIL in measles and Hendra viruses has also confirmed in the context of full-length N protein that formed Nucleocapsid like particle (NLP) when expressed in the heterologous system. [296] [297] [298] Initially, it was thought that the C-terminal X domain (XD) of the phosphoprotein triggers major conformational rearrangement within nucleocapsid, and this leads to the access of the viral polymerase to RNA genome. 259, 292, 299 However recent NMR studies rule out these possibilities and provide the first direct observation of the interaction between XD and intact nucleocapsid in the Paramyxoviridae. The disordered N TAIL region is partially exposed at the surface of the nucleocapsid and provides a way for interaction with numerous protein partners. Indeed, MeV N TAIL interacts with various viral protein partners, such as P, P-L complex, and matrix protein. 300 Besides interaction with viral components, it also interacts with host cellular components, such as Interferon regulatory factor 3 (IRF3), 301 hsp70, 295 peroxiredoxin 1, 302 casein kinase II, 303 the cell protein responsible for the nuclear export of N, 304 and possibly the components of the cell cytoskeleton. 305, 306 Additionally, the N TAIL of MeV nucleocapsid released from infected cells binds to the cell receptors involved in MeV-induced immunosuppression. 293 315, 316 SeV interacts with an unassembled form of N (N°) and L protein. 317, 318 While the C-terminus nucleocapsid binding region of P adopts compact folded stable conformations in members of Rhabdoviridae and majority of Paramyxovirinae, it remains disordered in the respiratory syncytial virus which is a member of Pneumovirinae subfamily. 311, 319 The N-terminal region of P protein from Rhabdoviridae and Paramyxoviridae that is involved in binding to N°has been reported to contain the α-MoRF. 158, 312, 313, 320 This induced folding upon the binding effect in a form of the α-MoRF is limited to vesicular stomatitis virus (VSV), a rhabdovirus. The structure of VSV N°ÀP complex was solved and verified that although the binding region adopts an α-helical configuration, the flanking regions remain flexible. P protein α-MoRF binding occurs at the same site that is responsible for RNA and different N protein binding, thereby preventing the polymerization of N protein. These results provide a link between different processes and possibly explain the mechanism of initiation for viral RNA synthesis. 321 In MeV, limited proteolysis study carried out in secondary structure stabilizer (TFE) provided evidence for the disorder to order transition of disordered N-terminal region of P (PNT). 157 The presence of disordered domains in both P and N proteins leads to the controlled dynamic interactions in a coordinated manner between template nucleocapsid surface and polymerase complex that could extend further over the successive turns of the helix. The long disordered regions in viral proteins enable them to act as a potential linker between the binding partner and participate in large macromolecular assembly acting as a scaffolding engine. 322,323 IDRs provide more flexibility, hence help in the quick conformational changes of proteins required for the capsid assembly of viruses. For instance, the VP-4 protein of the Foot-and-mouth disease virus (FMDV) contains low structure content however plays a crucial role in capsid assembly. 148 As most viral proteins have synthesized in the form of the polyprotein, the presence of IDRs at the proteolytic sites make digestion easy and faster and generate independent functional chains. 324, 325 The presence of IDRs in viral proteins provides a self-driven mechanism of self-assembly due to the aforementioned property. 10.6.1 Intrinsic disorder in Flaviviridae core proteins Flaviviridae family members are non-segmented single-stranded positivesense viruses, whose genome size varies between 9.6 and 12.3 kb. Viral genera Flavivirus, Hepacivirus, and Pestivirus come under the family of Flaviviridae. 326 N-terminal region of viral core protein is highly basic and makes interaction in a sequence-specific manner with RNA to accomplish the various functions. The core protein is released from the rest of the polyprotein to initiate the functions required for further maturation and multiplication of viruses. RNA chaperoning activity of core protein is confirmed in in vitro assays, additionally it is responsible for packaging and condensation of viral genomic RNA during viral morphogenesis. Core protein mediates several interactions with host proteins for viral persistence and pathogenicity and simultaneously involves itself in functions related to viral replication. 327 Biophysical and biochemical studied done so far on the Flaviviridae family confirmed the widespread use of core protein IDRs in its member viruses despite having the low sequence similarity and other pronounced differences in their modular organization. 148 10.6.2 Disordered capsid protein of ZIKV and DENV Capsid protein of DENV and ZIKV are found to be highly disordered with respect to other proteins encoded by their genome. The disorder content is found to be 33.3% and 36% in ZIKV 146 and DENV, respectively. 328 This high amount of disorder suggests the exclusive involvement of these regions in the mechanism of viral-mediated functions at the battlefront of host and pathogens. The ZIKV capsid major functions are nucleocapsid assembly and involvement in the viral infection processes by interacting with cellular proteins, modulating cellular metabolism, apoptosis, and immune response. 329 Major functions of the Capsid protein of DENV are RNA binding and RNA chaperone activity, nucleocapsid assembly, lipid droplet accumulation and interaction with host components. 330 Despite major knowledge on the functions and disorder status of capsid proteins of DENV and ZIKV, the exact mechanism of IDR-mediated control of various functions of this protein is yet to be discovered. Fig. 5 demonstrates the MoRF position of (A) ZIKV and (B) DENV capsid proteins predicted by the MoRFchibi SYSTEM HTML server. 331 A pattern of position and number of MoRFs could be analyzed in detail in the capsid proteins of these viruses to identify the factors associated with their specific functions. 10.6.3 The fd phage coat protein pVIII undergoes transitions from order to disorder form Fd bacteriophage, filamentous in shape, belongs to the Invorus genus and infects enterobacteria, such as E. coli. 332, 333 The coat protein of Fd phage undergoes the transition from the state of disordered to ordered and ordered to disordered to regulate the molecular mechanism of its penetration and assembly. 148 The structural transition in FdpVIII coat protein indicates that there is involvement of MG (partial disorder) intermediate in the process of macromolecular assembly and disassembly. 334 10.6.4 Capsid protease: An illustrative example of an intrinsically disordered enzyme in Semliki forest virus The IDRs play their role in the activation and deactivation of the enzymatic property of viral proteins, as in the case of the Semliki forest virus (SFV). SFV belongs to the Alphavirus genus that has enveloped positive-strand RNA with an icosahedral nucleocapsid and spherical morphology. 335, 336 The N-terminal region of SFV polyprotein (residues 1-267) is an intramolecular serine protease that cleaves itself off after the Trp267 from the rest of the polyprotein segment and provides a mature capsid protein. After this auto cleavage process, the free carboxyl group of Trp267 interacts with catalytic triad consisting of amino acid His145, Asp167, and Ser219 and leads to inactivation of the enzyme. 337 Nucleocapsid protein (N) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoA) plays a crucial role in its viability and packaging of its genomic RNA. However, the exact mechanism of binding of N protein to genomic RNA is not completely understood. Two domains present in N protein NTD and CTD are flanked by long stretches of disordered regions that counts for almost half of the entire length. Both domains through their flanking disordered regions bind to RNA. Although low sequence homology reported in different coronavirus N protein through bioinformatics study, flexible linker region of N protein of all coronaviruses started with SR-rich region and end with region enriched with basic residues. These features are the hallmarks of the protein disorder. The overall isoelectric point (pI) of these flexible linkers is high, which is selfexplanatory for their RNA binding abilities. The aforementioned findings suggest that the physiochemical features are likely to be conserved across different groups of Coronaviridae. This observation highlights the role of intrinsic disorder in N protein whether it be multisite nucleic acid binding or RNP packaging. 338 Surface glycoprotein is required for the fusion of viral membrane with host membrane, hence mediating the way of entrance to the target cell. [339] [340] [341] One of the best examples of the most studied membrane fusion proteins is the influenza virus HA. HA is homotrimeric type I transmembrane surface glycoprotein responsible for the binding of viruses to the host receptor, their internalization and subsequent membrane fusion events within the endosome of the infected cell. Presence of HA at the viral surface in high numbers make it the most abundant antigen that contains primary neutralizing epitopes for antibodies. 342 Recent bioinformatics study revealed that although many viral membrane proteins are universally ordered, intrinsic disorder is still present in these proteins pointing out that IDRs might have crucial functions. For instance, influenza A virus virulent strain 1918 H1N1 and H5N1 differ from less virulent or nonvirulent strain H3N2 and 1930 H1N1 in their disordered content of the HA protein. 148 It has been observed that during viral replication, non-structural protein 2 of the influenza virus interacts with nuclear export machinery. It behaves as an adaptor molecule between viral ribonucleoprotein complex and the viral nuclear export machinery. Various techniques such as differential scanning calorimetry (DSC), hydrodynamic techniques, and limited proteolysis demonstrated the presence of high levels of disorder in this protein. 343 10.10 Intrinsic disorder in human adenovirus type 5 early transcription unit 1B A set of proteins comprises early transcription unit 1B (E1B) encoded by human adenovirus type 5. These proteins participate in several important viral functions, such as viral replication and adenoviral-mediated cell transformation. 344, 345 An interesting feature demonstrated by this set of proteins is that they are expressed from the overlapping reading frames of the 2.28 kb E1B-mRNA through alternative splicing that takes place between common splice donor and one splice acceptor site among three possible sites. This results in the encoding of proteins from mRNAs having common N-terminus and different C-terminus. 346, 347 This feature determines one of the names of these proteins, E1BN proteins. Computational analysis along with NMR and CD determines that E1B-93R is a typical IDP, and the N-terminal region within E1B and other E1BN proteins is likely to be intrinsically disordered. 345 10.11 Intrinsic disorder in non-structural HCV proteins HCV NS5A, a key protein involved viral replication that plays a role in viral particle assembly. 251 Numerous interactions made by NS5A with viral and host proteins have been reported. 348 NS5A is a membrane associated protein that possesses an anchor at its N-terminal region with C-terminal region being divided into three different domains, D1, D2, and D3. D1 is highly conserved and is less disordered, while D2 and D3 are less conserved and are highly disordered. [252] [253] [254] 349 High disorder content defines the dynamic behavior of D2 and D3 that makes them a hub-like a center for multiple interactions. NS5A-D2 is important for NS5A function and is involved in molecular interaction with RDRP (NS5B) and PKR. The interaction established by NS5A-D2 interferes with host signaling pathways and apoptosis. 256 Although NS5A-D3 is mostly disordered, it contains short ordered elements at its N-terminus. In a recent study, NS5A-D3 proteins from two HCV strains were found to exhibit a propensity to partial folding into an α-helix. 350 NMR analysis revealed two putative α-helices for that a molecular model could be proposed. The first α-helix conservation in all genotypes and its amphipathic character suggest that it could be corresponding to MoRE and hence promote the interaction with a suitable biological partner(s). One such partner is Cyclophilin A (CypA). Cyclophilins are cell factors crucial in HCV replication. Interestingly, Cyclosporin completely abrogates the interaction between HCV NS5A-D3 and CypA. CypA together with NS5A and NS5B forms the crucial component of multi-protein complex and supports RNA transcription and replication. 350 10.12 Intrinsic disorder in the HDV basic protein δAg Among many animal viruses know so far, HDV has the smallest RNA genome that code for single protein known as δ-antigen (δAg). 351 From a structural perspective, this protein comprises of the coiled-coil domain, a nuclear localization signal (NLS) and RNA binding domain. 352 δAg is self-oligomerize to yield dodecamers structure associated with HDV genomic RNA. 248, 353 Computational and experimental analysis of eight clades of HDV shows the high disorder of this protein. 248 Tat protein of HIV-1 is an important factor in viral pathogenesis that serves as a transactivator of viral transcription. The activity of Tat is dependent on its interaction with the Transactivation response region (TAR), whose example is a short nascent stem bulge loop leader RNA. TAR present at 5 0 extreme of all viral transcripts. Tat protein display typical characteristics of IDPs that include the high net charge to low global hydrophobicity. 148 Intrinsic disorder of Tat is also proven by CD and NMR studies. 354 Rev. protein also plays a regulatory role in HIV-1. This is a basic protein of 116 residues in length that belongs to the ARM family of RNA-binding proteins. Rev. binds to the Rev. Response element (RRE) of viral mRNA in the cytoplasm of the host cell, and, therefore, Rev. is essential for viral replication 355 Monomeric Rev. adopts MG state as confirm by Hydrodynamic and Spectroscopic studies. 356 Recent biophysical studies of Rev. ARM associated with RNA binding suggest it is intrinsically disordered not only in the isolated state but also when embedded into oligomerization deficient Rev. Mutant. 357 10.14 Intrinsic disorder in non-structural HPV E6 and E7 proteins The large family of papillomavirus (PV) includes small DNA viruses infecting mammals, reptile, and birds. At least 100 different types of HPV are reported to date that act as a cofactor in the development of carcinoma of head, neck, genital tract and epidermis and also cause the papillomas and benign wart. HPV classified into two classes on the basis of its association with cancer. The first category includes low-risk viruses (HPV-6, HPV-11), and the second category contains high-risk viruses (HPV-16, HPV-18, and HPV-45) types. Similar to all DNA tumor viruses, HPV hijacks the replication machinery and forces the infected cell to enter into the S phase of the cell cycle. The transforming activity of high-risk HPVs is mainly exerted through their E7, which is one of their two oncoproteins. E7 is responsible for pathogenesis and maintenance of human cervical cancer and has been determined to participate in numerous cellular processes including DNA synthesis, transcription, transformation, cell growth, and apoptosis. 358 E7 interacts with Rb, which is a tumor suppressor protein, and interferes with its tumor suppression activity. Rb acts as guardian of the cell cycle due to its involvement into the control of G1/S transition. 359 Therefore, Rb is critical for determining the progression of the cell into the normal phase or transformation. Besides interacting with proteins of Rb family, E7 also interacts with histone deacetylase, 360 kinase p33CDK2 and cyclin A, 361 protein phosphatase 2A (PP2A), 362 and the cyclindependent kinase inhibitor p21cip1 protein. 363 PP2A is sequestered and excluded from its interaction with protein kinase B (PKB) or Akt due to its involvement in the formation of a complex with E7. 364 PKB is one of several second messenger kinases that is activated via cell attachment and growth factor signaling and that sends a signal to the cell nucleus to prevent apoptosis, thus leading the way toward cell survival during proliferation. The interaction between PP2A and E7 leads to the inhibition of PKB/Akt dephosphorylation that keeps the PKB/Akt signaling activated. E7 protein broad range molecular interactions depend on the flexible disordered region present within the E7. Previous studies performed on recombinant E7 reveal that its structure can be described as the elongated dimer that changes conformation upon a small change in pH, while gaining α-helicity by exposure to solvents. 365 Biophysical characterization of E7 from HPV-45 with far-UV CD and NMR revealed that its N-terminal region (E7N, amino acids 1-40) is disordered, while its C-terminal domain is well structured with a unique zinc-binding fold. The Intrinsically unstructured N-terminal region of E7 contains binding and Casein kinase II phosphorylation sites. 292, 366, 367 The CD spectra recorded for the different conformations as a function of temperature and pH indicated a polyproline II-like structure. 366 The structural stability is maintained by phosphorylation that results in increased transformation activity in the cell. Transforming protein E6 and E7 of high-risk HPVs incorporate high amounts of intrinsic disorder. 368 In λ bacteriophage, its N protein (λN) plays an important role in the transcription of the gene. The absence of this protein leads to the reduction in the phage genome transcription to 2% with the only transcription of the early gene. 369 λN protein positively regulates the transcription of λ bacteriophage and promotes the expression of a gene located downstream to the termination signal. λN acts as an anti-terminator transcription factor and in doing so, it binds to an RNA sequence (the box B segment) and multiple proteins in the transcription complex, where it serves as an important regulator of antiterminator complex that allows transcription through termination sites during phage gene expression. The interaction between host bacteria RNA polymerase and factor NusA to λN has been also observed. 161 λN demonstrate all features of unstructured flexible protein that are typical to IDPs. These features include high net charge and low hydrophobicity, 161 as well as structural asymmetry determined through various experiments. [370] [371] [372] [373] [374] 10.16 Intrinsic disorder in the Hordeivirus movement TGBp1 protein Plant viral infection spreads from one infected position to another through special proteins known as movement proteins (MPs) that facilitate the movement of viruses within the plant body. These MPs possess a wide range of functions. They interact with the viral proteins and RNA to form ribonucleoprotein complex that facilitates cell to cell and long-distance movement of the viral genome in the plant and helps in the interaction with cytoskeleton components and endoplasmic reticulum. 161 Three types of movement protein that are TGBp1 (528 residues), TGBp2 (204 residues), and TGBp3 (155 residues), encoded by "triple gene black" (TGB) are reported in hordeiviruses. 375 The N-terminal region of TGB1 of Barley stripe mosaic virus (residues 1-180) are predicted to be highly disordered, whereas C-terminal is not as shown in Fig. 6 . This chapter summarizes the current knowledge on the protein intrinsic disorder phenomenon, discusses various peculiar features of IDPs, including their involvement in PPI networks, other biological roles and introduces different disorder predictors. It also discusses some details of the intrinsic disorder perspective of viruses, the role of IDPs and IDRs in the virus-facilitated host mechanisms, prevalence of the intrinsic disorder in viral proteomes, and functional prominence of disordered viral proteins. The role of IDRs in various structural and non-structural proteins of viruses, such as capsid, nucleocapsid, genome-linked surface glycoproteins, matrix and accessory, and regulatory proteins have been summarized. IDPs/IDRs role in specific function-oriented proteins in different viruses have been elaborated, such as membrane-binding protein λN of bacteriophage, hordeivirus movement protein TGBp1, influenza virus nonstructural protein 2, bBasic protein δAg of HDV, and Human adenovirus type 5 early transcription unit 1B. Also, the importance of intrinsic disorder for the alternative splicing and overlapping reading frames of viral proteome is discussed. Viruses mainly cause pathogenesis by hijacking the cell machinery and modulating its functions, e.g., by altering IDP components involved in the host cell cycle control mechanism. Viral IDPs mediate successful infection and regulate pathogenesis at multiple levels. Therefore, the knowledge of intrinsic disorder and structural flexibility in processes of virus-host interaction and associated functions is crucial for better understanding of viral pathogenesis. The involvement of IDPs/IDRs in the mechanism of viral infection is not completely understood. Therefore, this chapter would allow readers to get better understanding of the importance of IDPs/IDRs in various functional mechanisms/viral components, which are essential for the completion of crucial phases of the viral life cycle. Finally, the IDPs/IDPRs of viruses are considered as potential drug targets, due to their high prevalence in viral proteomes and ubiquitous involvement in host-pathogen mediated regulations. In conclusion, the involvement of IDPs in viral pathogenesis should be solemnly considered for unlocking the complex riddles of viral infection and associated patterns, their cellular control, and exploitation strategies, and drug development approach in near future by targeting their disordered regions. Understanding the role of intrinsic disorder of viral proteins in the oncogenicity of different types of HPV Understanding protein non-folding Intrinsically unstructured proteins and their functions Structure and Function of Intrinsically Disordered Proteins Structural studies of tau protein and Alzheimer paired helical filaments show no evidence for β-structure NACP, a protein implicated in Alzheimer's disease and learning, is natively unfolded Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm Intrinsically disordered proteins in human diseases: introducing the D 2 concept Protein dynamics: dancing on an ever-changing free energy stage Drugs for "protein clouds": targeting intrinsically disordered transcription factors Multitude of binding modes attainable by intrinsically disordered proteins: a portrait gallery of disorder-based complexes Operational definition of intrinsically unstructured protein sequences based on susceptibility to the 20S proteasome Flexible nets of malleable guardians: Intrinsically disordered chaperones in neurodegenerative diseases Malleable machines in transcription regulation: the mediator complex Malleable machines take shape in eukaryotic transcriptional regulation A protein-chameleon: conformational plasticity of α-synuclein, a disordered protein involved in neurodegenerative disorders Protein structure protection commits gene expression patterns The protein trinity-linking function and disorder The relation of polypeptide hormone structure and flexibility to receptor binding: the relevance of X-ray studies on insulins, glucagon and human placental lactogen High-resolution proton-magnetic-resonance studies of chromatin core particles Intrinsic disorder of Drosophila melanogaster hormone receptor 38 N-terminal domain Caseins as rheomorphic proteins: interpretation of primary and secondary structures of the αs1-, β-and κ-caseins Protein intrinsic disorder as a flexible armor and a weapon of HIV-1 Intrinsic protein disorder in complete genomes Abundance of intrinsically unstructured proteins in P. falciparum and other apicomplexan parasite proteomes Large-scale analysis of thermostable, mammalian proteins provides insights into the intrinsically disordered proteome Prevalent structural disorder in E. coli and S. cerevisiae proteomes Prediction and functional analysis of native disorder in proteins from the three kingdoms of life Archaic chaos: intrinsically disordered proteins in archaea Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life Intrinsically disordered proteins: regulation and disease this review comes from a themed issue on sequences and topology edited The protein non-folding problem: amino acid determinants of intrinsic order and disorder Sequence complexity of disordered protein Intrinsic disorder and functional proteomics Composition profiler: a tool for discovery and visualization of amino acid composition differences Lethality and centrality in protein networks Flexible nets: the roles of intrinsic disorder in protein interaction networks Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? Intrinsic disorder is a common feature of hub proteins from four eukaryotic Interactomes Disordered domains and high surface charge confer hubs with the ability to interact with multiple proteins in interaction networks Disorder and sequence repeats in hub proteins and their implications for network evolution Role of intrinsic disorder in transient interactions of hub proteins Intrinsic disorder in yeast transcriptional regulatory network Intrinsically disordered proteins from A to Z Hub promiscuity in protein-protein interaction networks Why are natively unfolded proteins unstructured under physiologic conditions? Net charge per residue modulates conformational ensembles of intrinsically disordered proteins Fluorescence correlation spectroscopy shows that monomeric polyglutamine molecules form collapsed structures in aqueous solutions Characterizing the conformational ensemble of monomeric polyglutamine End-to-end distance distributions and intrachain diffusion constants in unfolded polypeptide chains indicate intramolecular hydrogen bond formation Quantitative characterization of intrinsic disorder in polyglutamine: insights from analysis based on polymer theories A natively unfolded yeast prion monomer adopts an ensemble of collapsed and rapidly fluctuating structures Single homopolypeptide chains collapse into mechanically rigid conformations Examining polyglutamine peptide length: a connection between collapsed conformations and increased aggregation Protein disorder in the human diseasome: unfoldomics of human genetic diseases Local structural elements in the mostly unstructured transcriptional activation domain of human p53 Intrinsic structural disorder and sequence features of the cell cycle inhibitor p57Kip2 Intrinsic structural disorder of the C-terminal activation domain from the bZIP transcription factor Fos Identification of a novel regulatory domain in Bcl-x(L) and Bcl-2 TC-1 is a novel tumorigenic and natively disordered protein associated with thyroid cancer Alzheimer's disease in down's syndrome: clinicopathologic studies Alzheimer's disease and Down's syndrome: sharing of a unique cerebrovascular amyloid fibril protein Neuronal origin of a cerebral amyloid: neurofibrillary tangles of Alzheimer's disease contain the same protein as the amyloid of plaque cores and blood vessels A68: a major subunit of paired helical filaments and derivatized forms of normal tau Molecular cloning of cDNA encoding an unrecognized component of amyloid in Alzheimer disease Polyglutamine diseases: protein cleavage and aggregation Part II: α-synuclein and its molecular pathophysiological role in neurodegenerative disease Shattuck lecture-neurodegenerative diseases and prions HIV vaccine mystery and viral shell disorder Calculation of ensembles of structures representing the unfolded state of an SH3 domain The effect of a ΔK280 mutation on the unfolded state of a microtubule-binding repeat in tau Constructing structure ensembles of intrinsically disordered proteins from chemical shift data Constructing ensembles for intrinsically disordered proteins Mollack: a web server for the automated creation of conformational ensembles for intrinsically disordered proteins NMR relaxation studies on the hydrate layer of intrinsically unstructured proteins Primary contact sites in intrinsically unstructured proteins: the case of calpastatin and microtubule-associated protein 2 Protein-water and protein-buffer interactions in the aqueous solution of an intrinsically unstructured plant dehydrin: NMR intensity and DSC aspects Intrinsically disordered regions may lower the hydration free energy in proteins: a case study of nudix hydrolase in the bacterium Deinococcus radiodurans Intrinsically disordered protein Intrinsic disorder in cell-signaling and cancer-associated proteins Heterogeneity of the binding sites of bovine serum albumin 1 Structural studies of p21Waf1/Cip1/Sdi1 in the free and Cdk2-bound state: conformational disorder mediates binding diversity Coupled folding and binding with α-helix-forming molecular recognition elements Molecular recognition features in Zika virus proteome The structure of ClpB: a molecular chaperone that rescues proteins from an aggregated state Crystal structure of the 30 S ribosomal subunit from Thermus thermophilus: structure of the proteins and their interactions with 16 S RNA An overview of the structures of protein-DNA complexes Crystal structure of a β-catenin/Tcf complex Role of intrinsic flexibility in signal transduction mediated by the cell cycle regulator, p27Kip1 Regulation of cell division by intrinsically unstructured proteins: intrinsic flexibility, modularity, and signaling conduits The crystal structure of ZapA and its modulation of FtsZ polymerisation Structure of the bcr-abl oncoprotein oligomerization domain Core structure of the envelope glycoprotein GP2 from Ebola virus at 1.9-Å resolution The trimer-of-hairpins motif in membrane fusion: visna virus Folding and assembly of oligomeric proteins in Escherichia coli Mechanism and evolution of protein dimerization Analysis of ordered and disordered protein complexes reveals structural features discriminating between stable and unstable monomers β arcades: recurring motifs in naturally occurring and disease-related amyloid fibrils MultiCoil: a program for predicting two-and threestranded coiled coils Coiled coil domains: stability, specificity, and biological implications A seven-helix coiled coil Storage function of cartilage oligomeric matrix protein: the crystal structure of the coiled-coil domain in complex with vitamin D3 Crystal structure of the heterodimeric bZIP transcription factor c-Fos-c-Jun bound to DNA Structural basis for asymmetric association of the βPIX coiled coil and shank PDZ Structure of the molecular chaperone prefoldin: unique interaction of multiple coiled coil tentacles with unfolded proteins Neuropathology, biochemistry, and biophysics of α-synuclein aggregation Flexible nets: disorder and induced fit in the associations of p53 and 14-3-3 with their partners Alpha-synuclein misfolding and neurodegenerative diseases Intrinsically disordered proteins and multicellular organisms Cell cycle regulation by the intrinsically disordered proteins p21 and p27 Intrinsically disordered proteins in cellular signalling and regulation Functional roles of transiently and intrinsically disordered regions within proteins Predicting intrinsic disorder in proteins: an overview The DISOPRED server for the prediction of protein disorder FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content Protein disorder prediction: implications for structural proteomics RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins Protein intrinsic disorder in plants Natively unfolded proteins: a point where biology waits for physics Structural characterization of intrinsically disordered proteins by NMR spectroscopy Site-directed spin labeling EPR spectroscopy Monitoring structural transitions in IDPs by site-directed spin labeling EPR spectroscopy EPR in protein science: intrinsically disordered proteins Tyrosine-targeted spin labeling and EPR spectroscopy: an alternative strategy for studying structural transitions in proteins Enlarging the panoply of site-directed spin labeling electron paramagnetic resonance (SDSL-EPR): sensitive and selective spinlabeling of tyrosine using an Isoindoline-based Nitroxide Single-molecule fluorescence studies of intrinsically disordered proteins Intramolecular three-colour single pair FRET of intrinsically disordered proteins with increased dynamic range Modulation of allostery by protein intrinsic disorder Visualization of intrinsically disordered regions of proteins by high-speed atomic force microscopy Visualization of mobility by atomic force microscopy Conformational equilibria in monomeric α-synuclein at the single-molecule level High-speed AFM and applications to biomolecular systems Intrinsically unstructured proteins Smooth muscle caldesmon is an extended flexible monomeric protein in solution that can readily undergo reversible intra-and intermolecular sulfhydryl cross-linking. A mechanism for caldesmon's F-actin bundling activity Physicochemical characterization of the heatstable microtubule-associated protein MAP2 Involucrin acts as a transglutaminase substrate at multiple sites A novel strategy for the purification of recombinantly expressed unstructured protein domains Uncovering the unfoldome: enriching cell extracts for unstructured proteins by acid treatment Assessing protein disorder and induced folding Obtaining highly purified intrinsically disordered protein by boiling lysis and single step ion exchange Introducing protein intrinsic disorder Intrinsically disordered side of the Zika virus proteome Rates of spontaneous mutation Structural disorder in viral proteins Probing the role of nascent helicity in p27 function as a cell cycle regulator Use of host-like peptide motifs in viral proteins is a prevalent strategy in host-virus interactions Fuzziness endows viral motif-mimicry How viruses hijack cell regulation Drawing on disorder: how viruses use histone mimicry to their advantage Unconventional RNA-binding proteins step into the virus-host battlefront How do intrinsically disordered viral proteins hijack the cell? Caught in the actprotein adaptation and the expanding roles of the PACS proteins in tissue homeostasis and disease The n-terminal domain of the phosphoprotein of morbilliviruses belongs to the natively unfolded class of proteins Structural disorder and modular organization in Paramyxovirinae N and P The C-terminal domain of the measles virus nucleoprotein is intrinsically disordered and folds upon binding to the C-terminal moiety of the phosphoprotein Do viral proteins possess unique biophysical features? Flexible Viruses : Structural Disorder in Viral Proteins Viral disorder or disordered viruses: do viral proteins possess unique features? Order and disorder in viral proteins: new insights into an old paradigm Here a virus, there a virus, everywhere the same virus? Movement of viruses between biomes Opinion: viral metagenomics High abundance of viruses found in aquatic environments Structural and functional studies of archaeal viruses The virophage as a unique parasite of the giant mimivirus A giant virus in amoebae The 1.2-megabase genome sequence of Mimivirus Virology: gulliver among the lilliputians Ultrastructural characterization of the giant volcano-like virus factory of Acanthamoeba polyphaga Mimivirus Expression of animal virus genomes Viral evolution in the genomic age The classification of organisms at the edge of life or problems with virus systematics Giant DNA virus mimivirus encodes pathway for biosynthesis of unusual sugar 4-amino-4,6-dideoxy-D-glucose (viosamine) The origin of viruses and their possible roles in major evolutionary transitions The ancient virus world and evolution of cells Br€ ussow H. Phage as agents of lateral gene transfer Redefining viruses: lessons from Mimivirus Physical principles in the construction of regular viruses The bacteriophage T4 DNA injection machine Sialobiology of influenza molecular mechanism of host range variation of influenza viruses Structure and function of the influenza A M2 proton channel Assembly and budding of influenza virus Influenza virus morphogenesis and budding E1 protein of human papillomavirus is a DNA helicase/ ATPase Transient replication of BPV-1 requires two viral polypeptides encoded by the E1 and E2 open reading frames Binding of the human papillomavirus E1 origin-recognition protein is regulated through complex formation with the E2 enhancer-binding protein Targeting the E1 replication protein to the papillomavirus origin of replication by complex formation with the E2 transactivator Transcriptional regulation of the human papillomavirus-16 E6-E7 promoter by a keratinocyte-dependent enhancer, and by viral E2 trans-activator and repressor gene products: implications for cervical carcinogenesis The upstream regulatory region of the human papilloma virus-16 contains an E2 protein-independent enhancer which is specific for cervical carcinoma cells and regulated by glucocorticoid hormones Role of the E1 E4 protein in the differentiationdependent life cycle of human papillomavirus type 31 The human papillomavirus type 11 E1^E4 protein is a transglutaminase 3 substrate and induces abnormalities of the cornified cell envelope HPV16 E1-E4 protein is phosphorylated by Cdk2/cyclin A and relocalizes this complex to the cytoplasm The E5 gene from human papillomavirus type 16 is an oncogene which enhances growth factor-mediated signal transduction to the nucleus The E5 oncoprotein of human papillomavirus type 16 transforms fibroblasts and effects the downregulation of the epidermal growth factor receptor in keratinocytes The E5 protein of human papillomavirus type 16 perturbs MHC class II antigen maturation in human foreskin keratinocytes treated with interferon-γ Human papillomavirus type 31 E5 protein supports cell cycle progression and activates late viral functions upon epithelial differentiation Quantitative role of the human papillomavirus type 16 E5 gene during the productive stage of the viral life cycle Interactions between viral nonstructural proteins and host protein hVAP-33 mediate the formation of hepatitis C virus RNA replication complex on lipid raft West Nile virus nonstructural protein NS1 inhibits complement activation by binding the regulatory protein factor H The rinderpest virus non-structural C protein blocks the induction of type 1 interferon Inhibition of interferon induction and signaling by paramyxoviruses Minute virus of mice non-structural protein NS-1 is necessary and sufficient for trans-activation of the viral P39 promoter Role of viral regulatory and accessory proteins in HIV-1 replication Viruses in extreme environments The origin of viruses The diversity of physical forces and mechanisms in intermolecular interactions Linear motifs: evolutionary interaction switches Marked variability in the extent of protein disorder within and between viral families Reduction in structural disorder and functional complexity in the thermal adaptation of prokaryotes The desoxyribonucleic acid content of animal cells and its evolutionary significance The constancy of desoxyribose nucleic acid in plant nuclei The desoxyribose nucleic acid content of animal nuclei The dynamic nature of eukaryotic genomes Macroevolution, hierarchy theory, and the C-value enigma Distinguishing protein-coding and noncoding genes in the human genome DisProt: the database of disordered proteins Structural adaptation of extreme halophilic proteins through decrease of conserved hydrophobic contact surface Halophilic adaptation of protein-DNA interactions Electrostatic contributions to the stability of halophilic proteins Unique amino acid composition of proteins in halophilic bacteria Halophilic adaptation of enzymes Halophilic enzymes: proteins with a grain of salt Molecular signature of hypersaline adaptation: insights from genome and proteome composition of halophilic prokaryotes Structural basis for the aminoacid composition of proteins from halophilic archea The effect of salts on the activity and stability of Escherichia coli and Haloferax volcanii dihydrofolate reductases Halophilic proteins and the influence of solvent on protein stabilization Intrinsic disorder in pathogenic and non-pathogenic microbes: discovering and analyzing the unfoldomes of early-branching eukaryotes Flavors of protein disorder Understanding viral transmission behavior via protein intrinsic disorder prediction: coronaviruses Protein structural disorder of the envelope V3 loop contributes to the switch in human immunodeficiency virus type 1 cell tropism Virus-induced aggregates in infected cells Viral aggregation: impact on virus behavior in the environment IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding The CamSol method of rational design of protein mutants with enhanced solubility Rapid and accurate in silico solubility screening of a monoclonal antibody library The evolution of RNA viruses Protein intrinsic disorder toolbox for comparative analysis of viral proteins Roles of intrinsic disorder in protein-nucleic acid interactions Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions Functional anthology of intrinsic disorder. 2. Cellular components, domains, technical terms, developmental processes, and coding sequence diversities correlated with long disordered regions Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins Protein intrinsic disorder and influenza virulence: the 1918 H1N1 and H5N1 viruses Hepatitis D: thirty years after Intrinsic disorder and oligomerization of the hepatitis delta virus antigen Interaction of host cellular proteins with components of the hepatitis delta virus The heterogeneous ribonuclear protein C interacts with the hepatitis delta virus small antigen All three domains of the hepatitis C virus nonstructural NS5A protein contribute to RNA binding Crystal structure of a novel dimeric form of NS5A domain I protein from hepatitis C virus Domain 3 of non-structural protein 5A from hepatitis C virus is natively unfolded The domain 2 of the HCV NS5A protein is intrinsically unstructured Hepatitis C virus NS5A: tales of a promiscuous protein Transient structure and SH3 interaction sites in an intrinsically disordered fragment of the hepatitis C virus protein NS5A Replication of paramyxoviruses Structural disorder within the replicative complex of measles virus: functional implications Structural disorder within paramyxovirus nucleoproteins and phosphoproteins Protein is linked to the 5 0 end of poliovirus RNA by a phosphodiester linkage to tyrosine Sobemovirus RNA linked to VPg over a threonine residue Protein-RNA linkage and posttranslational modifications of two sobemovirus VPgs Proteins attached to viral genomes are multifunctional The genome-linked protein VPg of the Norwalk virus binds eIF3, suggesting its role in translation initiation complex recruitment Calicivirus translation initiation requires an interaction between VPg and eIF4E Binding analyses for the interaction between plant virus genome-linked protein (VPg) and plant translational initiation factors The potyviral virus genome-linked protein VPg forms a ternary complex with the eukaryotic initiation factors eIF4E and eIF4G and reduces eIF4E affinity for a mRNA cap analogue The C terminus of lettuce mosaic potyvirus cylindrical inclusion helicase interacts with the viral VPg and with lettuce translation eukaryotic initiation factor 4E VPg of murine norovirus binds translation initiation factors in infected cells Potyvirus genome-linked protein, VPg, directly affects wheat germ in vitro translation: interactions with translation initiation factors eIF4F and eIFiso4F Protein-protein interactions in two potyviruses using the yeast two-hybrid system Direct interaction between the rice yellow mottle virus (RYMV) VPg and the central domain of the rice eIF(iso)4G1 factor correlates with rice susceptibility and RYMV virulence Natively unfolded" VPg is essential for Sesbania mosaic virus serine protease activity Virulence factor of potato virus Y, genomeattached terminal protein VPg, is a highly disordered protein Potato virus A genome-linked protein VPg is an intrinsically disordered molten globule-like protein with a hydrophobic core Intrinsic disorder in viral proteins genomelinked: experimental and predictive analyses Stacking interactions of W271 and H275 of SeMV serine protease with W43 of natively unfolded VPg confer catalytic activity to protease Interaction of sesbania mosaic virus movement protein with VPg and P10: implication to specificity of genome recognition A comparative analysis of viral matrix proteins using disorder predictors Myristoylation-dependent replication and assembly of human immunodeficiency virus 1 Role of capsid precursor processing and myristoylation in morphogenesis and infectivity of human immunodeficiency virus type 1 Assembly and morphology of HIV: potential effect of structure on viral function HIV-1: fifteen proteins and an RNA Analysis of the viral elements required in the nuclear import of HIV-1 DNA Mutations of basic amino acids of NCp7 of human immunodeficiency virus type 1 affect RNA binding in vitro Charged amino acid residues of human immunodeficiency virus type 1 nucleocapsid p7 protein involved in RNA packaging and infectivity Human immunodeficiency virus type 1 nucleocapsid protein specifically stimulates Mg2+Àdependent DNA integration in vitro Flexible nature and specific functions of the HIV-1 nucleocapsid protein Transcription et r eplication des Mononegavirales: Une machine mol eculaire originale The measles virus N TAIL-XD complex: an illustrative example of fuzziness Structural Disorder in Viral Proteins-Vladimir Uversky. Sonia Longhi Measles virus nucleoprotein induces cellproliferation arrest and apoptosis through N TAIL-NR and N CORE-FccRIIB1 interactions, respectively Measles virus (MV) nucleoprotein binds to a novel cell surface receptor distinct from Fc RII via its C-terminal domain: role in MV-induced immunosuppression Hsp72 recognizes a P binding motif in the measles virus N protein C-terminus Characterization of the interactions between the nucleoprotein and the phosphoprotein of henipavirus Intrinsic disorder in measles virus nucleocapsids Atomic resolution description of the interaction between the nucleoprotein and phosphoprotein of Hendra virus Structural disorder within the measles virus nucleoprotein and phosphoprotein The matrix protein of measles virus regulates viral RNA synthesis and assembly by interacting with the nucleocapsid protein The interaction between the measles virus nucleoprotein and the interferon regulator factor 3 relies on a specific cellular environment Peroxiredoxin 1 is required for efficient transcription and replication of measles virus Phosphorylation of measles virus nucleoprotein upregulates the transcriptional activity of minigenomic RNA Morbillivirus nucleoprotein possesses a novel nuclear localization signal and a CRM1-independent nuclear export signal Involvement of actin microfilaments in the transcription/replication of human parainfluenza virus type 3: possible role of actin in other viruses Host cell proteins required for measles virus reproduction Interaction of the C-terminal domains of Sendai virus N and P proteins: comparison of polymerase-nucleocapsid interactions within the paramyxovirus family Structural disorder within Henipavirus nucleoprotein and phosphoprotein: from predictions to experimental assessment Structure and dynamics of the nucleocapsid-binding domain of the Sendai virus phosphoprotein in solution A structural model for unfolded proteins from residual dipolar couplings and smallangle x-ray scattering Structural analysis of the human respiratory syncytial virus phosphoprotein: characterization of an α-helical domain involved in oligomerization Modular organization of rabies virus phosphoprotein The N0-binding region of the vesicular stomatitis virus phosphoprotein is globally disordered but contains transient α-helices Ensemble structure of the modular and flexible full-length vesicular stomatitis virus phosphoprotein Measles virus protein interactions in yeast: new findings and caveats Protein interactions entered into by the measles virus P, V, and C proteins An N-terminal domain of the Sendai paramyxovirus P protein acts as a chaperone for the NP protein during the nascent chain assembly step of genome replication An acidic activation-like domain of the Sendai virus P protein is required for RNA synthesis and encapsidation The nine C-terminal amino acids of the respiratory syncytial virus protein P are necessary and sufficient for binding to ribonucleoprotein complexes in which six ribonucleotides are contacted per N protein promoter Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins Structure of the vesicular stomatitis virus N0-P complex Intrinsic disorder in scaffold proteins: getting more from less High levels of structural disorder in scaffold proteins as exemplified by a novel neuronal protein, CASK-interactive protein1 Probing the partly folded states of proteins by limited proteolysis Probing protein structure by limited proteolysis Intrinsic disorder in the core proteins of Flaviviruses Unstructural biology of the dengue virus proteins Crystal structure of the capsid protein from Zika virus Properties and functions of the dengue virus capsid protein MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences Filamentous Bacterial viruses Proposed molten globule intermediates in fd phage penetration and assembly Liljestr€ om P. The molecular pathogenesis of Semliki forest virus: a model virus made useful? Role of ribosomes in Semliki forest virus nucleocapsid uncoating Novel enzymatic activity derived from the Semliki forest virus capsid protein Multiple nucleic acid binding sites and intrinsic disorder of severe acute respiratory syndrome coronavirus nucleocapsid protein: implications for ribonucleocapsid protein packaging Coiled coils in both intracellular vesicle and viral membrane fusion Receptor binding and membrane fusion in virus entry: the influenza hemagglutinin Mechanisms of viral membrane fusion and its inhibition The structure and function of the hemagglutinin membrane glycoprotein of influenza virus Structural plasticity in influenza virus protein NS2 (NEP) Adenovirus type 5 early region 1B 156R protein promotes cell transformation independently of repression of p53-stimulated transcription Intrinsic disorder in the common N-terminus of human adenovirus 5 E1B-55K and its related E1BN proteins indicated by studies on E1B-93R Early region 1B of adenovirus 2 encodes two coterminal proteins of 495 and 155 amino acid residues Identification of adenovirus type 2 early region 1B proteins that share the same amino terminus as do the 495R and 155R proteins Hepatitis C virus infection protein network Structure of the zinc-binding domain of an essential component of the hepatitis C virus replicase Domain 3 of NS5A protein from the hepatitis C virus has intrinsic α-helical propensity and is a substrate of cyclophilin A Chapter 3 replication of the hepatitis delta virus RNA genome Intracellular localization of hepatitis delta virus proteins in the presence and absence of viral RNA accumulation Ribonucleoprotein complexes of hepatitis delta virus Molecular recognition of the human coactivator CBP by the HIV-1 transcriptional activator tat Solid-state NMR data support a helix-loop-helix structural model for the N-terminal half of HIV-1 rev in fibrillar form HIV rev self-assembly is linked to a molten-globule to compact structural transition The arginine-rich RNA-binding motif of HIV-1 rev is intrinsically disordered and folds upon RRE binding Human papillomavirus E7 oncoproteins bind a single form of cyclin E in a complex with cdk2 and p107 The human papilloma virus-16 E7 oncoprotein is able to bind to the retinoblastoma gene product The E7 oncoprotein associates with Mi2 and histone deacetylase activity to promote cell growth HPV16 E7 protein associates with the protein kinase p33(CDK2) and cyclin A Activation of the protein kinase B pathway by the HPV-16 E7 oncoprotein occurs through a mechanism involving interaction with PP2A Posttranscriptional induction of p21cip1 protein by human papillomavirus E7 inhibits unscheduled DNA synthesis reactivated in differentiated keratinocytes Ten years of protein kinase B signalling: a hard Akt to follow High-risk (HPV16) human papillomavirus E7 oncoprotein is highly stable and extended, with conformational transitions that could explain its multiple cellular binding partners The N-terminal module of HPV16 E7 is an intrinsically disordered domain that confers conformational and recognition plasticity to the oncoprotein Targeting mechanism of the retinoblastoma tumor suppressor by a prototypical viral oncoprotein: structural modularity, intrinsic disorder and phosphorylation of human papillomavirus E7 Protein intrinsic disorder and human papillomaviruses: increased amount of disorder in E6 and E7 oncoproteins from high risk HPVs Termination factor for RNA synthesis Properties of the N gene transcription antitermination protein of bacteriophage ?? Assembly of the N-dependent antitermination complex of phage λ: NusA and RNA bind independently to different unfolded domains of the N protein Complexes of N Antitermination protein of phage λ with specific and nonspecific RNA target sites on the nascent transcript † Fractal dimension of an intrinsically disordered protein: small-angle X-ray scattering and computational study of the bacteriophage λ N protein Independent ligand-induced folding of the RNA-binding domain and two functionally distinct antitermination regions in the phage λ N protein Triple gene block: modular design of a multifunctional machine for plant virus movement The authors acknowledge the funding support from the Council of Scientific and Industrial Research (CSIR), India. The file no. for the same is 09/1058(0013)/2019-EMR-I. The authors also acknowledge the host support and facilities provided by the Indian Institute of Technology Mandi, Himachal Pradesh, India. PMM acknowledge the cooperation of Mr. Kundlik while preparing the list of abbreviation and citing the part of the manuscript. The authors declare no conflict of interest.