key: cord-0011338-gbjn6qqo authors: Halim, Sobia Ahsan; Aziz, Sobia; Ilyas, Mohammad; Wadood, Abdul; Khan, Ajmal; Al-Harrasi, Ahmed title: In Silico Modeling of Crimean Congo Hemorrhagic Fever Virus Glycoprotein-N and Screening of Anti Viral Hits by Virtual Screening date: 2020-03-11 journal: Int J Pept Res Ther DOI: 10.1007/s10989-020-10055-1 sha: 8619d9775d596dffec4bbc4ccf4f882180f6e598 doc_id: 11338 cord_uid: gbjn6qqo Crimean-Congo hemorrhagic fever (CCHF) is a widespread zoonotic viral disease, caused by a tick-born virus Crimean-Congo hemorrhagic fever virus (CCHFV). This disease is endemic in Middle East, Asia, Africa and South-Eastern Europe with the mortality rate of 5–30%. CCHFV genome is composed of three segments: large, medium and small segments. M segment encodes a polyprotein (glycoprotein) so called glycoprotein N (Gn) which is considered as a potential druggable target for the effective therapy of CCHF. The complete structure of Gn is still not characterized. The aim of the current study is to predict the complete three-dimensional (3D-) structure of CCHFV Gn protein via threading-based modeling and investigate the residues crucial for binding with CCHFV envelop. The developed model displayed excellent stereo-chemical and geometrical properties. Subsequently structure based virtual screening (SBVS) was applied to discover novel inhibitors of Gn protein. A library of > 1300 anti-virals was selected from PubChem database and directed to the predicted binding site of Gn. The SBVS results led to the identification of thirty-seven compounds that inhibit the protein in computational analysis. Those 37 hits were subject to pharmacokinetic profiling which demonstrated that 30/37 compound possess safer pharmacokinetic properties. Thus, by specifically targeting Gn, less toxic and more potent inhibitors of CCHFV were identified in silico. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10989-020-10055-1) contains supplementary material, which is available to authorized users. Crimean-Congo hemorrhagic fever (CCHF) is a prevalent and zoonotic viral disease, caused by Crimean-Congo hemorrhagic fever virus (CCHFV) which is a tick-borne virus belongs to the Nairovirus genus of Bunyaviridae family. CCHF is widespread over a range of geographical area including Middle East, Asia, Africa and South-Eastern Europe with the mortality rate of 40% (Appannanavar and Mishra 2011; Zivcec et al. 2016) . After Dengue, CCHFV is the second most prevalent arbovirus with significant medical importance (Ergonul 2006; Ergonul 2012; Bente et al. 2013) . The CCHFV causes severe viral haemorrhagic fever outbreaks, with a case fatality rate of 10-40% (WHO reports 2013). In Pakistan, since 2000, 50-60 cases are reported annually (WHO reports 2013; Begum et al. 1970) . CCHFV spread through Hyalomma tick, a vector responsible for viral transmission. CCHFV do not show symptoms in animals, while cause mild to highly fatal disease in humans (Bente et al. 2013 ). The infection is usually initiated by the skin lesions produced by the infected tick. After short-term incubation period (usually ≤ 7 days) non-specific symptoms of CCHFV are initiated, including high fever, faintness, chills, irritability, limb, head and spinal pains, which can last for 5-12 days. Moreover diarrhoea, vomiting, abdominal pain, thrombocytopenia, bradycardia and elevation of circulating enzymes of liver are some other recurrent side effects of CCHF. The hemorrhages like ecchymosis, epistaxis, cerebral, gingival, and gastrointestinal haemorrhages usually begin on the fourth day of infection. The liver turns to be swollen and painful. In current infections, aspiratory hemorrhages, neurological entanglements and blood loss leads to death. Distribution of intravascular coagulopathy, multi-organ failure and shock typically results in the fatal outcome (Burt et al. 1997; Saijo et al. 2010; Bente et al. 2013) . Moreover, endothelial cells, mononuclear phagocytes and hepatocytes are the main cellular targets of infection. Ribavirin is the only available antiviral drug used for the treatment of CCHF which inhibit CCHFV replication cycle, however may cause neurological and hematological anomalies Ergonul 2008) . The viral load is prominent in blood during the initial stage of CCHF, thus it is expected that the transfer of antibodies against CCHFV can be an effective therapy (Wölfel et al. 2007) , however use of monoclonal antibodies against CCHFV is still in its infancy. CCHFV is pleomorphic virus (~ 90-100 nm in diameter), consists of tripartite single-stranded negative-sense RNA genome (vRNA) which is composed of small (S), medium (M) and large (L) segments (Guo et al. 2012; Shtanko et al. 2014; Goedhals et al. 2015) . This virion contains cellmembrane derived envelope (coated by mature glycoproteins) which contains genomic ribonucleoprotein complexes (RNPs) formed by vRNA, nucleoprotein and RdRp (Zivcec et al. 2016) . The viral capsid is enveloped by 5 nm thick lipid bilayer and small projections (~ 5-10 nm long) are formed by the envelope proteins (Sanchez et al. 2002; Bergeron et al. 2007 ). The L segment is 11-14.4 kilobases in length which encodes RNA-dependent RNA polymerase (RdRp) and the nucleoprotein (NP). RdRp is required for mRNA and cRNA synthesis, necessary for translation and genome replication, respectively (Honig et al. 2004) . The M segment is 4.4-6.3Kbs long which encodes envelope glycoproteins precursor (Zivcec et al. 2016 ) which is cleaved co-translationally by signal peptidase and then post-translationally modified into two structural trans-membrane glycoproteins (Gc and Gn), non-structural M protein (NS M ), and secreted non-structural proteins (GP160, GP85 and GP38). These structural proteins form complexes on the surface of virion and assist in attachment to the host cell surface receptors, thus leads to fusion of viral envelope with the host membrane (Sanchez et al. 2002; Bergeron et al. 2007 ). The S segment is 1.7-2.1 kb in size, and encodes nucleocapsid protein which serves in the encapsidation of viral RNA (vRNA) and complimentary RNA (cRNA) during transcription and replication of genome. The mutation rates for the three parts of the genome were estimated to be: 1.09 × 10 − 4 , 1.52 × 10 − 4 and 0.58 × 10 − 4 substitutions/site/year for the S, M, and L segments, respectively (Carter et al. 2012) . The M segment is a crucial part of CCHFV because its Gn and Gc proteins aids CCHFV entry and fusion, formation of virion particle and immune evasion (Bertolotti-Ciarlet et al. 2005) . Gn binds with ribonucleoproteins in vitro through its cytoplasmic tail that contains a zinc finger domain, implying that Gn may involve in genome packaging and has important role in viral assembly (Erickson et al. 2007 ). The Gn glycoprotein contains a 176 residue ectodomain followed by a 24 residues transmembrane region and terminates in a long cytoplasmic tail consisting of ~ 100 residues (Estrada and Guzman 2011; Strandin et al. 2013) . Because of significant importance of Gn segment, designing a small molecule inhibitor against Gn segment could be a better approach to inhibit CCHFV. Due to lack of complete structure of Gn segment, we carried out its structural elucidation by in silico homology modeling. The generated model was used to identify novel drug like compounds against CCHFV by targeting Gn protein. Subsequently we performed ADMET profiling of the selected hits. This study has resulted in the discovery of novel scaffolds against CCHFV Gn protein. Threading based modeling was conducted by I-TASSER (Yang et al. 2015) , RaptorX (Wang et al. 2016a ) and Mod-Base (Pieper et al. 2010) servers. Molecular Operating Environment (MOEv2014.09) was used for Molecular docking. PLIF (Protein-Ligand Interaction Fingerprints) (Labute 2001 ) utility was used for protein ligand interaction calculation. The illustrations of 3D-model and Protein-ligand interactions were prepared by UCSF Chimera (Pettersen et al. 2004) . The complete strategy is depicted in Fig. 1 . M-segment of CCHFV codes for polyprotein glycoprotein. The sequence of M segment was retrieved from NCBI with the accession code ARB51455. This sequence consists of 1684 residues; Gn sequence is resided between 520 and 842 region, which was selected from UniProtKB (code: Q8JSZ3) for modeling. The Gn sequence is given below: F > sp|Q8JSZ3|GP_CCHFI Envelopment polyprotein OS = Crimean-Congo hemorrhagic fever virus (strain Nigeria/IbAr10200/1970) GN = GP PE = 3 SV = 1. SEEPSDDCISRTQLLRTETAEIHGDNYGGPGDKIT-ICNGSTIVDQRLGSELGCYTINRVRSFKLCENSAT-GKNCEIDSVPVKCRQGYCLRITQEGRGHVKLSRG-SEVVLDACDTSCEIMIPKGTGDILVDCSGGQQH-FLKDNLIDLGCPKIPLLGKMAIYICRMSNHPKTT-MAFLFWFSFGYVITCILCKAIFYLLIIVGTLGKR-LKQYRELKPQTCTICETTPVNAIDAEMHDLNCSYN-ICPYCASRLTSDGLARHVIQCPKRKEKVEETELYLN- For homology modeling, template searching was carried out on BLAST protein (with psi blast option) which retrieved sequences of very low coverage i.e., < 30%. Therefore modeling of Gn segment was performed by threading based approach. For this purpose, I-TASSER server was used in which proteins models are developed based on the structural alignment of the selected templates, instead of sequence alignment and homology (percent identity). The best model (obtained by I-TASSER) was validated by Procheck Ramachandran plot to predict the stereochemical properties of the protein. The results were not accurate thus RaptorX server was used to predict the reliable structure. RaptorX use multiple templates threading protocol to build 3D model from single target sequence and the quality of the final model is improved by its ability to correct the errors partially in pairwise alignments and alignment coverage is also increased. The RaptorX generated model was again evaluated by Procheck, which showed good results. However RaptorX generated model possess several disordered loops, therefore, two loops starting from residues 38-96 and 274-323 were separately modeled. After several attempts on these three servers, we obtained best model of region 38-96 and 274-323 from ModBase and RaptorX, respectively. The structural properties of the predicted loops were evaluated and replaced in the 3D-model of Gn by Chimera. The binding site of Gn segment was predicted by COACH server (Yang et al. 2013 ) and by aligning the binding site with the 2L7X. A set of 1392 anti-viral compounds were retrieved from PubChem database for virtual screening. The chemical structures of compounds were checked by MOE and converted into 3D by MOE wash module, hydrogen atoms were added and partial charges were assigned on each structure. The compounds were minimized to their lowest energy conformation by using MMFF94x force field until the gradient was reached to 0.01 RMS/ kcal/mol. The selected library of compounds was docked in the predicted Gn binding site by MOE. Using Protonate3D module, ionization state of model was assigned, hydrogens were added and electrostatic potential was calculated. Subsequently model was minimized using MMFF94x force field with the default parameters. For docking, virtual screening protocol of MOE was used with Triangle Matcher placement method, London dG scoring function and force field (GBVI/WSA dG) refinement method. Finally protein-ligand interactions were calculated by PLIF. The interactions were rendered by Chimera. The 3D-structure of Gn protein was initially elucidated through I-TASSER, however the stereochemical properties of the obtained model was not good, thus we use RaptorX server for modeling. RaptorX built secondary and tertiary structures of query sequence by aligning it with the sequences of multiple templates and the quality of the predicted model is evaluated by p-value, Score, uGDT and GDT (Peng and Xu 2011; . The 323 amino acid sequence of Gn glycoprotein of CCHFV was submitted to RaptorX to model the structure of Gn. The Gn model retrieved from RaptorX contains three domains, each domain was deduced by aligning one or more top matched template sequences present in the template library. The model is composed of an ectodomain (residue Ser1-Pro168), a transmembrane domain (residues Lys169-Leu206), a zinc finger cytoplasmic domain (residues Lys207-Glu278), and a cytoplasmic tail (residues Arg279-Ile323). The templates used to construct the model are 2L7X (Chain A) (Estrada and Guzman 2011), 5M87 (Chain A) (Ehrnstorfer et al. 2017) , 2A65 (Chain A) (Yamashita et al. 2005 ) and 5G47 (Chain A) (Halldorsson et al. 2016) . Alignment of the templates and the query sequence is depicted in Fig. 2 . Properties of the templates are shown in Table 1 . GDT and uGDT are the global distance test and unnormalized global distance test, respectively. The absolute model quality is measured by uGDT (GDT). If uGDT is > 50 then it is a good indicator for the protein having residues more than 100. GDT > 50 is a good indicator for the protein having residues < 100. Thus, if a good uGDT (> 50) is shown by a model but at the same time it has bad GDT (< 50) then it means that only a small portion of the protein model is good. The score indicate the alignment score that range from 0 to the domain sequence length while "0" score shows the results to be worst. Relative quality of the model is predicted by the p-value that should be < 10 − 3 and 10 − 4 for mainly α and β proteins, respectively. The majority of the domains consist of α helix and none of the domains has p-value > 10 − 3 that shows the quality of model is good. The predicted structure of model is shown in Fig. 3 . The structure of the model is composed of three domains and the best template was 5G47 (chain A). Overall uGDT (GDT) value of the predicted structure is 126 (39) in which the GDT value is 39 which is < 50 and uGDT value is 126 i.e., > 50. This indicates that the small portion of the whole model is good. The complete sequence was (100% residues) modeled by RaptorX, out of which only 17% residues were predicted as disorder, suggesting that the rest 87% residues are in ordered position. The ordered and disordered positions are depicted as blue and red bars in Fig. 4 . The secondary structure of the Gn model was deduced by RaptorX (Wang et al. 2016b; Schaarschmidt et al. 2017) which display result in two modes: (i) 3-state secondary structure and (ii) 8-state secondary structure. The 3-state secondary structure is comprised of α-helix, β-sheet and coiled regions represented by H, E and C respectively. The secondary structure of Gn shows that it contains 37% H, 24% E and 37% C (Fig. 5 ). The solvent exposed areas of the model was predicted by 3-state solvent accessibility method of RaptorX , which shows solvent exposed, medium and buried residues as E, M and B, respectively. The solvent accessibility of Gn shows that 31% region was exposed, 48% was resided in medium region while 19% region was laid in buried region (Fig. 6 ). The stereo-chemical properties of the Gn model was validated by PROCHECK-Ramachandran plot, which shows that 86.6% of total residues were present in most favored region and 11.3% residues were in additional allowed regions and 1.4% residues were in generously allowed regions while only 0.7% residues were present in disallowed regions. The Ramachandran plot is shown in supporting information Fig. S1 . The 3D-model predicted by RaptorX was composed of several loops (Fig. 3) hence two loops were modeled separately by different programs. Therefore, regions that contain > 50 residues were modeled again. Thus Gn (38-96) region was modeled by Modbase server using 5B0U (Chain A) as template with 53% identity with region 38-96. The model showed that this loop is not composed of coils, but three anti-parallel β-sheets. The model quality was assessed by Ramachandran plot (Fig. S2) shows that 92% residues were present in most favored region while none of the residues were in disallowed region. The region Gn (274-323) was also modelled by RaptorX using 5UAK (Chain A), 5UAR (Chain A) and 4WAT (Chain A) as templates, and the resultant model shows that this region (274-323) is composed of 68% helical part. The Ramachandran plot (Fig. S3) shows that 97.8% residues lie in the most favored region while only 1(2.2%) residue is present in the disallowed region. These modeled segments were then replaced and joined into the initial model of Gn (predicted by RaptorX) by using Chimera. The complete structure after Fig. 2 The sequence alignment of CCHFV Gn Protein and templates loops modeling is shown in Fig. 3 . The final model of the Gn was then validated by Ramachandran plot, depicted that 88.4% residues lie in the most favored region, while 9.2%, 1.4%, and 1.1% residues were present in additional allowed, generously allowed and disallowed region respectively. The stereo-chemical properties indicate that the model is of good quality (Fig. S4 ). The residues involve in the protein-protein, or proteinligand binding were identified by using COACH algorithm (Wu et al. 2018 ) which uses structural alignment method to predicts the binding site of the protein by comparing the binding sites of different templates. These sites are predicted by two methods namely: TM-SITE and S-SITE. According to COACH results, binding site are located on region 217, 220, 233, and 237 of Gn glycoprotein and its Table 2 . A set of 1392 compounds with anti-viral activites were retrieved from PubChem and screened against Gn protein by molecular docking. Based on docking rank and score, top 500 compounds were selected for their interactions analysis. The PLIF calculated results showed that thirty seven compounds interact precisely with the zinc finger domain of Gn protein. The docking and PLIF results of the selected hits are tabulated in Table 3 . The pharmacokinetic properties of selected hits were evaluated by ADMETsar (http://lmmd.ecust .edu.cn/admet sar1/ predi ct/) and SwissADME (http://www.swiss adme.ch/). The results are tabulated in Table 4 . The results depicted that only six compounds (23, 28-32 and 36) displayed AMES toxicity while the rest are non-carcinogenic. Moreover five compounds (29-32 and 36) are predicted to cross blood brain barrier (BBB) while the rest do not show BBB positivity. Among thirty seven hits, twelve compounds (22-25, 28-34 and 36) displayed high intestinal absorption in humans. The predicted acute toxicity in rat models showed that the compounds do not show lethality upto the concentration of 2 mol/Kg, hence we can say that these compounds are not lethal in lower doses and fall in the good range of median lethal dose (LD 50 ). The ADMETsar server predicted the acute oral toxicity of all the compounds. According to the results, compound 31 fall in category II (50 mg/kg > LD50 < 500 mg/kg) and compound 13 and 19 fall in category IV (LD50 = ≥ 5000 mg/kg). However the rest of the compounds fall in class III (LD50 = > 500 mg/kg, ≤ 5000 mg/kg). The results indicate that the compounds do not show oral toxicity on doses up to 5000 mg/kg, thus the compounds are not orally toxic. The predicted metabolic profile of the compounds shows that which cytochrome p450 will act as substrate and non-substrate for the compound and which will be inhibited by the compound. The molecules with high AMES toxicity and high BBB permeability were excluded from selection. The final selected compounds with their respective ADMET properties are tabulated in Table 4 . Followed by the interaction analysis, compounds were segregated into two categories. The compounds which bind with the binding site were categorized as category I, while compounds that particularly interact with the residues of zinc finger domain were classified as category II. The compounds 1, 2, 4-6, 9-11, 13, 18- Compound 3, 7-8, 12, 14-17, 24, 33-34, and 37 are included in category II, which interact with zinc finger domain of Gn-protein (Fig. 7) . The docked view of compound 3 depicts that the compound mediates H-bonds with Tyr207 (2.5 Å) and Ala209 (2.7 Å), and an ionic bond with Cys208 (2.1 Å). Moreover the benzoyl-OH moiety of the compound mediates bi-dentate interactions with Arg211 (1.5 Å and 2.5 Å). The docked view of compound 7 showed that weak H-bond is formed between carbonyl oxygen and Cys208 (3.3 Å) and a Π-Π interaction is formed between the phenyl moiety of the ligand and His220 (2.1 Å). The phenyl-OH of the compound is H-bonded with Arg211 (0.7 Å). The collective H-bonds and hydrophobic interactions are responsible for the inhibition of zinc finger domain. The binding mode of compound 8 depicts that compound mediates weak H-bond with Glu229 (3.1 Å) and a strong H-bond between carbonyl oxygen and the side chain of His220 (1.6 Å). Additionally several residues of zinc finger domain provides hydrophobic interactions to stabilize the compound. The compound 12 mediates several H-bonds within zinc finger domain. The sulfate moiety of compound forms bidentate interactions with Ser210 (1.8 Å and 2.5 Å), and two H-bonds with Cys208 (2.8 Å) and His220 (2.4 Å). Moreover ligand formed bi-dentate interactions with Glu223 at a distance of 1.6 Å and 1.9 Å, respectively. The carbonyl oxygen of the ligand formed a strong H-bond with Arg219 (1.7 Å). The predicted binding mode suggests that this compound could be a potent inhibitor of CCHFV-Gn protein because of these multiple interactions. The carbonyl oxygen at 3-methyl-butanoyl moiety of compound 14 mediates H-bond with His220 (2.3 Å) and Cys208 (2.4 Å). Additionally compound also forms Π-Π interactions with the side chain of Tyr207. The 4-fluoro-3-hydroxyne-4-methyloxolan and 2, 4-dioxopyrimidin moieties of compound 15 accepts H-bond from the side chain of His220 (2.3 Å) and Cys208 (2.4 Å), respectively. The trihydroxybenzoyl moiety of the compound 16 mediates ionic interaction with Lys226 (2.3 Å), and the carbonyl oxygen of the compound formed H-bond with the -OH of Ser210 (1.9 Å). The benzoyl moiety of compound forms H-bond with Gln223 and His220 at a distance of 2.2 Å and 2.6 Å, respectively. The compound 17 mediates multiple H-bonding with the surrounding residues. The tri-hydroxyphenyl moiety of the compound formed H-bonds with Cys208 (2.4 Å), Ser210 (2.6 Å) and His220 (1.7 Å). The tri-hydroxy phenyl substituted carbonyl moiety interacts with Gln223 (2.9 Å), thus compound 17 also form multiple strong interactions with the zinc finger domain. The compound 24 mediates strong H-bond with the side chains of Tyr207 (1.9 Å), His220 (1.9 Å) and Arg219 (2.5 Å). The compound 33 also mediates several hydrophilic interactions with the zinc finger domain. The carbonyl moiety of the compound formed H-bond with Cys208 (3.1 Å), and bi-dentate interactions with the side chain of Arg211 at a distance of 1.9 Å and 2.3 Å, respectively. Moreover the compound also formed bi-dentate interaction with the side chain of Ser210 (2.6 Å and 1.8 Å). Additionally several hydrophobic interactions stabilize the compound within the binding site. The compound 34 formed three H-bonds with Tyr207, His220 and Arg219. The compound is composed of four rings. One of tri-hydroxyphenyl moiety of compound interacts with Tyr207 via H-bond (1.9 Å) while carboxyl group formed H-bond with the side chain of His220 (1.9 Å). Another phenyl moiety accepts H-bond form the side chain of Arg219 (2.5 Å). The compound 37 formed H-bond with Cys208 and His220 at a distance of 2.4 Å and 1.9 Å, respectively. A weak H-bond was observed between phenyl group of the compound and Tyr207 (3.6 Å). The binding interactions of these compounds are shown in Fig. 7 . The result suggests that these compounds bind with the zinc finger domain with strong interactions, thus capable to hinder the function of Gn protein. CCHF is a life threatening viral disease with high mortality and morbidity rate (Rahpeyma et al. 2015) . Though CCHFV belongs to Bunyaviridae family however comparing to other 3, 7, 8, 12, 14, 15, 16, 17, 24, 33, 34 , and 37 (a-l) that interacts with zinc finger domain genera of this family, it shows some uncommon properties; for instance the length of the M-segment of CCHFV is a large precursor, comprised of 1684 amino acids and a remarkably large glycoprotein is encoded by this precursor protein. Another feature which distinguishes CCHFV from other genera is that its M-segment encoded glycoprotein precursor undergoes complex series of proteolysis before maturation while other viral glycoprotein undergoes proteolysis in a single step. Cysteine residues present in CCHFV glycoproteins indicate the complexity of its secondary structure due to presence of disulfide bonds (Bertolotti-Ciarlet et al. 2005; Altamura et al. 2007; Carter et al. 2012) . Because of important role of Gn protein in viral assembly and localization, several researches have targeted this glycoprotein as a potent immunogen for vaccine development by using various expression systems (Saijo et al. 2010; Strandin et al. 2013; Buttigieg et al. 2014; Dowall et al. 2017; Wu et al. 2017 ). The three-dimensional (3D) structure of any protein facilitates its functional characterization (Ul-Haq et al. 2015; Purohit et al. 2018; . To construct the model of Gn glycoprotein, I-TASSER server was used initially, however the retrieved model showed lower quality and out of 323 residues, 157 lied in most favored region, 104 in additionally allowed region, 16 in generously allowed region while 7 residues were present in the disallowed region. I-TASSER uses structural fragments of multiple templates to build the model as a result the generated model can have lower quality due to the presence of more torsion angles in their backbone. The complex and challenging proteins with less close homologous templates show such type of results (Ul-Haq et al. 2015) . Since our target protein is complex, we tested RaptorX for its modeling. RaptorX server is usually used to construct models of those targets which have few close homologs by employing multiple template structure. The resultant model showed acceptable stereo-chemical profile however two loops were geometrically unacceptable in the model; those two loops were modeled separately and joined in the model. The Gn of all nairoviruses has conserved cysteine and Histidine residues in their cytoplasmic tail (CTs) which are responsible for the formation of zinc finger domain. Nairoviruses possess dual CCHC type zinc fingers that form a globular domain by tightly associating with each other. The role of ZF's domain is the regulation of DNA and viral RNA (Strandin et al. 2013) . Andes virus (ANDV) and CCHFV contains dual zinc fingers with similar structure. The only difference is that ANDV does not have the ability to bind with viral RNA while CCHFV binds with viral RNA (Altamura et al. 2007 ). The glycoproteins Gn and Gc belongs to type I membrane integral proteins and extend viral membrane, its N-terminal domain contact with outer environment and act as ectodomain while C-terminal points toward intraviral space. Bunyaviruses are different from other single stranded anti-sense RNA viruses because they lack protein which acts as scaffold between viral envelope and RNP components (matrix protein). However, viral Gn-CTs are large enough that they can accommodate domains and performs function like matrix protein. So, they can be assumed as substitutes of viral matrix protein (Strandin et al. 2013) . The thrombocytopenia syndrome virus and Rift Valley fever virus belongs to the Phleboviruses genus of the bunyaviridae family, their Gn is also type-I integral membrane protein and its N-terminal act as ectodomain while C-terminal domain is transmembrane helical portion that is inserted in viral membrane (Wu et al. 2017) . The CCHFV Gn model also depicts that its C-terminal is helical. The studies showed that Gn glycoprotein contains a 176 residue ectodomain followed by a 24 residues transmembrane region and a long cytoplasmic tail composed of > 100 residues (Estrada and Guzman 2011) . The developed model is composed of an ectodomain (residue Ser1-Pro168), a transmembrane domain (residues Lys169-Leu206), a zinc finger cytoplasmic domain (residues Lys207-Glu278), and a cytoplasmic tail (residues Arg279-Ile323.) Several studies revealed that Gn of protein of virus from bunyaviridae family is involved in viral assembly. For example, alanine mutagenesis of the cytoplasmic tails of Uukuniemi virus and Bunyamwera virus affect the ability of virus-like particles to effectively incorporate ribonucleoproteins, thus intimating a role for Gn tails in genome packaging. More recently, the Gn tail of Puumala virus was shown to co-immunoprecipitate with the Puumala nucleocapsid protein. These results suggest that the CCHFV Gn tail plays an equally important role in viral assembly of genus Nairovirus (Sanchez et al. 2002; Estrada and Guzman 2011; Wu et al. 2017) . Drug designing is a complex process where computational tools help to foster this process in less time and automatic procedures. Due to high mutation rate of viral proteins, it is increasingly demanding to expedite the drug delivery against viral diseases. Computational medicinal chemistry is not only applied against human disease but also delivered several novel fungicides against plants diseases (Iftikhar et al. 2017) . It is important to predict toxicity, ADME properties and potential activity of a drug like molecules prior to their biological testing in order to avoid drug failure. In the present study, these properties are identified by SwissADMET and ADMETsar. Virtual screening is computational searching of huge chemical space against targets. Previously structure based virtual screening was applied to identify the novel immunomodulators against human immune disorders (Halim et al. 2013 ) and several drugs like molecules were suggested against dengue virus to establish effective therapeutics . We believe that the predicted hits will be a valuable starting point to deliver drugs against congo virus. CCHF is contagious disease; currently there is no drug or vaccine available to treat this fatal disease. This study was conducted to explore computational resources to get insights into the inhibitory mechanism of CCHFV. Glycoprotein Gn of CCHFV has been exploited as an important drug target in this study because of its role in viral envelop binding. The zinc finger domain of this protein is available however complete three dimensional structure of this protein is not available. Thus threading based in silico modeling was employed to elucidate its complete structure which was used for the development of new drugs by structure based virtual screening of antiviral compounds. The computational analysis revealed that out of > 1300 compounds, thirty seven compounds were compatible with the binding site and are anticipated to block the activity of Gn in silico. The in silico predicted ADMET profile suggests that thirty compounds has safer pharmacokinetic properties and could be exploited as potential hits. The results need in vitro and in vivo experimental validation to confirm these results. Ramachandran plot of Gn model, Ramachandran plot of Gn (38-96) region predicted by modbase, Ramachandran plot of the Gn (274-323) region predicted by RaptorX, Ramachandran plot of the 3D structure of the CCHFV Gn protein and the docked view of compounds 1-2, 4-6, 9-11, 13, 18-22, 25-27, 35 are included in supporting information. Author Contributions SAH outline the research strategy and idea, drafted and revised the manuscript. SA carried out literature search, and performed computational experiments. AW provided MOE software for virtual screening. MI, AJ and AH suggested valuable comments on manuscript writing. All the authors read and approved the final manuscript. Conflict of interest All authors declare that they have no conflict of interest. Ethical Approval This article does not contain any studies with human participants or animals performed by any of the authors. Identification of a novel C-terminal cleavage of Crimean-Congo hemorrhagic fever virus PreGN that leads to generation of an NSM protein An update on Crimean Congo hemorrhagic fever Tick-Borne viruses of west Pakistan: IV. Viruses similar to, or identical with, Crimean Hemorrhagic fever (Congo-Semunya), WAD MEDANI and PAK ARGAS 461 Isolated from Ticks of the Changa Manga Forest, Lahore District, and of Hunza, Gilgit Agency, W. Pakistan Crimean-Congo hemorrhagic fever: history, epidemiology, pathogenesis, clinical syndrome and genetic diversity Crimean-Congo hemorrhagic fever virus glycoprotein processing by the endoprotease SKI-1/S1P is critical for virus infectivity Cellular localization and antigenic characterization of Crimean-Congo hemorrhagic fever virus glycoproteins Immunohistochemical and in situ localization of Crimean-Congo hemorrhagic fever (CCHF) virus in human tissues and implications for CCHF pathogenesis A novel vaccine against Crimean-Congo Haemorrhagic Fever protects 100% of animals against lethal challenge in a mouse model Structure, function and evolution of the Crimean-Congo hemorrhagic fever virus nucleocapsid protein Development of vaccines against Crimean-Congo haemorrhagic fever virus Structural and mechanistic basis of proton-coupled metal ion transport in the SLC11/NRAMP family Crimean-Congo haemorrhagic fever Analysis of risk-factors among patients with Crimean-Congo haemorrhagic fever virus infection: severity criteria revisited Treatment of Crimean-Congo hemorrhagic fever Crimean-Congo hemorrhagic fever virus: new outbreaks, new discoveries N-linked glycosylation of Gn (but not Gc) is important for Crimean Congo hemorrhagic fever virus glycoprotein localization and transport Structural characterization of the Crimean-Congo hemorrhagic fever virus Gn tail provides insight into virus assembly Comparative analysis of the L, M, and S RNA segments of Crimean-Congo haemorrhagic fever virus isolates from southern Africa Crimean-Congo hemorrhagic fever virus nucleoprotein reveals endonuclease activity in bunyaviruses Identification of novel Interleukin-2 inhibitors through computational approaches Targeting dengue virus NS-3 helicase by ligand based pharmacophore modeling and structure based virtual screening Structure of a phleboviral envelope glycoprotein reveals a consolidated model of membrane fusion Crimean-Congo hemorrhagic fever virus genome L RNA segment and encoded protein Discovering novel alternaria solani succinate dehydrogenase inhibitors by in Silico modeling and virtual screening strategies to combat early blight Prediction of a highly deleterious mutation E17K in AKT-1 gene: an in silico approach Prediction of functionally significant single nucleotide polymorphisms in PTEN tumor suppressor gene: an in silico approach Protein threading using contextspecific alignment potential RaptorX: exploiting structure information for protein alignment by statistical inference UCSF Chimera-a visualization system for exploratory research and analysis ModBase, a database of annotated comparative protein structure models, and associated resources Screening of potential inhibitor against coat protein of apple chlorotic leaf spot virus Crimean-Congo hemorrhagic fever virus Gn bioinformatic analysis and construction of a recombinant bacmid in order to express Gn by baculovirus expression system Recent progress in the treatment of Crimean-Congo hemorrhagic fever and future perspectives Characterization of the glycoproteins of Crimean-Congo hemorrhagic fever virus Assessment of contact predictions in CASP12: co-evolution and deep learning coming of age Crimean-Congo hemorrhagic fever virus entry into host cells occurs through the multivesicular body and requires ESCRT regulators Cytoplasmic tails of bunyavirus Gn glycoproteins-could they act as matrix protein surrogates? 3D structure prediction of human β1-adrenergic receptor via threading-based homology modeling for implications in structure-based drug designing RaptorX-Property: a web server for protein structure property prediction Protein secondary structure prediction using deep convolutional neural fields Developed with joint collaboration of National Institute of Health (NIH), World Health Organization (WHO) (b) Seasonal awareness and alert letter for epidemic-prone infectious diseases in Pakistan, Ministry of National Health Services, Regulation and Coordination Government of Pakistan Virus detection and monitoring of viral load in Crimean-Congo hemorrhagic fever virus patients Structures of phlebovirus glycoprotein Gn and identification of a neutralizing antibody epitope COACH-D: improved protein-ligand binding sites prediction with refined ligand-binding poses through molecular docking Crystal structure of a bacterial homologue of Na+/Cl-dependent neurotransmitter transporters Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment The I-TASSER suite: protein structure and function prediction Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Molecular insights into Crimean-Congo hemorrhagic fever virus