key: cord-0929809-8dyiw9ru authors: Hossain, Mohammad Uzzal; Bhattacharjee, Arittra; Emon, Md. Tabassum Hossain; Chowdhury, Zeshan Mahmud; Mosaib, Md. Golam; Mourin, Muntahi; Das, Keshob Chandra; Keya, Chaman Ara; Salimullah, Md. title: Recognition of plausible therapeutic agents to combat COVID-19: An omics data based combined approach date: 2021-03-01 journal: Gene DOI: 10.1016/j.gene.2020.145368 sha: ed823e7e9e93a994fe101db0c8ed7d2653f9f7b1 doc_id: 929809 cord_uid: 8dyiw9ru Coronavirus disease-2019 (COVID-19), caused by Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), has become an immense threat to global public health. In this study, we performed complete genome sequencing of a SARS-CoV-2 isolate. More than 67,000 genome sequences were further inspected from Global Initiative on Sharing All Influenza Data (GISAID). Using several in silico techniques, we proposed prospective therapeutics against this virus. Through meticulous analysis, several conserved and therapeutically suitable regions of SARS-CoV-2 such as RNA-dependent RNA polymerase (RdRp), Spike (S) and Membrane glycoprotein (M) coding genes were selected. Both S and M were chosen for the development of a chimeric vaccine that can generate memory B and T cells. siRNAs were also designed for S and M gene silencing. Moreover, six new drug candidates were suggested that might inhibit the activity of RdRp. Since SARS-CoV-2 and SARS-CoV-1 have 82.30% sequence identity, a Gene Expression Omnibus (GEO) dataset of Severe Acute Respiratory Syndrome (SARS) patients were analyzed. In this analysis, 13 immunoregulatory genes were found that can be used to develop type 1 interferon (IFN) based therapy. The proposed vaccine, siRNAs, drugs and IFN based analysis of this study will accelerate the development of new treatments. Coronavirus Disease-2019 (COVID-19) has drastically spread throughout the world and become a massive threat to humanity (WHO, 2019) . Most healthcare systems and governing bodies of developed countries are struggling to control the spread of this pandemic (Lai et al., 2020) . Quick preventative actions are challenging in curbing its spread since the causative agent of this disease is a novel SARS-related Coronavirus (CoV) or SARS-CoV-2. This particular strain of the virus was unknown before the outbreak that took place in Wuhan, China; therefore, its ambiguous pattern of transmission and pathogenicity is an ongoing and unrelenting threat to public health and the global economy (Lai et al., 2020; . CoVs, members of the Coronaviridae family and the Coronavirinae subfamily were considered insignificant pathogens for humans before the outbreak of SARS occurred in 2002 and 2003, which originated in the Guangdong province, China (Fung and Liu, 2019) . Coronavirinae has been divided into four categories: Alpha; Beta, Gamma and Deltacoronavirus (Fung and Liu, 2019) . These pathogens were described as the "virology backwater" since very little was known about their virulence (Cavanagh et al., 2005) . In the past couple of decades, they have caught the attention of the scientific community several times due to their zoonotic characteristics. The Betacoronavirus group especially consists of the riskiest Human CoVs (HCoVs), such as HCoV-OC43, HCoV-HKU1, SARS-CoV-1 and Middle East respiratory syndrome coronavirus (MERS-CoV) (Cui et al., 2019) . Within itself, coronavirinae has vast genetic diversity, but these have mostly been found in bats (Lau et al., 2019) . This mammalian host poses a serious threat to CoV outbreaks (Fan et al., 2019) . A possible reason behind this occurrence is that members of Coronavirinae frequently undergo RNA recombination inside bats and often generate novel CoVs (Cui et al., 2019; Lau et al., 2019) . SARS-CoV-2, a member of Betacoronavirus, most probably originated in bats and transmitted to humans via intermediary animals in Wuhan, China. This transmission started the COVID-19 pandemic around December 2019 (Wu and McGoogan, 2020) . SARS-CoV-2, a positive-stranded (+) RNA virus, spreads from an infected person to a healthy individual via respiratory droplets as a result of sneezing and/or coughing . The presence of SARS-CoV-2 is also observed in the stool; thus contaminated water supply might promote transmission . The virions pass through the upper respiratory tract of the host, then enter the lung cells by engaging Angiotensin-Converting Enzyme 2 (ACE2) and Transmembrane Serine Protease 2 (TMPRSS2) (Hoffmann et al., 2020) . This entry triggers innate immunity; as a result, the following inflammatory cytokines become elevated: Interleukin (IL) -2, IL-10, IL-7, Granulocyte-colony stimulating factor (GCSF), monocyte chemoattractant protein 1 (MCP1), Macrophage Inflammatory Proteins (MIP) 1A, Interferon gamma-induced protein-10 (IP-10) and tumor necrosis factor-alpha (TNFα) (Wu and McGoogan, 2020) . Consequently, patients develop indistinguishable flu-like symptoms such as cough, sore throat, headache, fatigue, myalgia, breathlessness, conjunctivitis and/or fever. In the worst cases, patients develop pneumonia leading to respiratory failure and death . To alleviate these symptoms, physicians are non-specifically applying antiparasitics such as Ivermectin and the antiviral drug Remdisivir, along with other antiviral therapeutics. Plasma therapy is also helping many patients to survive (Wu and McGoogan, 2020; Gao et al., 2020; Cascella et al., 2020; Caly et al., 2020; Eastman et al., 2020) . A recent in vitro study revealed a potential clinically proven protease inhibitor that can block TMPRSS2, and consequently the entry of SARS-CoV-2 (Hoffmann et al., 2020) . Conducting similar studies will help to reveal a variety of possible therapeutics and druggable targets, which increases the probability of creating successful treatments against COVID-19. In silico approaches are the most suitable way to execute this type of research in a short time with nearly zero cost. Powerful computational methods such as subtractive proteomics, comparative genomics or virtual screening can determine therapeutic targets and help develop drugs against bacteria, fungi or non-communicable diseases (Hossain et al., 2018; Abadio et al., 2011; Bhattacharjee et al., 2020) . Unfortunately, for viral infections, this type of combined rational approach is hardly seen. As a result, unveiling the potential targets, novel drugs, interferons (IFNs) and immunodominant vaccines against COVID-19 will be extremely challenging if the virus becomes resistant against different therapeutics. In this study, we propose a combined in silico approach to recognize plausible therapeutics. Here, we have sequenced one SARS-CoV-2 isolate from Bangladesh and identified the regions that are relatively conserved in more than 67,000 isolates. Several computational techniques were then merged to develop treatments against those conserved regions. Using these techniques, a chimeric vaccine (which is not dependent only on spike protein), small druggable molecules, and Small Interfering RNAs (siRNAs) were developed. Additionally, a comparative genome analysis was carried out between SARS-CoV-2, SARS-CoV-1 and MERS-CoV to detect the genetic similarity. SARS-CoV-1 showed close genetic relation and virulence pattern with SARS-CoV-2. Thus, a transcriptomic analysis of this related virus was performed to retrieve the IFN inducing genes to implement IFN based therapy. The viral isolate (SARS-CoV-2/human/BGD/NIB-BCSIR_02/2020) was collected from a 45-year-old female patient who was tested positive for COVID-19 by Reverse-Transcriptase Polymeric Chain Reaction (RT-PCR) along with the symptoms such as cough, mild fever, and throat congestion. The oropharyngeal swab of the patient was taken as a specimen on the 11th of May 2020. From there, the viral RNA was extracted using PureLink Viral RNA/DNA Mini Kit (Invitrogen). The RNA was then converted into cDNA using SuperScript™ VILO™ cDNA Synthesis Kit (Invitrogen). The viral cDNA was sequenced with Illumina Nextseq 550 nextgeneration sequencing technology. Here, the library preparation kit of Nextera DNA Flex was used for the synthesis of the nucleotides. The reagent cartridge was NextSeq High Output Kit that covers 300 cycles. The run mode was a local run manager that generated the FASTQ data workflow from the NextSeq 4-channel chemistry. Quality estimation and analysis of the sequences were performed using a customized version of the DRAGEN RNA pipeline, which is also available on local DRAGEN server hardware. The Illumina® DRAGEN RNA Pathogen Detection App uses a combined human and virus reference to explore pathogen data. Here, the low-quality raw reads were trimmed with Trimmomatic 0.36 (-phred33, LEADING: 20, TRAILING: 20, SLID) (Bolger et al., 2014) . The assembly was performed by SPAdes V. 03 using default parameters as well as used to cross-validate with the reference-based method as an internal control (Bankevich et al., 2012) . A total of 1, 78, 22,898 reads were produced in the reference-based alignment. After trimming, 99% of them were mapped to the SARS-CoV-2 reference genome. The assembled genome was submitted in National Center for Biotechnology Information (NCBI) GenBank (Accession number: MT568643.1) and Global Initiative on Sharing All Influenza Data (GISAID) (Accession ID: EPI_ISL_458133) (https://www.gisaid.org/). The phylogenetic tree was constructed as stated by Moniruzzaman et al., 2020 (Moniruzzaman et al., 2020 . In short, the closely related genomes were aligned with MAFFT version 7 and the Phylogenetic tree was constructed using FastTree version 2.1.10 via Galaxy platform (Katoh and Standley, 2013; Price et al., 2010; Afgan et al., 2018) . To identify conserved regions, SARS-CoV-2 genomes from different continents were aligned. From there relatively conserved regions that are suitable for therapeutics development were collected. The conservation (segments that are less diversified) of these regions were further observed in the GISAID database ( Fig. 2) (Elbe and Buckland-Merrett, 2017) . Afterward, these selected regions/ genes were obtained from our submitted genome to develop a chimeric vaccine, small molecules Interferon gamma-induced protein-10 TNFα Tumor necrosis factor alpha and siRNAs. The antigenicity of the selected Membrane (M) (NCBI GenBank accession: QKI36904.1) and Spike Glycoprotein (S) (NCBI GenBank accession: QKI36901.1) were determined by Vaxijen 2.0 (Doytchinova and Flower, 2007) and AntigenPro (Magnan et al., 2010) . These servers predicted antigenicity through alignment independent methods. The outside amino acid residues of the vaccine candidates were identified with THMM (Krogh et al., 2001) . The identified sequences were uploaded in the BepiPred-2.0 server to select the most potential B cell epitopes (Jespersen et al., 2017) . During this selection, surfaceexposed amino acids were prioritized mostly. The Cytotoxic T cell (CTL), Helper T cell (HTL) epitopes were disclosed by NetCTL 1.2 and NetMHCII respectively (Larsen et al., 2007; Jensen et al., 2018) . NetCTL 1.2 was implemented to identify the CTL epitopes for 12 classes of Major Histocompatibility Complex 1 (MHC I) supertypes. The best CTL epitopes were selected based on the combined score. NetMHCII was used to detect 15-mer HTL epitopes for Human Leukocyte Antigen (HLA)-DP, HLA-DQ, and HLA-DR alleles. Most potential epitopes were chosen by evaluating affinity, percentage ranking and binding level. To construct a chimeric or multi-epitope vaccine, all the epitopes were joined by EAAK, GPGPG and AAY linkers (Shey et al., 2019) . Human Beta Defensin-2 (HBD-2) (PDB ID: 1FD3) and a recombinant viral protein were added in the N terminal and the C terminal of the vaccine respectively (Yu et al., 2010; Li et al., 2013) . HBD-2 was conjugated because this protein can activate Toll-Like Receptor 4 (TLR 4) and has chemotactic activity (Yu et al., 2010; Niyonsaba et al., 2004) . The recombinant viral protein was added to stimulate antiviral responses. The fasta sequence of the vaccine was uploaded to an agent-based immune system simulator C-ImmSim (http://150.146.2.1/C-IMMSIM/ index.php) for measuring the immune responses (Rapin et al., 2010) . The parameters were kept in default during this simulation. C-ImmSim showed an adequate secretion of IFNγ which was further evaluated by IFNepitope (Dhanda et al., 2013) . Recombinant vaccines can initiate an allergic response or induce various types of toxicity. Therefore, evaluation of toxicity and allergenicity is an essential step for vaccine designing. Here, the allergenicity of the vaccine was calculated by AlgPred and AllerTop v.2 (Saha and Raghava, 2006; Dimitrov et al., 2014) . Toxicity was measured by Tox-inPred (Gupta et al., 2013) . The physicochemical analysis of the protein was executed via Prot-Param (Gasteiger et al., 2005) . The tertiary structure was generated through Contact-guided Iterative Threading ASSEmbly Refinement (C-I-TASSER) (Zheng et al., 2019) . The structure was refined by GalaxyRefine and validated with Procheck and ProSAWeb (Heo et al., 2016; Laskowski et al., 1993; Wiederstein and Sippl, 2007) . The crystal structure of TLR 4 was collected from Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) (www.rcsb.org). The PDB ID of the structure was 3FXI. For molecular docking analysis, only Chain A of TLR 4 was preserved. All heterogeneous atoms were removed. This preparation was executed by BIOVIA Discovery Studio (https://www.3dsbiovia.com/). To determine the active site pocket of Chain A, the PDB file was uploaded in Computer Atlas of Surface Topology of Proteins (CASTp) (Tian et al., 2018) . Finally, the vaccine and Chain A of TLR-4 was uploaded in ClusPro 2.0 for protein-protein docking (keeping the parameters default) (Kozakov et al., 2017) . The vaccine sequence was reversely translated and the codons were optimized for Escherichia coli (strain K12) through JCat (Grote et al., 2005) . The optimized DNA sequence was kept free from rhoindependent transcription terminators and prokaryotic ribosome binding sites. This sequence was flanked by Nde I and Xho I restriction sites. Stop codons were added at the end of 3 ′ OH or C terminal end. After these preparations, the DNA sequence was inserted in pET28a (+) Plasmid Vector via SnapGene tools (www.snapgene.com). The conserved RNA directed RNA polymerase (RdRp) sequence was retrieved from our submitted genome to develop small molecules. C-I-TASSER was employed to build the 3D model of RdRP (Zheng et al., 2019) . Thereafter, CASTp was implemented to identify the drug binding sites that were allowed to recognize the critical amino acids for drugprotein interactions (Tian et al., 2018) . DrugBank Database (www.drugbank.ca) was utilized to search for potent drugs against RdRP. Molecular docking of the receptor and the selected compounds were performed in AutodockVina to observe the binding affinity at the drug binding site of RdRp (Trott and Olson, 2010; Agrawal et al., 2001) . Here, AutoDock tools 1.5.6 was used to prepare the input pdbqt file for the receptor. A grid box parameter was set in size with X = 70, Y = 70, Z = 36 points (center grid box: x = 23.448, y = 53.587, z = -2.012, spacing = 0.5 A). The molecular visualization of protein-ligands was executed by BIOVIA Discovery Studio and PyMol (Seeliger and de Groot, 2010) . Osiris property explorer, Molinspiration, ACToR (Aggregated Computational Toxicology Resource), admetSAR (absorption, distribution, metabolism, excretion, and toxicity Structure-Activity Relationship database) and ACD/I-lab were exploited for the calculation of Absorption, distribution, metabolism, excretion (ADME) properties and toxicity (T) profile (Sander, 2001; Kumar Mishra, et al., 2011; Judson et al., 2008; Cheng et al.; Masunov, 2001) . ADMET properties are necessary to establish an effective drug. To design the effective siRNAs molecule, I-Score Designer was employed (Ichihara et al., 2007) . I-Score Designer computes nine different siRNA designing scores such as Ui-Tei (Ui-Tei et al., 2008), Amarzguioui (Amarzguioui and Prydz, 2004) , Hsieh (Hsieh et al., 2004) Takasaki (Takasaki et al., 2004) s-Biopredsi (Huesken et al., 2005) i-Score (Ichihara et al., 2007) Reynolds (Reynolds et al., 2004) Katoh (Katoh and Suzuki, 2007) and Design of SIRna (DSIR) (Vert et al., 2006) along with other essential parameters. The server also ranks the best siRNA molecules. From there, the best siRNA sequence was taken for further analysis. The secondary structure of the siRNA was predicted via RNA structure web server (Mathews et al., 2004) . RNAfold web server was applied to calculate the free energy of the thermodynamic ensemble for the secondary structures (Mathews and Turner, 2006) . Transcription and Translation Tool (http://biomodel.uah.es/en/lab/cybertory/analy sis/trans.htm) was implemented to transcribe the viral DNA sequences. Finally, the designated Antisense siRNAs were hybridized against viral RNA sequences using the HNADOCK server (He et al., 2019) . HNADOCK executed RNA-RNA docking and performed molecular dynamics simulations to refine the best 10 predicted siRNA-mRNA complexes. The model with the best docking score for Membrane Glycoprotein mRNA and Spike Glycoprotein mRNA was visualized with PyMol. The genome of SARS-Cov-2 was compared with SARS-CoV-1 and MERS-CoV. We Blasted SARS-CoV-2 against SARS-CoV-1 and MERS-CoV, using the Megablast and Discontiguous Megablast algorithm. The graphical representation of the side-by-side genome comparison is demonstrated in the Artemis Comparison Tool (Carver et al., 2005) . The Gene Expression profile (Accession Number: GSE5972) of SARS-CoV-1 was collected from NCBI. Afterward, the normalization study was performed in between 10 Healthy samples and 54 SARS patient samples, where recovered cases were excluded from this study. Limma (R package) was used to find the differentially expressed genes (Ritchie et al., 2015) . Integrated Differential Expression and Pathway Analysis (IDEP) tools were utilized to create Hierarchical Clustering Heatmap and Boxplot visualization of the data for both up-and down-regulated genes (Ge et al., 2018) . Kyoto Encyclopedia of Genes and Genomes (KEGG) and Enrichr4 were employed to identify the genes which were involved in pathways and Gene Ontology (GO) processes such as biological process (BP), molecular function (MF) and cellular component (CC) (Sherman et al., 2007; Kanehisa and Goto, 2000; Kuleshov et al., 2016; . The genes with GO accession ID &KEGG pathways were enlisted for further study. ShinyGO was utilized to construct a clustered tree of the top 30 GO terms for both up and down-regulated genes in BP, MF and CC (Ge et al.) . The down and up-regulated genes were analyzed using BP, MF and CC in gene ontology dataset to find out genes that are connected in cytokine regulation and viral production. INTERFEROME database was analyzed to recognize the Interferon Stimulating Genes (ISGs) and potential IFNs (Samarajiwa et al., 2009 ). Further, we tried to explore the pathway of ISGs which modulate the immune system and IFN regulation by Reactome (Jassal et al., 2020) . The workflow of the current study was shown in Fig. 1 The isolated and assembled genome was 29737 bp long. According to GISAID (www.gisaid.org/) this virus is a member of L clade. Through alignment and diversity observation, 3 proteins such as S, M and RdRP were collected. These protein-coding regions of the genome are relatively conserved and therapeutically potential (Fig. 2, Supplementary Fig. 1, Supplementary Data 1, 2, 3) . These proteins were further analyzed to develop vaccine, siRNAs and drugs. The selected M (Vaxijen score 0.55 and AntigenPro score 0.232547) and S (Vaxijen score 0.4695 and AntigenPro score 0.717053) proteins were predicted as "antigenic" by the assigned programs. The M and S have a total of 24 and 1214 outside amino acids respectively. Among those residues, one potential B cell epitope NGTITVEELKKLLEQ was identified in M (Table 1) . Four possible B-cell epitopes (HAPATVCGPKKSTN, NNSYECDIP, FYEPQIITTD, and VEGFNCYFPLQ) were found in S (Table 1) . The selected four Cytotoxic T-cell (CTL) epitopes from M and S cover HLA-A12, A24, A26, B58, B8 alleles. Moreover, Helper T-cell (HTL) epitopes can interact with HLA DRB1-0101, DPA10103-DPB10301, DQA10101-DQB10501 and DRB1-1501 (Supplementary Table 1 ). The chimeric vaccine exhibited a suitable binding with Toll-Like Receptor-4 (TLR4) (Fig. 3A) . A single dose injection of the vaccine has demonstrated an adequate release of IFNγ in the C-ImmSim immune system simulator (Fig. 3B) . The IFNepitope also showed 127 IFNγ inducing epitopes by different methods. Additionally, Interleukin (IL)-2 was also released significantly for lymphocyte differentiation. These differentiations are mentioned as danger signal D (Sympshon Index) (Fig. 3B) . After the release of inflammatory cytokines, the immune system released anti-inflammatory IL-10 at the 2.5th day and consequently generated Immunoglobulin M (IgM) and later IgG (Fig. 3C ). The generation of memory B cells was also observed (Fig. 3D) . The vaccine specifically stimulated the differentiation of T helper cell-1 (Th1) with a certain level of regulatory T cell (Fig. 3E ). Memory Helper T Cell was also produced within 2.5 days ( Fig. 3F and 3G ). The vaccine does not contain any major allergenic properties according to AlgPred and AllerTop. ToxinPred identified 343 non-toxic peptides in different combinations and only 11 peptides with toxic properties (Supplementary Table 2 ). The vaccine is stable according to the instability index of ProtParam (Supplementary Table 3 Table 3 ). The refined three dimensional (3D) structure of the vaccine has a 1.905 MolProbity score (provided by GalaxyWEB), − 2.62 Z-Score and 87.9% residues in the most favored regions of the Ramachandran Plot ( Supplementary Figs. 2 and 3 ). When this protein was docked against TLR 4 (Chain A) via ClusPro 2.0, it interacted with the receptor via active site pocket and HBD-2 with the lowest energy score − 1401.1. The DNA sequence of the vaccine has 1089 nucleotides that demonstrated 0.96 Codon Adaptation Index (CAI) with 54.26% GC content. Here, the GC content of the host organism E. coli strain K12 is 50.73 (Fig. 4) . This sequence was taken under modifications for cloning. Six experimental compounds were collected from the DrugBank. The chemical properties of the compounds were suitable for the drug candidates to work against RdRp (Supplementary Table 4 ). RdRp and those six compounds went under molecular docking analysis to identify the binding affinities. Strong binding affinity was observed for benzyl (2oxopropyl)carbamate (-4.2 Kcal/mol) (Fig. 5A ), 2-[(2,4-dichloro-5methylphenyl)sulfonyl]-1,3-dinitro-5-(trifluoromethyl)benzene (-7.6 Kcal/mol) (Fig. 5B ), s-[5-(trifluoromethyl)-4h-1,2,4-triazol-3-yl]5-(phenylethynyl)furan-2-carbothioate (-5.9 Kcal/mol) (Fig. 5C) , 5amino-2-methyl-N-[(1R)-1-naphthalen-1-ylethyl]benzamide (-6.3 Kcal/mol) (Fig. 5D ), Nalpha-[(benzyloxy)carbonyl]-n-[(1r)-4-hydroxy-1-methyl-2-oxobutyl]-l-phenylalaninamide (-7.4 Kcal/mol) (Fig. 5E ), 4-(Dimethylamino)benzoic acid (-4.0 Kcal/mol) (Fig. 5F ). Fifteen common amino acid residues including Val473, Glu474, Asp477, Lys478, Leu636, Val637, Arg735, Asp736, Val737, Asp304, Arg305, Arg640, His642, Arg651 and Leu648 of RdRp interacted with the druggable Table 5 ). The Sense and Antisense RNA molecules of both M and S were 19mer and 21mer respectively ( Supplementary Tables 6 and 7) . The siRNA generated a hairpin-like secondary structure at 310.15 K temperature. The free energy of the thermodynamic ensembles was − 18.30 kcal/mol and − 17.90 kcal/mol respectively for M and S siRNA secondary structures. Results of HNADOCK server demonstrate that the antisense siR-NAs can interact with the three dimensional (3D) viral mRNA molecules with around − 260 docking score (Fig. 6) . The whole genome of SARS-CoV-1 and MERS-CoV were compared with SARS-CoV-2. SARS-CoV-1 and SARS-CoV-2 share 88% query sequence coverage with an 82.30% sequence identity. Besides, MERS-CoV shares only 29% query coverage with a 67.06% sequence identity ( Supplementary Fig. 4) . Genome alignment showed higher homology between SARS-CoV-1 and SARS-CoV-2 with a very few synteny break region. Whereas, MERS-CoV shares low scoring homology with SARS-CoV-2 with a larger synteny breakage area. The entry of SARS-CoV-2 and SARS-CoV-1 are similar so differential gene expression investigation was run on SARS-CoV-1 (Fig. 7) . 64 different samples were normalized and the median was centralized (Fig. 8A) . 19,200 differentially expressed genes were found ( Fig. 8B and Supplementary Fig. 5 ). Genes with official human gene symbols (HGNC) and significant p-value (<0.05) were selected (Supplementary Data 4 and 5). 1265 genes were considered for further analysis, among them, 616 genes were up-regulated (Supplementary Data 6) and 649 genes were down-regulated (Supplementary Data 7) . In the hierarchical clustering heatmap, genes based on LogFC, t-stat and adjusted p-value were clustered together (Fig. 8C, D, E and F) . The genes that we have considered as up-regulated seem to have a major frequency around LogFC≈ 2.1 and t value≈ 2.8, which dictates significant up-regulation (Supplementary Data 8) . On the other side, the down-regulated genes seem to have a major frequency around LogFC≈ The biological process of GO reported that 91 up-regulated genes and Table 1 The potential B cell, Cytotoxic T cell and Helper T cell epitopes. COVID-19 pandemic has destabilized the social, political and economic spheres globally (Ren et al., 2020; Berera and Zambon, 2013; Andersen et al., 2020) . SARS-CoV-2, the causative agent of this pandemic, can undergo frequent mutations, adaptations and transmission, which will, unfortunately, pose a continuous risk of outbreaks (Gurwitz et al., 2020) . To confront these future risks, the sustainable development of therapeutics against this deadly virus is necessary. In this study, we have exploited a large number of SARS-CoV-2 genome sequences including one from our group and merged several computational approaches to propose therapeutics that could be effective against all the strains of SARS-CoV-2 (Fig. 1) . To achieve this goal of therapeutic efficacy, we have identified some therapeutic regions that have a decent level of conservation in the genome. Alongside this, whole genomes of SARS-CoV-2 from different countries were explored to find out their conserved region (Supplementary Data 1) . Among all the strains, several consensus regions were obtained (Supplementary Data 2, 3 and Supplementary Fig. 1) . The proposed chimeric vaccine, drug compounds and siRNA therapeutics were developed against those conserved regions (Supplementary Data 3) . Most other studies focused only on the S protein for vaccine development. This study targets both S and M proteins for vaccine and siRNA development. The vaccine candidates contain several well-conserved B and T cell epitopes (Table 1, Supplementary Table 1 ). The epitopes were joined to design a chimeric vaccine which is expected to elicit TLR4 mediated Th1 antiviral response. Immune simulation analysis showing that this vaccine can produce memory B and T cells (Fig. 3D & 3F) . The vaccine is expected to induce inflammatory and anti-inflammatory cytokines in sufficient amounts with non-allergen activities. Since the vaccine contains very few toxic peptides (Supplementary Table 2 ), in vitro and in vivo validation are required before the actual application. Again, the constructed expression vector should successfully produce adequate vaccine molecules in E. coli strain K12, considering that the half-life of the protein is >10 h inside the bacterial host with an acceptable stability index (Fig. 4) . A universal drug is necessary to reduce the post-infection consequences of COVID-19. For coronaviruses, RdRp is an attractive therapeutic target that catalyzes the replication of RNA from the RNA template. Thus, we targeted the conserved RNA directed RdRP protein from SARS-CoV-2 (Supplementary Data 3). The virtual screening directed by the RdRP protein identified six experimental compounds from the DrugBank server (Supplementary Table 4 ). Later, the molecular docking analysis revealed a higher binding affinity between the compounds and the RdRp receptor. All the interacting residues between the drug compounds and the RdRp receptor were located in the predicted drug binding site (Fig. 5) . Among the ADMET properties, all the experimental compounds exhibited sufficient human intestinal absorption, a higher metabolic rate, no blood-brain barrier permeability (Supplementary Table 5 ). Moreover, the toxicity levels of these compounds appeared to be safe for humans (Supplementary Table 5 ). These characteristics are significant for drug development since poor ADMET properties could exclude a drug in phase III clinical trial. RNA interference (RNAi) pathway is very efficient and specific in viral gene silencing, but only in post-transcription. Here, siRNAs (small interfering RNAs) can play a therapeutic role (Elbashir et al., 2001) . The mRNA sequences have potential sense siRNA strands CAGUAUAAUUAAUAACUAA and GGUUUUUGUAUAUAAUUAA from S and M, respectively (Supplementary Tables 6 and 7) . Therefore, the 19mer and 21mer nucleotides of the sense and antisense strands were designed based on siRNA universal rules. The thermodynamic value and strong interaction between antisense and viral mRNA strongly support its prospective application ( Fig. 6A and 6B ). These siRNAs might be Fig. 7 . The whole-genome comparison among the SARS-CoV-1, SARS-CoV-2 and MERS CoV. MERS seems to be distantly related as it showed only 29% similarity with the SARS-CoV-2. Besides, SARS-CoV shared 88% similarity with the newly evolved SARS-CoV-2 virus. The variation found only in ORF1ab, Spike and ORF8 region of SARS-CoV-2 with SARS-CoV. Alternatively, the orf1ab region from MERS is similar to SARS-CoV-2. conjugated with aptamer or delivered through various delivery systems to inhibit the intracellular growth of SARS-CoV-2. IFNs instigate an array of antiviral effectors. Though every virus contains a distinctive property, partly overlapping antiviral IFN Stimulating Genes (also known as "ISG profile") could be a potential target (Jassal et al., 2020) . IFN effectors target various phases of the virus life cycle to interfere with viral infection and proliferation. To identify the ISG and IFN of COVID-19 cases, the gene expression profile of COVID-19 patients was needed, but it was not available. Therefore, three genomes of Betacoronavirus that have common virulence patterns, signs and symptoms were taken, including SARS-CoV-1, MERS-CoV and SARS-CoV-2. A comparison of their genome was performed. The whole- genome comparison between SARS-CoV-1 and SARS-CoV-2 showed 88% similarity with each other (Fig. 7 and Supplementary Fig. 4) . Therefore, the SARS-CoV-1 expression profile was investigated to identify ISG and IFNs. Based on this connection, we analyzed the genome-wide expression profile of SARS patients which unveiled suitable genes/receptors for the IFN-based therapeutics (Supplementary Data 4-9). A total of 167 immunoregulatory genes (91 up-regulated genes and 76 down-regulated genes) were found from the biological process of GO (Fig. 8, Supplementary Fig. 5, Supplementary Data 9-17) . We then identified the IFN stimulating genes KPNA4, IRF7, OAS3, CAMK2G, IFITM1, IFIT1, EGR1, MX1 and IFNAR2 from the up-regulated immuno-regulatory genes and TRIM8, VCAM1, IFNGR2, JAK1 from the down-regulated immuno-regulatory genes (Fig. 9) . The functions of these immune-regulatory genes can be modulated by the interferons IFN alpha, IFN beta and IFN gamma which can directly induce the antiviral pathways and immunological responses (Fig. 9, Supplementary Fig. 6 , Supplementary Data 17-25) . This analysis provides deep insight into the type 1 IFN based therapy for COVID-19 patients. The present study has proposed several therapeutics including chimeric vaccine, potent drugs, siRNAs and ISG targets against COVID-19. This study can be used as a convenient source by the researchers and pharmaceutical companies for developing novel therapeutics against SARS-CoV-2. We hope this in silico pipeline will also help many researchers to combat against other viral outbreaks in the future. MS: Conceived, designed, and guided the study, analyzed the data, drafting the manuscript and performed critical revision. CAK and KCD: Guided the study, acquisition and analyzed the data, helped in drafting the manuscript. MM and MGM: Helped in Bioinformatics analysis and helped in drafting the manuscript. MUH, AB, MTHE, ZMC: Helped to design the study, performed bioinformatics analysis, drafted, and developed the manuscript and performed critical revision. All authors have approved the manuscript. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. All applicable international, national, and/or institutional guidelines for the care and use of human patients were followed. The Ethical approval number is NIBREC2020-02. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Fig. 9 . Interferon stimulating genes (ISGs) in the pathway. Overrepresented pathway of interferon regulation by overexpressed and under expressed ISGs. The gene list including KPNA4, IRF7, OAS3, CAMK2G, IFITM1, IFIT1, EGR1, MX1 and IFNAR2 from the upregulated genes and TRIM8, VCAM1, IFNGR2 and JAK1 from the downregulated genes might stimulate the Interferon against the SARS-CoV-2. Here, the possible pathway mechanism has been explored to induce the immune response by the Interferon stimulating genes (ISGs). Comparative genomics allowed the identification of drug targets against human fungal pathogens The Galaxy platform for accessible, reproducible and collaborative biomedical analyses The 3 ′ end of hepatitis E virus (HEV) genome binds specifically to the viral RNA-dependent RNA polymerase (RdRp) An algorithm for selection of functional siRNA sequences The proximal origin of SARS-CoV-2 SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing Antivirals in the 2009 pandemic-lessons and implications for future strategies Insight of druggable cannabinoids against estrogen receptor β in breast cancer Trimmomatic: a flexible trimmer for Illumina sequence data The FDA-approved drug ivermectin inhibits the replication of SARS-CoV-2 in vitro ACT: the Artemis comparison tool Features, evaluation and treatment coronavirus (COVID-19) a review of coronaviruses and toroviruses. InCoronaviruses with Special Emphasis on First Insights Concerning SARS 2005 Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties Origin and evolution of pathogenic coronaviruses Designing of interferon-gamma inducing MHC class-II binders AllerTOP v. 2-a server for in silico prediction of allergens Identifying candidate subunit vaccines using an alignment-independent method based on principal amino acid properties A review of its discovery and development leading to emergency use authorization for treatment of COVID-19 Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells Data, disease and diplomacy: GISAID's innovative contribution to global health Bat Coronaviruses in China Human coronavirus: host-pathogen interaction Chloroquine phosphate has shown apparent efficacy in treatment of COVID-19 associated pneumonia in clinical studies Protein identification and analysis tools on the ExPASy server ShinyGO: a graphical enrichment tool for animals and plants. bioRxiv iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data JCat: a novel tool to adapt codon usage of a target gene to its potential expression host Open Source Drug Discovery Consortium. In silico approach for predicting toxicity of peptides and proteins Angiotensin receptor blockers as tentative SARS-CoV-2 therapeutics. Drug development research HNADOCK: a nucleic acid docking server for modeling RNA/DNA-RNA/DNA 3D complex structures GalaxyRefineComplex: Refinement of proteinprotein complex model structures driven by interface repacking SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor Pathway based therapeutic targets identification and development of an interactive database CampyNIBase of Campylobacter jejuni RM1221 through non-redundant protein dataset A library of sirna duplexes targeting the phosphoinositide 3-kinase pathway: determinants of gene silencing for use in cell-based screens Design of a genomewide siRNA library using an artificial neural network Thermodynamic instability of siRNA duplex is a prerequisite for dependable prediction of siRNA activities The reactome pathway knowledgebase Improved methods for predicting peptide binding affinity to MHC class II molecules BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes ACToR-aggregated computational toxicology resource KEGG: kyoto encyclopedia of genes and genomes MAFFT multiple sequence alignment software version 7: improvements in performance and usability Specific residues at every third position of siRNA shape its efficient RNAi activity The ClusPro web server for protein-protein docking Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes Enrichr: a comprehensive gene set enrichment analysis web server 2016 update Prediction of specificity and crossreactivity of kinase inhibitors Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): the epidemic and the challenges Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction PROCHECK: a program to check the stereochemical quality of protein structures Novel bat alphacoronaviruses in Southern China support chinese horseshoe bats as an important reservoir for potential novel coronaviruses Immunogenicity and protection efficacy of monomeric and trimeric recombinant SARS coronavirus spike protein subunit vaccine candidates High-throughput prediction of protein antigenicity using protein microarray data ACD/I-Lab 4.5: an internet service review Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure Prediction of RNA secondary structure by free energy minimization Coding-Complete Genome Sequence of SARS-CoV-2 Isolate from Bangladesh by Sanger Sequencing Human β-defensin-2 functions as a chemotactic agent for tumour necrosis factor-α-treated human neutrophils FastTree 2-approximately maximumlikelihood trees for large alignments Computational immunology meets bioinformatics: the use of prediction tools for molecular binding in the simulation of the immune system Fear can be more harmful than the severe acute respiratory syndrome coronavirus 2 in controlling the corona virus disease 2019 epidemic Rational siRNA design for RNA interference limma powers differential expression analyses for RNA-sequencing and microarray studies AlgPred: prediction of allergenic proteins and mapping of IgE epitopes INTERFEROME: the database of interferon regulated genes Ligand docking and binding site analysis with PyMOL and Autodock/Vina The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists In-silico design of a multi-epitope vaccine candidate against onchocerciasis and related filarial diseases An effective method for selecting siRNA target sequences in mammalian cells Detection of novel Coronavirus by RT-PCR in stool specimen from asymptomatic child CASTp 3.0: computed atlas of surface topography of proteins AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Functional dissection of siRNA sequence by systematic DNA substitution: modified siRNA with a DNA seed arm is a powerful tool for mammalian gene silencing with significantly reduced off-target effect An accurate and interpretable model for siRNA efficacy prediction ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins The outbreak of COVID-19: an overview Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention Endogenous toll-like receptor ligands and their biological significance Deep-learning contact-map guided protein structure prediction in CASP13 The authors are grateful to the Ministry of Science and Technology, Bangladesh for extensive support during this work. Supplementary data to this article can be found online at https://doi. org/10.1016/j.gene.2020.145368.