key: cord-1039954-tgods0wq authors: Saini, Sandeep; Saini, Avneet; Thakur, Chander Jyoti; Kumar, Varinder; Gupta, Rishabh Dilip; Sharma, Jogesh Kumar title: Genome-wide computational prediction of miRNAs in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) revealed target genes involved in pulmonary vasculature and antiviral innate immunity date: 2020-06-03 journal: Mol Biol Res Commun DOI: 10.22099/mbrc.2020.36507.1487 sha: bc0499634ef0834e3307f0c4ed730cc2fd939da1 doc_id: 1039954 cord_uid: tgods0wq The current outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China threatened humankind worldwide. The coronaviruses contains the largest RNA genome among all other known RNA viruses, therefore the disease etiology can be understood by analyzing the genome sequence of SARS-CoV-2. In this study, we used an ab-intio based computational tool VMir to scan the complete genome of SARS-CoV-2 to predict pre-miRNAs. The potential pre-miRNAs were identified by ViralMir and mature miRNAs were recognized by Mature Bayes. Additionally, predicted mature miRNAs were analysed against human genome by miRDB server to retrieve target genes. Besides that we also retrieved GO (Gene Ontology) terms for pathways, functions and cellular components. We predicted 26 mature miRNAs from genome of SARS-CoV-2 that targets human genes involved in pathways like EGF receptor signaling, apoptosis signaling, VEGF signaling, FGF receptor signaling. Gene enrichment tool analysis and substantial literature evidences suggests role of genes like BMPR2 and p53 in pulmonary vasculature and antiviral innate immunity respectively. Our findings may help research community to understand virus pathogenesis. The current outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was confirmed in 12,10956 peoples and there has been 67,594 deaths reported worldwide till now [1] . The genome sequence analysis of SARS-CoV-2 relates it to previously identified SARS-CoV (severe acute respiratory syndrome-related coronavirus, bat-SL-CoVZC45 and bat-SL-CoVZXC21) clade and hence it was classified under family Coronaviridae and genus Betacoronavirus. Furthermore, phylogenetic analysis suspected bat as the original host of the virus but possibility of intermediate host animal was also purposed [2] [3] [4] . A brief study of previously identified coronoviruses indicated six human coronaviruses (HCoVs) types: 229E, OC43, NL63, HKU1, SARS-CoV and Middle East respiratory syndrome (MERS-CoV). SARS-CoV and MERS-CoV are of zoonotic origin, and they have been outbreaks earlier during 2003 (China) and 2012 (Saudi Arabia) [5, 6] . The current disease clinical symptoms include fever, cough, dyspnoea which after chest radiography diagnosed as viral pneumonia, now named as COVID-19 (Coronavirus disease 2019) by WHO [7, 8] . The diagnostic determination of virus infection can be done by newly provided real-time RT-PCR assay [9] . MicroRNAs (miRNAs) are small (~22 nt) non-coding RNAs that play role in posttranscriptional gene regulation by binding to complementary sites on mRNA. The binding may results either in inhibition of translation or complete cleavage of mRNA depending upon complementarity of hybridization [10, 11] . The occurrence of miRNAs in plant, animal and fungi has been documented previously [12, 13] . Furthermore, the instances of viral encoded miRNAs in the host defense mechanism, cell differentiation, apoptosis and cell proliferation in different virus families and genus has been reported in literature [14] . The role of miRNAs in inducing the lung pathology, a characteristic symptom of SARS-CoV was identified previously by analyzing deep sequencing data from the lungs of SARS-CoV-MA15-infected BALB/c mice. 18-22 nucleotide long small viral RNAs were identified from genomic region of SARS-CoV, interestingly it was found that these small RNAs target the host cellular mRNAs 3'UTR specific target sequences and upon in vivo inhibition of these small viral RNAs by antisense inhibitor a significantly decrease in pulmonary inflammation was observed. [15] . Furthermore, the fact that existence of nuclear life cycle of SARS viruses exist was purposed by isolation of SARS-CoV from nucleus of Vero E6 Cells [16] . But the experimental approaches of miRNA identification was relied on expression in specific cell type and therefore based on time consuming cloning techniques. In the urgent need to understand the disease etiology, the computational based miRNAs prediction approaches can provide early evidences by genome analysis to predict miRNAs in timely manner, so that the impact of miRNAs on disease etiology can be traced during outbreaks or emergence [17, 18] . Mainly two approaches have been used for computational miRNAs prediction: ab-intio based and homology based. Homology based approach depends on evolutionary conservation, and therefore limited in locating novel miRNAs in genome. Whereas ab-intio based approach scan for hair-pin loop fold in genome to detect novel pre-miRNAs therefore is more significant [19, 20] . Till date, the RNA viruses-encoded miRNAs have been predicted in Hepatitis-A virus (HAV), Hepatitis-E virus (HEV), Dengue virus (DENV), ZIKA Virus ZIKV), Ebola virus, Japanese Encephalitis virus (JEV), Kyasanur forest disease virus (KFDV) and Nipah virus [21] [22] [23] [24] [25] [26] [27] [28] . Therefore, in this study the genome of SARS-CoV-2 was being analysed to predict mature viral miRNAs. Moreover, predicted mature viral miRNAs were also scanned for target genes in human genome and these targets were further analysed for gene ontology. The complete genome sequence of SARS-CoV-2 was retrieved from NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/) using accession number: MN908947. Genome is positive sense single stranded RNA molecule with linear topology. It contains 29903 nucleotide (nt) base pair. An ab-intio based pre-miRNA prediction software package, Vmir (v2.3) was used for identification of SARS-CoV-2's pre-miRNAs. VMir package contains two individual modules: VMir analyzer and VMir viewer for prediction and viewing pre-miRNAs respectively [29] . The analysis was done using default parameters (window count: 500, conformation: linear, orientation: both) in VMir analyzer. Furthermore, filtering parameter (min. hairpin size: 70, min. score: ≥150 and min. window count: ≥35) was used in VMir viewer to filter out top scorer pre-miRNAs as described previously in literature [21, 25] . Potential pre-miRNAs were identified by using ViralMir (http://csb.cse.yzu.edu.tw/viralmir/), a SVM (support vector machine) based web-server. ViralMir was specially designed for viruses with SVM model for prediction has been trained on sequence and structural features of experimentally validated pre-miRNAs data set [30] . The Mfold (http://unafold.rna.albany.edu/?q=mfold) web server was used with default parameters to predict the secondary structure ( Supplementary Fig. S1 .) and minimum free energy (MFE) of pre-miRNAs [31] . Mature Bayes (http://mirna.imbb.forth.gr/MatureBayes.html) web-server was used for identification of mature miRNAs from filtered pre-miRNAs. Mature Bayes uses Naive Bayes Classifier (NBC) and takes into account sequence as well as structural information of experimental predicted miRNA precursors for deducing mature miRNAs from precursors [32] . Target prediction of mature miRNAs against human genome was done using in an online web based server, miRDB (http://mirdb.org/). The custom prediction module of server was used for predicting target genes in human. The server uses seeding approach and scans viral mature miRNAs against 3' UTR (untranslated regions) of human's genome for possible hybridization [33] . GO analysis of the target genes was performed using PANTHER (Protein Analysis through Evolutionary Relationships) (http:// www.pantherdb.org) and Enrichr (https://amp.pharm.mssm.edu/Enrichr/) to explore the role of target gene's product in biological process, molecular function, cellular component and pathways [34, 35] . NCBI's Gene IDs were used for this analysis to find GO terms related to gene products. The associations of screened target genes with related pathways were established by literature evidences to deduce disease etiology. VMir analysis predicted a total of 1114 hair-pin like pre-miRNAs folds in SARS-CoV-2 genome that were filtered using filtering parameter as described in methodology above. After filtering done by VMir viewer only top 13 pre-miRNAs were selected for further study. Nine pre-miRNAs were found on direct strand whereas four pre-miRNAs on reverse strand. Additionally, all 13 pre-miRNAs were in length range 78-148 nt. The sequence, rank, score, length and orientation are listed in Table 1 . As ab-intio based tools have the limitation of false-positive pre-miRNAs prediction because of selection of the pseudo hair pin loops structures [36, 37] therefore to validate and find reliable candidates pre-miRNAs, all 13 predicted pre-miRNAs were further analysed by ViralmiR for identification for real or pseudo viral pre-miRNAs. All 13 pre-miRNAs were found to be real or potential pre-miRNAs folds, which were further confirmed by assessment of minimum free energy (MFE) by Mfold Table 2 . Because pre-miRNAs sequence folding is one of the feature that confer stability to structural fold therefore by calculating MFE more confidence in authenticity of predicted pre-miRNAs can be done [38] . After authentication of pre-miRNAs, Mature Bayes server was used for retrieving the mature miRNAs. A total of 26 mature miRNAs were obtained from 13 precursors on 5' and 3' stem location as shown in Table 3 . As mentioned in literature, one or both strands can serve as mature miRNA molecule depending on the assembly of RISC complex therefore we retained both for further analysis [39] . Table 3 : Mature miRNAs length, location and sequence as predicted by MatureBayes Computational prediction of miRNA-mRNA binding depends on Watson-Crick base pairing which is mostly implemented using seed pairing approach [40] . miRDB also adopts 7-mer seeding approach through MirTarget algorithm. We predicted 1059 human target genes (Supplementary Table S1 ) by custom prediction using mature miRNAs sequences which bind at 3' UTRs. We selected top scoring target genes with prediction score >80 because score above this threshold are most likely to be real and not required any other supporting evidence [33] . Gene Ontology term for the target genes were identified by PANTHER database which cluster and group them into biological process (Fig. 1a) , molecular function (Fig. 1b) and cellular component (Fig. 1c) . Biological processes important for antiviral responses are immune system process (GO:0002376), metabolic process (GO:0008152), biological adhesion (GO:0022610), biological regulation (GO:0065007) and response to stimulus (GO:0050896). Molecular functions were classified into eight categories essentially comprised of transporter activity (GO:0005215), catalytic activity (GO:0003824) translation regulator activity (GO:0045182), transcription regulator activity (GO:0140110) and binding (GO:0005488). Cellular Components encompasses eight subcellular components including extracellular region (GO:0005576), cell (GO:0005623) and organelle (GO:0043226). Target genes were further evaluated using Enrichr, a gene list enrichment analysis tool which retrieved total 82 pathways. On the basis of p value<0.1 few important pathways associated with targeted gene are listed in (Fig. 1d) which are important in human immune response to virus infection including angiogenesis (P000050), EGF receptor signaling pathway (P000180), apoptosis signaling pathway (P000060), VEGF signaling pathway (P000560), FGF receptor signaling pathway (P00021) and CCKR signaling map ST pathway (P06959). Screened target genes and their associated roles that may involve in virus pathogenesis were listed in Table 4 . along with literature evidence. In the genomic-age, there are now more ways to find and studying miRNA biology, the most trending one is the genome-wide identification of this small non-coding RNAs [41] . The ab-intio prediction approach even can detect miRNAs that were not identified in cloning experiments due to under expression [42] . There are now ample amount of evidences that several viruses encode miRNAs, which directly downregulate the expression of genes involve in immunological, apoptosis, axon guidance and cell differentiation pathways [43, 44] . The significance of miRNAs in viral induced respiratory infection and immune regulation has been established previously [15] . Here in this study, the genome analysis of recent outbreak SARS-CoV-2 was done to explore the role of miRNAs in acute respiratory syndrome. We found significant pathways that may contribute to disease etiology for example apoptosis play an important role in physiological processes and pathogenesis of infectious diseases caused by viruses. nCoV-MD3 -3P target p53 which act as a main inducer of apoptosis pathway during viral infection [45] . p53-dependent apoptosis has been reported to control viral infection of herpes simplex virus (HSV), vesicular stomatitis virus (VSV) and polio virus [46] . Tumor suppressor p53 (TP53) diminish the ability of viral replication and spreading as well as up regulate many genes of type I IFN transcriptional target that suggest p53 role in innate immunity [47] . nCoV-MD241-3P target BMPR2 (bone morphogenetic protein receptor type 2) which involved in transforming growth factor (TGF)-β signaling pathway. Upon viral infection, BMPR2 gets suppressed which result inhibition of pulmonary vascular homeostasis [48] . The previous studies on different viral miRNAs and their target gene silencing explained interesting facts, particularly about disease etiology [49] [50] [51] . Above all, the occurrence of same miRNAs in in-vitro studies as is found by computational means build confidence in in-silico approaches [52] . To our knowledge this is the first paper on computational prediction of mature miRNAs from SARS-CoV-2 genome where we found that predicted mature miRNAs are targeting the large number of significant human target genes. The main findings of the work are the two target genes: BMPR2 and TP53 that involves in the pulmonary vasculature and antiviral innate immunity respectively. The inhibition of these two target genes by predicted viral miRNAs may induce the respiratory lung disease pathology and decrease in antiviral response of the body. This study may results in exploring the disease manifestation. WHO. Coronavirus disease 2019 (COVID-19) Situation Report-77 Genomic characterization and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding The first disease X is caused by a highly transmissible acute respiratory syndrome Coronavirus Bat origin of a new human coronavirus: there and back again A mini review of the zoonotic threat potential of Influenza viruses, Coronaviruses, Adenoviruses, and Enteroviruses Human Coronaviruses: A review of virus-host interactions A novel coronavirus (2019-nCoV) causing pneumoniaassociated respiratory syndrome A distinct name is needed for the new coronavirus Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR The mechanics of miRNA-mediated gene silencing: a look under the hood of miRISC MicroRNAs: Small RNAs with a big role in gene regulation Plant microRNAs-novel players in natural medicine? The evolutionary origin of plant and animal microRNAs Virus-encoded microRNAs: an overview and a look to the future SARS-CoV-encoded small RNAs contribute to infection-associated lung pathology The life cycle of SARS coronavirus in Vero E6 cells MicroRNAs: genomics, biogenesis, mechanism, and function Cloning and identification of a microRNA cluster within the latency-associated region of Kaposi's sarcoma-associated herpesvirus Computational methods for ab initio detection of microRNAs MicroRNA: biological and computational perspective Identification and validation of a novel microRNA-like molecule derived from a cytoplasmic RNA virus antigenome by bioinformatics and experimental approaches Computational identification of hepatitis E virus-encoded microRNAs and their targets in human Computational identification of Dengue virus microRNA-like structures and their cellular targets Genome-wide prediction of microRNAs in Zika virus genomes reveals possible interactions with human genes involved in the nervous system development Systematic Genome-wide Screening and Prediction of microRNAs in EBOV During the In silico identification of miRNAs and their target prediction from Japanese encephalitis Genome wide computational prediction of miRNAs in Kyasanur forest disease virus and their targeted genes in human Computational prediction of miRNAs in Nipah virus genome reveals possible interaction with human genes involved in encephalitis A combined computational and microarray-based approach identifies novel microRNAs encoded by human gamma-herpesviruses ViralmiR: A support-vector-machine-based method for predicting viral microRNA precursors Mfold web server for nucleic acid folding and hybridization prediction MatureBayes: a probabilistic algorithm for identifying the mature miRNA within novel precursors miRDB: An online resource for microRNA target prediction and functional annotations The gene ontology (GO) database and informatics resource Enrichr: a comprehensive gene set enrichment analysis web server 2016 update A review of computational tools in microRNA discovery On the performance of pre-microRNA detection algorithms The discriminant power of RNA features for pre-miRNA recognition The fate of miRNA* strand through evolutionary analysis: implication for degradation as merely carrier strand or potential regulatory molecule? The hunting of targets: Challenge in miRNA research Two decades of miRNA biology: Lessons and challenges Virus-encoded microRNAs Widespread evidence of viral miRNAs targeting host pathways MicroRNAs as mediators of viral immune evasion Potential roles of apoptosis in viral pathogenesis Viral Infection and Apoptosis Dual role of p53 in innate antiviral immunity Consequences of BMPR2 deficiency in the pulmonary vasculature and beyond: Contributions to pulmonary arterial hypertension Herpesviral microRNAs in cellular metabolism and immune responses Herpesviruses and microRNAs: New pathogenesis factors in oral infection and disease? In silico analysis revealed Zika virus miRNAs associated with viral pathogenesis through alteration of host genes involved in immune response and neurological functions Ebola virus produces discrete small non-coding RNAs independent of the host microRNA pathway and which lack RNA interference activity in bat and human cells The authors declare that they have no conflict of interest.