key: cord-0033974-x48huy5f authors: Ji, Fengmin; Luo, Liaofu title: Prediction for Target Sites of Small Interfering RNA Duplexes in SARS Coronavirus date: 2004-01-16 journal: Genome Biol DOI: 10.1186/gb-2004-5-2-p6 sha: e53f732a04d27d739f51e44246422f4e767ce245 doc_id: 33974 cord_uid: x48huy5f RNA interference is used for SARS-related pharmaceutical research and development. Following bioinformatic method twenty seven 21~25 base-long sequence segments in SARS-CoV genome are predicted as the optimal target sites of small interfering RNA duplexes. to thousands of human beings. However, the active drug in treating SARS has not been found yet. The genome sequences determined by several groups [1] [2] [3] show that it is a variant of coronaviruses, belonging to single-stranded plus sense RNA viruses. The genome is about 30 kb in length, and its several encoded proteins have been separated and purified. This provides a sound basis for SARS related pharmaceutical research and development . The use of double-stranded RNA (dsRNA) to manipulate gene expression (RNA interference or RNAi) has been proved highly effective, at least 10 times more effective than either using sense or antisense RNAs alone [4] . The RNAi triggered by dsRNA is a phenomenon of homology-dependent gene silencing [5] [6] [7] . It was found that the small interfering RNA (siRNA, 21-25 nt long) plays an important role in RNAi-related gene silencing pathways [8] . Progress has also been made in anti-HIV and anti-HCV drug design by applying the method of RNA interference [9] - [10] . To design anti-SARS-CoV drug, one strategy is to search for siRNAs which specifically interfere the gene expression and block the genome replication of SARS-associated coronavirus. In this note we shall make theoretical prediction on the possible target sites of siRNAs in the virus genome. 4 sites of RNA interference when these segments frequently occurred in non-base-paired regions based on the above calculation. A given RNA sequence segment may have different configurations of secondary structure with lower free energy, some containing short stems (quasi-free) but some not (free). The total frequency of a segment occurring in non-base-paired region of different folds is called appearance rate. If each quasi-free case is multiplied by a reduced factor in numeration, namely, by 0.9 for 1 base pair, 0.8 for 2 base pair, and 0.7 for 3 base pairs (base pairs may be continuous in structure or disconnected) then the total number of folds is called reduced appearance rate. The antisense oligonucleotide (AO) complementary to a specific sub-sequence of an RNA target has been extensively investigated. AO efficacy is affected by many factors. Apart from the binding energy between AO and RNA, which describes the AO accessibility to the RNA, the sequence motif is another important factor. The correlation of 9 sequence motifs with AO efficacy was deduced empirically in [13] [14] . If the target sequence contains CCAC, TCCC, ACTC, GCCA and CTCT, then it will make a positive score. If the target sequence contains GGGG, ACTG, TAA and AAA, then it will make a negative score. On the other hand, experiment shows that 2 nt 3' overhangs in siRNA duplex has played an important role in its stabilization [8] . That means AA in 5' end of the sequence segment is favorable for its target. In SARS-CoV genome we have found several tens' long segments (length >20 nt) matching 5 with those of human beings. To guarantee the safety of the designed drug, we make alignment of free and quasi-free segments of high appearance rate with human genome and delete the matching ones (more than 18 exactly matching bases) in siRNA target candidates. By the use of RNA sequence data of SARS-CoV, Isolate Tor2, twenty seven optimal 20~25 base-long siRNA targets are selected from 60000 candidates in both strands. They are listed in Table 1 and 2 for minus-strand and plus-strand respectively. Each segment is scored. The main term of score is the value of reduced appearance rate (column 5 of Table 1 and 2). The sum of AO efficacies (multiplied by 10) in a segment is also listed for reference (column 6). The enhancing factor of AA occurred in 5' end is indicated in column 7 by notation +. The results of multiple sequence alignment of 19 complete SARS coronavirus genome give the mutational sites between different strains [15] . The last term of score is related to mutational sites. Each point mutation in siRNA target sequence contributes -1 in score (column 8). Though the relative importance of these terms cannot be quantitatively estimated at present we expect that the main contribution to the score comes from the reduced appearance rate (column 5). Generally, in the proliferation of plus-sense RNA viruses the concentration of plus-strand is much higher than that of minus-strand. For example, they may differ by 100 times in TMV (tobacco mosaic virus) [16] . If the concentration of minus-strand in SARS-CoV is lower,then the RNA interference targeted at virus minus-strand will be more effective . We suggest that the latter point should be checked by experiments immediately since it is important for designing an effective siRNA duplex. 6 The above approach is of broad interest to other anti-virus drug design. Characterization of a Novel Coronavirus Associated with Severe Acute Respiratory Syndrome The Genome Sequence of the SARS-Associated Coronavirus A complete sequence and comparative analysis of a SARS-associated virus (Isolate BJ01) Potent and specific genetic interence by double-stranded RNA in Caenorhabditis elegans The rest is silence RNA splicing: The genome's immune system Ancient pathways programmed by small RNAs Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells Modulation of HIV-1 replication by RNA interference RNA interference blocks gene expression and RNA synthesis from hepatitis C replicons propagated in human liver cells Expanded Sequence Dependence of Thermodynamic Parameters Improves Prediction of RNA Secondary Structure Identification of sequence motifs in oligonucleotides whose presence is correlated with antisense activity Computational antisense oligo prediction with a neural network model Results of multiple sequence alignment of 19 complete SARS coronavirus genome Specific Cessation of Minus Strand RNA Accumulation at an Early Stage of TMV Infection Acknowledgement Authors are grateful to Dr. J.C. Luo for his sending SARS-related material.The work was supported by National Science Foundation of China, Project 90103030.