key: cord-0690821-nxutmxp6 authors: Pyrc, Krzysztof; Jebbink, Maarten F.; Berkhout, Ben; van der Hoek, Lia title: Detection of New Viruses by VIDISCA: Virus Discovery Based on cDNA-Amplified Fragment Length Polymorphism date: 2007-11-28 journal: SARS- and Other Coronaviruses DOI: 10.1007/978-1-59745-181-9_7 sha: 4023e727f2e03b241546a3cc98972ce3eb578d95 doc_id: 690821 cord_uid: nxutmxp6 Virus discovery based on cDNA-AFLP (amplified fragment length polymorphism) (VIDISCA) is a novel approach that provides a fast and effective tool for amplification of unknown genomes, e.g., of human pathogenic viruses. The VIDISCA method is based on double restriction enzyme processing of a target sequence and ligation of oligonucleotide adaptors that subsequently serve as priming sites for amplification. As the method is based on the common presence of restriction sites, it results in the generation of reproducible, species-specific amplification patterns. The method allows amplification and identification of viral RNA/DNA, with a lower cutoff value of 10(5) copies/ml for DNA viruses and 10(6) copies/ml for the RNA viruses. Previously, we described the identification of a novel human coronavirus, HCoV-NL63, with the use of the VIDISCA method. To date, there is still a variety of human diseases of unknown etiology, including several chronic diseases such as amyotrophic lateral sclerosis (ALS) and multiple sclerosis (MS), but also acute infections such as Kawasaki disease and multiple respiratory diseases (1,2) . A viral origin has been suggested for many of these diseases, emphasizing the importance of a continuous search for new viruses. Identification of previously unrecognized viral agents in patient samples is of great medical interest, but remains a major technical challenge. Identification of novel viral pathogens is difficult with the virus discovery tools known to date. Several problems are encountered when searching for new viruses. First, most of the unidentified viruses do not replicate in vitro, at least not in the cells that are commonly used in viral diagnostics. Second, the molecular biology techniques previously employed to identify unknown viruses have their specific drawbacks. Several techniques are in use for virus discovery, e.g., universal primer PCR, random priming based PCR, and representational difference analysis (RDA). Although every technique has proven to be useful for virus discovery in certain circumstances, they all have serious limitations and restrictions. Universal PCR primers should amplify new members of an already known virus family, but this method has two major drawbacks. First, a choice for a specific virus family has to be made. This limits the possibility of identifying a member of an unsuspected family or the founding member of a totally new one. Second, the universal primers may simply not match the genome sequence of novel members of a virus family. This is illustrated by the lack of success of universal coronavirus primers that were designed before the new members-SARS-CoV, HCoV-NL63, and HCoV-HKU1-were identified. None of the studies that used such primers was able to detect a novel human coronavirus (3,4) . Obviously, such primers gradually improve once more family members are known. Another technique uses nonspecific amplification of viral sequences in a random priming PCR at low annealing temperatures. However, most ingredients of this assay contain contaminating DNA. For instance, the enzymes used may contain trace amounts of DNA from the bacteria in which they are produced. This contaminating DNA is also amplified and it is therefore not possible to determine at an early stage whether amplification products represent a new virus or contaminating nucleic acids. This can be resolved only after excessive cloning and sequencing. Therefore, high throughput screening of many clinical samples is impractical. Moreover, this technique has only been successful with viruses that replicate in vitro, in which case cell culture supernatant was used as input for the assay (5). Representational difference analysis (RDA) is a subtractive hybridization technique that enriches for nucleic acid sequences that are present in one tissue but absent or present at lower concentration in an otherwise identical tissue sample. RDA utilizes PCR to generate sets of nucleic acids in a target and a (negative control) tester sample. After subtractive hybridization, there is selective amplification of target-enriched sequences. The method was developed for tissue material and not for nontissue samples such as serum/plasma or virus culture supernatants (6). The fact that these liquid samples have low concentrations of DNA and RNA in the tester sample may restrain the selective amplification of an unknown viral target. A disadvantage of this technique is that it requires a negative control tissue from the same person from whom the diseased tissue was obtained. We recently developed a general, simple, and easy to use new virus discovery method that allows large-scale screening for any RNA or DNA virus in samples such as serum/plasma or virus culture supernatant (7). The method is based on the cDNA-AFLP technique (8) (Virus discovery cDNA-AFLP: VIDISCA). The main feature of VIDISCA is that prior knowledge of the genome sequence is not required as the presence of restriction enzyme sites is sufficient to guarantee PCR amplification. VIDISCA begins with a treatment to selectively enrich for viral nucleic acid, which includes a centrifugation step to remove residual cells and mitochondria (Fig. 1) . In addition, a DNase treatment is used to remove interfering chromosomal DNA and mitochondrial DNA from degraded cells, whereas RNases in the sample will degrade RNA. During this step, the viral nucleic acid is specifically protected within the viral particle. Next, DNase/Rnases are inactivated and the viral nucleic acids are subsequently extracted from the particles, RNA is reverse transcribed into cDNA, and second-strand synthesis is performed to make dsDNA (from a viral RNA or DNA genome). The dsDNA is digested with frequently cutting restriction enzymes that are likely to be present in every viral target (HinP1-I and Mse-I), and HinP1-I-and Mse-I-anchors are ligated to the digested DNA. Essential to the method is that the restriction enzymes remain active during the ligase reaction, thus preventing concatamerization of digested fragments. The anchors themselves are not removed because they are designed in such a way that the restriction site is lost. The target is subsequently PCR-amplified with primers that anneal to the anchor sequences, followed by a round of selective amplification with primers that are extended with one nucleotide (G, A, T, or C). Thus, 16 primer combinations are used and each sample is compared to a representative negative control (negative serum, plasma, or supernatant from an uninfected culture). The PCR fragments that are specific to the "infected" clinical sample can then be cloned and sequenced. Because amplification is based on the presence of restriction sites, the PCR is reproducible (in duplicate samples the same fragments are amplified) and these PCR products can be distinguished from background amplification. The assay is relatively high-throughput as multiple samples (about ten) can be tested per cycle of VIDISCA. We were able to amplify viral nucleic acids from EDTA-plasma of a person with hepatitis B virus infection and a person with an acute parvovirus B19 infection. Using urine, we could detect adenoviral DNA and influenza B RNA in two patients. The technique can also detect HIV-1 and picornaviruses in cell culture. These results illustrate that the VIDISCA technique has the capacity to identify both RNA and DNA viruses directly from patient material or from cell cultures. In fact, it was the first experiment with a suspected virus culture that led to identification of a novel human coronavirus [HCoV-NL63 (9)]. Only three human coronaviruses were known at that time: HCoV-229E, HCoV-OC43, and SARS-CoV (10,11), and HCoV-NL63 represents the fourth species. The rapid identification of this novel coronavirus demonstrates the power of our virus discovery tool, which can now be used to test large sample sets suspected of containing viral pathogens. 10 min, in order to remove the cells, cell debris, and insoluble particles such as mucus. Every analyzed sample is tested in duplicate and in every experiment an appropriate negative control is included. The negative control can be a sample of the same type derived from a healthy person or virus-negative cell culture of the same cell type if the pathogen was cultured. 2. Immediately after centrifugation, 100 l of sample is transferred into a fresh tube. Care should be taken that pelleted material is not transferred. If the sample is exceptionally full of cells/insoluble material, the primal volume may be increased as needed. 3. DNase treatment. DNase I solution is prepared in the nucleic acid free environment by mixing 15 l of DNase I enzyme, 15 l of DNase I buffer, and 20 l of sterile water per 100 l of the original sample. Subsequently, the DNase I solution is added to the sample material and incubated at 37 • C for 45 min. 1. Immediately after the DNase I treatment, 900 l of L6 solution is added to the sample to lyse the material (see Notes 2-5). The lysis is done at room temperature for 10 min. Sample should by thoroughly mixed by inverting and vortexing. 2. 40 l of silica is added and the sample is incubated at room temperature with gentle shaking for 10 min. 3. Sample is centrifuged (13,200 rpm) for 10 sec to pellet the silica particles, and the L6 supernatant is discarded. 4. The pelleted silica is washed twice with 900 l of L2. After addition of L2 solution, the sample is vortexed thoroughly until no pellets or large particles are visible and centrifuged for 10 sec at 13,200 rpm. Washing with L2 is necessary to remove all traces of Triton-X100 and EDTA that may inhibit the following enzymatic reactions. 5. The sample is washed twice with room temperature 70% ethanol and once with 100% acetone, in the same manner as described above for L2. Ethanol is added to wash out the guandine thiocyanate and residual traces of detergent and EDTA, whereas the acetone washing is needed primarily to speed up the drying process. 6. After the removal of the acetone, silica is dried for 5 min at 56 • C with the lid open. 7. To elute bound nucleic acids, 50 l of sterile water is added and the sample is vortexed until all silica particles are in suspension and incubated at 56 • C for 10 min with shaking (500 rpm). After the elution, the sample is centrifuged for 2 min at 13,200 rpm. About 30 l of the liquid fraction is transferred into a fresh tube. Samples should be stored at -80 • C until needed. The reverse transcription (RT) reaction mixture is assembled under nucleic acid and nuclease free conditions and consists of a two-step reaction. 1. After the RT reaction, the resulting single-stranded cDNA cannot be used as a template for restriction enzyme cleavage or adaptor ligation. Therefore, secondstrand synthesis is performed using RNase H to digest any residual RNA and Sequenase enzyme to synthesize the second strand DNA. 1. Digestion of the ds-cDNA is performed with restriction enzymes. In the protocol described here we included only digestion with HinP1-I and Mse-I enzymes, but this combination may be altered (13). The use of two different restriction enzymes is essential, as we observed that fragments that are cleaved on both sides by the same enzyme have less of a chance to be amplified by PCR. If one is planning to change the enzyme combination, care should be taken that the combination of restriction enzymes generates restriction fragments from virtually all templates. There is no inactivation step between the digestion and ligation, as restriction activity prevents generation of the concatameric forms of the target templates. Fragments that are properly ligated with adaptors will not be cleaved, because of the point mutation introduced (Fig. 2) . The ligation should be performed for 2 h at 37 • C. The main part of the VIDISCA method is amplification of the genetic material without prior knowledge of the sequence. The preprocessed ds-cDNA with adaptors can be now amplified using the primers specific for the adaptors. During development of the method it was determined that a single PCR round does not provide sufficient specificity and sensitivity, so a second "nested" PCR is included in the protocol. This PCR uses primers that are similar to the primers used in the first PCR, but with one nucleotide added to the 3 -end of the primers. 1. The first PCR reaction is done with the standard PCR thermocycling program ( Table 1 ) and is optimized for 50 l reaction. Table 1 (see Note 6). After successful thermocycling the sample can be store at -20 • C until needed. 4. Second PCR-selective amplifications: The second, nested PCR reaction is necessary to provide high specificity and sensitivity. This selective PCR is performed with primers with sequence identical as the standard primers, but with an additional nucleotide on its 3 part. This additional nucleotide is outside the adaptor sequence and thus belongs to the unknown material (Fig. 2) . Use of an additional nucleotide allows separation of the reactions in 16 different primer combinations and enables better analysis of the sample. To have a selectivity that is required when one wants to amplify only those fragments with a 100% match, the thermocycling profile is designed to increase the specificity of reaction by using the starting annealing temperature of 65 • C, which gradually decreases during first ten cycles to 56 • C. 5. The PCR mix is prepared as described below (47.5 l per sample). The HinpI-X and MseI-X primers denote primers with an additional 3 -nucleotide. Table 2 (see Note 6). The PCR product may be analyzed immediately or stored at -20 • C until needed. 1. The second PCR product is analyzed on agarose gel. Most of generated fragments are less than 300 bp in size. Owing to the need for high-quality separation and small differences in fragment sizes, the MetaPhor agarose is being used (it allows differentiation among fragments varying 1 bp in size). Additionally, the MetaPhor agarose provides an easy setup and high-throughput processing for gel analysis and purification, compared to the polyacrylamide gels. The MetaPhor agarose gel is prepared as described below. 2. 150 ml of 0.5X concentrated TBE buffer is poured into an Erlenmeyer flask and stirred with a magnetic stirrer. 4 g of MetaPhor agarose is weighted and gently poured into the Erlenmeyer flask while mixing. Addition of all agarose powder at once will result in clumping of the agarose. The solution is stirred for another 10 min to soak the agarose grains and heated in the microwave for 60 sec with low power. After the primary heating, the agarose is stirred and heated in 30-sec cycles (low power) with extensive stirring in between. All the agarose is solubilized during a final heating step for 60 sec with medium power. The agarose is cooled down to ∼65 • C and 10 l of ethidium bromide (10 mg/ml) is added. The agarose is poured into the electrophoresis tray and combs are inserted. It is crucial to remove all air bubbles from the gel (e.g., with a pipette tip). The agarose is solidified at room temperature and further incubated at 4 • C for at least 20 min (the incubation at 4 • C improves the gel resolution). The gel is positioned in the electrophoresis box filled with 0.5X TBE buffer 3. 15 l of the second PCR product is mixed with 5 l of the loading buffer and the samples are layered on the prepared agarose gel. 5 l of the 25-bp ladder is used as a DNA size marker. Electrophoretic separation is performed at 150 V for about 1 h. 4. Immediately after the electrophoresis is completed, the gel is analyzed on the UV transilluminator. A picture is taken for analysis and the gel is stored at 4 • C, wrapped in plastic (Saran Wrap). The picture of the gel is used to search for fragments that are present in the sample of interest and not in the control sample All the fragments that are present exclusively in the sample of interest are marked on the picture (Fig. 3) . If the bands appear very faint on the gel, the PCR products can be concentrated by vacuum centrifuge and reanalyzed on a MetaPhor gel. After fragment selection, the gel is again positioned on the UV transilluminator and the selected bands are excised with sterile razors (about 100 mg per slice) and stored in coded 1.5-ml Eppendorf tubes at 4 • C (see Note 7). After excision of all bands, a second picture of the gel should be taken to document the proper excision. 5. The DNA fragments from the gel are extracted with the QIAquick gel extraction kit following the manufacturer's protocol. The gel slices are solubilized in 600 l of QG buffer and 100 l of isopropanol is added. After extraction, resulting DNA is dissolved in 30 l of EB buffer. Alternatively, any other gel extraction method may be used. 3. 5 l of suspended E. coli bacteria in BHI medium is added to the PCR mixture. 4. The thermocycling is performed as described in Table 3 . After the PCR is completed, 10 l of the PCR product is mixed with gel loading dye and analyzed on 0.8% agarose MP gel with a Smart ladder DNA size marker. A representative picture of such a gel is shown in Fig. 4 . 5. The lanes that seem to contain the plasmid with proper insert are selected, and corresponding PCR products are subjected to sequencing reactions. 6. Sequencing reactions are performed on the colony-PCR product with the BigDye chemistry, using the M13 reverse and T7 primer, according to the manufacturers' instructions (Applied Biosystems). The sequence data obtained in the survey is analyzed with the BLAST server (http://www.ncbi.nlm.nih.gov/BLAST/). The raw sequence is edited to remove the sequence derived from the vector and the adaptors. This procedure can be done manually or using designated program, e.g., CodonCode (http://www.codoncode.com/). After the cleanup, the sequences are analyzed for their quality and only those that show a clear, single signal are exported in FASTA format for further analysis. Once imported to the BioEdit program (http://www.mbio.ncsu.edu/BioEdit/bioedit.html), the sequences are subjected to batch BLAST analysis with default settings. This batch analysis allows preselection of the sequences of interest, as mRNA and rRNA fragments are frequently found as background. All results that indicate the presence of a virus, or an unknown sequence should be selected and reanalyzed with the BLAST server (nblast) with the expectation number of 1000 against all databases. If the results are still not clear the following steps might be taken: The sequences in tblastx and rpsblast that display similarity to viral sequences should be considered as possibly unknown pathogens. If the sequence is analyzed against a viral database, care should be taken with each hit, because virtually all fragments show some similarity to viral sequences. In that case, the pathogen might be considered identified only if the results from different fragments from one sample show similarity to the same virus family. In all cases, it is essential to design a diagnostic primer set and retest the original material for the presence of the pathogen. It is only when the pathogen can be detected by the diagnostic (RT)-PCR in the original sample that efforts to sequence the entire genome can be undertaken. 1. For the nucleic acid isolation, any highly efficient method may be used. It is not advisable to use TRIzol isolation, as that is intended for isolation of nucleic acids from cells and tissues. 2. The L6 buffer lysis is sufficient to inactivate the virus. After a 10-min incubation it is safe to process the sample in a normal biochemistry laboratory. 3. The L6 and L2 buffer contain concentrated guanidine thiocyanate (GTC) and thus should be considered as highly toxic. Remember to store the GTC waste separately with addition of one-tenth volume of 1 N sodium hydroxide to prevent GTC degradation. 4. All RNA and cDNA handling before the first PCR should be performed in a nucleic-acid-free environment. The sequence independent amplification will result in overamplification of contaminating DNA. 5. The use of chlorine as a decontaminant should be limited as it may decrease the viability of reverse transcription enzyme. 6. If the thermocycling is performed in a PCR machine that does not include heating of the cover, two drops of paraffin oil should be layered on top of the PCR solution to prevent evaporation during the PCR reaction. 7. It is advised to use a fresh razor for each band during excision. The exposure of the gel to UV light should be limited, as such exposure results in DNA degradation. Sequenase 2.0, T7 DNA polymerase at concentration 13 U/l (Amersham Biosciences) Random primers (hexamers Working solution 1 g/l Moloney murine leukemia virus reverse transcriptase enzyme; 200 U/l; Invitrogen) CMB buffer (10X): 100 mM Tris-HCl pH 8.3, 500 mM KCl, 1% Triton-X100. Prepare by mixing 1 ml of 2 M Tris-HCl pH 8.3, 5 ml of 2 M KCl, 2 ml of 10% Triton X-100, and 12 ml of sterile water (Baker) SEQII buffer (10X): 350 mM Tris-HCl pH 7.5, 250 mM NaCl, 175 mM MgCl 2 . Prepare by mixing 2.25 ml sterile water (Baker), 3.5 ml of 1 M Tris-HCl pH 7.5, 2.5 ml of 1 M NaCl, 1.75 ml of 1 M Magnesium chloride (100 mM) dNTP's (25 mM of each Prepare by mixing 10 ml of 2 M Tris-HCl pH 8.3, 25 ml of 2 M KCl, 60 ml of sterile water (Baker), and 5 ml of BSA (Bovine serum albumin, 20 mg/ml; Roche) First PCR primer set: HinP1-I standard primer 5 -GAC GAT GAG TCC TGA CCG C-3 and Mse-I standard primer 5 Nested PCR primer set: HinP1-I-X Selective primers 5 Nested PCR primer set: Mse-I-X Selective primers Top strand: 5 Top strand: 5 Mse-I restriction enzyme, 10 U/l (New England Biolabs). BSA and NEB-2 buffer are included HinP1-I restriction enzyme. 10 U/l Ligase, 5 U/l (Invitrogen) Ligase buffer (Invitrogen) Sterile HPLC pure water (Baker) Gel Electrophoresis and Gel Extraction 1. MetaPhor agarose (Cambrex) Ethidium bromide (BioRad) A 25-bp DNA ladder (Invitrogen) Smart ladder DNA size marker (Eurogentec) Sterile razor blades QIAquick gel extraction kit Sterile HPLC pure water (Baker) Agarose gel loading buffer: 0.1% orange G, 30% glycerol in 0.5X TBE Kawasaki disease: what is the epidemiology telling us about the etiology? Molecular mimicry, bystander activation, or viral persistence: infections and autoimmune disease Detection of coronaviruses by the polymerase chain reaction Phylogenetic analysis of a highly conserved region of the polymerase gene from 11 coronaviruses and development of a consensus polymerase chain reaction assay Identification of a novel coronavirus in patients with severe acute respiratory syndrome Identification of herpesvirus-like DNA sequences in AIDSassociated Kaposi's sarcoma Identification of a new human coronavirus Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: analysis of gene expression during potato tuber development Identification of a new human coronavirus Coronaviridae: The viruses and their replication Identification of a novel coronavirus in patients with severe acute respiratory syndrome Rapid and simple method for purification of nucleic acids Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: analysis of gene expression during potato tuber development