key: cord-1035621-2s14knw6 authors: Lai, Michael M.C. title: Recombination in large RNA viruses: Coronaviruses date: 1996-12-31 journal: Seminars in Virology DOI: 10.1006/smvy.1996.0046 sha: c5f160f9f79cb242c7368ea3d641db5a2cc3c5db doc_id: 1035621 cord_uid: 2s14knw6 Abstract Coronaviruses contain a very large RNA genome, which undergoes recombination at a very high frequency of nearly 25% for the entire genome. Recombination has been demonstrated to occur between viral genomes and between defective-interfering (DI) RNAs and viral RNA. It provides an evolutionary tool for both viral RNAs and DI RNA and may account for the diversity in the genomic structure of coronaviruses. The capacity of coronaviruses to undergo recombination may be related to its mRNA transcription mechanism, which involves discontinuous RNA synthesis, suggesting the nonprocessive nature of the viral polymerase. Recombination is used as a tool for the mutagenesis of viral genomic RNA. CORONAVIRUSES CONTAIN an extraordinarily large RNA genome (27-31 kb) . This large RNA size imposes a severe burden on the virus because such an RNA can be expected to accumulate a large number of errors during RNA replication, assuming that the error frequency of coronaviral RNA polymerase is comparable to that of other RNA viruses. Thus, coronaviruses must develop genetic mechanisms to counter the potentially deleterious effects of the errors. RNA recombination is one such mechanism. The discovery of RNA recombination in coronaviruses 1 was made at a time when only picornaviruses, but no other RNA viruses, had been demonstrated to be capable of RNA recombination. And it came with a vengeance, as murine coronaviruses were quickly shown to recombine at very high frequency under a variety of natural and experimental conditions. The capacity to recombine has now been demonstrated in several different coronaviruses. Recombination is an important mechanism contribut-ing to both the genetic stability and diversity of coronaviruses in nature. The coronavirus RNA genome is a single species of single-stranded, positive-sensed RNA 27-31 kb in length (see Lai 1990 ). 2 It consists of seven to 10 genes, one of which (gene 1) encodes a precursor of RNA polymerase of approximately 750-800 kDa. Gene composition and arrangement vary among the different coronaviruses ( Figure 1 ). The enormous size (22 kb) of the polymerase gene suggests that the polymerase has multiple functions. Each gene is expressed through one of the mRNAs, which are 3'-coterminal and have a nested-set structure. Only the 5'-most gene of each mRNA is functional for protein translation. Each mRNA has a leader sequence of 70-90 nucleotides derived from the 5'-end of the genome RNA. mRNA transcription is carried out by a discontinuous transcription mechanism which fuses the leader RNA to the transcription start signal (intergenic sequence). The mRNA leader sequence is usually derived in trans from a different RNA molecule. 3, 4 Therefore, the coronaviral polymerase must jump between the leader sequence and intergenic sequences in different RNA molecules during positive-or negative-strand RNA synthesis. The first coronavirus recombinant was isolated by coinfecting temperature-sensitive (ts) mutants of two mouse hepatitis virus (MHV) strains, A59 and JHM, and selecting progeny viruses which grew at the nonpermissive temperature. 1 The identity of this recombinant was established by genomic sequence analysis, which showed that it indeed had one crossover site and contained sequences from both parents. Subsequently, additional recombinants were obtained using different pairs of ts mutants and other selection markers, including monoclonal antibody neutralization epitopes and cell-cell fusion ability. [5] [6] [7] [8] [9] Although recombination frequency was not determined in these early studies, the ease with which these recombinants were isolated suggested that the recombination frequency of MHV was very high. This was also suggested from the finding that many of these recombinants had multiple cross-overs, some of which were surprisingly located outside of the two selection markers used for the isolation of the recombinants. Therefore, MHV recombination likely occurs at such a high frequency that recombinants are selected without specific selection pressure. The high frequency of recombination was also demonstrated in an experiment in which an A59 ts mutant and wild-type JHM were used for a mixed infection. 8 The recombinant viruses that grew at the nonpermissive temperature became the predominant virus population after only two tissue culture passages. This result was striking because one of the parental viruses (JHM) was not selected against, suggesting that the recombinants had evolutionary advantages over the parental viruses under the experimental conditions. The finding indicated that recombination could serve as a tool for virus evolution. Recombination can occur almost everywhere in the MHV genome. However, some cross-over sites appear to be restricted in recombination between certain pairs of viruses. For example, cross-overs in the 3'-end of the viral genome were rarely detected in the recombination between A59 and JHM, and yet they occurred frequently between MHV-2 and A59. 7 As discussed below, this finding probably reflects the possibility that recombinants with chimeric viral proteins derived from certain pairs of parental viruses might be unstable or have an inferior replication ability, and, therefore, were selected against during virus growth. So far, only homologous recombination has been detected between coronaviruses. This is in contrast to the frequent occurrence of nonhomologous or aberrant homologous recombination seen in other RNA viruses, such as turnip crinkle virus, brome mosaic virus or Sindbis virus, [10] [11] [12] in which cross-overs occur at nonhomologous sites on the two parental RNAs despite the presence of homologous sequences on them. 13 The absence of nonhomologous recombination in coronaviruses may reflect their rigid viral RNA or protein structure requirements for optimal virus growth. Recombination occurred not only in tissue culture, but also in animal infections, as demonstrated by the intracerebral inoculation of MHV into mouse brain. 6 Again, recombinants were isolated at a very high frequency, comparable to that in tissue culture. By performing a series of recombination studies between different pairs of ts mutants, Baric et al were able to establish a linear recombination map for MHV. 14 The two most distant ts markers used in that study had a recombination frequency of 8.7%. By estimating the genetic locations of the ts defects and assuming that recombination occurred reciprocally, a recombination frequency of approximately 25% was extrapolated for the entire MHV genome (31.2 kb). This recombination frequency translates to approximately 1% recombination for every 1300 nucleotides, which is in the same range as the estimated frequency for picornaviruses (1% for every 1700 nucleotides); 15 however, because of the extremely large size of the coronavirus RNA, the overall recombination frequency of the MHV appears very large. Subsequent recombination mapping studies showed that there is an increasing gradient of recombination frequency (in the direction of 5'-→3') across the genome. 16, 17 This result is best interpreted as the possible participation of the subgenomic mRNAs in recombination, since the subgenomic mRNAs of coronaviruses have a 3'-coterminal, nested-set structure, and thus are preferentially enriched in the 3'-end sequence ( Figure 1 ). Despite the high frequency of recombination in MHV in tissue culture and experimental inoculations in animals, there has been no clear-cut evidence for the occurrence of recombination among natural MHV strains, probably because they have not been extensively studied. In contrast, clear-cut evidence of recombination has been obtained for natural isolates of avian infectious bronchitis virus (IBV), many of which have recombination between different strains in the spike protein gene or the 3'-end of viral RNA. [18] [19] [20] [21] [22] Recombination has now been demonstrated experimentally for IBV in embryonated eggs 23 and TGEV in tissue culture (L. Enjuanes, personal communication). However, the recombination frequency in these two viruses may not be very high, as may be implied from the difficulty of isolating these recombinants. DI RNA has traditionally been considered a product of nonhomologous recombination during viral RNA replication. 13 The generation and structure of coronaviruses DI RNAs will be discussed in the next chapter, and thus will be discussed here only in the context of RNA recombination. It has been shown that once a DI RNA is generated, its size and structure continue to change as it is passaged in tissue culture. Thus, the predominant DI RNA species is different at different passage levels. 24 This phenomenon is at least partially caused by recombination, as shown by the evolution of an MHV-JHM DI RNA passaged in MHV-A59-infected cells. 25 In this instance, a novel DI RNA species, which was determined to be a recombinant of A59 and JHM, appeared after a few passages. 25 30, 31 This occurs only when a stretch of the repetitive sequence, which resembles the transcription start signal, is present immediately downstream of the leader sequence in the DI RNA. 4, 30 The recombinant DI RNA can become the predominant species within just one replication cycle. Thus, the leader junction site may be considered a hot spot of recombination. This type of recombination is very similar to coronavirus mRNA transcription, which uses a discontinuous transcription mechanism involving a separate leader RNA. The free leader RNA used for transcription may also be involved in this type of recombination. Therefore, this type of recombination probably uses a mechanism similar to mRNA transcription. The reciprocal outcome of recombination between the DI RNA and the viral RNA as described above is the incorporation of DI RNA sequences into the viral RNA. This has the desirable consequence of changing the viral RNA sequence, inasmuch as DI RNAs can be manipulated by recombinant DNA methodology. The feasibility of this approach was first demonstrated by transfecting an mRNA 7 construct (representing the 3'-end sequence of the viral RNA) (Figure 1 ) into cells infected with a ts mutant with a defective N gene. 32 As a result of this transfection, wild-type viruses with a functional N gene were obtained. Sequence analysis showed that they were bona fide recombinants, in which sequences from the transfected RNA replaced the defective gene in the original virus. Similar recombination events have also been observed when RNA fragments representing either the 5'-or 3'-ends of the viral RNAs were transfected into virus-infected cells. 33 In this case, the viral RNA containing the sequence of the transfected RNA fragments was detected by reverse transcription-polymerase chain reaction (RT-PCR), although the actual recombinants could not be isolated because of lack of selection markers. It is noteworthy that these transfected RNA fragments could not replicate; 32,33 thus, they probably directly served as templates for RNA recombination. In addition, both the transfected positive-and negative-strand RNAs could lead to recombination; 33 suggesting that recombination may occur during both positive-and negative-strand RNA synthesis. So far, this type of recombination has not been demonstrated in the internal region of the RNA, where it is likely to occur at a lower efficiency because at least two cross-over events are required. More efficient recombination of this type occurred when DI RNAs that can replicate were used as the donor sequence. [34] [35] [36] These DI RNAs typically contain sequences of both the 5'-and 3'-ends of the viral RNA genome, which include the RNA replication signals. 37, 38 Probably as a result of DI RNA replication, more RNA substrates for recombination were generated and more RNA replication events occurred, creating more opportunities for recombination. Using this approach, recombinant viruses which had incorporated DI sequences into the viral RNA could be obtained at a higher efficiency than when nonreplicating RNAs were used. 34, 35, 39 Theoretically, either the 5'-or 3'-end sequences of the DI RNAs could be incorporated into the viral RNA via this mechanism; however, so far, only recombination involving the 3'-end sequence has been achieved. The lack of 5'-end recombination may simply be due to the lack of appropriate selection markers. Although the recombination frequency has not been determined for these studies, this approach has proven to be very useful for introducing desirable sequences into the viral RNA. Utilized in this manner, recombination is a valuable tool for coronavirus studies because an infectious coronavirus cDNA or recombinant RNA is still not available (no doubt due to the large size of RNA). This recombination strategy provides an alternative method for introducing site-specific mutations into the viral RNA genome. It has generated the first interspecies recombinant virus between MHV and bovine coronavirus (BVC). 39 Experimental evidence in tissue culture indicated that recombination can generate new viruses, even when no specific selection pressures were applied, as long as these recombinants have evolutionary advantages. 8 The effects of recombination on the evolution of coronavirus DI RNAs have also been demonstrated. 25, 28, 29 Furthermore, in natural coronavirus infections, recombination also serves as an evolutionary tool. This is particularly evident for IBV, many field isolates of which are recombinants between various IBV strains. [19] [20] [21] [22] 40 Two classes of natural IBV recombinants have been identified so far. In the first, recombination occurs in the spike protein gene, conceivably allowing the virus to alter surface antigenicity, and thus escape immunesurveillance in the animals. In the second class, recombination occurs in the 3'-end of the viral RNA, which may alter the replication ability of the RNA because this region contains regulatory sequences for RNA replication. Recombination may also explain the many gene insertion and rearrangement events in the various coronavirus genomes. When the genome structures of various coronaviruses are compared, it is apparent that IBV contains two additional ORFs between the N and M genes, which are not present in other coronaviruses (Figure 1 ). MHV also contains two novel genes (gene 2 and HE protein gene) between the polymerase and spike protein genes. These genes must have been inserted into the coronavirus genomes by recombination between coronavirus and cellular or viral RNAs. Since the HE protein of MHV shares sequence similarity with the influenza C virus HEF (hemagglutinin-esterase-fusion) protein, 41 the HE gene was likely the result of recombination between an ancestral coronavirus and influenza C virus. Furthermore, since the HE gene is present only in some coronaviruses, this recombination event was probably a fairly recent occurrence. When the genome of coronaviruses (e.g. MHV) is compared to that of torovirus, which belongs to a different genus of the Coronaviridae family, it appears that gene 2 of coronavirus is present in torovirus as part of its gene 1, and part of the coronavirus HE protein gene is present elsewhere (in gene 4) in the torovirus RNA ( Figure 1 ). 42 Since each coronavirus gene is flanked by a stretch of similar intergenic sequences, which serves as a transcription start signal (Figure 1) , each viral gene may be regarded as a gene cassette, which can be easily moved to the various sites on the RNA genome by recombination between the intergenic sequences. In summary, recombination has played an important role in the past evolution of coronaviruses, and continues to play significant roles in the ongoing evolution of viruses in nature. It is supposed that recombination in coronaviruses, as in other RNA viruses, occurs by a copy-choice mechanism, 13 although there is still no direct evidence for this. In this model, recombination takes place during RNA replication, when RNA polymerase pauses at certain sites of RNA template. The nascent RNA transcripts separate from the original template, and then join themselves to a different RNA template to continue RNA synthesis. Depending on the rejoining sites, the resultant RNA recombination will be either homologous or nonhomologous. Several pieces of evidence support this model: RNA transcripts of discrete sizes have been detected in the MHV-infected cells; 43 these RNAs appear to represent transcripts which have paused at sites of strong secondary structures and may participate in recombination. Since coronaviruses utilize a discontinuous transcription mechanism to synthesize mRNAs, the viral polymerase and nascent RNA transcripts must dissociate from the RNA template regularly during RNA transcription to fuse the leader RNA to a distant mRNA start site. Therefore, the coronavirus polymerase is probably not a processive enzyme and is able to dissociate from and rejoin itself to RNA templates with regularity. Indeed, one of the most frequently utilized MHV recombination sites is at the junction between the leader RNA and the remainder of the RNA genome, 5 which is reminiscent of the joining of the leader and the body sequence during mRNA transcription. This result suggests that the intermediates of mRNA transcription may participate in recombination. This interpretation is also consistent with the finding that recombination frequency increases toward the 3'-end of the MHV genome, suggesting that subgenomic mRNAs also participate in RNA recombination. 16, 17 Thus, the mechanism of coronavirus RNA recombination may be similar to that of mRNA transcription. The copy choice mechanism of RNA recombination predicts that recombination will occur more frequently at RNA sites of strong secondary structure, since these structures promote transcriptional pausing. 44 Indeed, MHV recombination has been shown to occur readily in hypervariable region of the spike protein gene, where frequent deletions occur, 45 suggesting that the same secondary structure causes both deletion and recombination. However, this interpretation may not be correct, because when recombination was examined under nonselective conditions (e.g. when intracellular RNA from cells infected with two different viruses was examined by RT-PCR for the presence of recombinant viral RNA molecules without virus isolation), the cross-over sites in the recombinant RNAs were found to be distributed almost randomly. 46 Only after a few cycles of virus passage in tissue culture did the pattern of 'hotspots' of RNA recombination become apparent. This finding suggests that the so-called 'hotspots' detected in most coronavirus recombination studies may be the result of virus selection, but they do not represent the actual recombination hotspots. It is possible that recombinants with certain chimeric proteins derived from two different parental viruses have evolutionary advantages and thus will predominate during the course of virus growth, and other recombinant viruses will be selected against. The emergence of select recombinants was also seen in DI recombination, where recombinant DI RNAs containing a longer ORF were selected. 28, 29 This interpretation may explain why coronavirus DI RNAs can undergo nonhomologous recombination, 25 but coronaviral genomic RNAs cannot, i.e. because recombinant viruses generated by nonhomologous recombination may not grow competitively. How are the acceptor RNA sites selected? Conceivably, nascent RNA transcripts bind to homologous sequences on a different RNA template because of sequence complementarity, resulting in homologous recombination. The difficult aspect of this scenario is that the template RNAs and the nascent transcripts are likely complexed with other RNAs or proteins; therefore, they are not exposed. Furthermore, RNA polymerases are not known to initiate RNA synthesis from the internal regions of any RNA, except from certain transcription or replication signals. Thus, how the acceptor sites are selected and RNA synthesis resumes from those sites are theoretically difficult issues. One possibility is that the polymerase-nascent RNA complex recognizes certain RNA secondary structures or RNA-protein complexes on the acceptor molecule by RNA-protein or protein-protein interactions rather than base-pairing. According to this scenario, the nascent RNA-polymerase complex may not bind to the homologous sites on the acceptor RNA. This explains the nonhomologous or aberrant homologous recombination seen in many RNA viruses. 13 It is clear that even in homologous recombination, strict sequence complementarity at the crossover sites is not necessary. 46, 47 Conceivably, once the nascent RNA has joined the acceptor RNA, there is additional processing of the transcript, such as cleavage of the 3'-ends. The extent of cleavage may determine the final cross-over sites. Such a 3'-cleavage activity has been demonstrated in several types of DNA-dependent RNA polymerases. 48, 49 In other RNA viruses, such as brome mosaic virus, the parental RNA templates may be held together by secondary structures (complementary sequences) to facilitate recombination. 11 Such a case has not been demonstrated for coronaviruses. Does recombination occur during ( + )-or (-)-strand RNA synthesis? Since both ( + ) and (-)-strand RNA fragments that cannot replicate could recombine with the viral RNA, 33 it stands to reason that recombination can take place during the synthesis of both strands. The efficiency of either strand in recombination has not been determined and may depend on the amount of the available template RNA. Another unresolved issue is whether any particular sequence would favor recombination, as shown for other RNA viruses. 50, 51 Recombination is an important genetic mechanism for coronaviruses. It probably provides a mechanism for maintaining viral genomic stability, inasmuch as the coronavirus RNA has an extremely large size which renders it vulnerable to the accumulation of a large number of errors during RNA replication. It also provides a mechanism for the natural evolution of the virus and DI RNAs. Several issues regarding coronavirus recombination remain unresolved, particularly concerning the mechanism of recombination, e.g. what is the sequence requirement for recombination? What are the protein factors involved in recombination? Recombination may also provide a useful genetic tool for creating coronaviral mutants, which is not yet feasible by conventional reverse genetics methodology. Thus, coronavirus RNA recombination is an important biological phenomenon for coronavirus and serves as an excellent model for viral RNA recombination in general. Recombination between nonsegmented RNA genomes of murine coronaviruses Coronavirus: organization, replication and expression of genome Evidence for coronavirus discontinuous transcription Coronavirus leader RNA regulates and initiates subgenomic mRNA transcription, both in trans and in cis Multiple recombination sites at the 5'-end of murine coronavirus RNA In vivo RNA-RNA recombination of coronavirus in mouse brain RNA recombination of murine coronaviruses: Recombination between fusion-positive MHV-A59 and fusion-negative MHV-2 Highfrequency RNA recombination of murine coronaviruses RNA recombination of coronaviruses: Localization of neutralizing epitopes and neuropathogenic determinants on the carboxyl terminus of peplomers Recombination between Sindbis virus RNA Generation and analysis of nonhomologous RNA-RNA recombinants in brome mosaic virus: Sequence complementarities at crossover sites Recombination between satellite RNAs of turnip crinkle virus RNA recombination in animal and plant viruses Establishing a genetic recombination map for murine coronavirus strain A59 complementation groups Genetics of picronaviruses Evidence for variable rates of recombination in the MHV genome Map locations of mouse hepatitis virus temperature-sensitive mutants: Confirmation of variable rates of recombination Phylogeny of antigenic variants of avian coronavirus IBV Sequence analysis of strains of avian infectious bronchitis coronavirus isolated during the 1960s in the UK Evidence of natural recombination within the S1 gene of the infectious bronchitis virus Evolutionary implications of genetic variations in the S1 gene of infectious bronchitis virus A novel variant of avian infectious bronchitis virus resulting from recombination among three different strains Experimental evidence of recombination in coronavirus infectious bronchitis virus Structure of the intracellular defective viral RNAs of defective interfering particles of mouse hepatitis virus Natural evolution of coronavirus defective-interfering RNA involves RNA recombination Translation but not the encoded sequence is essential for the efficient propagation of the defective interfering RNAs of the coronavirus mouse hepatitis virus A cis-acting viral protein is not required for the replication of a coronavirus defective-interfering RNA The fitness of defective interfering murine coronavirus DI-a and its derivatives is decreased by nonsense and frameshift mutations Generation and selection of coronavirus defective interfering RNA with large open reading frame by RNA recombination and possible editing High-frequency leader sequence switching during coronavirus defective interfering RNA replication The UCUAAAC promoter motif is not required for high-frequency leader recombination in bovine coronavirus defective interfering RNA Repair and mutagenesis of the genome of a deletion mutant of the coronavirus mouse hepatitis virus by targeted RNA recombination RNA recombination in a coronavirus: Recombination between viral genomic RNA and transfected RNA fragments Homologous RNA recombination allows efficient introduction of site-specific mutations into the genome of coronavirus MHV-A59 via synthetic co-replicating RNAs Optimization of targeted RNA recombination and mapping of a novel nucleocapsid gene mutation in the coronavirus mouse hepatitis virus Analysis of second-site revertants of a murine coronavirus nucleocapsid protein deletion mutant and construction of nucleocapsid protein mutants by targeted RNA recombination Deletion mapping of a mouse hepatitis virus defective-interfering RNA reveals the requirement of an internal and discontiguous sequence for replication Analysis of cis-acting sequences essential for coronavirus defective interfering RNA replication Construction of murine coronavirus mutants containing interspecies chimeric nucleocapsid proteins Sequence evidence for RNA recombination in field isolates of avian coronavirus infectious bronchitis virus Sequence of mouse hepatitis virus A59 mRNA 2: Indications for RNA-recombination between coronavirus and influenza C virus Comparison of the genome organization of toro-and coronaviruses: evidence for two nonhomologous RNA recombination events during Berne virus evolution Analysis of intracellular small RNAs of mouse hepatitis virus: Evidence for discontinuous transcription Template-determined, variable rate of RNA chain elongation A clustering of RNA recombination sites adjacent to a hypervariable region of the peplomer gene of murine coronavirus Random nature of coronavirus RNA recombination in the absence of selection pressure The mechanism of RNA recombination in poliovirus Identification of a 3'-→5' exonuclease activity associated with human RNA polymerase II RNA cleavage and chain elongation by Escherichia coli DNAdependent RNA polymerase in a binary enzyme-RNA complex Sequences and structures required for recombination between virus-associated RNAs RNA determinants of junction site selection in RNA virus recombinants and defective interfering RNAs I would like to thank Daphne Shimoda for editorial assistance. The author is an investigator of the Howard Hughes Medical Institute.