key: cord-0431163-xzt9h05v authors: Cook, Georgia M.; Brown, Katherine; Shang, Pengcheng; Li, Yanhua; Soday, Lior; Dinan, Adam M.; Tumescheit, Charlotte; Adrian Mockett, A. P.; Fang, Ying; Firth, Andrew E.; Brierley, Ian title: Ribosome profiling of porcine reproductive and respiratory syndrome virus reveals novel features of viral gene expression date: 2021-11-19 journal: bioRxiv DOI: 10.1101/2021.11.17.468997 sha: cee7a0a9e2d439a6581af2c1902083474c8bb47c doc_id: 431163 cord_uid: xzt9h05v Porcine reproductive and respiratory syndrome virus (PRRSV) is an arterivirus which causes significant economic losses to the swine industry worldwide. Here, we use ribosome profiling (RiboSeq) and parallel RNA sequencing (RNASeq) to characterise the transcriptome and translatome of both species of PRRSV and analyse the host response to infection. We quantified viral gene expression over a timecourse of infection, and calculated the efficiency of programmed ribosomal frameshifting (PRF) at both sites on the viral genome. At the nsp2 frameshift site (a rare example of protein-stimulated frameshifting), −2 PRF efficiency increases over time, likely facilitated by accumulation of the PRF- stimulatory viral protein (nsp1β) during infection. This marks arteriviruses as the second example of temporally regulated PRF. Surprisingly, we also found PRF efficiency at the canonical ORF1ab frameshift site increases over time, in apparent contradiction of the common assumption that RNA structure-directed frameshift sites operate at a fixed efficiency. This has potential implications for the numerous other viruses with canonical PRF sites. Furthermore, we discovered several highly translated additional viral ORFs, the translation of which may be facilitated by multiple novel viral transcripts. For example, we found a 125-codon ORF overlapping nsp12, which is expressed as highly as nsp12 itself at late stages of replication, and is likely translated from novel subgenomic (sg) RNA transcripts that overlap the 3′ end of ORF1b. Similar transcripts were discovered for both PRRSV-1 and PRRSV- 2, suggesting a potential conserved mechanism for temporal regulation of expression of the 3′-proximal region of ORF1b. In addition, we identified a highly translated, short upstream ORF (uORF) in the 5′ UTR, the presence of which is highly conserved amongst PRRSV-2 isolates. This is the first application of RiboSeq to arterivirus-infected cells, and reveals new features which add to the complexity of gene expression programmes in this important family of nidoviruses. Porcine reproductive and respiratory syndrome virus (PRRSV) is an arterivirus which causes significant 20 economic losses to the swine industry worldwide. Here, we use ribosome profiling (RiboSeq) and 21 parallel RNA sequencing (RNASeq) to characterise the transcriptome and translatome of both species 22 of PRRSV and analyse the host response to infection. We quantified viral gene expression over a 23 timecourse of infection, and calculated the efficiency of programmed ribosomal frameshifting (PRF) at 24 both sites on the viral genome. At the nsp2 frameshift site (a rare example of protein-stimulated 25 frameshifting), −2 PRF efficiency increases over time, likely facilitated by accumulation of the PRF-26 stimulatory viral protein (nsp1β) during infection. This marks arteriviruses as the second example of 27 temporally regulated PRF. Surprisingly, we also found PRF efficiency at the canonical ORF1ab 28 frameshift site increases over time, in apparent contradiction of the common assumption that RNA 29 structure-directed frameshift sites operate at a fixed efficiency. This has potential implications for the 30 numerous other viruses with canonical PRF sites. Furthermore, we discovered several highly translated 31 additional viral ORFs, the translation of which may be facilitated by multiple novel viral transcripts. 32 For example, we found a 125-codon ORF overlapping nsp12, which is expressed as highly as nsp12 33 itself at late stages of replication, and is likely translated from novel subgenomic (sg) RNA transcripts 34 that overlap the 3′ end of ORF1b. Similar transcripts were discovered for both PRRSV-1 and PRRSV-35 2, suggesting a potential conserved mechanism for temporal regulation of expression of the 3′-proximal 36 region of ORF1b. In addition, we identified a highly translated, short upstream ORF (uORF) in the 5′ 37 formerly known as "European" (Type 1) and "North American" (Type 2) PRRSV, share just ~60% 49 pairwise nucleotide similarity and were recently re-classified as two separate species, Betaarterivirus 50 suid 1 and 2 (viruses named PRRSV-1 and PRRSV-2) 5-7 . For ease of reference, PRRSV-1 is herein 51 referred to as EU (European) and PRRSV-2 as NA (North American) PRRSV, although both lineages 52 are observed worldwide 8 . 53 The PRRSV genome (14.9-15.5 kb; Figure 1A ) is 5′-capped, 3′-polyadenylated and directly translated 54 following release into the cytoplasm 9 . Like most members of the order Nidovirales, PRRSV replication 55 includes the production of a nested set of subgenomic (sg) RNAs by discontinuous transcription, where 56 the viral RNA-dependent RNA polymerase (RdRp) jumps between similar sequences in the 3′-proximal 57 region of the genome and the 5′ UTR, known as body and leader transcription regulatory sequences 58 (TRSs), respectively 5,10 . These sgRNAs are 5′-and 3′-co-terminal and are translated to express the 59 structural proteins encoded towards the 3′ end of the genome 5,10 . The 5′-proximal two thirds of the 60 genome contains two long ORFs, ORF1a and ORF1b, with a −1 programmed ribosomal frameshift 61 (PRF) site present at the overlap of the two ORFs 11,12 . Ribosomes that frameshift at this site synthesise 62 polyprotein (pp)1ab, while the remainder synthesise pp1a, both of which are cleaved by viral proteases 63 into several non-structural proteins (nsps) 5,13 . The proteins encoded by ORF1b include the RdRp and 64 the helicase, and frameshifting at this site is thought to set the stoichiometry of these proteins relative 65 to those encoded by ORF1a, a prevalent expression strategy in the Nidovirales order 14 . 66 ORFs are coloured and offset on the y axis according to their frame relative to ORF1a (0: purple, 142 no offset; +1/-2: blue, above axis; +2/-1: yellow, below axis). Subgenomic RNAs are shown 143 beneath the full-length genomic RNA, with the region of 5′ UTR that is identical to the genomic 144 5′ UTR shown in grey (known as the "leader" and ~45 nt for RNASeq. Note that the RiboSeq library 9 hpi mock replicate three was discarded 157 due to poor quality. 158 Quality control analyses were performed as previously described 49 ( Figure 1C, Supplementary Figures 159 1-6) . The length distribution of CDS-mapping RPFs is observed to peak at ~21 nt (where fragments of 160 this length were purified) and at ~29 nt, with RPFs of these lengths thought to originate from, 161 respectively, ribosomes with an empty A site or an A site occupied by aminoacyl-tRNA ( Figure 1B mapping reads compared to host-mapping reads in some NA PRRSV RiboSeq libraries at late 183 timepoints ( Figure 1C, Supplementary Figure 2) suggests that a small proportion of viral reads originate 184 from non-RPF sources, such as protection from RNase I digestion by viral ribonucleoprotein (RNP) 185 complex formation. This non-RPF fraction of the library (henceforth referred to as RNP contamination 186 although it could originate from several sources) is predominantly noticeable among reads mapping to 187 the ORF1b region of the viral genome ( Supplementary Figures 3 and 4) , where the read depth from 188 genuine translation is lowest. RiboSeq read lengths for which a high proportion of reads map to phase 189 0 were inferred to be least likely to have a high proportion of RNP contamination (Supplementary 190 Figure 4), and were selected for all NA PRRSV RiboSeq analyses henceforth, unless specified. RNP 191 contamination is not a relevant concern for RNASeq libraries (as proteins are enzymatically digested 192 before RNA purification) and it does not noticeably affect the EU PRRSV RiboSeq libraries, nor RPFs 193 mapping to the host transcriptome ( Supplementary Figures 5 and 6 ). Overall, we inferred that these 194 datasets have a high proportion of RiboSeq reads representing genuine RPFs, and where RNP 195 contamination is evident in lowly translated regions of the viral genome its effects will likely be 196 ameliorated by stratification of read lengths. 197 Having confirmed data quality, we moved on to analyse virus replication over the timecourse by plotting below. With the exception of this highly expressed uORF, the transcriptional and translational profile 220 of EU PRRSV at 8 hpi is similar to that of NA PRRSV at 9 hpi, although the production and translation 221 of sgRNAs relative to ORF1a is slightly lower (Figure 2, Figure 3 ). In all RiboSeq libraries, we noted 222 a variable proportion of negative-sense reads that mapped to the viral genome; however, they do not 223 display the characteristic length distribution or phasing of genuine RPFs (Supplementary Figure 9) , 224 suggesting they originate from other sources (discussed above). They are therefore excluded from plots 225 and analyses hereafter. 226 per million mapped reads (RPM) on the WT viral genome, after application of a 45-nt running 230 mean filter, from cells harvested over a timecourse of 3-12 hpi. Positive-sense reads are plotted 231 in green (above the horizontal axis), negative-sense in orange (below the horizontal axis). The 232 WT libraries with the best RiboSeq quality control results were selected for this plot (3 hpi 233 replicate one, 6 hpi replicate two, 9 hpi replicate four, 12 hpi replicate one), with further replicates 234 and KO2 libraries shown in Supplementary Figure 7 . C) RiboSeq read densities on the WT viral 235 genome from the counterpart libraries to B. Reads were separated according to phase (0: purple, 236 -2/+1: blue, -1/+2: yellow), and densities plotted after application of a 15-codon running mean 237 filter. Only read lengths identified as having minimal RNP contamination (indicated in 238 Supplementary Figure 4) Genome map constructed as in Figure 1A , with subgenomic RNAs omitted for space 264 considerations. B) RNASeq read densities on the EU PRRSV genome. Plot constructed as in 265 Figure 2B . C) RiboSeq read densities on the EU PRRSV genome. Plot constructed as in Figure 266 2C, except for the selection of read lengths to include -in this case, read lengths showing good 267 phasing were selected for inclusion (indicated in Supplementary Figure 6D only a handful of non-canonical transcripts have been discovered for PRRSV, through relatively low-279 throughput methods. Here, we characterise the PRRSV transcriptome in more detail by examining novel 280 junctions in RNASeq reads aligned to the genome using STAR 60 . Borrowing terminology from the 281 process of splicing (which is unrelated to discontinuous transcription), we refer to the 5′-most and 3′-282 most positions of the omitted region (where "5′-most" and "3′-most" refer to orientation with respect to 283 the positive-sense genome) as, respectively, the "donor" and "acceptor" sites, and their joining site as 284 the "junction". Neighbouring junctions were merged to account for the potential ambiguity in assigning 285 the exact junction site of discontinuous transcription events, and merged junctions were filtered to keep 286 only those present in more than one replicate within each timepoint (where WT and KO2 were treated 287 as equivalent) to generate one set of junctions per timepoint (Supplementary Tables 3-7; NA PRRSV dataset, although usage of the secondary body TRS found in VR-2332 (beginning at 301 position 14,875 on the SD95-21 genome) was much more frequent than that of tw91, consistent with 302 the fact that SD95-21 is more closely related to VR-2332. This more abundant secondary transcript, 303 herein termed N-short, has a 5′ UTR 114 nt shorter than that of the NA PRRSV primary transcript 304 (herein termed N-long, with body TRS beginning at position 14,761), presenting a potential opportunity 305 for differential translation regulation. If such regulation exists, it is unlikely to be temporal, as the ratio 306 of N-long to N-short remains constant, at approximately 6:1, between 9 and 12 hpi. Any such regulation 307 would also likely be isolate-dependent, as the N-short body TRS is not completely conserved amongst 308 NA PRRSV isolates, and species-dependent, as the N-long body TRS is neither highly conserved nor 309 highly utilised in EU PRRSV, for which N-short is ~60-fold more abundant than any other N transcript 310 and its body TRS is absolutely conserved. 311 In addition to the numerous novel sgRNAs predicted to express full-length structural proteins, we found 312 that most canonical sgRNAs have transcript variants with body TRSs downstream of the start codon, 313 which are expected to express truncated forms of the structural proteins ( Figure 4B , Figure 5 ). One of 314 these was also observed for VR-2332 PRRSV: the "5-1" transcript variant 12 , which is thought to express 315 a truncated form of GP5, and is present in our NA PRRSV dataset at ~1.7% of the abundance of the 316 primary GP5 transcript (based on the number of junction-spanning reads at the donor site). Similar GP5 317 transcript variants were observed in SHFV, and mutagenesis studies suggest that the truncated GP5 may 318 be beneficial for viral fitness 45 , raising the possibility that the putative truncated forms of this and other 319 PRRSV structural proteins could be functional. 320 A) Sashimi plots of junctions for NA PRRSV at 9 hpi. Plots constructed as in Figure 4B , but with 344 the threshold for inclusion of junctions adjusted to ≥ 100 junction-spanning reads in total from 345 all 9 hpi libraries (note that eight libraries were analysed at this timepoint compared to four at 346 other timepoints). B) Sashimi plots of junctions for EU PRRSV at 8 hpi. Plots constructed as in 347 Figure 4B , but with the threshold for inclusion of junctions adjusted to ≥ 10 supporting reads 348 (note that only two libraries were analysed and shorter read lengths are expected to lead to fewer 349 identifiably junction-spanning reads). 350 In addition to the transcript variants for the structural proteins, a small number of non-canonical 351 sgRNAs were discovered in both NA and EU PRRSV which have acceptor sites within ORF1b, herein 352 termed ORF1b sgRNAs ( Figure 4B , Figure 5 ). This was unexpected as ORF1b is thought to be 353 expressed only from gRNA, which is much less abundant than sgRNAs at late timepoints and is 354 relatively inefficiently translated (see below). ORF1b sgRNAs, even of low abundance relative to 355 canonical sgRNAs, could therefore have a significant effect on the expression of polyprotein products. 356 This is explored further below. 357 Deletions, in which the leader TRS is not the donor site, tend to have fewer junction-spanning reads 358 than sgRNAs, but nonetheless may influence gene expression. Many of these likely represent defective 359 interfering (DI) RNAs; however, several of the long-range deletions in the NA PRRSV 12 hpi dataset 360 bear similarity to "heteroclite" sgRNAs, a family of non-canonical transcripts found in several NA 361 PRRSV isolates 44,62 . Heteroclite sgRNA formation is thought to be directed by short (2-12 nt) regions 362 of similarity between the donor site, located within ORF1a, and the acceptor site, usually located within 363 the ORFs encoding structural proteins 44,62,63 . These transcripts can be packaged into virions but, unlike 364 classical DI RNAs, they do not appear to interfere with canonical gRNA or sgRNA production and are 365 present in a wide range of conditions, including low MOI passage and samples directly isolated from 366 the field 44,62 . In our datasets, the most abundant deletion at 12 hpi is identical to the junction that forms 367 the "S-2" heteroclite sgRNA for VR-2332 PRRSV, from which a fusion of the first 520 amino acids of 368 ORF1a (nsp1α, nsp1β and part of nsp2) and the last 11 amino acids of 5a is thought to be expressed 44,62 . 369 At 12 hpi, 2.3% of reads at the donor site span this junction, which indicates this is a relatively minor 370 transcript relative to gRNA; however, this could be enough to affect gene expression, for example it is 371 only ~5-fold lower than the corresponding percentage for the major GP4 junction, the canonical sgRNA 372 with the fewest junction-spanning reads. Although this junction is not present above the limit of 373 detection at 6 hpi, it is observed at 9 hpi (Supplementary Table 5 ; total read counts below the threshold 374 for inclusion in Figure 5A ) and at 3 hpi ( Figure 4A , upper), consistent with this transcript being 375 packaged into virions 44, 62 . No transcripts resembling heteroclite sgRNAs were detected for EU PRRSV, 376 although it is possible such transcripts might be observed if a later timepoint was sampled and/or longer 377 RNASeq inserts were generated, as the shorter read lengths purified for these libraries (and NA PRRSV 378 9 hpi replicate one) are less amenable to detection of junctions. 379 The numerous novel transcripts described in this section not only present opportunities for regulation 380 of the known PRRSV proteins, but also highlight considerable flexibility in the transcriptome, which 381 provides a platform for expression of truncated protein variants and novel ORFs. Nonetheless, it is 382 likely that many of the lowly abundant novel transcripts are simply an unavoidable consequence of a 383 viral replication complex that has evolved to facilitate discontinuous transcription as an essential ORFs are highly translated -for example the 125-codon NA PRRSV nsp12-iORF is translated at a level 397 similar to nsp12 at 12 hpi ( Figure 6C ). To test whether these novel ORFs in either virus are subject to 398 purifying selection (an indicator of functionality), we analysed synonymous site conservation within 399 the known functional viral ORFs ( Figure 6A and Supplementary Figure 11A ). Overlapping functional 400 elements are expected to place additional constrains on evolution at synonymous sites, leading to local 401 peaks in synonymous site conservation. While such peaks were observed in the regions where the 402 known viral ORFs overlap (and also within the M ORF and at the 5′ end of ORF1a), no large 403 conservation peaks were observed in the vicinity of the novel, translated overlapping ORFs, indicating 404 their functional relevance is debatable. 405 As mentioned earlier, we also identified a uORF in the NA PRRSV 5′ UTR ( Figure 6D ), which is highly 406 expressed at all timepoints ( Figure 2C , blue peak). At only ten amino acids, the peptide expressed from 407 this uORF is unlikely to be functional, and the ORF is truncated or extended in a small proportion of 408 isolates ( Figure 6E ). However, the presence of a uORF in this position is highly conserved in NA 409 PRRSV ( Figure which was designated as the start site by PRICE for the "Candidate location" before application 446 of start site selection algorithms (coordinates in Supplementary predicted to have terminated due to a stop codon. One sequence out of these 98 (KY348852) has 459 a 28-codon extension to the ORF which is not depicted. F and G) Conservation of F) the initiation 460 context and G) the stop codon for the NA PRRSV uORF, based on 661 NA PRRSV sequences. 461 Sequences were filtered to take only those spanning the entire feature of interest with no gaps, 462 leaving F) 564 and G) 598 sequences in the alignment used for the logo plots. The initiator AUG 463 is indicated by a black box. The initiation context of this ORF is weak, as defined by the absence 464 of a G at position +4 or a A/G at position -3 relative to the A of the AUG, but the sequence is 465 highly conserved. Canonical transcripts and ORFs are labelled in black, novel ones in red. The genome map from 511 nsp5 onwards is reproduced above for comparison. The leader (grey) is treated as a separate 512 transcript for the purposes of these analyses, and the NA PRRSV uORF putatively expressed 513 from it was omitted from these plots for clarity. Where more than one 5′ UTR is depicted for 514 some mRNAs this indicates that multiple merged junctions were detected that likely give rise to 515 transcripts from which the same ORF(s) are translated. In these cases, the alternative transcripts 516 were considered as one species in the gene expression analysis, and junction-spanning read counts 517 for the junctions were combined. To the right of each transcript, the consensus sequence of the 518 body TRS used to generate the major transcript variant (indicated by the thicker UTR) is plotted, 519 based on A) 661 NA PRRSV or B) 120 EU PRRSV genome sequences. For ease of identification, 520 both N-long and N-short are depicted as major transcripts for N. In addition to these sgRNAs and 521 ORFs, ORF1a and all novel ORFs not depicted here were included in the analysis and designated 522 as expressed from the gRNA transcript. 523 PRRSV are plotted as filled circles, with individual data points omitted for clarity and some 535 circles offset on the x axis to aid visualisation. ORF1b is omitted from this and several other plots 536 in this section, and investigated separately in Figure 9A -C. Investigation of variations in 537 transcription and translation within ORF1a are given in Figure 9D ; for all other analyses in this 538 section, "ORF1a" refers to the region designated by PRICE, which begins at genomic coordinates 539 2249 (NA) and 1212 (EU) and extends to the end of ORF1a. Similarly, "gRNA" transcript 540 abundance is calculated ignoring the existence of putative heteroclite sgRNAs, except for where 541 this is investigated in Figure 9D . Except for the absence of a uORF, the relative translation levels of EU PRRSV ORFs are similar to 554 those in NA PRRSV, although with less translation of 5a ( Figure 8C, Supplementary Figure 14) . This 555 may reflect the different relative arrangements of GP5 and 5a for these two isolates, with 5a beginning 556 5 nt downstream of the beginning of GP5 for EU PRRSV and 10 nt upstream for NA PRRSV. TE values 557 for EU PRRSV are slightly higher than those for NA PRRSV (Supplementary Figure 15) ; however, this 558 may be influenced by reduced accuracy of transcript abundance quantification due to the shorter read 559 lengths of the EU libraries. 560 Novel ORFs make up a relatively small proportion of total viral translation ( Figure 8C ). Nonetheless, 561 they may represent a significant contribution to the viral proteome -for example, the novel ORFs 562 overlapping the end of ORF1b have a similar density of ribosomes as ORF1a at 12 hpi ( Figure 8C , 563 Supplementary Figure 14) . These overlapping ORFs are not subject to noticeable purifying selection 564 ( Figure 6A, Supplementary Figure 11A ), indicating they are unlikely to produce functional proteins. 565 This raises the possibility that their translation is tolerated as a side effect of ORF1b sgRNA production, 566 which may primarily function to regulate expression of ORF1b. This is supported by the observed step 567 increases in ORF1b-phase RiboSeq density after some of the ORF1b sgRNA body TRSs at late 568 timepoints ( Figure 9A and B) , a trend confirmed by quantification of this read density in the regions 569 between these body TRSs ( Figure 9C ). At 6 hpi, when no ORF1b sgRNAs are detected, read density 570 remains reasonably constant throughout ORF1b, while at later timepoints, as ORF1b sgRNA expression 571 increases, a pattern of increasing density towards the 3′ end of ORF1b emerges, with the 3′-most regions 572 more highly translated than ORF1a ( Figure 9C ). For NA PRRSV, the greatest step increases are 573 observed after the ORF1b sgRNA 2, 5 and 7 body TRSs ( Figure 9A and C) -the only non-canonical 574 sgRNAs in Figure 7 which have just a single mismatch in the body TRS compared to the leader TRS. 575 These body TRSs are also well-conserved, particularly the final two Cs, identified as the most highly 576 conserved part of the canonical sgRNA body TRS consensus in this and other studies 12,61 (Figure 7) . 577 This raises the likelihood that such body TRSs may also produce ORF1b sgRNAs in other isolates of 578 NA PRRSV. Furthermore, although the body TRSs for the EU PRRSV ORF1b sgRNAs are less well-579 conserved within the species (Figure 7) , they are located at very similar positions on the genome 580 compared to the NA PRRSV ORF1b sgRNA 2 and 5 body TRSs, which correlate with two of the 581 greatest increases in ORF1b-phase RiboSeq read density for NA PRRSV. Indeed, the EU PRRSV 582 ORF1b sgRNA 2 body TRS is in a genomic location exactly equivalent to that of NA PRRSV ORF1b 583 sgRNA 5, and both body TRSs have only a single mismatch compared to the leader TRS, despite this 584 not being a requirement for maintaining the amino acid identities at this position. The conservation of 585 these features of ORF1b sgRNAs between these two highly divergent arterivirus species suggests there 586 may be a selective advantage in their production, which could result from temporal modulation of the 587 stoichiometry of nsps 10-12. 588 Similarly, the heteroclite sgRNAs have the potential to modulate the stoichiometry of ORF1a. To 589 examine this, the RNASeq read density in ORF1ab was partitioned between gRNA and heteroclite 590 sgRNAs (a distinction not made in the junction-spanning read analysis) using a "decumulation" 591 procedure introduced in Irigoyen et al. 49 , and RiboSeq read density in three regions of ORF1a was 592 determined ( Figure 9D ). NA PRRSV RiboSeq read density upstream of the major (S-2) heteroclite 593 junction is considerably higher than in the downstream regions, with the highest ratio of 594 heteroclite:ORF1a (before FS) translation being reached at 12 hpi, consistent with the increased ratio 595 of heteroclite:gRNA RNASeq density at this timepoint ( Figure 9D ). This supports the hypothesis that 596 the N-terminal region of ORF1a can be independently translated from heteroclite sgRNAs (besides from 597 gRNA as part of pp1a/ab) during infection, which could function to increase the ratio of nsp1α and 598 nsp1β compared to the other nsps ( Figure 9D ). Consistent with the previous analysis (Supplementary 599 Figure 15 ), the TE of ORF1a decreases over time for both regions upstream of the nsp2 PRF site (Figure 600 9D). TE in the region after the PRF site does not follow the same trend, likely due to the unexpected 601 increase in RiboSeq read density after the nsp2 frameshift site at 9 and 12 hpi. This is contrary to 602 expectation, as ribosomal frameshifting into nsp2TF should decrease the ribosome density downstream 603 of the nsp2TF stop codon. The reason for this is unclear; perhaps it is a consequence of expressed non-604 canonical transcripts below the threshold of detection, or biological and/or technical biases. Despite the 605 absence of detectable EU PRRSV heteroclite sgRNAs in the junction-spanning read analysis ( Figure 606 5B), analogous calculations were performed to investigate heteroclite sgRNA and ORF1a expression 607 in EU PRRSV ( Figure 9D ), revealing RNASeq and RiboSeq outcomes consistent with the presence of 608 translated heteroclite sgRNAs ( Figure 9D ). These transcripts could potentially be present below the 609 threshold of detection for the junction-spanning read analysis pipeline. Taken together, these results 610 demonstrate that the non-canonical transcripts discovered in this study provide a potential mechanism 611 to temporally regulate the stoichiometry of the polyprotein components, which may reflect changing 612 requirements for the different non-structural proteins throughout infection. 613 the NA PRRSV genome. Plots constructed as in Figure 2B and Figure 6B , respectively, with the 617 application of a 213-nt running mean filter. Dotted lines indicate body TRS positions, with 618 junction-spanning read abundances supporting body TRSs reproduced from Figure 6B , and the 619 designated ORF1b sgRNA number indicated above in red. For RNASeq, all read lengths were 620 used, and for RiboSeq, read lengths identified as having minimal RNP contamination were used. 621 The libraries displayed are those in Figure 2B and C, with remaining replicates and KO2 libraries 622 in Supplementary Figure 16 . B) Distribution of RNASeq (upper) and RiboSeq (lower) reads 623 mapping to the ORF1b region of the EU PRRSV genome. For RNASeq, all read lengths were 624 used, and for RiboSeq, read lengths with good phasing were used. Plot constructed as in panel A, 625 with junction-spanning read abundances supporting body TRSs reproduced from Supplementary 626 Figure 11B . The body TRS annotated at the end of nsp12 does not represent an ORF1b sgRNA, 627 but is expected to produce an alternative transcript for GP2. C) RiboSeq read density attributed 628 to different regions of ORF1b. ORF1b was divided into regions based on the positions of the 629 ORF1b sgRNAs, and RiboSeq density of reads in-phase with ORF1b was determined. All sgRNA 630 numbers in the x axis labels refer to ORF1b sgRNAs. RiboSeq density in ORF1a (the "before 631 FS" region from panel D) is included for comparison and its mean value is indicated by a solid 632 grey line. Plot constructed as in Supplementary Figure 13 , using a linear scale. D) Gene 633 expression in different regions of ORF1a. Transcript abundance for gRNA was calculated by 634 determining RNASeq read density (RPKM) in the region between the major heteroclite (S-2) 635 junction and the nsp2 PRF site. This was subtracted from the density between the beginning of 636 ORF1a and the major heteroclite junction to "decumulate" (decum.) the density in this region and 637 estimate the abundance of the heteroclite transcripts (where all heteroclite sgRNAs contribute 638 read density but the normalisation for length is based only on S-2). RiboSeq read density was 639 calculated in the region between the beginning of ORF1a and the major heteroclite sgRNA 640 junction ("heteroclite"), the region between this junction and the nsp2 PRF site ("before FS"), 641 and the region between the nsp2 and ORF1ab frameshift sites ("after FS"). Although no junctions 642 were detected for putative heteroclite sgRNAs in the EU dataset, regions were designated 643 analogously to NA PRRSV, for comparison. For TE of the "heteroclite" region, the denominator 644 was both gRNA and (decumulated) heteroclite sgRNA combined. Plot constructed as in 645 Supplementary Figure 13 , using a linear scale, and with WT and KO2 values indicated by crosses 646 and triangles, respectively. 647 Another key mechanism by which the stoichiometry of the polyprotein components is controlled is 649 PRF. The ORF1ab frameshift site facilitates a reduction in the ratio of nsp9-12 compared to the 650 upstream proteins 11,12 , whereas frameshifting at the nsp2 site produces three variants of nsp2 and causes 651 a proportion of ribosomes to terminate before reaching nsp3 [23] [24] [25] 32] . The occurrence of both frameshift 652 events is evident on the WT NA and EU PRRSV genomes from the changes in phasing after the PRF 653 sites ( Figure 10A and B, Supplementary Figures 17-20) . 654 We began by quantifying the efficiency of frameshifting at the nsp2 site. Commonly, from profiling 655 data, frameshift efficiency is calculated using the ratio of the read density upstream of the PRF site 656 compared to downstream, where density is expected to be lower due to termination of either the 0-frame 657 or the transframe ORF 43,49-51 . However, at the NA PRRSV nsp2 PRF site, ribosome drop-off at the end 658 of nsp2N and nsp2TF is not evident ( Figure 10A, Supplementary Figure 17) , with an increase in 659 RiboSeq read density after the frameshift site, as discussed above ( Figure 9D ). This increase is not seen 660 in the counterpart RNASeq libraries (Supplementary Figure 21) and is not related to frameshifting, as 661 it also occurs in the KO2 mutant, in which nsp2 frameshifting is prevented. Plot constructed as in Figure 2C . Regions defined as "upstream" and "downstream" in the 666 frameshift efficiency calculations for the nsp2 and ORF1ab sites are annotated above the genome 667 map. Only read lengths identified as having minimal RNP contamination (indicated in 668 Supplementary Figure 4) were used to generate this plot. Replicates shown are noCHX-Ribo-669 6hpi-WT-2, noCHX-Ribo-6hpi-KO2-2, noCHX-Ribo-9hpi-WT-4, noCHX-Ribo-9hpi-KO2-3, 670 noCHX-Ribo-12hpi-WT-1 and noCHX-Ribo-12hpi-KO2-1, with remaining replicates in 671 Supplementary Figure 17 . The heightened peak shortly after the beginning of nsp2 corresponds 672 to ribosomes with proline codons, which are known to be associated with ribosomal pausing 69-71 , 673 in both the P and A sites (P site genomic coordinates 1583-1585). The S-2 heteroclite junction is 674 shortly downstream (genomic coordinate 1747), and is excluded from the nsp2 upstream region. 675 Similarly, the ORF1ab downstream region ends upstream of the body TRS for ORF1b sgRNA 1. to factor out differences in translation speed and/or biases introduced during library preparation (see 697 Methods for details). Using the resulting quotient profile, we then compared densities upstream of the 698 nsp2 frameshift site and downstream of the nsp2TF stop codon in order to calculate the combined −2/−1 699 frameshift efficiency at different timepoints. However, the nsp2-site frameshift efficiencies calculated 700 using this method were quite variable (Supplementary Figure 22A and B) . This may be due to the 701 modest level of frameshifting at this site (see below) meaning ribosomal drop-off is low relative to the 702 level of non-frameshift translation, besides the extra complications of (temporally dependent) 703 heteroclite and noncanonical sgRNA production in PRRSV. This is in contrast to cardioviruses, where 704 the frameshift efficiencies reach ~80% 40,43 and there is only a single transcript species (full-length 705 gRNA), or frameshifting at the ORF1ab site, where it is only frameshifted ribosomes rather than non-706 frameshifted ribosomes that contribute to downstream RiboSeq density. 707 Therefore, we instead quantified −2 PRF efficiency at the nsp2 site by comparing the proportion of 708 reads in each phase in the upstream and transframe regions (see Methods for details). This led to much 709 greater reproducibility between replicates, and revealed that −2 PRF efficiency significantly increases, 710 from 23% at 6 hpi to 39% at 9 hpi, at which point it reaches a plateau ( Figure 10C , Supplementary 711 Figure 22 ; p < 0.0005 based on bootstrap resampling). Although these calculations could be 712 systematically biased if translation were slower in one frame than the other (for example due to sub-713 optimal codon usage resulting from maintaining two overlapping ORFs), such bias would not be 714 expected to change systematically over the timecourse of infection, and therefore the observed trend 715 should be robust. This is only the second known example of temporally regulated PRF (after 716 cardioviruses 40 ), and supports a model of increasing −2 PRF efficiency as nsp1β, the viral protein 717 responsible for stimulating PRF at this site, accumulates and then similarly starts to plateau at 9 hpi 718 ( Figure 3D and E). The −2 PRF efficiency on the EU PRRSV genome at 8 hpi was estimated to be 26%, 719 which is similar to the 20% value determined by 35 S-Met radiolabelling of MARC-145 cells infected 720 with the EU PRRSV isolate SD01-08 and harvested at 24 hpi (MOI 0.1) [23] . The efficiency of EU 721 PRRSV −2 PRF at 8 hpi (26%) is significantly lower than the NA PRRSV efficiency at 9 hpi (39%; p 722 < 0.0005 based on bootstrap resampling). This likely reflects differences between the two viruses as 723 opposed to the difference in timepoints, as EU nsp1β has already accumulated by 8 hpi (Figure 3F Figure 17) . We quantified the ratio of RiboSeq read density in the region downstream 736 of the PRF site compared to that upstream to calculate frameshift efficiency ( Figure 10D ). PRF 737 efficiency at RNA structure-directed sites is commonly assumed to be fixed; however, surprisingly, −1 738 PRF efficiency at this site also increased over the course of infection, from 11% at 6 hpi to 19% at 9 739 hpi (p value from two-tailed Mann-Whitney U test = 8.5×10 −3 ), and further increased to 32% at 12 hpi 740 (p = 8.5×10 −3 ). The same trend was not observed in the RNASeq libraries ( Figure 10D ), which were 741 processed as a negative control, indicating it does not result from shared technical biases or an increase 742 in non-canonical transcripts facilitating translation of ORF1b (note that all detected ORF1b sgRNAs 743 are excluded from the regions used). The ORF1ab −1 PRF efficiency on the EU PRRSV genome at 8 744 hpi was 23%, which is similar to the calculated efficiency for NA PRRSV at 9 hpi ( Figure 10D ). This 745 is consistent with the replicase components being required at similar stoichiometries at this stage of 746 infection for these two viruses. 747 Ribosomal pausing over the slippery sequence is considered to be an important mechanistic feature of 748 PRF 18,19,72,73 , although it has been difficult to detect robustly on WT slippery sequences using ribosome 749 profiling 49-51 . To determine whether ribosomal pausing occurs over the nsp2 slippery sequence, we 750 plotted the RPF distribution on the WT genome in this region, and compared this to the KO2 genome 751 to control for shared biases (Supplementary Figure 23A ). This revealed a peak on the WT genome, 752 derived predominantly from 21-nt reads and corresponding to ribosomes paused with P site over the 753 slippery sequence (G-GUU-UUU, P site pause location underlined, hyphens delineate 0-frame codons) 754 (Supplementary Figure 23A Figure 11A and B, top-left 782 quadrants of right-column panels). Such an effect has previously been described as "translational 783 buffering" and is expected to result in little to no change in protein abundance 88 , suggesting that many 784 of the observed transcriptional changes make only a minor contribution to the host response to infection. 785 This is consistent with the observation that many of the GO terms enriched amongst the transcriptionally 786 up-regulated genes are also enriched in the translationally down-regulated genes (Supplementary Table 787 9, 'GO_TE_down' sheet). Comparisons between RNA and TE fold changes further reveal many genes 788 with large fold changes of TE and little to no change at the transcriptional level ( Figure 11A changes were determinable, these were compared (right), irrespective of p value. The full results 807 of these differential expression analyses, including lists of GO terms enriched in each set of 808 differentially expressed genes, are in Supplementary Table 9, Supplementary Table 10 and 809 Supplementary Table 11 . 810 We moved on to compare the host response to infection with WT PRRSV to that of KO2 PRRSV, to 812 investigate the effects of the nsp2 frameshift products, nsp2TF and nsp2N ( Figure 11C , Supplementary 813 Table 11 ). As we expected relatively small differences in the gene expression programmes activated by 814 the two viruses, we lowered the magnitude of the log2(fold change) required to qualify as a 815 Nonetheless, TXNIP clearly stands out in the comparison of RNA and TE fold changes as it is both 830 more highly transcribed and more efficiently translated in KO2 than WT ( Figure 11C , right column), 831 further supporting the conclusion that increased TXNIP expression is a notable feature of KO2 832 infection. 833 The mechanism by which the presence of nsp2TF/nsp2N could lead to reduced TXNIP expression in 834 WT infection is unclear. The frameshift products share a PLP2 protease domain with the 0-frame 835 product, nsp2, although the DUB/deISGylase activity of this domain is most potent in nsp2N 32 . The 836 frameshift proteins also have different sub-cellular distributions to nsp2 [23, 36] , which may grant them 837 access to proteins involved in the transcriptional activation of TXNIP, allowing them to interfere with 838 this signalling pathway (for example by de-ubiquitinating its components). While the mechanism 839 remains elusive, there are several reasons why this down-regulation may be beneficial to PRRSV. 840 TXNIP is a key protein in metabolism and redox homeostasis 89,90 , regulates cell survival/apoptosis via 841 apoptosis signal regulating kinase 1 (ASK1) [91] , and triggers NLRP3 inflammasome activation in polarisation. Indeed, it has already been suggested that PRRSV infection induces a skew towards M2 855 polarisation, although the mechanism was uncharacterised 108 . 856 Here, we describe a high resolution analysis of PRRSV replication through ribosome profiling and 858 RNASeq. In addition to confirming and extending the findings of previous transcriptomic analyses, we 859 define the PRRSV translatome and identify strategies of gene expression that may permit the virus to 860 exert translational control during the replication cycle ( Figure 12) . 861 host transcriptional responses being counteracted by opposing changes in TE has also been observed in 869 response to SARS-CoV-2 infection, where it was attributed to inhibition of mRNA export from the 870 nucleus, preventing translation 109,110 . This activity is known to be associated with coronavirus nsp1, 871 which inhibits nuclear export by interacting with the nuclear export factor NXF1 in SARS-CoV or the 872 nuclear pore complex component Nup93 in SARS-CoV-2 [111, 112] . Similarly, PRRSV nsp1β causes 873 imprisonment of host mRNAs in the nucleus, by binding Nup62 to cause the disintegration of the 874 nuclear pore complex 113,114 . This inhibition of host mRNA nuclear export has been found to reduce 875 synthesis of interferon-stimulated genes 114 , and mutations in nsp1β which ablate this activity lead to 876 reduced viral load and increased neutralising antibody titres in pigs 115 . These findings suggest that 877 nsp1β-mediated nuclear export inhibition may be responsible for the translational repression seen in our 878 host differential gene expression analyses, which, analogously to conclusions drawn for SARS-CoV-879 2 [109, 110] , may be a key mechanism by which PRRSV evades the host response to infection. 880 On the viral genome, numerous translated novel ORFs were discovered, including a short but highly 881 expressed uORF in the NA PRRSV 5′ UTR. The presence of this uORF is very well-conserved amongst 882 NA PRRSV isolates, with an AUG in this position in 558/564 available NA PRRSV sequences, and a 883 non-canonical initiation codon (GUG/AUA/ACG) present in the remainder. In many contexts, uORFs 884 have been shown to regulate translation of the main ORF downstream 116 were detrimental to viral fitness and led to rapid reversion to the WT sequence 119 , although the uORF 896 was not essential for virus viability 119,120 . Upstream ORFs in other arteriviruses have not been 897 characterised, although it was noted that the SHFV 5′ leader contains a putative 13-codon uORF at 898 genomic coordinates 35-73 [121] , which is conserved in all 37 available full genome sequences. This 899 SHFV uORF is of a similar length and position to the NA PRRSV uORF (ten codons; genomic 900 coordinates 24-36), with the uORFs in these two viruses ending, respectively, 126 nt and 128 nt 901 upstream of the leader TRS. The similarity between these two putative uORFs suggests a possible 902 conserved function, for example conferring resistance to eIF2α-phosphorylation-mediated translation 903 inhibition (as observed for some cellular uORFs 122 ), or affecting re-initiation efficiency on the 904 downstream ORF. The latter could help to modulate the ratio of overlapping ORFs (GP2:E and/or 905 GP5:5a) in which the downstream AUG is thought to be accessed by leaky scanning. However, only 906 three out of the 100 EU PRRSV genome sequences with any 5′ UTR have AUGs other than at the 907 extreme 5′ end of the leader, and we do not detect robust uORF translation in the EU PRRSV isolate 908 used in this study, indicating that any putative function of the uORF is not conserved across the entire 909 family. 910 Our analysis of canonical PRRSV ORFs over a 12 hour timecourse revealed that expression of many 911 of these ORFs is controlled by additional mechanisms, at both the transcriptional and translational level, 912 beyond what was previously appreciated. A key observation is that −2 PRF at the NA PRRSV nsp2 913 PRF site is both highly efficient and temporally regulated. At 6 hpi its efficiency is 23%, and this 914 increases to ~40% at 9 and 12 hpi, likely due to accumulation of the frameshift-stimulatory viral protein, 915 nsp1β. Such regulation may be a selective advantage for PRRSV by directing ribosomes to translate 916 proteins which are most beneficial at each stage of infection, optimising the use of cellular resources. 917 At early timepoints, lower nsp2 frameshift efficiency means more ribosomes continue to translate the 918 remainder of pp1a or pp1ab, which encode components of the replication and transcription complex 919 (RTC), which may be more important for establishing infection than translation of the accessory protein 920 nsp2TF. Later in the replication cycle, higher −2 PRF efficiency likely corresponds to an increased 921 requirement for nsp2TF to prevent degradation of GP5 and M, which are expressed from ~8 hpi and are 922 essential for virion assembly 36,123 . Further, nsp2TF is a more potent innate immune suppressor than 923 nsp2, and down-regulates expression of swine leukocyte antigen class I (swine MHC class I) 32,124 , which 924 may become critical later in infection as viral proteins and double-stranded RNA accumulate due to 925 viral translation and replication. The only other known example of temporally regulated frameshifting 926 is provided by cardioviruses 40 , which encode the only other known protein-stimulated PRF site 38-43 . 927 This suggests that temporal regulation may emerge as a common feature of trans-activated PRF sites, 928 as more non-canonical PRF sites are discovered in future. 929 RNA structure-directed frameshift sites are commonly assumed to operate at a fixed ratio, due to the 930 lack of trans-acting factors involved; however, we found that −1 PRF efficiency at the ORF1ab site 931 increased over time, from 11% to 32%. As opposed to representing specific "regulation" of PRF, we 932 suggest that this is due to changes in gRNA translation conditions as infection progresses. Such changes 933 could result from activation of pathways that globally regulate translation, such as the unfolded protein 934 response (which is known to be activated by PRRSV infection 125 ) or potential phosphorylation of eEF2 935 (discussed above). Additionally, changes in the localisation or availability of gRNA for translation 936 could result in changing ribosome density as infection progresses, and decreases in ribosome load have 937 been shown to increase −1 PRF efficiency in some studies 126,127 . The mechanism responsible for this 938 effect is not well characterised, although it has been suggested that frameshift-stimulatory RNA 939 structures are more likely to have time to re-fold in between ribosomes if the transcript is more sparsely 940 occupied 126,127 . Consistent with this hypothesis, we find a trend of decreasing gRNA TE over time, 941 although this analysis may be confounded by, for example, the inability to discern translatable gRNA 942 from that undergoing packaging. Whether the observed changes in PRF efficiency represent a selective 943 advantage for PRRSV, or whether they are simply incidental, is unclear. The expected result is an 944 increase in the ratio of ORF1b products to ORF1a products over time. This could be advantageous for 945 PRRSV, for example if there is a greater requirement for nsp2 and nsp3, which promote DMV 946 formation 34 , early in infection to establish a protective environment for viral replication, followed by a 947 later preference for producing more of the RdRp (nsp9) and helicase (nsp10) to promote replication 948 itself. Alternatively, this could simply reflect that PRRSV can tolerate a reasonably wide range of 949 In addition to changes in the ratio of ORF1b to ORF1a translation, we observed temporal changes in 956 the relative translation of different regions within ORF1b, with increasing translation of the 3′-proximal 957 region as infection progresses. This may result from translation of non-canonical sgRNAs, which we 958 term ORF1b sgRNAs, in which the body TRS is within ORF1b. If the putative translated proteins are 959 processed to produce functional nsps, this would be expected to increase the stoichiometry of nsps 10-960 12 compared to nsp9, and alter the relative stoichiometries of nsp10, nsp11 and nsp12. There are several 961 possible reasons this could be beneficial to viral fitness. Although the stoichiometry of the arterivirus 962 RTC is unknown, there is some evidence that the stoichiometry of the coronavirus replication complex 963 varies, containing either one or two copies of the helicase for each copy of the holo-RdRp 129 . This 964 highlights the possibility that the composition of the PRRSV RTC could change over time, for example 965 if extra copies of nsp10, 11 or 12 are supplied from ORF1b sgRNA translation (as well as the potential 966 contribution of increased ORF1ab frameshift efficiency). This provides a potential mechanism of 967 regulating viral replication, for example by altering the ratio of gRNA to sgRNA production (as 968 observed in Figure 2 and Figure 8 ), as both nsp10 (the helicase) and nsp12 are thought to be involved 969 in promoting sgRNA transcription 129-134 . Nsp11 (NendoU) is an endoribonuclease found in many 970 nidoviruses, which has broad substrate specificity in vitro 135 , and is also an innate immune antagonist 136-971 140 . Its expression outside the context of infection is highly toxic 139,141 , leading to the suggestion that its 972 restricted perinuclear localisation during infection is important to prevent its expression becoming 973 "suicidal" for the virus 9,137 . Therefore, it may be beneficial to maintain relatively low levels of nsp11 974 early in infection, and increase production after the optimal microenvironment for its localisation has 975 formed. However, such possibilities are clearly speculative at present. 976 Interestingly, ORF1b sgRNAs have been found in a number of other nidoviruses. Our results are highly 977 consistent with a previous study on SHFV, in which several ORF1b sgRNAs were detected, which were 978 predicted to produce in-frame portions of the ORF1b polyprotein, or in one case a novel overlapping 979 ORF 45 . Quantitative mass spectrometry provided support for translation of both categories of ORF1b 980 sgRNA, and showed that peptides from nsp11 and nsp12 were 1.2-and 3.1-fold more abundant 981 (respectively) than those from ORF1a-encoded nsp8 [45] . ORF1b sgRNAs were also found in lactate 982 dehydrogenase elevating virus (predicted to express the C-terminal 200 amino acids of ORF1b) 142 , 983 SARS-CoV-2, HCoV-229E and equine torovirus 46, 48, 51, 59 . Whether this has evolved by virtue of 984 conferring a selective advantage, or whether it is a neutral consequence of the promiscuous 985 discontinuous transcription mechanism, this suggests that ORF1b sgRNAs are a conserved feature of 986 the nidovirus transcriptome. Further characterisation of these non-canonical transcripts would be highly 987 informative, to determine potential initiation sites and ascertain whether any in-frame products are 988 functional. 989 Another group of non-canonical transcripts with the potential to modulate polyprotein stoichiometry 990 comprises those termed "heteroclite sgRNAs" by Yuan et al. 44, 62 of the polyprotein proteins. At the level of translational control, the nsp2 −2 PRF site was found to be 1006 a rare example of temporally regulated frameshifting, while the finding that −1 PRF efficiency at the 1007 ORF1a/1b overlap also increased over time challenges the paradigm that RNA structure-directed 1008 frameshift sites operate at a fixed efficiency. At the transcriptional level, numerous non-canonical 1009 Supplementary Figures 3 and 4) , overlapping regions of ORFs were permitted for the length distribution 1109 but not phasing analyses. For these phasing plots, phase 0 was designated independently for each region, 1110 relative to the first nucleotide of the ORF in that region. For the negative-sense read analysis, reads 1111 mapping to anywhere on the viral genome were used, and phase was determined using the 5′ end of the 1112 read (the 5′ end of the reverse complement reported by bowtie plus the read length). The coordinates of 1113 the regions of the viral genome used for all analyses are given in Supplementary Table 1. 1114 For plots of read distributions on viral genomes, read densities were plotted at the inferred ribosomal P 1115 Significance testing for proportion of host RPFs which are short 1127 RiboSeq libraries were grouped into early (3 and 6 hpi) and late (9 and 12 hpi) timepoints, to provide 1128 enough replicates in each group to perform a two-tailed t test (WT and KO2 were treated as equivalent). 1129 Positive-sense RPFs mapping to host mRNA were used, and short reads were defined as 19-24 nt long, 1130 with the denominator formed by 19-34 nt long reads. The early timepoint group was used as a control, 1131 for which there was no significant difference in the percentage of short RPFs in infected cells compared 1132 to mock-infected cells (p = 0.52). 1133 Junction-spanning read analysis for novel transcript discovery 1134 Reads which did not map to any of the host or viral databases (rRNA, vRNA, mRNA, ncRNA or gDNA) 1135 in the core pipeline (described above) were used as input for mapping using STAR 60 , version 2.7.3a. 1136 Mapping parameters were selected based on those suggested in Kim et al. 46 Table 1) . Junctions were clustered so that all junctions within a cluster had acceptor 1150 coordinates within seven (for TRS-spanning junctions) or two (for non-TRS-spanning junctions) 1151 nucleotides of at least one other junction in the cluster, with the same requirement applied to donor 1152 coordinates. This was to group highly similar junctions together and account for the fact that the precise 1153 location of a junction is ambiguous in cases where there is similarity between the donor and acceptor 1154 sites (such as between the 6-nt leader and body TRSs). The junctions within each cluster were merged, 1155 with donor and acceptor sites defined as the midpoints of the ranges of coordinates observed in the 1156 cluster, and the number of reads supporting the merged junction defined as the sum of the supporting 1157 read counts for all the input junctions in the cluster. 1158 Then, junctions were filtered to keep only those present in multiple libraries and merged to generate 1159 one dataset per timepoint. Merged junctions from individual libraries were filtered so that only junctions 1160 which were present in more than one replicate (considering WT and KO2 as one group) passed the 1161 filter. Junctions were defined as matching if the ranges of the donor and acceptor coordinates for the 1162 junction in one library overlap with those of a junction in a second library. Matching junctions from all 1163 replicates were merged as described above to make the final merged junction. For the NA PRRSV M 1164 junction there is a stretch of six bases that is identical upstream of the leader TRS and the body TRS, 1165 leading to separation of the two alternative junction position assignments by a distance greater than the 1166 seven bases required to combine TRS-spanning junctions into clusters. The two junction clusters that 1167 are assigned either side of this stretch of identical bases were specifically selected and merged at this 1168 stage. To ensure this merging strategy did not lead to clusters spanning overly wide regions, widths of 1169 merged junction donor and acceptor sites were assessed, and the mean and median junction width for 1170 all analyses was found to be < 3 nt (maximum width 17 nt, for the NA PRRSV M junction). TRS-1171 spanning junctions were designated as "known" junctions if they were the major junction responsible 1172 for one of the known canonical sgRNAs of PRRSV. Non-TRS-spanning junctions were filtered 1173 according to whether they represent local (≤ 2000 nt) or distant (> 2000 nt) deletions. 1174 The proportion of junction-spanning reads at donor and acceptor sites was calculated as junction-1175 spanning / (junction-spanning + continuously aligned to reference genome). The number of non-1176 junction-spanning reads at the junction site was defined as the number of bowtie-aligned reads (from 1177 the core pipeline) spanning at least the region 12 nt either side of the midpoint position of the donor or 1178 acceptor site (note that for sgRNA acceptor sites the denominator will include not only gRNA but also 1179 sgRNAs with body TRSs upstream). For all TRS-spanning junctions the donor midpoint was set 1180 according to the known leader TRS sequence (genomic coordinate 188 for NA PRRSV and 219 for EU 1181 PRRSV). 1182 sequence alignment of 137 sequences. Logo plots and mini-alignment plots were generated using 1207 CIAlign 68 and, for the uORF analyses, genome sequences which began partway through the ORF were 1208 excluded, as was KT257963 which has a likely sequencing artefact in the 5′ UTR. Synonymous site 1209 conservation was analysed, for the representative NA PRRSV sequences or for all EU sequences, using 1210 SYNPLOT2 [67] and p values plotted after application of a 25-codon running mean filter. 1211 Transcript abundance, total translation, and translation efficiency analyses 1212 For the main analysis, RiboSeq RPKM values were calculated using the read counts and ORF 1213 "Location"s from the PRICE output (Supplementary Table 8 ), using the same library size normalisation 1214 factors as the core pipeline (where positive-sense virus-and host-mRNA-mapping reads from the 1215 bowtie output are the denominator). Each ORF was paired with the transcript(s) most likely to facilitate 1216 its expression (see schematic in Figure 7 and junction coordinates in Supplementary Tables 1 and 5-7) . 1217 For some ORFs (NA PRRSV nsp10-iORF2, nsp11-iORF3, GP3-iORF and GP4-iORF), this included 1218 transcripts which are expected to produce slightly N-terminally truncated ORFs compared to the PRICE 1219 designation. ORFs overlapping ORF1ab for which there were no novel transcripts expected to facilitate 1220 expression were paired with gRNA. All ORF1b sgRNAs, defined as sgRNAs with body TRSs within 1221 ORF1b and ≥ 50 or ≥ 10 junction-spanning reads at 12 hpi (NA PRRSV) or 8 hpi (EU PRRSV), were 1222 included in the transcript abundance analysis regardless of whether they are expected to result in 1223 expression of a novel ORF. 1224 To estimate transcript abundance, reads aligned to the viral genome by STAR (see junction-spanning 1225 read analysis pipeline) were normalised by library size using the same library size normalisation factors 1226 as the core pipeline. In cases where multiple body TRSs are expected to give rise to two different forms 1227 of a transcript that express the same ORF(s), these were treated as a single transcript for the purposes 1228 of this analysis, and read counts for all junctions were combined. Abundance of the gRNA transcript 1229 was defined as the number of reads which span 12 nt either side of the midpoint of the leader TRS 1230 (genomic coordinate 188 for NA PRRSV and 291 for EU PRRSV). This is analogous to the 12-nt 1231 overhang required either side of a junction to qualify for mapping by STAR; however, these reads are 1232 not junction-spanning, and map specifically to gRNA (and a small proportion of non-canonical 1233 transcripts such as heteroclite sgRNAs). Leader abundance was defined as the total number of reads for 1234 all other transcripts in the analysis combined, as the leader is present on all sgRNA species and the 1235 gRNA. TE was calculated by dividing RiboSeq RPKM values by RNASeq junction-spanning read RPM 1236 values, excluding conditions where the denominator was zero. 1237 For plots with logarithmic axes, data points with a value of zero were excluded from the plot, but not 1238 from mean calculations. WT and KO2 were treated as equivalent unless specified. For libraries with 1239 shorter read lengths (EU libraries and NA 9 hpi replicate 1 libraries) junction-spanning read counts are 1240 lower (and also subject to greater inaccuracies as a result of less dilution of possible read start-and end-1241 point specific ligation biases) due to the requirement for a 12-nt overhang either side of the junction 1242 effectively representing a much larger proportion of the total read length. As such, these libraries are 1243 not directly comparable to the remaining NA PRRSV libraries and they were plotted separately and not 1244 included in NA PRRSV mean calculations. 1245 For the estimation of translation of different regions of ORF1b, sections were designated as the regions 1246 between the downstream-most body TRS of one ORF1b sgRNA and the upstream-most body TRS of 1247 the next (all region coordinates given in Supplementary Table 1) . Bowtie-aligned RiboSeq reads (from 1248 the core pipeline) which mapped in-phase with ORF1b in the designated regions were counted, using 1249 only read lengths with minimal RNP contamination (NA PRRSV) or good phasing (EU PRRSV). Total 1250 read counts were normalised by library size and region length to give RPKM. The same process was 1251 applied to the region of ORF1a between the major heteroclite junction and the nsp2 PRF site for 1252 comparison, counting reads mapping in-phase with ORF1a. 1253 For investigation of gene expression in different regions of ORF1a, transcript abundance for the 1254 heteroclite sgRNAs was calculated by subtracting the gRNA RNASeq RPKM (measured in the region 1255 between the major [S-2] heteroclite sgRNA junction and the first ORF1b sgRNA body TRS) from the 1256 RPKM in the region between the leader TRS and the major heteroclite sgRNA junction ("heteroclite"). 1257 This provides an averaged result for all heteroclite sgRNAs, although it does not take into account the 1258 reduced transcript length for the minor heteroclite sgRNAs compared to S-2. RiboSeq read density for 1259 the different regions of ORF1a (Supplementary Table 1) This phasing-based method of calculating frameshift efficiency should theoretically be unaffected by 1279 RNP contamination, provided the RNP footprints are equally distributed between the three phases. Let 1280 R be the proportion of total reads that are RNPs, and let P0 and P−2 be the proportion of total RNPs that 1281 are attributed to the 0 and −2 phases, respectively. The phasing of reads originating from RNPs is not 1282 expected to change due to frameshifting. Therefore, the equation for calculating the fraction of reads 1283 that change from the 0 to −2 phase becomes: 1284 multiple replicates) the WT library with the higher ratio of virus:host RiboSeq reads was paired with 1304 not be applied to the xtail results as the test statistic is not included in the xtail output. Genes were 1354 considered significantly differentially expressed if they had FDR-corrected p value ≤ 0.05 and log2(fold 1355 change) of magnitude > 1 (for comparisons to mock) or > 0.7 (for KO2 vs WT comparison). GO terms 1356 associated with lists of significantly differentially expressed genes were retrieved and tested, using 1357 DAVID 160 (version 6.8, functional annotation chart report, default parameters), for enrichment against 1358 a background of GO terms associated with all genes that passed the threshold for inclusion in that 1359 differential expression analysis. 1360 20/Z) and an Investigator Award European Research Council grant (646891; to A.E.F.) and an Agriculture and Food Research Initiative 1365 Competitive Grant (2015-67015-22969; to Y.F.) from the USDA National Institute of Food and 1366 We would like to thank Dr David Brown for supporting discussions Lelystad virus belongs to a new virus family dehydrogenase-elevating virus, equine arteritis virus, and simian hemorrhagic fever virus Nidovirales: a new order comprising Coronaviridae and Arteriviridae Improved vaccine against PRRSV: Current Progress and future perspective Assessment of the economic impact of porcine reproductive and 1383 respiratory syndrome virus on United States pork producers PRRSV structure, replication and recombination: Origin of 1386 phenotype and genotype diversity Isolation of swine infertility and respiratory syndrome virus VR-2332) in North America and experimental reproduction of the disease in gnotobiotic pigs Mystery swine disease in The Netherlands: the isolation of Lelystad virus The prevalent status and genetic diversity of 1393 porcine reproductive and respiratory syndrome virus in China: A molecular epidemiological 1394 perspective Arterivirus molecular biology and pathogenesis Nidovirus RNA polymerases: Complex 1398 enzymes handling exceptional RNA genomes Lelystad virus, the causative agent of porcine epidemic abortion and 1400 respiratory syndrome (PEARS), is related to LDV and EAV Porcine Reproductive and Respiratory Syndrome Virus Comparison: Divergent Evolution on Two Continents Proteolytic processing of the replicase ORF1a 1405 protein of equine arteritis virus Evolving the largest 1407 RNA virus genome Survey and summary: Translational recoding: Canonical translation 1409 mechanisms reinterpreted Ribosomal frameshifting 1411 and transcriptional slippage: From genetic steganography and cryptography to adventitious use Non-canonical translation in RNA viruses Torsional restraint: A new twist on frameshifting pseudoknots A mechanical explanation 1418 of RNA pseudoknot function in programmed ribosomal frameshifting Frameshifting by Kinetic Partitioning during Impeded Translocation Equine arteritis virus is not a togavirus but belongs to the coronaviruslike 1424 superfamily An extended signal involved in eukaryotic -1 Frameshifting 1426 operates through modification of the E site tRNA Efficient -2 frameshifting by mammalian ribosomes to synthesize an additional 1428 arterivirus protein Transactivation of programmed ribosomal frameshifting by a viral protein A novel role for poly(C) binding proteins in programmed ribosomal 1432 frameshifting Programmed −2/−1 Ribosomal Frameshifting in Simarteriviruses: an Evolutionarily 1434 The Porcine Reproductive and Respiratory 1436 Syndrome Virus nsp2 Cysteine Protease Domain Possesses both trans-and cis-Cleavage 1437 Arterivirus and Nairovirus Ovarian Tumor Domain-Containing 1439 Deubiquitinases Target Activated RIG-I To Control Innate Immune Signaling Ovarian Tumor Domain-Containing Viral Proteases Evade Ubiquitin-and 1442 ISG15-Dependent Innate Immune Responses The Cysteine Protease Domain of Porcine 1444 Reproductive and Respiratory Syndrome Virus Nonstructural Protein 2 Possesses 1445 Deubiquitinating and Interferon Antagonism Functions Nonstructural Protein 2 of Porcine 1447 Reproductive and Respiratory Syndrome Virus Inhibits the Antiviral Function of Interferon-1448 Stimulated Gene 15 Nonstructural proteins nsp2TF and nsp2N of porcine reproductive and respiratory 1450 syndrome virus (PRRSV) play important roles in suppressing host innate immune responses Porcine reproductive and respiratory syndrome 1453 virus nonstructural protein 2 (nsp2) topology and selective isoform integration in artificial 1454 membranes Non-structural proteins 2 and 3 interact 1456 to modify host cell membranes during the formation of the arterivirus replication complex Ultrastructural Characterization of Arterivirus Replication Structures: 1459 Reshaping the Endoplasmic Reticulum To Accommodate Viral RNA Synthesis A swine arterivirus deubiquitinase stabilizes two major envelope proteins and 1462 promotes production of viral progeny Molecular characterization of the RNA-protein complex directing -2/-1 1464 programmed ribosomal frameshifting during arterivirus replicase expression Ribosomal frameshifting into an overlapping gene in 1467 the 2B-encoding region of the cardiovirus genome Characterization of Ribosomal Frameshifting in Theiler's Murine 1470 Encephalomyelitis Virus Protein-directed ribosomal frameshifting temporally regulates gene 1472 expression Characterization of the stimulators 1474 of protein-directed ribosomal frameshifting in Theiler's murine encephalomyelitis virus Structural and molecular basis for Cardiovirus 2A protein as a viral gene 1477 expression switch Investigating molecular mechanisms of 2A-stimulated ribosomal pausing and 1479 frameshifting in Theilovirus Heteroclite subgenomic RNAs are produced in 1481 porcine reproductive and respiratory syndrome virus infection Expanded subgenomic mRNA transcriptome and coding capacity of a nidovirus The Architecture of SARS-CoV-2 Transcriptome The SARS-CoV-2 subgenome landscape and its novel regulatory features Transcriptional and Translational Landscape of Equine Torovirus High-Resolution Analysis of Coronavirus Gene Expression by RNA 1490 Comparative Analysis of Gene Expression in Virulent and Attenuated Strains 1492 of Infectious Bronchitis Virus at Subcodon Resolution The coding capacity of SARS-CoV-2 in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling Polypeptide Chain Initiation : Nucleotide Sequences of the Three Ribosomal 1498 Binding Sites in Bacteriophage R17 RNA Ribosome pausing and stacking during translation of a eukaryotic 1500 mRNA Defines Discrete Ribosome Elongation States and Translational Regulation during Cellular 1503 Distinct stages of the translation 1505 elongation cycle revealed by sequencing ribosome-protected mRNA fragments Identification of porcine reproductive and respiratory 1508 syndrome virus ORF1a-encoded non-structural proteins in virus-infected cells Porcine reproductive and respiratory syndrome virus enters 1511 cells through a low pH-dependent endocytic pathway Direct RNA nanopore sequencing of full-length coronavirus genomes 1513 provides novel insights into structural variants and enables modification analysis STAR: ultrafast universal RNA-seq aligner Leader-body junction sequence of the viral subgenomic 1517 mRNAs of porcine reproductive and respiratory syndrome virus isolated in Taiwan Characterization 1520 of heteroclite subgenomic RNAs associated with PRRSV infection Identification of new defective interfering RNA species associated with porcine 1523 reproductive and respiratory syndrome virus infection Improved Ribo-seq enables identification of cryptic translation events The ribosome 1527 profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected 1528 mRNA fragments Non-AUG translation: A new start for protein synthesis in 1530 eukaryotes Mapping overlapping functional elements embedded within the protein-coding 1532 regions of RNA viruses CIAlign-A highly customisable command line tool to 1534 clean, interpret and 1 visualise multiple sequence alignments Ribosome profiling of mouse embryonic stem 1537 cells reveals the complexity and dynamics of mammalian proteomes Accounting for biases in riboprofiling data indicates a major role 1539 for proline in stalling translation Slow peptide bond formation by proline and other N-alkylamino acids in 1541 translation Mutational analysis of the "slippery-sequence" 1543 component of a coronavirus ribosomal frameshifting signal The energy landscape of −1 ribosomal 1545 frameshifting Conditional Switch between Frameshifting Regimes upon Translation of 1547 dnaX mRNA Structured mRNA induces the ribosome into a hyper-1549 rotated state Dynamic pathways of -1 translational frameshifting Gene expression in tonsils in swine following infection with porcine reproductive 1552 and respiratory syndrome virus Integrated time-serial transcriptome networks reveal common innate and tissue-1554 specific adaptive immune responses to PRRSV infection Transcriptome profile of lung dendritic cells after in vitro porcine reproductive 1556 and respiratory syndrome virus (PRRSV) infection Genome-wide analysis of the transcriptional response to porcine 1558 reproductive and respiratory syndrome virus infection at the maternal/fetal interface and in the 1559 fetus Transcriptome Analysis Reveals Dynamic Gene Expression Profiles in Porcine 1561 Alveolar Macrophages in Response to the Chinese Highly Pathogenic Porcine Reproductive and 1562 Respiratory Syndrome Virus Genome-wide analysis of long noncoding RNA profiling in PRRSV-infected 1564 PAM cells by RNA sequencing Global miRNA, lncRNA, and mRNA transcriptome profiling of endometrial 1566 epithelial cells reveals genes related to porcine reproductive failure caused by porcine 1567 reproductive and respiratory syndrome virus Genome-wide assessment of differential translations with 1569 ribosome profiling data Moderated estimation of fold change and dispersion for 1571 RNA-seq data with DESeq2 Pig immune response to general stimulus and to porcine reproductive and 1573 respiratory syndrome virus infection: a meta-analysis approach Distinctive Cellular and Metabolic Reprogramming in Porcine Lung 1575 Regulation of gene 1577 expression via translational buffering Diverse Functions of VDUP1 in 1580 Cell Proliferation, Differentiation, and Diseases Metabolic 1582 Regulation: Physiological Role and Therapeutic Outlook Catalytic degradation of vitamin D up-regulated protein 1 mRNA enhances 1584 cardiomyocyte survival and prevents left ventricular remodeling after myocardial ischemia Thioredoxin-interacting protein links 1587 oxidative stress to inflammasome activation Thioredoxin-interacting protein: Regulation and function in the 1589 pancreatic β-cell Identification of Thioredoxin-binding Protein-2/Vitamin D3 Up-regulated 1591 Protein 1 as a Negative Regulator of Thioredoxin Function and Expression * The thioredoxin antioxidant system. Free Radic Thioredoxin and redox signaling: Roles of the thioredoxin system in control of 1596 cell fate Vitamin D3 up-regulating protein 1 (VDUP1) antisense DNA regulates 1598 tumorigenicity and melanogenesis of murine melanoma cells via regulating the expression of 1599 fas ligand and reactive oxygen species VDUP1 is required for the development of natural killer cells Intercellular transfer of mitochondria rescues virus-induced cell 1603 death but facilitates cell-to-cell spreading of porcine reproductive and respiratory syndrome 1604 virus Porcine reproductive and respiratory syndrome virus induces 1606 apoptosis through a mitochondria-mediated pathway Activation of apoptosis signalling pathways by 1608 reactive oxygen species M. Functions of ros in macrophages and antimicrobial immunity Redox signaling in macrophages Pathology and Virus Distribution in the Lung and Lymphoid Tissues of Pigs 1613 Experimentally Inoculated with Three Distinct Type 1 PRRS Virus Isolates of Varying 1614 Reactive oxygen species (ROS) in macrophage activation and function in 1616 diabetes The Reactive Oxygen Species in Macrophage Polarization Thioredoxin-1 Promotes Anti-Inflammatory Macrophages of the M2 Phenotype and Antagonizes Atherosclerosis Porcine alveolar macrophage polarization is involved in inhibition of porcine 1623 reproductive and respiratory syndrome virus (PRRSV) replication SARS-CoV-2 uses a multipronged strategy to impede host protein synthesis Ribosome-profiling reveals restricted post transcriptional expression of 1628 antiviral cytokines and transcription factors during SARS-CoV-2 infection SARS coronavirus protein 1631 nsp1 disrupts localization of nup93 from the nuclear pore complex Nsp1 protein of SARS-CoV-2 disrupts the mRNA export machinery to inhibit 1634 host gene expression Nuclear imprisonment of host cellular mRNA by nsp1β 1636 protein of porcine reproductive and respiratory syndrome virus Syndrome Virus Nonstructural Protein 1 Beta Interacts with Nucleoporin 62 To Promote Viral 1639 Replication and Immune Evasion Type I interferon suppression-negative and host mRNA nuclear retention-negative 1641 mutation in nsp1β confers attenuation of porcine reproductive and respiratory syndrome virus 1642 in pigs Function and Evolution of Upstream ORFs in Eukaryotes A high-resolution temporal atlas of the SARS-CoV-2 translatome and 1646 transcriptome Sequence determination of the extreme 5' end of 1648 equine arteritis virus leader region The intraleader AUG 1650 nucleotide sequence context is important for equine arteritis virus replication The arterivirus replicase is the only viral protein required for genome 1653 replication and subgenomic mRNA transcription The molecular biology of arteriviruses Translation of 5' leaders is pervasive in genes resistant to eIF2 repression Envelope Protein Requirements for the Assembly of Infectious Virions 1659 of Porcine Reproductive and Respiratory Syndrome Virus Nsp2TF of porcine reproductive and respiratory syndrome virus down-regulates the expression 1662 of Swine Leukocyte Antigen class I Involvement of unfolded protein response, p53 and Akt in modulation of porcine 1664 reproductive and respiratory syndrome virus-mediated JNK activation Kinetics of Ribosomal Pausing during 1667 Ribosome collisions 1669 alter frameshifting at translational reprogramming motifs in bacterial mRNAs The translational landscape of SARS-CoV-2 and infected cells Structural Basis for Helicase-Polymerase Coupling in the SARS-CoV-2 1674 An 1676 infectious arterivirus cDNA clone: Identification of a replicase point mutation that abolishes 1677 discontinuous mRNA transcription Helicase of type 2 porcine reproductive and respiratory syndrome virus strain HV 1679 reveals a unique structure Mapping the Nonstructural Protein Interaction Network of Porcine Reproductive 1681 and Respiratory Syndrome Virus The Network of Interactions Among Porcine Reproductive and Respiratory 1683 Syndrome Virus Non-structural The Nsp12-coding region of type 2 PRRSV is required for viral subgenomic 1685 mRNA synthesis Reveals the Nidovirus-Wide Conservation of a Replicative Endoribonuclease The nonstructural protein 11 of porcine reproductive and respiratory syndrome 1690 virus inhibits NF-κB signaling by means of its deubiquitinating activity Nonstructural protein 11 of porcine reproductive and respiratory syndrome virus 1693 suppresses both MAVS and RIG-I expression as one of the mechanisms to antagonize Type I 1694 interferon production The Superimposed Deubiquitination Effect of OTULIN and Porcine Reproductive 1696 and Respiratory Syndrome Virus (PRRSV) Nsp11 Promotes Multiplication of PRRSV Porcine Reproductive and Respiratory Syndrome Virus nsp11 Antagonizes Type 1699 Interferon Signaling by Targeting IRF9 Nonstructural Protein 11 of Porcine Reproductive and Respiratory Syndrome 1701 Virus Induces STAT2 Degradation To Inhibit Interferon Signaling Nonstructural protein 11 (nsp11) of porcine reproductive and respiratory syndrome 1703 virus (PRRSV) promotes PRRSV infection in MARC-145 cells Sequences of 3' end of genome and of 5' end of open reading frame 1a of lactate 1705 dehydrogenase-elevating virus and common junction motifs between 5' leader and bodies of 1706 seven subgenomic mRNAs Direct RNA sequencing and early evolution of SARS-CoV-2 Characterisation of the transcriptome and proteome of SARS-CoV-2 1709 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike 1710 glycoprotein A SARS-CoV-2 BioID-based virus-host membrane protein interactome 1712 and virus peptide compendium: new proteomics resources for COVID-19 Shotgun proteomics analysis of SARS-CoV-2-infected cells and how it can 1715 optimize whole viral particle antigen production for vaccines Degradation of CREB-binding protein and modulation of 1718 type I interferon induction by the zinc finger motif of the porcine reproductive and respiratory 1719 syndrome virus nsp1α subunit Biogenesis of non-structural protein 1 (nsp1) and nsp1-mediated type I interferon 1721 modulation in arteriviruses Modulation of type I interferon induction by 1723 porcine reproductive and respiratory syndrome virus and degradation of CREB-binding protein 1724 by non-structural protein 1 in MARC-145 and HeLa cells Nuclear export signal of PRRSV NSP1α is necessary for type I IFN inhibition Nonstructural protein 1α subunit-based inhibition of NF-κB 1728 activation and suppression of interferon-β production by porcine reproductive and respiratory 1729 syndrome virus Mammalian microRNAs predominantly 1731 act to decrease target mRNA levels The use of duplex-specific nuclease in ribosome profiling and a user-friendly 1733 software package for Ribo-seq data analysis Adaptor trimming and merging for Illumina 1735 sequencing reads Ultrafast and memory-efficient alignment 1737 of short DNA sequences to the human genome STADIUM: Species-Specific tRNA Adaptive Index 1739 Translation inhibitors cause abnormalities in ribosome 1741 profiling experiments Cd-hit: a fast program for clustering and comparing large sets of protein or 1743 nucleotide sequences HTSeq-a Python framework to work with high-throughput 1745 sequencing data Systematic and integrative analysis of large 1747 gene lists using DAVID bioinformatics resources This work was supported by the Wellcome Trust (U.K.) through PhD studentships to G.M.C. 1362 (KO2) based on 1035 this background 23,24,32 , or mock-infected. At the time of harvesting for the timecourse samples, cells 1036 were washed with warm PBS and snap-frozen in liquid nitrogen. For the CHX pre-treated library (CHX-1037 9hpi-WT), an additional CHX pre-treatment step (100 µg/ml, 2 min) was included directly prior to 1038 harvesting, and cells were washed with ice cold PBS containing 100 µg/ml CHX before snap-freezing. 1039Snap-frozen dishes were transferred to dry ice and 400 µl lysis buffer added. The dish was transferred 1040 to ice to defrost, and cells were scraped and processed as described above. 1041 was used to trim the universal adapter sequence from reads and to discard adapter-only reads, non-1076 clipped reads, and "too-short" reads (inferred original fragment lengths shorter than the minimum 1077 intended length experimentally purified -see library preparation description for lengths). Adapter 1078 dimers were counted using the grep command line utility and added to the adapter-only read count. For 1079 paired-end libraries, adapter trimming, read pair merging and removal of adapter-only reads was carried 1080 out using LeeHom 154 (v.1.1.5) with the --ancient-dna option specified (as the expected fragment lengths 1081 of such DNA are in the same range as ours). Pairs of reads which LeeHom was unable to merge were 1082 put in the "non-clipped" category for the purposes of library composition analysis, and "too-short" reads 1083 were removed using awk. For libraries prepared using adapters with randomised bases, PCR duplicates 1084 were removed using awk, and seqtk (version 1.3) was used to trim the randomised bases from the reads. 1085 Bowtie1 (version 1.2.3) [155] was used to map reads to host and viral genomes using parameters "-v 1086 n_mismatches --best", where n_mismatches was one for RiboSeq and two for RNASeq libraries. Reads 1087 were mapped to each of the following databases in order and only reads that failed to align were mapped 1088 to the next database: ribosomal RNA (rRNA), virus RNA (vRNA), mRNA, non-coding RNA (ncRNA), 1089 genomic DNA (gDNA). Viral genome sequences were verified by de novo genome assembly using 1090 confirm that significant numbers of viral reads were not erroneously mapping to host databases or vice 1100 versa. All analyses were carried out using reads mapped by bowtie as described above, except for 1101 running PRICE, analyses using junction-spanning reads, or host differential gene expression. 1102 Quality control plots and analyses were performed as described in Irigoyen et al. 49 , modified for the 1103 timecourse libraries to account for the longer RNASeq reads, so that a 3′ UTR of at least 90 nt (as 1104 opposed to 60 nt) was required for inclusion of transcripts in the metagene analysis of 5′ end positions 1105 relative to start and stop codons (Supplementary Figure 1) . For quality control analyses of read length 1106 and phasing, reads mapping to ORF1ab (excluding nsp2TF, all phases) were used for the virus versus 1107 host analyses (e.g. Figure 1C, Supplementary Figure 2) , while for analyses of specific regions (e.g. 1108 Confidence intervals for each bootstrap distribution were calculated using the bias-corrected accelerated 1331 (BCa) method, implemented through the R package coxed (version 0.3.3) . This was performed for 95%, 1332 99.5% and 999.5% confidence intervals, and pairs of groups were considered as significantly different 1333 with p < 0.05, 0.005 or 0.005, respectively, if the mean of the "group one" bootstrap distribution was 1334 not within the confidence intervals of "group two" and vice versa. 1335 Host differential gene expression 1336 After basic processing and removal of rRNA-and vRNA-mapping reads using bowtie as described in 1337 the core analysis pipeline, remaining reads were aligned to the host genome (fasta and gtf from genome 1338 assembly ChlSab1.1) using STAR 60 (version 2.7.3a) with the following parameters: --runMode 1339 alignReads --outSAMtype BAM SortedByCoordinate --outFilterMismatchNmax n_mismatches 1340 --outFilterIntronMotifs RemoveNoncanonicalUnannotated --outMultimapperOrder Random (where 1341 n_mismatches was one for RiboSeq libraries and two for RNASeq libraries). Reads were tabulated 1342 using htseq-count 159 (version 0.13.5), with parameters -a 0 -m union -s yes -t gene (covering the whole 1343 gene) for the differential transcription and -a 0 -m intersection-strict -s yes -t CDS (covering only the 1344 CDS) for the differential TE. Genes with fewer than ten reads between all libraries in the analysis 1345 combined were excluded, and quality control was performed according to the recommendations in the 1346 DESeq2 [85] user guide, with all replicates deemed to be of sufficient quality. Read counts were 1347 normalised for differences in library size using DESeq2 (version 1.30.1), providing the input for 1348 differential transcription using DESeq2 (default parameters) or differential TE using xtail 84 (version 1349 1.1.5; parameters: normalize = FALSE). Shrinkage was applied to the DESeq2 output using lfcShrink 1350 (parameters: type = "normal"). Where necessary (i.e. for the KO2 vs WT comparison), fdrtool was used 1351 to correct conservative p values for differential transcription (version 1.2.16; parameters: statistic = 1352 "normal"), in addition to the Benjamini-Hochberg correction for multiple testing. This correction could Competing interests 1373 The authors declare no competing interests. 1374