key: cord-0838150-lco03t4p authors: Jansen, Sander; Smlatic, Enisa; Copmans, Daniëlle; Debaveye, Sarah; Tangy, Frédéric; Vidalain, Pierre-Olivier; Neyts, Johan; Dallmeier, Kai title: Identification of host factors binding to dengue and Zika virus subgenomic RNA by efficient yeast three-hybrid screens of the human ORFeome date: 2021-01-18 journal: RNA biology DOI: 10.1080/15476286.2020.1868754 sha: 397075bc5798c6c486d2efc4177e84a37e4422e6 doc_id: 838150 cord_uid: lco03t4p Flaviviruses such as the dengue (DENV) and the Zika virus (ZIKV) are important human pathogens causing around 100 million symptomatic infections each year. During infection, small subgenomic flavivirus RNAs (sfRNAs) are formed inside the infected host cell as a result of incomplete degradation of the viral RNA genome by cellular exoribonuclease XRN1. Although the full extent of sfRNA functions is to be revealed, these non-coding RNAs are key virulence factors and their detrimental effects on multiple cellular processes seem to consistently involve molecular interactions with RNA-binding proteins (RBPs). Discovery of such sfRNA-binding host-factors has followed established biochemical pull-down approaches skewed towards highly abundant proteins hampering proteome-wide coverage. Yeast three-hybrid (Y3H) systems represent an attractive alternative approach. To facilitate proteome-wide screens for RBP, we revisited and improved existing RNA-Y3H methodology by (1) implementing full-length ORF libraries in combination with (2) efficient yeast mating to increase screening depth and sensitivity, and (3) stringent negative controls to eliminate over-representation of non-specific RNA-binders. These improvements were validated employing the well-characterized interaction between DDX6 (DEAD-box helicase 6) and sfRNA of DENV as paradigm. Our advanced Y3H system was used to screen for human proteins binding to DENV and ZIKV sfRNA, resulting in a list of 69 putative sfRNA-binders, including several previously reported as well as numerous novel RBP host factors. Our methodology requiring no sophisticated infrastructure or analytic pipeline may be employed for the discovery of meaningful RNA–protein interactions at large scale in other fields. RNA-protein interactions play a vital role in numerous cellular processes affecting the fate of both RNA and protein involved. The recent discovery of several new classes of noncoding RNAs such as long non-coding RNAs (lncRNAs) and their role in pathologies such as cancer, neurodegenerative disorders and cardiovascular disease [1, 2] , as well as the fact that an estimated 5-10% of the human proteome comprises RNA-binding proteins (RBPs) [3, 4] , further highlights the need to elucidate the RNA-binding proteome. Aside from their importance in noncommunicable diseases, RBPs also play a crucial role at the molecular interface between human cells and microbial pathogens. This is especially true for viruses, which have limited coding capacity and rely greatly on RNA-protein interactions for hijacking host cell machinery [5, 6] . A paradigm of such pathogen-derived non-coding RNAs are the subgenomic RNAs (small flavivirus RNA, sfRNA) expressed by flaviviruses, that are small positivestrand RNA viruses and include many medically important pathogens like the dengue (DENV) and Zika viruses (ZIKV) (see infra). Similar to the 'guilt-by-association' principle for proteinprotein interactions [7] , the identification of specific RNAprotein interactions may hint to possible functions of a particular RNA or protein. Many techniques have been developed to study RNA-protein interactions [8, 9] . Recent go-to approaches focus on the isolation of RNA-protein complexes from living cells or cellular lysates by pulldown experiments, with recent advancements in RNA-sequencing and mass spectrometry for RBP identification significantly increasing scale and throughput. However, as cells of certain origin express only a fraction of the proteome, assay output inherently depends on the particular cell line used, skewing the depth of such analysis to RBP present at a sufficiently high level [10] . False-positive interactions also remain an issue. Novel techniques such as proximity biotinylation or UVcrosslinking aim to reduce background, as they allow for higher stringency, but again introduce a certain bias, as both biotinylation [11] as well as UV-crosslinking tend to involve only a small subset of all RNA-protein interactions; for the latter estimated at 1-5% [12] . To lower false-positive and false-negative rates, and thus generate a more exhaustive picture of RNA-protein interactomes, complementary assays based on orthogonal methods remain of much value. Three-hybrid approaches such as the yeast [13, 14] , bacterial [15] and mammalian three-hybrid [16] serve as an interesting alternative, as they circumvent abundance biases by exogenously expressing both the RNA and protein of interest. We for instance have recently developed a mammalian RNA three-hybrid system providing the advantage of assessing RNA-protein binding in a more native cellular environment. However, such advanced mammalian cell assay does not allow growth selection and requires testing RNA-protein interactions individually in single wells. As a consequence, this approach is technically demanding when testing hundreds to thousands of RNA-protein pairs, and hence limited to laboratories equipped with automation solutions. By contrast, Y3H systems are easy to implement and scale up, with a minimal requirement for expensive reagents or complex infrastructure. Y3H screens of large cDNA libraries to identify RBPs have produced valuable data sets [17] . Previous attempts have, however, been limited by high rates of false-positive andnegative hits, intrinsic to the use of classical cDNA libraries containing large numbers of incorrect out-of-frame and randomly inserted cDNA constructs interfering with a reliable readout when using the original Y3H approach. The aim of the current study was therefore to improve the established Y3H methodology [13, 18] and overcome some of the major shortcomings of previous strategies by implementing (1) the use of high-quality ORF libraries for RBP 'prey' expression, (2) yeast mating to guarantee a high sampling depth, and (3) matched antisense 'baits' as stringent negative controls to allow for robust screening for relevant RBPs and hit validation at a large scale. As a proof-of-concept, we assessed RBP binding to the highly structured subgenomic flavivirus RNA (sfRNA). These unique ncRNAs are a product of incomplete degradation of the flaviviral RNA genome by the ubiquitous cellular 5ʹ-3ʹ exoribonuclease 1 (XRN-1) stalling on XRN1-resistant RNA structures (xrRNAs) in the 3' untranslated region (3'UTR) [19] . Essentially comprising a large part (~300-500nt) of the 3'UTR with its conserved secondary structures, this non-coding viral RNA species serves as a virulence factor that perturbs cellular RNA homeostasis [20, 21] , and inhibits via its interaction with cellular host proteins [22, 23] , amongst others, innate antiviral immunity. For a more comprehensive overview see Bavia et al. [24] , MacFadden et al [25] . and Goërtz et al. [26] and references therein. In vivo experiments showed that sfRNAs aid flaviviruses in disseminating within their vector mosquitoes [27, 28] , and increase viral pathogenicity in experimentally infected mice [20] . Despite their role in flavivirus pathogenesis, the list of host factors targeted by sfRNAs remains limited, and requires further investigation [16, 22, 29, 30] . Here we show that the previously discovered interaction between a DENV sfRNA and DEAD-box Helicase 6 (DDX6) [22] can readily be detected by Y3H techniques, and confirm its specificity by comparison to matched antisense and protein-binding domain-deleted bait RNA controls, providing a blueprint for thorough hit validation. Next, we show that our improved Y3H methodology can pick up the interaction between sfRNA and DDX6 under actual screening conditions, when yeast diploids expressing DDX6 are outnumbered by those expressing an irrelevant control non-interacting protein as prey. Finally, a human ORF library was screened for cellular targets of the sfRNAs of DENV and ZIKV, resulting in a list of 69 putative sfRNA-binding proteins. A handful of previously confirmed RBP-sfRNA interactions were detected (e.g. DDX6) further validating our approach, as well as several novel and biologically interesting putative RBP host factors. All All synthetic-defined (SD) media was prepared with doubledistilled water, using BD Difco Yeast Nitrogen Base without Amino Acids and Ammonium sulfate (Thermo Fisher Scientific) supplemented with 50 g/l ammonium sulfate, 20 g/l D-glucose (Sigma) and 1.92 g/l drop-out supplement (Sigma-Aldrich). 1× YPDA medium was obtained by adding 50 g/l of YPD broth (Carl Roth) and 250 mg (20x) adenosine hemisulfate (Apollo scientific) to 1 L of double-distilled water. Agar plates were prepared by autoclaving 20 g/l of agar (Carl Roth) in double-distilled water before adding all other sterile components. Additional components such as 3-aminotriazole or 3-AT (Sigma) and the antibiotics kanamycin (Carl Roth) and geneticin G418 (Thermo Fisher Scientific) were added to agar plates after solidification to prevent degradation. Bait RNA constructs were expressed from the p3HR2 vector (a generous gift of Prof. Marvin Wickens), with the URA3 auxotrophic marker replaced by LEU2 or TRP1 (p3HR2L and p3HR2T). sfRNA constructs were generated by cloning DENV serotype 2 New Guinea strain C (Genbank AF038403) nt 10,296-10,723, ZIKV strain MR766 (Genbank LC002520.1) nt 10,478 − 10,807 or the yellow fever virus (YFV) 17D-204 vaccine strain (Genbank NC_002031) nt 10,532-10,862 into a unique XhoI restriction site of the yeast three-hybrid vector p3HR2 [14] which expresses the bait RNA in a stable scaffold (GC-clamp) and as 5ʹ-terminal fusion to a tandem repeat of two MS2 RNA stem-loops, yielding two types of clones with the sense (DENV sfRNA) and antisense RNA (asDENV sfRNA), respectively. The antisense construct was used as a matched negative control for hit confirmation. Prey proteins were expressed as N-terminal GAL4activation domain fusion from Gateway compatible vectors pIR190 and pACT2.2gtwy, a respective low-and high-copy yeast episomal plasmid, to exclude vector-specific effects. pIR190 was a gift from Detlef Weigel (Addgene plasmid # 64,812; http://n2t.net/addgene:64812; RRID:Addgene_64,812) [32] and pACT2.2gtwy was a gift from Guy Caldwell (Addgene plasmid # 11,346; http://n2t.net/addgene:11346; RRID:Addgene_11,346). ORF cDNAs were cloned into these vectors using standard Gateway cloning (Gateway™ LR Clonase™ II Enzyme mix, Invitrogen). Chemically competent yeast cells were generated using the frozen competent yeast cell protocol of Gietz and Schiestl [33] and transformed with our Y3H expression constructs using the lithium acetate/single-stranded carrier DNA/PEG method [34] . Yeast suspensions were stored at −80°C after adding an equal volume of a sterile 50% glycerol solution. The prey library expressing 12,000 full-length human ORFs as N-terminal GAL4-AD fusion was previously described [35] . Briefly, a pool of 12,000 human ORFs cloned in pDONR223 (version 3.1 of the Centre for Cancer Systems Biology, CCSB; purchased from Open Biosystems; a gift of Dr Vincent Lotteau) was transferred by in vitro recombination into the pPC86 vector (Invitrogen) made Gateway compatible (a gift of Dr Marc Vidal). The resulting library was transformed into yeast strain Y187 and stored frozen at −80°C until use. Library screens were conducted by mating Y187 cells bearing the prey library with YBZ2 cells expressing particular RNA baits. To that end, YBZ2 cells transformed with the p3HR2L vector expressing the respective RNA under study were freshly grown in SD/-L media in a shaker for 48 h at 28°C. 125 µL (6 OD) of Y187 yeast suspension transformed with the human ORFeome library was retrieved from −80°C, added to fresh YPD medium and incubated for 10 min in a shaker at 28°C. Then, 7.2 OD of YBZ2 suspension was mixed with the Y187 suspension and spun down at 750 × g for 5 min. Supernatant was removed, the yeast pellet resuspended in 500 μl of sterile Milli-Q water and plated on YPD agar (90 mm plate) for mating. And, 1.25 mL of 200 mg/ml adenosine hemisulfate and 25 µL of 50 µg/ml kanamycin were added to this plate right before the cell suspension. Cells were spread using glass beads and kept for 4 h at 28°C. After mating, resulting diploids were scraped of the agar and resuspended in 3 ml of Milli-Q water and distributed evenly over 6 SD/-LWUH + 200 µg/mL G418 agar plates (90 mm; containing 50 µg/mL kanamycin to avoid bacterial contamination). In parallel, a small aliquot was plated on SD/-LW medium to determine mating efficacy. After 7 days of incubation at 28°C, colony PCR was performed on all colonies (0.5 to 5 mm in diameter), to amplify each ORF in pIR190, using the GoTaq Hot Start Polymerase and Green Master Mix (Promega) with following primers: 5ʹ-GACGGACCAAACTGCGTATA-3ʹ (forward) and 5ʹ-ACCAAACCTCTGGCGAAGAA-3ʹ (reverse) and the elongation time set to 240 s (based on maximal ORF size). PCR products were run on a 1% agarose gel, purified using the Wizard SV 96 PCR Clean-Up System (Promega) and sequenced. Resulting sequences were blasted against the human ORFeome database v3.0 of CCSB (http://horfdb.dfci. harvard.edu/) and the general NCBI database (http://blast. ncbi.nlm.nih.gov/Blast.cgi). LacZ expression was measured using the Promega Beta-Glo® Assay System [36] . In brief, YBZ2/Y187 diploids were grown in SD/-LW medium while shaking for 24 h at 28°C. To exclude density-dependent effects, 50 µl of two different dilutions each (OD 0.1 and 0.05) of these yeast cultures were mixed with 50 µl of Beta-Glo® reagent in a white 96-well microtiter plate with clear transparent bottom (White View Plates, PerkinElmer). After 1 h of incubation at 28°C, luminescence was measured in a Safire II microplate reader (Tecan) and signals were corrected for cell density (A 600 ). Lower yeast dilutions did not yield consistent results. HIS3 expression was indirectly determined by measuring the effect of 3-aminotriazole (3-AT) on diploid growth [37] . YBZ2/Y187 diploids were inoculated in SD/-LW medium for 24 h, spun down, resuspended in an appropriate volume of Milli-Q water and added to multiple wells of a deep-well plate with 500 µL of SD/-LWH (0.01 OD final yeast concentration) and a 1:2 dilution series of 3-AT. After 48 h shaking incubation at 28°C, 100 µL of each solution was transferred to a microtiter plate with a clear transparent bottom (White View Plates, PerkinElmer). Cell density was measured by absorbance at 600 nm in a Safire II microplate reader (Tecan) and plotted against 3-AT concentration. Results were normalized, with yeast growth in SD/-LWH + 10 mM 3-AT set as 0% and growth in (1) SD/-LW set as 100% for data visualization and (2) SD/-LWH + 0 mM 3-AT set as 100% for calculating a 3-AT median inhibitory concentration (IC 50 value) as a proxy for HIS3 expression. HAP1 (haploid human epithelial) cells (catalogue number C631) and DDX6-KO cells derived thereof (clone 116-5) with expression of DDX6 (gene ID: 1656) ablated by CRISPR/Cas9-mediated gene disruption were purchased from Haplogen, Vienna. Cells were cultured in Iscove's Modified Dulbecco's Medium (IMDM) supplemented with 10% (v/v) foetal bovine serum (FBS) and incubated at 37°C with 5% CO 2 . To determine virus infectivity, cells were seeded in a 6-well plate at 1 million cells/well in IMDM medium with 2% FBS. Cells were infected 24 h after seeding at a multiplicity of infection (MOI) of 1. Infectious virus yields were measured in the supernatant 2 days postinfection using an end-point dilution assay scoring for virus-induced cytopathic effect (CPE) by MTS/PMS ([3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium/phenazine methosulfate; Promega), and respective median tissue culture infectious doses (TCID 50 ) calculated using the Spearman-Karber method. All cell culture media and reagents were from Gibco, unless otherwise specified. We designed an advanced Y3H protocol to identify proteins interacting with RNA molecules of interest ( Fig. 1 ). As previously described, bait RNAs were expressed in fusion with MS2 RNA hairpins allowing their capture by the hybrid LexA/MS2 coat protein (N55K), whereas prey proteins were expressed in fusion with Gal4-AD domain for transactivation. Interactions between bait RNA molecules and prey proteins allow the assembly of a functional transcription factor driving the expression of lacZ and HIS3 reporter genes. Compared to previous versions of the Y3H system, several improvements were achieved. First, we made the assay compatible with mating, a procedure that is much more efficient than yeast transformation when trying to cover complex prey libraries. To reach this goal, the novel YBZ2 strain was generated by introducing the kanMX4 gene in the GAL4 locus of the parental YBZ1 strain already expressing N55K and the (LexAop)-HIS3 and (LexAop)-lacZ reporter genes. This made it possible to select diploids on medium lacking uracil but supplemented with geneticin (G418), when screening prey libraries established in the Y187 strain ( Fig. 1) . Introducing the kanMX4 gene also abolished the expression of Gal4 in the YBZ2 strain, something essential to prevent the (genotype: MATα, URA3, his3, trp1, leu2) with prey plasmids (prototrophic marker: TRP1) expressing the human ORF library, both strains are mated and resulting diploids plated on selective medium (SD/-LWUH supplemented with G418). The thus chosen growth conditions select for functional complementation of multiple markers, essentially resulting from the productive interaction between bait RNA and prey protein (HIS prototrophy). This interaction needs to be a consequence of fusion of cells from either yeast strain (G418res and URA3, respectively) by mating, by this means combining both bait (LEU2) and prey (TRP1) plasmids in one diploid cell. Each candidate RBP is identified by colony PCR using prey expression vector-specific primers (amplification of target ORF cDNA), Sanger sequencing and hit calling by BLAST analysis. (B) Elimination of false positives. In parallel, a negative control screen is performed the identical way, yet using an empty bait construct to eliminate non-specific RNA binders to be deselected as false-positives from the primary interaction screen. (C) Hit verification. Remaining hits may further be characterized quantitatively by Y3H, assessing their HIS3 and lacZ reporter gene expression levels. Hits can thus be verified by showing a marked increase in signal over several stringent negative controls, such as a matched antisense bait construct. transactivation of Gal4-dependent lacZ and HIS3 reporter genes brought by the Y187 strain in the diploids. Secondly, a normalized library of 12,000 full-length human ORFs previously established in Y187 was used to perform the screens [38] . As aforementioned, full-length proteins will generate less background than isolated fragments or domains, and thus reduce detection of spurious interactors. In addition, full coverage of normalized libraries is much easier to achieve. To further improve data quality, we conducted a counterscreen using MS2 RNA hairpins alone without any other RNA molecule of interest fused as bait, to identify nonspecific RNA binders in the assay. Finally, the specificity of RNA-protein interactions identified in this manner can be assessed using a matching antisense variant of our bait RNA molecule of interest as a stringent negative control. We first challenged this Y3H protocol using subgenomic flavivirus RNA (sfRNA) molecules as bait because of their key role in viral physiopathology and the known interactions with host proteins. At first, we tested our assay on a wellcharacterized interaction between DENV2 sfRNA and human DEAD-box helicase 6 (DDX6). DDX6 is a host RBP interacting with the DENV2 3ʹUTR, and likewise in sfRNA derived thereof, via the dumbbell secondary RNA structures [22] . DDX6 has been shown to support DENV replication. In line with previous reports, a tenfold decrease in infectious virus yield (TCID 50 ) and RNA copies in cell supernatant was observed in DDX6 knockout cells compared to HAP1 parental cells ( Fig. 2A and Supplementary Fig. S1 ). The sfRNA-DDX6 pair was thus implemented in our Y3H system. Yeast reporter cells expressing DENV2 sfRNA and DDX6 as respective bait and prey gained histidine prototrophy as they grew on selective media lacking this amino acid (see Supplementary Fig. S2 ). This demonstrates HIS3 reporter gene induction downstream of productive RNA-protein binding. In contrast, yeast expressing an irrelevant protein as non-interacting prey failed to grow on selective media. HIS3 expression was indirectly titrated by measuring growth of diploid yeasts in presence of increasing concentrations of 3-aminotriazole (3-AT), a competitive HIS3 inhibitor. In parallel, LacZ signals were quantified in a chemiluminescent beta-galactosidase assay. Both reporter genes lacZ and HIS3 showed a robust increase in expression over matched negative controls (expressing DENV2 sfRNA in antisense orientation; DENV2as), demonstrating the specificity of the Y3H assay. Differences were also observed as expected when prey protein expression was tuned using either low or high expression vectors (Fig. 2B, C) . Indeed, a ~ 30-to 40-fold increase was observed when DDX6 was expressed from the high-copy plasmid compared to a ~ 20-fold increase using the low-copy plasmid, in line with lower expected prey protein levels in the latter. Altogether, this demonstrates that HIS3 and LacZ induction in our Y3H system depends on bait and prey expression as expected. As additional proof of specificity, DDX6-binding was assessed with the protein-binding domain deleted in the bait RNA. As shown before [22] , DDX6 binds to the tandemly repeated dumbbell (DB-I and DB-II) structures of DENV2, whereas YFV has only a single dumbbell structure in its 3 ´-UTR. As shown in Fig. 2C , this single dumbbell structure was sufficient and required for DDX6 binding, although the signal was slightly lower compared to DENV2 sfRNA. Finally, swapping of the YFV dumbbell region for the DENV2 dumbbell region fully rescued DDX6 binding in a respective deletion variant. Altogether, these results further demonstrate the specificity of our Y3H assay. To implement a robust library screening protocol and to increase sampling efficacy, a solid media mating protocol was optimized (as described in the "Material and methods" section) effectively reaching mating efficacies of ~5%. For comparison, high-efficacy yeast transformation ideally yields up to 10 6 transformants per μg plasmid DNA per 10 8 cells, in case easy to transform strains are used [34] , equalling an efficiency of 1%, possibly limiting library coverage and sampling depth (redundancy of individual clones). This protocol aimed at screening a prey library containing a pool of 12,000 in-frame, full-length human ORFs (hORFeome library V3.1; CCSB) that were N-terminally fused to Gal4-AD and transformed in the Y187 yeast strain. We decided to mate 7.2 OD of the 'bait' YBZ2 suspension with 6 OD of 'prey' Y187 suspension (1.2:1) [39] . Using this relatively small amount of library cells for mating (yeast cell count of 6 OD; estimated 10 8 haploid cells), each one of the 12,000 different ORFs present in the library described above is effectively sampled ~300-400 times in a single screen. This should be sufficient since it was previously determined empirically (in a Y2H cDNA setting) that a yeast hybrid screen reaches saturation when the number of diploids equals more than 40 times the original complexity of the library. To validate our Y3H setup for library screening, a pilot screen was conducted mimicking an actual screen where diploid cells expressing a bait RNA-binding protein were strongly outnumbered by cells expressing a non-interacting protein. To this end, Y187 clones expressing DDX6 (interacting protein, or IP) as prey were mixed with Y187 clones expressing a negative control prey (Non-interacting protein, or NIP) in five different ratios (DDX6:NIP 1:0 to 1:10,000) prior to mating with YBZ2 cells expressing the DENV2 sfRNA as bait (Fig. 3A) . In addition, this pilot screen enabled us to determine optimal conditions for an actual screen such as media composition, time for growth, and colony picking/PCR conditions. A common practice in yeast two-and three-hybrid screens to counter leaky HIS3 expression is to add 3-AT, a competitive HIS3 inhibitor, to increase stringency [17, 37, 40, 41] . However, in our Y3H setting, our bait did not cause any auto-activation (see Supplementary 1) and the selective pressure posed by as little as 1 mM of 3-AT proved to be too strong, lowering the amount of outgrowing colonies and thus sensitivity drastically. We hence tried to balance the false-negative rate (resulting from addition of 3-AT) is shown relative to HAP1-parental. n = 4 from two independent experiments. Unpaired t-test calculated in Graphpad Prism. ****p < 0.0001. (B + C) Elevated reporter gene expression in yeast resulting from functional Y3H interaction. Diploids expressing DDX6 as 'prey' and DENV2 sense sfRNA as 'bait' have markedly increased HIS3 and LacZ expression levels over diploids expressing antisense sfRNA employed as matched negative control 'bait'. DDX6 was expressed from two different expression vectors to exclude vector-dependent effects. (B) Fold increase in LacZ expression over the antisense 'bait' control was determined in a coupled bioluminescent LacZ assay in two biological replicates. (C) HIS3 expression was titrated indirectly by measuring growth resistance of diploid Y3H cells to the competitive HIS3 inhibitor 3-AT. Growth was measured by absorbance (OD at 600 nm) in liquid cultures and normalized for SD/-LW (set as 100%) and SD/-LWH supplemented with 10 mM 3-AT (set as 0%). A 3-AT IC50 value was calculated with growth in SD/-LWH set as 100% (not shown). (D) Conserved dumbbell structures in the DENV2 sfRNA are required for DDX6-interaction in the Y3H. Different sfRNA variants of both YFV17D (construct 3) and DENV2 (construct 4) were assessed for DDX6 binding in Y3H. Diploids were grown overnight in SD/-LW medium and streaked for phenotypic analysis on SD/-LWH medium. In construct 6 (YFV ∆DB), the central dumbbell (DB) structure (orange) proposed to bind to DDX6 [27] has been deleted. In construct 8, the tandem repeat DB-I and DB-II structures (blue) from DENV2 sfRNA have been swapped for the homologous RNA elements in the YFV sfRNA. Constructs 5 (DENV2as) and 7 (YFVas ∆DB) represent antisense controls for respectively construct 4 (DENV2 sfRNA) and construct 6. and the increased incidence of false positives (obtained in absence of 3-AT) using stringent hit validation. An optimal balance between sensitivity and selectivity was reached on media selecting for both yeast strains as well as expression plasmids, SD/-LWUH + G418. The plate with an IP:NIP ratio of 1:100 was chosen, for practical reasons, for assessing selectivity. Fifty colonies were picked and out of these 50 diploid yeast clones, 90% expressed DDX6 as prey (Fig. 3B ) equalling an enrichment factor of 900 (of the IP as prey over NIP). By increasing the lower limit for colony size (i.e. by leaving out colonies less than 0.5 mm in diameter), this percentage could further be increased. After validating and optimizing the Y3H mating and selection protocol, demonstrating that it is able to detect RBPs with high specificity and selectivity under screening conditions, an actual library screen was performed using DENV2 and ZIKV sfRNA as bait and the hORFeome library V3.1 encoding 12,000 human prey proteins. In total, 314 and 167 positive yeast colonies were retrieved from DENV2 and ZIKV screens, respectively. Along with the screen using DENV2 and ZIKV sfRNA as bait, an additional negative control screen was also performed using an empty bait construct expressing only the MS2 RNA (i.e. without sfRNA) to eliminate false positives such as non-specific RBP. Ninety colonies were picked in this negative control screen expressing 18 unique proteins. Fifteen of these proteins were also identified in our DENV2 and ZIKV sfRNA screen and were thus considered false positives and deducted from the list of putative sfRNA-binding proteins. By this means, 69 out of the initial 84 putative (82%) sfRNA-binding proteins remained (false-positive hit rate 18%). Among these hits, EDC3 was picked up 125 times in our dengue screen (≈1/3 of all colonies), while RBPMS was the most frequent hit in our Zika screen, as it was picked up 18 times (Fig. 4A) . Seven proteins (DDX6, ZNF274, PRDM14, DAZAP2, MOCS2, FCGR2A and TUBA1B) were detected in both screens. Of note, DDX6 was identified 5 times in the dengue screen, providing additional validation for our screening approach. To confirm lacZ reporter gene induction in positive yeast colonies, a luminescent beta-galactosidase assay was performed on a panel of yeast clones that were picked up at least twice in the DENV2 screen. All showed a marked increase in LacZ signal over background, with EDC3 showing the strongest signal increase (~212-fold) and PRKRA the lowest signal increase (~3-fold). Overall, a positive correlation between LacZ signal and screen hit frequency could be observed, with DPYSL3 as outlier showing a ~ 240-fold increase in LacZ signal while it was only picked up four times in the DENV2 screen. Excluding DPYSL3, the Spearman correlation coefficient equalled 0.8986 (p = 0.0278). Additionally, the same clones were replicaplated to confirm their HIS3 expression and grow on histidine-deficient selective medium. Results are shown for diploid expressing protein EDC3 as prey in Fig. 4D (results of other preys are displayed in Supplementary Fig. S3) . Gene ontology profiling [42] showed that our list of hits was significantly enriched for P-body proteins (p = 1.909 x 10 −3 ). General RNA-binding capability was predicted (Fig. 4F) based on the analysis of one or more of the following queries: (1) the presence of classical RNA-binding domains, RNA-binding gene ontology and/or (2) detection in a comprehensive RNA-interactome screen [3] . Based on this analysis, 19 out of 69 identified genes (27.5%) have already been associated with RNA-binding and the four most frequent hits (ZNF274, EDC3, PRDM14 and RBPMS) belong to this list. Three of our hits, lacking classical RBD or Overview of the cellular component gene ontologies for proteins that were picked up in the DENV2 and ZIKV sfRNA Y3H screens. The resulting hit list was significantly enriched for proteins that are part of P-bodies (GO:0000932). Hits run as an ordered query, according to their screen hit frequency, in the g:GOst tool in g:profiler [42] . **p ≤ 0.01 (C) Correlation between hit calling frequencies and reporter gene expression levels. A panel of yeast diploids selected in the DENV2 screen were assessed for LacZ expression at two different yeast cell densities, all hits showed a marked increase over the NIP control. A positive correlation between screen hit frequency (ordered from left to right) and respective LacZ signal intensities could be observed with the exception of DPYSL3 (Excluding DPYSL3, Spearman r: 0.8986, two-tailed p = 0.0278). (D) Replica plating confirms HIS3 reporter gene expression. A panel of yeast diploids picked up in the DENV2 screen were replica plated on selective medium to confirm HIS3 expression. Results are shown for diploids expressing protein EDC3, the most frequent hit, as prey. (E) Comparison with other genome-wide approaches to identify DENV and ZIKV host factors. Venn diagram showing the overlap of hits identified by Y3H with DENV2 and ZIKV 3ʹUTR/sfRNA-binding proteins previously identified in orthogonal screens, i.e. by (1) a mammalian three-hybrid assay [16] and (2) RNA pulldown screens [22, 30] and (3) host dependency or restriction factors of DENV and ZIKV replication identified by two independent RNAi screens [43, 44] with no bias for RBP. (F) RNA-binding domains involved. Venn diagram showing hits with known RNA-binding activity based on (1) the presence of a classical RNAbinding domains (n = 17) and/or (2) detection in a comprehensive RNA-interactome screen (n = 13) [3] and/or (3) RNA-binding gene ontology (n = 19). known RNA-binding gene ontology, were detected in the latter interactome screen, which only picked up mRNAbinders in HeLa cells. Our Y3H screen detected six proteins that were previously found in orthogonal screens for DENV2 or ZIKV 3ʹUTR/sfRNA-binding proteins (Fig. 4E) . DDX6, STX11, RBPMS, SPATA5 and PACT were also hits in our DENV2 mammalian three-hybrid (M3H) screen [16] . PACT was additionally identified in a DENV2 pulldown screen [30] and DDX6 showed binding to both DENV2 and ZIKV sfRNA in two independent pulldown screens, which also picked up EDC3 [22, 30] . Furthermore, four proteins overlapped with host dependency and restriction factors for dengue and/or Zika virus. ZNF165, ISG15, DHDDS and again PACT were found to have a proviral effect on DENV or ZIKV replication in an RNA interference screen [43, 44] . RBPMS, which was picked up in both of our three-hybrid screens, was likewise found in a functional screen for the related flavivirus West Nile virus [45] . Altogether, our Y3H screen detected at least nine proteins that were already previously found in orthogonal screens for both, (1) 3ʹUTR binding proteins, or (2) host dependency and restriction factors for dengue and/or Zika virus (Fig. 4E ). Sixty proteins were newly identified as potential sfRNA binders and will require further investigation. In this study, we demonstrate how the classical RNA-Y3H method [13, 46] can be advanced to deliver a powerful screening tool for the identification of RNA-binding proteins, essentially by combining (1) yeast mating, (2) high-quality ORF libraries and (3) stringent hit validation. Using this methodology, we identify 69 human proteins specifically binding to DENV and ZIKV sfRNA, including several previously reported as well as numerous novel RBP host factors. Though principally equally simple and appealing as the original protein-protein yeast two-hybrid (PP-Y2H) approach and, hence, lending its application for large-scale RBP interaction screens [17, 47] , the established Y3H technique has not widely been used (for a critical review see [17] ). In fact, published evidence for use of RNA-Y3H, in particular for (close to) genome-wide screens, is sparse; reporting somewhat disappointing results regarding the relatively small number of hits gained and the depth of the (claimed) coverage [48] [49] [50] . By contrast, RNA-Y3H has mostly been employed in (1) an inverse setting, i.e. looking for RNA-interactors of known RBP, or (2) for the mapping of interaction sites in known interacting couples in a one-by-one setting (as also we do for confirmation of specificity as shown in Fig. 2B-D) . This is obviously in stark contrast to the power and wide-spread (and even commercialized) use of the classical PP-Y2H. The most commonly encountered problem when using the original Y3H system is the selection of a large number of false-positive clones. Several improvements have been implemented, focusing on yeast strain improvements, choice of selection markers, or amendments to the protocol in order to eliminate autoactivation (i.e. erroneous reporter induction in absence of either interaction partner, bait or prey) (for a discussion see Martin [17] ). However, another fundamental shortcoming that we encountered when using a standard cDNA library, and which does not relate to the genuine Y3H principle has hardly been addressed experimentally so far. In brief, despite vast technical improvements have been made towards the generation of high-quality cDNA libraries (see, e.g. [51] ), randomprimed or size-selected cDNA libraries, or cDNA libraries generated using universal adapters for first-strand synthesis (such as polyT primers) contain only a fraction of full-length clones. Importantly for the Y3H method, many out-of-frame fusions to the AD will be generated, in possibly all six reading frames. In turn, protein-RNA interactions are largely driven by strong electrostatic interactions. Due to the degeneration of the genetic code, in particular, the over-representation of codons encoding for the positively charged amino-acids arginine (CGN, AGA/G) and lysine (AAA/G), a high abundance of frameshift proteins leads inevitably to the expression of an excess of non-natural prey proteins consisting of faulty, yet highly charged AD variants with quasi histone-like (intrinsically nucleic acid-binding) properties. Erroneously introduced premature stop-codons leading to the expression of C-terminally truncated clones is less problematic in this regard, although it may result in the deletion of an RNAbinding motif, increasing the number of false negatives. For comparison, we previously conducted a small set of Y3H screens essentially as described in the "Material and methods" section, using YBZ2 cells transformed with DENV2 sfRNA as bait, and a commercially available highquality pre-transformed normalized universal human cDNA library (Mate & Plate Library; Matchmaker Gold Yeast Two-Hybrid System, Clontech/Takara) [52, 53] as prey collection, with a consistently poor outcome. Firstly, only a very small number of outgrowing clones (primary hits growing without histidine) were gained after mating (<50 CFU). More importantly, only a minute fraction (10 CFU) could be rescued by replica plating. Of the remaining clones, four consisted of diploid cells harbouring empty prey plasmids that escaped prototrophic growth selection, three thereof with a frameshift in the linker region downstream of the encoded AD domain showing autoactivation for unexplained reasons. One clone expressed a small poly-lysine peptide, putatively arising the cloned polyA tail of a fragmented mRNA, or polyT olignucleotides employed during firststrand cDNA synthesis. The rest consisted of out-of-frame fusions of cDNA fragments originating from known human proteins. As a special example, two identical clones expressed the same roughly 145 aa long peptide (calculated MW 16.5 kDa) with a calculated high isoelectric point of 9.81, originating from of the Homo sapiens secreted frizzled-related protein 4 (SFRP4) mRNA (NM_003014.3). None of the hits thus obtained was considered biologically relevant. We expect similar anecdotal findings by others, however hardly reported, except by Tan et al. [50] mentioning in their methods narrative that hits were selected after 'non-sense sequences of the candidates were eliminated ' yet disclosing no further details. Prior to establishing the new refined Y3H methodology, we first confirmed that the Y3H can pick up relevant binding partners, using the well-characterized interaction between dengue non-coding sfRNA and the flavivirus host factor DDX6 as paradigm. We observed a marked, consistent and specific increase in expression of both yeast reporter genes employed (lacZ and HIS3) over multiple stringent negative controls, such as matched antisense and proteinbinding domain-deleted bait RNA, and when the respective 'prey' constructs were expressed from either a low-or highcopy vector, in line with the previously described correlation between reporter activities, prey protein expression levels and RNA-RBP affinities [18] . Besides serving as a proof-of-concept, we propose this set of experiments as a blueprint for thorough hit validation. Interestingly, DDX6 also showed binding to the YFV sfRNA, hinting at a more broadspectrum flavivirus host factor role for DDX6. Next, we establish to what extent the thus refined Y3H methodology can pick up the interaction between sfRNA and DDX6 under simulated screening conditions, i.e. when reporter cells expressing DDX6 are outnumbered by cells expressing a non-interacting protein as prey. Additionally, this pilot screen allowed us to determine optimal conditions for an actual screen. Defined medium selecting for both parental yeast strains (bait donor strain and prey library strain) and both expression plasmids (bait and prey), SD/-LWUH + G418, effectively suppressed background growth and selected for more than 90% true interactors when outnumbered by a 100-fold excess of yeast cells transformed with a non-interacting protein as irrelevant prey. Although 3-AT, a competitive HIS3 inhibitor, is commonly added to the selective medium for yeast two-and three-hybrid screens [17, 40, 41, 54] , this additional selective pressure proved to be a too strong selection barrier in our Y3H setup, suggesting generally low-affinity RNA-RBP interactions [17] . Nonetheless, an omission of 3-AT bears the risk of an increase in false positives under actual screening conditions. As a compromise, we chose for a stage approach, namely using (1) screening conditions with a minimal falsenegative hit rate (i.e. without 3-AT), and (2) tackled the resulting increase in false positives subsequently by performing a negative control screen and stringent hit validation. Finally, we conducted two independent screens of a 12.212 ORF library for proteins binding to DENV2 and ZIKV sfRNAs, which resulted in a list of 84 proteins in total. After conducting an identical screen with a negative control bait construct (DENV2 sfRNA in antisense conformation), and removing these non-specific interactors, 69 putative sfRNAbinders remained. Our list of 69 proteins should be further studied to (1) verify sfRNA-binding, and (2) determine their true relevance during flavivirus infection. Here it might be valuable to assign higher priority to hits identified with a higher frequency, and/or in both the DENV2 and ZIKV screen, and/or in orthogonal screens. In two-hybrid screenings filtering out hits identified only once or twice can enrich for high-quality interactions [55] . Several of our hits have been identified in previous orthogonal interaction and functional screenings, such as the PACT (PRKRA) protein which was already previously detected and confirmed to bind the DENV2 sfRNA in a mammalian three-hybrid RNA-KISS screen [15] , and shown to affect DENV infection in tissue culture in an RNAi screen [43] . Scanning gene ontologies and mining data for RNAbinding evidence can provide additional information about the relevance of each hit. In our case, 27.5% or ~1/ 4 of all hits have already been associated with RNAbinding. Of note, four most frequent hits belong to this list, and known RNA-binders are distributed evenly over the rest of the ranked list. However, labelling remaining hits a priori as false-positives due to a failure to match established classifications should be avoided since fundamental gaps in the RNA-interactome prevail, as is discussed by Hentze et al. [4] . By coincidence, the interaction between sfRNA and TRIM25 [23] may serve an example of such a functionally relevant, yet non-conventional RNAprotein binding [4] . Furthermore, hits from our Y3H screen were examined for the presence of classical RNA-binding domains, RNA-binding gene ontology, as well as detection in a recent RNA-interactome screen (RBDmap) recording general mRNA-binders in HeLa cells [3, 56] . Of interest, three of our hits, each lacking classical RBD or known RNA-binding gene ontology, had been classified as RBP only in latter interactome screen. Likewise, an increasing number of hits are still to be detected in future interactome screens, especially when applied to a broader repertoire of RNAs. Moreover, due to inherent constraints of both pulldown and Y3H systems, either method will inherently detect only a fraction of true interactors that may not be confirmed by the respective orthogonal method, as also observed in two-hybrid screens [57, 58] . High rates of false positives and negatives are one of the main drawbacks of yeast hybrid screenings in general, especially when not properly controlled, although actual rates have not been documented for Y3H. Using the original Y2H variant as a comparator, false-positive rates (FPR) are estimated to be a minimum ~20% and false-negative rates (FNR) ~75% [58, 59] . The improvements of the Y3H method presented here should markedly lower both FPR and FNR. Since it has been estimated that FPRs for Y2H are ~34% lower when screening ORF instead of cDNA libraries [59] , a marked drop in FPR can also be expected in a Y3H setting due to a high degree of similarity between both methods. Moreover, as exemplified by our experience and anecdotal reports by others [50] employing prey libraries originating from fragmented cDNAs (see above), the use of a high-quality ORF library will eliminate all those false-positive interactors raising from frameshift proteins. In our view, the problem of falsepositive interactors can further be tackled effectively by setting up proper controls, such as our negative control screen, and performing extensive hit confirmation using stringent negative controls such as matched antisense bait RNAs, as we have shown for DDX6. Although further hit verification may be required to determine the actual range of FPR, our negative control screen identified and rapidly eliminated 18% of hits from the initial ORF screens as false positives. In conclusion, compared to cDNA screens where a substantial amount, if not all, hits were caused by positively charged frameshift proteins (see above), the use of ORF libraries can lead to a marked reduction, hence improvement in FPR in RNA-Y3H screens. Actual RNA-binding proteins can stay undetected in the Y3H due to mislocalization, misfolding, too low expression levels, or the lack of appropriate posttranslational modifications in yeast. Although our approach does not solve these fundamental constraints, it should reduce the FNR compared to comparable Y3H cDNA screens, mainly by avoiding prey undersampling. While reaching a similar coverage, ORF libraries offer the particular advantage of consisting of equimolar concentrations of each plasmid and entailing a much lower complexity than classical cDNA libraries. High-efficacy yeast mating, besides requiring less work and experience than library transformation, increases the efficacy of sampling and thus prey redundancy at least fivefold. Thus, with the same amount of library material, each protein is sampled minimally five times more. Aside from the use of more extensive ORF libraries comprising more ORFs or both the C-and N-terminally fused GAL4-AD variant, further decreasing the FNR in the Y3H will probably be less straightforward. Arrayed libraries can fully eliminate ORF undersampling, but this requires more elaborate screening infrastructure. The recent implementation of normalized, in-frame protein domain libraries in two-hybrid screenings seems to provide a major improvement in this regard [60] and is easily transferable to a Y3H setting. Besides the fact that isolated protein domains often show better interaction than the full protein, it also allows the detection of membrane proteins by testing the RBD of a protein independent of its membrane domain. To improve the readout of our method, deep sequencing (as described in [61] ) could be applied, allowing for a faster, more comprehensive and potentially more sensitive readout. Despite all recent advancements in RNA pulldown screens such as RNA-protein crosslinking or proximity biotinylation [11, 62] and the development of other more complex threehybrid systems such as our RNA-KISS method [16] , the Y3H remains a valuable, robust and easy-to-use tool in the field of interactomics, which can quickly provide high-quality datasets complementary to these novel methods. The work we present here is an attempt to improve on existing Y3H methodology and should further facilitate its use as a convenient screening tool for RBPs of a specific RNA of interest. Finally, the long list of novel DENV and ZIKV sfRNAbinding human proteins thus identified warrants further investigation to elucidate the role that specific RBP play in flavivirus infection and pathogenesis. A blueprint on how to validate relevant sfRNA-binding host factors has been stipulated using the DDX6-sfRNA interaction as paradigm; rationalizing pathways towards the discovery of new cellular targets, with as ultimate goal, the therapeutic intervention and treatment of emerging flavivirus infections. Enisa Smlatic Daniëlle Copmans Frédéric Tangy Non-coding RNA networks in cancer The long non-coding RNAs in neurodegenerative diseases: novel mechanisms of pathogenesis Comprehensive identification of RNA-binding domains in human cells A brave new world of RNA-binding proteins RNA regulatory processes in RNA virus biology Comparative analysis of viral RNA signatures on different RIG-I-like receptors Estimating protein function using protein-protein relationships Methods to study RNA-protein interactions Methods for Identification of Protein-RNA Interaction Variation and genetic control of protein abundance in humans RNA-protein interaction detection in living cells Profiling the dress codes of RNA-binding proteins A three-hybrid system to detect RNA-protein interactions in vivo Analyzing mRNA-protein complexes using a yeast three-hybrid system A bacterial three-hybrid assay detects Escherichia coli Hfq-sRNA interactions in vivo The development of RNA-KISS, a mammalian three-hybrid method to detect RNA-protein interactions in living mammalian cells Fifteen years of the yeast three-hybrid system: RNA-protein interactions under investigation RNA-protein interactions in the yeast three-hybrid system: affinity, sensitivity, and enhanced library screening The structural basis of pathogenic subgenomic flavivirus RNA (sfRNA) production A highly structured, nuclease-resistant, noncoding RNA produced by flaviviruses is required for pathogenicity A noncoding RNA produced by arthropod-borne flaviviruses inhibits the cellular exoribonuclease XRN1 and alters host mRNA stability Quantitative mass spectrometry of DENV-2 RNA-interacting proteins reveals that the DEAD-box RNA helicase DDX6 binds the DB1 and DB2 3ʹ UTR structures Dengue subgenomic RNA binds TRIM25 to inhibit interferon expression for epidemiological fitness A glance at subgenomic flavivirus RNAs and microRNAs in flavivirus infections Mechanism and structural diversity of exoribonuclease-resistant RNA structures in flaviviral RNAs Dengue Non-coding RNA: TRIMmed for Transmission Noncoding sub genomic flavivirus RNA is processed by the mosquito RNA interference machinery and determines west Nile virus transmission by culex pipiens mosquitoes Subgenomic flavivirus RNA binds the mosquito DEAD/H-box helicase ME31B and determines Zika virus transmission by Aedes aegypti Biochemistry and molecular biology of flaviviruses Zika virus noncoding sfRNAs sequester multiple host-derived RNA-binding proteins and modulate mRNA decay and splicing during infection Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis Coordination of flower maturation by a regulatory circuit of three microRNAs Frozen competent yeast cells that can be transformed with high efficiency using the LiAc/SS carrier DNA/ PEG method High efficiency transformation of intact yeast cells using single stranded nucleic acids as a carrier hORFeome v3.1: A resource of human open reading frames representing over 10,000 human genes Using the beta-glo assay system to determine beta-galactosidase activity in yeast Methods in enzymology: guide to yeast genetics and molecular biology Mapping of Chikungunya virus interactions with host proteins identified nsP2 as a highly connected viral component A field-proven yeast two-hybrid protocol used to identify coronavirus-host protein-protein interactions Regulation of eukaryotic protein synthesis: selective influenza viral mRNA translation is mediated by the cellular RNA-binding protein GRSF-1 The 3′ untranslated region of human vimentin mRNA interacts with protein complexes containing eEF-1γ and HAX-1 Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Dengue virus hijacks a noncanonical oxidoreductase function of a cellular oligosaccharyltransferase complex Identification of Zika virus and dengue virus dependency factors using functional genomics resource identification of Zika virus and dengue virus dependency factors using functional genomics RNA interference screen for human genes associated with West Nile virus infection A tri-hybrid system for the analysis and detection of RNA-protein interactions The yeast three-hybrid system for screening RNA-binding proteins in plants Using the yeast three-hybrid system to identify proteins that interact with a phloem-mobile mRNA Enod40, a short open reading frame-containing mRNA, induces cytoplasmic localization of a nuclear RNA binding protein in Medicago truncatula Binding of the 5ʹ-untranslated region of coronavirus RNA to zinc finger CCHC-type and RNA-binding motif 1 enhances viral replication and transcription cDNA library preparation A novel method for SNP detection using a new duplex-specific nuclease from crab hepatopancreas Simple cDNA normalization using kamchatka crab duplex-specific nuclease rec-YnH enables simultaneous many-by-many detection of direct protein-protein and protein-RNA interactions A map of the interactome network of the metazoan C. elegans Identification of RNA-binding domains of RNA-binding proteins in cultured cells on a system-wide scale with RBDmap Maximizing binary interactome mapping with a minimal number of assays An experimentally derived confidence score for binary protein-protein interactions Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps A protein domain-based interactome network for C. elegans early embryogenesis Next-generation sequencing for binary protein-protein interactions PAR-CLIP: a method for transcriptome-wide identification of RNA binding protein interaction sites We thank Katrien Geerts for her excellent technical support with virus yield assays.Yeast strain YBZ1 and the p3HR2 vector were generous gifts of Prof. Marvin Wickens, University of Wisconsin, Madison. Yeast strain Y01044 was a generous gift from European Sacharomyces cerevisiae Archive for Functional Analysis (EUROSCARF), Frankfurt, Germany. No potential conflicts of interest were disclosed. The functional enrichment analysis was performed using g:Profiler (version e100_eg47_p14_7733820), which can be accessed from https://biit. cs.ut.ee/gprofiler/gost