key: cord-0957464-8h3scpus authors: Pavlovich, Stephanie S.; Lovett, Sean P.; Koroleva, Galina; Guito, Jonathan C.; Arnold, Catherine E.; Nagle, Elyse R.; Kulcsar, Kirsten; Lee, Albert; Thibaud-Nissen, Françoise; Hume, Adam J.; Mühlberger, Elke; Uebelhoer, Luke S.; Towner, Jonathan S.; Rabadan, Raul; Sanchez-Lockhart, Mariano; Kepler, Thomas B.; Palacios, Gustavo title: The Egyptian Rousette Genome Reveals Unexpected Features of Bat Antiviral Immunity date: 2018-05-17 journal: Cell DOI: 10.1016/j.cell.2018.03.070 sha: 4800badd9a1b5e69dfcd7bdbdd6be7d34727edd7 doc_id: 957464 cord_uid: 8h3scpus Bats harbor many viruses asymptomatically, including several notorious for causing extreme virulence in humans. To identify differences between antiviral mechanisms in humans and bats, we sequenced, assembled, and analyzed the genome of Rousettus aegyptiacus, a natural reservoir of Marburg virus and the only known reservoir for any filovirus. We found an expanded and diversified KLRC/KLRD family of natural killer cell receptors, MHC class I genes, and type I interferons, which dramatically differ from their functional counterparts in other mammals. Such concerted evolution of key components of bat immunity is strongly suggestive of novel modes of antiviral defense. An evaluation of the theoretical function of these genes suggests that an inhibitory immune state may exist in bats. Based on our findings, we hypothesize that tolerance of viral infection, rather than enhanced potency of antiviral defenses, may be a key mechanism by which bats asymptomatically host viruses that are pathogenic in humans. The genome of the Egyptian fruit bat reveals how its immune defenses allow tolerance of pathogenic viruses. Bats, members of the large, diverse order Chiroptera, appear to harbor significantly more zoonotic viruses than other mammals and do so without overt pathology (Calisher et al., 2006; Olival et al., 2017) . Such asymptomatic infection is especially noteworthy in the case of human pathogens such as henipaviruses (Nipah and Hendra viruses), coronaviruses (severe acute respiratory syndrome [SARS] and Middle East respiratory syndrome [MERS] coronaviruses), and filoviruses (Marburg virus [MARV] ), which cause severe, and often lethal, systemic disease in humans and non-human primates (Calisher et al., 2006; Smith and Wang, 2013) . This stark difference between bats and primates has motivated efforts to deeply characterize the genes involved in the immune system of bats and understand the antiviral immune mechanisms used to control viral infection. Genomic analyses of immune genes in bats have produced conflicting and surprising observations. The most thoroughly studied bat genome is that of Pteropus alecto (Zhang et al., 2013) , a reservoir host of Hendra virus. Additional bat genomes have been studied to a more limited extent (Seim et al., 2013; Zhang et al., 2013; Parker et al., 2013) . The most notable findings from these studies involve two large classes of immune genes: natural killer (NK) cell receptors and type I interferons (IFNs). Multiple studies have reported the absence of canonical NK cell receptors in bat genomes (Shaw et al., 2012; Zhang et al., 2013; Lee et al., 2015) , although a few receptors were identified in the P. alecto transcriptome (Papenfuss et al., 2012) . A few studies suggest that significant differences exist in type I IFNs between bats and humans (Zhang et al., 2017; Zhou et al., 2016; Kepler et al., 2010; De La Cruz-Rivera et al., 2018) , although the precise nature of the differences remains unclear. For example, the type I IFN locus has contracted in Pteropus alecto (Zhou et al., 2016) , but expanded in Pteropus vampyrus and Myotis lucifugus (Kepler et al., 2010) . While these genome projects provide important insights into the unique biology of bats, the bat genomes currently available were generated with low-coverage sequencing or with only short-read next-generation sequencing (NGS) technologies (Zhang et al., 2013; Seim et al., 2013) . These sequencing strategies impact the overall contiguity of genome assemblies and limit the ability to resolve repetitive genome loci where important immune gene loci reside. To overcome this limitation, we used a hybrid strategy combining both short-and long-read NGS technologies to generate a high-quality annotated genome for the Egyptian rousette bat (Rousettus aegyptiacus), an asymptomatic host of MARV (Towner et al., 2009 ). Here, we use this genome, the most contiguous bat genome available, to understand the evolution of immune genes and gene families in bats, and describe several observations relevant to defense against viruses. We report an unusual expansion of the KLRC (NKG2) and KLRD (CD94) gene families in R. aegyptiacus relative to other species and show genomic evidence of unique features and expression of these receptors that may result in a net inhibitory balance within bat NK cells. The expansion of NK cell receptors is matched by an expansion of potential MHC class I ligands, which are distributed both within and, surprisingly, outside the canonical MHC loci. We also observe that the type I IFN locus is considerably expanded and diversified in R. aegyptiacus. The IFN-u subfamily contributes most to this expansion, and members of this subfamily are induced after viral infection and show antiviral activity. All these features strengthen the notion of the unique biology of bats and suggest the existence of a distinct immunomodulatory mechanism used to control viral infection. We used a hybrid paired-end short-read plus long-read strategy to generate 720 Gb of short-read data and 34.9 Gb of long-read data, for an approximate total coverage of 1693. We generated a draft genome (Raegyp2.0) comprising 1.91 Gb of sequence represented in 2,490 scaffolds (N50 = 2.007 Mb; NG50 = 1.811 Mb), which is $90% of the estimated genome size. Of 248 core eukaryotic genes, 88.31% were complete in the genome, while 92.34% were at least partially represented, suggesting that gene information is covered well. The genome size and N50 are comparable to the assembled genomes of other bats. However, Raegyp2.0 has a larger contig N50 and a lower total gap length than any other available bat genome, making it the most contiguous bat genome to date (Table 1 ; Figures S1C and S1D). Whole-genome annotation was performed via the NCBI eukaryotic annotation pipeline using all R. aegyptiacusspecific transcript and RNA sequencing (RNA-seq) data in the NCBI databases, as well as proteins of other well-characterized species (Thibaud-Nissen et al., 2013) . Similar to the number estimated by transcriptomic analysis (Lee et al., 2015; Hö lzer et al., 2016) and the numbers in other bats (Table S2) , 19,668 of the annotated genes in Raegyp2.0 are protein-coding genes with high support from transcriptomic or protein data. To gain insight into the evolutionary relationship of R. aegyptiacus to other bats and to incidental hosts of MARV, we inferred homologous protein groups among R. aegyptiacus and 14 other mammals and constructed a maximum-likelihood phylogenetic tree of all 15 species using single-copy orthologous genes ( Figure 1 ). As previously established, the closest taxon to bats among the taxa included in the analysis is the horse (Equus caballus) (Seim et al., 2013; Zhang et al., 2013) . We performed gene family expansion and contraction analysis on inferred gene families and observed many more expanded families in microbats than megabats, similar to what has previously been shown (Zhang et al., 2013; Tsagkogeorga et al., 2017) (Figure 1 ). 55 gene families are significantly expanded in R. aegyptiacus compared to the megabat ancestor (Figure 1 ; Tables S3 and S4). The contiguity statistics (scaffold and contig N50, total length, and total gap length) are reported for each bat genome available in the NCBI GenBank database (accessed on 8/8/17), along with the GenBank Assembly Name and the given coverage. Kb, kilobases; Mb, megabases; Gb, gigabases. See also Figure S1 , Tables S1 and S2, and STAR Methods. To investigate whether evolution at the gene level could contribute to the unique phenotype of R. aegyptiacus compared to other mammals, we studied disease-relevant immune genes for selection pressures ( Figure S2 ; Table S5 ). Multiple innate immune response genes experience positive selection pressures along the R. aegyptiacus branch, including ISG15, an interferon-stimulated gene, IFNAR1, a subunit of the type I IFN receptor, and SIKE1, a negative regulator of the interferon response (Table 2) . Several more innate immune genes experience relaxed purifying selection, including JAK2 and STAT3, components of the JAK/STAT signaling cascade, DDX58 (RIG-I), a sensor of viral double-stranded RNA, and TLR8, a pathogen-sensing molecule (Table 2) . These results are consistent with those from similar analyses in P. alecto and M. davidii (Zhang et al., 2013) . In mammals, NK cell receptors are encoded in two distinct gene complexes: the natural killer complex (NKC) contains killer lectinlike receptors (KLRs) such as CD94 (KLRD), NKG2 (KLRC), and Ly49, while the leukocyte receptor complex (LRC) contains immunoglobulin superfamily proteins such as the ILT/LIR family and the killer cell immunoglobulin-like receptors (KIRs). Both complexes vary in gene content among species, and encode both inhibitory and activating receptors. We did not find functional KIRs in the LRC, but consistent with what was found in P. alecto (Papenfuss et al., 2012) , we identified 14 NKG2A/ B-like genes (referred to as NKG2-1 through NKG2-14, including four pseudogenes), one NKG2D-like gene, one NKG2C-like gene, and 5 CD94-like genes (Table S3) . Remarkably, six of the ten putatively functional NKG2A/B-like genes simultaneously encode activating and inhibitory interaction motifs. Of the remaining genes, three encode only inhibitory motifs, and only one gene encodes an activating motif alone (Figures 2A and 2B) . No other potential NK cell receptor genes were found. While inhibitory NKG2 receptors signal via ITIMs in their cytoplasmic domains, activating CD94/NKG2 receptors transmit signals via a positively charged residue in their transmembrane domains. This residue recruits adaptor molecules that contain activating motifs, with a lysine residue associated with DAP12 recruitment, and an arginine residue associated with DAP10 recruitment. R. aegyptiacus NKG2 proteins with putative activating function show a strong preference for arginine at this location, suggesting that these receptors favor an association with DAP10 ( Figure 2B ). Multiple diverse NKG2 and CD94 genes presumably allow substantial combinatorial diversity among heterodimeric NKG2/CD94 receptors (Figures 2A, 3 , and S3C). Four of the five CD94 genes in R. aegyptiacus lack two cysteines at position 58 and 59 that are highly conserved in multiple mammal species. These cysteines participate in the disulfide-mediated heterodimeric interaction between CD94 and NKG2 proteins in humans (Kaiser et al., 2008; Petrie et al., 2008) , suggesting that these CD94 molecules might interact with their NKG2 co-receptors in A maximum likelihood tree based on 2,400 orthologous proteins was generated and used to infer expansion and contraction of 7,698 gene families. The number of expanded and contracted gene families is in blue and red, respectively. Numbers in black are the bootstrap evidence for partitions based on 1,000 bootstrap replicates. Images used under a creative commons license. MRCA, most recent common ancestor. See also Tables S3 and S4, Data S1, and STAR Methods. an alternate way. Mouse and rat CD94 are capable of direct association with DAP12 or DAP10 via a lysine residue in their transmembrane domains (Koch et al., 2013) , but the CD94 sequences from all bats we examined have no lysine residues in their transmembrane regions, so are unlikely to associate directly with DAP proteins ( Figure S4C ). To determine whether CD94 and NKG2 genes are expressed, we queried our published transcriptomic data (Lee et al., 2015) and found that the majority of these genes are expressed at low levels in peripheral blood mononuclear cells and secondary lymphoid organs ( Figures 2C and 3B ), similar to human baseline expression. However, two receptors with inhibitory signaling motifs-NKG2-13 and NKG2-14-along with CD94-1, are expressed at higher levels in the same tissues, suggesting that inhibitory signaling dominates in uninfected bats. To determine whether NKG2/CD94 receptors are diversified across bats, we examined the NKC in additional bats, and found that all bats we studied have multiple NKG2-like genes, although some were not originally classified as C-type lectins because of missing exons; P. vampyrus and P. alecto also have multiple CD94-like genes (Figures 2, 3, and S3) . Each bat has one CD94 gene with canonical cysteine residues at position 58 and 59, (Figures 3A and S4A) and like R. aegyptiacus, multiple bats have CD94 genes without these residues, suggesting that they may have alternative functions. Phylogenetic analysis shows that NKG2 genes have undergone considerable diversification before and after the speciation of megabats (Figures 2D and S5) . As with R. aegyptiacus, the putative functional and truncated NKG2 genes in both megabats (P. alecto and P. vampyrus) and microbats (M. lucifugus and M. davidii) contain ITIMs or both ITIMs and a positively charged transmembrane residue, suggesting that both inhibitory and activating signaling in these receptors is common among all bats in our analysis ( Figure 2B ). The MHC Class I Genes Are Located Both within and outside of the Canonical MHC Locus While KIRs interact with classical MHC class I molecules (cMHCs), NKG2A/B/C and F receptors interact with HLA-E, a non-classical MHC class I molecule (ncMHC) that displays nonamers derived from the signal peptides of cMHCs. This mechanism is thought to allow NK cells to monitor the expression of cMHCs and deliver cytotoxic hits to cells lacking such expression (Yokoyama and Plougastel, 2003). We hypothesized that Evolution rates were estimated under several models, and the best fitting model was chosen. u 0 refers to the background rate of evolution and always includes non-bat species, but also includes some bat species under certain models. u1 refers to the rate of evolution of the R. aegyptiacus gene but also includes other bat species under certain models. FDR, false discovery rate. See Figure S2 and STAR Methods for full description of the models. See also the expansion of the NKG2 receptor family would be matched by an expansion of cMHCs or ncMHCs. In humans, MHC class I genes are located in three areas referred to as the a, k, and b duplication blocks, which are separated by framework regions containing non-HLA genes (Kulski et al., 2002) . HLA-E and its functional equivalent in mice, H2-Qa1, are located in the k block. Raegyp2.0 appears to lack the a and k blocks ( Figure 4A ), as do P. alecto and E. fuscus (Ng et al., 2016) . Consistent with an expansion, we find twelve MHC class Ilike genes and seven pseudogenes in Raegyp2.0 ( Figures 4A and 4B ), none of which is discernible as the functional equivalent of HLA-E. Nonamers inferred bioinformatically from R. aegyptiacus MHC class I signal peptides are much less diverse than those observed in human or mouse, a pattern observed also in the gray mouse lemur ( Figure 4E ). Only two MHC class I genes and two pseudogenes are located in the b block ( Figure 4A ). Two MHC class I genes in Raegyp2.0 were found in genomic contexts apparently outside of the MHC class I region ( Figure 4B ). Eight genes and five pseudogenes were identified on scaffolds that could not be further localized in the genome. These could potentially be allelic variants or additional genomic contexts for MHC-I genes. Examination of additional bat genomes also shows MHC class I genes outside the canonical MHC class I region, suggesting that dispersion of class I genes is not an assembly artifact but is common to many bats (data not shown). Many of the MHC class I genes in Raegyp2.0 are expressed across a wide range of tissues, including genes located outside the canonical MHC locus ( Figure 4C ) like MHC-11 and -12, suggesting that they may function in the canonical self-detection role of cMHCs. Unlike NKG2A/CD94 receptors, NKG2D forms homodimers that bind a number of MHC class Ib ligands, including MICA and MICB, and members of the ULBP family, which are upregulated in cells during infection and stress (Molfetta et al., 2016) . MICA and MICB were also not found in the MHC-I locus, although a candidate MICB ortholog was found outside of the MHC loci ( Figure 4B ). R. aegyptiacus and other bats appear to have two additional groups of NKG2D ligands, which are closest to ULBPs in humans ( Figure S6 ). Further investigation is needed to determine whether these genes functionally resemble ULBP or MIC family members. We estimate 20 type I IFN genes in the megachiropteran ancestor and find 46 putative functional genes in Raegyp2.0, including 12 IFN-a genes, one of each of the IFN-b, -ε, and -k genes, nine IFN-d genes, and 22 IFN-u genes ( Figure 5 ; Table S3 ). The greatest expansion occurred in the IFN-u subfamily, which has only one copy in humans. Given previous reports of constitutive IFN expression in bats, we sought to determine whether the expanded IFN genes may be constitutively expressed and found limited baseline expression of these genes (Table S6) . To determine whether these IFN-us may be induced by viral infection, we infected immortalized R. aegyptiacus cells (RoNi) with the Cantell strain of Sendai virus, a known inducer of IFNs in other species. We observed Figure S7C ), although at relatively low levels compared to IFN-b transcripts. To examine whether IFN-u proteins retain the canonical antiviral function of type I IFNs, we synthesized recombinant Egyptian rousette IFN-b and IFN-u4, and an unrelated protein of similar size, and used these proteins in an antiviral assay in RoNi cells. Consistent with a bona-fide antiviral function, treatment of RoNi cells with IFN-u4 blocks infection with vesicular stomatitis virus encoding eGFP (VSV-eGFP) ( Figure S7 ). Few viruses are known to cause acute disease in bats, including those that cause profound, often lethal, disease in humans. The precise mechanism(s) of this viral resistance is not known. One hypothesis that has been gaining acceptance is that bats present especially potent innate antiviral defenses compared to primates, controlling viral replication early in infection, and as a result, developing effective adaptive immune responses (Baker et al., 2013) . Our results suggest a different hypothesis that is consistent with the idea that bat antiviral mechanisms are different in essential ways from those of other mammals, but is additionally associated with enhanced infection tolerance rather than enhanced defense. This hypothesis is supported by infection studies of MARV transmission among R. aegyptiacus bats, in which bats that are ''naturally'' infected by other experimentally inoculated bats appear to have a protracted incubation period (Amman et al., 2015; Schuh et al., 2017) . Further, these studies showed that infected bats can remain viremic and shed infectious virus for extended periods of time (up to 3 weeks after infection) before eventually clearing the virus (Data S1). Despite prolonged infection, limited inflammation is observed in even the most highly infected tissues, similar to what was previously observed in infected wild-caught bats (Jones et al., 2015; Towner et al., 2009) . Bats may tolerate viral infections to a greater extent by minimizing the proinflammatory effectors that promote damage to the host in many viral infections. Type I IFNs are induced very early in viral infection and act by inducing effectors encoded by interferon stimulated genes (ISGs). Different IFN subtypes specifically interact with the common IFN receptor inducing a distinct spectra of ISGs with different antiviral potencies (Hoffmann et al., 2015) . The magnitude and nature of the IFN response determine whether the resulting effects on the host are harmful or beneficial (Malireddi and Kanneganti, 2013) . For example, dysregulation of the type I IFN response has been implicated in the pathogenesis of multiple emerging viruses, including MARV (Liu et al., 2017; Connor et al., 2015) . Although IFN gene families differ substantially across mammals (Secombes and Zou, 2017), the extensive expansion of IFN-u genes in R. aegyptiacus to almost two dozen genes is striking. This expansion is dramatically different from what was observed in P. alecto (Zhou et al., 2016) , but is consistent with the expansion in P. vampyrus (Kepler et al., 2010) . In our hands, at least one of the IFN-u members, IFN-u4, has a marked antiviral effect against VSV-eGFP infection in RoNi cells ( Figure S7 ). However, the effect observed in this in vitro system, was less potent than that of IFN-b. Thus, IFN-u4 most likely induces a unique ISG pattern and may be more effective against different viruses, or may induce lower levels of effectors in a more regulated response. Consistent with this latter scenario, Banerjee et al. (2017) have shown that poly I:C treatment induces type I IFNs in both human and Eptesicus fuscus bat cells, but bat cells express much lower levels of inflammatory mediators. A wider variety of signaling mediators may provide R. aegyptiacus greater flexibility to develop a more nuanced antiviral response. It is also possible that different IFN-us may complement and/or synergize with each other. We observed IFN-u gene induction in RoNi cells after Sendai virus infection, albeit at lower levels than IFNb or IFN-a genes ( Figure S7 ). Given that Sendai virus is known to be a potent inducer of IFN-a and -b in particular, it is possible that different viruses or other stimuli may preferentially induce IFN-u genes. Unlike P. alecto, R. aegyptiacus shows no evidence of constitutive IFN expression (Table S6 ). Both bats are reservoirs for different emerging viruses of high virulence to humans, so it will be of great interest to determine whether these apparent differences in IFN expression represent distinct mechanisms of viral control by both bats species. NK cells are an important component of innate antiviral responses, and have been associated with survival of Ebola virus infection (Liu et al., 2017) . The unique organization, structure, and increased signaling complexity of the NK cell receptors in multiple bats point to adaptations that are also consistent with the infection tolerance hypothesis. Our findings suggest that NKG2/CD94 receptors, which are more associated with an inhibitory response in other species, serve as the primary NK cell receptors in bats. Consistent with this, all but one of the NKG2A/B-like genes in R. aegyptiacus have inhibitory motifs at the cytosolic tail. The expansion of CD94 genes makes the the presence of additional genes on the same scaffold. Black, MHC class I genes; gray, non-MHC genes; dark blue, MICB; unfilled boxes, MHC class I pseudogenes. The a, k, and b class I duplication blocks are shown in red, purple, and green, respectively. Figure S6 , Table S6 , and STAR Methods. possible combinatorial diversity of heterodimeric receptors very large, as previously demonstrated for prosimians (Averdam et al., 2009) . Because signal transmission occurs via a conserved set of residues, the additional receptor diversity is likely to be associated with the capacity to bind additional ligands or differentially interact with the same ligands. Greater diversity in ligand binding would provide better recognition of MHC class I alleles to recognize self for NK cell tuning and licensing, and also to distinguish a variety of pathogen mimics of MHC class I molecules (Parham and Moffett, 2013). The KLRD variants missing conserved cysteines add an additional level of complexity to this interaction that needs further characterization once tools for isolating NK cells in bats are available. In other mammals, KLRC genes are also expressed on T cells, and NKG2A has been shown to control the level of T cell activation during viral infection, preventing excessive activation and immunopathology in mice (Rapaport et al., 2015) . The high baseline expression of inhibitory KLRC genes may indicate an immune-inhibitory state associated with both NK cells and T cells in R. aegyptiacus bats in the absence of infection, although this remains to be confirmed with functional studies. A further striking feature of the NK cell receptor genes in R. aegyptiacus and the other bats we studied is the signaling mode they are predicted to use. In genes with potential for activating signaling, an arginine residue is preferred ( Figure 2B ), suggesting DAP10, rather than DAP12, recruitment (Koch et al., 2013) . DAP12 has been shown to be more efficient in activating cytokine production than DAP10 (Lanier 2009), therefore a preference for DAP10 could mean that activating NK receptors in bats have adapted to be less potent inducers of cytokines, and thus less inflammatory. NK cell receptors that possess both inhibitory and activating domains are unusual across mammals. The only known genes with both functions are the single gene KIR2DL4 in humans and the NKG2 genes in lemurs (Parham and Moffett, 2013). KIR2DL4 activation promotes robust cytokine secretion but not cytotoxicity (Kikuchi-Maki et al., 2005) . It is possible that the R. aegyptiacus KLRC genes mimic the signaling of human KIR2DL4, although parallels between both receptors are difficult to make without functional assays. However, our genomic evidence suggests that NK cell receptors in bats are uniquely regulated, especially given the multiple changes in the signaling potential of these receptors, which are highly conserved in eukaryotes ranging from Drosophila to humans. The potential extended capacity of the KLRC/KLRD system to bind diverse ligands is accompanied by expansion of MHC class I genes outside of the canonical MHC locus, although there is no obvious bat ortholog of HLA-E. A similar expansion and absence of a clear HLA-E ortholog has been observed in prosimians (Averdam et al., 2009) . If one of the MHC class I genes in Raegyp2.0 is functionally similar to HLA-E, the high similarity of predicted class I nonamers ( Figure 4E ) suggests that presenting a peptide from one MHC class I molecule would likely be interchangeable with presenting a peptide from another. Thus, expression of MHC class I molecules would have to be dramatically decreased in order for ''missing-self'' detection to occur. While functional orthologs cannot be inferred without functional assays, the large number of MHC genes encoded outside the canonical locus in the bats we studied ( Figure 4B ) may be suitable ligands for the expanded KLR genes. Distribution of the MHC genes across the genome might serve as a mechanism to generate redundancy as different KLRC receptors might interact with distinct MHC class I genes. This could potentially result in a higher activation threshold for NK cells. Taken together, these results show that multiple bats, including R. aegyptiacus, have expanded and diversified numerous antiviral loci, and potentially developed unique adaptations in NK cell receptor signaling, and type I IFN responses. Our findings are consistent with the hypothesis that certain key components of the immune system in bats have coevolved with viruses toward a state of respective tolerance and avirulence, although tolerance is likely not the only mechanism at play. For example, in addition to enhanced flexibility, the expansion of type I IFNs may also point to enhanced potency of antiviral defenses. Recent studies have shown that IFN stimulation in bats induces some ISGs that are not induced by IFN in humans, and that the response kinetics may vary as well (De La Cruz-Rivera et al., 2018) . Adaptations in potency are also indicated by observations of lower viral loads in R. aegyptiacus bats compared to humans (Amman et al., 2015; Jones et al., 2015) . Finally, even with mutual disarmament, the host must be alert to viruses that may spontaneously revert to an antagonistic phenotype. In either case, definitive tests of these hypotheses await the development of further experimental reagents for cytometry and biochemical intervention-reagents that are being developed now with information made available by the completed genome project. Detailed methods are provided in the online version of this paper and include the following: The authors declare no competing interests. Shaw, T.I., Srivastava, A., Chou, W.C., Liu, L., Hawkinson, A., Glenn, T.C., Adams, R., and Schountz, T. (2012) . Transcriptome sequencing and annotation for the Jamaican fruit bat (Artibeus jamaicensis). PLoS ONE 7, e48472. Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Gustavo Palacios (gustavo.f.palacios.ctr@mail.mil). Genomic DNA was isolated from the liver and spleen tissue of a wild-caught healthy older juvenile male Egyptian rousette bat captured at Python Cave in Uganda. Research was conducted under an IACUC approved protocol in compliance with the Animal Welfare Act, PHS Policy, and other Federal statutes and regulations relating to animals and experiments involving animals. The facility where this research was conducted is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care, International and adheres to principles stated in the Guide for the Care and Use of Laboratory Animals, National Research Council, 2011. Rousettus aegyptiacus immortalized fibroblasts (RoNi) were originally generated in the work described in Biesold et al. (2011) . Briefly, kidney tissue from a wild-caught sub-adult Egyptian rousette bat was cultured and immortalized via lentiviral transduction of the SV40 large T antigen. The sex of the bat was not recorded at the time of capture. Cell cultures were genotyped and tested for mycoplasma, SV5, filoviurses and lyssaviruses by RT-PCR. Cells were cultured at 37 C at 5% CO 2 in DMEM (Dulbecco's Modified Eagles Medium) with 4.5g L glucose supplemented with 10% fetal bovine serum, 1% penicillin/streptomycin 100x conentrate, 1% L-Glutamine 200mM, 1% Sodium Pyruvate 100mM, and 1% MEM nonessential amino acids 100x concentrate. We used the UltraClean Blood DNA Isolation kit (MO BIO Laboratories, Carlsbad, CA) with a customized protocol for tissue (available upon request) to isolate genomic DNA from liver and spleen tissue of a healthy older juvenile male Rousettus aegyptiacus bat captured at Python Cave in Uganda. For sequencing on the Illumina platform, 1 ug of genomic DNA was sheared to 400 bp using Covaris LE220 Focused-ultrasonicator (Covaris, Woburn, MA). End repair, A-tailing and ligation of adapters were performed on the Apollo 324 automated system, using Prep X Complete ILMN DNA Library Kit (WaferGen Biosystems, Fremont, CA). KAPA Library Amplification Kit with 10 cycles of PCR was used for library enrichment. Libraries were quantified by qPCR using KAPA Library Quantification Kit (KAPA Biosystems, Wilmington, MA). Each library was loaded on 8 lanes of the high output flow cell. Cluster amplification was performed on the cBot with the TruSeq PE Cluster Kit v3-cBot-HS (Illumina, San Diego, CA). Clustered flow cell was sequenced on the HiSeq 2500 instrument with the TruSeq SBS Kit -HS (Illumina). 720 Gb of 2x101bp paired-end data were produced (approximate coverage of 145x). For sequencing on the PacBio platform, genomic DNA was sheared to 20kb average size using g-TUBE (Covaris). After DNA damage repair and ends repair, hairpin adapters were ligated to form a SMRTbell template. ExoIII and ExoVII treatment was used to remove failed ligation products. Size selection was performed on Blue Pippin system (Sage Sciences, Beverly, MA) using 0.75% dye-free agarose gel cassette, marker S1 and Hi-Pass protocol; low cut was set on 4000 bp. Final library assessment was obtained by Qubit dsDNA BR assay and Agilent 2100 Bioanalyzer DNA 12000 chip. To obtain longer reads, libraries were sequenced with P5-C3 chemistry. Annealing of sequencing primer and binding polymerase P5 to the SMRTbell template was performed according to PacBio calculator. The polymerase-template complexes were bound to MagBeads, loaded onto SMRTcells (SMRT Cell 8 pack V3) at final concentration 180 pM, and sequenced with 180 min movies on PacBio RS II instrument (Pacific Biosciences, Menlo Park, CA). Genome size was estimated by k-mer analysis. K-mers from 4 lanes of the Illumina dataset were counted using Jellyfish (Març ais and Kingsford, 2011). Excluding rare 25-mers that occur at low depth, the most frequently occurring depth was 57x ( Figure S1 ). The presence of multiple peaks suggests that this genome has a high degree of heterozygosity. Peak k-mer frequency (M), real sequencing depth (N), read length (L), k-mer length (K), total bases (T), and genome size (G) are related by the following formulas (Li et al., 2010) : Using this method, we estimate a complete genome size of 2.08 Gb, which is very close to previous estimates of 2.11 Gb based on nuclear densitometry (Kwiecinski and Griffiths, 1999) ( Figure S1A ). Illumina reads were assembled separately from the PacBio reads with SparseAssembler, a short read assembler that exploits high coverage to construct a modified de Bruijn graph (DBG) (Ye et al., 2012) . Because raw Pacific Biosciences (PacBio) long sequencing reads can be error-prone, we corrected potential sequencing errors in these reads with $34x of high-accuracy Illumina reads using LoRDEC v0.5 (long read de Bruijn graph error correction) (Salmela and Rivals, 2014) . Contiguity of the short-read assembly was improved by incorporating long-read data using DBG2OLC (Ye et al., 2016) , which anchors the short-read contigs generated with SparseAssembler to the long PacBio reads. Multiple approaches were used to scaffold the assembly. First, we used PacBio reads with the long-read scaffolding program LINKS v1.5.1, which uses a paired k-mer approach to scaffold assemblies with long reads even if they have already been used in the assembly (Warren et al., 2015) . LINKS was used iteratively until no improvement was observed in contiguity, which was after seven iterations. Second, we made use of transcriptome data recently published (Lee et al., 2015) to scaffold our assembly with L_RNA_Scaffolder (downloaded 9/28/2015) (Xue et al., 2013) . Last, we reincorporated the paired-end information from our Illumina data for scaffolding with SSPACE v3.0 (Boetzer et al., 2011) . As a post-processing step, we aligned the Illumina reads to the assembly using Bowtie2 (Langmead and Salzberg, 2012) and ran the variant calling program Pilon (Walker et al., 2014) on the resulting alignment to correct potential mis-assemblies, consensus calling errors, or homopolymer indel errors. We also tested the use of the Quiver algorithm from Pacific Biosciences as a post-processing step prior to pilon, but found that it reintroduced insertion and deletion errors that led to poor gene models. The genome-wide average GC content is about 40%, and intra-scaffold GC content ranges from 31.1% to 71.4%. Raegyp2.0 has a higher degree of heterozygosity (0.53%, Figure S1B ) than has been reported for other bat genomes (Zhang et al., 2013) . To check for mis-assemblies, we used the gene coverage program CEGMA v2.5 with mammalian settings to identify the presence and coverage of highly conserved eukaryotic genes in the genome (Parra et al., 2007) . We also remapped $54x of short paired reads onto the assembly with bwa-mem v0.7.10 and assessed the number of reads that map appropriately (in the correct orientation and with the expected insert size) with SAMtools flagstat (SAMtools v0.1.18) (Li 2013; Li et al., 2009) . 98.41% of reads align, and 91.19% of reads map appropriately, indicating no major misassemblies. To assess the quality of the Raegyp2.0 assembly, we used a statistical analysis program, QUAST v3.0 (Gurevich et al., 2013) , to look at baseline statistics, including the total assembly length, the percentage of estimated genome size covered, and contiguity statistics. The Nx and NGx plots (generated using QUAST) in Figure S1 show the shortest sequence for which the total length of all sequences of its length or longer make up ''x'' percentage of the total assembly size (Nx) or the total estimated reference genome size (NGx). For example, In a length-sorted list of sequences, the N50 refers to the length of the smallest sequence for which the sum of the length of all sequences of the same or longer length constitute 50% of the total assembly length. The scaffold N50 refers to the N50 of all the scaffolds in the genome, while the contig N50 refers to the N50 of all the contigs in the genome. To compare genome contiguity statistics across all available bat genome projects (Table 1) , we accessed the appropriate GenBank Genome page for each species and reported contiguity statistics as listed (accessed on 8/8/17). The assembly was submitted to GenBank (accession GCA_001466805.2) and annotated using the NCBI eukaryotic genome annotation pipeline (Thibaud-Nissen et al., 2013) . A total of 26.25% of the genomic sequence was identified as repetitive by the de novo repeat finder WindowMasker (Morgulis et al., 2006) and was masked for the purpose of aligning evidence and predicting genes. Transcripts and proteins available in GenBank and known RefSeq transcripts and proteins for bats and human were aligned to the genome with Splign (Kapustin et al., 2008) and ProSplign, along with model RefSeq proteins previously annotated on the Pteropus alecto and Myotis brandtii genomes. In addition, over 2 billion RNA-Seq reads derived from 12 different Rousettus aegyptiacus tissues and available in SRA were aligned. Model precursors were created by Gnomon (National Center for Biotechnology Information n.d.), a gene calling algorithm trained on Pteropus alecto, by collapsing overlapping alignments with compatible splice patterns. In a second step also run by Gnomon, model precursors with high coding propensity were extended or joined if missing start or stop codon using an HMM model. The resulting models were evaluated and filtered based on multiple criteria, including supporting evidence, conflicts with models on the other genomic strand, Blast hits to UniProtKB/SwissProt if over 50% ab initio sequence, and number of exons (single-exon non-coding RNA were eliminated). Models in the final set were assigned a function by orthology calculation to human, or if no ortholog could be calculated, by homology to proteins in UniProtKB/SwissProt. Finally, models were assigned GeneIDs and RefSeq model transcript and protein accession (with XM_, XP_, XR_ prefixes) and loaded to the Nucleotide, Protein and Gene databases as part of NCBI Rousettus aegyptiacus Annotation Release 100. The genome contains 36.4% repetitive content, similar to that found in other bats (Table S1 ). 19,668 of the annotated genes were protein-coding genes and 2,380 were non-coding genes. 2,198 coding sequences were corrected for premature stop codons, small internal gaps, or frameshifts based on aligning evidence (NCBI Eukaryotic Annotation Group, 2016) . When present, the corrected models were used for downstream analysis. 5,958 long non-coding RNAs were predicted with full support from transcript data and 340 tRNA models were predicted with tRNAscan-SE (Lowe and Eddy, 1997). 96.03% of predicted mRNAs were fully supported by transcriptomic or protein data. A total of 16,254 of predicted protein-coding genes (83%) had at least one protein aligning to a UniProtKB/ SwissProt protein for over 95% of its length, which was higher than for any other bat annotated by the NCBI pipeline at the time. In addition, the number of the UniProtKB/SwissProt hits covered over 95% of their length by a predicted protein was similarly high (82%), indicating that the predicted proteins on Raegyp2.0 represent full-length models. Of the 19,668 protein-coding genes annotated in the genome, 317 genes have no associated gene labels or homolog in open reading frames from genomes of other species in RefSeq, and may represent novel R. aegyptiacus-specific genes. However, the median length of the protein product of these genes (257 amino acids) is significantly lower than the median length of all annotated proteins (498 amino acids). Thus, it remains a possibility that these genes have been previously identified in other species but are partial or poor models in Raegyp2.0 because of misannotation or local misassembly. We inferred homologous protein groups among 15 species of mammals using a similarity-based approach within the OrthoMCL pipeline (Fischer et al., 2011) . The following species were included in the analysis: Rousettus aegyptiacus (Egyptian rousette bat), Pteropus vampyrus (large flying fox), Pteropus alecto (black flying fox), Myotis davidii, Myotis lucifugus (little brown bat), Homo sapiens, Macaca fascicularis (crab-eating macaque), Macaca mulatta (rhesus macaque), Mus musculus (mouse), Cavia porcellus (guinea pig), Cricetulus griseus (Chinese hamster), Sus scrofa (pig), Bos taurus (cow), Equus caballus (horse), Canis familiaris (dog). Protein data were downloaded from RefSeq and filtered to include only the longest protein product of a gene for use in the pipeline. Briefly, OrthoMCL gathers all-against-all blastp hits into reciprocal best hits (between species) and reciprocal better hits (within species). These hits are converted into a graph network describing likely orthologous or paralogous relationships among proteins, and MCL clustering is performed to group proteins into families. Using OrthoMCL v2.0.9, we obtained 19,310 groups of proteins and 7,041 single-copy orthologous groups. We annotated all OrthoMCL groups with functional family designations from the PANTHER database (Mi et al., 2016) . We first labeled each group with a representative RefSeq protein accession from that group, using a protein from a well-curated genome (human, mouse, macaque) whenever possible. We mapped all RefSeq accessions directly to PANTHER family IDs or to UniProtKB accessions for subsequent mapping to PANTHER IDs using bioDB, PantherDB, and/or the UniProtKB accession mapping feature (Mi et al., 2016; Mudunuri et al., 2009; UniProt Consortium, 2015) . All OrthoMCL groups with the same PANTHER family ID were collapsed into new gene families, which produced a total of 9,555 gene families (2,400 single-copy orthologous gene families). Some families remain unlabeled after this process because proteins from relatively new genomes may be missing in PANTHER. Of note, 3,450 OrthoMCL groups did not map to PANTHER families and were designated unlabeled gene families for downstream analysis. Using this process, upon examining the type I IFN gene family, we observed that many genes were not accounted for despite being present in the NCBI annotation. We discovered that these genes were labeled as singletons or in families with only R. aegyptiacus proteins, and therefore would not be classified as a PANTHER family since no R. aegyptiacus proteins were in the PANTHER database at the time of analysis. We manually corrected the numbers of genes in this family based on NCBI annotations and yielding 9,550 families, 3,445 of which were unlabeled. Repetitive content was identified based on homology to the RepBase database using RepeatMasker (Smit et al. n.d.) . ''One code to find them all'' (Bailly-Bechet et al., 2014) was used to resolve nested repeats and provide family-level quantitative information. All short reads were mapped to the genome using Bowtie2 (Langmead and Salzberg, 2012) . Duplicate reads were marked and removed with PicardTools v1.131 (Broad Institute n.d.). Variants were called using Samtools v1. 3 (Li et al., 2009 ). BCFTools v1.3.1 was used to filter variant calls in regions with parameters -g3 -G10 -e '%QUAL < 20 jj (RPB < 0.1 && %QUAL < 30 jj (DP < 30) jj (DP > 250) jj (MQ < 20)'. RoNi cells in 10% FBS media were seeded at 3x10 4 in 96-well plates, and treated with serial dilutions of recombinant 6xHis-tagged IFN-b1, recombinant 6xHis-tagged IFN-u4, or an unrelated recombinant 6xHis-tagged protein (rPA-D1) for 4 hr at 37 C. The interferon-containing media was removed, and cells were infected with vesicular stomatitis virus encoding eGFP (VSV-eGFP; kindly provided by John Connor, Boston University School of Medicine; Whitlow et al., 2006 Whitlow et al., , 2008 at an MOI of 0.05 in 2% FBS RoNi media, and imaged for GFP expression 1 day post infection (multiple viral replication cycles). Among many others, we included genes involved in dendritic cell maturation, induction, and signaling of type I IFNs, and cytokine responses, since dysfunctions of these pathways have all been reported as contributors to the pathogenesis of filoviruses in primates (Connor et al., 2015; Liu et al., 2017) . We also included genes in DNA damage response pathways, since adaptations in DNA damage responses during the selection for flight in bats have been hypothesized to influence immune responses in bats (for a full list of genes studied, see Table S5 ). All genes of interest (immune genes, echolocation genes, flight genes) were downloaded from RefSeq in GenBank format and coding sequences for each gene were extracted using a python script. For each member of an ortholog group, the longest isoform was selected. Amino acid sequences of ortholog groups were aligned with Muscle v3.8.31 (Edgar, 2004) . The resulting alignments were then trimmed with trimAl v1.4 (Capella-Gutié rrez et al., 2009) to retain only high-quality aligned regions and alignment columns containing gaps. We used codeml within the PAML v4.9b suite of phylogenetic analysis tools to estimate u (omega), the nonsynonymous/synonymous nucleotide substitution rate ratio, and k (kappa), the transition/transversion rate ratio with the F3 3 4 codon frequency matrix (Bielawski and Yang, 2005; Yang, 1998) . Seven separate models were used ( Figure S2 ) and maximum likelihood scores estimated for each. The models shown in Figure S2 describe the hierarchy of models tested for each gene included in the analysis. Likelihood ratio tests were performed, progressively allowing more degrees of freedom. Model 0 assumes one rate of evolution in all branches. The fit of three models (1a, 1b, and 1c) are separately compared to Model 0 via a likelihood ratio test, and the best fitting model (highest likelihood of the three) is then compared to the next model in the hierarchy (e.g., Model 1a is compared to 2a, Model 1c is compared to both 2a and 2b), and the best fitting model is then compared to Model 3. If the next model in the hierarchy does not fit the data better than the previous model, the previous model is taken to be the most likely. Two-ratio unconstrained models were first tested against the one-ratio null model to test for differential selective pressure. Any gene for which the null model was rejected proceeded to testing of the most likely two-ratio model against all three-ratio models in which it was nested using the same procedure, and, finally, any gene for which the two-ratio model was rejected against any three-ratio model proceeded to testing the most likely three-ratio model against the four-ratio model. FDR was controlled at each level using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) . Models showing u 1 > 1 along any branch were then compared against two-ratio constrained models with u 1 fixed at 1 to test for neutral evolution. Branches showing significantly higher than average, but not greater than 1, u 1 may have positively selected sites. However, relaxation of purifying selection along these branches cannot be ruled out using these models. Analyses were run with multiple starting values of both k (2,2.5,4) and u (0.1,1,2) to provide a check against local optimas. These analyses were automated using LMAP v1.0.0 (Maldonado et al., 2016) . Significance of expanded and contracted gene families was determined as follows: CAFE v3.1 provides a family-wide p value for gene families that evolved differently than expected, and additionally provides a Viterbi p value that assigns a p value to the contribution of each species to the family-wide evolution across the tree (De Bie et al., 2006) . The Benjamini-Hochberg procedure was applied to family-wide p values for the gene families that evolved differently than expected. The p values were ranked from lowest to highest with identical p values assigned the same rank, and the rank for the next non-identical p value incremented by the total size of the identical group. The rank of each p value was divided by the total number of hypotheses (7,698, for each gene family) and multiplied by 0.05 to obtain the adjusted p value with a false-discovery rate of 0.05. For families with adjusted p values < 0.05, the Viterbi p values were used to determine whether the expansion or contraction in a specific lineage was significant. Families with expansions and contractions in Figure 1 have family-wide adjusted p values less than 0.05 and Viterbi p values less than 0.05. Significance of differential expression between Sendai virus-infected samples and mock-infected samples in Figure S6C Table 2 and STAR Methods After concatenation, the super-proteins were re-trimmed with trimAL (-automated1 parameter) and used to generate a maximum likelihood species tree with RAxML v8.2.9 under a JTT + G substitution model with empirical base frequencies (Stamatakis, 2014). 1,000 bootstrap the CAFE v3.1 (Computational Analysis of gene Family Evolution) algorithm (De Bie et al., 2006) to estimate a value for lambda (the birth and death rate for a given gene per million years) by maximum likelihood. Since CAFE assumes that all input gene families have at least one member in the most recent common ancestor of all species, gene families were filtered to exclude single lineage families. Only families with at least one gene in both the Laurasiatheria superorder (bats, ungulates, carnivores) and the Euarchontoglires superorder (primates, rodents) were included. We manually corrected type I IFN gene family sizes after observing that several genes that were annotated as type I IFNs by the NCBI did not appear in our inferred gene families. Further exploration revealed that these additional IFNs get classified as single gene families or as unlabeled gene families because they consist of only bat IFNs. The resulting families were further filtered to exclude eight families with large ranges in size (> 100) across the tree, leaving 7,698 families for expansion and contraction analysis. CAFE was used to obtain a maximum likelihood estimate of a global birth and death rate parameter l (lambda; rate of gain/loss per gene per million years) across the species tree (created by above methods) of 0.0169284. The size of each family at each ancestral node was estimated and used to obtain a family-wise p value to indicate non-random expansion or contraction for each family, as well as significant expansion or contraction across all branches of the species tree. Tables S3 and S4 contain all the gene families with expansions and contractions in R. aegyptiacus prior to and post IFN-correction (described above), respectively. Annotation of genes We used annotations derived from the NCBI pipeline to extract and characterize NK cell receptor genes and MHC genes. These annotations were supplemented with homology-based predictions from the gene family analysis described above, and manual correction after examination in BioEdit v7.0.0 (Hall, 1999). For example, out of the 12 genes classified as C-type lectins in our gene family analysis, one gene is missing exons and is considered a pseudogene. The NCBI gene database contains another gene annotated as an NKG2A-like gene, but was classified as a singleton in our gene family analysis because of missing exons. Additionally, one protein sequence identified as an NK cell receptor-like C-type lectin by our gene family analysis remained uncharacterized by the NCBI pipeline. Upon manual reannotation, we discovered that the uncharacterized protein sequence was actually a misannotation of two NKG2A-like genes-one complete pseudogene and one partial gene. Thus, this analysis is able to detect homologs and is robust to potential misannotation issues. To annotate the type I IFN genes, we used IFN sequences inferred from P. vampyrus sequencing traces from Kepler 1, and NW_015493352.1. The scaffolds containing the type I IFN locus in Figure 5A are M. lucifugus -NW_005871058.1, NW_005873184.1, NW_005874133.1. Cell To extract MHC class I putative nonamers, proteins derived from MHC class I genes (or the closest mouse homologs to HLA-A) were aligned, and nonamers determined based on location of known nonamer sequences To this, we added a curated set of bat orthologs retrieved from GenBank. The protein sequences were then aligned in Mega 6.0 (Tamura et al., 2013) with MUSCLE (Edgar, 2004) using a maximum of 20 iterations. Mega 6.0 was then used to estimate the most likely model, which in all cases was JTT+G. This model was used to generate phylogenies for each gene. Sites were not considered for phylogenetic analysis if a deletion was present in > 5% of sequences (10% for NKG2A) For Table S6, TPM values of IFNs were normalized by dividing by GAPDH TPM in the same tissue Purified total RNA samples were verified by NanoDrop spectrophotometer (Thermo Fisher Scientific) and stored at À80 C prior to use. RNA libraries were generated using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat High-Throughput kit (Illumina). The completed libraries were screened for quality with the High Sensitivity D1000 Screentape and Reagents on the Tapestation 2200 (Agilent) and the Library Qauntification kit (KAPA Biosystems). RNA-sequencing was performed using a dual-index paired end (2x125 bp) format on Illumina HiSeq 2500 with the Hiseq SBS Kit v4 (250 cycles) and the HiSeq PE Cluster Kit v4 (Illumina). Raw reads were cleaned with Trimmomatic v0 Thermo Fisher Scientific; see Key Resources Table) were maintained in FreeStyle 293 Expression Medium (Thermo Fisher Scientific) at 37 C at 8% CO 2 with continuous shaking at 135 rpm. 293F cells at a density of 1x10 6 /mL were transfected with plasmids encoding pCAGGS/6xHis-IFN-b1, and -IFN-u4 days post-transfection, clarified media (centrifuged at 4 C for 1 hr at 4000rpm) was collected and frozen at À20 C till purification. BL21(DE) cells were transformed with pET22b/6xHis-PA-D1 and streaked on LB agar plates containing 50ug/mL ampicillin. Resulting colonies were grown in 4mL of 2xYt media (ampicillin 50ug/mL) at 37 C till a 600nm OD of 0.6 was reached. 3 mL of starter culture was added to 97 mL of 2xYt media (ampicillin 50ug/mL) and grown till a 600nm OD > 1 Proteins were purified via a Capturem His-tagged purification maxiprep kit (Clontech, Takara Bio, Mountain View, CA) and dialyzed into sterile 1x PBS with a Vivaspin 2 protein concentration column (MWCO 10kDa; GE Life Sciences). Proteins were quantified by western blot for the 6x-His tag, and by 280nm absorbance measured on a NanoDrop spectrophotometer Arrows indicate nested models (e.g., an arrow pointing from Model 1a to 2a means that Model 1a is nested in Model 2a). For each applicable model, color indicates which branches were used to estimate which evolution rate Model 0 was the best-fitting for 330 genes, Model 1a for 5 genes, Model 1b for 65 genes, Model 1c for 23 genes, Model 2a for 0 genes, model 2b for 32 genes Expanded maximum likelihood phylogenies of (A) NKG2, (B) CD94 proteins, and (C) NKG2D proteins. R. aegyptiacus proteins are marked by red dots