key: cord-316258-7hucqcaj authors: Henriques, Elsa S; Brito, Rui M M; Soares, Hugo; Ventura, Sónia; de Oliveira, Vivian L; Parkhouse, R Michael E title: Modeling of the Toll-like receptor 3 and a putative Toll-like receptor 3 antagonist encoded by the African swine fever virus date: 2011-01-28 journal: Protein Science DOI: 10.1002/pro.554 sha: doc_id: 316258 cord_uid: 7hucqcaj African swine fever virus (ASFV) is a large double-stranded DNA virus responsible for a lethal pig disease, to which no vaccine has ever been obtained. Its genome encodes a number of proteins involved in virus survival and transmission in its hosts, in particular proteins that inhibit signaling pathways in infected macrophages and, thus, interfere with the host's innate immune response. A recently identified novel ASFV viral protein (pI329L) was found to inhibit the Toll-like receptor 3 (TLR3) signaling pathway, TLR3 being a crucial “danger detector.” pI329L has been predicted to be a transmembrane protein containing extracellular putative leucine-rich repeats similar to TLR3, suggesting that pI329L might act as a TLR3 decoy. To explore this idea, we used comparative modeling and other structure prediction protocols to propose (a) a model for the TLR3–Toll-interleukin-1 receptor homodimer and (b) a structural fold for pI329L, detailed at atomistic level for its cytoplasmic domain. As this later domain shares only remote sequence relationships with the available TLR3 templates, a more complex modeling strategy was employed that combines the iterative implementation of (multi)threading/assembly/refinement (I-TASSER) structural prediction with expertise-guided posterior refinement. The final pI329L model presents a plausible fold, good structural quality, is consistent with the available experimental data, and it corroborates our hypothesis of pI329L being a TLR3 antagonist. African swine fever virus (ASFV) is the etiological agent of an acute hemorrhagic fever of the domestic pig, with a mortality rate approaching 100%. Although not a human pathogen, this virus is a serious concern to swine farming and livestock economy, as it is endemic in Sardinia and sub-Saharan countries. The increasing infections in Africa and the worldwide commercial trade provide serious risk factors to the global pig industry; plus, there is no vaccine and so control is still based on diagnosis and the subsequent adoption of strict sanitary measures. 1 ASFV is a large, cytoplasmic, double-stranded DNA (dsDNA) virus and the single member of the Asfarviridae family, encoding many novel genes not encoded by other virus families. As it primarily infects porcine macrophages-a key cellular component of the innate immune system-this virus may have evolved immune evasion genes to manipulate innate immunity. The prediction is that half to two-thirds of the approximately 150 genes encoded by ASFV are not essential for replication in cells but have an important role for virus survival and transmission in its hosts. So far, the major strategy of the known ASFV proteins with roles in evading host defenses seems to interfere with intracellular signaling pathways and to inhibit transcriptional activation of important immunomodulatory genes. 2 This could in part explain the absence of an adequate host response on infection by ASFV. Understanding the viral proteins involved in this strategy can point the way to key drug targets or even be of therapeutical use themselves (or derivatives of them) to curtail inflammation. There are still a number of ASFV-encoded proteins of unknown function that could be worth exploring for that purpose. The innate immune system is mediated by germline encoded pattern recognition receptors, each receptor having a broad specificity for conserved components of microorganisms, such as nucleic acids, polysaccharides, and lipids. The molecular signature of most viruses is double-stranded RNA (dsRNA), produced either as an intermediate of the viral replication cycle (e.g., for dsDNA viruses) or as part of the viral RNA genome. Viral dsRNA is recognized by the Toll-like receptor 3 (TLR3), a member of the well characterized TLR family that comprises 10-20 pattern recognition receptor paralogs, all being type I integral membrane proteins. On dimerization subsequent to recognition of dsRNA, TLR3 recruits the adaptor protein Toll-interleukin-1 receptor (TIR)-domain-containing adapter-inducing interferon-b (TRIF) to its cytoplasmic domain, thereby initiating a signaling cascade that results in the secretion of type I interferons and other inflammatory cytokines. 3 TRIF is actually the sole TLR adaptor that is able to engage mammalian cell death signaling pathways, and TLR3 is the only receptor in the TLR family that interacts directly with it. Interestingly, ASFV not only infects pigs but also ticks, both of which share TLR-mediated host defense systems. This, and the fact that ASFV specifically infects macrophages, makes it only conceivable that some of the unassigned ASFV-encoded proteins might well interfere with TLR3 or other TLR signaling mechanisms. In search for possible topological similarities and sequence homologies with already existing proteins, a preliminary computational screening of the ASFV open reading frames (ORFs) predicted a protein of unknown function, named pI329L after ORF I329L, to be a transmembrane protein containing extracellular putative leucine-rich repeats (LRRs). As TLRs are also transmembrane proteins featuring an extracellular domain with LRR motifs, 4 this apparent similarity prompted for experimental testing to check for interference with TLR-signaling. The results showed that pI329L is a highly glycosylated protein expressed in the endoplasmic reticulum, the Golgi, and the cell surface of recombinant lenti-virus transduced cells, and that it inhibits TLR3-dependent activation of two transcription factors (NFjB and IRF3) responsible for the expression of interferon and chemokines (de Oliveira et al. 24 ). And as presented here, the experiments also indicate that TRIF is the putative target for this viral host modulation gene. In more detail, TLR3 features a large glycosylated ectodomain (ECD) with multiple LRRs responsible for ligand recognition, a transmembrane a-helix and a cytoplasmic TIR domain responsible for initiation of intracellular signaling. The recently proposed structures of two TLR3-ECDs bound to a 46-bp dsRNA (Protein Data Bank [PDB] entry 3ciy 5 ) and of the TLR4-TIR homodimer 6 provided a credible structural picture of how viral dsRNA is recognized and how the signaling adaptors might be recruited. Each of the TLR3-ECDs binds dsRNA at two sites located at opposite ends of the TLR3 horseshoe (refer to Ref. 5 and Fig. 5) , and an intermolecular contact between the two TLR3-ECD C-terminal domains coordinates and stabilizes the dimer. This juxtaposition should then mediate downstream signaling by the dimerization of the cytoplasmic TIR domains. 5 Despite the much shorter sequence of the viral protein, all the above information put together gave rise to the hypothesis that pI329L might form a heterodimer with TLR3, thus acting like a decoy receptor. Here, we report a structural assessment of this hypothesis. Using homology modeling and other structure prediction simulation protocols, we propose (a) a model for the so far experimentally unsolved TLR3-TIR structure, assembled within the context of the overall TLR3-dsRNA recognition complex, and (b) a structural fold for the pI329L intracellular extension that reinforces the idea of the viral protein being a TLR3 antagonist. The computational models are discussed and validated, are consistent with the available experimental data, and preliminary conclusions concerning the role of the ASFV viral protein pI329L are put forward. Structural assessment of the pI329L ECD The TMHMM posterior probabilities of inside/outside/transmembrane regions for the pI329L sequence are depicted in Figure 1 , together with the equivalent analysis on TLR3 for comparison. For pI329L, the expected number of amino acids in transmembrane helices is 22.90714 within region 238-260 of the sequence; the number being larger than 18 and not in the N-first 60 residues (ruling out the possibility of a signaling peptide), it is very likely that pI329L is a transmembrane protein. The PHYRE server ranked the Nogo receptor (NgR) ECD (PDB entry 1ozn) as the most apposite homologous fold of the pI329L putative ECD. NgR is a 470-residue, glycosyl phosphatidyl inositol-anchored membrane protein with a majority of the globular structure comprised of a LRR domain capped by N-terminal and C-terminal cysteine-rich modules. 7 As can be seen in Figure 2 , the NgR structure easily superposes the TLR3-ECD membrane-adjacent region, despite the relatively low sequence homology (20% identity and 40% similarity, for the span of the NgR-ECD sequence). Combining this structural superposition with the PHYRE-proposed alignment of pI329L to NgR, we devised the alignment presented in Figure 3 . Calculated (ClustalW) sequence identity and similarity of pI329L are 15 and 34% to NgR (ECD only) and 11 and 25% to TLR3 (ECD þ transmembrane region). Noticeable from the alignment is the pI329L's LRR pattern (a series of short segments rich in hydrophobic residues such as leucine, isoleucine, and valine 8 ). Also worth mentioning is that the region aligned with the C-terminal cysteine-rich capping motif of NgR and TLR3 (underlined in Fig. 3 ), delineates at least one of the conserved structural disulfide bonds of this motif. In addition, PHYRE predicted a high a-helix content for pI329L in this region (data not shown) in close agreement with the one observed in NgR 7 (highlighted in Fig. 2 ). These results are consistent with our hypothesis of pI329L being a decoy of TLR3 through the formation of a heterodimer. The fact that the putative ECD of pI329L is considerably shorter than the TLR3 counterpart (note that the alignment in Fig. 3 presents only the last $340 out of the 680 residues of TLR3-ECD) is not an issue, for it is the two C-termini of TLR3-ECD (i.e., the region with the aforementioned disulfide bonds, aligned with pI329L) that are brought into contact on binding to dsRNA then followed by the association of the transmembrane helices and the dimerization of the cytoplasmic TIR domains. 5 Bearing this in mind, the subsequent modeling efforts were directed to the pI329L-cytoplasmic domain to consider a possible downstream inhibition of TRIF-induced signaling. To understand the structural implications of a possible heterodimer between the pI329L and TLR3 cytoplasmic domains, a necessary first step is to define the TLR3-TIR dimer itself. As not even the structure of the monomer has been experimentally solved to date, our next step was to build a reasonable model of the TLR3-TIR structure. A number of experimental structures for other TLR-TIR domains are already available, including those of TLR1 (PDB entry 1fyv 9 ), TLR2 (PDB entry 1o77 10 ), and TLR10 (PDB entry 2j67 11 ), the latter in the form of a putative signaling dimer. These three receptors were found to be suitable homologous tem-plates for TLR3-TIR, with values of identity and similarity of 26 and 57% (TLR1), 22 and 56% (TLR2), and 27 and 54% (TLR10), respectively, following the alignment in Figure 4 . Compared with the sequence homology among the three templates themselves (identity varies from 46 to 69%), these values emphasize the atypical character of TLR3 within the TLR-family. Unfortunately, none of the available crystallographic structures that could serve as a template has the cytoplasmic linker region-the one immediately following the transmembrane domain, resolved (for the chosen templates, the N-terminal region displayed with the unresolved residues in italic in the alignment of Fig. 4 ). This region is consistently predicted (PHYRE-server) to have an a-helix fold and, at least in the case of TLR3, it bears a few key functional residues that are required for TLR3induced activation of two transcription factors that promote the antiviral response. 4 These are residues Phe732, Tyr733, Leu742, and Gly743 (residue numbering according to Ref. 4) marked with an asterisk in the alignment (Fig. 4) , which are conserved across human, pig, mouse, and other species, respectively. The initial model structures of the TLR3-TIR domain were built with Modeller using the alignment presented in Figure 4 . The most promising models were subjected to further restrained- modeling for the cytoplasmic linker, by invoking the special restraints routine to impose an a-helix structure to it. The best quality refined structure presents all bonding parameters within the allowed limits, no bad contacts, and the corresponding Ramachandran plot analysis (PROCHECK) shows 94.0% of the residues in the core regions and 6.0% in the additional allowed regions. The model is of equivalent quality to the best quality template structure, the TLR10-one with 93.1% of its residues in the most favored regions. Finally, the model was relaxed with 100 ps of molecular dynamics (MD) under physiological conditions to ensure that it sustains the folding characteristics. As suggested by Liu et al., 5 the dimerization of the cytoplasmic TIR domains results from the interactions of the LRR C-terminal ECDs on RNA-binding, and similarly to what has been depicted by those authors, a construct of the entire TLR3-dsRNA recognition complex dimer was next assembled to further assess the validity of our TLR3-TIR model. Two TLR3-TIR monomers were first placed together using the structure of the TLR10 dimer as scaffold, and then connected to the experimentally determined structure of the ectodomains 5 via two a-helices-the transmembrane domains-which were built and positioned using distance-restrained modeling within Modeller. The resulting model is depicted in Figure 5 . The entire structure was positioned in a membranemodel to further ensure that each domain would fall in the appropriate cellular compartments. For the TLR3-TIR dimer, the C-alpha root mean square deviation to the template protein is 0.90 Å . As in the case of TLR10, the so called BB-loop and central distorted a-helix C (nomenclature used in previous works on TIR domains 4,5,11 ) constitute the major part of the dimer's interface. The BB-loop (signaled in BOX 2 in the alignment of Fig. 4) shares a conserved proline in all TLRs except TLR3, where this residue is replaced with an alanine (Ala795 marked with an asterisk in the alignment). It should be pointed out that this is the binding-adaptor loop, which on dimerization shapes a twofold symmetrical exposed patch, 11 where the appropriate TLR adaptor is suggested to dock. In fact, experimental evidence demonstrates the importance of that particular alanine in the binding of TLR3 directly to the adaptor TRIF, while the other TLRs require mediator adaptors. 4 As for a-helix C, it has an extra arginine residue in TLR3 when compared with all other TLRs (the insertion also signaled in the alignment), which no doubt adds to the uniqueness of TLR3 within its family. In fact, this is to say that there might be some complementarity nuances in the dimer's contact region that may not be revealed by homology modeling alone. For the purpose of this work, however, it suffices to map the critical regions that the virus protein under study might interfere with. Not surprisingly, attempts to sequence-align the pI329L putative cytoplasmic domain with the TLR3-TIR domain resulted in very different alignments depending on the chosen ClustalW scoring matrix and gap penalty. Homology values were always too low for straightforward homology modeling to be considered (8% identity and 24% similarity at the best). The PHYRE server predicts a protein-fold featuring an initial a-helix and three to four small b-strands, with no match to their proposed set of weakly/distant homologous templates. A more sophisticated modeling approach was thus required, one possibility being the iterative implementation of (multi)threading/assembly/ refinement approach (I-TASSER), the ''Zhang-server'' that ranked as the No. 1 server in recent CASP7 and CASP8 experiments (http://predictioncenter.org/casp8/ groups_analysis.cgi). The I-TASSER predicted secondary structure (nearly equivalent to PHYRE's) is presented in Figure 6 . The server also proposed five crude models for the pI329L domain in question, the higher ranking one displaying a confidence score of À3.81 (range is typically within À5 to 2, the higher the value the higher the confidence 12 ). Interestingly, a PDB-search by secondary structure content revealed a comparable motif for the recently solved NMR-structure of domain-C of the nonstructural nsp3e protein from the severe acute respiratory syndrome (SARS) coronavirus (CoV) (PDB entry 2kaf 13 ). The resulting alignment of these two viral protein domains within the secondary structure prediction context is also presented in Figure 6 . The I-TASSER best model was then used for further refinement within Modeller, using the SARS-CoV domain as an additional template. The available TLR structures were used only to guide the modeling of a few localized segments. Two apposite refined models for the pI329L-cytoplasmic domain were obtained in this way, both presenting a plausible fold and good structural quality. The Ramachandran plot analysis showed 87.8% of the residues in the core regions (it is 88.1% for the best representative conformer in the 2kaf ensemble) and the remaining in the additional allowed regions. The alignment of the viral proteins with TLR3-TIR, also depicted in Figure 6 , was attained by superposition of the structural features assigned to each sequence. Based on proximity criteria, both models of pI329L allow for two disulfide bonds (signaled in Fig. 6) , which would be expected if such a small domain ($80 residues) is to sustain an ordered compact fold. The two models are practically equivalent, differing only in the N-terminal a-helix, which is ''straight'' in one case and ''distorted'' in the other (refer to Fig. 6 ): both motifs are eligible considering that (1) this is the frontier fragment between the transmembrane and cytoplasmatic domains and (2) one such kink would not be uncommon in the region of a potential disulfide bond. Figure 6 . The view in Figure 7 (b) highlights the TLR3-TIR residues that must come into play on homodimer formation, and how pI329L could ''ill-replace'' one of the monomers at the interface region. Whether the outcome is mimic-binding or a more simple steric hindrance, either would suffice to impede the correct formation of the twofold symmetric region where the adaptor TRIF is expected to bind, and disrupt its corresponding signaling. Moreover, at least one of the phosphorylation sites required for downstream regulation 4 (the tyrosine-759 in BOX 1, refer to the alignments in Figs. 4 and 6) would be ousted in this scheme. Arguably, the actual TLR3-region covered by the pI329L cytoplasmic domain would also depend on the positioning of the a-helix cytoplasmic linker in TLR3 (the first 25 residues in the alignment of Fig. 4 , for which there are no template counterpart). The way it has been modeled [Figs. 5 and 7(b) ], this linker accommodates on top of the TIR domain, but even if it was somewhat detached from it, that would alternatively cause pI329L to mainly superpose over the linker. Since (as mentioned in the previous subsection) the linker bares several residues required for TLR3-induced activation of two key transcription factors (NFjB and IRF3), the very same that are inhibited by I329L expression, our proposed model still holds. One such signal inhibition hypothesis is also corroborated by the experimental observation that overexpression of TRIF does indeed reverse the inhibition caused by pI329L, as inferred from the results of the Luciferase assay presented in Figure 8 . The luciferase gene is controlled by the Interferon-b (IFNb) promoter and the graph shows that the inhibition of the luciferase reporter activation induced by I329L is reverted in a dose-dependent manner by cotransfection of TRIF. Remark that the possible inhibitory role of pI329L at the intracellular level does not exclude the potential decoy purpose of its extracellular domain, which is implied in the structural assessment of the I329L ECD presented in the corresponding subsection above. Following the preliminary screening of the ASFV ORFs (data not shown), the sequence of the pI329L precursor-as taken from the NCBI GenBank database (www.ncbi.nlm.nih.gov; accession number AAA65369)was further analyzed using the TMHMM program for predicting transmembrane regions. 14 It gives the most probable location and orientation of transmembrane helices in the sequence as found by the N-best algorithm that sums over all paths through the model with the same location and direction of the helices. 14 Other possible similarities/homologies were investigated using PHYRE, the Protein Fold Recognizer Server, 15 which predicts protein structures by detecting remote homology to known structures. Multiple sequence alignments were performed with ClustalW2 16 and edited for manual refinement within Jalview. 17 TLR3 structures were homology modeled with Modeller 9v4, 18 based on the alignment presented in Figure 4 . For the putative cytoplasmic domain of pI329L, a preliminary set of crude models was built with the I-TASSER server for protein prediction. In short, I-TASSER generates full length models of proteins by excising continuous fragments from Local Meta-Threading-Server multiple-threading alignments and then reassembling them using replica-exchange Monte Carlo simulations (for more details see Ref. 12 and references within). It followed a more accurate modeling with Modeller using the I-TASSER results as starting input and an additional template (vide supra the modeling of pI329L) that resulted from a targeted PDB-search. 19 The quality of the models was assessed using PROCHECK v.3.4.3. 20 The best quality modeled structures were regularized and MD-relaxed using the NAMD program 21 with the CHARMM22 force-field. 22 The possible impact of I329L on TRIF signaling was further investigated through a Luciferase reporter gene assay. Vero cells were cotransfected with 300 ng of either the empty plasmid vector (pcDNA3-HA) or the I329L expression vector (pcDNA3-I329L-HA), and with increasing amounts of the TRIF plasmid vector (25-100 ng) along with 100 ng of the IFNb reporter construct [pIFD(À125/þ72)lucter]. The cells were finally stimulated with 25 lg/mL of poly I:C for 5 h. Luciferase activity was normalized to the b-galactosidase activity given by the cotransfected b-galactosidase internal control plasmid (pCMVb). The proposed fold and structural model of the pI329L viral protein was devised using both automatic tools and human expertise in template-based modeling, along with experimental guidance. One such strategy has been proven adequate, when only remote relationships between the target and the structural templates exist. 23 Special attention has been paid in choosing the appropriate alignment(s) and template(s). The working hypothesis being that the viral protein is an antagonist of the TLR3, for which no experimental structure of its TIR-domain exist, a model of this domain in the context of the TLR3-dsRNA recognition complex was first assembled. It provided a background framework for the subsequent comparative modeling of the pI329L cytoplasmic domain. The modeling results substantiate the idea that pI329L may function as a TLR3 decoy, showing that the viral protein could hinder TLR3 dimerization, and in doing so, inhibit the downstream signaling pathway. In conjunction with the experimental evidence, the present modeling exercise allowed us to gain further insight into the strategies used by the ASF-virus to evade the host immune response and the role of the nonassigned I329L gene encoded viral protein in this process. Modeled structures are available on request. Systematic analysis of longitudinal serological responses of pigs infected experimentally with African swine fever virus African swine fever virus proteins involved in evading host defence systems Ticam-1, an adaptor molecule that participates in toll-like receptor 3-mediated interferon-beta induction Sensing of viral infection and activation of innate immunity by toll-like receptor 3 Structural basis of toll-like receptor 3 signaling with double-stranded RNA A dimer of the toll-like receptor 4 cytoplasmic domain provides a specific scaffold for the recruitment of signaling adaptor proteins Structure of the Nogo receptor ectodomain: a recognition module implicated in myelin inhibition The leucine-rich repeat structure Structural basis for signal transduction by the toll/interleukin-1 receptor domains An extensively associated dimer in the structure of thec713s mutant of the tir domain of human tlr2 The crystal structure of the human toll-like receptor-10 cytoplasmic domain reveals a putative signaling dimer 2008) I-tasser server for protein 3D structure prediction NMR assignment of the nonstructural protein nsp3(10661181) from SARS-CoV Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes Protein structure prediction on the web: a case study using the Phyre server ClustalW and ClustalX version 2.0 Jalview version 2-a multiple sequence alignment editor and analysis workbench Comparative protein structure modeling using MODEL-LER The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data Procheck: a program to check the stereochemical quality of protein structures Namd2: greater scalability for parallel molecular dynamics All-atom empirical potential for molecular modeling and dynamics studies of proteins The use of automatic tools and human expertise in template-based modeling of CASP8 target proteins vember 2010) Novel TLR3 inhibitor encoded by the African Swine Fever Virus (ASFV). Arch Virol