key: cord-0835691-g55iinzj authors: Baltoumas, Fotis A; Zafeiropoulou, Sofia; Karatzas, Evangelos; Paragkamian, Savvas; Thanati, Foteini; Iliopoulos, Ioannis; Eliopoulos, Aristides G; Schneider, Reinhard; Jensen, Lars Juhl; Pafilis, Evangelos; Pavlopoulos, Georgios A title: OnTheFly(2.0): a text-mining web application for automated biomedical entity recognition, document annotation, network and functional enrichment analysis date: 2021-10-06 journal: NAR Genom Bioinform DOI: 10.1093/nargab/lqab090 sha: 7b8b55a5bec38fcfc0a0fb4eabba0a9f0d90da02 doc_id: 835691 cord_uid: g55iinzj Extracting and processing information from documents is of great importance as lots of experimental results and findings are stored in local files. Therefore, extracting and analyzing biomedical terms from such files in an automated way is absolutely necessary. In this article, we present OnTheFly(2.0), a web application for extracting biomedical entities from individual files such as plain texts, office documents, PDF files or images. OnTheFly(2.0) can generate informative summaries in popup windows containing knowledge related to the identified terms along with links to various databases. It uses the EXTRACT tagging service to perform named entity recognition (NER) for genes/proteins, chemical compounds, organisms, tissues, environments, diseases, phenotypes and gene ontology terms. Multiple files can be analyzed, whereas identified terms such as proteins or genes can be explored through functional enrichment analysis or be associated with diseases and PubMed entries. Finally, protein–protein and protein–chemical networks can be generated with the use of STRING and STITCH services. To demonstrate its capacity for knowledge discovery, we interrogated published meta-analyses of clinical biomarkers of severe COVID-19 and uncovered inflammatory and senescence pathways that impact disease pathogenesis. OnTheFly(2.0) currently supports 197 species and is available at http://bib.fleming.gr:3838/OnTheFly/ and http://onthefly.pavlopouloslab.info. The extraction and processing of information from literature is of paramount importance in biomedical sciences. More often than not, researchers need to cope with the task of manually sifting through large amounts of text in various forms (e.g. scientific articles in various file formats, data in spreadsheets and images) in order to obtain perti-nent biological information about genes, proteins, chemical compounds, organisms and biological processes and functions. The manual approach is summarized as follows: (i) read through the article texts, (ii) detect biomedical entities of interest and (iii) query one or more databases for the relevant information. As the volume of literature and experiment-derived datasets continues to increase, this iterative procedure can become slow and heavy. For this reason, text-mining methods are often employed to aid researchers in automatically extracting and searching meaningful biological terms from texts. One particular application of text mining that is widely used in scientific text processing is named entity recognition (NER), i.e. identifying words or phrases of interest (the so-called 'named entities') mentioned in plain text, and normalizing them to appropriate database/ontology identifiers (1) . In biological and biomedical sciences, these entities include gene and protein names, organisms (scientific or common names), chemical compounds and ontology terms such as biological processes, cellular components, molecular functions, diseases, phenotypes and environmental descriptors. Numerous computational tools and web services for NER have been proposed [reviewed in (2) (3) (4) ]. Characteristic examples include tools like EXTRACT (5), PubTator (6) , HunFlair (7) , BioTextQuest + (8), Saber (9) , OGER++ (10) and others. Altogether, these tools detect genes/proteins, genetic variants, diseases, chemical compounds, organisms, diseases and cell lines mentioned in documents. However, while being able to successfully identify entities is critical, NER is only one of the components for meaningful parsing and analysis of the scientific literature. The result of running NER on a set of documents will be a long list of genes and other entities, which the user then needs to navigate and make sense of. Network visualization is one popular way to get an overview of a large number of entities. Molecular networks can be obtained from a wide variety of different sources, including manually curated pathway databases, e.g. KEGG (11) and Reactome (12) , and databases of interaction experiments, e.g. IntAct (13) and BioGRID (14) . Resources like STRING (15) and STITCH (16) combine these with additional associations that are predicted or extracted from the biomedical literature through automatic text mining (17) . All of these resources can be queried manually by the user, or automatically through application programming interfaces (APIs), packages or plugins. Their results can be used to generate, visualize and analyze interaction networks with network viewers such as Cytoscape (18) , Gephi (19) , NORMA (20) and others [reviewed in (21, 22) ]. Functional enrichment analysis is another commonly used approach, which summarizes a long list of genes by comparing their associated functional annotations against a collection of gene set annotation terms, each representing a gene ontology term, molecular pathway, protein domain, disease etc. Statistically enriched gene annotation sets are then identified by comparing their frequency against a reference background list. Widely used tools for enrichment analysis include DAVID (23), PANTHER (24) , WebGestalt (25) , aGOtool (26) and g:Profiler (27, 28) , each adopting different statistical tests and supporting different enrichment options [reviewed in (29, 30) ]. In this article, we present OnTheFly 2.0 , a full, userfriendly pipeline which goes far beyond just applying NER and allows users to start from a collection of documents and, via a set of entities, perform network and enrichment analyses. Through a user-friendly, interactive web interface, OnTheFly 2.0 supports a plethora of different file formats for text mining and biomedical entity extraction, including text documents (in both editable and read-only formats), spreadsheets and image files. Through NER, OnTheFly 2.0 can recognize and retrieve a large variety of both biological and biomedical terms. Extracted protein and chemical entities can be combined to create datasets for various types of analyses, including functional enrichment, related literature finding, associations with diseases and protein domain reporting from protein family databases. In addition, OnTheFly 2.0 allows the generation and visualization of protein-protein and protein-chemical interaction networks. The capabilities of OnTheFly 2.0 are shown using a case study in which several inflammatory and senescence pathways that impact COVID-19 pathogenesis have been unraveled after analyzing six clinical articles with mentions to clinical biomarkers of severe COVID-19. The OnTheFly 2.0 pipeline consists of four steps ( Figure 1 ): (i) uploading of input files and conversion from their original format to HTML, (ii) identification of bioentities with EXTRACT, (iii) functional annotation on a set of selected identifiers and (iv) network analysis. A detailed description of these steps is provided in the following subsections. In its current version, OnTheFly 2.0 supports annotation for PDF files, Office-formatted documents, various flat text file formats, including XML and images. In the online version, each file must have a maximum size of 10 MBs. Users can upload multiple files simultaneously and process them separately or in combination. OnTheFly 2.0 uses various tools and pipelines in its backend to convert uploaded files to HTML format prior to annotation. PDF files are converted with the use of pdf2htmlEX, an open-source package (31) , whereas the Li-breOffice universal converter (unoconv) is used to convert Office files, including formatted/enriched text files (MS Office .doc/.docx, OpenOffice .odt, Rich Text Format .rtf) and spreadsheets (MS Excel .xls/xlsx, OpenOffice Spreadsheet .odp), tab-and comma-delimited table files (.tsv and .csv, respectively), XML files, as well as flat text (.txt) files. Notably, in the case of spreadsheet documents, the converter is capable of handling each of the sheets. Almost all of the aforementioned file types are converted to HTML with their overall layout, text formatting, formulas and images maintained to the largest extent possible. The only exception are XML files which are rendered as plain texts, without any syntax highlighting for the XML tags. Future versions of OnTheFly will address this issue by implementing more advanced visualization options, including better support for the XML format. Once the files have been uploaded, users can annotate them with the help of EXTRACT tagging service (5) . EXTRACT performs dictionary-based NER using the highly efficient tagger software (33) to detect words and phrases, which correspond to biomedical entities. This is performed through a dictionary-based approach, through which biological and biomedical terms, both canonical and synonyms, are assigned to database and ontology identifiers; thus producing concept-normalized results. In detail, EXTRACT is capable of identifying environment descriptive terms from environment ontology (e.g. desert and forest) (34), organism mentions from NCBI Taxonomy (35), tissue terms from BRENDA Tissue Ontology (36), disease mentions from Disease Ontology (37), phenotypes from Mammalian Phenotype Ontology (38) , biological processes, cellular components, molecular functions from Gene Ontology (39, 40) , small chemical molecules from PubChem (41), non-coding RNAs from RAIN (42) and protein-coding genes from STRING (15) . In the implementation of OnTheFly 2.0 , NER can be performed for a list of 197 organisms. Once the annotation parameters (entity types and organisms) have been set and a NER process has been completed, OnTheFly 2.0 will return the annotated document with all of the recognized terms linked and highlighted using different colors ( Figure 2 ). On mouse-click action on a term, OnTheFly 2.0 will generate a popup window with details about the biomedical entity and links to external databases. In case of term disambiguation (e.g. when a term comes from several organisms or corresponds to more than one entity type), OnTheFly 2.0 will report all of the possible options. For a more comprehensive summary, all of the identified terms along with their database identifiers and links are collected in an interactive table and can be exported as a CSV file. The table results can be narrowed down after filtering for entity type (e.g. genes/proteins and diseases) at any stage. The annotation process is presented in Figure 2 . OnTheFly 2.0 uses two tools, g:Profiler (27, 28) and aGOtool (26) , to provide rich functional enrichment analysis for a selected set of genes/proteins collected by one or multiple files. The user can customize parameters for the enrichment analysis and choose from a list of 197 organisms. OnTheFly 2.0 uses g:Profiler to identify enriched functional terms from Gene Ontology (39, 40) , pathways from KEGG (11), Reactome (12) and WikiPathways (43), protein complexes from CORUM (44) , expression data from Human Protein Atlas (45), regulatory motifs from TRANSFAC (46, 47) and miRTarBase (48) , and phenotypes from the Human Phenotype Ontology (49) . The analysis results from g:Profiler are complemented by further enrichment analyses from aGOtool to also identify enriched terms from the UniProt keyword classification system, protein families and domains from Pfam (50) and InterPro (51) , as well as human diseases from the DISEASES database (52) . g:Profiler and aGOtool test for statistically significant enrichment by using Fisher's exact test to compare the user-defined input dataset (foreground) to a background set from organismspecific genes annotated in the Ensembl database (53) and UniProt Reference Proteomes (54), respectively. The resulting p-values are corrected for multiple testing using either g:SCS (only in case of g:Profiler), Bonferroni correction or Benjamin-Hochberg false discovery rate (FDR), all of which can be used as thresholds for the results. Enrichment analysis is performed using ENSEMBL IDs as input, while results can be reported as Entrez, UniProt, EMBL, EN-SEMBL and RefSeq gene/protein names/identifiers, based on the user's choice. Functional enrichment results are reported in interactive searchable tables displaying details about each functional term. One can expand each row of the table to see which of the identified genes/proteins were found to be associated with the functional term. For example, in the case of a KEGG pathway, one can see how many proteins or genes were found to be related to it and get redirected to the KEGG repository to see the actual schema of the pathway in a static form with all of the detected genes/proteins highlighted. In the case of g:Profiler, an interactive Manhattan plot is offered for a clearer overview. In this plot, functional terms are grouped along the x-axis and colored by their data source, whereas the y-axis shows the significance (P-value) of each term. Hovering over a data point reveals a tooltip with key information about the functional term. Finally, the most significant functional terms are shown as a bar chart, which the user can customize to show the desired number of terms. All of the aforementioned reports can be exported and saved in various file formats (CSV, XLS, PDF). An overview is shown in Figure 3 . OnTheFly 2.0 uses the aGOtool to allow users to find scientific articles that mention surprisingly many of the genes/proteins identified in the uploaded input files. While conceptually similar to the functional enrichment analyses just described, publication enrichment analysis serves a very different purpose, namely to help the user identify scientific publications of relevance to the gene/protein list. The publication enrichment analysis in aGOtool is based on a text corpus of all PubMed abstracts and full-text articles from the PubMed Central Open Access subset. These have been run through the same NER tagger used in EXTRACT and the results are updated with new documents on a weekly basis. Consequently, all documents have been automatically annotated with the genes mentioned within them, thus turning every document into a gene set. These millions of gene sets are then used by aGOtool in the same manner as all other gene sets. We make use of this functionality to provide publication enrichment functionality in OnTheFly 2.0 for the list of 197 organisms. The user can select up to 1000 of the genes/proteins identified in the uploaded files for analysis, which will then be submitted to aGOtool to test each document from the precomputed corpus for statistically significant enrichment, again using Fisher's exact test. The resulting P-values as well as Bonferroni-corrected P-values and Benjamini-Hochberg FDR values can be used for filtering the results. Results are reported in interactive searchable tables displaying details about each literature term (scientific publication). Links are provided for publications to PubMed. In addition, users are able to rank the most significant publications using barchart plots and manually adjust the number of the reported results with the use of a slide bar. All of the aforementioned reports can be exported and saved in various file formats (CSV, XLS, PDF). In addition to the aforementioned enrichment options, OnTheFly 2.0 offers the capability to construct and visualize biomolecular interaction networks for a set of 197 organisms. This task is performed using the APIs of the STRING (15) and STITCH (16) databases for protein-protein and protein-chemical interactions, respectively. The users may submit their dataset obtained from the uploaded documents to retrieve interactions and visualize the results as networks with the interacting entities presented as nodes and their interactions as edges. For computational efficiency reasons, in its current version, OnTheFly 2.0 allows a maximum of 500 proteins per request for STRING and 100 proteins or small molecules per request for STITCH. STRING and STITCH classify interactions between two entities (proteins or small molecules) as either physical (i.e. part of the same biomolecular complex) or functional (i.e. involved in the same pathway/process). To this end, OnTheFly 2.0 requires users to select whether to include the Full set of interactions (both physical and functional) or the Physical subnetwork exclusively. Users can also specify the cutoff on the Interaction Score. Finally, users can choose whether each edge should show the type(s) of evidence (e.g. experiments or text mining) supporting it (Evidence mode) or if the thickness of the edge should instead show the interaction score (Confidence mode). In addition to the above, in protein-chemical networks network edges can be formatted based on Molecular Action or Binding Affinity. By choosing Molecular Action, the edges in the network will represent the type (activation, inhibition, catalysis etc.) as well as the effect (positive, negative or unspecified) of each protein-chemical interaction. By choosing Binding Affinity, the edge thickness will indicate the binding affinity between the proteins and bound chemicals. The resulting network is shown in a separate Network Viewer panel, preserving the characteristic STRING network layout and style. An example of such networks is shown in Supplementary Figure S1 . In addition, options are given to view the generated network in STRING (proteinprotein) or STITCH (protein-chemical) for further analysis. Finally, one can export a network as an image or as a tab-delimited file compatible with external network visualization applications. OnTheFly 2.0 is a web application implemented in R, using the R/Shiny package as well as HTML, CSS and JavaScript. The Shiny and ShinyJS packages are used as mediators to establish the connection between the R and JavaScript functions. The API of the EXTRACT web service which utilizes the tagger text mining utility is used to perform NER. Functional enrichment analysis is performed using the g:Profiler2 (28) package (R implementation of g:Profiler) and aGOtool. Biological networks are constructed and visualized using the STRING API, as implemented in the STRING and STITCH databases. OnTheFly 2.0 is available as a web tool and as a standalone package through a GitHub repository. The standalone version is fully functional in native Linux and other Unixbased operating systems. It can also run on Windows, by utilizing a Windows Subsystem for Linux (WSL) or other similar compatibility layers (e.g., Cygwin). The web tool is fully functional in all major web browsers (Google Chrome, Mozilla Firefox, Microsoft Edge, Tor, Apple Safari, Opera) ( Table 1) . To demonstrate the capacity of OnTheFly 2.0 for rapid extraction of biological information and knowledge discovery, we analyzed six published meta-analysis reports on clinical Table S1 ). Texts in PDF format were annotated by NER, results filtered to manually remove false positives and jointly processed for functional enrichment analysis. Reassuringly, we found 'Respiratory failure', 'Pneumonia' and 'COVID-19' to be among the most significantly enriched diseases (Supplementary Table S2 ). The GO enrichment for biological processes (Supplementary Table S3 ) identified several GO terms related to inflammation, cell activation and response to stress, in line with COVID-19 being associated with exaggerated lung inflammation and systemic immune dysfunction. Similarly, the annotated text terms were found to be enriched for molecular functions that are associated with cytokine activity and cytokine receptor signaling (Supplementary Table S4 ). These results were supported by the UniProt keyword analysis, which revealed 'Cytokine', 'Inflammatory response', 'Host-virus interaction' and 'Host cell receptor for virus entry' to all be enriched (Supplementary Table S5 ). Analysis of putative protein-protein interactions (physical and functional associations) through the STRING option of OnTheFly 2.0 uncovered a cluster of interacting cytokines and other immune components that is pertinent to the 'cytokine storm' of severe COVID-19 (Figure 4) . Cytokines are also a recurring theme in the publication enrichment results, which as one would hope further included several COVID-19 studies (Supplementary Table S6 ). Cellular/extracellular components predicted to be associated with biomarkers of severe COVID-19 included extracellular space (GO:0005615, GO:0005576), plasma membrane (GO:0009897, GO:0009986, GO:0098552) and, interestingly, membrane microdomains (also called 'membrane rafts'; GO:0098857, GO:0045121) (Supplementary Table S7 ). The latter emerge as important cellular components implicated in (i) the initial binding of SARS-CoV-2 to ACE2 receptor, (ii) virus internalization and (iii) cell-to-cell transmission [reviewed in (61) ]. Pertinent to knowledge discovery, this biological information was extracted in the absence of specific reference to membrane microdomains in any of the six meta-analysis reports that were interrogated. Several relevant KEGG pathways were also extracted (Supplementary Table S8 ), including 'coronavirus disease -COVID-19' (KEGG: 05171; Supplementary Figure S2) , 'viral protein interaction with cytokine and cytokine receptor' (KEGG: 04061) and 'cytokine-cytokine receptor interaction' (KEGG: 04060). Interestingly, 'Yersinia infection' (KEGG: 05135) was also identified as a relevant KEGG pathway with high probability (P-value<10 -8 ). Yersinia pestis is the causative pathogen for pneumonic plague, one of the world's deadliest infectious diseases. Yersinia pestis infects pneumocytes and alveolar macrophages, triggering inflammasome-mediated IL-1␤/IL-18 cytokine release (62) that is followed by neutrophil influx, exaggerated inflammation and lung tissue damage (63) . These immune and 8 NAR Genomics and Bioinformatics, 2021, Vol. 3, No. 4 lung tissue reactions to Yersinia pestis are reminiscent of those to severe SARS-CoV-2 infection (61) and warrant further insights into the immunological mechanisms of response to these unrelated pathogens. Of additional interest is the predicted involvement of the 'IL-17 signaling pathway' (KEGG: 04657) in severe COVID-19 (Supplementary Table S8 ) which is supported by a recent study reporting T cell skewing towards Th17, a specialized CD4 + effector T cell lineage characterized by secretion of IL-17 and IL-17F cytokines in patients with COVID-19 pneumonia (64) . We also explored the REACTOME option of OnTheFly 2.0 to map and analyze biological pathways that are over-represented in the validation example. As shown in Supplementary Table S9 , several cytokine pathways were predicted to be significantly associated with biomarkers of severe COVID-19. We note that predicted REACTOME pathways included 'cellular senescence' despite the absence of specific references to this biological term in any of the six annotated meta-analysis reports under study. In line with this prediction, COVID-19 pneumonia has recently been associated with immunosenescence (64) and accelerated aging of pneumocytes (65) . Overall, the aforementioned analyses underscore the practical utility of OnTheFly 2.0 to rapidly extract biological information from texts and hence assisting knowledge discovery (Figure 4 ). OnTheFly 2.0 has been redeveloped to use current technologies and overcome many of the problems of its predecessor (66) . The GUI has been completely rewritten to no longer rely on a Java applet and instead using R, Shiny, CSS, HTML and JavaScript technologies. The backend document format conversion has also been considerably improved, replacing commercial Windows-based converters with open-source, Unix-based ones, which furthermore do a much better job preserving the original document layout. Moreover, compared to its predecessor, OnTheFly 2.0 comes with a broader spectrum of term types it can identify and supports OCR technology for processing images. Uploaded files are only stored temporarily in the OnTheFly 2.0 server just for parsing and no file backups, copies or personal data are kept. A more detailed comparison between OnTheFly 1.0 and OnTheFly 2.0 is presented in Table 1 . OnTheFly 2.0 is a powerful tool for identifying terms in locally stored documents varying from texts and PDFs to Office and image files. Users can identify terms such as proteins, genes, chemical compounds, organisms, tissues, environments, diseases, phenotypes and gene ontologies and perform a functional enrichment and network analysis upon selecting a set of biomedical entities. Furthermore, popup windows with informative summaries about a term and its links to external repositories are also generated. OnTheFly 2.0 can aid researchers in annotating locally stored documents and further exploring and analyzing their identified biomedical entities in a fully automated way. We believe that due to its offered capabilities and ease of use, OnTheFly 2.0 will reach a broad spectrum of users varying from experimentalists to bioinformaticians. OnTheFly 2.0 is available at: http://bib.fleming.gr:3838/ OnTheFly/ and http://onthefly.pavlopouloslab.info. The source code and instructions about the necessary dependencies can be found at https://github.com/ PavlopoulosLab/OnTheFly. Supplementary Data are available at NARGAB Online. A survey of named entity recognition and classification Text-mining solutions for biomedical research: enabling integrative biology Text mining resources for the life sciences 2020) Named entity recognition and relation detection for biomedical information extraction EXTRACT: interactive extraction of environment metadata and term suggestion for metagenomic sample annotation PubTator: a web-based text mining tool for assisting biocuration HunFlair: an Easy-to-Use tool for State-of-the-Art biomedical named entity recognition BioTextQuest(+): a knowledge integration platform for literature mining and concept discovery Towards reliable named entity recognition in the biomedical domain OGER++: hybrid multi-type entity recognition KEGG: integrating viruses and cellular organisms The reactome pathway knowledgebase The MIntAct project-IntAct as a common curation platform for 11 molecular interaction databases The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets STITCH 5: augmenting protein-chemical interaction networks with tissue and affinity data Exploring networks in the STRING and reactome database Cytoscape: a software environment for integrated models of biomolecular interaction networks Gephi: An Open Source Software for Exploring and Manipulating Networks NORMA: the network makeup artist --a web tool for network annotation visualization A survey of visualization tools for biological network analysis A guide to conquer the biological network era using graph theory DAVID-WS: a stateful web service to facilitate gene/protein list analysis PANTHER version 16: a revised family classification, tree-based classification tool, enhancer regions and extensive API WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs Avoiding abundance bias in the functional annotation of post-translationally modified proteins Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) 2020) gprofiler2 -an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler. F1000Res Gene set analysis: challenges, opportunities, and future research Gene set analysis methods: a systematic comparison Online publishing via pdf2htmlEX An overview of the tesseract OCR engine Real-time tagging of biomedical entities The environment ontology: contextualising biological and biomedical entities NCBI taxonomy: a comprehensive update on curation, resources and tools The BRENDA tissue ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources Human disease ontology 2018 update: classification, content and workflow expansion The mammalian phenotype ontology: enabling robust annotation and comparative analysis Gene ontology: tool for the unification of biology. The gene ontology consortium The gene ontology resource: enriching a GOld mine PubChem in 2021: new data content and improved web interfaces RAIN: RNA-protein association and interaction networks WikiPathways: connecting communities CORUM: the comprehensive resource of mammalian protein complexes-2019 Proteomics. Tissue-based map of the human proteome. Science TRANSFAC: transcriptional regulation, from patterns to profiles The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation 2020) miRTarBase 2020: updates to the experimentally validated microRNA-target interaction database The human phenotype ontology in 2021 Pfam: the protein families database in 2021 The interpro protein families and domains database: 20 years on DISEASES: text mining and data integration of disease-gene associations UniProt: the universal protein knowledgebase in 2021 Hematologic, biochemical and immune biomarker abnormalities associated with severe illness and mortality in 10 A meta-analysis of potential biomarkers associated with severity of coronavirus disease 2019 (COVID-19) Cytokine elevation in severe and critical COVID-19: a rapid systematic review, meta-analysis, and comparison with other inflammatory syndromes Diagnostic and prognostic value of hematological and immunological markers in COVID-19 infection: a meta-analysis of 6320 patients Predictors of adverse prognosis in COVID-19: a systematic review and meta-analysis Predictors of mortality in hospitalized COVID-19 patients: a systematic review and meta-analysis COVID-19 enters the expanding network of apolipoprotein E4-related pathologies Yersinia pestis activates both IL-1␤ and IL-1 receptor antagonist to modulate lung inflammation during pneumonic plague Early host cell targets of yersinia pestis during primary pneumonic plague Marked t cell activation, senescence, exhaustion and skewing towards TH17 in patients with COVID-19 pneumonia Senolytics reduce coronavirus-related mortality in old mice OnTheFly: a tool for automated document-based text annotation, data linking and network generation