key: cord-102530-wetqqt2i authors: Brandell, Ellen E.; Becker, Daniel J.; Sampson, Laura; Forbes, Kristian M. title: The rise of disease ecology date: 2020-07-17 journal: bioRxiv DOI: 10.1101/2020.07.16.207100 sha: doc_id: 102530 cord_uid: wetqqt2i Disease ecology is an interdisciplinary field that has recently rapidly grown in size and influence. We described the composition and educational experiences of disease ecology practitioners and identified changes in research foci. We combined a global survey with a literature synthesis involving machine-learning topic detection. Disease ecology practitioners have diversified in the last decade in terms of gender identity and institution, with weaker diversification in terms of race and ethnicity. Topic detection analysis of over 18,500 research articles revealed research foci that have declined (e.g., HIV), increased (e.g., infectious disease in bats), and have remained common (e.g., malaria ecology, influenza). The steady increase in topics such as climate change, and emerging infectious diseases, superspreaders indicate that disease ecology as a field of research will continue advancing our understanding of complex host-pathogen interactions and forms a critical and adaptable component of the global response to emergent health and environmental threats. among these is the urgency to understand and address novel disease threats, which are rooted 44 in natural systems but are often exacerbated by societal inequalities (Carlson & Mendenhall 45 2019) . For example, the impacts of habitat degradation on pathogen spillover is an expanding 46 area of research that can be used to guide risk assessments and environmental policy (Patz et 47 al. 2004 ). At the same time, infrastructure has developed around disease ecology, such as a 48 specialized National Science Foundation and National Institutes of Health funding program and 49 conference series (Scheiner & Rosenthal 2004) , which have helped to direct research effort and 50 create networks amongst researchers. 51 52 Still, many questions remain as to the composition of disease ecology practitioners, core 53 research foci, and how research trends have changed to meet societal needs. Answering these 54 questions will help to improve training pathways and prioritize research emphases for future 55 research. However, understanding these complex and interrelated factors as they apply to an 56 interdisciplinary research topic requires diverse and innovative approaches. 57 Here we characterize both the practitioners and field of disease ecology by addressing the 59 following questions: 60 (1) Who comprises the field in terms of education, demographics, and research foci? 61 (2) Which are the most influential scientific articles and journals? 62 (3) And significantly, how have research trends emerged and changed over time? For 63 example, do they follow global health priorities such as disease outbreaks? 64 To answer these questions, we surveyed self-declared disease ecologists globally and 66 conducted a literature synthesis with machine-learning topic detection ( had to meet specific criteria using Boolean filters, including a focus on studying a pathogen or 120 parasite, host infections (to distinguish from solely environmental persistence of 121 microorganisms), and individual-level or higher-order dynamics (e.g. not cellular processes, with 122 the exception of those analyzed as a population-level process). The full list of search terms is 123 provided in the Supporting Information, alongside a set of exclusionary terms to remove similar 124 but non-disease ecology articles. Web of Science categories were used to narrow our search 125 and also reduce false-positive inclusions. Finally, articles with fewer than four citations were 126 removed as a form of quality control. 127 To evaluate false-positives, two authors (DJB and KMF) independently evaluated the same 100 129 randomly selected articles and classified them as 'disease ecology' or 'outside the field'. Papers 130 that fell outside the field predominantly described bacterial communities, within-host behavior or 131 adaptations, or genetics/genomics. Over 75% of the articles in the final corpus were classified 132 as disease ecology, and consensus was strong among evaluators (94% agreement, Cohen's 133 κ=0.84). Within-host studies were accepted if they focused on population-level processes (e.g., To evaluate false-negatives, we cross-validated our corpus using our survey data. Specifically, 137 we assessed whether articles that were identified by at least two respondents as influential were 138 present in our corpus. We calculated the proportion of papers that were included in our corpus 139 out of the list of such articles, with the requirement that at least 70% of papers had to be 140 included. Of the influential articles identified by survey respondents (written ≥2 times) restricted 141 to journals used in building the corpus, approximately 71% (50/70) were present in the corpus. 142 Yet the 'most influential' articles had a higher probability of being included: the corpus included 143 85% of articles written four or more times, 75% of articles written three times, and 63% of 144 articles written twice. We adjusted the search and exclusion terms twice using the workflow 145 described in Figure 1 small or too large, we were unable to detect temporal variation in that topic. If j was too small or 155 too large, the topics were not clearly defined. For example, a topic with only five words may not 156 be interpretable; similarly, a topic with 30 words may be too broad to assign meaning. We used i 157 = 15 and j = 15, so our corpus was analyzed for 15 topics with 15 words each. 158 We used K-means clustering from the nltk Python library to construct topics, where each topic 160 comprised 15 commonly co-occuring words. We assigned a name to each topic to describe its 161 theme. For example, we named a topic containing immunodeficiency, HIV, patient, therapy, 162 drug, AIDS, background, treatment, and risk, as an HIV topic. We gave each topic name a 163 'confidence' measurement of 1-3, from high to low confidence in identifying the topic. In addition 164 to topics that emerged from the literature, we also generated and assessed our own topic lists based on key research areas, such as climate change, dilution effect, superspreaders, network 166 analysis, EIDs, bovine tuberculosis, infectious diseases in bats and rodents, and chytrid fungus 167 ( Fig. 4) . To ensure topic trends were not confounded by an increase in the total number of 168 published articles through time, we constructed a baseline topic using neutral words that should 169 be in all disease ecology articles: analysis, study, and paper. We evaluated temporal trends in 170 publications for each theme using generalized additive models (GAMs) fit using the mgcv 171 package in R (Wood 2006) . The proportion of words in each topic relative to all words was 172 modeled as a binomial response using thin plate splines with shrinkage for publication year. 173 Lastly, to assess covariation among topics, we estimated Spearman's rank correlation 174 coefficients (ρ) at the zero-year lag. 175 Survey 178 A total of 413 self-declared disease ecologists participated in the survey. The average 179 respondent was 36.1 years old (21-76, median: 34; n=348). 76.7% of participants (n=408) 180 considered at least half of their research to fall within disease ecology. Participants that 181 considered ≥75% of their research to be disease ecology were concentrated from ages 25-40, 182 and most self-identified as women (60%) (n=344). More broadly, 56.1% of participants identified 183 as women (n=231), 42.5% as men (n=175), 0.7% as other (n=3), and 0.7% preferred not to say 184 (n=3). We report on participants that chose to disclose a gender identity for results regarding 185 gender. 186 187 Most respondents identifying as women were younger (age ≤25-35) than most respondents 188 identifying as men (age 26-50). The youngest age category (≤25 years) was 68.9% women 189 (n=45), and the oldest age category (60+ years) was 85.0% men (n=20). Current positions held 190 by survey participants were: undergraduate student (1.2%, n=5), Master's student (2.9%; n=12), PhD student (24.5%; n=100), post-doctoral researcher (21.1%; n=86), faculty (39.5%; n=161), 192 researcher (9.1%; n=37), and other (1.7%; n=7). Respondents identifying as women comprised 193 most of each academic position except Master's student and faculty (Table S1 ). In general, 194 most PhD students and post-doctoral researchers were young and identified as women. Most 195 Masters' students were young and identified as men, and most faculty were middle-aged and 196 identified as men (Tables S1-S3, Fig. S2C ). Participants that did not identify with a strict gender 197 binary were distributed across age (≤25-50) and position categories. important' areas were parasitology, immunology, field/laboratory techniques, 248 microbiology/virology, and genetics/genomics/bioinformatics. Ecology was also listed as least 249 important, suggesting that most participants considered ecology as either the most or least 250 important area of research; however, the number of responses for the former was nearly three 251 times greater than for the latter. 252 253 Survey respondents were asked to write in scientific journals and articles that they believed 254 were the most influential in disease ecology (Table 1) We compiled a list of 42 journals that at least four survey participants said were the most 266 important in disease ecology, plus Science and Nature. We searched these 44 journals for 267 relevant articles in the field using the algorithm described below, and our final corpus Many of the topics that emerged from the disease ecology literature, such as malaria, influenza, 285 and vaccination, have remained constant over time (Fig. 4B) . Others, such as HIV and serology, have declined over time, and host-pathogen coevolution has instead steadily increased. These 287 emergent topics comprised a notable portion of the disease ecology literature and were more 288 prominent than author-selected topics. We constructed a neutral topic for comparison, which 289 was constant through time (Fig. 4B, gray line in panels) , thus validating the observed temporal 290 changes in these topics. 291 Using key term searches, we next explored select topic trends: climate change, emerging 293 infectious diseases (EIDs), the dilution effect, superspreaders, network analysis, pathogens in 294 rodents and bats, bovine tuberculosis, and chytrid fungus in amphibians (Fig. 4B) . As with 295 emergent topics, our topic detection was sensitive to detecting changes in frequency over time, have remained prominent foci of disease ecology, whereas an increase in a priori selected topics such as emerging infectious diseases, climate change, and effects of biodiversity loss 341 emphasize how this expanding field has grown to meet global health crises and priorities. 342 Further, addressing diversity and allocating resources toward these growing topics could 343 promote equity within disease ecology, improve training programs, designate funding 344 opportunities, and provide the infrastructure for concentrated advancements. 345 346 Self-declared disease ecology practitioners are becoming more diverse in terms of country of 347 education, gender identity, and institution (Fig. 1) . This echoes similar trends in conservation 348 In general, research on epidemics tended to be responsive rather than anticipatory, such that 411 we observed an immediate spike in publications on high-profile pathogens followed by a decline 412 or plateau (e.g., bovine tuberculosis and chytrid fungus). Emergent topics were remarkably 413 stable through time, with the exception of HIV and host-pathogen coevolution, which have 414 respectively decreased and increased. Research focusing on concepts (e.g., dilution effect, 415 superspreaders, coevolution) or approaches (e.g., network analyses) rather than specific hosts 416 or pathogens tended to rise more gradually and remain a notable proportion of the literature. On 417 the other hand, mosquito-borne pathogens and influenza have been defining topics over the entire time series, which we expect to persist for the foreseeable future. Although our analysis of 419 cross-correlation between the topic time series is associative, we observed several especially 420 interesting relationships. In particular, publications on bat disease, chytrid fungus, climate 421 change, the dilution effect, superspreaders, and emerging infectious diseases were all positively 422 Economic burden of livestock 465 disease and drought in Northern Tanzania Climate 468 change and infectious diseases: From evidence to a predictive framework The 471 proximal origin of SARS-CoV-2 Infectious diseases of humans: 473 dynamics and control Population biology of infectious diseases: Part I Quantifying the burden of vampire bat rabies in Peruvian livestock Natural language processing with Python: analyzing 480 text with the natural language toolkit Probabilistic topic models Women and science careers: leaky pipeline or gender filter? 483 Gender Global trends in antimicrobial resistance in 486 animals in low-and middle-income countries Wildlife mortality investigation and disease research: Contributions of 488 the USGS National Wildlife Health Center to endangered species management and 489 recovery Women and minorities in science, technology, 491 engineering, and mathematics: Upping the numbers Bats: important 493 reservoir hosts of emerging viruses Preparing for emerging infections means 495 expecting new syndemics Biodiversity inhibits 498 parasites: broad evidence for the dilution effect Effects of landscape heterogeneity on the 501 emerging forest disease sudden oak death Disentangling the interaction 503 among host resources, the immune system and pathogens Anthropogenic environmental 506 change and the emergence of infectious diseases in wildlife Seeking congruity 509 between goals and roles: A new look at why women opt out of science, technology, 510 engineering, and mathematics careers When do female role models benefit women? 512 The importance of differentiating recruitment from retention in STEM Gender diversity of editorial 515 boards and gender differences in the peer review process at six journals of ecology 516 and evolution Patterns of authorship in ecology and 518 evolution: First, last, and corresponding authorship vary with gender and geography An emerging disease causes regional population 522 collapse of a common North American bat species Ecology of infectious diseases in 524 natural populations Group of Eight. 2013. The Changing PhD On the benefits of systematic reviews for 527 wildlife parasitology Topic modeling of major research themes in disease 530 ecology of mammals Statistical methods for meta-analysis A timeline of HIV Prevention of Population Cycles by 535 Ecology 537 of wildlife diseases Diversity in the geosciences and successful 539 strategies for increasing diversity Why infectious disease research 541 needs community ecology Global trends in emerging infectious diseases Challenges and supports for women conservation 545 leaders Taming wildlife disease: bridging the gap between science 548 and management Effects of species diversity on disease risk The Rise of Disease Ecology and Its 552 Implications for Parasitology-A Review Achieving synthesis with meta-analysis by combining and 554 comparing all available studies Facilitating systematic reviews, data extraction and meta-analysis 556 with the metagear package for R Plasmodium knowlesi: reservoir hosts and tracking the 559 emergence in humans and macaques The changing face of pathogen discovery and surveillance Superspreading and 563 the effect of individual variation on disease emergence NLTK: the natural language toolkit Biological invasions: a field synopsis, systematic 567 review, and database of the literature Population biology of infectious diseases: Part II The educational benefits of diversity: Evidence from multiple sectors National Science Foundation. 2020. Women, Minorities, and Persons with Disabilities in Automated 577 content analysis: addressing the big literature challenge in ecology and evolution A guide to conducting a standalone systematic literature review Host and viral traits predict zoonotic spillover from mammals Unhealthy 594 landscapes: Policy recommendations on land use change and infectious disease 595 emergence Ecological responses to altered flow regimes: a 597 literature review to inform the science and management of environmental flows Recognizing the benefits of diversity: When and 600 how does diversity increase group performance? Global expansion and 603 redistribution of Aedes-borne virus transmission risk with climate change Amphibian fungal panzootic causes catastrophic and ongoing loss of biodiversity Ecology of infectious disease: Forging an 610 alliance Leaks in the pipeline: Separating demographic 612 inertia from ongoing gender differences in academia Ecological 616 interventions to prevent and manage zoonotic pathogen spillover The influence of feeding behaviour and temperature on the 620 capacity of mosquitoes to transmit malaria Interactions between HIV/AIDS and the 622 environment: Toward a syndemic framework Press release: Infectious diseases kill over 17 million 625 people a year: WHO warns of global crisis The top 10 causes of death UNESCO Institute for Statistics (UIS) Higher Education. 2020 Heterogeneity in pathogen transmission: 632 mechanisms and methodology Quantifying the impact of human mobility on malaria Wildlife Disease Ecology: Linking Theory to 637 Data and Application Generalized Additive Models: An Introduction with R Evaluating ecological restoration success: a 641 review of the literature