key: cord-0500395-kxleikdh authors: Park, Michael; Leahey, Erin; Funk, Russell title: The decline of disruptive science and technology date: 2021-06-21 journal: nan DOI: nan sha: 8afbd4336f685a1b28a22c408f7212d0bc2cde2c doc_id: 500395 cord_uid: kxleikdh Although the number of new scientific discoveries and technological inventions has increased dramatically over the past century, there are growing concerns that progress is slowing. We analyze 25 million papers and 4 million patents across 6 decades and find that science and technology are becoming less disruptive of existing knowledge, a pattern that holds nearly universally across fields. We link this decline in disruptiveness to a narrowing in the utilization of existing knowledge. Diminishing quality of published science and changes in citation practices are unlikely to be responsible for this trend, suggesting that this pattern represents a fundamental shift in science and technology. While the past century witnessed an unprecedented expansion of scientific and technological knowledge, there are concerns that innovative activity is slowing (Jones, 2009; Gordon, 2016; Chu and Evans, 2021) . Studies document declining research productivity in semiconductors, pharmaceuticals, and other fields (Pammolli et al., 2011; Bloom et al., 2020) . Papers, patents, and even grant applications have become less novel and less likely to connect disparate areas of knowledge, both of which are precursors of innovation (Packalen and Bhattacharya, 2020; Jaffe and Lerner, 2011) . The gap between the year of discovery and the awarding of a Nobel Prize has also increased (e.g., Horgan, 2015; Collison and Nielsen, 2018) , suggesting that today's contributions may not measure up to the past. Numerous explanations for this slowdown have been proposed. Some point to a dearth of "low hanging fruit," as the easier innovations have already been produced (Cowen, 2011; Gordon, 2016) . Others suggest the decline is due to an increasing burden of knowledge; scientists and inventors require more training to reach the frontier of their field, leaving less time for making breakthroughs (e.g., Jones, 2009 ). Yet much remains unknown, not merely about the causes of slowing innovative activity, but also the depth and breadth of the phenomenon. To date, the evidence pointing to a slowdown is based on studies of particular fields, using disparate and domain-specific metrics (Pammolli et al., 2011; Bloom et al., 2020) , making it difficult to know whether the changes are happening at similar rates across areas of science and technology. Little is also known about whether the patterns seen in aggregate indicators may mask differences in the degree to which individual works push the frontier. We address these gaps in knowledge by analyzing 25 million papers in the Web of Science ("WoS data") and 4 million patents from in the United States Patent and Trademark Office's Patents View database ("USPTO data"). The WoS data include 159 million citations and 28 million paper titles and abstracts. The USPTO data include 18 million citations and 6 million patent titles and abstracts. Using these data, we join a novel citation-based measure (Funk and Owen-Smith, 2017) with textual analyses of titles and abstracts to understand whether papers and patents forge new directions over time and across fields. To characterize the nature of innovation, we draw on foundational theories of scientific and technological change (Schumpeter, 1942) , which distinguish between two types of breakthroughs. First, some contributions improve existing streams of knowledge, and therefore consolidate the status quo. Kohn & Sham (1965) (Kohn and Sham, 1965) , a Nobel-winning paper ("KS") utilized established theorems to develop a method for calculating the structure of electrons, which cemented the value of prior research. Second, some contributions disrupt existing knowledge, rendering it obsolete, and propelling science and technology in new directions. Watson & Crick (1953) (Watson and Crick, 1953) ("WC"), also a Nobel winner, introduced a model of the structure of DNA that superseded previous approaches (e.g., Pauling's triple helix). KS and WC were both important, but their implications for scientific and technological change were different. To quantify this distinction, we utilize a measure-the CD index-which characterizes the consolidating/disruptive nature of science and technology based on citation networks ( fig. 1) . The intuition is that if a paper or patent is disruptive, the subsequent work that cites it is less likely to also cite its predecessors; for future researchers, the ideas that went into its production are less relevant (e.g., Pauling's triple helix). If a paper or patent is consolidating, subsequent work that cites it is also more likely to cite its predecessors; for future researchers, the knowledge upon which the work builds is still (and perhaps more) relevant (e.g., the theorems KS used). The CD index ranges from -1 (most consolidating) to 1 (most disruptive). We measure the CD index five years after the year of publication (indicated by CD 5 ). For example, WC and KS both received over a hundred citations within five years of publication. However, the KS paper has a CD 5 of -0.22 (indicating consolidation), whereas the WC paper has a CD 5 of 0.62 (indicating disruption). The CD index has been validated in prior research, including with expert assessments (Funk and Owen-Smith, 2017; Wu et al., 2019) . Across fields, we find the rate of disruptive work is declining. Fig. 2 plots the average CD 5 over time for papers ( fig. 2A) and patents (fig. 2B ). For papers, the decrease (1945-2010) ranges from 91.9% (Social Science) to 100% (Physical Science); for patents, the decrease (1980-2010) ranges from 93.5% (Computers and Communications) to 96.4% (Drugs and Medical). We verify the decline in disruptiveness over time through regression models which show that year has a statistically significant and negative relationship with CD5 for papers (p < 0.0001) and patents (p < 0.0001). These declines demonstrate that relative to earlier eras, recent papers and patents do less to push fields in new directions in a way that surpasses prior work. The similarity in trends we observe across fields is noteworthy in light of "low hanging fruit" arguments (Cowen, 2011; Gordon, 2016) , which would likely predict greater heterogeneity in the decline, as it seems unlikely fields would "consume" their low hanging fruit at similar rates/times. The trends also hold when using alternative indicators. Because they create departures from the status quo, disruptive papers and patents are likely to introduce new words. Therefore, if disruptiveness is declining, we should expect a decline in the diversity of words used in science and technology. To evaluate this, fig. 3A and C document the lexical diversity, based on the type-token ratio (i.e., unique/total words), of paper and patent titles, respectively, over time (S1). We observe substantial declines. For paper titles ( fig. 3A) , the decrease (1945-2010) ranges from 76.5% (Social Science) to 88% (Technology); for patent titles ( fig. 3B) , the decrease (1980-2010) ranges from 32.5% (Chemical) to 81% (Computer and Communications). For paper abstracts (fig. S1A), the decrease (1992-2010) ranges from 23.1% (Life Science and Biomedicine) to 38.9% (Social Science); for patent abstracts (fig. S1B), the decrease (1980-2010) ranges from 21.5% (Mechanical) to 73.2% (Computers and Communications). A decline in disruptive activity is also apparent in the particular words used. If disruptiveness is declining, then verbs alluding to the creation, discovery, or perception of new things ("disruptive" words) should be used less over time, whereas verbs alluding to the improvement, application, or assessment of existing things ("consolidating" words) may be used more frequently. Fig. 3 shows changes in the most common verbs in paper (fig. 3B) and patent titles ( fig. 3D ) by decade. We find a relative decrease in disruptive words (blue) and an increase in consolidating words (red), as classified by a panel of reviewers (S2). Consider the verb "produce," suggestive of disruptive work, which appeared in the title of a Nobel-winning paper (Ingle and Kendall, 1937) that used cortin to produce atrophy of the adrenal cortex in rats. This paper has a CD 5 of 0.56, reflecting its disruptive tendency. In patent titles, use of this verb dropped in three of four decades (fig. 3D); a similarly steep decline exists in paper titles (table S1). Conversely, "use" is indicative of consolidating work (e.g., using existing knowledge). The incidence of this verb increased in patent titles ( fig. 3D ) and underwent one of the greatest jumps in utilization in both paper and patent titles (table S1). For example, "use" appears in the title of the Nobel-winning paper "Understanding, improving, and using green fluorescent proteins" (Cubitt et al., 1995) to indicate the improvement and application of a previously studied compound. Accordingly, the paper has a CD 5 of -0.09. Overall, these results further suggest that science and technology has become less disruptive. The aggregate trends we document mask considerable heterogeneity in disruptiveness and remarkable stability in the absolute number of highly disruptive works (S3). This result suggests that the persistence of major breakthroughs-e.g., measurement of gravity waves, mRNA COVID-19 vaccines-is not inconsistent with concerns about slowing innovative activity. In short, declining aggregate disruptiveness does not preclude the possibility of individually highly disruptive works. What is driving the decline in disruptiveness? Earlier, we suggested our results are not consistent with explanations that tie slowing innovative activity to diminishing "low-hanging fruit." S4 further shows that the decline in disruptiveness is unlikely due to other field-specific factors by decomposing variation in CD 5 attributable to field, author, and year effects. Declining rates of disruptive activity are unlikely due to a reduction in the quality of science and technology (e.g., Jaffe and Lerner, 2011; Ioannidis, 2005) . If they were, then the patterns we see in fig. 2A and B should be less visible in high quality work. However, when we restrict our sample to articles published in premier publication venues like Nature, PNAS, and Science or to Nobel-winning discoveries (S5), the downward trend holds (Nobel data from (Li et al., 2019) ). Furthermore, the trend is not driven by characteristics of the WoS and UPSTO data; we observe similar trends across four additional corpuses: JSTOR, American Physical Society, Microsoft Academic Graph, and PubMed (S6). Declines in disruptiveness are also not attributable to citation practices (S7). Given that the CD 5 is based on citations, we use Monte Carlo simulations to randomly rewire the observed citation networks for papers and patents. The algorithm preserves several properties of the underlying networks, including the number of citations to and from each work and the age of citing/cited works. We find that observed CD 5 values are generally lower than those from the simulated networks ( fig. S6 ): the observed and simulated CD 5 measures are statistically different for papers (Kolmogorov-Smirnov statistic=0.3903, p < 0.001) and patents (Kolmogorov-Smirnov statistic=0.1870, p < 0.001) . This suggests that the decline in the CD 5 is unlikely to be driven by changing citation practices. We further show that the decline is not an artifact of the CD index by reporting similar patterns using alternative bibliometric measures (S8). We also considered how declining disruptiveness relates to the growth of knowledge. On the one hand, scientists and inventors face an increasing knowledge burden, which may inhibit disruptive work. On the other hand, research has also observed that "knowledge begets knowledge," an idea captured in Newton's observation, "If I have seen further it is by standing on the shoulders of Giants" (Koyré, 1952) . Therefore, using regression models, we evaluated the relationship between the stock of papers and patents (a proxy for knowledge) within fields and CD 5 (S9). Interestingly, we find a positive effect of the growth of knowledge on disruptiveness for papers, consistent with prior work (Chu and Evans, 2021) ; however, we find a negative effect for patents (table S2) . Given these conflicting results, we further considered the possibility that the availability of knowledge may differ from its utilization. The dramatic growth in publishing and patenting may lead scientists and inventors to focus on narrower slices of prior work, thereby limiting the "effective" stock of knowledge. Using three proxies, we document a decline in utilized knowledge ( fig. 4) . First, we see a decline in the diversity of work cited ( fig. 4A and D) , indicating a narrower selection of existing work is being utilized. Second, we see an increase in self-citation ( fig. 4B and E) , which suggests growing reliance on knowledge that is highly familiar to author teams. Third, the mean age of work cited is increasing ( fig. 4C and F), suggesting that scientists and inventors may be struggling to keep up with the pace of knowledge expansion and instead rely on older, familiar work. All three indicators point to a consistent story: a narrower scope of existing knowledge is being utilized. Results from a series of regression models suggest that utilizing less diverse work, more of one's own work, and older work is negatively associated with disruption (S10). When the range of work utilized by scientists narrows, disruptive activity declines. This paper has documented a dramatic decline in disruptiveness across science and technology. Analyses show that changes in citation practices and in the quality of publications are probably not responsible for the decline. Rather, the decline represents a substantive change, which aligns with concerns about slowing innovative activity. We document that the dramatic change in the nature of science and technology is likely attributable at least partially to scientists' reliance on a narrower set of extant research. These findings have implications for the production and evaluation of science and technology. First, to foster more disruptive work, a broader array of extant knowledge should be incorporated. Given the vast amount of work produced and researchers' limited capacity, this is a challenge, but one that collaborative and diverse teams may help meet. Second, when evaluating progress in science and technology, it is critical to consider the nature of work being produced. The sheer number of papers and patents fails to capture whether new work is pushing existing knowledge in new directions. Figs. S1 to S8 Tables S1 to S5 References (20-27) As noted in the main text, we complemented our bibliometric analyses with assessments of on paper and patent titles and abstracts, which yield independent evidence of declining disruptiveness over time. In this supplement, we describe the methodological details of our textual analyses. Lexical diversity We examined changes in the diversity of words used in papers and patents over time. Our rationale for these analyses is that increases in disruption should be associated with increases in the diversity of words used in science and technology. Disruptive discoveries and inventions create departures from the status quo, rendering their predecessors less useful. While this pattern alone may have the effect of reducing the diversity of words used, disruptive discoveries and inventions are also likely to introduce new words; part of the way that disruptive discoveries and inventions render their predecessors less useful is by introducing ideas that are more useful than those that came before. Taken together with the long memory of science and technology (i.e., even obsolete words are still occasionally used), we therefore anticipate a positive association between disruption and the diversity of words used by scientists and inventors. Thus, to the extent that our observations on decreasing disruption hold, we should see a decline in the diversity of words over time. To evaluate for such changes, we pulled all titles and abstracts for papers and patents in our sample from Web of Science and Patents View. For titles, there was very little missing data in either Web of Science or Patents View, with titles absent in fewer than 0.01% of cases in both the former and the latter. For abstracts, Patents View also provides highly complete coverage, with only 0.32% of cases missing. Web of Science has less robust coverage of abstracts before the early 1990s; from 1945-1991, only 4.45% of papers in our sample include abstracts. Coverage is much better in later years; from 1992-2010, abstracts are included for 90.85% of papers. We therefore limit our analyses of abstract data from WoS to the 1992-2010 period. After extracting paper and patent titles and abstracts, we completed a series of processing steps using spaCy, an open-source, state-of-the-art Python package for natural language processing. To begin, we tokenized each title and abstract. From the resulting lists of tokens, we then excluded those that were tagged by spaCy as stop words, tokens consisting only of digits or punctuation, and tokens that were shorter than three characters or longer than 250 characters in length. Next, we converted all remaining tokens to their lemmatized form and converted all letters to lowercase. Finally, we aggregated the resulting lists of tokens to the subfield × year level, separately for papers and patents and for titles and abstracts. We evaluate changes in the diversity of words used over time by computing, for each subfield × year observation, the type-token ratio, a common measure of lexical diversity. The type-token ratio is defined as the ratio of unique words to total words. We compute this measure separately for papers and patents and for titles and abstracts, at the level of the Web of Science research area (for papers) and NBER technology category (for patents). More specifically, for each field (i.e., research area or technology category) and each year, we divide the number of unique words appearing in titles by the total number of words appearing in titles ( fig. 3 ). We repeat this step for the number of unique and total words appearing in abstracts (fig. S1). The measure attains its theoretical maximum when every word is used exactly once. Thus, higher values indicate greater diversity. 1 Linguistic change We also examined changes in the specific words used in papers and patents over time. Our rationale for these analyses is that the changes we observe in the CD index are likely to coincide with changes in approaches to discovery and invention, particularly the orientation of scientists and inventors towards prior knowledge. For example, to the extent that disruption is decreasing over time, it seems plausible that we will also observe decreases in words indicating the creation, discovery, or perception of new things. Similarly, it is also plausible that we will observe concomitant increases in the use of words that are more indicative of improvement, application, or assessment of existing things, which, consistent with the notion of consolidation, may reinforce existing streams of knowledge. To evaluate changes in the utilization of specific words over time, we followed an approach similar to that described above in our analyses of lexical diversity, using similar samples of papers and patents and preprocessing steps. To simplify the presentation, we limit our attention to words appearing in paper Figure S1 : Lexical diversity of paper and patent abstracts over time. This figure shows changes in lexical diversity (as measured by the type-token ratio) over time for the abstracts of papers (A) and patents (B). For papers, lines correspond to Web of Science research areas; for patents, lines correspond to NBER technology categories. For paper abstracts, lines begin in 1992 because Web of Science does not reliably record abstracts for papers published prior to the early 1990s. 1950 1960 1970 1980 1990 2000 2010 Year and patent titles, for which, as noted previously, we have more complete data. However, the patterns we report below are also observable in analyses using paper and patent abstracts. Prior work has studied word frequencies in paper and patent titles extensively, and they are generally thought to provide a good window into the nature of science and technology (e.g., Milojević, 2015) . For the present analyses, during preprocessing, we also assigned a part of speech tag to each lemma, after which we extracted all nouns, verbs, adjectives, and adverbs, which we anticipated would provide the most meaningful insights. At this stage, our data consisted of counts of lemmas by part of speech appearing in the titles of sample papers and patents. To facilitate analysis, we subsequently reshaped the data in a long-panel format, separately for papers and patents, where each row was uniquely identified by a document id × part of speech × token. We then conducted two complementary assessments of changes in the specific words used in papers and patents over time. First, we examined changes in the top 30 most frequently used words in paper and patent titles by decade. For patents, we present these word frequencies for the years 1980, 1990, 2000, and 2010 ; for papers, our time series is longer, and therefore we present frequencies for every other decade (i.e., 1950, 1970, 1990, 2010) . Second, to complement this assessment, we also examined the top 30 words that underwent the greatest change (either positive or negative) in utilization over the period of our study, again separately by part of speech (verbs, nouns, adverbs, and adjectives) and for papers and patents. To identify these words, we created, for each token × part of speech observation, a panel tracking annual utilization by papers or patents (i.e., the proportion of papers or patents in which the focal token × part of speech appeared for each year). Subsequently, we computed the Spearman rank correlation between this measure of utilization and the year of publication (for papers, grant year for patents). Next, we dropped all token × part of speech observations for which the p-value for the Spearman correlation was >0.05. In addition, to eliminate idiosyncratic terms, we excluded token × part of speech observations that appeared in fewer than 1,000 papers or patents. Finally, because we are interested in changing approaches to science and technology (rather than changing substance), topical terms were excluded from the table. Topical terms were defined as those relating to specific chemicals (e.g., "ammonium") and drugs (e.g., "penicillin"), medical conditions (e.g., "jaundice") and procedures (e.g., "ultrasound"), organisms (e.g., "fowl") and organism parts (e.g., "ureter"). To simplify the presentation and conserve space, we focus our reporting on the results for verbs, which also generally yielded more substantively interesting patterns (the most frequent nouns were often topical in nature; the most frequent adverbs and adjectives tended to be general/stop words). However, the substantive conclusions we observe for verbs are also visible with the other parts of speech, several examples of which we highlight below. Furthermore, to understand whether certain verbs are more likely to be used in the titles of disruptive or consolidating work, we conducted a survey among researchers familiar with the literature in Science of Science (see S2). Results from our assessments of changes in the most common words by decade are shown in fig. 3B and D, separately for both papers and patents. Words colored in blue indicate those considered to be disruptive according to our survey results. Disruptive words are verbs more closely related to processes such as the creation, discovery, and perception of new things. Words colored in red indicate those considered to be consolidating according to our survey results. Consolidating words are verbs more closely related to processes such as the assessment, improvement, and application of existing things. Beginning with paper titles, and consistent with our expectations, we observe a decrease in the relative number of disruptive words (blue) and an increase in the relative number of consolidating words (red). The number of disruptive words stayed constant across two of the three decade transitions (1950 to 1970 and 1970 to 1990) and decreased from three to one once (1990 to 2010). By contrast, the number of consolidating words decreased once (1950 to 1970) , but increased twice thereafter (1970 to 1990 and 1990 to 2010) , where the latter increase was a dramatic one, from five to nine consolidating verbs. Furthermore, consolidating words seem to be used with greater frequency over time. For example, the consolidating word "base" appears in three of the decades, and increases in usage each time (1970 to 1990 and 1990 to 2010) . In addition, even though the consolidating word "associate" only appears in the final decade, it is used more frequently than any other word from prior decades (i.e., 1.15 is higher than the highest frequency of any word in the past decades). We see a similar trend of decreasing usage of disruptive words and increasing usage of consolidating words in patent titles. The relative number of disruptive to consolidating words appearing on the list is generally consistent over time. However, the frequency of disruptive words is decreasing while the frequency of consolidating words is increasing. For example, the disruptive word "make," which appears in all four decades represented in our data, decreases in use in each subsequent decade. Similarly, "produce" appears across all four decades and registers a decrease across two of three decadal transitions (1990 to 2000 and 2000 to 2010) . Conversely, the consolidating words "use" and "control" both appear in all four decades, and increase in use in each subsequent decade. Similarly, the consolidating word "have" appears in all four decades and shows an increase across two of the three transitions (1980 to 1990 and 1990 to 2000) . Thus, we observe a decrease in the frequency of processes related to disruptive work and a simultaneous increase in the frequency of processes related to consolidating work. These results are consistent with our argument that there is a decrease in disruptive science and technology. Next, we turn to the results of our analysis of words undergoing the greatest changes in utilization, presented in table S1. Overall, our findings using this approach align with those just reported. The results here indicate an overall decrease in the frequency of processes related to disruptive work (i.e., a decrease in the frequency of verbs associated with the creation, discovery, and perception of new things). Moreover, there is a general increase in the frequency of processes related to consolidating work (i.e., an increase in the frequency of verbs associated with the assessment, improvement, and application of existing things). In particular, in the list of words that are increasing in utilization, the number of disruptive words is less than the number of consolidating words for both papers (11 disruptive words to 13 consolidating words) and patents (7 disruptive words to 13 consolidating words). For example, verbs related to processes of consolidation, such as "use," "base," and "update," are increasingly utilized in both papers and patents. Similarly, among words that are decreasing in utilization, the appearance of disruptive words is higher than consolidating ones for both papers (13 disruptive words to 9 consolidating words) and patents (10 disruptive words to 5 consolidating words). For example, verbs related to processes of disruption (e.g., "substitute," "attack," and "separate") seem to be decreasing in utilization across papers and patents. These results indicate that relative to the usage of consolidating words, the usage of disruptive words is increasing less and decreasing more. Overall, based on the patterns in the frequencies of particular verbs appearing in titles, we conclude that the processes related to disruptive work seem to be decreasing while the processes related to consolidating work seem to be increasing. Results presented in both tables are especially noteworthy when recalling that they are based on raw data, with no adjustment or transformation other than basic text preprocessing. Overall, then, these results offer compelling support for the findings we observe on the changing nature of science and technology using the CD index. Notes: Because we are primarily interested in changing approaches to science and technology (rather than changing objects of study), topical lemmas are excluded from the table. Topical lemmas were defined as those relating to specific chemicals (e.g., "ammonium", "phosphorus", "hydrocarbon") and drugs (e.g., "penicillin", "vitamin", "barbiturate"), medical conditions (e.g., "jaundice", "encephalitis", "paralysis") and procedures (e.g., "ultrasound", "psychotherapy", "autopsy"), and organisms (e.g., "fowl", "chick", "tobacco") and organism parts (e.g., "diaphragm", "gland", "ureter"). We took the following steps to characterize verbs as consolidating or disruptive. First, two of the authors independently went through the list of all verbs used in paper and patent titles and abstracts. Using prior theory as a guide (Funk and Owen-Smith, 2017), we looked for verbs that seemed broadly related to process of disruption or consolidation. Our objective in this phase was to be as inclusive as possible; when in doubt about the potential relevance of a particular verb, we erred on the side of including it in our list. We identified many verbs describing processes of creating, discovering, or perceiving new things, which we considered to be more indicative of disruption. We also found many verbs describing process assessing, improving, and applying existing things, which we considered to be more indicative of consolidation. Following this phase, we grouped the identified verbs into six inductively defined categories, each of which was associated with disruptive or consolidating work. The disruptive categories were "creation," "discovery," and "perception" of new things. The consolidating categories were "assessment," "improvement," and "application" of existing things. Second, using All Our Ideas (Salganik and Levy, 2015), we generated a survey to determine which of the verbs identified in the previous phase were most strongly indicative of disruption or consolidation. All Our Ideas is an open platform that allows researchers to submit a single question and a list of potential answers (or "ideas"). The platform automatically creates a survey, associated with a designated hyperlink. We created six surveys, one for each of the six categories identified in the prior phase, asking respondents about their perceptions of the associated verbs. For example, for the category "creation," the question was "Which verb is more indicative of efforts to create knowledge/technology?" Respondents were then given two random verbs from the associated list, and asked to pick one to answer the question. They are also given the option to choose neither of the two options if they cannot make a decision. After responding, the person will be prompted with the same question and another random pair of verbs. The platform will repeat this process infinitely and concurrently generate a score for each verb, indicating the probability that the verb is chosen over any other verb. We distributed the survey to three researchers who are familiar with the literature on technology strategy and Science of Science. Finally, for each of the six categories, we took the top 25 words that had the highest score from the All Our Ideas surveys. The top 25 words from each of creation, discovery, and perception were all deemed more likely to appear in disruptive titles. The top 25 words from each of assessment, improvement, and application were all deemed more likely to appear in consolidating titles. A small number of words were appeared in disruptive and consolidating categories (e.g., "derive," "automate"). For these words, we compared the relative frequency of the verb across the six categories. For example, if a word appeared in the creation, discovery, and assessment categories (i.e., two disruptive categories and one consolidating category), we considered the word to be more closely associated with disruption. If a word appeared on both sides in an equal number of categories, we associated it with the side in which it had the higher rank. This process allowed us to designate most words on the list as characterizing either consolidation or disruption. We also considered classifying verbs using automated, machine learning techniques (e.g., word embeddings). However, we found that our manual approach, informed by human reviewers and judgement, gave more reliable and intuitive results. Observations (and claims) of slowing progress in science and technology are increasingly common, supported not just by the evidence we report above, but also by prior research from diverse methodological and disciplinary perspectives. Yet as noted previously, there is a tension between observations of slowing progress from aggregate data on the one hand, and continuing observations of seemingly major breakthroughs in many fields of science and technology-spanning everything from the measurement of gravity waves to the sequencing of the human genome-on the other. In an effort to reconcile this tension, we considered the possibility that while overall, discovery and invention may be less disruptive over time, the high-level view taken thus far (and also in prior work) may mask considerable heterogeneity. Put differently, aggregate evidence of slowing progress does not preclude the possibility that some (smaller) subset of discoveries and inventions are highly disruptive. To The first two intervals therefore correspond to papers and patents that are relatively weakly disruptive, while the latter two correspond to those that are more strongly so (e.g., where we may expect to see major breakthroughs like some of those mentioned above). Strikingly, despite huge increases in the numbers of papers and patents published each year, we see little change in the number of highly disruptive papers and patents, as evidenced by the relatively flat red, green, and orange lines. This pattern helps to account for simultaneous observations of both aggregate evidence of slowing progress and seemingly major breakthroughs in many fields of science and technology. S4 Is the decline driven by field, year, or author-level effects? Our results show a steady decline in the disruptiveness of science and technology over time. Moreover, the patterns we observe are generally similar across broad fields of study, which suggests that the factors driving the decline are not unique to specific domains of science and technology. The decline then could be driven by other factors, such as the conditions of science and technology at a point in time or the particular individuals that produce science and technology. For example, exogenous factors like economic conditions may encourage research or invention practices that are less disruptive. Similarly, people of different generations may have research styles that produce less disruptive work. Therefore, we seek to understand the relative contribution of domain, time, and people that is leading to the decline of disruptiveness in science and technology. To evaluate the contribution of the three factors systematically, we conducted an analysis in which we decomposed the relative contribution of field, year, and author fixed effects to the predictive power of regression models of the CD index. The unit of analysis in these regressions is the author (or inventor) × year. We enter field fixed effects using more granular subfield indicators (i.e., 150 WoS subjects for papers, 138 NBER subcategory for patents). For simplicity, we did not include additional covariates beyond the fixed effects in our models. Field fixed effects capture all field-specific factors that do not vary by author or year (e.g., the nature of the subject matter); year fixed effects capture all year-specific factors that do not vary by field or author (e.g., the state of communication technology); author (or inventor) fixed effects capture all author-specific factors that do not vary by field or year (e.g., the research style of individual researchers). After specifying our model, we determine the relative contribution of field, year, and author fixed effects to the overall model adjusted R 2 using Shapley-Owen decomposition. Specifically, given our n = 3 groups of fixed effects (field, year, and author) we evaluate the relative contribution of each set of fixed effects by estimating the adjusted R 2 separately for the 2 n models using subsets of the predictors. The relative contribution of each set of fixed effects is then computed using the Shapley value from game theory (for more details, see Grömping, 2007) . Results of this analysis are shown in fig. S3 , for both papers (top bar) and patents (bottom bar). Total bar size corresponds to the value of the adjusted R 2 for the fully specified model (i.e., with all three groups of fixed effects). Consistent with our observations from plots of the CD index over time, we observe that for both papers and patents, field specific factors make the lowest relative contribution to the adjusted R 2 (0.02 and 0.01 for papers and patents, respectively). Author fixed effects, by contrast, appear to contribute much more to the predictive power of the model, for both papers (0.20) and patents (0.17). Researchers and inventors who enter the field in more recent years may face a higher burden of knowledge and thus resort to building on narrower slices of existing work. This would generally leads to less disruptive work being produced in later years, which is consistent with our findings. The pattern is more complex for year fixed effects; year-specific factors that do not vary by field or author hold more explanatory power than field for both papers (0.01) and patents (0.16); they appear to be substantially more important for the latter than the former. Thus, if we needed to predict the disruptiveness of an individual paper or patent, we would likely do better knowing who the author or inventor was rather than the general topic (as indicated by field) or year of publication. Taken together, these findings suggest that relatively stable factors that vary across individual scientists and inventors may be particularly important for understanding changes in disruptiveness over time. The results also confirm that domain-specific factors across fields of science and technology play a very small role in explaining the decline in disruptiveness of papers and patents. Figure S3 : Contribution of field, year, and authors. This figure shows the relative contribution of field, year, and author fixed effects to the adjusted R 2 in regression models predicting the CD 5 index. The top bar shows the results for papers; the bottom bar shows the results for patents. The results suggest that for both papers and patents, stable characteristics of authors contribute significantly to patterns of disruptiveness. S6 Is the decline driven by our choice of data sources? We also considered whether the patterns we document may be artifacts of our choice of data sources. While we observe consistent trends in both the WoS and USPTO data, and both databases are widely used by the Science of Science community, it remains conceivable that our results are driven by factors like changes in coverage (e.g., journals added or excluded from WoS over time) or even data errors rather than fundamental changes in science and technology. To evaluate this possibility, we therefore calculated the CD 5 index for papers in four additional databases-JSTOR, the American Physical Society corpus, Microsoft Academic Graph, and PubMD-the results of which are shown in fig. S5 . 2 Across all four additional data sources, we continue to observe a decline in disruptiveness. Since six data sources are unlikely to be all biased in the same direction, this replication suggests our findings do not stem from our reliance on the WoS and USPTO data. We also considered whether our results may be explainable by changes in citation practices. Perhaps most critically, prior work has shown that over time, there has been a dramatic increase in the average number of citations made by papers and patents (i.e., papers and patents are citing more prior work than in previous eras) (e.g., Bornmann and Mutz, 2015) . Recall that the CD index measures the degree to which future work cites a focal work together with its predecessors (i.e., references in its bibliography). Greater citation of a focal work independently of its predecessors is taken to be evidence of disruption. As papers and patents cite more prior work, however, there may be a mechanical decline in the probability of a focal work being cited independently of its predecessors; the more citations a focal work makes, the more likely future work is to cite it together with one of its predecessors, even by chance. Consequently, it seems possible that increases in the average number of citations made by papers and patents may account for declining disruption. To evaluate this possibility, we conducted an analysis in which we recomputed the CD 5 index on randomly rewired citation graphs. We began by creating copies of the underlying citation graph on which the values of the CD 5 index used in all analyses reported above were based, separately for papers and patents. For each citation graph (one for papers, one for patents), we then rewired citations using a degree-preserving randomization algorithm. In each iteration of the algorithm, two edges (e.g., A-B, C-D) are selected from the underlying citation graph, after which the algorithm attempts to swap the two endpoints of the edges (e.g., A-B becomes A-D, and C-D becomes C-B). If the degree centrality of A, B, C, and D remains the same after the swap, the swap is retained; otherwise, the algorithm discards the swap and moves on to the next iteration. When evaluating degree centrality, we consider "in-degree" (i.e., citations from other papers/patents to the focal paper/patent) and "out-degree" (i.e., citations from the focal paper/patent to other papers/patents) separately. Furthermore, we also required that the age distribution of citing and cited papers/patents was identical in the original and rewired networks. Specifically, swaps were only retained when the publication year of the original and candidate citations were the same. In light of these design choices, our rewiring algorithm should be seen as fairly conservative, as it preserves substantial structure from the original graph. There is no scholarly consensus on the number of swaps necessary to ensure the original and rewired networks are sufficiently different from one another; the rule we adopt here is 100 × m, where m is the number of edges in the network being rewired. After creating these rewired citation graphs, we then recomputed the CD 5 index. Due to the large scale of the WoS data, we base our analyses on a 10% random subsample of papers; the CD 5 index was computed on the rewired graph for all patents. The results of these analyses are shown in fig. S6 , separately for papers ( fig. S6A ) and patents ( fig. S6B ). Lines correspond to the ratio of the average CD 5 index in the observed network to the average CD 5 index in the rewired one, computed by WoS research area (for papers) and NBER technology category (for patents). Values closer to 1.0 indicate that the average CD 5 index is similar to those we observe in a (comparable) random network. The plots reveal a pattern of change in the CD index over and beyond that "baked-in" to the changing structure of the network (c.f. Funk, 2014, for similar approaches). S8 Is the decline driven by our operationalization of the CD index? Several recent papers have introduced alternative specifications of (Funk and Owen-Smith, 2017)'s CD index. In supplementary analyses, we evaluated whether the declines in disruptiveness we observe are visible using these alternative measures. First, we randomly drew 100,000 papers and patents each from our analytic sample. Then we calculated the measures of disruption presented in 3 and (Leydesdorff et al., 2020) 4 . Results are presented in fig. S7 (papers fig. S7A and patents in fig. S7B ). The blue lines indicate disruption based on and the orange lines indicate disruption based on (Leydesdorff et al., 2020) . Across science and technology, the two alternative measures both show declines in disruption over time, similar to the patterns observed with the CD index. Taken together, these results suggest that the declines in disruption we document are not an artifact of our particular operationalization. ; the orange lines calculate disruption using a measure proposed in (Leydesdorff et al., 2020) . 3 We calculate DI no k l where l = 1 (Bornmann et al., , p. 1245 . 4 We calculate DI * on (Leydesdorff et al., 2020, p. 4) S9 Disruptiveness and the growth of knowledge S8B ). Fig. S8 shows that there has been a sharp and consistent increase in the number of new papers and patents. The rate of new papers and patents added year by year seems to be accelerating, expanding the existing stock of knowledge rapidly, and thereby potentially placing a "burden of knowledge" on scientists and inventors. Table S2 evaluate the relationship between the growth of knowledge and disruptiveness. Models 1 and 4 proxy for new knowledge based on the number of new works (papers or patents) produced in the focal year. Models 2 and 5 proxy for new knowledge based on the number of new works (papers or patents) produced in the five most recent years. Models 3 and 6 proxy for new knowledge based on the number of new works (papers or patents) produced in the ten most recent years. We find divergent results for papers and patents; for the former, there is a negative association between new knowledge and CD 5 ; for the latter, the association is positive. This divergent pattern motivates our subsequent analyses, discussed in (S10), on the utilization of knowledge and disruptiveness. We evaluate the relationship between disruptiveness and utilization more systematically. Using regression models, we predict CD 5 for individual papers and patents based on three indicators of prior knowledge utilization measured at the field level-the diversity of work cited, mean number of self-citations, and mean age of work cited. To account for potential confounding factors, our models included year and field fixed effects. Year fixed effects account for time variant factors that affect all observations (papers or patents) equally (e.g., global economic trends). Field fixed effects account for field-specific factors that do not change over time (e.g., some fields may intrinsically value disruptive work over consolidating ones). In contrast to our descriptive plots, for our regression models, we define our field-level measures and adjust for field effects using the more granular 150 WoS "extended subjects" 5 and 38 NBER technology subcategories. 6 Table S3 (papers) and S4 (patents) show summary statistics for variables used in the ordinary-leastsquares regression models. The diversity of work cited is measured by normalized entropy, which ranges from 0 to 1. Greater values on this measure indicate a more even distribution of citations to a wider range of existing work; lower values indicate a more uneven distribution of citations to a smaller range of existing work. The tables show that the normalized entropy in a given field and year has a nearly maximal average entropy of 0.98 for both science and technology, implying that overall, citations tend to be evenly distributed to a wide variety of existing work. While about 16% of papers cited in a paper are by an author of the focal paper, about 7% of patents cited in a patent are by an inventor of the focal patent. Papers tend to utilize older work and work that varies more greatly in age (measured by standard deviation) than patents. Additionally, the average CD 5 of a paper is 0.04 while the average CD 5 of a patent is 0.12, meaning that the average paper tends to be less disruptive than the average patent. Table S5 presents the regression results for fields across both science and technology. Models 5 and 10 present the full regression models. The models indicate a consistent pattern for both science and technology, wherein utilizing more diverse work, less of one's own work, and older work tends to be associated with the production of more disruptive work in the field. The coefficients for diversity of work cited are positive and significant for papers (0.158, p = 0.000) and patents (0.069, p = 0.000), indicating that in fields where there is more utilization of diverse work, there is greater disruption. The CD 5 of papers and patents increase by 0.16% and 0.07% compared to their mean values, respectively, when the diversity of work cited increases by one standard deviation. The coefficients of the mean number of self-citations is negative and significant for papers (-0.010, p = 0.000) and patents (-0.059, p = 0.000), showing that fields where researchers or inventors rely more on their own work tend to be less disruptive. The CD 5 of papers and patents decrease by 0.07% and 0.83% compared to their mean values, respectively, when one more self-citation is made. The coefficients of the interaction between mean age of work cited and dispersion in age of work cited is positive and significant for papers (0.000, p = 0.000) and patents (0.001, p = 0.000), suggesting that fields that utilize older work, holding the dispersion of the age of work cited constant, are more likely to be disruptive. The CD 5 of papers and patents increase by 0.21% and 0.24% compared to their mean values, respectively, when the mean age of work cited increases by one standard deviation. In summary, the regression results suggest that changes in utilization practices may lead to the production of less disruptive science and technology. Notes: This table evaluates the relationship between different measures of utilization of prior scientific and technological knowledge and the CD5 index. Estimates are from ordinary-least-squares regressions. Robust standard errors are shown in parentheses; p-values correspond to two-tailed tests. *p<0.1 **p<0.05; ***p<0.01 Are ideas getting harder to find? Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references Are disruption index indicators convergently valid? the comparison of several indicator variants with assessments by peers Slowed canonical progress in large fields of science Science is getting less bang for its buck The great stagnation: How America ate all the low-hanging fruit of modern history, got sick, and will (eventually) feel better. Penguin Understanding, improving and using green fluorescent proteins Making the most of where you are: Geography, networks, and innovation in organizations A dynamic network measure of technological change The rise and fall of American growth Estimators of relative importance in linear regression based on variance decomposition The end of science: Facing the limits of knowledge in the twilight of the scientific age Atrophy of the adrenal cortex of the rat produced by the administration of large amounts of cortin Why most published research findings are false Innovation and its discontents: How our broken patent system is endangering innovation and progress, and what to do about it The burden of knowledge and the "death of the renaissance man": Is innovation getting harder? Self-consistent equations including exchange and correlation effects An unpublished letter of robert hooke to isaac newton A proposal to revise and simplify the disruption indicator A dataset of publication records for nobel laureates We thank the National Science Foundation for financial support of work related to this project (grants 1829168, 1932596, and 1829302). We would also like to thank Jonathan O. Allen, Thomas Gebhart, Daniel McFarland, Staša Milojević, Raviv Murciano-Goroff, participants of the CADRE workshop at the Indiana University Network Science Institute, and reviewers of the Academy of Management Technology and Innovation Management division for their feedback.