key: cord-0501466-8soq2mbr authors: Daizadeh, Iraj title: The dynamics of United States drug approvals are persistent and polycyclic: Insights into economic cycles date: 2020-12-16 journal: nan DOI: nan sha: 8dae453786f391690c41f295f4120876326ad79e doc_id: 501466 cord_uid: 8soq2mbr It is challenging to elucidate the extrinsic effects of social, economic, policy or other substantive changes on the dynamics of US drug approvals. Here, a novel approach, termed the Chronological Hurst Exponent (CHE), is proposed, which hypothesizes that changes in intrinsic long-range memory latent within the dynamics of time series data may be temporally associated with extrinsic variables. Using the monthly number FDA Center for Drug Evaluation and Research (CDER) approvals from 1939 to 2019 as the data source, it is demonstrated that US approvals are found to have a distinct S-shaped (trichotomized) long-range CHE structure: an 8-year (1939-1947) Stagnated (random; H of 0.5), a 27-year (1947-1974) Emergent (time-varying persistent; H between 0.5 and 0.9), and the 45-year (1974-2019) Saturated (persistent; H of 1) US Approvals Hurst Cycles. Further, dominant periodicities (resolved via wavelet analyses) are identified in the Saturated Period at 17, 8 and 4 years; thus, US Approvals have been following a Juglar/Kuznet mid-term cycle with Kitchin-like bursts. This work suggests that (1) macro-factors in the Emergent Period have led to persistent growth in US approvals enjoyed since 1974, (2) the CHE may be a valued method to explore the constancy of extrinsic pressures on time series data, and (3) adds further evidence that innovation-related economic cycles exist (as supported via the proxy metric of US drug approvals). Drug discovery and development (DDD) requires economic investment to maneuver a single medicine from discovery science to market approval for a given condition or disease. The investments cover the costs associated with acquiring both the hardware (e.g., laboratory materials and space) and software [explicit (e.g., patents) and tacit (e.g., know-how) know-how] as well as executing the various DDD activities [1] . Ultimately, should an investigational candidate survive the attrition process and obtain marketing authorization (also known as marketing approval) by a health authority, a sponsor (or manufacturer) then enjoys economic rents secured from supplying the approval medicine. On the demand side, the patient receives a trusted medicine associated with a market innovation based on a new chemical and biologic entity, a cost advantage (generic), or a more efficient delivery of drug product [2] . Since the early 20 th century to the present, in terms of drug development, the social, economic, and political environments have evolved dramatically. For example, the growth in the amount of governmental investment in research and development (R&D) [3] , the number of R&D firms [4, 5] , the volume of intellectual property (e.g., patents, trademarks, as well as peer-reviewed publications) [5, 6] , the number of R&D policy initiatives (see Table 1 and discussion below), and the rise of the R&D cluster [7] have seemingly grown synchronistically and exponentially. For example, in the US and across industries, Daizadeh [8, 9] showed a statistical significant intercorrelation between R&D investment, the number of patent and trademark applications, peer-reviewed and media publications, and stock price of major indices in the US. However, from the author's perspective, much more work is needed to better understand the historical dynamics of these variables (among others) (and across jurisdictions) that may continue and/or further expedite successful drug development (as well as cross R&D industries as well). Importantly, the DDD industry is a regulated industry, requiring an objective, independent, and external agency (collectively known as a health authority (HA)) to attest to a medicine's quality, safety, and efficacy profile and to formally authorize a drug for marketing purposes in a given jurisdiction. Focusing on US activities, similarly, from a policy perspective, there have been a concomitant evolution of the number and variety of initiatives focused on providing oversight to the DDD process. As briefly presented in Table 1 , mirroring the modernization in science and technology, the FDA policy environment has evolved considerably from placing under regulation specific drugs (e.g., insulin and penicillin) and describing the basic tenets of the safety sciences in the early 20 th century to building a robust infrastructure pushing the frontiers of regulatory science into the 21th century. Economic cycles, a wavelength between crests of development maxima over stagnation minima, are an active area of inquiry, not without controversy [10] . Juglar defined this periodicity over three phases: prosperity, crisis, and subsequent liquidation, and suggested an "approximate length of the cycle with crisis/liquidation taking 1-2 years, followed by a 6-7 year phase of prosperity [11; pp. 7] ," with drivers to prosperity to crisis transition due to exuberance and thus over-speculation (ibid). Kitchin derived 'minor' and 'major' inventory cycles with wavelengths of 3.5 years (40 month) and "aggregates usually of two, and less seldom of three, minor cycles," respectively [12; pp. 10]. Subsequent to the introduction of these short and intermediate cycles, Kondratieff introduced the concept of the long-wave 50-60 year cycles [13] . Concomitantly, Kuznetz extrapolated 15-25 year cycles derived from data from "fluctuations in rates of population growth and immigrating but, also with investment delays in building, construction, transport infrastructure, etc… [14; pp. 2] ." These perceived economic cycles were extrapolated by the original authors from a broad assortment of macro-economic data from US and Europe including climate, monetary, fiscal, consumption, among others. Memory characteristics (also termed persistency) in the dynamics of typical econometrics captured over time are intimately connected with cycles and thus also to the underlying processes [15] . Technically, however, these same characteristics such as long-range memory processes are challenging to analyze and interpret due to (in part) self-similarity and typical non-stationary properties (as they confound spurious from true signals) [16] . The Hurst constant and wavelet analyses are statistical time series tools that may be calculated in such as a way as to avoid these challenges [17] . While there are other ways to define a Hurst constant, a measurement of memory, it is classically defined as H ~ ln(R / S)t / ln(t), where R and S is the rescaled range and standard deviation, respectively, and t is a time window. An H=0.5, an H<0.5, and an H>0.5 indicates a random walk, an anti-persistent, and a persistent (trend reinforcing) time series, respectively [18] . Wavelet analyses is a well-established group of time-series methods that leverages the expansion and contraction of wave functions to resolve time series properties [19] . In this work, and the to the author's knowledge, this is the first investigation of the existence and evolution of persistency, and the existence of approval cycles (akin to economic cycles) within US drug approvals, which is treated as a macro-economic variable and a proxy metric for FDA policy. This work is exploratory and empirical in nature. As presented in the Materials and Methods section below, the data source is a time series of monthly values of US drug Approvals from Jan. 1939 through Dec. 2019 from the Centers of Drug Evaluation and Research (CDER) branch of the Food and Drug Administration (FDA), which "regulates over-the-counter and prescription drugs, including biological therapeutics and generic drugs 1 ." While this is not the only institution that regulates the DDD process within the FDA, it is one that provides a publicly, reliable and valuable source of longitudinal metrics regarding the DDD process from the dawn of the review process (1939) to the present time. The methods are standard with the exception of the Chronological Hurst Exponent to explore the persistency latent in the time series. All datasets and R Project code are provided in the Electronic Supplementary Materials section for the sake of transparency and replicability as well as to encourage future researchers in investigate a potentially Page 5 of 44 very interesting and informative aspect of drug development. This work then discusses the key results of both the descriptive and inferential statistics followed by a discussion on how the statistical work positively supports the hypotheses mentioned above (viz., persistency and economic cycles are latent within US drug approvals), and the ramifications of this work including potential linkages to sociological, economic, and policy features experienced over the nearly 100 years of data. The following summarizes the data sources and the statistical approaches used. This work is applied by nature and thus differing the mathematical formulae and technical discussion to original sources, as cited. All data and the R Project code for the statistical analysis are provided in the Electronic Supplementary Materials section supporting this article for transparency and reproducibility, as well as for purposes of future work. The data was obtained from the FDA repository accessed at https://www.accessdata.fda.gov/scripts/cder/daf/ on July 16 and July 17, 2020. The data was culled from a monthly report and described as follows: "All Approvals and Tentative Approvals by Month. Reports include only BLAs/NDAs/ANDAs 2 or supplements to those applications approved by the Center for Drug Evaluation and Research (CDER) and tentative NDA/ANDA approvals in CDER. The total dataset comprised 181,157 total approvals from Jan. 1939 until Dec 2019 (for a total of 972 monthly observations). As mentioned above, as this is an applied paper, reference is made to the various theoretical formulae in the respective supportive citations. Many of the distribution-inquiring statistical tests selected are considered 'standard' in the sense that they are typically used in the context described and are readily available and interpretable. All methods presented below followed standard implementation; default parameters were used (as appropriate) throughout the analyses. While the R code [20] is presented in the Electronic Supplemental Materials section of this article, the steps to perform the analysis were as follows: I. Load US Approvals as a time-series and perform descriptive statistics (including autocorrelation functions) [21; R package: 'moments']. In this step, the data is read as a time series into the R program, and descriptive statistics including moments and serial and partial correlation functions calculated. II. Assess attributes of the time series, including: The time series of US drug approvals follows an interesting flow given the dramatic rise starting in the 1970s to 2000 then after a drastic fall with a subsequent re-rise ( Figure 1 ). < Insert Figure 1 here. Figure 1 : Time evolution of total US CDER Approvals > The US drug approvals time series distribution is non-normal, platykurtic and positively skewed, with an average of 186 approvals (191 standard deviation) ( Table 2 and 3) . Importantly, the time series is nonstationary, non-seasonal, and non-linear, with intrinsic persistent memory ( Table 2 and figure 2) , which is removed with single differencing (that is, the time series has an order of integration (number of differences to attain stationarity) of 1, I(1)). I(1) processes are rather well-represented across a spectrum of different disciplines and a broad assortment of the economic variables including US drug approvals [34] . < Insert Table 2 here: Using time series analysis, this work finds two conceptually novel aspects of US drug approvals: the existence and evolution of persistency, and the existence of approval cycles (akin to economic cycles). Formally, persistency may be defined as the "rate at which its autocorrelation function decays to zero," or "the extent to which events today have an effect on the whole future history of a stochastic process 3 ." Translating to the context of our concern, it generally means that the value of US drug approvals at a given month is closely related to its value at the prior month. The Chronological Hurst Exponent proposed herein is a simple algorithm that reiteratively calculates the Hurst exponent (a measure of persistency) over an incrementally increased time period. With each iteration, an additional data point (here the next monthly observation of US approvals) is taken into account until the exponent of the full data set is calculated. The Chronological Hurst Exponent proposed in this work elucidated a Sshaped structure reflecting a trichotomized picture of the time evolution of persistency latent within US drug approvals: Interpreting US drug approvals as an economic variable -a singular outcome of several complex macro-(national), meso-(cluster), and micro (firm)-inputs such as national policy and R&D spend (government, firm), potential of future rents (individual buyer, payor), science and technology innovation (tacit (staff dexterity) and explicit (e.g., patents) know-how), and resource availability (e.g., chemicals, vials) -the existence of business cycles were investigated. Several tiered periodicities (17 years, 4-8 years, and intermittent monthly/yearly) were identified within Periods 2 and 3 of the CHE. During Period 2 (27-years (1947-1974) ), it is observed that 1947 was the first year in which there were one or more approvals during much of the year and had the largest number on an annual basis since the start of the collection cycle in Jan 1939. After 1947, a general rise in the number of approvals per month and per annum is observed. It is also a period of commensurate changes to the policy and social landscape pertaining to DDD, as well as continued investment into R&D. These changes were seemingly due to end of World War II (1939) (1940) (1941) (1942) (1943) (1944) (1945) , the beginning of the so-called 'Golden Age of Capitalism,' and the associated economic progress [35] with a relatively small number of economic disasters (see Figure 3 in [36] henceforth. Since the 1938 Food, Drug and Cosmetic act, no significant advances in policy occurred until the 1962 Drug Amendments (see Table 1 ), while there were significant milestone activities in terms of congressional review (the Kefauver Hearings dealt with pricing and market control [37] . One could therefore speculate that it may not have been FDA activities that drove the changes in the persistency measurement, but overall increased economic activity. The appearance of Period 3 (45-years (1974-2019)) suggests a uniform pressure onto the time series. Two general reasons present themselves to foment such a sustained persistent alteration in the fabric of US drug approvals: some sort of substantive and everlasting change (1) to accounting practices regarding US drug approvals (that is, how the source data was initially contrived and/or collected); or (2) in the scientific, social, economic, and/or legislative landscape. The former is unlikely to cause a persistent shift. To illustrate, FDA data sources state a change in department ownership in and around that time, as well as issues regarding changes from fiscal to calendar year practices. 4 It is unlikely that either of these reasons would have changed the time series in such a permanent manner. The latter reason, while likely, however, is ill-defined, but does allow for hypothesis generation. One hypothesis that could be tested is that of a significant change in the FDA regulatory landscape may have caused the formation of a cycle (see Table 1 ). From an FDA perspective, the 1960s and 1970s were a transformative vicennial [38] . Interestingly, if one considered the US drug approvals strictly as an economic variable, and assuming the theory of Schumpeter's economic cycles, the identified periodicities seem to coincide with certain macro-economic periodicities, with exception as no canonical long-term (> 40 years) periodicities were identified in this analysis (see Table 3 ). The periodicities began at different times with different durations (Figure 4) . The dominant periodicity of 17, 8 and 4 years has reoccurred during the longest (45 years), medium (20 years), and short-term (intermittent) durations, respectively ( Figure 5 ). Thus, it seems that US drug approvals follow a Juglar/Kuznets mid-term cycle with Kitchin bursts. Only time will tell if a longer-term cycle (Kondratieff) emerges, irrespective of any downside pressures (such as multidecade bear cycles). A key difference between the identified approval cycles as compared with economic cycles may be the degree of importance of the regulatory context. While a potentially coarse interpretation, without the legal requirement for market approval there would not have been a US drug approvals time series, whereas for variables such as gross domestic product typically used to consider economic cycles this is not the case (as the legal regimes do not define (as much as support) the existence of these more traditional economic variables). < Insert table 3 here. There are extensions and limitations to any statistical analyses, especially when dealing with socialeconomic variables. Examples of future investigation may include: Hypothesis: • One could argue that the number of US drug applications may have been a more insightful variable, as applications may be either withdrawn (by the Sponsor) or rejected (by the FDA). Unfortunately, the author could not find this dataset. • The number of initial US drug applications or approvals for new molecular and/or biologic entities may provide additional insight into the economics of the innovative process. In this article, the total number of US drug approvals including generics and line extensions (e.g., new indications or dosage forms) were considered, as reflected "market innovation." That is, a sponsor would not have considered seeking an approval without a market driver of some sort. • Data integrity and completeness: This study relies on a single source dataset from the FDA. While the author feels comfortable with the data source, there is uncertainty in how the data is collected, maintained, and presented given the duration of data collection and limited-to-no ability to cross-reference. • Data transformation: The data was transformed from irregular to a regular time-structure. That is, FDA drug approvals occurred as a function of day; these data were then aggregated into monthly values to facilitate the statistical analyses. Thus, some information may have been lost in terms of structure, as there are limited statistical routines able to manage such data. • R Project: While the presence of the R Project has been invaluable to the author, and the author checked all calculations (including utilizing more than one method to ensure veracity of results), there could still be a 'bug' in the routines utilized. • Methods: Statistical methods are ever evolving, becoming more generalizable (model agnostic). Nonetheless, the author evaluated many approaches to ensure appropriateness of the analyses used based on the (distribution) characteristics of the data. In the author's opinion, these data are an important artifact of R&D expenditures related to the DDD industry and therefore have interesting utility. Future investigations may consider these data and analyses to support research questions such as those related to forecasting and long-memory effects of non-stationary and non-linear data. It will be interesting to revisit these analyses on a yearly basis given the recent COVID-19 crises and resultant economic challenges, with a hope that the US drug approvals remain persistent with respect to these significant triggers. The author extends gratitude N.D., S.L.D., and N.L.D. for their support of the manuscript. The author is an employee of Takeda Pharmaceuticals; however, this work was completed independently of his employment. The views expressed in this article may not represent those of Takeda Pharmaceuticals. As an Associate Editor for Therapeutic Innovation and Regulatory Science, the author was not involved in the review or decision process for this article. See Electronic Supplementary Materials for all data and methods to replicate (or extend) the results presented herein. some TAR process rejected * Some tests require stationary data. As such, as the number of differences required for a stationary series from the original time-series was 1, the difference was used in the specific test demarcated. The FDA website https://www.accessdata.fda.gov/scripts/cder/daf/ was access on July 16 and July 17, 2020. The data was culled from a monthly report and described as follows (see Figure 1 ): "All Approvals and Tentative Approvals by Month. Reports include only BLAs/NDAs/ANDAs or supplements to those applications approved by the Center for Drug Evaluation and Research (CDER) and tentative NDA/ANDA approvals in CDER. The reports do not include applications or supplements approved by the Center for Biologics Evaluation and Research (CBER). Approvals of New Drug Applications (NDAs), Biologics License Applications (BLAs), and Abbreviated New Drug Applications (ANDAs), and supplements to those applications; and tentative approvals of ANDAs and NDAs." Upon entry into the data-repository via the website, the number of approvals from Jan. 1939 to Dec. 2019 was then determined by month (see Figure 2) . The values were placed in Excel and then exported as a comma delimited CSV file for input into the data analysis routine. Install ** Teraesvirta's neural network test ** Null hypothesis: Linearity in "mean" X-squared = 227.9227 df = 2 p-value = 0 wave.out$period <-wave.out$period/12 wavelet.plot(wave.out) wave.avg <-data.frame(power = apply(wave.out$Power, 2, mean), period = (wave.out$period)) plot(wave.avg$period, wave.avg$power, type = "l") #Confirm time series frequency library(forecast);citation("forecast") A general approach for determining when to patent, publish, or protect information as a trade secret Estimated Research and Development Investment Needed to Bring a New Medicine to Market Chapter 2: R&D costs and returns to new drug development: a review of the evidence Lessons from 60 years of pharmaceutical innovation How many patents does a biopharmaceutical company need? Can literature analysis identify innovation drivers in drug discovery Innovating through Clusters An intellectual property-based corporate strategy: An R&D spend, patent, trademark, media communication, and market price innovation agenda Issued US patents, patent-related global academic and media publications, and the US market indices are inter-correlated, with varying growth patterns Old series, new signals: The economic cycle in light of wavelet analysis Clément Juglar and the transition from crises to business cycle theories. Paper prepared for a conference on the occasion of the centenary of the death of Clément Juglar Cycles and Trends in Economic Factors The long waves of economic life Date Unknown) Kitchin, Juglar and Kuznetz business cycles revisited C (2020) Medium-term cycles in the dynamics of the Dow Jones Index for the period 1985-2019 Pitfalls in long memory research Wavelet and rescaled range approach for the Hurst coefficient for short and long time series A dendrochronology program library in R (dplR) Statistical and visual crossdating in R using the dplR library dplR: Dendrochronology Program Library in R. R package version 1 forecast: Forecasting functions for time series and linear models_. R package version 8 Automatic time series forecasting: the forecast package for R Investigating Rates of Food and Drug Administration Approvals and Guidances in Drug Development: A Structural Breakpoint/Cointegration Timeseries Analysis Stock earnings and bond yields in the US 1871-2017: The story of a changing relationship Economic disasters: A new data set Reform, regulation, and pharmaceuticals-The The Kefauver-Harris Amendments at 50 FDA and Clinical Drug Trials: A Short History, in A Quick Guide to Clinical Trials Since the Mid-2010s FDA Drug and Biologic Guidelines have been Growing at a Faster Clip than Prior Years: Is it Time to Analyze Their Effectiveness? #Step 1: Load data, convert to time series, perform descriptive statistics, and autocorrelation Input <-read.csv(file="c:\Users/pzn6811/OneDrive -Takeda/Desktop/GLOC/read.csv", header=T, sep= omit(Input) #excel seems to have some NAs at the end of column time<-ts neural network test ** Null hypothesis: Linearity in "mean s one-degree test for nonlinearity ** Null hypothesis: The time series follows some AR process Li test ** Null hypothesis: The time series follows some ARIMA process s Test for nonlinearity ** Null hypothesis: The time series follows some AR process F-stat = 2.733688 p-value = 6 Likelihood ratio test for threshold nonlinearity ** Null hypothesis: The time series follows some AR process Alternative hypothesis: The time series follows some TAR process tsfeatures: Time Series Feature Extraction 2019-12-31"), by = "month") monthyear <-strftime(monthyear, format = "%b %Y") c<-analyze Continuous Morlet Wavelet Transform Library(dplR);citation("dplR A dendrochronology program library in R (dplR) Statistical and visual crossdating in R using the dplR library Fares Qeadan and Christian Zang (2020). dplR: Dendrochronology Program Library in R forecast: Forecasting functions for time series and linear models_. R package version 8 Automatic time series forecasting: the forecast package for R findfrequency(time) # dominant frequency is determined from a spectral analysis of the time series