key: cord-0959707-1rikcuns authors: Hauben, Manfred; Hung, Eric title: Global and Country Level Time Series Analyses of the Effect of the COVID-19 Pandemic on Spontaneous Reporting date: 2020-12-18 journal: Clin Ther DOI: 10.1016/j.clinthera.2020.12.008 sha: bc8ae26c52f4db8991700f6bba87f9b860e49a81 doc_id: 959707 cord_uid: 1rikcuns Purpose The COVID-19 pandemic was widely reported to present a stress to medical systems globally and disrupted the lives of patients and health care practitioners. Since spontaneous reporting heavily relies on both health care professionals and patients, an understandable question is whether the stress of the pandemic diminished spontaneous reporting. Herein we assess the hypothesis that the COVD-19 pandemic negatively impacted spontaneous reporting of adverse drug events. Methods We analyzed 119 weeks of spontaneous report counts from the Pfizer Safety database (January 1, 2018 to April 12, 2020). We fitted autoregressive integrated moving average models to aggregated and disaggregated time series and charted the model residuals on Individual value-moving range and exponentially weighted moving average charts to identify statistically unexpected changes associated with the pandemic. Results Overall reporting of serious adverse events showed no unexpected decline, Total global reporting declined, driven by health care professional reporting (both serious and non-serious), starting after week 8, 2020, exceeding model expectations by week 15, 2020, suggesting the pandemic as an assignable cause. However, reporting remained within longer term historic ranges. The Japan time series was the one individual country time series showing a significant decline, and unusual periodicity related to national holidays. A few countries, notably Taiwan, showed statistically unexpected reporting increases associated with the pandemic commencing as early as week 3, 2020. Literature reporting of adverse drug events was stable. Ancillary findings included prevalent year-end/beginning reporting minima, more with health professional than consumer reports. Implications In a large and diverse pharmaceutical company database we did detect a significant global decline in total reporting, driven by healthcare professionals, not consumers, and non-serious reports, consistent with the pandemic as an assignable cause, but the reporting remained within long term ranges. suggesting a relative durability. Importantly, our analyses found no unexpected decline in overall serious reporting. Future avenues for research include assessing whether these effects impaired signal detection, performing the same analysis on large public spontaneous reporting system data to assess generalizability of our findings, and follow-up analysis to assess if impacts on spontaneous reporting abate. We analyzed 119 weeks of spontaneous report counts from the Pfizer Safety database (January 1, 2018 to April 12, 2020) . We fitted autoregressive integrated moving average models to aggregated and disaggregated time series and charted the model residuals on Individual value-moving range and exponentially weighted moving average charts to identify statistically unexpected changes associated with the pandemic. Overall reporting of serious adverse events showed no unexpected decline, Total global reporting declined, driven by health care professional reporting (both serious and non-serious), starting after week 8, 2020, exceeding model expectations by week 15, 2020, suggesting the pandemic as an assignable cause. However, reporting remained within longer term historic ranges. The Japan time series was the one individual country time series showing a significant decline, and unusual periodicity related to national holidays. A few countries, notably Taiwan, showed statistically unexpected reporting increases associated with the pandemic commencing as early as week 3, 2020. Literature reporting of adverse drug events was stable. Ancillary findings included prevalent year-end/beginning reporting minima, more with health professional than consumer reports. In a large and diverse pharmaceutical company database we did detect a significant global decline in total reporting, driven by healthcare professionals, not consumers, and non-serious reports, consistent with the pandemic as an assignable cause, but the reporting remained within long term ranges. suggesting a relative durability. Importantly, our analyses found no unexpected decline in overall serious reporting. Future avenues for research include assessing whether these effects impaired signal detection, performing the same analysis on large public spontaneous reporting system data to assess generalizability of our findings, and follow-up analysis to assess if impacts on spontaneous reporting abate. Time series modelling was used to assess potential impacts of the pandemic on spontaneous reporting. The unprecedented COVID-19 pandemic reportedly presented a huge stress to medical systems across the globe [1, 2] . Health care practitioners (HCPs) and institutions were faced with large caseloads and serious and unexpected clinical phenotypes. Patients' lives have also been disrupted in significant ways. Spontaneous reporting, a foundation for drug safety monitoring, relies on the time and efforts of HCPs and patients to provide critical information on the real-world safety of drugs. Spontaneous reporting may be influenced by numerous factors including attitudes of health care professionals, constraints on reporter's time, reporters recognition/suspicion, fear of litigation, publicity/notoriety bias, and secular trends related to the product life cycle (e.g. Weber Curve). An important question is whether a disruptive strain of the pandemic, on HCPs, patients or both, diminished spontaneous reporting. Serial data over long time periods, such as time series (TS) of spontaneous reporting frequencies, cannot be fully understood if the time sequence is ignored. The observations at any point in a TS may reflect memory/carry-over effects from previous time points (autocorrelation), trends, seasonality, other periodic effects, random variability and systematic disturbances. Ignoring such structure, such as comparing the number of reports for a given week or month in one year with the number of counts in the same week or month in another year, can lead to erroneous conclusions [3] . Herein we performed weekly TS analysis of global and country-specific spontaneous reporting frequencies to assess potential impact of the COVID-19 pandemic on spontaneous reporting . Our basic strategy to investigate the potential impact of the COVID-19 pandemic on spontaneous reporting frequencies was to apply time series regression to model the long-term temporal evolution of spontaneous report counts. and then apply statistical process control charts to assess whether significant deviations from well-fitting models were temporally related to the pandemic. As it might be expected that pandemic effects on spontaneous reporting may vary as a function of multiple variables, we analyzed total global spontaneous reports, as well as by report source HCP versus consumer versus literature, serious versus non-serious, and in a selected set of countries with the highest numbers of confirmed COVID-19 cases. Therefore, there are three elements of our methodology: 1) Safety data base for obtaining adverse event report counts. 2) A global database of the confirmed number of COVID-19 cases. 3) Statistical time series analysis. We performed our analyses using Pfizer's in-house safety database. The Pfizer safety database consists of close to over 3,000,000 initial reports (in the last 10 years) originating from 208 countries, inclusive of a portfolio of over 900 drugs; it is a large, pharmacologically, clinically and geographically diverse database. and preferably-100, time points, but not overly so, which could degrade the ability to approximate the true process with a model [6, 7] . In addition to TS for global spontaneous adverse event report counts, we analyzed the corresponding component TS for health care professionals (HCPs) versus consumer, serious versus non-serious, and literature reports, plus weekly spontaneous report counts of the top 12 countries ranked by confirmed COVID-19 cases as of May 8, 2020 on the John Hopkins COVID-19 Resource Center World Map (see section 2.2 below) [8] . We increased the robustness of our analysis by including two additional countries which we were curious about because they may be regarded as having pharmacovigilance systems with unique features of interest to our exercise: Taiwan and Japan. Taiwan implemented an early, intensive, and effective response to the COVID-19 pandemic [9] while Japan contributes a substantial number of spontaneous reports and has specialized features for intensive monitoring of new drugs [10] . We also examined TS of literature reports as a comparator that we hypothesized would be resilient to the pandemic, though possibly showing delayed effects. The Johns Hopkins Resource Center is a leading centralized source for data and information on COVID-19 infection. It collects, collates, tracks and displays/plots data on global and country specific confirmed COVID-19 infections and deaths, on an open source platform. We interrogated this website on May 8, 2020 to select the countries with the top 12 numbers of confirmed COVID-19 cases, for our time series analysis [8] . Our analysis involved an auto-regressive integrated moving average (ARIMA) model / Box-Jenkins protocol supplemented with special causes statistical process control charts (SPCs) of the model residuals [11] . (Minitab 18 © ). This combined ARIMA and SPC approach has been used in epidemic detection [12, 13] and ARIMA modeling with outlier detection has been implemented for highthroughput, hypothesis-free signal detection in pharmacovigilance [14] Recently it has also been used to forecast COVID-19 cases [15] . ARIMA models incorporates model terms for the aforementioned effects of trends, autocorrelation, and seasonality, thus maximally exploiting historical information embedded in the time sequence. ARIMA is a form of regression, specifically regression of time series data-i.e. a model that estimates or predicts the value of a variable of interest (e.g. spontaneous report counts) over time. In general a regression model estimates or predicts the value of an outcome variable as a weighted combination of predictor variables plus random error in an ARIMA model the predictor variables can include previous values of the outcome (i.e. the Autoregressive term representing correlations with one or more previous time points), an underlying trend in the time series (i.e. Integrated term) and carryover effects of previous estimation errors (the Moving Average term). In summary, in an ARIMA model the value observed at any time point is modeled with a weighted sum of previous values, underlying trends, and/or previous estimation errors. Sometimes the best model is purely autoregressive, other times purely moving average, and sometimes the best model is mixed (i.e. both autoregressive and moving average terms and even integration term). A specific ARIMA model is represented by three letters p,d,q in that order corresponding to the AR, I, and MA order respectively. The order refers to how far ahead in time the AR and MA effects reach. So, an ARIMA model that is autoregressive of order 2 means that the estimated value at a given time point J o u r n a l P r e -p r o o f is a weighted sum of the values of the two preceding time points without underlying trend or moving averages and would be labeled ARIMA (2,0,0) The order of the I term denotes the number of times you take the difference between consecutive values to eliminate trend. Statistical process control charts, of which there are many varieties for different types of data, were originally developed for manufacturing quality monitoring. Common features of these graphs are a variable plotted over time, a line of central tendency and so-called control limits, that are statistically derived lines above and below the line of central tendency. The latter define threshold values of the monitored variable that substantially deviate from the central, expected value, indicating a significant non-random change in the process that generated the data. The protocol consists of four steps to assess whether observed changes in counts of spontaneous reports were consistent with the inherent nature of the entire TS (e.g. common causes) versus a change of sufficient magnitude to be considered unlikely without an external shock or disruption (i.e. an assignable external cause, contributing to the change): 1) Visual inspection of TS plots/ run charts with testing for clustering indicative of autocorrelation, and trends. 2) Initial model selection starting with examination of autocorrelation function (ACF) and partial autocorrelation function (PACF) of the original TS to determine an initial ARIMA model 3) Iterative model selection and evaluation based on the following criteria, supporting model fit: Approximate normality of residuals (Q-Q plot), no obvious structure on residuals versus fits plots, no significant residual correlations in the ACF, PACF, or Box-Ljung statistics, statistically significant parameter estimates, parameter parsimony and minimum mean square error (MSE). An additional model checking option (used only in the appendix example for didactic purposes) is overfitting the model with an additional higher order term to assess its statistical significance. Due to theoretical issues with the Box-Ljung statistic small p-values of the statistic at large lags is less important as an isolated finding. If more than one equally satisfactory model was identified, the minimum MSE was decisive. 4) Generation of statistical process control chart (Individual valuemoving range (I-MR) and exponentially weighted moving average (EWMA) charts of the model residuals, so called special causes charts (SCCs)) [11] . Two SCCs were used because they complement each other-I-MR charts are more suited for detecting sudden shifts while EWMA charts are more suited for detecting more gradual drifts [16] . In addition, because the EWMA calculates a moving average of current and past observations, it is robust to deviation from normality and thus useful for plotting individual values [16] . The smoothing parameter (λ) for the EWMA was set at 0.2 given its use to supplement the I-MR chart/detect small shifts [16] . If a finding from the aggregated global TS was inconsistent across component TS (e.g. HCP versus consumer, serious versus non-serious), the role of increased TS noisiness/volatility decreasing the signal/noise ratio was evaluated by comparing the respective TS consecutive disparity index (CDI), a, measure of TS volatility used in quantitative ecology, for each TS [17] . Unlike conventional measures (e.g. the coefficient of variation), the CDI takes into account the time-ordering of the observations. To assess if differences in CDI between two TS was statistically significant, we performed paired t-tests on the summed consecutive differences in the formula for the CDI of each TS. Unexpected changes were considered potentially pandemic-related if observed from the second week of 2020, or later "time period of interest" (TPI), to avoid prevalent periodic year-end/beginning reporting effects. Appendix 2 provides a detailed, step-by-step worked example of our TS analysis methodology. J o u r n a l P r e -p r o o f The top 12 countries, ranked by the number of confirmed COVID-19 cases as of May 8, 2020, were: United States, Spain, Italy, United Kingdom, Russia, France, Germany, Brazil, Turkey, Iran, China, and Canada [8] . These countries collectively accounted for 71.67% (501,960/700,362) of spontaneous reports in this time period studied. With Taiwan and Japan included the percentage rises to 85.1%. Reporting from Iran was extremely sparse, and therefore excluded from further country-level analysis. There was a total of 700,362 spontaneous reports in the time period studied. The median number of spontaneous reports per week was 5,857 with a range of 3,248-8,798. The corresponding numbers submitted by HCPs was 419,361, with median weekly count of 3481 and range of 1,320-5,945. The minimum weekly total global report count during the TPI was 73% of the median weekly count for the entire TS (4,269/5,857). The corresponding figures for HCP, consumer, serious and non-serious report TS were 60% (2,097/3,481), 93% (2,154/2,323), 65% (966/1,494) and 71% (3,134/4,389). An adequate model was fitted for 25/38 TS. Most models were pure low order AR or MA models, with a few countries requiring mixed AR and MA models (Appendix 1 Table 1 ). Five TS were fitted with seasonal models. Detailed results for every fitted model are provided in Appendix 1 Table 1 . According to our combined ARIMA/SPC analysis , total spontaneous reporting of serious cases did not show unexpected declines. Total spontaneous reporting, total non-serious spontaneous reporting, and total reporting by HCPs showed unexpected declines either by ARIMA /SPC (when a good fitting model was obtained) or by visual inspection (Appendix 1 Table 1 ) . The stability of the serious reporting was partly related to opposing changes in HCP versus consumers reports-i.e. an increase in consumer reports coincident with a decline in HCP reports. Figure 1 shows the charts for HCP reporting. The downward drift of overall and HCP spontaneous reporting exceeded SCC control limits by week 13, 2020 for both TSs, but commenced earlier for HCP reporting (after week 6 versus after week 8 respectively). The decline in overall reporting, commencing after the WHO declared a public health emergency of international concern, was detected with the EWMA special causes chart, not the I-MR chart (Figure 1 and Appendix 1 Table 1 ). The differential TS behavior between HCP and consumers was not explainable by differences in noise/volatility of the corresponding TS as the CDI was lower for consumer-versus the HCP TS, 0.0936 versus 0.1556 but not significantly so (P=0.764). The possibility that the selectivity for non-serious reports reflected increased noisiness of the serious TS is discounted because the CDI was higher for serious versus non-serious reports (0.133 versus 0.123) but the difference was not statistically significant (p=0.989). Literature reports were very stable, but with discrete year end/ beginning spikes due to entry of cases listed in the American Association of Poison Control Centers annual report. Exception for Japan (Figure 3) , individual country TS did not show unexpected declines (Appendix 1 Table 1 ). The Japan TS displayed a visually evident reporting decline during the TPI. It also displays an intricate structure with the usual week 52 local minima superimposed on recurring local minima at weeks 18 and 33 (mid-May and early September, respectively). These periodic minima correspond temporally to national holidays in Japan: Golden week for the week 18 minima, Mountain day and Obon for the week 33 minima. [insert Figure 3 ] 4 ). that confounded ARIMA modeling but was nevertheless obvious on visual inspection. Notably, On January 20 th , Taiwan started implementing set of 124 action items including active periodic patient health checks [9] , so these intensified interactions with patients possibly increased ascertainment of all manner of health information, including suspected ADRs. [insert Figure 4 ] The United Kingdom also showed a spike in reporting that that exceeded special causes control limited by week 11, 2020. This reflected a change commencing by week 7 on the I-MR chart and week 8 on the EWMA chart. However, unlike Taiwan, it was rather discrete, making it more difficult to rationalize as an effect of the pandemic. Our TS analysis using a large, pharmacologically and geographically diverse pharmaceutical company safety database showed no unexpected (i.e., inconsistent with the TS model) declines in overall spontaneous reporting of serious suspected ADRs, and no unexpected decline in consumer reports, in the TPI. This does not disprove a pandemic effect but deviations from the models were not sufficient to reject the null hypothesis of no effect. We did find statistically unexpected declines in overall spontaneous reporting, reporting by HCPs (total, serious, non-serious), and non-serious reports. The decline in HCP spontaneous reporting manifested primarily as a downward drift rather than a large acute level shift, commencing after week 6, reaching significance by week 13, 2020, somewhat earlier onset than overall reporting (week 11, 2020). We found no signals of a decline in literature reporting from literature reports. Thus, selectivity of model deviations for HCPs over consumers is intuitively plausible, given the "front-line status" of HCPs in fighting the pandemic. When unexpected significant declines in spontaneous reporting were detected, there were more often detected by EMWA than I-MR charts, consistent with the former charts superiority in detecting more gradual downward drifts compared to I-MR charts which are designed for detecting acute large declines. A more gradual decline seems to be the more intuitively plausible pandemic scenario than a large sudden change point. However, when both SPC charts demonstrated a significant decline the I-MR chart tended to show an earlier decline. The lack of unexplainable country-specific declines has several possible, not mutually exclusive explanations: 1) Although "sample size" for TS typically refers to the number of included time periods, TS "noisiness" also affects power. The CDI for global reporting was in fact less than individual countries and the top 12 countries pooled. 2) A contribution to global reporting by countries other than those examined with more fragile or underdeveloped health care systems, including pharmacovigilance, and decreased availability of COVID-19 testing, that may contribute disproportionately to global declines. 3) Numbers of confirmed C0VID-19 cases may correlate imperfectly with health care system strain-e.g. per capita COVID-19 cases, COVID-19 case counts/physician or hospital bed ratios may be better indicators. 4) The Johns Hopkins University Resource Center's data may be a lagging indicator of relative disease activity. We rechecked the ranking on May 20 th , 2020, observing relative ordinal stability, the only changes being displacement of China and Canada by India and Peru in the top ranked set. 5) Some countries may experience delayed peaks in COVID-19 cases e.g. Eastern Europe/ Russia [18] . 6) Overestimation of impacts. 7) The stability of the literature reporting TS may reflect the backlog of articles in press and late-stage production, with delayed declines possible. Japan was the only country to demonstrate declines in overall and HCP reporting in the TPI. Of the few countries showing unexpected reporting increases, Taiwan had more sustained increases. Taiwan's intensive response to COVID-19, with 124 action items published on January 20 th , including periodic patient health checks [9] , potentially increasing ascertainment of various health outcomes, including suspected ADRs. It would be interesting to see if the increased reporting was generalized, or drug, event, and/or drug-event combinations selective. Other interesting observations include highly prevalent year-end/beginning local minima, and more complex periodic patterns as in Japan. Our analysis has limitations. ARIMA models are one methodological choice for TS analysis, with advantages and disadvantages relative to others. Advantages include relative robustness to data fluctuations, and no required parameter selection, Disadvantages include the need for substantial J o u r n a l P r e -p r o o f amounts of data, especially for estimating seasonal effects, inability to include effect modification, and sensitivity of results to model specification. While we followed structured criteria for model fitting there is an element of trial and error, and some models fall between exemplars of good fit (e.g. worked example in Appendix 2) and poor fit, with subjective judgement sometimes needed. Similarly, the field of statistical process control charts is marked by vigorous controversy and debate on the robustness of these methods to theoretical assumptions and optimum application [19, 20] . We studied levels of reporting, not signal detection performance. But these are related as signals may be detected from individual cases or by aggregate quantitative analysis. Decreased spontaneous reporting, if significant, could plausibly hamper signal detection at the case level, although the decrease was within the range of the pre-pandemic TS. and the lack of an unexpected decline in serious reporting is reassuring. However, altered ratios of different types of events (e.g. serious versus non-serious) could positively or negatively impact disproportionality analysis in unpredictable and situation-dependent way due to temporary change in the distribution of reported drugs, events, and drug-even pairs, by COVID-19 disease-drug interactions/novel ADR phenotypes, and/or misattribution of clinical COVID-19 signs/symptoms to suspected ADRs. A new dedicated online reporting site for reporting suspected adverse reactions from medicines, future vaccines and medical equipment relating to COVID-19 treatment, was launched by the UK Medical Products and Health Care Agency (MHRA) on May 4, 2020 [21] . Finally, while we analyzed a large and diverse pharmaceutical company database, we intend to see if our results are generalizable to larger public databases containing data from numerous pharmaceutical companies when the data is available. We illustrate the analysis flow using data from Italy, a country whose medical system was heavily impacted by COVID-19, having the second highest number of confirmed COVID-19 cases while being the 23 rd most populous nation as of May 8 th , 2020. Figure 1 is a TS plot/ run chart of the number of reports per week originating from Italy. Visual inspection suggests serial correlation, recurring week 52 minima, and a week 10, 2020 local minimum. The latter has a plausible temporal relationship with COVID-19 activity, but our modeling will tell us if indeed this local minimum is expected based on the long-term behavior of the TS, versus an external cause, such as the pandemic. The run chart is notable for statistically significant clustering (p=0.04) supporting autocorrelation, but a nonsignificant test for trend. figure 5 are the special causes charts (I-MR, EWMA) of the residuals from the AR(2) model. The local minimum in early 2020 is seen to be consistent with the model and not unexpected/due to assignable causes, such as the COVID-19 pandemic. Virus Outbreak Pushes Italy's Health-Care System to the Brink. The Wall Street Journal Effects of COVID-19 on Global Healthcare Systems Statistical methodology: V. Time series analysis using autoregressive integrated moving average (ARIMA) models WHO Declares COVID-19 a Pandemic WHO. Rolling updates on coronavirus disease Understanding and using time series analyses in addiction research Very long and very short time series. Forecasting: Principles and Practice COVID-19 Dashboard by the Center for Systems Science and Engineering Response to COVID-19 in Taiwan: Big Data Analytics, New Technology, and Proactive Testing Bias in spontaneous reporting of adverse drug reactions in Japan Time-Series Modeling for Statistical Process Control Modelling of Infectious Diseases for Providing Signal of Epidemics: A Measles Case Study in Bangladesh Detecting the start of an influenza outbreak using exponentially weighted moving average charts. BMC medical informatics and decision making Time Series Disturbance Detection for Hypothesis-Free Signal Detection in Longitudinal Observational Databases Prediction of the COVID-19 Pandemic for the Top 15 Affected Countries: Advanced Autoregressive Integrated Moving Average (ARIMA) Model Exponentially Weighted Moving Average (EWMA) Control Charts for Monitoring an Analytical Process. Industrial & Engineering Chemistry Research The consecutive disparity index, D: a measure of temporal variability in ecological studies WHO says 'delayed epidemic' takes hold in Eastern Europe as coronavirus cases in Russia rise Bridging the gap between theory and practice in basic statistical process monitoring. Quality Engineering Controversies and Contradictions in Statistical Process Control MHRA. Coronavirus: new website for reporting medicines side effects and equipment incidents