key: cord-0687453-ct72a1og authors: ADACHI, Yasumoto; MAKITA, Kohei title: Real time detection of farm-level swine mycobacteriosis outbreak using time series modeling of the number of condemned intestines in abattoirs date: 2015-04-24 journal: J Vet Med Sci DOI: 10.1292/jvms.14-0675 sha: ca2fa034524ce26d06b3674b3c494635d7d5f13b doc_id: 687453 cord_uid: ct72a1og Mycobacteriosis in swine is a common zoonosis found in abattoirs during meat inspections, and the veterinary authority is expected to inform the producer for corrective actions when an outbreak is detected. The expected value of the number of condemned carcasses due to mycobacteriosis therefore would be a useful threshold to detect an outbreak, and the present study aims to develop such an expected value through time series modeling. The model was developed using eight years of inspection data (2003 to 2010) obtained at 2 abattoirs of the Higashi-Mokoto Meat Inspection Center, Japan. The resulting model was validated by comparing the predicted time-dependent values for the subsequent 2 years with the actual data for 2 years between 2011 and 2012. For the modeling, at first, periodicities were checked using Fast Fourier Transformation, and the ensemble average profiles for weekly periodicities were calculated. An Auto-Regressive Integrated Moving Average (ARIMA) model was fitted to the residual of the ensemble average on the basis of minimum Akaike’s information criterion (AIC). The sum of the ARIMA model and the weekly ensemble average was regarded as the time-dependent expected value. During 2011 and 2012, the number of whole or partial condemned carcasses exceeded the 95% confidence interval of the predicted values 20 times. All of these events were associated with the slaughtering of pigs from three producers with the highest rate of condemnation due to mycobacteriosis. Mycobacterium avium-intracellulare Complex (MAIC) is a common pathogen that, like M. tuberculosis, causes tuberculosis in humans and opportunistic infections in those suffering from acquired immune deficiency syndrome (AIDS) [7, 15, 30] . Mycobacteriosis in swine caused by MAIC is a chronic infectious disease, which presents as caseous necrosis in submandibular lymph nodes and mesenteric lymph nodes. In swine, the most common lesion is the one found in mesenteric lymph node [32] . Mycobacteriosis is the only known disease that forms caseous necrosis in animals and humans. Japanese meat inspection service follows the statements of a manual [20] : 'a caseous necrosis suggests mycobacteriosis', and it regards such case as mycobacteriosis without further microbiological examinations. Intestines are condemned when the caseous necrosis is a single lesion, and whole carcasses are condemned when the systemic infection is found, causing a significant economic burden for the pig producer. The abattoir condemnation data at the Higashi-Mokoto Inspection Center (hereafter "abattoir data") are stored as electronic data, which is fed back to each producer monthly upon request. In addition to this feedback, the pig producer also receives notification when mycobacteriosis is observed. As swine mycobacteriosis occurs sporadically throughout the year, it is not easy to distinguish an outbreak from sporadic cases on the basis of either daily raw abattoir data or monthly aggregated data. Therefore, a rational framework is needed to distinguish between sporadic cases and outbreaks. However, such a framework has not been successfully developed or applied in any country. In this study, we used time series analysis in an attempt to design a mycobacteriosis prediction model. Time series analysis is a method that is used to analyze sequential data with equally spaced measurements in time (hereafter "time series data") and is performed by fitting of the data to various models. The Auto-Regressive Integrated Moving Average (ARIMA) model [5] , one method that is frequently used for time series analysis, can be used to describe a wide variety of behaviors for time series data using relatively simple expressions and thus has been used in a range of studies. In order to fit an ARIMA model, the time series data presented for anlaysis must be stationary, without a trend or a cyclical fluctuation. To eliminate both trend and periodicity, Brockwell et al. proposed three methods, as follows: (i) esti-mating a trend along with cyclical components and subtracting both of them from the original data; (ii) introducing the lag-d differencing (where d is the interval of the periodicity) and (iii) fitting a sum of harmonics and a polynomial trend [6] . ARIMA modeling is widely used in a range of natural sciences, including biomedicine and public health [24, 33] . In the field of public health, ARIMA models are used to describe and forecast the incidence of infectious diseases, such as campylobacteriosis [3] , severe acute respiratory syndrome (SARS) [11] , hemorrhagic fever with renal syndrome (HFRS) [18] , cholera [2] , onchocerciasis [17] , influenza [25] , malaria [31] and dengue fever [4] , and even the number of patients visiting emergency departments [23, 26] . Moreover, such models are applied in detecting bioterrorism [23] or food poisoning of unknown source [3] by observing the excess of the case numbers over the confidence interval of expected numbers (i.e., as predicted by the model). Although time series analysis has been widely used in public health, such analysis has not been applied frequently to abattoir data. To our knowledge, such modeling is represented by 2 published studies. Neumann et al. performed a retrospective analysis of abattoir data in the New Zealand pig industry using the ARIMA model [22] . Vial et al. performed a retrospective analysis of Swiss slaughterhouse data using a GAM model [29] . The present study attempted to use the ARIMA model to distinguish mycobacteriosis outbreaks at the farm level from sporadic cases. Temporal cyclical components in abattoir data were estimated to calculate weekly ensemble average, and the residuals after subtracting the temporal cyclical components were fitted to an ARIMA model. Subsequently, we estimated the time-dependent expected values by summing up the cyclic components and the ARIMA model. Finally, we compared the actual data observed after the period used for model fitting and the 95% confidence interval of the expected values predicted by the model in an attempt to detect outbreaks at the farm level. The methodology used in the present study can be applied not only to swine mycobacteriosis but also to other diseases detected in veterinary meat inspection. Meat Inspection Center is located in Abashiri-Gun, Hokkaido, Japan, and has 2 abattoirs, which receive 150,000 to 200,000 pigs annually for slaughtering, mainly from the eastern Hokkaido. At this facility, pig carcasses are inspected by gross pathological examination, and the carcasses harboring intestines with nodular caseous necrosis are diagnosed with swine mycobacteriosis. In this study, we used the data from both abattoirs within jurisdiction of the Higashi-Mokoto Meat Inspection Center, as collected from April 1, 2003, to May 27, 2013. The number of condemned intestines due to mycobacteriosis in swine (hereafter "observed data") and the names and townships of the corresponding producers were used to identify the farms with high incidences of swine mycobacteriosis; these pro-ducers were informed of potential outbreaks as part of the Meat Inspection Center's mandate. The numbers of the observed data (dates of operation) used for the analysis consisted of 1,922 days over roughly 8 years as a training period for the model and 788 days (roughly 2 years) as a validation period for the model. Analysis of periodicity: In order to understand the periodicity of the number of condemned carcasses, we performed Fast Fourier Transform (FFT) [28] to calculate a periodogram from the observed data obtained during the training period. As the observed data do not include non-operating days and are not completely serial, FFT could not be performed on the raw primary data. Therefore, we pretreated data before FFT as follows: (i) connecting observed data of weekdays without weekends and (ii) replacing missing data on non-operating weekdays with the medians of each day of the week as dummy values, such that-5-day intervals constituted a nominal week. The variation in the number of condemned intestines among days of the week was assessed using a generalized linear model (GLM) with quasi-Poisson errors. All the statistics in the present study were performed using statistic software R (version 3.0.1). Estimation of ARIMA model: For the estimation of the ARIMA model, week ensemble averages were subtracted from observed data for each date of the weeks, and the residuals including missing values were used. Three main parameters need to be selected when fitting an ARIMA model: the orders of autoregressive (AR), differencing or integration (I), and moving average (MA). Autoregressive terms relate the the observation made at time t in the series to the observation made at time t-1, or to the observation made at t-2 and so on. Moving average terms relate the error (difference between observation and estimated value) at time t to the error at times t-1, t-2, etc. If the time series data contain a stochastic trend (that is, if the data contain a unit root), differencing is performed to eliminate the trend and to obtain a stationary process. The introduction of the principle of ARIMA model is described briefly by Trottier et al. [27] , and the detailed introduction to this method is available in the textbook by Brockwell et al. [6] . We examined unit roots in residual data by using the Augmented Dickey Fuller-test (ADF test) [8, 9] . These coefficients of the ARIMA model were estimated using the Arima function in R [13] by monitoring Akaike's information criterion (AIC). The estimation used data obtained between April 2003 and March 2011 and was performed under the conditions of AR (p) with 0≤p≤ 5 and MA (q) with 0≤q≤5. Since the ensemble average with overall mean was subtracted according to Brockwell's method, the estimation was performed without a constant term. At the same time, we compared both conditions with and without drift on the basis of the AIC. The ARIMA model selected based on the AIC was checked using the Ljung-Box test [19] to ensure that the residuals were independent and temporally equally distributed over time. Furthermore, the significance of the coefficients and the stationarity of AR (p) process were tested. The t values, calculated by dividing the coefficients by the standard errors, were used to perform t-tests for the significance of the coefficients. The stationarity was checked by calculating the roots for the characteristic polynomial using the polyroot function in R [14] , confirming that the nontrivial roots (expressed as absolute values) were greater than 1. Finally, for the examinations of possible temporal effects other than week periodicity (see Results), the remaining residuals were tested using the autocorrelation function (ACF) and partial autocorrelation function (PACF) [6] , using missing-dataomitted residuals. Prediction of the expected condemned number of swine intestines: The model generated as described above was used to calculate the expected values and the confidence intervals for the interval between April 2011 and May 2013; validation was performed by comparing the resulting values with the observed data for the same interval. The expected value of the condemned number for the specific day was calculated by summing the ensemble average of the week and the ARI-MA-predicted value of the day. The 95% confidence interval of each future day was calculated in the same manner. When the actual number of whole or partially condemned carcasses in the observed data exceeded the expected value, the producers who sold the condemned carcasses were identified and the annual rate of condemnation by Japanese fiscal year (April to March) was calculated. Test of the robustness of the model: The ARIMA model predicts the next day's value (for each day) using the records of the past several days, with the duration determined by the larger order of AR or MA. Therefore, there arises a critical question: how sensitive is the model to the very recent condemned numbers of carcasses used for the prediction? In order to prove the robustness of the model, we replaced the last actual condemned numbers of the training phase (de-termined by either AR or MA) with an equal number of the numbers generated randomly as follows. The numbers were drawn from a Poisson distribution with the shape parameter lambda (λ) of the 8-year observed data at the training phase and were subtracted by week ensemble average. We then refitted the ARIMA model for the updated AR and MA based on the least AIC. Then, the differences in deviance between the 2 predicted models (model 1: predicted by using ensemble average-subtracted actual observed data and model 2: the last few days of the training phase were replaced with randomly generated numbers as stated above) at the valida- tion phase were tested for significance at a level of P<0.05 using chi-squared statistics. The statistical processing was performed for 100 iterations. The periodograms calculated by FFT show the local maximum value at the 1-week cycle (Fig. 1 ). The number of data points replaced with the medians of the week of a day as dummy values was 145. Table 1 shows the numbers of pig slaughtered, average number of pigs slaughtered in each operating day with 2.5 and 97.5 percentiles, rate of condemnation due to mycobacteriosis and mean daily number of condemnation estimated in GLMs with quasi-Poisson errors (dispersion parameter 7.1) with P-values for each day of a week. The average numbers of intestines condemned on each day of a week were used in the modeling as week ensemble averages. Figures 2 and 3 show box plots of the daily numbers of condemned intestines and the rates of condemnation per day, respectively, for the presentation of the variations. Although the daily number of pigs slaughtered was consistent (Table 1) , the daily number of condemned intestines differed according to the day of a week (Table 1, Fig. 2 ). The number of condemned intestines was highest on Mondays, followed by Tuesdays. When sub-set of the data consisting of the days which exceeded the average condemnation rate (0.485%) was examined, a similar temporal pattern in the number of days exceeded, and the daily number of condemned intes-tines among the days of a week was observed ( Table 2) . Table 3 shows total number of producers, total days and mean daily number of producers brought pigs between April 1, 2003 and March 31, 2011. The 95% confidence limit overlapped between Monday and Friday, and no obvious clustering in the number of producers in particular day of a week. Figure 4 shows the frequency of the producers brought pigs according to the annual number of pigs sold. The median number of pigs brought by a producer in a year was 365.6, and the mean was 1,717.0. The majority of producers brought less than 500 heads per year. Estimation of ARIMA model: The ADF test rejected the null hypothesis of a unit root. Thus, the order of differencing or integration (I) was identified as zero. As the result of the estimation on the basis of least AIC, we selected ARIMA (3, 0, 4) with the mean zero. The result of the Ljung-Box test (P<0.05) showed that the residuals were independent and evenly distributed over time. The t-tests for significance of coefficients showed that the coefficients of all the options of AR: 1-3 and MR: 1-4 were significantly different from zero (Table 4) . Moreover, the roots for the characteristic polynomial were greater than 1, thus confirming the stationarity of the ARIMA model. The ACF and PACF plots of the residuals of the week-ensemble-average-deducted ARIMA model did not show any other peaks and were regarded as white noise (Fig. 5a and 5b) . Prediction of the expected condemned number of swine intestines: The number of days when the actual number of condemned intestines exceeded the upper limit of the 95% confidence interval was three in fiscal year 2011 and 17 in fiscal year 2012 (Figs. 6 and 7) . Because the lower limit of the 95% confidence intervals falls below zero, they are not indicated in these figures. The minimum, median and maximum numbers of daily condemned intestines of these 20 days were 13, 18.5 and 49, respectively; these values exceeded the overall mean number (which was 2.05) for fiscal years 2011 and 2012. Total number of days, minimum, median and maximum, and the annual rates of condemnation of the three producers indicated in using the upper limit (UL) of the 95% confidence interval are indicated in Table 5 . On the days on which the confidence interval was exceeded, the suppliers of pigs slaughtered at the abattoirs included at least one of the three producers with the highest annual rates of condemnation due to mycobacteriosis for the respec- tive fiscal year. The annual rates of condemnation of these producers were 8.1%, 10.0% and 13.6% (fiscal year 2011), and 23.1%, 30.5% and 32.9% (fiscal year 2012). Therefore, farm-level outbreaks of mycobacteriosis in these producers were highly suspected, given the fact that annual overall rates were less than 1% at the Higashimokoto Meat Inspection Center. One producer with high annual condemnation rates was represented in the top 3 in both fiscal years. Test of the robustness of the model: The fitted ARIMA model was ARIMA (3, 0, 4) , meaning that the number of days used to feed the model was selected as the larger order of AR (3) or MA (4). Thus, the last 4 values of the observed data during the training period were used to calculate the first expected value in the validation period. Therefore, these values were replaced with the randomly generated values from a Poisson distribution with a λ of 3.37. Out of the result of the estimations of ARIMA model using replaced data for 100 iterations, ARIMA (3, 0, 4) was selected 99 times, and ARIMA (4, 0, 3) was selected once on the basis of least AIC. Consequently, it became clear that the ARIMA (3, 0, 4) model was appropriate for the next step. As the results of the repetition of χ 2 test (df=4) to examine the difference between deviances for 100 iterations, the statistical significance (P<0.05) was not shown for 99 times and was shown once. We concluded that the ARIMA (3, 0, 4) model was statistically robust for data analysis. This study attempted to perform time series analysis of abattoir data to distinguish outbreaks of mycobacteriosis from sporadic cases. Although the prevalence at the animal level is quite low as it was shown in the results, sporadic cases of swine mycobacteriosis occur commonly at the farm level. Morita et al. reported that sporadic infections of MAIC were detected in 870 of 1,200 piggeries (72.5%) between 1988 and 1990 at a slaughterhouse in Gunma Prefecture, Japan [21] . Distinguishing the outbreaks from sporadic cases based on daily raw abattoir data is therefore not easy; however, a time series analysis in this study demonstrated the feasibility of its practical application in the real-time detection. The detection of the outbreaks is important, because of the public health impact to consumers and economic impact to producers of the disease, as stated in the introduction. Komijn et al. suggested that pigs may be an important vehicle for M. avium infections in humans [16] . Real-time detection of the a) The total number of days when these producers brought pigs. b) The rates of condemnation. c) Producer "Q" in 2011 and 2012 is the same producer. outbreak at the farm level can minimize public health risks by careful detection of the rest of the carcasses of the day and can help the responsible producer to improve farm hygiene immediately. Such collaborative effort between meat inspection and livestock hygiene service would be effective in controlling zoonotic diseases [1] . Although this study dealt with only one disease, swine mycobacteriosis, such an analytical method potentially can be applied to many other diseases of concern in food animals, permitting detection of outbreaks at the production level. In the present study, we were able to identify the producers with elevated condemnation rates of diseased pigs brought into the abattoirs, specifically noting when the number of daily condemned intestines exceeded the upper limit of a 95% confidence interval. These outlying events presumably correspond to outbreaks in the respective source farms immediately on the day of inspection. Similar temporal patterns in the daily number of condemned intestines among the days of a week seen in both the whole dataset between 2003 and 2011, and the days which exceeded the average condemnation rate also supported the validity of this time series modeling. Traditionally, the statistical processing for abattoir data in Japan has been used only to calculate monthly and annual total incidences. Hence, by the traditional statistical analysis, outbreaks of diseases might not be recognized until total slaughter numbers were obtained, causing a lag in the potential detection of non-sporadic events. Therefore, the application of time series analysis, as used in this study, may accelerate the rapid detection of outbreaks. While this study demonstrated the above-stated strengths, our analysis has three limitations. First, sensitivity and specificity in diagnosis of mycobacteriosis are not taken into account. The meat inspection for mycobacteriosis in swine intestines is based on visual examination, and misdiagnosis may happen. However, restrictions in expense and time precluded our investigation of the sensitivity and the specificity of the histopathology examinations. In fact, unless molecular technology is used, classical tests would miss some infections. A study using 239 lymph nodes from pig carcasses with caseous lesions showed sensitivity of 79.1% by Ziel-Neelsen stain (189 positive) and 69.5% by culture [16] , and the kappa value (agreement) of these tests, calculated from the published data, was 0.016, which is interpreted to be slight agreement [10] . Second, a separate (though related) limitation is that the present study selected the number of condemned intestines in abattoirs due to mycobacteriosis as an objective variable. The prevalence rate among pigs slaughtered on a given day might be better variable for modeling to identify suspected outbreaks accurately, as disease is a binary response [12] . However, mycobacteriosis is a rare disease with low annual condemned rate (0.36% in fiscal year 2012), which makes the data heavily zero-inflated and over-dispersed. Third, the present model does not take background farm size into account in detecting outbreak of mycobacteriosis. There is a concern that the model may miss an outbreak of a small-scale farm, and on the other hand, it may be so sensitive against prevalence of the disease in large-scale farm that even sporadic cases may be detected. However, as Fig. 6 showed, the level of detection is low enough to detect outbreak even in the small-scale farms. Moreover, large-scale farms have multiple production units, and outbreaks can occur within a unit. Thus, the present model can detect such outbreaks at a sale to slaughterhouse, and we think this would not be a notable limitation. The periodogram showed the weekly periodicity of the condemned number in this study. Specifically, the periodicity was indicated by the result of FFT without data for non-operating days, and high condemnation numbers were observed on Mondays (Table 1 , Fig. 2 ), This weekly periodicity was not caused by the periodicity in the number of swine submitted for slaughter, as no particular day-specific pattern was detected in the number of swine brought to the abattoirs (Table 1) . We hypothesize that the producers with high mycobacteriosis prevalence rates tended to bring swine to the abattoirs for slaughter on one specific day of the week (Monday). However, specific (individual) producers bringing infected swine for slaughter on Mondays in multiple weeks responsible for the increase were not able to be confirmed (data not shown), and the actual cause of the pattern remains uncertain. The ARIMA model uses the last several values in time series data to predict the first subsequent expected value and assigning the expected values sequentially to predict the second subsequent expected value and so on. Therefore, we evaluated the effect of statistical randomness in these last several values by assessing the results for robustness. As a result, replacement of last several actual numbers with random numbers did not affect the model significantly, which in turn proved that the model takes preceded long term information into account well enough. In conclusion, although there are limitations, time series analysis appears to be useful in rapid detection of potential outbreaks of swine mycobacteriosis. In Higashi-Mokoto Meat Inspection Center, this time-series model has been developed further into user-friendly system on EXCEL fed by R-programing model and just has been applied in daily meat inspection service. This method can be applied to the other zoonotic food-borne diseases, thereby promoting increased food safety. Effective utilization of the data obtained from inspections at the slaughterhouses-Good combination with Livestock Hygiene Service Center and Meat Inspection Center for farm Time series analysis of cholera in Matlab Use of time-series analysis in infectious disease surveillance Forecasting incidence of dengue in Rajasthan, using time series analyses Time Series Analysis: Forecasting and Control, rev Introduction of Time Series and Forecasting Mycobacterium avium complex infection in HIV/AIDS patients Distributions of the estimators for autoregressive time series with a unit root Likelihood ratio statistics for autoregressive time series with a unit root Veterinary Epidemiologic Research Using autoregressive integrated moving average (ARIMA) models to predict and monitor the number of beds occupied during a SARS outbreak in a tertiary hospital in Singapore Principles of Medical Statistics Automatic time series forecasting: The forecast package for R Algorithm 419 − Zeros of a complex polynomial Mycobacterium avium complex in patients with HIV infection in the era of highly active antiretroviral therapy Prevalence of Mycobacterium avium in slaughter pigs in The Netherlands and comparison of IS1245 restriction fragment length polymorphism patterns of porcine and human isolates Time series analysis of onchocerciasis data from Mexico: a trend towards elimination Forecasting incidence of hemorrhagic fever with renal syndrome in China using ARIMA model On a measure of lack of fit in time series models Meat Inspection Manual, New Edition Prevalence of atypical mycobacteriosis in slaughtered swine in Gunma Prefecture and the serovars of the isolates Descriptive and temporal analysis of post-mortem lesions recorded in slaughtered pigs in New Zealand from Time series modeling for syndromic surveillance Disease management with ARIMA model in time series Modeling and predicting seasonal influenza transmission in warm regions using climatological parameters Forecasting daily attendances at an emergency department to aid resource planning Stochastic modeling of empirical time series of childhood infectious diseases data before and after mass vaccination Modern Applied Statistics with S-PLUS Evaluation of Swiss slaughterhouse data for integration in a syndromic surveillance system Mycobacterium avium-intracellulare otomastoiditis in a young AIDS patient: case report and review of the literature Development of temporal modelling for forecasting and prediction of malaria infections using time-series and ARIMAX analyses: a case study in endemic districts of Bhutan Studies on atypical mycobacteria isolated from tuberculous lesions of the mesenteric lymph nodes of slaughtered pigs On time series analysis of public health and biomedical data ACKNOWLEDGMENTS. The authors would like to thank the Food Sanitation Division, Welfare Bureau of Health and Safety, Department of Health and Welfare, Hokkaido Prefectural Government, and the Higashi-Mokoto Meat Inspection Center, Okhotsk Sub-Prefectural Bureau, for permission to publish the results from this investigation.