key: cord-0987462-r6v4ghcs authors: Mörk, M.; Lindberg, A.; Alenius, S.; Vågsholm, I.; Egenvall, A. title: Comparison between dairy cow disease incidence in data registered by farmers and in data from a disease-recording system based on veterinary reporting date: 2009-04-01 journal: Prev Vet Med DOI: 10.1016/j.prevetmed.2008.12.005 sha: 59e6dfead3d50788eb626a32f19305afce88bc63 doc_id: 987462 cord_uid: r6v4ghcs Sweden has a national disease-recording system based on veterinary reporting. From this system, all cattle-disease records are transferred to the dairy industry cattle database (DDD) where they are used for several purposes including research and dairy-health statistics. Our objective was to evaluate the completeness of this data source by comparing it with disease data registered by dairy farmers. The proportion of veterinary-treated disease events was estimated, by diagnosis. Disease incidence in the DDD was compared, by diagnosis and age, with disease data registered by the farmers. Comparison was made, by diagnosis, for (i) all disease events and (ii) those reported as veterinary-treated. Disease events, defined as “observed deviations in health, from the normal” were recorded by the farmers during January, April, July and October 2004. For the diagnoses calving problems, peripartum disorders, puerperal paresis and retained placenta, incidence proportions (IP) with 95% confidence intervals (CIs) were estimated. For all other disease problems, incidence rates (IR) were used. In total, 177 farmers reported at least 1 month and 148 reported all 4 months. Fifty-four percent of all disease events in the farmers’ data were reported as veterinary-treated. For several of the most common diagnoses, the IRs and IPs for all events were significantly higher in farmers’ data than in the DDD. Examples are, in cows: clinical mastitis, cough, gastro-intestinal disorders and lameness in hoof and limb; and in young stock: cough and gastro-intestinal disorders. For veterinary-treated events only, significant differences with higher IR in the farmers’ data were found in young stock for sporadic cough and sporadic gastro-intestinal disorders. The diagnosis “other disorders” had significantly more events in the DDD than in farmers’ data, i.e. veterinarians tended to choose more unspecific diagnoses than the farmers. This result indicates that the true completeness is likely to be higher than our estimate. We conclude that for the time period studied there was differential under-reporting associated with the diagnosis, the age of the animal and whether the herd was served by a state-employed or private veterinarian. . However, the potential pitfalls of using such databases for a secondary purpose, such as research, have been discussed and a general need for validation of such data has been identified (Bartlett et al., 1986; Lawrenson et al., 1999; Olsson et al., 2001) . Jordan et al. (2004) defined the completeness (epidemiologic sensitivity) of a secondary database as the proportion of cases that were actually recorded and the correctness (positive predictive value) as the proportion of cases reported that actually had the disease. In veterinary medicine, there are only a few examples where the correctness and/or completeness of a disease database have been evaluated. Examples are evaluations of the agreement between information in the computerized record and the paper files in a Canadian veterinary teaching hospital (Pollari et al., 1996a) and in Swedish insurance data (Egenvall et al., 1998; Nodtvedt et al., 2006; Penell et al., 2007) . The national animal disease-recording system in Sweden started in 1984 with the aims to monitor the incidence of disease in animal populations, provide data on national and herd disease status, include disease data in breeding goals and provide data for research (Emanuelson, 1988) . It is based on veterinary reporting and all species of animals are included, although the emphasis is on production animals. In Sweden, and for dairy cattle, veterinarians are obliged to report disease events for which they have been consulted to the Swedish Board of Agriculture (SBA, 2000) . Further, drugs used in veterinary medicine for food animals need a prescription (NPA, 1997) and veterinarians are only allowed to prescribe after medical examination of the animal (SBA, 2006) . Consequently, the Swedish disease-recording system should cover all cases of disease in cattle where a veterinarian is consulted, including all cases where there is a need for prescribed drug treatment. All disease records involving cattle are transferred from the Swedish Board of Agriculture to the Swedish Dairy Association (SDA). The link is the animal's unique identity, and therefore records where the individual identity is not recorded (such as group treatments), or is incorrect, cannot be used. At the SDA, the data are used for sire evaluation, extension services, annual statistics and research. Disease events can also be reported by farmers through the Swedish Official Milk Recording Scheme, but this route is not extensively used (for a more detailed description of the Milk Recording Scheme, see Andersson, 1988) . Consequently, the disease events in the database at the SDA are mainly those associated with veterinary treatment of individual animals. Hereafter we refer to the disease database at the SDA, including disease events that are either transferred from Swedish Board of Agriculture or reported by farmers to the SDA, as the dairy-disease database (DDD). Our objective was to evaluate the completeness of the DDD by (i) estimating the proportion of disease events, for each diagnosis, that were veterinary-treated (according to the dairy farmers), and (ii) comparing disease incidence estimates from the DDD and from disease data registered by dairy farmers, by diagnosis and age (cows/young stock). An additional aim was to investigate whether the proportion of veterinary-treated disease events reported in the farmers' data that was also registered in the DDD was different for state-employed veterinarians and private practitioners. In Sweden, there are two main dairy breeds: Swedish Red and White and Swedish Holstein. The population is free from, or has a very low prevalence of, specific infections such as salmonellosis, paratuberculosis, infectious bovine rhinotracheitis, enzootic bovine leucosis and bovine viral diarrhoea. During 2004, the mean herd size was 44 cows and the average milk yield per cow was 9177 kg ECM (energy-corrected milk). There were 7072 herds enrolled in the Swedish Official Milk Recording Scheme, including 86% of the 400,000 Swedish dairy cows. Our sampling frame was herds in the Swedish Official Milk Recording Scheme, with a herd size !25 dairy cows at the time of sampling. For example, to detect a loss of 20% at the official numbers of clinical mastitis (17 events per 100 lactations) with a power of 80% and 95% confidence level, a sample of 2060 cows was needed, without considering the farm-level variation (Win Episcope 2.0). Based on such sample calculations, practicality and expected participation (50%), a sample of 400 herds was randomly selected, i.e. we aimed at having approximately 8000 cows in the study. The random selection of herds was done by giving all herds that fulfilled the criteria a random number and the herds with the lowest 400 numbers were sampled. The dairy farmers were contacted by mail and the aim of the study and the work associated with participation was explained. The farmers were asked to reply by prepaid mail whether they were interested in participating or not. Respondents were then contacted by phone for further information. Farmers who had not responded to the letter were also contacted to avoid misunderstandings. As a gesture of appreciation, the farmers that agreed to participate were offered a subscription to a Swedish dairy magazine or a gift voucher of similar value. Disease events were recorded by the farmers during January, April, July and October 2004. Forms and instructions were sent to the farmers a week before the first study month. Prior to each study month, the farmers received a reminder. The farmers reported by mail, e-mail or fax. Reporting was weekly during the first month and monthly thereafter. Farmers that had not reported 2 weeks after the end of a study month were contacted by phone every second week until the forms were submitted. Because knowledge about the study could affect the veterinarians reporting routines, participating farmers were explicitly asked not to discuss the study with their veterinarians. Farmers were instructed to report ''observed deviations in health, from the normal,'' regardless of whether he/she chose to wait, treat the animal himself/herself, contact a veterinarian or slaughter the animal. For each disease event, the farmer reported the animal's identity and gender, the date when the health deviation was observed, diagnosis (Table 1) and whether the event led to veterinary consultation or not. Further, the farmer described, in text, the health deviation and the treatment given. The diagnoses used corresponded largely to those available for the farmers to report in the Milk Recording Scheme (with the addition of cough, diarrhoea, puerperal paresis and displaced abomasum) and there were no further definitions provided. If a veterinarian was consulted, the veterinarian's codes for diagnosis and treatment were also recorded. When groups of animals were affected at the same time, the farmers did not have to report all animal identities. Such events are hereafter referred to as ''group reports''. The data-collection form is available from the first author upon request. Information about the study herds was obtained from the SDA in November 2005. This included herd-level data such as annual disease incidences as well as individualanimal data (identity, gender, date of birth, calving data, milk yield, time in the herd and disease events) from 2001 to 2004. Besides using the data for incidence estimation in the DDD, they were further used to evaluate whether the participating herds were representative of the population in the sampling frame. The data collected by the farmers were entered into a database (MS Access, Microsoft Corporation, Redmond, WA, USA). Whenever disease records lacked information such as identity, gender or date, farmers were contacted by phone for further information. If the farmer was not able to recall the exact disease date, this was set to the 15th in the study month. If a cow fell ill a few days before veterinary consultation (but within a study month) the disease date was set to the day of veterinary consultation. This was done to facilitate the matching of events between farmers' data and the DDD. In conjunction with data input, we defined criteria for some diagnoses (Table 1) . From the code denoting ''other disease'', seven diagnoses were extracted (also listed in Table 1 ). The remaining records in this category were termed as ''other disorders''. The code for diarrhoea from the farmers' form was shifted to the code for gastrointestinal disorders. For group-reported events where the farmer reported ''all animals'' to be affected, SDA data were used to identify all animals in the herd at that time. The animals were categorised into cows/young stock based on whether they had calved or not. Group reports involving animals that could not be individually identified using SDA data (which was the case when only a part of the herd was affected) were age-categorised, if possible, based on the information given in the written description of the The diagnoses available for use by farmers were analogous to those that farmer can report through the milk-recording scheme with the addition of cough, diarrhoea, puerperal paresis and displaced abomasum. The diagnoses ''fertility problems'' and ''death from natural causes'' were also available for use by farmers but are not discussed further in this paper. Also, sub-clinical mastitis, dry-cow treatments, sub-clinical puerperal paresis and abortions were reported by farmers but are not discussed further. b Criteria were defined during data editing for these diagnoses. disease event. Events that were not possible to categorise into cows/young stock were not used in the incidence estimations. These were: two outbreaks involving 95 and 30 animals affected with diarrhoea and one involving 88 animals affected with cough. Also, two animals with diarrhoea were not possible to categorise due to errors in animal identification. There were also one outbreak of diarrhoea and one with cough where the numbers of affected animals remained unknown that were dropped from all incidence estimations. We defined a disease event as a new case of a certain diagnosis based on the definition in Section 2.2. It was therefore possible for an animal to have more than one event at the same time, e.g., a cow with clinical mastitis and teat tramp had both an event of mastitis and an event of teat tramp. Time-intervals for considering disease events as new cases were set to the same as is used at the SDA: 7 days for acetonemia/inappetence and paresis (not puerperal) and 21 days for all other diagnoses, except for peripartum disorders, puerperal paresis and retained placenta. For the latter, an animal was at risk in a defined interval in relation to calving (Table 1 ) and re-visits were not counted as new events. Descriptive statistics were produced and statistical analysis was done using Stata 1 version 8 (Stata Corporation, College station, TX, USA). Non-overlapping 95% confidence intervals (CI) and, for tests (two-sided), p-values < 0.05 were considered as indicating significant differences. We investigated the possible selection bias of the studied population by comparing participating herds with the negative-responders/drop-out herds. Based on SDA data from 2003 (when farmers were recruited), differences in herd size, annual milk yield per cow and disease incidences between participating versus non-participating herds were tested with the non-parametric Wilcoxon rank-sum test. Geographical differences in participation were tested for using Pearson's chi-square test, using affiliation to regional livestock association as a proxy for geographical location. The comparison of veterinary-treated events in our two data sources was studied by calculating (1) the proportion of diagnostic events in the farmers' data that were identified in the DDD (where the farmer had reported veterinary contact as well as the animal's unique identity) which is a measure of completeness in the veterinary-reporting process and (2) the proportion of diagnostic events reported in the DDD that were identified in the farmer's data (as an internal validation, where loss would indicate that the farmer failed to report accurately) (see Fig. 1 ). An event was identified in the other database if an event with (a) the same diagnosis occurred during the study month, or (b) another diagnosis occurred within 5 days. The proportion identified was calculated both for all events and within herd. We used the Wilcoxon rank-sum test to test whether herd-level proportions differed between types of veterinary districts. In addition, logistic regression adjusting for clustering within herd was used to test, overall, whether the proportions identified differed between state-employed and private veterinary districts. Incidence proportions (IP) with 95% CIs were estimated for all events and for veterinary-treated/reported events in the farmers' data and in the DDD respectively, for the diagnoses calving problems, peripartum disorders, puerperal paresis and retained placenta (Eq. (1)). For each study month, animals were at risk for these diagnoses if the time at risk defined in Table 1 overlapped the study month. For calving problems, cows were at risk if they had calved within the study month. Similarly, for all other disease problems, incidence rates (IR) with 95% CIs were estimated (Eq. (2) For herds with reports from all four study months, the corresponding herd-level IRs and IPs were also estimated for each database. The proportion of herds with any event, and the IR/IP at the 75th and 90th percentiles, were calculated. Differences in incidence, by diagnosis, were tested for by using (i) 95% CIs, for all events and for veterinarytreated/reported events only at the individual-event level and (ii) the Wilcoxon rank-sum test for differences in herdlevel incidence distributions. There were no significant differences between study herds and herds that did not participate with respect to herd size, annual milk yield and herd-level incidence (Table 2) , nor could we detect any geographical differences in degree of participation (10 df, p-value 0.75). In total, 177 farmers reported at least 1 month (January (n = 177), April (n = 157), July (n = 153) and October (n = 152)) and 148 reported all 4 months. The main reason given for not reporting was lack of time. Of the 177 herds, 125 were located in a state-employed veterinary district and 52 in a private veterinary district. During the study, 33,650 animals were registered in the herds at some point in time, giving 7807 total cattle-years of observation. In all, 2984 animals had at least one disease event and 490 of those had more than one event during the study. The maximum number of events in one animal was six. The number of events that led to veterinary consultation (according to the farmer) is presented by diagnosis in Table 1 . The relationship between the information in the DDD and the data provided by the farmers is illustrated in Fig. 1 . Of those 1503 events where the farmers reported that a veterinarian had been contacted, 71% were identified in the DDD. In all, 162 of 177 farmers had reported events with veterinary treatment. In 46 herds (28%), all events were identified in the DDD. In 62 (38%), 31 (19%), 16 (10%) and 7 herds (4%) 1-2, 3-5, 6-8 and 10-16 events were missing respectively. In one herd, 47 (out of 66) events in the farmer's data were missing in the DDD data. Another question was whether there were veterinarytreated events reported to the DDD that the farmer had failed to report. We found that 88% of all veterinary-reported events in the DDD (n = 1161) were identified in the farmers' data. The data in the DDD came from 155 herds. In 91 of them (59%), all events were identified in the farmers' data. In 34 (22%), 9 (6%) and 21 herds (14%) 1, 2 and 3-7 events were missing, respectively. At the individual-event level, the odds ratio for disease events being identified in the DDD were lower for events in herds that were located in a district served by private practitioners than by state-employed veterinarians ( Table 3 ). The proportion identified was also significant at the herd-level, and in the same direction. Looking at veterinary reported events in the DDD that were identified in the farmers' data, the event-level comparison was not significant (Table 3) . However, the herd-level comparison was significant with more farmers with a low proportion identified in districts served by state-employed veterinarians (Table 4) . Diagnosis-specific IRs and IPs are presented in Table 5 , for all disease events in the farmers' data and in the DDD. Table 5 also shows the corresponding IRs and IPs for events with veterinary contact (according to the farmer) and for veterinary reported events in the DDD. For several of the most common diagnoses, the rates and proportions for all events were significantly higher in farmers' data than in the DDD. When only including events with veterinary treatment, the incidences did not differ significantly between the databases for most of the diagnoses. In contrast, the diagnosis ''other disorders'' had significantly more events in the DDD than in the farmers' data. The herd-level IRs and IPs showed differences in distribution between farmers' data and the DDD (Table 6 ). Puerperal paresis and clinical mastitis were the only diagnoses where more than 50% of the herds had any events and also percentiles below 75% provided useful information. The 50th percentile for puerperal paresis was 2.0 in the farmers' data and 0 in the DDD data. For clinical mastitis, the 10th, 25th and 50th percentiles were 6.0, 14.1 and 23.8 in the farmers' data and 0, 6.9 and 14.5 in the DDD data. Eight herds had outbreaks of cough: seven in January and one in April. For four of the outbreaks, a veterinarian was contacted. In seven of the herds, a total of 688 animals were affected. In the eighth herd, the exact number was unknown (excluded from all incidence estimations). There were nine herds with outbreaks of diarrhoea: four in January and April and one in July. A veterinarian was contacted for three of these outbreaks. In eight of the herds, a total of 476 animals were affected. The ninth herd had an outbreak of diarrhoea affecting most cows but the exact number was not given (the outbreak was excluded from all incidence estimations). Four herds had outbreaks of ringworm/lice in January; 293 animals were affected and none of those outbreaks led to a reported veterinary contact. Our results showed that there is a substantial fraction of the total morbidity (as reported by farmers) in the dairy cow population that is not captured in the industry database (DDD), which in turn depends on a diseaserecording system based on veterinary reporting. As expected with such a system, the fraction lost varies between diagnoses. Whereas a veterinarian was consulted for all observed events of traumatic reticuloperitonitis and laminitis and 96% of abomasal displacement, only 78% of the mastitis events led to veterinary consultation (according to the farmers' data). The severity of events where a veterinarian was not contacted was not assessed but about 50% of these mastitis events were treated with hand milking and/or massage, 5% were treated with left-over antibiotics and 11% were not treated at all (data not shown). This indicates that, for a system based on veterinary reporting, a certain loss of events is due to milder cases of disease-suggesting a differential misclassification associated with severity of the disease. Inherent are the differences in farmers' ability to detect disease. Furthermore, for a system based on compulsory veterinary reporting, the individual differences in threshold for calling a veterinarian will lead to differential reporting, as could treatments carried out by the farmers themselves (Olsson et al., 2001) . Similarly, Nyman et al. (2007) reported that a high incidence rate of veterinary- Table 3 Veterinary-treated events (VTE) in the farmers' data (with unique identity) and the dairy-disease data (DDD) that were identified (recorded) also in the alternative database. Data were reported from farmers or obtained from the DDD in a study on baseline recording of disease events in Swedish dairy herds during January (n = 177), April (n = 157), July (n = 153) and October (n = 152) 2004. a Herds were divided based on whether they were located in a state-employed or private practitioner veterinary district. b Due to the definition of identified events in farmers' data and the DDD the numbers are not identical. c The CI are adjusted for clustering within herd. Veterinary-treated events (VTE) in the farmers' data (with unique identity) and the dairy-disease data (DDD) that were identified (recorded) also in the alternative database. Proportion of identified events per herd. Data were reported from farmers or obtained from the DDD in a study on baseline recording of disease events in Swedish dairy herds during January (n = 177), April (n = 157), July (n = 153) and October (n = 152) 2004. treated clinical mastitis was associated with farmers' willingness to treat. Willingness to contact a veterinarian for a case of clinical mastitis depends on factors such as severity of the case, single-cow characteristics (e.g., parity), herd situation and availability of alternative treatment (Vaarst et al., 2002; Vaarst et al., 2003) . The incidence of clinical mastitis is included in the sire evaluation in Sweden. Given that false negatives are randomly distributed among sires, a certain number of cows with mastitis that are not treated should not create systematic differences between progeny groups (Heringstad et al., 2000) . It is however, important that the daughter groups are sufficiently large, because such non-differential bias tends to diminish the size of any associations or differences as the efficient sample size decreases. We expected to find an age-dependent (differential) under-coverage because lactating cows are more valuable than calves. This was confirmed for cough, where there was a significant difference between the databases only for young stock. However, this was not seen for gastrointestinal disorders. To further analyse the nature of this differential under-coverage, more specific categories are needed to distinguish calves from heifers and first-or second-parity cows from older cows. Cough in young animals was the only diagnosis where we found higher IR in the farmers' data. It is however possible that our sample size was too small to detect differences for only veterinary-treated events due to the clustering effect of herd. The IR for the diagnosis ''other disorders'' was higher in the DDD than in the farmers' data. A closer look at the events in the DDD with code ''other disorders'' shows that 57 of the animals were indeed found in the farmers' data but with other diagnoses. The general tendency was that farmers had used more specific diagnoses than the veterinarians, which is the reason why the number of events with ''other diagnosis'' was larger in the DDD than in farmers' data. It is of course possible that the farmer over-diagnosed a specific disease where the veterinarian decided that a certain diagnosis was not totally appropriate. However, the Table 5 Incidence rates (IR) (events/100 cattle-years) or incidence proportions (IP) (events/100 cows at risk) reported by farmers and in the dairy-disease database (DDD). The study was a baseline recording of disease events in Swedish dairy herds during January (n = 177), April (n = 157), July (n = 153) and October (n = 152) 2004. Rates are given for all disease events and for veterinary-treated events only (herd outbreaks excluded). By diagnosis, cattle-years at risk ranged between 2929 and 2963 for cows and between 3857 and 3873 for young stock (for diagnoses where IRs were estimated). By diagnosis, the number of animals at risk ranged between 3518 and 9181 (for diagnoses where IPs were estimated). Only veterinary-treated disease events transition from a rather crude to a highly detailed code list in 1999 might have resulted in some veterinarians selecting unspecific diagnoses, simply because of difficulties in finding appropriate diagnoses or maybe unwillingness to do so. A questionnaire study involving large-animal practitioners in Sweden showed that the strategy for choosing diagnostic codes varies greatly among veterinarians, as does the opinion on the suitability of the codes available (Mö rk et al., 2005) . Consequently, there is a risk that some diseases appear under-reported due to the veterinarians' choice of diagnosis. There is also a risk that specific disease complexes or co-morbidities are underreported (Pollari et al., 1996b) . Failure of the veterinarian to report according to instructions is another reason for loss of data (Olsson et al., 2001 ). Between 2000 of the veterinary-treated disease events reported to the Swedish Board of Agriculture had invalid identities, and could therefore not be used within the Milk Recording Scheme. In March 2007, the situation had improved to being only 2-4% (personal communication, Katarina Roth, SDA). Further, there can be differences between veterinarians in the number of diagnoses/problems they choose to record for a given case, as discussed by Penell et al. (2007) . In our comparison, there would be a loss of events if the veterinarian included relatively fewer events than the farmer. In our sample, only 71% of the veterinary-treated events (according to the farmers) were identified in the DDD. This result, (C/(B + C)) in Fig. 1 , can however not be seen as an estimate of the systems epidemiological sensitivity because the farmers also failed to identify some of the events that were reported to the DDD. A better estimate of the completeness would be to also include the events that were reported in the DDD but not reported by the farmers ((C + D)/(B + C + D) in Fig. 1) , which slightly improves the estimate to 73% (for numbers, see Table 3 ). To estimate the epidemiological sensitivity of the DDD we would need to use a method of analysis that could handle the lack of gold standard. However, at current such methods require that the tests used (in our case, data sources) are independentan assumption that is not applicable on our data. In Sweden, there has been concern that data were withheld by private practitioners as a consequence of a dispute between this group of veterinary professionals and the Swedish Board of Agriculture (SOU, 2005) . Indeed, the total Table 6 Proportion of herds with events, herd incidence rates (IR) or herd incidence proportions (IP) in disease data reported by farmers in a study on baseline recording of disease events in Swedish dairy herds during January, April, July and October 2004, and in data registered in the disease-recording system during the same time period. The data are a subset of the study-data including only herds where the farmer reported all months (n = 148). a Herds with outbreaks of cough (n = 5), diarrhoea (n = 7) and ringworm/lice (n = 4), where a major part of the herd was affected, are not included in the respective estimates. b Difference in herd incidence distribution between farmers' data and the disease-recording system was tested with the Wilcoxon rank-sum test. proportion of farmer-reported events (Table 3 ) identified in the DDD was significantly higher for farms in stateemployed veterinary districts, which supports the concern that there was differential under-reporting depending on status of the veterinary district. The herd-level analysis showed the same direction. When testing the opposite (the total proportion of all veterinary-treated events in the DDD vs. farmer-identified), there were lower proportions of identified events in herds located in districts served by state-employed veterinarians (Table 4 ). One possibility is that farmers served by private veterinarians were aware of the under-reporting issue and therefore more motivated to report accurately in our study. In the Swedish system, when a veterinarian is consulted for larger outbreaks of disease, these are commonly reported as group events, leaving out the animals' identities. As we mentioned earlier, such events are not included in the DDD. In the present study it was, as expected, found that the under-reporting was substantial for group-related events. Larger outbreaks are, in Sweden, often associated with viral pathogens such as bovine coronavirus or bovine respiratory syncytial virus. Consequently, such infections are likely to be under-reported in the DDD just because they occur as large outbreaks. In addition, we also expected them to be under-reported because they are typically mild (not requiring veterinary treatment) and occur in younger animals. We think that these factors together are the reason for the large difference in IR between the databases for the diagnoses cough and gastro-intestinal disorders. The significant difference between the databases remained for cough and gastro-intestinal disorders in young stock and for gastro-intestinal disorders in cows when the outbreaks were excluded, indicating additional under-reporting due to absence of veterinary contact. However, the difference also remained (for young stock) when only veterinarytreated events were included in the estimation (indicating that there were also under-reporting by veterinarians). The proportion of events in the DDD that could be identified in the farmers' data was 88%, even though we used the farmer's narrative text and other appropriate information to find credible matched for cases initially not identified. We are therefore confident that we were able to match most of the cases even when the farmers' data were partly wrong. To investigate whether any single study month had an unexpectedly large influence on the overall incidence estimation we calculated month-specific incidences (data not shown), as a part of the internal validation. It was considered possible that farmers were more ambitious during the first study month (January) which also coincided with the season when there is least to do on a farm. However, the data showed no such general pattern. According to Bartlett et al. (1986) , only farmers with adequate record-keeping ability can participate in a prospective data collection and therefore, some selection bias is inevitable. Further, it has been stated that inconsistency in recording patterns might cause bias and that it is difficult to separate herds with a low level of reporting from those with a truly low incidence (Kadarmideen, 2002) . Some studies where farmer participation has been a crucial part have used inclusion criteria aimed at eliminating poor reporters (Olsson et al., 1993; Ortman and Svensson, 2004) . We did not screen eligible farmers to exclude those that might report poorly. Reducing the risk of bias arising from poor reporting would increase the risk of bias arising from including only farmers that kept good records (and possibly also contacted a veterinarian to a higher or lesser degree). Instead, we put a lot of effort into contacting individual farmers for a proper follow-up. The consequences of any differential under-reporting will depend on the purpose for which the data are used and on the magnitude of the bias, and should be kept in mind when designing epidemiological studies using such databases. In our opinion similar differences, except maybe for veterinary district, are likely to be found in any system based on veterinary reporting of clinical disease. Estimates on morbidity from a database based on veterinary recordings will be conservative and the degree of under-reporting will vary depending on the disease and age of animals' of interest. From our results we conclude that there might be differential under-reporting associated with the diagnosis (e.g. IR for clinical mastitis: 28.8 in farmers' data and 19.3 in the DDD), the age of the animal and specific Swedish circumstances (whether the herd was located in a state-employed or private veterinary district). Swedish dairy herd health programmes based on routine recording of milk production, fertility data, somatic cell counts and clinical diseases Development of a computerized dairy-herd health database for epidemiologic research Validation of computerized Swedish dog and cat insurance data against veterinary practice records The national Swedish animal disease recording system Selection for mastitis resistance in dairy cattle: a review with focus on the situation in the Nordic countries Foot/leg and udder health in relation to housing changes in Swedish dairy herds Quality of morbidity coding in general practice computerized medical records: a systematic review Recording disease and reproductive events in dairy cattle for developing national health and fertility selection indices. Performance recording of animals: state of the art Clinical information for research; the use of general practice databases Effects of diseases on reproductive performance in Swedish Red and White dairy cattle Validation of dairy disease data, final report part one (Validering av registerdata avseende sjukdomar på Svenska Mjö lkkor). Swedish Farmers' Foundation for Agricultural Research The Medical Products Agency's provisions on the prescription and dispensing of medicinal products etc Canine atopic dermatitis: validation of recorded diagnosis against practice records in 335 insured Swedish dogs Risk factors associated with the incidence of veterinary-treated clinical mastitis in Swedish dairy herds with a high milk yield and a low prevalence of subclinical mastitis Disease recording systems and herd health schemes for production diseases Calf diseases and mortality in Swedish dairy herds Associations between use of electric cow-trainers and clinical diseases, reproductive performance and culling in Swedish dairy cattle Use of antimicrobial drugs in Swedish dairy calves and replacement heifers Validation of computerized Swedish horse insurance data against veterinary clinical records Somatic-cell count as a selection criterion for mastitis resistance in dairy-cattle Quality of computerized medical record abstract data at a veterinary teaching hospital Postoperative complications of elective surgeries in dogs and cats determined by examining electronic and paper medical records Regulation on record keeping, reporting etc. (Fö reskrifter om ä ndring i Statens Jordbruksverks fö reskrifter (SJVFS 1998:38) om journalfö ring, uppgiftslä mnande m.m.). SJVFS 2000:114 Regulation on veterinary prescription and dispensing of medicinal products for animal use Report on Animal Disease Data. Government Offices of Sweden (Utredningen om Ö versyn av Djursjukdata Farmers' choice of medical treatment of mastitis in Danish dairy herds based on qualitative research interviews Organic dairy farmers' decision making in the first 2 years after conversion in relation to mastitis treatments Cumulative risk of bovine mastitis treatments in Denmark, Finland, Norway and Sweden We acknowledge the financial support from the Swedish Farmers' Foundation for Agricultural Research (Stockholm, Sweden). Also, the authors are grateful to all the cooperating farmers for their interest and support.