key: cord-0061196-g3wfchxd authors: Adamczyk, Krzysztof; Grzesiak, Wilhelm; Zaborski, Daniel title: The Use of Artificial Neural Networks and a General Discriminant Analysis for Predicting Culling Reasons in Holstein-Friesian Cows Based on First-Lactation Performance Records date: 2021-03-06 journal: Animals (Basel) DOI: 10.3390/ani11030721 sha: 3af703b3b57875395d5b51c0433d41964effbd1e doc_id: 61196 cord_uid: g3wfchxd SIMPLE SUMMARY: Routinely collected data on the performance of dairy cows are a valuable source of information on the beginning, course, and completion of their productive life. As a result, when using sufficiently accurate methods, one can analyze and optimize the milk production process at a herd level from the breeding and economic point-of-view. In this context, it is important to have a possibility to early predict culling reasons for cows, since, in the case of finding an effective method, it would be possible to modify breeding actions and farm management practices without anticipating the end of the animals’ productive lives. Therefore, the aim of the present study was to verify whether artificial neural networks and a general discriminant analysis may be an effective tool for predicting the culling reasons in cows based on routinely collected first-lactation data. It turned out that they were most effective in predicting culling due to old age and reproductive problems. It is significant because infertility is one of the conditions that are the most difficult to eliminate in dairy herds. ABSTRACT: The aim of the present study was to verify whether artificial neural networks (ANN) may be an effective tool for predicting the culling reasons in cows based on routinely collected first-lactation records. Data on Holstein-Friesian cows culled in Poland between 2017 and 2018 were used in the present study. A general discriminant analysis (GDA) was applied as a reference method for ANN. Considering all predictive performance measures, ANN were the most effective in predicting the culling of cows due to old age (99.76–99.88% of correctly classified cases). In addition, a very high correct classification rate (99.24–99.98%) was obtained for culling the animals due to reproductive problems. It is significant because infertility is one of the conditions that are the most difficult to eliminate in dairy herds. The correct classification rate for individual culling reasons obtained with GDA (0.00–97.63%) was, in general, lower than that for multilayer perceptrons (MLP). The obtained results indicated that, in order to effectively predict the previously mentioned culling reasons, the following first-lactation parameters should be used: calving age, calving difficulty, and the characteristics of the lactation curve based on Wood’s model parameters. Routinely collected data on longevity and survivability of cows and their culling reasons may be used for the analysis of dairy cattle management and milk production profitability in individual animals, herds, or populations [1] . Therefore, these data are more frequently monitored in real time and relationships among them are analyzed in the context of animal performance prediction [2] . dairy cows, or the estimation of carcass weight in beef cattle [31] , among others. Another, more traditional statistical approach (also belonging to data mining methods) is a general discriminant analysis (GDA). In this method, discriminant function analysis problems are solved using a general multivariate linear model, in which the dependent variables are binary vectors that reflect the class membership of each case (animal) [32, 33] . GDA offers more possibilities than traditional discriminant function analysis, based on a classification rule, which allows for the correct classification of cows and the evaluation of classification accuracy depending on the adopted division criteria. In animal science, GDA has been used for dystocia detection in dairy cows [34] or the examination of factors affecting beef tenderness [35] , among others. The following research hypothesis was adopted in the present study. ANN may be an effective tool for predicting culling reasons in cows, based on routinely collected first-lactation data. Moreover, the effects of the prediction made by ANN were compared with those obtained using the GDA. The analysis included data on Holstein-Friesian cows (from 466 herds) culled in Poland between 2017 and 2018. The animals were performance recorded. The data were obtained from the SYMLEK Polish National Milk Recording System. SYMLEK is a system of databases (including the results of data analysis for breeding purposes) on the population of dairy cattle under milk recording in Poland. At the breeding level, the system is managed by the Polish Federation of Cattle Breeders and Dairy Farmers, while ZETO Software is responsible for its technical (IT) side. Based on the whole dataset of test-day records for the first lactation, the cows were grouped according to the culling reason and the age at first calving. The following culling reasons (R) were analyzed: infectious diseases (R1), respiratory system diseases (R2), low milk yield (R3), nutritive and metabolic diseases (R4), leg diseases (R5), udder diseases (R6), infertility and reproduction problems (R7), old age (R8), accidents (R9), and others (R10). The grouping was carried out in order to plot lactation curves through the calculation of Wood's model parameters within the categories of culling reasons and age at first calving. Using test-day records, 17 cow groups were distinguished according to the age at first calving (at one-month intervals, from 17 to 34 months of calving, whereas the cows calving at the age of 17 and 18 months were treated as one group). A total of 164 classes (age group × culling reason) were formed in this way. Theoretically, the number of classes should be 170 (17 age groups × 10 culling reasons), but six age groups had missing data for certain culling reasons (Table 1) . 169 227 200 2371 6 23 10 37 204 313 475 674 2046 421 573 480 5233 7 24 12 49 267 443 697 907 2956 639 742 647 7359 8 25 13 49 256 416 698 892 2945 710 772 665 7416 9 26 7 56 226 369 619 758 2529 549 600 614 6327 10 27 10 30 171 282 537 637 2128 382 495 436 5108 11 28 2 29 147 214 405 538 1658 305 401 371 4070 12 29 4 15 98 190 348 462 1345 231 314 297 3304 13 30 2 14 78 129 274 369 1025 182 249 226 2548 14 31 4 17 63 106 201 270 787 152 193 177 1970 15 32 1 11 43 84 160 205 634 103 155 139 1535 16 33 0 5 56 66 131 138 507 82 126 118 1229 17 34 5 8 26 53 104 131 432 73 88 73 993 Total -77 339 1797 2889 4953 6475 20,601 4120 5062 4566 50,879 AFC-age at first calving, R-culling reason (R1-infectious diseases, R2-respiratory system diseases, R3-low milk yield, R4-nutritive and metabolic diseases, R5-legs diseases, R6-udder diseases, R7-infertility and reproduction problems, R8-old age, R9-accidents, R10-other). The number of test-day records within each culling reason is given in brackets. Based on milk yield from test-day records, the first-lactation curve parameters were estimated separately for each group (age at first calving × culling reason). For this purpose, the mean values of milk yield from test-day records were determined for each lactation stage. Ten lactation stages were distinguished at 30-day intervals (the first lactation stage from 5 to 30 days of lactation, the second lactation stage from 31 to 60 days, the third lactation stage from 61 to 90 days, the fourth lactation stage from 91 to 120 days, the fifth lactation stage from 121 to 150 days, the sixth lactation stage from 151 to 180 days, the seventh lactation stage from 181 to 210 days, the eighth lactation stage from 211 to 240 days, the ninth lactation stage from 241 to 270 days, and the 10 th lactation stage from 271 to 305 days). For the description of the lactation curve, the gamma function proposed by Wood [36] was used: where y is the milk production (kg) at time t (days), e is Napier's constant, a is the initial milk yield, b is the rate of increase until the peak is reached, and c is the rate of decline after peak production. The regression model parameters were estimated with the quasi-Newton method [37] . A total of 948,010 test-day records (for the first 305-day lactation) from 163,369 cows were used for estimating Wood's model parameters. The estimated Wood's model parameters (a, b, c) were used as explanatory (input) variables for further analysis. When preparing the training set for classification using neural networks, only cows with a complete set of information were used. Records with less than 1 kg of milk, incomplete or erroneous ones (e.g., improbable minimal and maximal values of variables) were removed. In addition, only cows with at least nine test-day records were included in the analysis. The final dataset contained 50,879 cows. In this dataset, the following explanatory (input) variables were included: X 1 -herdsize (from 3 to 1644 cows), X 2 -age at first calving (from 17 to 34 months), X 3 -lactation length (in days), X 4 -the number of first-lactation test-day records, X 5 -X 7 -Wood's model parameters (a, b, c, respectively) for individual categories (age at first calving × culling reason). Additionally, the following production traits for the first lactation were used as predictors (minimum, maximum, mean values, and standard deviations, respectively): X 8 -X 11 -daily milk yield (kg), X 12 -X 15 -fat content (%), X 16 -X 19 -protein content (%), X 20 -X 23 -lactose content (%), X 24 -X 27 -dry matter content (%), X 28 -X 31 -urea content (mg/L), and X 32 -X 35 -somatic cell count (thousand/mL). Moreover, first-lactation nominal variables such as X 36 -calving difficulty (according to the scale used for performance recording: easy, spontaneous, difficult, very difficult, abortion, and cesarean section) and X 37 -calving season (spring from 21 March to 20 June, summer from 21 June to 21 September, autumn from 22 September to 22 December, and winter from 23 December to 20 March) were included in the model. Ultimately, the dataset was randomly divided into a training set (33,071 culling records, 65% of all observations), a validation set (used for controlling the network training process, 7632 records, 15% of all observations) and a test set (used for verifying the predictive performance of the models, 10,176 records, and 20% of all observations). The distribution of continuous and nominal predictors in individual sets is presented in Tables 2 and 3. R1-infectious diseases, R2-respiratory system diseases, R3-low milk yield, R4-nutritive and metabolic diseases, R5-leg diseases, R6-udder diseases, R7-infertility and reproduction problems, R8-old age, R9-accidents, and R10-others. Different multilayer perceptrons (MLP) with one hidden layer were analyzed. The hidden layer consisted of 5 to 30 neurons (the number of neurons was selected empirically). The number of neurons in the input layer was 45 and the calving season and calving difficulty variables were coded by four and six neurons, respectively (one-of-n encoding) (Figure 1 ). In the input layer, the min-max transformation was used for continuous variables. In the hidden and output neurons, different types of activation functions were verified (linear, logistic, hyperbolic tangent, and exponential). The networks were trained with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm, which is a powerful second order training algorithm with very fast convergence but high memory requirements due to storing the Hessian matrix [38] . For each analyzed network, a given number of iterations was carried out until reaching the minimum misclassification rate on the validation set. For the evaluation of the network during its training, two error functions were considered, i.e., the sum of squares and cross-entropy. The latter is calculated as the sum of the products of real values and error logarithms for each output neuron. Based on the obtained results, the classification matrix was created on the test set and predictive performance measures were calculated (the percentage of correctly classified cases from each category and the overall accuracy). In addition, the positive predictive values (PPV) were calculated, which showed the reliability of predictions made by the neural models. Finally, a sensitivity analysis was carried out for ANN, which allowed for the ordering of predictors (input variables) according to their relative importance. This analysis was based on two criteria: an error ratio, i.e., error when input was set to mean divided by the error when input was used (for continuous predictors) or an average error when input was set to all other categorical levels divided by the error when input was used (for categorical predictors), and a rank, which ordered predictors according to their decreasing importance from one (the most important predictor) to 37 (the least important predictor). Based on the results of sensitivity analysis (Table 4 ), the set of predictors was limited to the five most discriminative ones, adopting the value of an error ratio above 1.5. The entire procedure was the same as for the networks with the full set of predictors (the selection of the best network out of 10 initial networks, classification matrix, sensitivity analysis, and gains charts). Clearly, there were differences in the number of input neurons (ten, Figure 2 ). Based on the obtained results, the classification matrix was created on the test set and predictive performance measures were calculated (the percentage of correctly classified cases from each category and the overall accuracy). In addition, the positive predictive values (PPV) were calculated, which showed the reliability of predictions made by the neural models. Finally, a sensitivity analysis was carried out for ANN, which allowed for the ordering of predictors (input variables) according to their relative importance. This analysis was based on two criteria: an error ratio, i.e., error when input was set to mean divided by the error when input was used (for continuous predictors) or an average error when input was set to all other categorical levels divided by the error when input was used (for categorical predictors), and a rank, which ordered predictors according to their decreasing importance from one (the most important predictor) to 37 (the least important predictor). Based on the results of sensitivity analysis (Table 4 ), the set of predictors was limited to the five most discriminative ones, adopting the value of an error ratio above 1.5. The entire procedure was the same as for the networks with the full set of predictors (the selection of the best network out of 10 initial networks, classification matrix, sensitivity analysis, and gains charts). Clearly, there were differences in the number of input neurons (ten, Figure 2 ). Based on the same dataset as for ANN, the GDA was carried out and the classification matrix was created on the test set for the models including all 37 initial or the five most discriminative predictors. The method of the GDA model building was described in more Based on the same dataset as for ANN, the GDA was carried out and the classification matrix was created on the test set for the models including all 37 initial or the five most discriminative predictors. The method of the GDA model building was described in more detail by Zaborski et al. [34] . In addition, the predictive performance measures were calculated (the percentage of correctly classified cases from each culling category, the overall accuracy, and PPV). In order to better illustrate the predictive abilities of the neural and GDA models, cumulative gains charts were also plotted. These charts show the relationship between the cumulative gains (the proportion of cases from a given culling category among all the cases belonging to this category) and the percentage of cases predicted by the model as belonging to this category in the whole data set [39] . A diagonal crossing the (0,0) and (1,1) points (the baseline) indicates a random model (without any predictive capabilities). Therefore, the curves located above the diagonal are preferred [the closer the line to the (0,1) point, the better the model] [40] . Statistica software (v. 13.3, Tibco Inc., Tulsa, OK, USA) was used for statistical analysis. The most effective ANN with 37 predictors had a relatively, highly correct classification rate on the training and validation set (86.73%-96.17% and 87.33%-95.96%, respectively) ( Table 5) . From among the analyzed ANN, the MLP with one hidden layer and a 45-29-10 structure (the number of neurons in the input, hidden and output layer, respectively) was selected ( Figure 1 ). This perceptron (denoted as MLP37) had the highest correct classification rate on the validation set. The applied training algorithm included 320 iterations. The cross-entropy error function was applied together with the SoftMax activation function in the network output layer. A hyperbolic tangent activation function was used in the hidden layer. The sensitivity analysis of MLP37 showed that the greatest influence on the output variable was exerted by lactation curve parameters (a, b, c), age at first calving, and calving difficulty. Their error ratio ranged from 8.870 (calving difficulty) to 188.901 (the a parameter). Therefore, these variables were used as the only input variables for the network with a reduced set of predictors. The remaining input variables for MLP37 had a much lower error ratio, i.e., below 2 (Table 4 ). In comparison with the best networks containing 37 predictors, the networks with a lower number of predictors were characterized by the lower values of a correct classification rate both on the training (71.46%-83.01%) and validation (71.40%-83.52%) set. Among these networks, the MLP with one hidden layer and a 10-19-10 structure (denoted as MLP5) was the most effective ( Figure 2 ). The applied training algorithm included 227 iterations. To evaluate the network performance during its training (like for MLP37), an entropy error function was used, which was applied together with the SoftMax activation function in the network output layer. Similarly, a hyperbolic tangent activation function was used in the hidden layer ( Table 5) . As can be seen from Table 3 , the most frequent culling reasons in the test set were: reproductive problems (4055 records), udder diseases (1314 records), and accidents and leg diseases (1047 and 1025 records, respectively) . Nevertheless, both MLP37 and MLP5 almost always correctly classified culling records from an old age category (R8) ( Table 6) . A very high correct classification rate (at least 99%) was also found for cows culled due to reproductive problems (R7). In other cases, the percentage of correct classification was 91%-97% for MLP37 (except for low milk yield-R3, for which it was 77%) and 51%-88% for MLP5. On the other hand, the lowest correct classification rate (77% and 55% for MLP37 and MLP5, respectively) was observed for low milk yield (R3). The percentage of correct classification for individual culling reasons obtained with GDA37 and GDA5 was, in general, lower than that for MLP37 and MLP5 (Table 7) . R1-infectious diseases, R2-respiratory system diseases, R3-low milk yield, R4nutritive and metabolic diseases, R5-leg diseases, R6-udder diseases, R7-infertility and reproduction problems, R8-old age, R9-accidents, and R10-other. The numbers of correctly classified cases are shown on the diagonal. A significant indicator of the predictive abilities of ANN and GDA was also the reliability of prediction. In the present study, PPV were used for this purpose (Table 7) . In general, these values were quite high for both neural models (88.31-100% for MLP37 and 68.15-100% for MLP5). For GDA, they ranged from 0% to 83.33% (GDA37) and from 0% to 100% (GDA5). In order to get an even better insight into the prediction reliability of ANN and GDA, the cumulative gains charts were plotted and analyzed, which illustrated the relationship between the percentage of correctly classified cases from a given category and the percentage of records from the dataset ordered, according to the predicted probability of the class assignment. It should be emphasized that the gain curves for most culling reasons predicted by ANN were located as much higher than the baseline [near the (0, 100%) point], which indicates a high prediction reliability of the neural models (Figures 3 and 4) . However, this time, MLP37 was (like for the previously reported results) more effective than MLP5. When interpreting gains charts for individual categories, one should also consider the percentage of cases from a given category in the whole dataset. Consequently, the course of the curve for udder diseases was not optimal (from the first 13% of observations classified with the highest probability to this category by the model, which about 80% belonged to this class). On the other hand, the curve for reproductive problems passed much closer to the baseline, even though the prediction reliability was very high. This resulted from the fact that the percentage of cases from this category in the whole dataset was about 40%. The gains obtained with both MLP37 and MLP5 were the highest (besides reproductive problems) for such culling reasons as: old age, accidents, or leg diseases. The gains for the GDA models with 37 and five predictors were much lower (Figures 5 and 6 ). In principle, the gains curves for individual categories (except for old age) were located very close to the baseline, and some curves were even below this line, which shows the uselessness of such models, since better results can be obtained in a purely random manner (without any model). n-number of records, MLP37-multilayer perceptron with one hidden layer and 37 input variables, MLP5-multilayer perceptron with one hidden layer and five input variables, GDA37-general discriminant analysis with 37 predictors, GDA5-general discriminant analysis with five predictors, Cor.-correct, Incor.-incorrect, PPV-positive predictive value, R1-infectious diseases, R2-respiratory system diseases, R3-low milk yield, R4-nutritive and metabolic diseases, R5-leg diseases, R6-udder diseases, R7-infertility and reproduction problems, R8-old age, R9-accidents, and R10-others. the whole dataset was about 40%. The gains obtained with both MLP37 and MLP5 were the highest (besides reproductive problems) for such culling reasons as: old age, accidents, or leg diseases. The gains for the GDA models with 37 and five predictors were much lower ( Figures 5 and 6 ). In principle, the gains curves for individual categories (except for old age) were located very close to the baseline, and some curves were even below this line, which shows the uselessness of such models, since better results can be obtained in a purely random manner (without any model). In the present study, old age was the most accurately predicted culling reason by both MLP37 and MLP5. In addition, prediction reliability for ANN was high for this category, which indicates that almost all animals predicted by ANN to be culled due to In the present study, old age was the most accurately predicted culling reason by both MLP37 and MLP5. In addition, prediction reliability for ANN was high for this category, which indicates that almost all animals predicted by ANN to be culled due to old age, really belonged to this category. The gains charts were also nearly optimal. Similar values of individual performance indicators were obtained for GDA. Therefore, in this case, it is really possible to include only lactation curve parameters (from Wood's model), age at first calving, and calving difficulty as the only predictors in primiparous cows. It may be of great importance from the production practice point-of-view, since, among all culling reasons, old age directly indicates the productive lifespan of dairy cows, considering the longest-living animals. The possibility of an early prediction of the maximum lifespan of cows (based on first-lactation parameters) may provide information required for the assessment of milk production profitability [41] and the modification of breeding programs for dairy cattle in terms of their longevity [42] . It should also be noted that many authors [1, [42] [43] [44] have indicated cow longevity as one of the most important measures of animal welfare. It is highly significant from both a breeding and production point-of-view and due to the increasingly higher sensitivity of dairy product consumers to the human-animal relationship [45] . Therefore, it seems that the prediction of the maximum length of productive life in high-yielding cows should be interesting for both breeders/milk producers and the food industry. On the other hand, reproductive problems, as the second culling reason (after old age), most accurately predicted by MLP37 and MLP5, belong to the most frequent difficulties encountered in production practice. It is estimated that they account for approximately 20%-40% of culled dairy cows [1, 46] . A highly correct classification rate (at least 99% in the case of ANN) for this category was also accompanied by high prediction reliability and high gains (considering the fact that this category was the most frequent one). Cows with reproductive problems were sometimes incorrectly classified to such categories as: leg diseases, udder diseases, accidents, other reasons, and (to a lesser extent) low yield, respiratory system diseases, metabolic diseases, and old age. It may have resulted from the relationship between reproductive problems in cows and other culling reasons. An association between cow fertility and respiratory system diseases [47, 48] , milk yield level [49, 50] , metabolic diseases [51] [52] [53] , leg diseases [54] [55] [56] , and udder diseases [57, 58] has been shown. In the present study, ANN incorrectly classified reproductive problems in cows in these cases. At the same time, this result supports the suggestion made by Adamczyk et al. [59] , who recommended to consider not only the ultimate culling reasons but also the mutual relationships among individual reasons and the life-history of cows when analyzing longevity and indicating culling reasons for these animals. Cows culled due to reproductive problems were also accurately predicted by GDA. However, PPV and gains were lower than those for ANN considering the proportion of this class in the whole dataset. Similar results were obtained for GDA with a reduced set of predictors. For the potential application of ANN in dairy production, prediction of culling reasons in cows should be considered in a broader context, i.e., concerning the lifetime performance of animals. In this regard, Kumar and Hooda [60] stated that artificial intelligence may be successfully applied to the prediction of lifetime milk yield of cows based on age at first calving, calving interval, and some parameters of the first and second lactation (service period, lactation milk yield, lactation length, and dry period), whereas Bhosale and Singh [61] reported that, for the effective prediction of lifetime milk yield in crossbred cows with a proportion of Holstein-Friesian genes, it is sufficient to include only the first-lactation parameters in the ANN input layer (lactation length, peak yield, and lactation total milk yield). Moreover, ANN was very effective in this case for both smaller (fewer than 10 cows) and larger herds. Considering the results reported by Bhosale and Singh [61] , it should be noted that, in the present study, the predictive abilities of ANN were confirmed based on the first-lactation data, including Wood's model parameters. Consequently, the effectiveness of ANN in predicting phenotypic milk performance traits is even more important due to the fact that it corresponds to the significant abilities of ANN to predict breeding value of dairy cattle [62] . In this context, a more traditional method such as a discriminant function analysis was much less effective when compared with ANN including 37 or five predictors. In addition, the application of GDA is associated with certain assumptions about predictors, especially multicollinearity, which limits its applicability [33] . Predictors should not be correlated with each other since this causes computational problems. These assumptions, however, are not so important for ANN. In the present study, it was shown that artificial neural networks may be an effective method of classifying cows culled due to old age based on routinely collected first-lactation data. Among the remaining culling reasons, a highly correct classification rate was observed for reproductive problems. An association between this culling reason and low milk yield, udder diseases, metabolic diseases, leg diseases, and respiratory system diseases was also confirmed in our study. It should be emphasized that, for the effective prediction of culling reasons, it was sufficient to include such first-lactation traits as calving age, calving difficulty, and the characteristics of the lactation curve (Wood's model parameters). The confirmed abilities of ANN may constitute a valuable source of information that can be used for breeding programs' modification in Holstein-Friesian cattle and economic model optimization for dairy herds. Cow longevity economics: The cost benefit of keeping the cow in the herd The validity of a monitoring system based on routinely collected dairy cattle health data relative to a standardized herd check Dairy cow culling decision. 3. Risk of culling on predicted income (an application of Bayes criterion) 1 Use of linear type and production data to supplement early predicted transmitting abilities for productive life Comparison between a Weibull proportional hazards model and a linear model for predicting the genetic merit of US Jersey sires for daughter longevity Prediction of longevity breeding values for US Holstein sires using survival analysis methodology Metabolic parameters in transition cows as indicators for early-lactation culling risk Metabolic predictors of post-partum disease and culling risk in dairy cattle Evaluating markers in selected genes for association with functional longevity of dairy cattle The Transition cow needs space and comfort Predicting first test day milk yield of dairy heifers Effects of data preprocessing on the performance of artificial neural networks for dairy yield prediction and cow culling classification Recognition of culling reasons in Polish dairy cows using data mining methods Creating a model to detect dairy cattle farms with poor welfare using a national database Associations between several aspects of heifer development and dairy cow survivability to second lactation A 100-Year Review: A century of dairy heifer research Heifer fertility and carry over consequences for life time production in dairy and beef cattle Nutrition-induced changes of growth from birth to first calving and its impact on mammary development and first-lactation milk yield in dairy heifers: A review The relationship between body energy traits and production and fitness traits in firstlactation dairy cows Relationships between age at first calving and first lactation milk yield, and lifetime productivity and longevity in dairy cows Correlations between first lactation and lifetime performance traits of Canadian Holsteins Correlations between milk yield in primiparous PHF cows and selected lifetime performance and fertility indicators as well as reasons for culling Effect of age at first calving on first lactation milk yield, lifetime milk production and longevity of cows Modeling lactation curve standards for test-day milk yield in Holstein, Brown Swiss and Simmental cows Lactation curves for milk, fat and protein in dairy cows: A full approach Examples of the use of data mining methods in animal breeding Classification of cow milk using artificial neural network developed from the spectral data of single-and three-detector spectrophotometers Using discriminant analysis and artificial neural network models for classification and prediction of fertility status of Friesian cattle A comparison of 4 predictive models of calving assistance and difficulty in dairy heifers and cows Estimation of carcass weight of Hanwoo (Korean Native Cattle) as a function of body measurements using statistical models and a neural network Introduction to Econometrics The Use of data mining methods for dystocia detection in Polish Holstein-Friesian Black-and-White Cattle Influence of the spatial organization of the perimysium on beef tenderness Algebraic model of the lactation curve in cattle Fit of standard models of the lactation curve to weekly records of milk production of cows in a single herd An improved learning algorithm based on the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method for back propagation neural networks Van den Poel, D. Handling class imbalance in customer churn prediction Response models based on bagging neural networks Associations between age at first calving, rearing average daily weight gain, herd milk yield and dairy herd production, reproduction, and profitability A 100-year review: Identification and genetic selection of economically important traits in dairy cattle The impact of genetic selection for increased milk yield on the welfare of dairy cows Animal Behavior and Well-Being Symposium: Optimizing outcome measures of welfare in dairy cattle assessment Public and farmer perceptions of dairy cattle welfare in the United States Associations between strain, herd size, age at first calving, culling reason and lifetime performance characteristics in Holstein-Friesian cows Associations between bovine coronavirus and bovine respiratory syncytial virus infections and animal performance in Swedish dairy herds Effects on productivity and risk factors of bovine respiratory disease in dairy heifers; a review for the Netherlands Effect of milk yield characteristics, breed, and parity on success of the first insemination in Dutch dairy cows Effect of milk production on reproductive performance in dairy herds The association of plasma glucose, BHBA, and NEFA with postpartum uterine diseases, fertility, and milk production of Holstein dairy cows Influences of nutrition and metabolism on fertility of dairy cows Nutrition, rumen health and inflammation in the transition period and their role on overall health and fertility in dairy cows Control and prevention of lameness associated with claw lesions in dairy cows Effect of claw disorders on milk production, fertility, and longevity, and their economic impact in Spanish Holstein cows Animal board invited review: Precision livestock farming for dairy cows with a focus on oestrus detection High somatic cell counts and changes in milk fat and protein contents around insemination are negatively associated with conception in dairy cows Associations between udder health and reproductive performance in United Kingdom dairy cows Analysis of lifetime performance and culling reasons in Black-and-White Holstein-Friesian cows compared with crossbreds Prediction of milk production using artificial neural network Development of lifetime milk yield equation using artificial neural network in Holstein Friesian crossbred dairy cattle and comparison with multiple linear regression model Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle Data Availability Statement: The data are publicly unavailable due to data confidentiality. The authors declare no conflict of interest.