We studied the relationship between farm practices and milk fatty acids (FA) profile while accounting for complex, nonlinear interactions between practices through a dual-model machine learning workflow. Bulk tank milk was collected from 75 dairy farms in 5 EU countries (CH, DE, FR, IE, IT) with 3 summer and 2 winter samplings, pooled per season, resulting in 144 pooled samples for FA analysis by gas chromatography. Farming systems varied from very low to very high intensity based on inputs on lands and animals. Farm data were collected at each visit via questionnaires (29 features on feeding, herd, management and land use). Random Forest models (RF) were trained and validated on a train–test split. Hyperparameters were tuned on the train set via cross-validation. Feature selection combined permutation importance and the Boruta algorithm. Given the low interpretability of RF, the top 5 features were selected to fit a Conditional Inference Tree (CIT) to rank feature significance, with the same RF workflow. This resulted in top-performing models for alpha-linolenic acid (ALA), eicosapentaenoic acid (EPA), n-6 polyunsaturated FAs (PUFA), n-6:n-3 PUFA ratio, and linoleic acid (LA) (RF: R² ≥ 0.80; CIT: R² ≥ 0.70 and ΔR² < 0.03). LA and n-6 PUFA increased with higher concentrate intake and higher livestock units (LSU/ha), but decreased with longer grazing time. The n-6:n-3 ratio increased with higher concentrates intake, annual milk yield, and LSU/ha. ALA and n-3 PUFA increased with lower total dry matter (DM) and maize silage intakes and under organic production. EPA decreased with higher concentrate and DM intakes, and with higher annual milk yield. We propose this approach as a flexible, evolving framework for identifying key farm practices influencing other intrinsic milk quality traits.
Machine Learning to Understand Relationships Between Farm Practices and Milk Fatty Acids across Diverse European Dairy Farms
M. Berton;
2025
Abstract
We studied the relationship between farm practices and milk fatty acids (FA) profile while accounting for complex, nonlinear interactions between practices through a dual-model machine learning workflow. Bulk tank milk was collected from 75 dairy farms in 5 EU countries (CH, DE, FR, IE, IT) with 3 summer and 2 winter samplings, pooled per season, resulting in 144 pooled samples for FA analysis by gas chromatography. Farming systems varied from very low to very high intensity based on inputs on lands and animals. Farm data were collected at each visit via questionnaires (29 features on feeding, herd, management and land use). Random Forest models (RF) were trained and validated on a train–test split. Hyperparameters were tuned on the train set via cross-validation. Feature selection combined permutation importance and the Boruta algorithm. Given the low interpretability of RF, the top 5 features were selected to fit a Conditional Inference Tree (CIT) to rank feature significance, with the same RF workflow. This resulted in top-performing models for alpha-linolenic acid (ALA), eicosapentaenoic acid (EPA), n-6 polyunsaturated FAs (PUFA), n-6:n-3 PUFA ratio, and linoleic acid (LA) (RF: R² ≥ 0.80; CIT: R² ≥ 0.70 and ΔR² < 0.03). LA and n-6 PUFA increased with higher concentrate intake and higher livestock units (LSU/ha), but decreased with longer grazing time. The n-6:n-3 ratio increased with higher concentrates intake, annual milk yield, and LSU/ha. ALA and n-3 PUFA increased with lower total dry matter (DM) and maize silage intakes and under organic production. EPA decreased with higher concentrate and DM intakes, and with higher annual milk yield. We propose this approach as a flexible, evolving framework for identifying key farm practices influencing other intrinsic milk quality traits.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.