In a paper published in Scientific Reports, researchers compared the prediction accuracy of various methods for feed efficiency (FE) traits in Nellore cattle. They found that machine learning (ML) methods like multi-layer neural networks (MLNN) and support vector regression (SVR), along with multi-trait genomic best linear unbiased prediction (MTGBLUP), outperformed single-trait methods and Bayesian regression approaches.
MLNN and SVR increased accuracy significantly, with SVR and MTGBLUP performing similarly. These findings suggest that MLNN and SVR are promising for genetically selecting complex traits like FE.
Related Work
Past studies have highlighted the importance of improving FE in beef cattle production for increased profitability. Compared to traditional pedigree-based methods, genomic selection (GS) offers a promising approach to enhance predictive accuracy for complex traits like FE. However, GS accuracy depends on various factors, including genetic architecture and statistical methods.
While traditional parametric models like genomic best linear unbiased prediction (GBLUP) are commonly employed, there's growing interest in methods that accommodate nonadditive genetic effects, such as Bayesian regression and ML techniques. Multitrait methods are also gaining attention for their ability to account for trait correlations. Yet, comprehensive comparisons of these methods for FE-related traits in beef cattle are lacking.
Data Utilization and Analysis Approaches
The study utilized data from 1,156 animals in an experimental breeding program at the Beef Cattle Research Center in São Paulo, Brazil, focusing on FE traits in Nellore cattle. Both phenotypic and genotypic information were collected, with animals coming from distinct selection herds within the program. Researchers assessed FE-related characteristics in a feeding trial, with animals housed individually or in group pens equipped with the GrowSafe feeding system.
Genotypic data underwent quality control procedures, resulting in a dataset of 1,024 animals genotyped for approximately 305,128 SNP markers. Researchers evaluated the population structure by conducting principal component analysis (PCA) to ensure data integrity. FE-related traits evaluated included average daily gain (ADG), DMI, FE, and residual feed intake (RFI). These traits were measured meticulously, considering factors such as contemporary groups and standardizing for variations.
Researchers employed various statistical methods for genomic prediction, including single-trait genomic best linear unbiased prediction (STGBLUP), multi-trait GBLUP (MTGBLUP), Bayesian regression models (BayesA, BayesB, BayesC, Bayesian Lasso (BL), and Bayesian Ridge Regression (BRR)), and ML techniques such as MLNN and SVR.
Researchers assessed the predictive ability of these methods using validation sets and measures like Pearson's correlation, root mean squared error (RMSE), and slope of linear regression. The study also compared the performance of different models, employing Ward's hierarchical clustering method and evaluating relative differences in predictive ability. Additionally, the relevant committee granted ethical approval for the study's procedures.
GS Methodologies Evaluation
Principal component analysis (PCA) revealed genetic stratification among selection herds, indicating diverse genetic backgrounds within the population. The first two principal components explained 3.27% of the genetic variation, with specific herds showing more genetic similarity. Moderate heritability estimates ranging from 0.21 to 0.40 were observed for FE traits, indicating a significant genetic component influencing these traits. Genetic correlations between FE-related traits varied, with FE displaying a negative correlation with dry matter intake (DMI) and residual feed intake (RFI).
The study used forward validation to compare various GS methodologies for predictive accuracy. ML techniques, such as support vector regression (SVR) and MLNN, alongside multi-trait GBLUP (MTGBLUP), demonstrated superior performance compared to traditional approaches like single-trait GBLUP (STGBLUP) and Bayesian methods (BayesA, BayesB, BayesC, BRR, and BL). Predictive accuracies were directly related to trait heritability, with SVR exhibiting the highest accuracy across traits.
Statistical analysis revealed significantly higher prediction accuracy with SVR and MTGBLUP than STGBLUP, while Bayesian methods exhibited lower accuracy for specific traits. Alternative approaches, including ML techniques and MTGBLUP, demonstrated increased trait prediction accuracies. Notably, Bayesian Lasso (BL) displayed the lowest predictive ability among Bayesian methods. The study emphasized the importance of considering alternative approaches for improving prediction accuracy in GS studies.
Evaluation of prediction bias through regression slope coefficients highlighted empirical bias for Bayesian methods and STGBLUP, while SVR and MTGBLUP exhibited minimal bias. Hierarchical clustering based on accuracy and slope regressions revealed distinct clusters for alternative approaches compared to Bayesian regression methods and STGBLUP. Furthermore, ML techniques and MTGBLUP showcased considerable reductions in predictive error compared to traditional approaches, underscoring their potential for enhancing prediction quality in GS studies.
Conclusion
To sum up, the study compared ML methods (MLNN and SVR) and MTGBLUP against traditional approaches (STGBLUP and Bayesian regression) for predicting FE-related traits in Nellore cattle. SVR, MLNN, and MTGBLUP demonstrated superior performance over STGBLUP and Bayesian regression, with SVR and MTGBLUP showing similar prediction accuracies and less bias. These findings suggest that SVR and MTGBLUP are adequate for the genomic prediction of FE traits in Nellore cattle due to their ease of implementation.