Browse Articles

Article|07 Oct 2022|OPEN
Application of machine learning to explore the genomic prediction accuracy of fall dormancy in autotetraploid alfalfa
Fan Zhang1,2 , Junmei Kang1 , Ruicai Long1 , Mingna Li1 , Yan Sun3 , Fei He1 , Xueqian Jiang1 , Changfu Yang1 , Xijiang Yang1 , Jie Kong1 , Yiwen Wang4 , Zhen Wang1 and Zhiwu Zhang2 , , Qingchuan Yang,1 ,
1Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
2Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99163, USA
3Department of Turf Science and Engineering, College of Grassland Science and Technology, China Agricultural University, Beijing 100193, China
4Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne 3052, Australia
*Corresponding author. E-mail: Zhiwu.Zhang@wsu.edu,qchyang66@163.com

Horticulture Research 10,
Article number: uhac225 (2023)
doi: https://doi.org/10.1093/hr/uhac225
Views: 398

Received: 15 Jun 2022
Accepted: 25 Sep 2022
Published online: 07 Oct 2022

Abstract

Fall dormancy (FD) is an essential trait to overcome winter damage and for alfalfa (Medicago sativa) cultivar selection. The plant regrowth height after autumn clipping is an indirect way to evaluate FD. Transcriptomics, proteomics, and quantitative trait locus mapping have revealed crucial genes correlated with FD; however, these genes cannot predict alfalfa FD very well. Here, we conducted genomic prediction of FD using whole-genome SNP markers based on machine learning-related methods, including support vector machine (SVM) regression, and regularization-related methods, such as Lasso and ridge regression. The results showed that using SVM regression with linear kernel and the top 3000 genome-wide association study (GWAS)-associated markers achieved the highest prediction accuracy for FD of 64.1%. For plant regrowth height, the prediction accuracy was 59.0% using the 3000 GWAS-associated markers and the SVM linear model. This was better than the results using whole-genome markers (25.0%). Therefore, the method we explored for alfalfa FD prediction outperformed the other models, such as Lasso and ElasticNet. The study suggests the feasibility of using machine learning to predict FD with GWAS-associated markers, and the GWAS-associated markers combined with machine learning would benefit FD-related traits as well. Application of the methodology may provide potential targets for FD selection, which would accelerate genetic research and molecular breeding of alfalfa with optimized FD.