Browse Articles

Article|27 Jun 2022|OPEN
Nontargeted metabolomics-based multiple machine learning modeling boosts early accurate detection for citrus Huanglongbing
Zhixin Wang1 , Yue Niu2 , Tripti Vashisth1 , Jingwen Li1 , Robert Madden1 , Taylor Shea Livingston1 and Yu Wang,1 ,
1Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
2Department of Mathematics, University of Arizona, Tucson, Arizona 85721-0089, U.S.A
*Corresponding author. E-mail: yu.wang@ufl.edu

Horticulture Research 9,
Article number: uhac145 (2022)
doi: https://doi.org/10.1093/hr/uhac145
Views: 110

Received: 06 Mar 2022
Accepted: 20 Jun 2022
Published online: 27 Jun 2022

Abstract

Early accurate detection of crop disease is extremely important for timely disease management. Huanglongbing (HLB), one of the most destructive citrus diseases, has brought about severe economic losses for the global citrus industry. The direct strategies for HLB identification, such as quantitative real-time polymerase chain reaction (qPCR) and chemical staining, are robust for the symptomatic plants but powerless for the asymptomatic ones at the early stage of affection. Thus, it is very necessary to develop a practical method used for the early detection of HLB. In this study, a novel method combining ultra-high performance liquid chromatography/mass spectrometry (UHPLC/MS)-based nontargeted metabolomics and machine learning (ML) was developed for conducting the early detection of HLB for the first time. Six ML algorithms were selected to build the classifiers. Regularized logistic regression (LR-L2) and gradient-boosted decision tree (GBDT) outperformed with the highest average accuracy of 95.83% to not only classify healthy and infected plants but identify significant features. The proposed method proved to be practical for early detection of HLB, which tackled the shortcomings of low sensitivity in the conventional methods and avoid the problems such as lighting condition interference in spectrum/image recognition-based ML methods. Additionally, the discovered biomarkers were verified by the metabolic pathway analysis and content change analysis, which was remarkably consistent with the previous reports.