Intercontinental prediction of soybean phenology via hybrid ensemble of knowledge-based and data-driven models

The timing of crop development has significant impacts on management decisions and subsequent yield formation. A large intercontinental dataset recording the timing of soybean developmental stages was used to establish ensembling approaches that leverage both knowledge-based, human-defined models of soybean phenology and data-driven, machine-learned models to achieve accurate and interpretable predictions. We demonstrate that the knowledge-based models can improve machine learning by generating expert-engineered features. The collection of knowledge-based and data-driven models was combined via super learning to both improve prediction and identify the most performant models. Stacking the predictions of the component models resulted in a mean absolute error of 4.41 and 5.27 days to flowering (R1) and physiological maturity (R7), providing an improvement relative to the benchmark knowledge-based model error of 6.94 and 15.53 days, respectively, in cross-validation. The hybrid intercontinental model applies to a much wider range of management and temperature conditions than previous mechanistic models, enabling improved decision support as alternative cropping systems arise, farm sizes increase and changes in the global climate continue to accelerate.

Saved in:
Bibliographic Details
Main Authors: McCormick, Ryan F., Truong, Sandra K., Rotundo, Jose, Gaspar, Adam P., Kyle, Don, Van Eeuwijk, Fred, Messina, Carlos D.
Format: Article/Letter to editor biblioteca
Language:English
Subjects:crop model, ensemble, machine learning, phenology, soybean, super learner,
Online Access:https://research.wur.nl/en/publications/intercontinental-prediction-of-soybean-phenology-via-hybrid-ensem
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The timing of crop development has significant impacts on management decisions and subsequent yield formation. A large intercontinental dataset recording the timing of soybean developmental stages was used to establish ensembling approaches that leverage both knowledge-based, human-defined models of soybean phenology and data-driven, machine-learned models to achieve accurate and interpretable predictions. We demonstrate that the knowledge-based models can improve machine learning by generating expert-engineered features. The collection of knowledge-based and data-driven models was combined via super learning to both improve prediction and identify the most performant models. Stacking the predictions of the component models resulted in a mean absolute error of 4.41 and 5.27 days to flowering (R1) and physiological maturity (R7), providing an improvement relative to the benchmark knowledge-based model error of 6.94 and 15.53 days, respectively, in cross-validation. The hybrid intercontinental model applies to a much wider range of management and temperature conditions than previous mechanistic models, enabling improved decision support as alternative cropping systems arise, farm sizes increase and changes in the global climate continue to accelerate.