Automated Machine Learning: A Case Study of Genomic “Image-Based” Prediction in Maize Hybrids

Machine learning methods such as multilayer perceptrons (MLP) and Convolutional Neural Networks (CNN) have emerged as promising methods for genomic prediction (GP). In this context, we assess the performance of MLP and CNN on regression and classification tasks in a case study with maize hybrids. The genomic information was provided to the MLP as a relationship matrix and to the CNN as “genomic images.” In the regression task, the machine learning models were compared along with GBLUP. Under the classification task, MLP and CNN were compared. In this case, the traits (plant height and grain yield) were discretized in such a way to create balanced (moderate selection intensity) and unbalanced (extreme selection intensity) datasets for further evaluations. An automatic hyperparameter search for MLP and CNN was performed, and the best models were reported. For both task types, several metrics were calculated under a validation scheme to assess the effect of the prediction method and other variables. Overall, MLP and CNN presented competitive results to GBLUP. Also, we bring new insights on automated machine learning for genomic prediction and its implications to plant breeding.

Saved in:
Bibliographic Details
Main Authors: Galli, G., Sabadin, F., Yassue, R.M., Galves, C., Fanelli Carvalho, H., Crossa, J., Montesinos-Lopez, O.A., Fritsche-Neto, R.
Format: Article biblioteca
Language:English
Published: Frontiers 2022
Subjects:AGRICULTURAL SCIENCES AND BIOTECHNOLOGY, Non-Image to Image, Multilayer Perceptrons, Convolutional Neural Networks, AutoML, ACCURACY, MAIZE, HYBRIDS, PLANT BREEDING, NEURAL NETWORKS, MACHINE LEARNING,
Online Access:https://hdl.handle.net/10883/22042
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning methods such as multilayer perceptrons (MLP) and Convolutional Neural Networks (CNN) have emerged as promising methods for genomic prediction (GP). In this context, we assess the performance of MLP and CNN on regression and classification tasks in a case study with maize hybrids. The genomic information was provided to the MLP as a relationship matrix and to the CNN as “genomic images.” In the regression task, the machine learning models were compared along with GBLUP. Under the classification task, MLP and CNN were compared. In this case, the traits (plant height and grain yield) were discretized in such a way to create balanced (moderate selection intensity) and unbalanced (extreme selection intensity) datasets for further evaluations. An automatic hyperparameter search for MLP and CNN was performed, and the best models were reported. For both task types, several metrics were calculated under a validation scheme to assess the effect of the prediction method and other variables. Overall, MLP and CNN presented competitive results to GBLUP. Also, we bring new insights on automated machine learning for genomic prediction and its implications to plant breeding.