Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials

Costa-Neto, G.; Fritsche-Neto, R.; Crossa, J.

Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials

Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way.

Saved in:

Bibliographic Details
Main Authors:	Costa-Neto, G., Fritsche-Neto, R., Crossa, J.
Format:	Article biblioteca
Language:	English
Published:	Springer Nature 2021
Subjects:	AGRICULTURAL SCIENCES AND BIOTECHNOLOGY, EVOLUTION, GENOMICS, MODELS,
Online Access:	https://hdl.handle.net/10883/20953
Tags:	Add Tag No Tags, Be the first to tag this record!

id	dig-cimmyt-10883-20953
record_format	koha
spelling	dig-cimmyt-10883-209532021-02-25T19:12:30Z Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials Costa-Neto, G. Fritsche-Neto, R. Crossa, J. AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way. 92-106 2020-09-29T00:10:14Z 2020-09-29T00:10:14Z 2021 Article Published Version https://hdl.handle.net/10883/20953 10.1038/s41437-020-00353-1 English https://hdl.handle.net/11529/10887 https://data.mendeley.com/datasets/tpcw383fkm/3 CIMMYT manages Intellectual Assets as International Public Goods. The user is free to download, print, store and share this work. In case you want to translate or create any other derivative work and share or distribute such translation/derivative work, please contact CIMMYT-Knowledge-Center@cgiar.org indicating the work you want to use and the kind of use you intend; CIMMYT will contact you with the suitable license for that purpose Open Access PDF Harlow (United Kingdom) Springer Nature 126 Heredity
institution	CIMMYT
collection	DSpace
country	México
countrycode	MX
component	Bibliográfico
access	En linea
databasecode	dig-cimmyt
tag	biblioteca
region	America del Norte
libraryname	CIMMYT Library
language	English
topic	AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS
spellingShingle	AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS Costa-Neto, G. Fritsche-Neto, R. Crossa, J. Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
description	Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way.
format	Article
topic_facet	AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS
author	Costa-Neto, G. Fritsche-Neto, R. Crossa, J.
author_facet	Costa-Neto, G. Fritsche-Neto, R. Crossa, J.
author_sort	Costa-Neto, G.
title	Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_short	Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full	Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_fullStr	Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full_unstemmed	Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_sort	nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
publisher	Springer Nature
publishDate	2021
url	https://hdl.handle.net/10883/20953
work_keys_str_mv	AT costanetog nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials AT fritschenetor nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials AT crossaj nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
_version_	1756086921918939136

Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials

Similar Items

Resource Map