Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials

Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way.

Saved in:
Bibliographic Details
Main Authors: Costa-Neto, G., Fritsche-Neto, R., Crossa, J.
Format: Article biblioteca
Language:English
Published: Springer Nature 2021
Subjects:AGRICULTURAL SCIENCES AND BIOTECHNOLOGY, EVOLUTION, GENOMICS, MODELS,
Online Access:https://hdl.handle.net/10883/20953
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cimmyt-10883-20953
record_format koha
spelling dig-cimmyt-10883-209532021-02-25T19:12:30Z Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials Costa-Neto, G. Fritsche-Neto, R. Crossa, J. AGRICULTURAL SCIENCES AND BIOTECHNOLOGY EVOLUTION GENOMICS MODELS Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way. 92-106 2020-09-29T00:10:14Z 2020-09-29T00:10:14Z 2021 Article Published Version https://hdl.handle.net/10883/20953 10.1038/s41437-020-00353-1 English https://hdl.handle.net/11529/10887 https://data.mendeley.com/datasets/tpcw383fkm/3 CIMMYT manages Intellectual Assets as International Public Goods. The user is free to download, print, store and share this work. In case you want to translate or create any other derivative work and share or distribute such translation/derivative work, please contact CIMMYT-Knowledge-Center@cgiar.org indicating the work you want to use and the kind of use you intend; CIMMYT will contact you with the suitable license for that purpose Open Access PDF Harlow (United Kingdom) Springer Nature 126 Heredity
institution CIMMYT
collection DSpace
country México
countrycode MX
component Bibliográfico
access En linea
databasecode dig-cimmyt
tag biblioteca
region America del Norte
libraryname CIMMYT Library
language English
topic AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
EVOLUTION
GENOMICS
MODELS
AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
EVOLUTION
GENOMICS
MODELS
spellingShingle AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
EVOLUTION
GENOMICS
MODELS
AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
EVOLUTION
GENOMICS
MODELS
Costa-Neto, G.
Fritsche-Neto, R.
Crossa, J.
Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
description Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype x environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (similar to up to 20%) under all model-kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (similar to up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way.
format Article
topic_facet AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
EVOLUTION
GENOMICS
MODELS
author Costa-Neto, G.
Fritsche-Neto, R.
Crossa, J.
author_facet Costa-Neto, G.
Fritsche-Neto, R.
Crossa, J.
author_sort Costa-Neto, G.
title Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_short Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_fullStr Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full_unstemmed Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_sort nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
publisher Springer Nature
publishDate 2021
url https://hdl.handle.net/10883/20953
work_keys_str_mv AT costanetog nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
AT fritschenetor nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
AT crossaj nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
_version_ 1756086921918939136