A comparison of methods for training population optimization in genomic selection

20 Pág..

Saved in:
Bibliographic Details
Main Authors: Fernández-González, Javier, Akdemir, Deniz, Isidro-Sánchez, Julio
Other Authors: Conferencia de Rectores de las Universidades Españolas
Format: artículo biblioteca
Language:English
Published: Springer Nature 2023-03-09
Online Access:http://hdl.handle.net/10261/310782
http://dx.doi.org/10.13039/501100011033
http://dx.doi.org/10.13039/501100003339
https://api.elsevier.com/content/abstract/scopus_id/85149972898
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-inia-es-10261-310782
record_format koha
spelling dig-inia-es-10261-3107822024-10-26T20:44:09Z A comparison of methods for training population optimization in genomic selection Fernández-González, Javier Akdemir, Deniz Isidro-Sánchez, Julio Conferencia de Rectores de las Universidades Españolas Consejo Superior de Investigaciones Científicas (España) Ministerio de Educación y Formación Profesional (España) Agencia Estatal de Investigación (España) Fernández-González, Javier [0000-0002-2109-7783] Isidro-Sánchez, Julio [0000-0002-9044-3221] Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72] 20 Pág.. Maximizing CDmean and Avg_GRM_self were the best criteria for training set optimization. A training set size of 50-55% (targeted) or 65-85% (untargeted) is needed to obtain 95% of the accuracy.  With the advent of genomic selection (GS) as a widespread breeding tool, mechanisms to efficiently design an optimal training set for GS models became more relevant, since they allow maximizing the accuracy while minimizing the phenotyping costs. The literature described many training set optimization methods, but there is a lack of a comprehensive comparison among them. This work aimed to provide an extensive benchmark among optimization methods and optimal training set size by testing a wide range of them in seven datasets, six different species, different genetic architectures, population structure, heritabilities, and with several GS models to provide some guidelines about their application in breeding programs. Our results showed that targeted optimization (uses information from the test set) performed better than untargeted (does not use test set data), especially when heritability was low. The mean coefficient of determination was the best targeted method, although it was computationally intensive. Minimizing the average relationship within the training set was the best strategy for untargeted optimization. Regarding the optimal training set size, maximum accuracy was obtained when the training set was the entire candidate set. Nevertheless, a 50-55% of the candidate set was enough to reach 95-100% of the maximum accuracy in the targeted scenario, while we needed a 65-85% for untargeted optimization. Our results also suggested that a diverse training set makes GS robust against population structure, while including clustering information was less effective. The choice of the GS model did not have a significant influence on the prediction accuracies. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. JIS and JFG were sup ported by the Beatriz Galindo Program (BEAGAL18/00115) from the Ministerio de Educación y Formación Profesional of Spain and the Severo Ochoa Program for Centres of Excellence in R &D from the Agencia Estatal de Investigación of Spain, Grant SEV-2016-0672 (2017-2021) to the CBGP Peer reviewed 2023-06-06T11:55:43Z 2023-06-06T11:55:43Z 2023-03-09 artículo http://purl.org/coar/resource_type/c_6501 Theoretical and Applied Genetics 136(3): e30(2023) 0040-5752 http://hdl.handle.net/10261/310782 10.1007/s00122-023-04265-6 1432-2242 http://dx.doi.org/10.13039/501100011033 http://dx.doi.org/10.13039/501100003339 36892603 2-s2.0-85149972898 https://api.elsevier.com/content/abstract/scopus_id/85149972898 en #PLACEHOLDER_PARENT_METADATA_VALUE# info:eu-repo/grantAgreement/MEFP//BEAGAL//18/00115 Centro de Biotecnología y Genómica de Plantas (CBGP) Publisher's version https://doi.org/10.1007/s00122-023-04265-6 Sí open application/pdf Springer Nature
institution INIA ES
collection DSpace
country España
countrycode ES
component Bibliográfico
access En linea
databasecode dig-inia-es
tag biblioteca
region Europa del Sur
libraryname Biblioteca del INIA España
language English
description 20 Pág..
author2 Conferencia de Rectores de las Universidades Españolas
author_facet Conferencia de Rectores de las Universidades Españolas
Fernández-González, Javier
Akdemir, Deniz
Isidro-Sánchez, Julio
format artículo
author Fernández-González, Javier
Akdemir, Deniz
Isidro-Sánchez, Julio
spellingShingle Fernández-González, Javier
Akdemir, Deniz
Isidro-Sánchez, Julio
A comparison of methods for training population optimization in genomic selection
author_sort Fernández-González, Javier
title A comparison of methods for training population optimization in genomic selection
title_short A comparison of methods for training population optimization in genomic selection
title_full A comparison of methods for training population optimization in genomic selection
title_fullStr A comparison of methods for training population optimization in genomic selection
title_full_unstemmed A comparison of methods for training population optimization in genomic selection
title_sort comparison of methods for training population optimization in genomic selection
publisher Springer Nature
publishDate 2023-03-09
url http://hdl.handle.net/10261/310782
http://dx.doi.org/10.13039/501100011033
http://dx.doi.org/10.13039/501100003339
https://api.elsevier.com/content/abstract/scopus_id/85149972898
work_keys_str_mv AT fernandezgonzalezjavier acomparisonofmethodsfortrainingpopulationoptimizationingenomicselection
AT akdemirdeniz acomparisonofmethodsfortrainingpopulationoptimizationingenomicselection
AT isidrosanchezjulio acomparisonofmethodsfortrainingpopulationoptimizationingenomicselection
AT fernandezgonzalezjavier comparisonofmethodsfortrainingpopulationoptimizationingenomicselection
AT akdemirdeniz comparisonofmethodsfortrainingpopulationoptimizationingenomicselection
AT isidrosanchezjulio comparisonofmethodsfortrainingpopulationoptimizationingenomicselection
_version_ 1816136400326623232