A bayesian genomic regression model with skew normal random errors

Genomic selection (GS) has become a tool for selecting candidates in plant and animal breeding programs. In the case of quantitative traits, it is common to assume that the distribution of the response variable can be approximated by a normal distribution. However, it is known that the selection process leads to skewed distributions. There is vast statistical literature on skewed distributions, but the skew normal distribution is of particular interest in this research. This distribution includes a third parameter that drives the skewness, so that it generalizes the normal distribution. We propose an extension of the Bayesian whole-genome regression to skew normal distribution data in the context of GS applications, where usually the number of predictors vastly exceeds the sample size. However, it can also be applied when the number of predictors is smaller than the sample size. We used a stochastic representation of a skew normal random variable, which allows the implementation of standard Markov Chain Monte Carlo (MCMC) techniques to efficiently fit the proposed model. The predictive ability and goodness of fit of the proposed model were evaluated using simulated and real data, and the results were compared to those obtained by the Bayesian Ridge Regression model. Results indicate that the proposed model has a better fit and is as good as the conventional Bayesian Ridge Regression model for prediction, based on the DIC criterion and cross-validation, respectively. A computing program coded in the R statistical package and C programming language to fit the proposed model is available as supplementary material.

Saved in:
Bibliographic Details
Main Authors: Perez-Rodriguez, P., Acosta-Pech, R., Pérez-Elizalde, S., Velasco Cruz, C., Suarez Espinosa, J., Crossa, J.
Format: Article biblioteca
Language:English
Published: Genetics Society of America 2018
Subjects:AGRICULTURAL SCIENCES AND BIOTECHNOLOGY, Genomic Selection, Data Augmentation, Assymetric Distributions, GBLUP, Ridge Regression, GenPred, Shared Data Resources, BAYESIAN THEORY, REGRESSION ANALYSIS, STATISTICAL METHODS,
Online Access:https://hdl.handle.net/10883/19493
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cimmyt-10883-19493
record_format koha
spelling dig-cimmyt-10883-194932023-12-08T15:08:53Z A bayesian genomic regression model with skew normal random errors Perez-Rodriguez, P. Acosta-Pech, R. Pérez-Elizalde, S. Velasco Cruz, C. Suarez Espinosa, J. Crossa, J. AGRICULTURAL SCIENCES AND BIOTECHNOLOGY Genomic Selection Data Augmentation Assymetric Distributions GBLUP Ridge Regression GenPred Shared Data Resources BAYESIAN THEORY REGRESSION ANALYSIS STATISTICAL METHODS Genomic selection (GS) has become a tool for selecting candidates in plant and animal breeding programs. In the case of quantitative traits, it is common to assume that the distribution of the response variable can be approximated by a normal distribution. However, it is known that the selection process leads to skewed distributions. There is vast statistical literature on skewed distributions, but the skew normal distribution is of particular interest in this research. This distribution includes a third parameter that drives the skewness, so that it generalizes the normal distribution. We propose an extension of the Bayesian whole-genome regression to skew normal distribution data in the context of GS applications, where usually the number of predictors vastly exceeds the sample size. However, it can also be applied when the number of predictors is smaller than the sample size. We used a stochastic representation of a skew normal random variable, which allows the implementation of standard Markov Chain Monte Carlo (MCMC) techniques to efficiently fit the proposed model. The predictive ability and goodness of fit of the proposed model were evaluated using simulated and real data, and the results were compared to those obtained by the Bayesian Ridge Regression model. Results indicate that the proposed model has a better fit and is as good as the conventional Bayesian Ridge Regression model for prediction, based on the DIC criterion and cross-validation, respectively. A computing program coded in the R statistical package and C programming language to fit the proposed model is available as supplementary material. 1771-1785 2018-05-29T21:04:04Z 2018-05-29T21:04:04Z 2018 Article 2160-1836 (Online) https://hdl.handle.net/10883/19493 10.1534/g3.117.300406 English https://www.g3journal.org/highwire/filestream/489330/field_highwire_adjunct_files/0/FileS1.zip CIMMYT manages Intellectual Assets as International Public Goods. The user is free to download, print, store and share this work. In case you want to translate or create any other derivative work and share or distribute such translation/derivative work, please contact CIMMYT-Knowledge-Center@cgiar.org indicating the work you want to use and the kind of use you intend; CIMMYT will contact you with the suitable license for that purpose. Open Access PDF Bethesda, Maryland, U.S. Genetics Society of America 5 8 G3: Genes, Genomes, Genetics
institution CIMMYT
collection DSpace
country México
countrycode MX
component Bibliográfico
access En linea
databasecode dig-cimmyt
tag biblioteca
region America del Norte
libraryname CIMMYT Library
language English
topic AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
Genomic Selection
Data Augmentation
Assymetric Distributions
GBLUP
Ridge Regression
GenPred
Shared Data Resources
BAYESIAN THEORY
REGRESSION ANALYSIS
STATISTICAL METHODS
AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
Genomic Selection
Data Augmentation
Assymetric Distributions
GBLUP
Ridge Regression
GenPred
Shared Data Resources
BAYESIAN THEORY
REGRESSION ANALYSIS
STATISTICAL METHODS
spellingShingle AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
Genomic Selection
Data Augmentation
Assymetric Distributions
GBLUP
Ridge Regression
GenPred
Shared Data Resources
BAYESIAN THEORY
REGRESSION ANALYSIS
STATISTICAL METHODS
AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
Genomic Selection
Data Augmentation
Assymetric Distributions
GBLUP
Ridge Regression
GenPred
Shared Data Resources
BAYESIAN THEORY
REGRESSION ANALYSIS
STATISTICAL METHODS
Perez-Rodriguez, P.
Acosta-Pech, R.
Pérez-Elizalde, S.
Velasco Cruz, C.
Suarez Espinosa, J.
Crossa, J.
A bayesian genomic regression model with skew normal random errors
description Genomic selection (GS) has become a tool for selecting candidates in plant and animal breeding programs. In the case of quantitative traits, it is common to assume that the distribution of the response variable can be approximated by a normal distribution. However, it is known that the selection process leads to skewed distributions. There is vast statistical literature on skewed distributions, but the skew normal distribution is of particular interest in this research. This distribution includes a third parameter that drives the skewness, so that it generalizes the normal distribution. We propose an extension of the Bayesian whole-genome regression to skew normal distribution data in the context of GS applications, where usually the number of predictors vastly exceeds the sample size. However, it can also be applied when the number of predictors is smaller than the sample size. We used a stochastic representation of a skew normal random variable, which allows the implementation of standard Markov Chain Monte Carlo (MCMC) techniques to efficiently fit the proposed model. The predictive ability and goodness of fit of the proposed model were evaluated using simulated and real data, and the results were compared to those obtained by the Bayesian Ridge Regression model. Results indicate that the proposed model has a better fit and is as good as the conventional Bayesian Ridge Regression model for prediction, based on the DIC criterion and cross-validation, respectively. A computing program coded in the R statistical package and C programming language to fit the proposed model is available as supplementary material.
format Article
topic_facet AGRICULTURAL SCIENCES AND BIOTECHNOLOGY
Genomic Selection
Data Augmentation
Assymetric Distributions
GBLUP
Ridge Regression
GenPred
Shared Data Resources
BAYESIAN THEORY
REGRESSION ANALYSIS
STATISTICAL METHODS
author Perez-Rodriguez, P.
Acosta-Pech, R.
Pérez-Elizalde, S.
Velasco Cruz, C.
Suarez Espinosa, J.
Crossa, J.
author_facet Perez-Rodriguez, P.
Acosta-Pech, R.
Pérez-Elizalde, S.
Velasco Cruz, C.
Suarez Espinosa, J.
Crossa, J.
author_sort Perez-Rodriguez, P.
title A bayesian genomic regression model with skew normal random errors
title_short A bayesian genomic regression model with skew normal random errors
title_full A bayesian genomic regression model with skew normal random errors
title_fullStr A bayesian genomic regression model with skew normal random errors
title_full_unstemmed A bayesian genomic regression model with skew normal random errors
title_sort bayesian genomic regression model with skew normal random errors
publisher Genetics Society of America
publishDate 2018
url https://hdl.handle.net/10883/19493
work_keys_str_mv AT perezrodriguezp abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT acostapechr abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT perezelizaldes abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT velascocruzc abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT suarezespinosaj abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT crossaj abayesiangenomicregressionmodelwithskewnormalrandomerrors
AT perezrodriguezp bayesiangenomicregressionmodelwithskewnormalrandomerrors
AT acostapechr bayesiangenomicregressionmodelwithskewnormalrandomerrors
AT perezelizaldes bayesiangenomicregressionmodelwithskewnormalrandomerrors
AT velascocruzc bayesiangenomicregressionmodelwithskewnormalrandomerrors
AT suarezespinosaj bayesiangenomicregressionmodelwithskewnormalrandomerrors
AT crossaj bayesiangenomicregressionmodelwithskewnormalrandomerrors
_version_ 1787232942776385536