Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data

Mallows's Cp and Akaike information criterion (AIC) are common criteria for selecting the dimensionality of regression models, as an alternative to cross- validation (CV) and nonparametric bootstrap. A key parameter in the calcula- tion of Cp and AIC is the effective number of degrees of freedom of the model, or model complexity (d). Parameter d is generally easy to calculate for linear smoothers, that is, models for which the prediction of the training response y is given by by = S y where S is a projector matrix that does not involve y. Never- theless, d is more difficult to estimate for nonlinear smoothers, such as partial least squares regression (PLSR). In this article, we present two algorithms for estimating d for PLSR based on Monte Carlo simulation methods (parametric bootstrap and perturbation analysis) and with the particular case of high dimensional data. We compare these Monte Carlo methods to three other algorithms already published. We used the d estimates to compute Cp and AIC and select PLSR model dimensionalities that we then compare to CV. Two real and heterogeneous agronomic near infrared (NIR) datasets were considered as examples.

Saved in:
Bibliographic Details
Main Authors: Lesnoff, Matthieu, Roger, Jean-Michel, Rutledge, Douglas N.
Format: article biblioteca
Language:eng
Subjects:modèle de simulation, modèle mathématique, technique de prévision, critère de sélection, méthode statistique, http://aims.fao.org/aos/agrovoc/c_24242, http://aims.fao.org/aos/agrovoc/c_24199, http://aims.fao.org/aos/agrovoc/c_3041, http://aims.fao.org/aos/agrovoc/c_1078, http://aims.fao.org/aos/agrovoc/c_7377,
Online Access:http://agritrop.cirad.fr/600014/
http://agritrop.cirad.fr/600014/1/lesnoff2021.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cirad-fr-600014
record_format koha
spelling dig-cirad-fr-6000142024-01-29T05:44:56Z http://agritrop.cirad.fr/600014/ http://agritrop.cirad.fr/600014/ Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data. Lesnoff Matthieu, Roger Jean-Michel, Rutledge Douglas N.. 2021. Journal of Chemometrics, 35 (10):e3369, 21 p.https://doi.org/10.1002/cem.3369 <https://doi.org/10.1002/cem.3369> Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data Lesnoff, Matthieu Roger, Jean-Michel Rutledge, Douglas N. eng 2021 Journal of Chemometrics modèle de simulation modèle mathématique technique de prévision critère de sélection méthode statistique http://aims.fao.org/aos/agrovoc/c_24242 http://aims.fao.org/aos/agrovoc/c_24199 http://aims.fao.org/aos/agrovoc/c_3041 http://aims.fao.org/aos/agrovoc/c_1078 http://aims.fao.org/aos/agrovoc/c_7377 Mallows's Cp and Akaike information criterion (AIC) are common criteria for selecting the dimensionality of regression models, as an alternative to cross- validation (CV) and nonparametric bootstrap. A key parameter in the calcula- tion of Cp and AIC is the effective number of degrees of freedom of the model, or model complexity (d). Parameter d is generally easy to calculate for linear smoothers, that is, models for which the prediction of the training response y is given by by = S y where S is a projector matrix that does not involve y. Never- theless, d is more difficult to estimate for nonlinear smoothers, such as partial least squares regression (PLSR). In this article, we present two algorithms for estimating d for PLSR based on Monte Carlo simulation methods (parametric bootstrap and perturbation analysis) and with the particular case of high dimensional data. We compare these Monte Carlo methods to three other algorithms already published. We used the d estimates to compute Cp and AIC and select PLSR model dimensionalities that we then compare to CV. Two real and heterogeneous agronomic near infrared (NIR) datasets were considered as examples. article info:eu-repo/semantics/article Journal Article info:eu-repo/semantics/publishedVersion http://agritrop.cirad.fr/600014/1/lesnoff2021.pdf text Cirad license info:eu-repo/semantics/restrictedAccess https://agritrop.cirad.fr/mention_legale.html https://doi.org/10.1002/cem.3369 10.1002/cem.3369 info:eu-repo/semantics/altIdentifier/doi/10.1002/cem.3369 info:eu-repo/semantics/altIdentifier/purl/https://doi.org/10.1002/cem.3369 info:eu-repo/grantAgreement/EC/////
institution CIRAD FR
collection DSpace
country Francia
countrycode FR
component Bibliográfico
access En linea
databasecode dig-cirad-fr
tag biblioteca
region Europa del Oeste
libraryname Biblioteca del CIRAD Francia
language eng
topic modèle de simulation
modèle mathématique
technique de prévision
critère de sélection
méthode statistique
http://aims.fao.org/aos/agrovoc/c_24242
http://aims.fao.org/aos/agrovoc/c_24199
http://aims.fao.org/aos/agrovoc/c_3041
http://aims.fao.org/aos/agrovoc/c_1078
http://aims.fao.org/aos/agrovoc/c_7377
modèle de simulation
modèle mathématique
technique de prévision
critère de sélection
méthode statistique
http://aims.fao.org/aos/agrovoc/c_24242
http://aims.fao.org/aos/agrovoc/c_24199
http://aims.fao.org/aos/agrovoc/c_3041
http://aims.fao.org/aos/agrovoc/c_1078
http://aims.fao.org/aos/agrovoc/c_7377
spellingShingle modèle de simulation
modèle mathématique
technique de prévision
critère de sélection
méthode statistique
http://aims.fao.org/aos/agrovoc/c_24242
http://aims.fao.org/aos/agrovoc/c_24199
http://aims.fao.org/aos/agrovoc/c_3041
http://aims.fao.org/aos/agrovoc/c_1078
http://aims.fao.org/aos/agrovoc/c_7377
modèle de simulation
modèle mathématique
technique de prévision
critère de sélection
méthode statistique
http://aims.fao.org/aos/agrovoc/c_24242
http://aims.fao.org/aos/agrovoc/c_24199
http://aims.fao.org/aos/agrovoc/c_3041
http://aims.fao.org/aos/agrovoc/c_1078
http://aims.fao.org/aos/agrovoc/c_7377
Lesnoff, Matthieu
Roger, Jean-Michel
Rutledge, Douglas N.
Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
description Mallows's Cp and Akaike information criterion (AIC) are common criteria for selecting the dimensionality of regression models, as an alternative to cross- validation (CV) and nonparametric bootstrap. A key parameter in the calcula- tion of Cp and AIC is the effective number of degrees of freedom of the model, or model complexity (d). Parameter d is generally easy to calculate for linear smoothers, that is, models for which the prediction of the training response y is given by by = S y where S is a projector matrix that does not involve y. Never- theless, d is more difficult to estimate for nonlinear smoothers, such as partial least squares regression (PLSR). In this article, we present two algorithms for estimating d for PLSR based on Monte Carlo simulation methods (parametric bootstrap and perturbation analysis) and with the particular case of high dimensional data. We compare these Monte Carlo methods to three other algorithms already published. We used the d estimates to compute Cp and AIC and select PLSR model dimensionalities that we then compare to CV. Two real and heterogeneous agronomic near infrared (NIR) datasets were considered as examples.
format article
topic_facet modèle de simulation
modèle mathématique
technique de prévision
critère de sélection
méthode statistique
http://aims.fao.org/aos/agrovoc/c_24242
http://aims.fao.org/aos/agrovoc/c_24199
http://aims.fao.org/aos/agrovoc/c_3041
http://aims.fao.org/aos/agrovoc/c_1078
http://aims.fao.org/aos/agrovoc/c_7377
author Lesnoff, Matthieu
Roger, Jean-Michel
Rutledge, Douglas N.
author_facet Lesnoff, Matthieu
Roger, Jean-Michel
Rutledge, Douglas N.
author_sort Lesnoff, Matthieu
title Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
title_short Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
title_full Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
title_fullStr Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
title_full_unstemmed Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data
title_sort monte carlo methods for estimating mallows's cp and aic criteria for plsr models. illustration on agronomic spectroscopic nir data
url http://agritrop.cirad.fr/600014/
http://agritrop.cirad.fr/600014/1/lesnoff2021.pdf
work_keys_str_mv AT lesnoffmatthieu montecarlomethodsforestimatingmallowsscpandaiccriteriaforplsrmodelsillustrationonagronomicspectroscopicnirdata
AT rogerjeanmichel montecarlomethodsforestimatingmallowsscpandaiccriteriaforplsrmodelsillustrationonagronomicspectroscopicnirdata
AT rutledgedouglasn montecarlomethodsforestimatingmallowsscpandaiccriteriaforplsrmodelsillustrationonagronomicspectroscopicnirdata
_version_ 1792500287312756736