Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques

The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and physical activity habits, socio-economic environment, lifestyle, etc. In addition, all these factors are expected to exert their influence through a specific and especially convoluted way during childhood, given the fast growth along this period. Machine Learning methods are the appropriate tools to model this complexity, given their ability to cope with high-dimensional, non-linear data. Here, we have analyzed by Machine Learning a sample of 221 children (6–9 years) from Madrid, Spain. Both Random Forest and Gradient Boosting Machine models have been derived to predict the body mass index from a wide set of 190 multidomain variables (including age, sex, genetic polymorphisms, lifestyle, socio-economic, diet, exercise, and gestation ones). A consensus relative importance of the predictors has been estimated through variable importance measures, implemented robustly through an iterative process that included permutation and multiple imputation. We expect this analysis will help to shed light on the most important variables associated to childhood obesity, in order to choose better treatments for its prevention.

Saved in:
Bibliographic Details
Main Authors: Marcos-Pasero, Helena, Colmenarejo, Gonzalo, Aguilar-Aguilar, Elena, Ramírez de Molina, Ana, Reglero, Guillermo, Loria-Kohen, Viviana
Format: artículo biblioteca
Language:English
Published: Springer Nature 2021
Online Access:http://hdl.handle.net/10261/250683
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cial-es-10261-250683
record_format koha
spelling dig-cial-es-10261-2506832021-12-28T16:29:17Z Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques Marcos-Pasero, Helena Colmenarejo, Gonzalo Aguilar-Aguilar, Elena Ramírez de Molina, Ana Reglero, Guillermo Loria-Kohen, Viviana The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and physical activity habits, socio-economic environment, lifestyle, etc. In addition, all these factors are expected to exert their influence through a specific and especially convoluted way during childhood, given the fast growth along this period. Machine Learning methods are the appropriate tools to model this complexity, given their ability to cope with high-dimensional, non-linear data. Here, we have analyzed by Machine Learning a sample of 221 children (6–9 years) from Madrid, Spain. Both Random Forest and Gradient Boosting Machine models have been derived to predict the body mass index from a wide set of 190 multidomain variables (including age, sex, genetic polymorphisms, lifestyle, socio-economic, diet, exercise, and gestation ones). A consensus relative importance of the predictors has been estimated through variable importance measures, implemented robustly through an iterative process that included permutation and multiple imputation. We expect this analysis will help to shed light on the most important variables associated to childhood obesity, in order to choose better treatments for its prevention. Peer reviewed 2021-09-21T11:32:00Z 2021-09-21T11:32:00Z 2021 artículo http://purl.org/coar/resource_type/c_6501 Scientific Reports 11: 1910 (2021) http://hdl.handle.net/10261/250683 10.1038/s41598-021-81205-8 2045-2322 33479310 en Publisher's version https://doi.org/10.1038/s41598-021-81205-8 Sí open Springer Nature
institution CIAL ES
collection DSpace
country España
countrycode ES
component Bibliográfico
access En linea
databasecode dig-cial-es
tag biblioteca
region Europa del Sur
libraryname Biblioteca del CIAL España
language English
description The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and physical activity habits, socio-economic environment, lifestyle, etc. In addition, all these factors are expected to exert their influence through a specific and especially convoluted way during childhood, given the fast growth along this period. Machine Learning methods are the appropriate tools to model this complexity, given their ability to cope with high-dimensional, non-linear data. Here, we have analyzed by Machine Learning a sample of 221 children (6–9 years) from Madrid, Spain. Both Random Forest and Gradient Boosting Machine models have been derived to predict the body mass index from a wide set of 190 multidomain variables (including age, sex, genetic polymorphisms, lifestyle, socio-economic, diet, exercise, and gestation ones). A consensus relative importance of the predictors has been estimated through variable importance measures, implemented robustly through an iterative process that included permutation and multiple imputation. We expect this analysis will help to shed light on the most important variables associated to childhood obesity, in order to choose better treatments for its prevention.
format artículo
author Marcos-Pasero, Helena
Colmenarejo, Gonzalo
Aguilar-Aguilar, Elena
Ramírez de Molina, Ana
Reglero, Guillermo
Loria-Kohen, Viviana
spellingShingle Marcos-Pasero, Helena
Colmenarejo, Gonzalo
Aguilar-Aguilar, Elena
Ramírez de Molina, Ana
Reglero, Guillermo
Loria-Kohen, Viviana
Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
author_facet Marcos-Pasero, Helena
Colmenarejo, Gonzalo
Aguilar-Aguilar, Elena
Ramírez de Molina, Ana
Reglero, Guillermo
Loria-Kohen, Viviana
author_sort Marcos-Pasero, Helena
title Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_short Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_full Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_fullStr Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_full_unstemmed Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_sort ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
publisher Springer Nature
publishDate 2021
url http://hdl.handle.net/10261/250683
work_keys_str_mv AT marcospaserohelena rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT colmenarejogonzalo rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT aguilaraguilarelena rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT ramirezdemolinaana rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT regleroguillermo rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT loriakohenviviana rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
_version_ 1777671497709518848