Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds

13 Pág.

Saved in:
Bibliographic Details
Main Authors: Schiavo, Giuseppina, Bertolini, F., Bovo, Samuele, Galimberti, Giuliano, Muñoz, María, Bozzi, Riccardo, Čandek-Potokar, M., Óvilo Martín, Cristina, Fontanesi, Luca
Other Authors: Università di Bologna
Format: artículo biblioteca
Language:English
Published: John Wiley & Sons 2024-01-08
Subjects:Sus scrofa, SNP, Genome, Population genomics, Random forest, Signatures of selection,
Online Access:http://hdl.handle.net/10261/356715
http://dx.doi.org/10.13039/501100005969
http://dx.doi.org/10.13039/501100000780
https://api.elsevier.com/content/abstract/scopus_id/85181719006
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-inia-es-10261-356715
record_format koha
institution INIA ES
collection DSpace
country España
countrycode ES
component Bibliográfico
access En linea
databasecode dig-inia-es
tag biblioteca
region Europa del Sur
libraryname Biblioteca del INIA España
language English
topic Sus scrofa
SNP
Genome
Population genomics
Random forest
Signatures of selection
Sus scrofa
SNP
Genome
Population genomics
Random forest
Signatures of selection
spellingShingle Sus scrofa
SNP
Genome
Population genomics
Random forest
Signatures of selection
Sus scrofa
SNP
Genome
Population genomics
Random forest
Signatures of selection
Schiavo, Giuseppina
Bertolini, F.
Bovo, Samuele
Galimberti, Giuliano
Muñoz, María
Bozzi, Riccardo
Čandek-Potokar, M.
Óvilo Martín, Cristina
Fontanesi, Luca
Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
description 13 Pág.
author2 Università di Bologna
author_facet Università di Bologna
Schiavo, Giuseppina
Bertolini, F.
Bovo, Samuele
Galimberti, Giuliano
Muñoz, María
Bozzi, Riccardo
Čandek-Potokar, M.
Óvilo Martín, Cristina
Fontanesi, Luca
format artículo
topic_facet Sus scrofa
SNP
Genome
Population genomics
Random forest
Signatures of selection
author Schiavo, Giuseppina
Bertolini, F.
Bovo, Samuele
Galimberti, Giuliano
Muñoz, María
Bozzi, Riccardo
Čandek-Potokar, M.
Óvilo Martín, Cristina
Fontanesi, Luca
author_sort Schiavo, Giuseppina
title Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
title_short Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
title_full Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
title_fullStr Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
title_full_unstemmed Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds
title_sort identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: application to european autochthonous and cosmopolitan pig breeds
publisher John Wiley & Sons
publishDate 2024-01-08
url http://hdl.handle.net/10261/356715
http://dx.doi.org/10.13039/501100005969
http://dx.doi.org/10.13039/501100000780
https://api.elsevier.com/content/abstract/scopus_id/85181719006
work_keys_str_mv AT schiavogiuseppina identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT bertolinif identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT bovosamuele identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT galimbertigiuliano identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT munozmaria identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT bozziriccardo identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT candekpotokarm identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT ovilomartincristina identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
AT fontanesiluca identificationofpopulationinformativemarkersfromhighdensitygenotypingdatathroughcombinedfeatureselectionandmachinelearningalgorithmsapplicationtoeuropeanautochthonousandcosmopolitanpigbreeds
_version_ 1816136646455721984
spelling dig-inia-es-10261-3567152024-10-28T21:36:40Z Identification of population-informative markers from high-density genotyping data through combined feature selection and machine learning algorithms: Application to European autochthonous and cosmopolitan pig breeds Schiavo, Giuseppina Bertolini, F. Bovo, Samuele Galimberti, Giuliano Muñoz, María Bozzi, Riccardo Čandek-Potokar, M. Óvilo Martín, Cristina Fontanesi, Luca Università di Bologna European Commission CSIC - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA) Schiavo, Giuseppina [0000-0002-3497-1337] Bertolini, F. [0000-0003-4181-3895] Bovo, Samuele [0000-0002-5712-8211] Galimberti, Giuliano [0000-0002-9161-9671] Bozzi, Riccardo [0000-0001-8854-0834] Čandek-Potokar, M. [0000-0003-0231-126X] Óvilo Martín, Cristina [0000-0002-5738-8435] Fontanesi, Luca [0000-0001-7050-3760] Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72] Sus scrofa SNP Genome Population genomics Random forest Signatures of selection 13 Pág. Large genotyping datasets, obtained from high-density single nucleotide polymorphism (SNP) arrays, developed for different livestock species, can be used to describe and differentiate breeds or populations. To identify the most discriminating genetic markers among thousands of genotyped SNPs, a few statistical approaches have been proposed. In this study, we applied the Boruta algorithm, a wrapper of the machine learning random forest algorithm, on a database of 23 European pig breeds (20 autochthonous and three cosmopolitan breeds) genotyped with a 70k SNP chip, to pre-select informative SNPs. To identify different sets of SNPs, these pre-selected markers were then ranked with random forest based on their mean decrease accuracy and mean decrease gene indexes. We evaluated the efficiency of these subsets for breed classification and the usefulness of this approach to detect candidate genes affecting breed-specific phenotypes and relevant production traits that might differ among breeds. The lowest overall classification error (2.3%) was reached with a subpanel including only 398 SNPs (ranked based on their mean decrease accuracy), with no classification error in seven breeds using up to 49 SNPs. Several SNPs of these selected subpanels were in genomic regions in which previous studies had identified signatures of selection or genes associated with morphological or production traits that distinguish the analysed breeds. Therefore, even if these approaches have not been originally designed to identify signatures of selection, the obtained results showed that they could potentially be useful for this purpose. This work received funding from the University of Bologna RFO 2016–2019 programme and from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 634476 for the project with the acronym TREASURE. The content of this article reflects only the authors’ view, and the European Union Agency is not responsible for any use that may be made of the information it contains. The authors thank the members of the TREASURE consortium for providing samples: Estefania Alves, Yolanda Núñez, Ana I. Fernandez, Fabián García, Juan M. García-Casco (Departamento Mejora Genética Animal, INIA-CSIC, Spain), José P. Araújo (Centro de Investigação de Montanha (CIMO), Portugal), Rui Charneca, José Manuel Martins (MED – Instituto Mediterrâneo para Agricultura, Ambiente e Desenvolvimento, Portugal), Maurizio Gallo (Associazione Nazionale Allevatori Suini, ANAS, Italy), Danijel Karolyi (Department of Animal Science, Faculty of Agriculture, University of Zagreb, Croatia), Goran Kušec (Faculty of Agrobiotechnical Sciences, University of Osijek, Croatia), Marie-José. Mercat (IFIP Institut du Porc, France), Raquel Quintanilla (Programa de Genética y Mejora Animal, IRTA, Spain), Čedomir Radović (Department of Pig Breeding and Genetics, Institute for Animal Husbandry, Serbia), Violeta Razmaite (Animal Science Institute, Lithuanian University of Health Sciences, Lithuania) Juliette Riquet (GenPhySE, Université de Toulouse, INRA, France), Radomir Savić (Faculty of Agriculture, University of Belgrade, Serbia), Graziano Usai (AGRIS SARDEGNA, Italy) and Christoph Zimmer (Bäuerliche Erzeugergemeinschaft Schwäbisch Hall, Germany). The support of the Slovenian Research Agency for MČP is acknowledged (grants P4-0133 and J4-3094). Peer reviewed 2024-05-13T08:27:22Z 2024-05-13T08:27:22Z 2024-01-08 artículo http://purl.org/coar/resource_type/c_6501 Animal Genetics 55(2): 193-205 (2024) 0268-9146 http://hdl.handle.net/10261/356715 10.1111/age.13396 1365-2052 http://dx.doi.org/10.13039/501100005969 http://dx.doi.org/10.13039/501100000780 38191264 2-s2.0-85181719006 https://api.elsevier.com/content/abstract/scopus_id/85181719006 en #PLACEHOLDER_PARENT_METADATA_VALUE# info:eu-repo/grantAgreement/EC/H2020/634476 Departamento de ​Mejora Genética Animal Publisher's version https://doi.org/10.1111/age.13396 Sí open application/pdf John Wiley & Sons