Improved biomarker discovery through a plot twist in transcriptomic data analysis

26 pages, 9 figures, 3 tables, supplementary information https://doi.org/10.1186/s12915-022-01398-w.-- Availability of data and materials: All data generated or analyzed during this study are included in this published article, its supplementary information files and publicly available repositories. The datasets analyzed are available at the GEO database with the accession numbers GSE115841 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115841), GSE117590 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi), and GSE116278 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116278)

Saved in:
Bibliographic Details
Main Authors: Sánchez Baizán, Núria, Piferrer, Francesc
Other Authors: Ministerio de Ciencia, Innovación y Universidades (España)
Format: artículo biblioteca
Language:English
Published: BioMed Central 2022-09
Subjects:Gene expression analysis, Gene networks, Weighted gene co-expression network analysis (WGCNA), Sex determination and differentiation, Gonadal development, Biomarker discovery,
Online Access:http://hdl.handle.net/10261/280639
http://dx.doi.org/10.13039/501100011033
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-icm-es-10261-280639
record_format koha
spelling dig-icm-es-10261-2806392022-11-04T17:48:31Z Improved biomarker discovery through a plot twist in transcriptomic data analysis Sánchez Baizán, Núria Piferrer, Francesc Ministerio de Ciencia, Innovación y Universidades (España) Agencia Estatal de Investigación (España) Gene expression analysis Gene networks Weighted gene co-expression network analysis (WGCNA) Sex determination and differentiation Gonadal development Biomarker discovery 26 pages, 9 figures, 3 tables, supplementary information https://doi.org/10.1186/s12915-022-01398-w.-- Availability of data and materials: All data generated or analyzed during this study are included in this published article, its supplementary information files and publicly available repositories. The datasets analyzed are available at the GEO database with the accession numbers GSE115841 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115841), GSE117590 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi), and GSE116278 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116278) Background: Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results: In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions: We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. NS was supported by a Spanish Ministry of Science (SMS) predoctoral scholarship (BES-2017-079744). Research was supported SMS grant no. PID2019-108888RB-I00 to FP. Also with funding from the Spanish government through the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S) Peer reviewed 2022-10-06T12:04:22Z 2022-10-06T12:04:22Z 2022-09 artículo BMC Biology 20: 208 (2022) CEX2019-000928-S http://hdl.handle.net/10261/280639 10.1186/s12915-022-01398-w 1741-7007 http://dx.doi.org/10.13039/501100011033 en #PLACEHOLDER_PARENT_METADATA_VALUE# info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-108888RB-I00/ES/CARACTERIZACION DE EPIALELOS PUROS Y SU APLICACION COMO INDICADORES CLAVE DEL RENDIMIENTO EN PISCICULTURA/ Publisher's version https://doi.org/10.1186/s12915-022-01398-w Sí open BioMed Central
institution ICM ES
collection DSpace
country España
countrycode ES
component Bibliográfico
access En linea
databasecode dig-icm-es
tag biblioteca
region Europa del Sur
libraryname Biblioteca del ICM España
language English
topic Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
spellingShingle Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
Sánchez Baizán, Núria
Piferrer, Francesc
Improved biomarker discovery through a plot twist in transcriptomic data analysis
description 26 pages, 9 figures, 3 tables, supplementary information https://doi.org/10.1186/s12915-022-01398-w.-- Availability of data and materials: All data generated or analyzed during this study are included in this published article, its supplementary information files and publicly available repositories. The datasets analyzed are available at the GEO database with the accession numbers GSE115841 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115841), GSE117590 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi), and GSE116278 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116278)
author2 Ministerio de Ciencia, Innovación y Universidades (España)
author_facet Ministerio de Ciencia, Innovación y Universidades (España)
Sánchez Baizán, Núria
Piferrer, Francesc
format artículo
topic_facet Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
author Sánchez Baizán, Núria
Piferrer, Francesc
author_sort Sánchez Baizán, Núria
title Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_short Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_fullStr Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full_unstemmed Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_sort improved biomarker discovery through a plot twist in transcriptomic data analysis
publisher BioMed Central
publishDate 2022-09
url http://hdl.handle.net/10261/280639
http://dx.doi.org/10.13039/501100011033
work_keys_str_mv AT sanchezbaizannuria improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
AT piferrerfrancesc improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
_version_ 1777667857194156032