Algorithms for metabolic pathway discovery and analysis in the human microbiome

The human microbiome is a good source of natural products and other types of small molecules that can profoundly affect human homeostasis. In the last decade, several studies have elucidated and characterized the presence of certain metabolites at relevant concentrations, which, given their role in microbe-microbe and microbe-host interactions, can be used as biological markers. The molecules that are known to date can be classified into two major groups based on their function; the ones that are involved in energy metabolism or primary metabolism and the ones used for protection against biotic and abiotic stresses that derive from bacterial secondary metabolism. Examples of specialized primary metabolites are indole and trimethylamine (TMA). Indole, which is derived from tryptophan and is the precursor of indoxyl sulfate, has been associated with a decrease in bacterial pathogenicity when found at high concentrations. TMA in contrast, is synthesized from carnitine or choline and is a marker for cardiovascular diseases. Oftentimes, the genes responsible for the synthesis of these molecules are clustered together in the genome, in loci also known as metabolic gene clusters (MGCs) or more specifically, biosynthetic gene clusters (BGCs) when they are responsible for secondary metabolite biosynthesis.Given the evidence that the production of specialized primary metabolites is encoded in metabolic gene clusters and that these metabolites have an important role in microbe-microbe and microbe-host interactions, the aim of this thesis is to provide new tools to functionally profile the human microbiome. This implies designing different methods not only to predict such MGCs, but also to assess the taxonomic distribution and architectural diversity of MGCs, and correlate their co-abundance and co-expression patterns in samples with specific phenotypes to better comprehend the roles they play in these complex ecosystems. Ultimately, the objective is to make use of all these tools and apply them to real datasets from public repositories, in order to make significant leaps in our understanding of the molecular mechanisms behind microbially derived phenotypic traits.The thesis has five different chapters that attempt to fill the gaps in knowledge in different research areas having, with as a central point the analysis and prediction of gene clusters.In Chapter 2, the antiSMASH database version 2 is introduced. The database was provided with an updated infrastructure that stores compressed information of BGCs from a diverse bacterial genome collection chosen using average nucleotide identity on a large set of publicly available genomes to remove redundant ones. This database is a comprehensive resource that allows to perform cross-genome searches. In Chapter 3, we highlight the usefulness of phylogenomics to uncover putative gene clusters reinforcing the accumulating evidence that large numbers of MGCs exist that encode yet-unknown metabolic pathways. Hence, this chapter shows a prospective potential to elucidate novel pathways and therefore how can help to functionally annotate novel genes. Based on this principle and tackling the need within the field to metabolically profile the gut microbiome, we built gutSMASH (Chapter 4). This is a new tool that not only allows to systematically profile specialized primary MGCs and bioenergetics-related gene clusters from anaerobes but also putative MGCs, which represent good candidates with unknown function to study further. Moreover, we also designed the gutSMASH web server, a user-friendly platform that allows any researcher without bioinformatic background to run their analysis (Chapter 5). Finally, since analysing single “omic” layers (e.g. genomic functional potential) provides limited ability to fully establish causation between microbiome host-pheno- types, we designed BiG-MAP (Chapter 6). This tool represents a step forward to combine different omics data and obtain more biological insights at different molecular levels. More specifically, BiG-MAP allows to profile gene clusters’ abundance and expression patterns that can ultimately help identify which gene clusters are most likely involved in conferring phenotypes of interest and prioritize them for further experimental characterization. 

Saved in:
Bibliographic Details
Main Author: Pascal Andreu, Victòria
Other Authors: de Ridder, D.
Format: Doctoral thesis biblioteca
Language:English
Published: Wageningen University
Subjects:Life Science,
Online Access:https://research.wur.nl/en/publications/algorithms-for-metabolic-pathway-discovery-and-analysis-in-the-hu
Tags: Add Tag
No Tags, Be the first to tag this record!