An integrative algorithmic approach towards knowledge discovery by bioinformatics

In this thesis we describe different approaches aiding in the utilization of the exponentially growing amount of information available in the life sciences. Briefly, we address two issues in molecular biology, on sequence analysis, and on text mining. The former issue addresses the problem how to determine remote sequence homology especially when the sequence similarity is very low. For this a visualisation tool is introduced that combines sequence alignment, domain prediction and phylogeny. The second topic on text mining centres on the question how to unambiguously formulate queries for efficient information retrieval. It tackles the problem of gene nomenclature — one in two gene symbols being ambiguous - by introducing a new text-clustering- and taxonomy-based disambiguation methodology.

Saved in:
Bibliographic Details
Main Author: Alako Tadontsop, F.B.
Other Authors: Leunissen, Jack
Format: Doctoral thesis biblioteca
Language:English
Subjects:algorithms, bioinformatics, classification, computer analysis, data mining, genomics, microarrays, molecular biology, nomenclature, nucleotide sequences, ontologies, phylogenetics, phylogeny, algoritmen, bio-informatica, classificatie, computeranalyse, datamining, fylogenetica, fylogenie, genexpressieanalyse, moleculaire biologie, nomenclatuur, nucleotidenvolgordes, ontologieën,
Online Access:https://research.wur.nl/en/publications/an-integrative-algorithmic-approach-towards-knowledge-discovery-b
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this thesis we describe different approaches aiding in the utilization of the exponentially growing amount of information available in the life sciences. Briefly, we address two issues in molecular biology, on sequence analysis, and on text mining. The former issue addresses the problem how to determine remote sequence homology especially when the sequence similarity is very low. For this a visualisation tool is introduced that combines sequence alignment, domain prediction and phylogeny. The second topic on text mining centres on the question how to unambiguously formulate queries for efficient information retrieval. It tackles the problem of gene nomenclature — one in two gene symbols being ambiguous - by introducing a new text-clustering- and taxonomy-based disambiguation methodology.