An integrative algorithmic approach towards knowledge discovery by bioinformatics
In this thesis we describe different approaches aiding in the utilization of the exponentially growing amount of information available in the life sciences. Briefly, we address two issues in molecular biology, on sequence analysis, and on text mining. The former issue addresses the problem how to determine remote sequence homology especially when the sequence similarity is very low. For this a visualisation tool is introduced that combines sequence alignment, domain prediction and phylogeny. The second topic on text mining centres on the question how to unambiguously formulate queries for efficient information retrieval. It tackles the problem of gene nomenclature — one in two gene symbols being ambiguous - by introducing a new text-clustering- and taxonomy-based disambiguation methodology.
Main Author: | |
---|---|
Other Authors: | |
Format: | Doctoral thesis biblioteca |
Language: | English |
Subjects: | algorithms, bioinformatics, classification, computer analysis, data mining, genomics, microarrays, molecular biology, nomenclature, nucleotide sequences, ontologies, phylogenetics, phylogeny, algoritmen, bio-informatica, classificatie, computeranalyse, datamining, fylogenetica, fylogenie, genexpressieanalyse, moleculaire biologie, nomenclatuur, nucleotidenvolgordes, ontologieën, |
Online Access: | https://research.wur.nl/en/publications/an-integrative-algorithmic-approach-towards-knowledge-discovery-b |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this thesis we describe different approaches aiding in the utilization of the exponentially growing amount of information available in the life sciences. Briefly, we address two issues in molecular biology, on sequence analysis, and on text mining. The former issue addresses the problem how to determine remote sequence homology especially when the sequence similarity is very low. For this a visualisation tool is introduced that combines sequence alignment, domain prediction and phylogeny. The second topic on text mining centres on the question how to unambiguously formulate queries for efficient information retrieval. It tackles the problem of gene nomenclature — one in two gene symbols being ambiguous - by introducing a new text-clustering- and taxonomy-based disambiguation methodology. |
---|