Towards the French biomedical ontology enrichment

Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions.

Saved in:
Bibliographic Details
Main Author: Lossio-Ventura, Juan Antonio
Format: thesis biblioteca
Language:eng
Published: Université de Montpellier
Subjects:C30 - Documentation et information, 000 - Autres thèmes, U30 - Méthodes de recherche, vocabulaire, terminologie, logiciel, automatisation, ontologie de domaine, http://aims.fao.org/aos/agrovoc/c_49855, http://aims.fao.org/aos/agrovoc/c_24907, http://aims.fao.org/aos/agrovoc/c_24008, http://aims.fao.org/aos/agrovoc/c_15855, http://aims.fao.org/aos/agrovoc/c_49849,
Online Access:http://agritrop.cirad.fr/582828/
http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!