Towards the French biomedical ontology enrichment

Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions.

Saved in:
Bibliographic Details
Main Author: Lossio-Ventura, Juan Antonio
Format: thesis biblioteca
Language:eng
Published: Université de Montpellier
Subjects:C30 - Documentation et information, 000 - Autres thèmes, U30 - Méthodes de recherche, vocabulaire, terminologie, logiciel, automatisation, ontologie de domaine, http://aims.fao.org/aos/agrovoc/c_49855, http://aims.fao.org/aos/agrovoc/c_24907, http://aims.fao.org/aos/agrovoc/c_24008, http://aims.fao.org/aos/agrovoc/c_15855, http://aims.fao.org/aos/agrovoc/c_49849,
Online Access:http://agritrop.cirad.fr/582828/
http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cirad-fr-582828
record_format koha
spelling dig-cirad-fr-5828282024-01-28T23:56:16Z http://agritrop.cirad.fr/582828/ http://agritrop.cirad.fr/582828/ Towards the French biomedical ontology enrichment. Lossio-Ventura Juan Antonio. 2015. Montpellier : Université de Montpellier, 222 p. Thesis Ph. D. : Computer science : Université de Montpellier Towards the French biomedical ontology enrichment Lossio-Ventura, Juan Antonio eng 2015 Université de Montpellier C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions. thesis info:eu-repo/semantics/doctoralThesis Thesis info:eu-repo/semantics/publishedVersion http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf text Cirad license info:eu-repo/semantics/openAccess https://agritrop.cirad.fr/mention_legale.html
institution CIRAD FR
collection DSpace
country Francia
countrycode FR
component Bibliográfico
access En linea
databasecode dig-cirad-fr
tag biblioteca
region Europa del Oeste
libraryname Biblioteca del CIRAD Francia
language eng
topic C30 - Documentation et information
000 - Autres thèmes
U30 - Méthodes de recherche
vocabulaire
terminologie
logiciel
automatisation
ontologie de domaine
http://aims.fao.org/aos/agrovoc/c_49855
http://aims.fao.org/aos/agrovoc/c_24907
http://aims.fao.org/aos/agrovoc/c_24008
http://aims.fao.org/aos/agrovoc/c_15855
http://aims.fao.org/aos/agrovoc/c_49849
C30 - Documentation et information
000 - Autres thèmes
U30 - Méthodes de recherche
vocabulaire
terminologie
logiciel
automatisation
ontologie de domaine
http://aims.fao.org/aos/agrovoc/c_49855
http://aims.fao.org/aos/agrovoc/c_24907
http://aims.fao.org/aos/agrovoc/c_24008
http://aims.fao.org/aos/agrovoc/c_15855
http://aims.fao.org/aos/agrovoc/c_49849
spellingShingle C30 - Documentation et information
000 - Autres thèmes
U30 - Méthodes de recherche
vocabulaire
terminologie
logiciel
automatisation
ontologie de domaine
http://aims.fao.org/aos/agrovoc/c_49855
http://aims.fao.org/aos/agrovoc/c_24907
http://aims.fao.org/aos/agrovoc/c_24008
http://aims.fao.org/aos/agrovoc/c_15855
http://aims.fao.org/aos/agrovoc/c_49849
C30 - Documentation et information
000 - Autres thèmes
U30 - Méthodes de recherche
vocabulaire
terminologie
logiciel
automatisation
ontologie de domaine
http://aims.fao.org/aos/agrovoc/c_49855
http://aims.fao.org/aos/agrovoc/c_24907
http://aims.fao.org/aos/agrovoc/c_24008
http://aims.fao.org/aos/agrovoc/c_15855
http://aims.fao.org/aos/agrovoc/c_49849
Lossio-Ventura, Juan Antonio
Towards the French biomedical ontology enrichment
description Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions.
format thesis
topic_facet C30 - Documentation et information
000 - Autres thèmes
U30 - Méthodes de recherche
vocabulaire
terminologie
logiciel
automatisation
ontologie de domaine
http://aims.fao.org/aos/agrovoc/c_49855
http://aims.fao.org/aos/agrovoc/c_24907
http://aims.fao.org/aos/agrovoc/c_24008
http://aims.fao.org/aos/agrovoc/c_15855
http://aims.fao.org/aos/agrovoc/c_49849
author Lossio-Ventura, Juan Antonio
author_facet Lossio-Ventura, Juan Antonio
author_sort Lossio-Ventura, Juan Antonio
title Towards the French biomedical ontology enrichment
title_short Towards the French biomedical ontology enrichment
title_full Towards the French biomedical ontology enrichment
title_fullStr Towards the French biomedical ontology enrichment
title_full_unstemmed Towards the French biomedical ontology enrichment
title_sort towards the french biomedical ontology enrichment
publisher Université de Montpellier
url http://agritrop.cirad.fr/582828/
http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf
work_keys_str_mv AT lossioventurajuanantonio towardsthefrenchbiomedicalontologyenrichment
_version_ 1792499178356604928