Towards the French biomedical ontology enrichment
Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions.
Main Author: | |
---|---|
Format: | thesis biblioteca |
Language: | eng |
Published: |
Université de Montpellier
|
Subjects: | C30 - Documentation et information, 000 - Autres thèmes, U30 - Méthodes de recherche, vocabulaire, terminologie, logiciel, automatisation, ontologie de domaine, http://aims.fao.org/aos/agrovoc/c_49855, http://aims.fao.org/aos/agrovoc/c_24907, http://aims.fao.org/aos/agrovoc/c_24008, http://aims.fao.org/aos/agrovoc/c_15855, http://aims.fao.org/aos/agrovoc/c_49849, |
Online Access: | http://agritrop.cirad.fr/582828/ http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
dig-cirad-fr-582828 |
---|---|
record_format |
koha |
spelling |
dig-cirad-fr-5828282024-01-28T23:56:16Z http://agritrop.cirad.fr/582828/ http://agritrop.cirad.fr/582828/ Towards the French biomedical ontology enrichment. Lossio-Ventura Juan Antonio. 2015. Montpellier : Université de Montpellier, 222 p. Thesis Ph. D. : Computer science : Université de Montpellier Towards the French biomedical ontology enrichment Lossio-Ventura, Juan Antonio eng 2015 Université de Montpellier C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions. thesis info:eu-repo/semantics/doctoralThesis Thesis info:eu-repo/semantics/publishedVersion http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf text Cirad license info:eu-repo/semantics/openAccess https://agritrop.cirad.fr/mention_legale.html |
institution |
CIRAD FR |
collection |
DSpace |
country |
Francia |
countrycode |
FR |
component |
Bibliográfico |
access |
En linea |
databasecode |
dig-cirad-fr |
tag |
biblioteca |
region |
Europa del Oeste |
libraryname |
Biblioteca del CIRAD Francia |
language |
eng |
topic |
C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 |
spellingShingle |
C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 Lossio-Ventura, Juan Antonio Towards the French biomedical ontology enrichment |
description |
Big Data for the biomedical domain involves a major issue: the analysis of large volumes of heterogeneous data (e.g. video, audio, text, image). Ontology, i.e. conceptual models of the reality, can play a crucial role in biomedical fields for automating data processing, querying, and matching heterogeneous data. Various English resources exist, but considerably fewer are available in French and there is a substantial lack of related tools and services to exploit them. Ontologies were initially built manually. A few semi-automatic methodologies have been proposed in recent years. Semi-automatic construction/enrichment of ontologies are mostly achieved using natural language processing (NLP) techniques to assess texts. NLP methods have to take the lexical and semantic complexity of biomedical data into account: (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology. In this thesis, we address the above-mentioned challenges by proposing methodologies for construction/enrichment of biomedical ontologies based on two main contributions. The first contribution concerns the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single and multi-word term extraction methods are proposed and evaluated. In addition, we present BioTex web and desktop application that implements the proposed measures. The second contribution concerns concept extraction and semantic linkage of extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find semantic links, i.e. relevant locations of new candidate terms, in an existing biomedical ontology. We propose a methodology that extracts new terms in MeSH ontology. Quantitative and qualitative assessments conducted by experts and non-experts on real data highlight the relevance of the contributions. |
format |
thesis |
topic_facet |
C30 - Documentation et information 000 - Autres thèmes U30 - Méthodes de recherche vocabulaire terminologie logiciel automatisation ontologie de domaine http://aims.fao.org/aos/agrovoc/c_49855 http://aims.fao.org/aos/agrovoc/c_24907 http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_15855 http://aims.fao.org/aos/agrovoc/c_49849 |
author |
Lossio-Ventura, Juan Antonio |
author_facet |
Lossio-Ventura, Juan Antonio |
author_sort |
Lossio-Ventura, Juan Antonio |
title |
Towards the French biomedical ontology enrichment |
title_short |
Towards the French biomedical ontology enrichment |
title_full |
Towards the French biomedical ontology enrichment |
title_fullStr |
Towards the French biomedical ontology enrichment |
title_full_unstemmed |
Towards the French biomedical ontology enrichment |
title_sort |
towards the french biomedical ontology enrichment |
publisher |
Université de Montpellier |
url |
http://agritrop.cirad.fr/582828/ http://agritrop.cirad.fr/582828/7/Thesis-Juan-Antonio-Lossio-Ventura.pdf |
work_keys_str_mv |
AT lossioventurajuanantonio towardsthefrenchbiomedicalontologyenrichment |
_version_ |
1792499178356604928 |