Automatic biomedical term polysemy detection

Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a novel approach to detect if a biomedical term is polysemic, with the long term goal of enriching biomedical ontologies. This approach is based on the extraction of new features. In this context we propose to extract features following two manners: (i) extracted directly from the text dataset, and (ii) from an induced graph. Our method obtains an Accuracy and F-Measure of 0.978.

Saved in:
Bibliographic Details
Main Authors: Lossio-Ventura, Juan Antonio, Jonquet, Clément, Roche, Mathieu, Teisseire, Maguelonne
Format: conference_item biblioteca
Language:eng
Published: ELRA
Subjects:C30 - Documentation et information, U10 - Informatique, mathématiques et statistiques, U30 - Méthodes de recherche,
Online Access:http://agritrop.cirad.fr/580890/
http://agritrop.cirad.fr/580890/1/LREC16_2_free.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a novel approach to detect if a biomedical term is polysemic, with the long term goal of enriching biomedical ontologies. This approach is based on the extraction of new features. In this context we propose to extract features following two manners: (i) extracted directly from the text dataset, and (ii) from an induced graph. Our method obtains an Accuracy and F-Measure of 0.978.