Combining C-value and keyword extraction methods for biomedical terms extraction

The objective of this work is to extract and to rank biomedical terms from free text. We present new extraction methods that use linguistic patterns specialized for the biomedical field, and use term extraction measures, such as C-value, and keyword extraction measures, such as Okapi BM25, and TFIDF. We propose several combinations of these measures to improve the extraction and ranking process. Our experiments show that an appropriate harmonic mean of C-value used with keyword extraction measures offers better precision results than used alone, either for the extraction of single-word and multi-words terms. We illustrate our results on the extraction of English and French biomedical terms from a corpus of laboratory tests. The results are validated by using UMLS (in English) and only MeSH (in French) as reference dictionary.

Saved in:
Bibliographic Details
Main Authors: Lossio Ventura, Juan Antonio, Jonquet, Clément, Roche, Mathieu, Teisseire, Maguelonne
Format: conference_item biblioteca
Language:eng
Published: s.n.
Subjects:C30 - Documentation et information, 000 - Autres thèmes,
Online Access:http://agritrop.cirad.fr/572345/
http://agritrop.cirad.fr/572345/1/document_572345.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cirad-fr-572345
record_format koha
spelling dig-cirad-fr-5723452022-03-30T15:05:06Z http://agritrop.cirad.fr/572345/ http://agritrop.cirad.fr/572345/ Combining C-value and keyword extraction methods for biomedical terms extraction. Lossio Ventura Juan Antonio, Jonquet Clément, Roche Mathieu, Teisseire Maguelonne. 2013. In : 5th International Symposium on Languages in Biology and Medicine (LBM 2013), 12th and 13th December, 2013, Tokyo, Japan. s.l. : s.n., 5 p. International Symposium on Languages in Biology and Medicine. 5, Tokyo, Japon, 12 Décembre 2013/13 Décembre 2013. Researchers Combining C-value and keyword extraction methods for biomedical terms extraction Lossio Ventura, Juan Antonio Jonquet, Clément Roche, Mathieu Teisseire, Maguelonne eng 2013 s.n. 5th International Symposium on Languages in Biology and Medicine (LBM 2013), 12th and 13th December, 2013, Tokyo, Japan C30 - Documentation et information 000 - Autres thèmes The objective of this work is to extract and to rank biomedical terms from free text. We present new extraction methods that use linguistic patterns specialized for the biomedical field, and use term extraction measures, such as C-value, and keyword extraction measures, such as Okapi BM25, and TFIDF. We propose several combinations of these measures to improve the extraction and ranking process. Our experiments show that an appropriate harmonic mean of C-value used with keyword extraction measures offers better precision results than used alone, either for the extraction of single-word and multi-words terms. We illustrate our results on the extraction of English and French biomedical terms from a corpus of laboratory tests. The results are validated by using UMLS (in English) and only MeSH (in French) as reference dictionary. conference_item info:eu-repo/semantics/conferenceObject Conference info:eu-repo/semantics/publishedVersion http://agritrop.cirad.fr/572345/1/document_572345.pdf application/pdf Cirad license info:eu-repo/semantics/restrictedAccess https://agritrop.cirad.fr/mention_legale.html
institution CIRAD FR
collection DSpace
country Francia
countrycode FR
component Bibliográfico
access En linea
databasecode dig-cirad-fr
tag biblioteca
region Europa del Oeste
libraryname Biblioteca del CIRAD Francia
language eng
topic C30 - Documentation et information
000 - Autres thèmes
C30 - Documentation et information
000 - Autres thèmes
spellingShingle C30 - Documentation et information
000 - Autres thèmes
C30 - Documentation et information
000 - Autres thèmes
Lossio Ventura, Juan Antonio
Jonquet, Clément
Roche, Mathieu
Teisseire, Maguelonne
Combining C-value and keyword extraction methods for biomedical terms extraction
description The objective of this work is to extract and to rank biomedical terms from free text. We present new extraction methods that use linguistic patterns specialized for the biomedical field, and use term extraction measures, such as C-value, and keyword extraction measures, such as Okapi BM25, and TFIDF. We propose several combinations of these measures to improve the extraction and ranking process. Our experiments show that an appropriate harmonic mean of C-value used with keyword extraction measures offers better precision results than used alone, either for the extraction of single-word and multi-words terms. We illustrate our results on the extraction of English and French biomedical terms from a corpus of laboratory tests. The results are validated by using UMLS (in English) and only MeSH (in French) as reference dictionary.
format conference_item
topic_facet C30 - Documentation et information
000 - Autres thèmes
author Lossio Ventura, Juan Antonio
Jonquet, Clément
Roche, Mathieu
Teisseire, Maguelonne
author_facet Lossio Ventura, Juan Antonio
Jonquet, Clément
Roche, Mathieu
Teisseire, Maguelonne
author_sort Lossio Ventura, Juan Antonio
title Combining C-value and keyword extraction methods for biomedical terms extraction
title_short Combining C-value and keyword extraction methods for biomedical terms extraction
title_full Combining C-value and keyword extraction methods for biomedical terms extraction
title_fullStr Combining C-value and keyword extraction methods for biomedical terms extraction
title_full_unstemmed Combining C-value and keyword extraction methods for biomedical terms extraction
title_sort combining c-value and keyword extraction methods for biomedical terms extraction
publisher s.n.
url http://agritrop.cirad.fr/572345/
http://agritrop.cirad.fr/572345/1/document_572345.pdf
work_keys_str_mv AT lossioventurajuanantonio combiningcvalueandkeywordextractionmethodsforbiomedicaltermsextraction
AT jonquetclement combiningcvalueandkeywordextractionmethodsforbiomedicaltermsextraction
AT rochemathieu combiningcvalueandkeywordextractionmethodsforbiomedicaltermsextraction
AT teisseiremaguelonne combiningcvalueandkeywordextractionmethodsforbiomedicaltermsextraction
_version_ 1758024069234556928