Low Cost Construction of a Multilingual Lexicon from Bilingual Lists

Manually constructing multilingual translation lexicons can be very costly, both in terms of time and human effort. Although there have been many efforts at (semi-)automatically merging bilingual machine readable dictionaries to produce a multilingual lexicon, most of these approaches place quite specific requirements on the input bilingual resources. Unfortunately, not all bilingual dictionaries fulfil these criteria, especially in the case of under-resourced language pairs. We describe a low cost method for constructing a multilingual lexicon using only simple lists of bilingual translation mappings. The method is especially suitable for under-resourced language pairs, as such bilingual resources are often freely available and easily obtainable from the Internet, or digitised from simple, conventional paper-based dictionaries. The precision of random samples of the resultant multilingual lexicon is around 0.70-0.82, while coverage for each language, precision and recall can be controlled by varying threshold values. Given the very simple input resources, our results are encouraging, especially in incorporating under-resourced languages into multilingual lexical resources.

Saved in:
Bibliographic Details
Main Authors: Lim,Lian Tze, Ranaivo-Malançon,Bali, Tang,Enya Kong
Format: Digital revista
Language:English
Published: Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo 2011
Online Access:http://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S1870-90442011000100006
Tags: Add Tag
No Tags, Be the first to tag this record!