Low Cost Construction of a Multilingual Lexicon from Bilingual Lists
Manually constructing multilingual translation lexicons can be very costly, both in terms of time and human effort. Although there have been many efforts at (semi-)automatically merging bilingual machine readable dictionaries to produce a multilingual lexicon, most of these approaches place quite specific requirements on the input bilingual resources. Unfortunately, not all bilingual dictionaries fulfil these criteria, especially in the case of under-resourced language pairs. We describe a low cost method for constructing a multilingual lexicon using only simple lists of bilingual translation mappings. The method is especially suitable for under-resourced language pairs, as such bilingual resources are often freely available and easily obtainable from the Internet, or digitised from simple, conventional paper-based dictionaries. The precision of random samples of the resultant multilingual lexicon is around 0.70-0.82, while coverage for each language, precision and recall can be controlled by varying threshold values. Given the very simple input resources, our results are encouraging, especially in incorporating under-resourced languages into multilingual lexical resources.
Main Authors: | , , |
---|---|
Format: | Digital revista |
Language: | English |
Published: |
Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo
2011
|
Online Access: | http://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S1870-90442011000100006 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|