Reentrenamiento: aprendizaje semisupervisado de los sentidos de las palabras

Palomar Sanz, Manuel; Rigau Claramunt, Germán; Suárez Cueto, Armando

Reentrenamientoaprendizaje semisupervisado de los sentidos de las palabras

Palomar Sanz, Manuel
Rigau Claramunt, Germán
Suárez Cueto, Armando

Journal:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2005

Issue: 34

Pages: 49-66

Type: Article

DIALNET GOOGLE SCHOLAR RUA editor

More publications in: Procesamiento del lenguaje natural

Abstract

This paper presents re-training, a bootstrapping algorithm that automatically acquires semantically annotated data, ensuring high levels of precision. This algorithm uses a corpus-based system of word sense disambiguation that relies on maximum entropy probability models. The re-training method consists of the iterative feeding of training-classification cycles with new and high-confidence examples. The process relies on several filters that ensure the accuracy of the disambiguation by discarding uncertain classifications. This new method is inspired by co-training algorithms, but it makes stronger assumptions on when to assign a label to a linguistic contex

Data source: Dialnet