Hedapen semantikoa informazioaren berreskurapenean

  1. Arantxa Otegi
  2. Eneko Agirre
  3. Xabier Arregi
Zeitschrift:
Ekaia: Euskal Herriko Unibertsitateko zientzi eta teknologi aldizkaria

ISSN: 0214-9001

Datum der Publikation: 2014

Nummer: 27

Seiten: 263-279

Art: Artikel

Andere Publikationen in: Ekaia: Euskal Herriko Unibertsitateko zientzi eta teknologi aldizkaria

Zusammenfassung

lnformation retrieval (IR) aims at searching documents which satisfy the information need of an user. In that way, an IR system informs the user about relevant documents , that is those documents that contain the information they need as formulated in the query. One of the main problems is the so-called vocabulary mismatch problem between query and documents: sorne documents might be relevant to the query e ven if the specific terms used differ substantia ll y, or so me documents might not be relevant to the query even if they have some terms in common. The former is because severa! words or phrases can be used to express the same idea or item (synonymy). The latter is caused by ambiguity, where one word can have more than one interpretation dependi ng on the context. In this work, we expand queries and documents making use of two NLP techn iques , word sense disambiguation and semantic relatedness. Our extensive experiments on three datasets show that the expansion methods explored in this dissertation help overcome the mismatch problem, consequently improving the effectiveness of an IR system.