Neural machine translation of Basque

  1. Etchegoyhen, Thierry
  2. Martínez García, Eva
  3. Azpeitia, Andoni
  4. Labaka, Gorka
  5. Alegría, Iñaki
  6. Cortés Etxabe, Itziar
  7. Amaia Jauregi Carrera
  8. Ellakuria Santos, Igor
  9. Maite Martín Roldán
  10. Calonge, Eusebi
Proceedings of the 21st Annual Conference of the European Association for Machine Translation: 28-30 May 2018, Universitat d'Alacant, Alacant, Spain
  1. Pérez-Ortiz, Juan Antonio (coord.)
  2. Sánchez-Martínez, Felipe (coord.)
  3. Esplà-Gomis, Miquel (coord.)
  4. Popović, Maja (coord.)
  5. Rico, Celia (coord.)
  6. Martins, André (coord.)
  7. Van den Bogaert, Joachim (coord.)
  8. Forcada, Mikel L. (coord.)

Argitaletxea: European Association for Machine Translation

ISBN: 978-84-09-01901-4

Argitalpen urtea: 2018

Orrialdeak: 139-148

Mota: Liburuko kapitulua


We describe the first experimental results in neural machine translation for Basque. As a synthetic language featuring agglutinative morphology, an extended case system, complex verbal morphology and relatively free word order, Basque presents a large number of challenging characteristics for machine translation in general, and for data-driven approaches such as attention-based encoder-decoder models in particular. We present our results on a large range of experiments in Basque-Spanish translation, comparing several neural machine translation system variants with both rule-based and statistical machine translation systems. We demonstrate that significant gains can be obtained with a neural network approach for this challenging language pair, and describe optimal configurations in terms of word segmentation and decoding parameters, measured against test sets that feature multiple references to account for word order variability.