Traducción Automática Neuronal no Supervisada, un nuevo paradigma basado solo en textos monolingües

  1. Labaka Intxauspe, Gorka
  2. Agirre Bengoa, Eneko
  3. Artetxe, Mikel
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2019

Issue: 63

Pages: 151-154

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

This article presents UnsupNMT, a 3-year project of which the first year has already been completed. UnsupNMT proposes a radically different approach to machine translation: unsupervised translation, that is, translation based on monolingual data alone with no need for bilingual resources. This method is based on deep learning of temporal sequences and uses cutting-edge interlingual word representations in the form of cross-lingual word embeddings. This project is not only a highly innovative proposal but it also opens a new paradigm in machine translation which branches out to other disciplines, such us transfer learning. Despite the current limitations of unsupervised machine translation, the techniques developed are expected to have great repercussions in areas where machine translation achieves worse results, such as translation between languages which have little contact, e.g. German and Russian. |

Funding information

UnsupNMT is a project funded by the Spa nish Ministry of Economy, Industry and Competitiveness (TIN2017-91692-EXP).

Funders

    • TIN2017-91692-EXP

Bibliographic References

  • Artetxe, M., G. Labaka, and E. Agirre. 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 451-462, Vancouver, Canada, July. Association for Computational Linguistics.
  • Artetxe, M., G. Labaka, and E. Agirre. 2018a. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In Proceedings of the Thirty-Second AAAI Conference on Arti cial In- telligence (AAAI-18), pages 5012{5019.
  • Artetxe, M., G. Labaka, and E. Agirre. 2018b. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 789-798. Association for Computational Linguistics.
  • Artetxe, M., G. Labaka, and E. Agirre. 2018c. Unsupervised statistical machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3632-3642, Brussels, Belgium, October- November. Association for Computational Linguistics.
  • Artetxe, M., G. Labaka, and E. Agirre. 2019. An effective approach to unsupervised machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics.
  • Artetxe, M., G. Labaka, E. Agirre, and K. Cho. 2018. Unsupervised neural machine translation. In Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), April.
  • Bahdanau, D., K. Cho, and Y. Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv eprints, abs/1409.0473, September.
  • Chen, Y., Y. Liu, Y. Cheng, and V. O. Li. 2017. A teacher-student framework for zero-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1925-1935. Association for Computational Linguistics.
  • Chu, C., R. Dabre, and S. Kurohashi. 2017. An empirical comparison of domain adaptation methods for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 385-391. Association for Computational Linguistics.
  • He, D., Y. Xia, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. 2016. Dual learning for machine translation. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29. Curran Associates, Inc., pages 820-828.
  • Koehn, P. and R. Knowles. 2017. Six challenges for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation, pages 28-39. Association for Computational Linguistics. Lample, G., A. Conneau, L. Denoyer, and M. Ranzato. 2018a. Unsupervised machine translation using monolingual corpora only. In Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), April.
  • Lample, G., M. Ott, A. Conneau, L. Denoyer, and M. Ranzato. 2018b. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 5039-5049, Brussels, Belgium, October-November. Association for Computational Linguistics.
  • Sennrich, R., B. Haddow, and A. Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 86-96, Berlin, Germany, August. Association for Computational Linguistics.