2018.8. Unsupervised machine translation: A novel approach to provide fast... 阅读笔记

本文主要是介绍2018.8. Unsupervised machine translation: A novel approach to provide fast... 阅读笔记，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

Unsupervised machine translation: A novel approach to provide fast, accurate translations for more languages

FB AI research

Abstract

本文提出的方法由两个步骤构成：word-by-word initialization 和 translating sentence (language modeling + back translation)。其中，word-by-word translation 的工作来自于作者的前一篇文章；translating sentence 的步骤如下：

MT system – Version 1:
- word-by-word initialization: Urdu => English
- LM for English to reorder the words
- => better than the word-by-word translation
MT system – Version 2:
- back translation: English => Urdu
- Treat translations obtained from MT V1 as ground truth data to train the MT V2
- => can be used to translate many sentences in English to Urdu, forming another data set.
- => help improve the Urdu-to-English MT system (MT Version 1).
  Many iterations…

Motivation

Automatic language translation is important to FB as a way for users to communicate
Current MT systems：requires a considerable volume of data
=> MT works well only for the small subset of language
=> unsupervised translation is necessary (因此有很多以往的工作）

Result of This Paper

Equivalent to supervised approaches trained with nearly 100,000 reference translations.
- Improve more than 10 BLEU points compaired with previous accomplishments (previous SOA unsupervised methods)
For low-resource languages like Urdu, we can translate between it an English using text in English and completely unrelated test in Urdu. (fully unsupervised)
faster & more accurate

Method

word-by-word initialization
language modeling
back translation

1. Word-by-word Translation

Target
- learn a bilingual dictionary (a word V.S. its plausible translations) (实现：某篇文章)
STEP 1
- learn WE
  - trained to predict the words around a given word using context
- Result
  - WE capture interesting semantic structure (此处有直观实例解释)
  - WE in different Lan. share similar neighborhood structure (此处有直观实例解释)
STEP 2
- learn a rotation of the WE in one Lan. to match the WE in the other Lan. (using adversarial training, self-learning of Procrutes Problem)

2. Translating Sentences

Motivation
- word obtained from w-b-w trans. may be missing, out of order, plain wrong. => However
- but preserves most of the meaning
- can be improved by making local edits using a language model: trained on lots of monolingual data to score sequences of words
  - train an Lan. model in Urdu alongside the English Lan. model
MT system (Version 1): the language model + word-by-word initialization
- better than word-by-word translation
- Urdu -> English
MT system (Version 2): back translation
- using data obtained from Version 1 as gourd true data
- English -> Urdu
- 1st time used in unsupervised system (initially trained on supervised data), first introduced at ACL 2015
- more data of E-U can help improve MT system Version 1.