Back-Translation


title:《Understanding Back-Translation at Scale》

Q:

how to improve neural machine translation with monolingual data?

S:

augment the parallel training corpus with back-tranalations of targetlanguage sentences.

C:

sampling or noisy synthetic data better.

new state of the art of 35 BLEU on the WMT’14 English-German test set.

Abstract

​ We find that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective. Our analysis shows that sampling or noisy synthetic data gives a much stronger training signal than data generated by beam or greedy search.

我们发现在所有资源设置缺失的反向翻译中,通过采样或者噪声束输出的是最有效的,相比于波束或者是贪婪搜索。

Introduction

MT通常依赖于大型平行数据集的统计(paired sentences in both the source and target),但是bitext有限,monolingual data有很多,所以开始利用monolingual data。可以使用language model fusion,back-translation,dual learning来进行优化。

我们的重点是BT,在semi-supervised setup中双语单语数据在目标语言中都是可用的。

首先回译在平行数据训练出一个中间系统,,结果是生成一个平行数据库,它的源端是合成的机器翻译,输出是人类书写的真正的文本。这个合成好的加入真正的bitext为了训练最终的系统。

对基于词的翻译,NMT,unsupervised MT很有作用


Author: Weiruohe
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source Weiruohe !
 Previous
deep transformer-NMT deep transformer-NMT
Title:《Very Deep Transformers for Neural Machine Translation》 Q:how to decrease the variance of the output layer in orde
Next 
Task-Oriented Dialogue as Dataflow Synthesis Task-Oriented Dialogue as Dataflow Synthesis
《Task-Oriented Dialogue as Dataflow Synthesis》一、对话系统的问题和挑战 短句包括多个指示。 人类语言具有长尾性,无法使用高频意图覆盖 [^长尾性:系统中的个体相差悬殊,缺乏优选]: 多轮对
  TOC