Regularized Context Gates on Transformer for Machine Translation

Xintong Li; Lemao Liu; Rui Wang; Guoping Huang; Max Meng

Regularized Context Gates on Transformer for Machine Translation

Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng

Abstract Paper Share

Machine Translation Short Paper

Session 14B: Jul 8 (18:00-19:00 GMT)

Session 15B: Jul 8 (21:00-22:00 GMT)

Abstract: Context gates are effective to control the contributions from the source and target contexts in the recurrent neural network (RNN) based neural machine translation (NMT). However, it is challenging to extend them into the advanced Transformer architecture, which is more complicated than RNN. This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information. Extensive experiments on 4 translation datasets demonstrate that the proposed model obtains an averaged gain of 1.0 BLEU score over a strong Transformer baseline.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Regularized Context Gates on Transformer for Machine Translation

Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng

Similar Papers

Target-Guided Structured Attention Network for Target-dependent Sentiment Analysis

Ji Zhang, Chengyao Chen, Pengfei Liu, Chao He, Cane Wing-Ki Leung,

The Cascade Transformer: an Application for Efficient Answer Sentence Selection

Luca Soldaini, Alessandro Moschitti,

Self-Attention Guided Copy Mechanism for Abstractive Summarization

Song Xu, Haoran Li, Peng Yuan, Youzheng Wu, Xiaodong He, Bowen Zhou,

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong,