End-to-End Neural Word Alignment Outperforms GIZA++

Thomas Zenkel; Joern Wuebker; John DeNero

End-to-End Neural Word Alignment Outperforms GIZA++

Thomas Zenkel, Joern Wuebker, John DeNero

Abstract Paper Share

Machine Translation Long Paper

Session 2B: Jul 6 (09:00-10:00 GMT)

Session 3A: Jul 6 (12:00-13:00 GMT)

Abstract: Word alignment was once a core unsupervised learning task in natural language processing because of its essential role in training statistical machine translation (MT) models. Although unnecessary for training neural MT models, word alignment still plays an important role in interactive applications of neural machine translation, such as annotation transfer and lexicon injection. While statistical MT methods have been replaced by neural approaches with superior performance, the twenty-year-old GIZA++ toolkit remains a key component of state-of-the-art word alignment systems. Prior work on neural word alignment has only been able to outperform GIZA++ by using its output during training. We present the first end-to-end neural word alignment method that consistently outperforms GIZA++ on three data sets. Our approach repurposes a Transformer model trained for supervised translation to also serve as an unsupervised word alignment model in a manner that is tightly integrated and does not affect translation quality.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

End-to-End Neural Word Alignment Outperforms GIZA++

Thomas Zenkel, Joern Wuebker, John DeNero

Similar Papers

Neighborhood Matching Network for Entity Alignment

Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao,

Unsupervised Cross-lingual Representation Learning at Scale

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov,

Checkpoint Reranking: An Approach to Select Better Hypothesis for Neural Machine Translation Systems

Vinay Pandramish, Dipti Misra Sharma,

Multiscale Collaborative Deep Models for Neural Machine Translation

Xiangpeng Wei, Heng Yu, Yue Hu, Yue Zhang, Rongxiang Weng, Weihua Luo,