Rigid Formats Controlled Text Generation

Piji Li, Haisong Zhang, Xiaojiang Liu, Shuming Shi

Abstract Paper Share

Generation Long Paper

Session 1B: Jul 6 (06:00-07:00 GMT)
Session 3A: Jul 6 (12:00-13:00 GMT)
Abstract: Neural text generation has made tremendous progress in various tasks. One common characteristic of most of the tasks is that the texts are not restricted to some rigid formats when generating. However, we may confront some special text paradigms such as Lyrics (assume the music score is given), Sonnet, SongCi (classical Chinese poetry of the Song dynasty), etc. The typical characteristics of these texts are in three folds: (1) They must comply fully with the rigid predefined formats. (2) They must obey some rhyming schemes. (3) Although they are restricted to some formats, the sentence integrity must be guaranteed. To the best of our knowledge, text generation based on the predefined rigid formats has not been well investigated. Therefore, we propose a simple and elegant framework named SongNet to tackle this problem. The backbone of the framework is a Transformer-based auto-regressive language model. Sets of symbols are tailor-designed to improve the modeling performance especially on format, rhyme, and sentence integrity. We improve the attention mechanism to impel the model to capture some future information on the format. A pre-training and fine-tuning framework is designed to further improve the generation quality. Extensive experiments conducted on two collected corpora demonstrate that our proposed framework generates significantly better results in terms of both automatic metrics and the human evaluation.
You can open the pre-recorded video in a separate window.
NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Similar Papers

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang,
A representative figure from paper main.9
Logical Natural Language Generation from Open-Domain Tables
Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang,
A representative figure from paper main.708
SEEK: Segmented Embedding of Knowledge Graphs
Wentao Xu, Shun Zheng, Liang He, Bin Shao, Jian Yin, Tie-Yan Liu,
A representative figure from paper main.358