Handling Rare Entities for Neural Sequence Labeling

Yangming Li; Han Li; Kaisheng Yao; Xiaolong Li

Handling Rare Entities for Neural Sequence Labeling

Yangming Li, Han Li, Kaisheng Yao, Xiaolong Li

Abstract Paper Share

Information Extraction Long Paper

Session 11B: Jul 8 (06:00-07:00 GMT)

Session 12B: Jul 8 (09:00-10:00 GMT)

Abstract: One great challenge in neural sequence labeling is the data sparsity problem for rare entity words and phrases. Most of test set entities appear only few times and are even unseen in training corpus, yielding large number of out-of-vocabulary (OOV) and low-frequency (LF) entities during evaluation. In this work, we propose approaches to address this problem. For OOV entities, we introduce local context reconstruction to implicitly incorporate contextual information into their representations. For LF entities, we present delexicalized entity identification to explicitly extract their frequency-agnostic and entity-type-specific representations. Extensive experiments on multiple benchmark datasets show that our model has significantly outperformed all previous methods and achieved new start-of-the-art results. Notably, our methods surpass the model fine-tuned on pre-trained language models without external resource.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Handling Rare Entities for Neural Sequence Labeling

Yangming Li, Han Li, Kaisheng Yao, Xiaolong Li

Similar Papers

Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge

Keqing He, Yuanmeng Yan, Weiran XU,

Soft Gazetteers for Low-Resource Named Entity Recognition

Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell,

Improving Entity Linking through Semantic Reinforced Entity Embeddings

Feng Hou, Ruili Wang, Jun He, Yi Zhou,

Empower Entity Set Expansion via Language Model Probing

Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han,