ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages

Colin Lockard, Prashant Shiralkar, Xin Luna Dong, Hannaneh Hajishirzi

Abstract Paper Share

Information Extraction Long Paper

Session 14A: Jul 8 (17:00-18:00 GMT)
Session 15A: Jul 8 (20:00-21:00 GMT)
Abstract: In many documents, such as semi-structured webpages, textual semantics are augmented with additional information conveyed using visual elements including layout, font size, and color. Prior work on information extraction from semi-structured websites has required learning an extraction model specific to a given template via either manually labeled or distantly supervised data from that template. In this work, we propose a solution for "zero-shot" open-domain relation extraction from webpages with a previously unseen template, including from websites with little overlap with existing sources of knowledge for distant supervision and websites in entirely new subject verticals. Our model uses a graph neural network-based approach to build a rich representation of text fields on a webpage and the relationships between them, enabling generalization to new templates. Experiments show this approach provides a 31% F1 gain over a baseline for zero-shot extraction in a new subject vertical.
You can open the pre-recorded video in a separate window.
NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Similar Papers

Improving Neural Machine Translation with Soft Template Prediction
Jian Yang, Shuming Ma, Dongdong Zhang, Zhoujun Li, Ming Zhou,
A representative figure from paper main.531
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, Ting Liu,
A representative figure from paper main.128
Zero-shot Text Classification via Reinforced Self-training
Zhiquan Ye, Yuxia Geng, Jiaoyan Chen, Jingmin Chen, Xiaoxiao Xu, Suhang Zheng, Feng Wang, Jun Zhang, Huajun Chen,
A representative figure from paper main.272