TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories
Giannis Karamanolakis, Jun Ma, Xin Luna Dong
Information Extraction Long Paper
Session 14B: Jul 8
(18:00-19:00 GMT)
Session 15B: Jul 8
(21:00-22:00 GMT)
Abstract:
Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. Through category conditional self-attention and multi-task learning, our approach is both scalable, as it trains a single model for thousands of categories, and effective, as it extracts category-specific attribute values. Experiments on products from a taxonomy with 4,000 categories show that TXtract outperforms state-of-the-art approaches by up to 10% in F1 and 15% in coverage across all categories.
You can open the
pre-recorded video
in a separate window.
NOTE: The SlidesLive video may display a random order of the authors.
The correct author list is shown at the top of this webpage.
Similar Papers
Taxonomy Construction of Unseen Domains via Graph-based Cross-Domain Knowledge Transfer
Chao Shang, Sarthak Dash, Md. Faisal Mahbub Chowdhury, Nandana Mihindukulasooriya, Alfio Gliozzo,

Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness
Sixing Wu, Ying Li, Dawei Zhang, Yang Zhou, Zhonghai Wu,

Review-based Question Generation with Adaptive Instance Transfer and Augmentation
Qian Yu, Lidong Bing, Qiong Zhang, Wai Lam, Luo Si,

Text Classification with Negative Supervision
Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, Yuki Arase,
