Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning

Zhuoren Jiang; Zhe Gao; Yu Duan; Yangyang Kang; Changlong Sun; Qiong Zhang; Xiaozhong Liu

Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning

Zhuoren Jiang, Zhe Gao, Yu Duan, Yangyang Kang, Changlong Sun, Qiong Zhang, Xiaozhong Liu

Abstract Paper Share

NLP Applications Short Paper

Session 6A: Jul 7 (05:00-06:00 GMT)

Session 7B: Jul 7 (09:00-10:00 GMT)

Abstract: We propose a Semi-supervIsed GeNerative Active Learning (SIGNAL) model to address the imbalance, efficiency, and text camouflage problems of Chinese text spam detection task. A “self-diversity” criterion is proposed for measuring the “worthiness” of a candidate for annotation. A semi-supervised variational autoencoder with masked attention learning approach and a character variation graph-enhanced augmentation procedure are proposed for data augmentation. The preliminary experiment demonstrates the proposed SIGNAL model is not only sensitive to spam sample selection, but also can improve the performance of a series of conventional active learning models for Chinese spam detection task. To the best of our knowledge, this is the first work to integrate active learning and semi-supervised generative learning for text spam detection.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning

Zhuoren Jiang, Zhe Gao, Yu Duan, Yangyang Kang, Changlong Sun, Qiong Zhang, Xiaozhong Liu

Similar Papers

Interpretable Operational Risk Classification with Semi-Supervised Variational Autoencoder

Fan Zhou, Shengming Zhang, Yi Yang,

Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation

Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang,

Empowering Active Learning to Jointly Optimize System and User Demands

Ji-Ung Lee, Christian M. Meyer, Iryna Gurevych,

Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification

Guangfeng Yan, Lu Fan, Qimai Li, Han Liu, Xiaotong Zhang, Xiao-Ming Wu, Albert Y.S. Lam,