An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering

Jay Kumar, Junming Shao, Salah Uddin, Wazir Ali

Abstract Paper Share

Information Retrieval and Text Mining Long Paper

Session 1B: Jul 6 (06:00-07:00 GMT)
Session 3A: Jul 6 (12:00-13:00 GMT)
Abstract: Clustering short text streams is a challenging task due to its unique properties: infinite length, sparse data representation and cluster evolution. Existing approaches often exploit short text streams in a batch way. However, determine the optimal batch size is usually a difficult task since we have no priori knowledge when the topics evolve. In addition, traditional independent word representation in graphical model tends to cause ``term ambiguity" problem in short text clustering. Therefore, in this paper, we propose an Online Semantic-enhanced Dirichlet Model for short sext stream clustering, called OSDM, which integrates the word-occurance semantic information (i.e., context) into a new graphical model and clusters each arriving short text automatically in an online way. Extensive results have demonstrated that OSDM has better performance compared to many state-of-the-art algorithms on both synthetic and real-world data sets.
You can open the pre-recorded video in a separate window.
NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Similar Papers

Autoencoding Keyword Correlation Graph for Document Clustering
Billy Chiu, Sunil Kumar Sahu, Derek Thomas, Neha Sengupta, Mohammady Mahdy,
A representative figure from paper main.366
Neural Mixed Counting Models for Dispersed Topic Discovery
Jiemin Wu, Yanghui Rao, Zusheng Zhang, Haoran Xie, Qing Li, Fu Lee Wang, Ziye Chen,
A representative figure from paper main.548
Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge
Bowen Zhang, Min Yang, Xutao Li, Yunming Ye, Xiaofei Xu, Kuai Dai,
A representative figure from paper main.291
Improving Adversarial Text Generation by Modeling the Distant Future
Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin,
A representative figure from paper main.227