T5: Achieving Common Ground in Multi-modal Dialogue

Malihe Alikhani and Matthew Stone

Live Session 1: Jul 5 (13:00-16:30 GMT)
Live Session 2: Jul 5 (22:00-01:30 GMT)
Abstract: All communication aims at achieving common ground (grounding): interlocutors can work together effectively only with mutual beliefs about what the state of the world is, about what their goals are, and about how they plan to make their goals a reality. Computational dialogue research offers some classic results on grouding, which unfortunately offer scant guidance to the design of grounding modules and behaviors in cutting-edge systems. In this tutorial, we focus on three main topic areas: 1) grounding in human-human communication; 2) grounding in dialogue systems; and 3) grounding in multi-modal interactive systems, including image-oriented conversations and human-robot interactions. We highlight a number of achievements of recent computational research in coordinating complex content, show how these results lead to rich and challenging opportunities for doing grounding in more flexible and powerful ways, and canvass relevant insights from the literature on human-human conversation. We expect that the tutorial will be of interest to researchers in dialogue systems, computational semantics and cognitive modeling, and hope that it will catalyze research and system building that more directly explores the creative, strategic ways conversational agents might be able to seek and offer evidence about their understanding of their interlocutors.

Information about the virtual format of this tutorial: This tutorial has a prerecorded talk on this page (see below) that you can watch anytime during the conference. It also has two live sessions that will be conducted on Zoom and will be livestreamed on this page. Additionally, it has a chat window that you can use to have discussions with the tutorial teachers and other attendees anytime during the conference.

Live Session 1

Live Session 2