Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

Paloma Jeretic; Alex Warstadt; Suvrat Bhooshan; Adina Williams

Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

Paloma Jeretic, Alex Warstadt, Suvrat Bhooshan, Adina Williams

Abstract Paper Share

Semantics: Textual Inference and Other Areas of Semantics Long Paper

Session 14B: Jul 8 (18:00-19:00 GMT)

Session 15A: Jul 8 (20:00-21:00 GMT)

Abstract: Natural language inference (NLI) is an increasingly important task for natural language understanding, which requires one to infer whether a sentence entails another. However, the ability of NLI models to make pragmatic inferences remains understudied. We create an IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of 32K semi-automatically generated sentence pairs illustrating well-studied pragmatic inference types. We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences. Although MultiNLI appears to contain very few pairs illustrating these inference types, we find that BERT learns to draw pragmatic inferences. It reliably treats scalar implicatures triggered by "some" as entailments. For some presupposition triggers like "only", BERT reliably recognizes the presupposition as an entailment, even when the trigger is embedded under an entailment canceling operator like negation. BOW and InferSent show weaker evidence of pragmatic reasoning. We conclude that NLI training encourages models to learn some, but not all, pragmatic inferences.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

Paloma Jeretic, Alex Warstadt, Suvrat Bhooshan, Adina Williams

Similar Papers

Harnessing the linguistic signal to predict scalar inferences

Sebastian Schuster, Yuxing Chen, Judith Degen,

Inherent Disagreements in Human Textual Inferences

Ellie Pavlick, Tom Kwiatkowski,

Logical Inferences with Comparatives and Generalized Quantifiers

Izumi Haruta, Koji Mineshima, Daisuke Bekki,

Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?

Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui,