The Paradigm Discovery Problem

Alexander Erdmann, Micha Elsner, Shijie Wu, Ryan Cotterell, Nizar Habash

Abstract Paper Share

Phonology, Morphology and Word Segmentation Long Paper

Session 13B: Jul 8 (13:00-14:00 GMT)
Session 14A: Jul 8 (17:00-18:00 GMT)
Abstract: This work treats the paradigm discovery problem (PDP), the task of learning an inflectional morphological system from unannotated sentences. We formalize the PDP and develop evaluation metrics for judging systems. Using currently available resources, we construct datasets for the task. We also devise a heuristic benchmark for the PDP and report empirical results on five diverse languages. Our benchmark system first makes use of word embeddings and string similarity to cluster forms by cell and by paradigm. Then, we bootstrap a neural transducer on top of the clustered data to predict words to realize the empty paradigm slots. An error analysis of our system suggests clustering by cell across different inflection classes is the most pressing challenge for future work.
You can open the pre-recorded video in a separate window.
NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Similar Papers

Implicit Discourse Relation Classification: We Need to Talk about Evaluation
Najoung Kim, Song Feng, Chulaka Gunasekara, Luis Lastras,
A representative figure from paper main.480
Unsupervised Morphological Paradigm Completion
Huiming Jin, Liwei Cai, Yihui Peng, Chen Xia, Arya McCarthy, Katharina Kann,
A representative figure from paper main.598
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
Annemarie Friedrich, Heike Adel, Federico Tomazic, Johannes Hingerl, Renou Benteau, Anika Marusczyk, Lukas Lange,
A representative figure from paper main.116
Frugal Paradigm Completion
Alexander Erdmann, Tom Kenter, Markus Becker, Christian Schallhart,
A representative figure from paper main.733