Abstract
In standard NLP pipelines, morphological analysis and disambiguation (MA&D) precedes syntactic and semantic downstream tasks. However, for languages with complex and ambiguous word-internal structure, known as morphologically rich languages (MRLs), it has been hypothesized that syntactic context may be crucial for accurate MA&D, and vice versa. In this work we empirically confirm this hypothesis for Modern Hebrew, an MRL with complex morphology and severe word-level ambiguity, in a novel transition-based framework. Specifically, we propose a joint morphosyntactic transition-based framework which formally unifies two distinct transition systems, morphological and syntactic, into a single transition-based system with joint training and joint inference. We empirically show that MA&D results obtained in the joint settings outperform MA&D results obtained by the respective standalone components, and that end-to-end parsing results obtained by our joint system present a new state of the art for Hebrew dependency parsing.
Original language | English |
---|---|
Pages (from-to) | 33-48 |
Number of pages | 16 |
Journal | Transactions of the Association for Computational Linguistics |
Volume | 7 |
DOIs | |
State | Published - 2019 |
Bibliographical note
Funding Information:We thank Joakim Nivre, Yue Zhang, and Yoav Goldberg for comments and suggestions, and four anonymous reviewers for their comments on earlier drafts. We further thank Shoval Sadde, Yochay Gurman, and Dan Bareket from the ONLP Lab at the Open University of Israel for critical discussion of the data and the empirical results. This research was supported by a Starting Research Grant from the European Research Council (ERC-StG-677352), and a grant from the Israel Science Foundation (ISF-1739/26), for which we are grateful.
Publisher Copyright:
© 2019 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license.