Morphological tasks use large multi-lingual datasets that organize words into inflection tables, which then serve as training and evaluation data for various tasks. However, a closer inspection of these data reveals pro-found cross-linguistic inconsistencies, which arise from the lack of a clear linguistic and operational definition of what is a word, and which severely impair the universality of the derived tasks. To overcome this deficiency, we propose to view morphology as a clause-level phenomenon, rather than word-level. It is an-chored in a fixed yet inclusive set of features, that encapsulates all functions realized in a saturated clause. We deliver MIGHTYMORPH, a novel dataset for clause-level morphology covering 4 typologically different languages: English, German, Turkish, and Hebrew. We use this dataset to derive 3 clause-level morphological tasks: inflection, reinflection and analysis. Our experiments show that the clause-level tasks are substantially harder than the respective word-level tasks, while having comparable complexity across languages. Furthermore, redefining morphology to the clause-level provides a neat interface with contextualized language models (LMs) and allows assessing the morphological knowledge encoded in these models and their usabil-ity for morphological tasks. Taken together, this work opens up new horizons in the study of computational morphology, leaving ample space for studying neural morphology cross-linguistically.
|Number of pages||18|
|Journal||Transactions of the Association for Computational Linguistics|
|State||Published - 2022|
Bibliographical noteFunding Information:
We would like to thank the TACL anonymous reviewers and the action editor for their insightful suggestions and remarks. This work was supported funded by an ERC-StG grant from the European Research Council, grant number 677352 (NLPRO), and by an innovation grant by the Ministry of Science and Technology (MOST) 0002214, for which we are grateful.
© 2022 Association for Computational Linguistics. All rights reserved.