ملخص
Parsing texts into universal dependencies (UD) in realistic scenarios requires infrastructure for morphological analysis and disambiguation (MA&D) of typologically different languages as a first tier. MA&D is particularly challenging in morphologically rich languages (MRLs), where the ambiguous space-delimited tokens ought to be disambiguated with respect to their constituent morphemes. Here we present a novel, language-agnostic, framework for MA&D, based on a transition system with two variants, word-based and morpheme-based, and a dedicated transition to mitigate the biases of variable-length morpheme sequences. Our experiments on a Modern Hebrew case study outperform the state of the art, and we show that the morpheme-based MD consistently outperforms our word-based variant. We further illustrate the utility and multilingual coverage of our framework by morphologically analyzing and disambiguating the large set of languages in the UD treebanks.
اللغة الأصلية | الإنجليزيّة |
---|---|
عنوان منشور المضيف | COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016 |
العنوان الفرعي لمنشور المضيف | Technical Papers |
ناشر | Association for Computational Linguistics, ACL Anthology |
الصفحات | 337-348 |
عدد الصفحات | 12 |
رقم المعيار الدولي للكتب (المطبوع) | 9784879747020 |
حالة النشر | نُشِر - 2016 |
الحدث | 26th International Conference on Computational Linguistics, COLING 2016 - Osaka, اليابان المدة: ١١ ديسمبر ٢٠١٦ → ١٦ ديسمبر ٢٠١٦ |
سلسلة المنشورات
الاسم | COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers |
---|
!!Conference
!!Conference | 26th International Conference on Computational Linguistics, COLING 2016 |
---|---|
الدولة/الإقليم | اليابان |
المدينة | Osaka |
المدة | ١١/١٢/١٦ → ١٦/١٢/١٦ |
ملاحظة ببليوغرافية
Publisher Copyright:© 1963-2018 ACL.