תקציר
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for large LMs in morphologically rich languages (MRLs) such as Hebrew. We demonstrate this by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, for which we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a separate, specialized, morpheme-based, decoder. Using this approach, our experiments show substantial improvements over previously published results on all existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.
| שפה מקורית | אנגלית |
|---|---|
| כותר פרסום המארח | Findings of the Association for Computational Linguistics, ACL 2023 |
| מוציא לאור | Association for Computational Linguistics (ACL) |
| עמודים | 7700-7708 |
| מספר עמודים | 9 |
| מסת"ב (אלקטרוני) | 9781959429623 |
| מזהי עצם דיגיטלי (DOIs) | |
| סטטוס פרסום | פורסם - 2023 |
| פורסם באופן חיצוני | כן |
| אירוע | Findings of the Association for Computational Linguistics, ACL 2023 - Toronto, קנדה משך הזמן: 9 יולי 2023 → 14 יולי 2023 |
סדרות פרסומים
| שם | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
|---|---|
| ISSN (מודפס) | 0736-587X |
כנס
| כנס | Findings of the Association for Computational Linguistics, ACL 2023 |
|---|---|
| מדינה/אזור | קנדה |
| עיר | Toronto |
| תקופה | 9/07/23 → 14/07/23 |
הערה ביבליוגרפית
Publisher Copyright:© 2023 Association for Computational Linguistics.
טביעת אצבע
להלן מוצגים תחומי המחקר של הפרסום 'Multilingual Sequence-to-Sequence Models for Hebrew NLP'. יחד הם יוצרים טביעת אצבע ייחודית.פורמט ציטוט ביבליוגרפי
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver