Universal joint morph-syntactic processing: The open university of Israel’s submission to the CoNLL 2017 shared task

Amir More, Reut Tsarfaty

פרסום מחקרי: פרק בספר / בדוח / בכנספרסום בספר כנסביקורת עמיתים

תקציר

We present the Open University’s submission (ID OpenU-NLP-Lab) to the CoNLL 2017 UD Shared Task on multilingual parsing from raw text to Universal Dependencies. The core of our system is a joint morphological disambiguator and syntactic parser which accepts morphologically analyzed surface tokens as input and returns morphologically disambiguated dependency trees as output. Our parser requires a lattice as input, so we generate morphological analyses of surface tokens using a data-driven morphological analyzer that derives its lexicon from the UD training corpora, and we rely on UDPipe for sentence segmentation and surface-level tokenization. We report our official macro-average LAS is 56.56. Although our model is not as performant as many others, it does not make use of neural networks, therefore we do not rely on word embeddings or any other data source other than the corpora themselves. In addition, we show the utility of a lexicon-backed morphological analyzer for the MRL Modern Hebrew. We use our results on Modern Hebrew to argue that the UD community should define a UD-compatible standard for access to lexical resources, which we argue is crucial for MRLs and low resource languages in particular.

שפה מקוריתאנגלית
כותר פרסום המארחCoNLL 2017 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2017 Shared Task
כותר משנה של פרסום המארחMultilingual Parsing from Raw Text to Universal Dependencies
מוציא לאורAssociation for Computational Linguistics (ACL)
עמודים253-264
מספר עמודים12
מסת"ב (אלקטרוני)9781945626708
מזהי עצם דיגיטלי (DOIs)
סטטוס פרסוםפורסם - 2017
אירוע2017 SIGNLL Conference on Computational Natural Language Learning- CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2017 - Vancouver, קנדה
משך הזמן: 3 אוג׳ 20174 אוג׳ 2017

סדרות פרסומים

שםCoNLL 2017 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

כנס

כנס2017 SIGNLL Conference on Computational Natural Language Learning- CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2017
מדינה/אזורקנדה
עירVancouver
תקופה3/08/174/08/17

הערה ביבליוגרפית

Publisher Copyright:
© 2017 Association for Computational Linguistics.

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Universal joint morph-syntactic processing: The open university of Israel’s submission to the CoNLL 2017 shared task'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי