TY - JOUR
T1 - Text-based NP Enrichment
AU - Elazar, Yanai
AU - Basmov, Victoria
AU - Goldberg, Yoav
AU - Tsarfaty, Reut
N1 - Publisher Copyright:
© MIT Press Journals. All rights reserved.
PY - 2022/7/27
Y1 - 2022/7/27
N2 - Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding. However, only a fraction of such relations is covered by standard NLP tasks and benchmarks nowadays. In this work, we propose a novel task termed text-based NP enrichment (TNE), in which we aim to enrich each NP in a text with all the preposition-mediated relations—either explicit or implicit—that hold between it and other NPs in the text. The relations are represented as triplets, each denoted by two NPs related via a preposition. Humans recover such relations seamlessly, while current state-of-the-art models struggle with them due to the implicit nature of the problem. We build the first large-scale dataset for the problem, provide the formal framing and scope of annotation, analyze the data, and report the results of fine-tuned language models on the task, demonstrating the challenge it poses to current technology. A webpage with a data-exploration UI, a demo, and links to the code, models, and leaderboard, to foster further research into this challenging problem can be found at: yanaiela.github.io/TNE/.
AB - Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding. However, only a fraction of such relations is covered by standard NLP tasks and benchmarks nowadays. In this work, we propose a novel task termed text-based NP enrichment (TNE), in which we aim to enrich each NP in a text with all the preposition-mediated relations—either explicit or implicit—that hold between it and other NPs in the text. The relations are represented as triplets, each denoted by two NPs related via a preposition. Humans recover such relations seamlessly, while current state-of-the-art models struggle with them due to the implicit nature of the problem. We build the first large-scale dataset for the problem, provide the formal framing and scope of annotation, analyze the data, and report the results of fine-tuned language models on the task, demonstrating the challenge it poses to current technology. A webpage with a data-exploration UI, a demo, and links to the code, models, and leaderboard, to foster further research into this challenging problem can be found at: yanaiela.github.io/TNE/.
UR - http://www.scopus.com/inward/record.url?scp=85135438167&partnerID=8YFLogxK
U2 - 10.1162/tacl_a_00488
DO - 10.1162/tacl_a_00488
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85135438167
SN - 2307-387X
VL - 10
SP - 764
EP - 784
JO - Transactions of the Association for Computational Linguistics
JF - Transactions of the Association for Computational Linguistics
ER -