TY - JOUR
T1 - Style Transfer of Modern Hebrew Literature Using Text Simplification and Generative Language Modeling
AU - Kaganovich, Pavel
AU - Münz-Manor, Ophir
AU - Ezra-Tsur, Elishai
N1 - Publisher Copyright:
© 2023 Copyright for this paper by its authors.
PY - 2023
Y1 - 2023
N2 - The task of Style Transfer (ST) in Natural Language Processing (NLP), involves altering the style of a given sentence to match another target style while preserving its semantics. Currently, the availability of Hebrew models for NLP, specifically generative models, is scarce. The development of such models is a non-trivial task due to the complex nature of Hebrew. The Hebrew language presents notable challenges to NLP as a result of its rich morphology, intricate inflectional structure, and orthography, which have undergone significant transformations throughout its history1. In this work, we propose a generative ST model of modern Hebrew language that rewrites sentences to a target style in the absence of parallel style corpora. Our focus is on the domain of Modern Hebrew literature, which presents unique challenges for the ST task. To overcome the lack of parallel data, we initially create a pseudo-parallel corpus using back translation (BT) techniques for the purpose of achieving text simplification. Subsequently, we fine-tune a pre-trained Hebrew language model (LM) and leverage a zero-shot Learning (ZSL) approach for ST. Our study demonstrates significant achievements in terms of transfer accuracy, semantic similarity, and fluency in the ST of source sentence to a target style using our model. Notably, to the best of our knowledge, no prior research has focused on the development of ST models specifically for Modern Hebrew literature. As such, our proposed model constitutes a novel and valuable contribution to the field of Hebrew NLP, Modern Hebrew Literature and more generally computational literary studies.
AB - The task of Style Transfer (ST) in Natural Language Processing (NLP), involves altering the style of a given sentence to match another target style while preserving its semantics. Currently, the availability of Hebrew models for NLP, specifically generative models, is scarce. The development of such models is a non-trivial task due to the complex nature of Hebrew. The Hebrew language presents notable challenges to NLP as a result of its rich morphology, intricate inflectional structure, and orthography, which have undergone significant transformations throughout its history1. In this work, we propose a generative ST model of modern Hebrew language that rewrites sentences to a target style in the absence of parallel style corpora. Our focus is on the domain of Modern Hebrew literature, which presents unique challenges for the ST task. To overcome the lack of parallel data, we initially create a pseudo-parallel corpus using back translation (BT) techniques for the purpose of achieving text simplification. Subsequently, we fine-tune a pre-trained Hebrew language model (LM) and leverage a zero-shot Learning (ZSL) approach for ST. Our study demonstrates significant achievements in terms of transfer accuracy, semantic similarity, and fluency in the ST of source sentence to a target style using our model. Notably, to the best of our knowledge, no prior research has focused on the development of ST models specifically for Modern Hebrew literature. As such, our proposed model constitutes a novel and valuable contribution to the field of Hebrew NLP, Modern Hebrew Literature and more generally computational literary studies.
KW - Computational Literary Studies
KW - Hebrew Language
KW - Language Model
KW - Modern Hebrew Literature
KW - Natural Language Processing
KW - Style Transfer
UR - http://www.scopus.com/inward/record.url?scp=85178657646&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85178657646
SN - 1613-0073
VL - 3558
SP - 391
EP - 412
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2023 Computational Humanities Research Conference, CHR 2023
Y2 - 6 December 2023 through 8 December 2023
ER -