Scalable attentive sentence-pair modeling via distilled sentence embedding

Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein

نتاج البحث: فصل من :كتاب / تقرير / مؤتمرمنشور من مؤتمرمراجعة النظراء

ملخص

Recent state-of-the-art natural language understanding mod-els, such as BERT and XLNet, score a pair of sentences (A and B) using multiple cross-attention operations - a process in which each word in sentence A attends to all words in sentence B and vice versa. As a result, computing the simi-larity between a query sentence and a set of candidate sen-tences, requires the propagation of all query-candidate sen-tence-pairs throughout a stack of cross-attention layers. This exhaustive process becomes computationally prohibitive when the number of candidate sentences is large. In con-trast, sentence embedding techniques learn a sentence-to-vector mapping and compute the similarity between the sen-tence vectors via simple elementary operations. In this pa-per, we introduce Distilled Sentence Embedding (DSE) - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks. The out-line of DSE is as follows: Given a cross-attentive teacher model (e.g. a fine-tuned BERT), we train a sentence embed-ding based student model to reconstruct the sentence-pair scores obtained by the teacher model. We empirically demonstrate the effectiveness of DSE on five GLUE sen-tence-pair tasks. DSE significantly outperforms several ELMO variants and other sentence embedding methods, while accelerating computation of the query-candidate sen-tence-pairs similarities by several orders of magnitude, with an average relative degradation of 4.6% compared to BERT. Furthermore, we show that DSE produces sentence embed-dings that reach state-of-the-art performance on universal sentence representation benchmarks. Our code is made pub-licly available at https://github.com/microsoft/Distilled-Sentence-Embedding.

اللغة الأصليةالإنجليزيّة
عنوان منشور المضيفAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
ناشرAAAI press
الصفحات3235-3242
عدد الصفحات8
رقم المعيار الدولي للكتب (الإلكتروني)9781577358350
حالة النشرنُشِر - 2020
منشور خارجيًانعم
الحدث34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, الولايات المتّحدة
المدة: ٧ فبراير ٢٠٢٠١٢ فبراير ٢٠٢٠

سلسلة المنشورات

الاسمAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

!!Conference

!!Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
الدولة/الإقليمالولايات المتّحدة
المدينةNew York
المدة٧/٠٢/٢٠١٢/٠٢/٢٠

ملاحظة ببليوغرافية

Publisher Copyright:
© 2020, Association for the Advancement of Artificial Intelli-gence.

بصمة

أدرس بدقة موضوعات البحث “Scalable attentive sentence-pair modeling via distilled sentence embedding'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا