דילוג לניווט ראשי דילוג לחיפוש דילוג לתוכן הראשי

Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting

  • Minghui Liao
  • , Guan Pang
  • , Jing Huang
  • , Tal Hassner
  • , Xiang Bai

פרסום מחקרי: פרק בספר / בדוח / בכנספרסום בספר כנסביקורת עמיתים

תקציר

Recent end-to-end trainable methods for scene text spotting, integrating detection and recognition, showed much progress. However, most of the current arbitrary-shape scene text spotters use region proposal networks (RPN) to produce proposals. RPN relies heavily on manually designed anchors and its proposals are represented with axis-aligned rectangles. The former presents difficulties in handling text instances of extreme aspect ratios or irregular shapes, and the latter often includes multiple neighboring instances into a single proposal, in cases of densely oriented text. To tackle these problems, we propose Mask TextSpotter v3, an end-to-end trainable scene text spotter that adopts a Segmentation Proposal Network (SPN) instead of an RPN. Our SPN is anchor-free and gives accurate representations of arbitrary-shape proposals. It is therefore superior to RPN in detecting text instances of extreme aspect ratios or irregular shapes. Furthermore, the accurate proposals produced by SPN allow masked RoI features to be used for decoupling neighboring text instances. As a result, our Mask TextSpotter v3 can handle text instances of extreme aspect ratios or irregular shapes, and its recognition accuracy won’t be affected by nearby text or background noise. Specifically, we outperform state-of-the-art methods by 21.9% on the Rotated ICDAR 2013 dataset (rotation robustness), 5.9% on the Total-Text dataset (shape robustness), and achieve state-of-the-art performance on the MSRA-TD500 dataset (aspect ratio robustness). Code is available at: https://github.com/MhLiao/MaskTextSpotterV3.

שפה מקוריתאנגלית
כותר פרסום המארחComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
עורכיםAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
מוציא לאורSpringer Science and Business Media Deutschland GmbH
עמודים706-722
מספר עמודים17
מסת"ב (מודפס)9783030586201
מזהי עצם דיגיטלי (DOIs)
סטטוס פרסוםפורסם - 2020
פורסם באופן חיצוניכן
אירוע16th European Conference on Computer Vision, ECCV 2020 - Glasgow, בריטניה
משך הזמן: 23 אוג׳ 202028 אוג׳ 2020

סדרות פרסומים

שםLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
כרך12356 LNCS
ISSN (מודפס)0302-9743
ISSN (אלקטרוני)1611-3349

כנס

כנס16th European Conference on Computer Vision, ECCV 2020
מדינה/אזורבריטניה
עירGlasgow
תקופה23/08/2028/08/20

הערה ביבליוגרפית

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי