When standard RANSAC is not enough: Cross-media visual matching with hypothesis relevancy

Tal Hassner, Liav Assif, Lior Wolf

نتاج البحث: نشر في مجلةمقالةمراجعة النظراء


The same scene can be depicted by multiple visual media. For example, the same event can be captured by a comic image or a movie frame; the same object can be represented by a photograph or by a 3D computer graphics model. In order to extract the visual analogies that are at the heart of cross-media analysis, spatial matching is required. This matching is commonly achieved by extracting key points and scoring multiple, randomly generated mapping hypotheses. The more consensus a hypothesis can draw, the higher its score. In this paper, we go beyond the conventional set-size measure for the quality of a match and present a more general hypothesis score that attempts to reflect how likely is each hypothesized transformation to be the correct one for the matching task at hand. This is achieved by considering additional, contextual cues for the relevance of a hypothesized transformation. This context changes from one matching task to another and reflects different properties of the match, beyond the size of a consensus set. We demonstrate that by learning how to correctly score each hypothesis based on these features we are able to deal much more robustly with the challenges required to allow cross-media analysis, leading to correct matches where conventional methods fail.

اللغة الأصليةالإنجليزيّة
الصفحات (من إلى)971-983
عدد الصفحات13
دوريةMachine Vision and Applications
مستوى الصوت25
رقم الإصدار4
المعرِّفات الرقمية للأشياء
حالة النشرنُشِر - مايو 2014

ملاحظة ببليوغرافية

Funding Information:
TH was partially funded by General Motors (GM).


أدرس بدقة موضوعات البحث “When standard RANSAC is not enough: Cross-media visual matching with hypothesis relevancy'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا