When standard RANSAC is not enough: Cross-media visual matching with hypothesis relevancy

Tal Hassner, Liav Assif, Lior Wolf

פרסום מחקרי: פרסום בכתב עתמאמרביקורת עמיתים


The same scene can be depicted by multiple visual media. For example, the same event can be captured by a comic image or a movie frame; the same object can be represented by a photograph or by a 3D computer graphics model. In order to extract the visual analogies that are at the heart of cross-media analysis, spatial matching is required. This matching is commonly achieved by extracting key points and scoring multiple, randomly generated mapping hypotheses. The more consensus a hypothesis can draw, the higher its score. In this paper, we go beyond the conventional set-size measure for the quality of a match and present a more general hypothesis score that attempts to reflect how likely is each hypothesized transformation to be the correct one for the matching task at hand. This is achieved by considering additional, contextual cues for the relevance of a hypothesized transformation. This context changes from one matching task to another and reflects different properties of the match, beyond the size of a consensus set. We demonstrate that by learning how to correctly score each hypothesis based on these features we are able to deal much more robustly with the challenges required to allow cross-media analysis, leading to correct matches where conventional methods fail.

שפה מקוריתאנגלית
עמודים (מ-עד)971-983
מספר עמודים13
כתב עתMachine Vision and Applications
מספר גיליון4
מזהי עצם דיגיטלי (DOIs)
סטטוס פרסוםפורסם - מאי 2014

הערה ביבליוגרפית

Funding Information:
TH was partially funded by General Motors (GM).

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'When standard RANSAC is not enough: Cross-media visual matching with hypothesis relevancy'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי