Abstract
Scale invariant feature detectors often find stable scales in only a few image pixels. Consequently, methods for feature matching typically choose one of two extreme options: matching a sparse set of scale invariant features, or dense matching using arbitrary scales. In this paper, we turn our attention to the overwhelming majority of pixels, those where stable scales are not found by standard techniques. We ask, is scale-selection necessary for these pixels, when dense, scale-invariant matching is required and if so, how can it be achieved? We make the following contributions: (i) We show that features computed over different scales, even in low-contrast areas, can be different and selecting a single scale, arbitrarily or otherwise, may lead to poor matches when the images have different scales. (ii) We show that representing each pixel as a set of SIFTs, extracted at multiple scales, allows for far better matches than single-scale descriptors, but at a computational price. Finally, (iii) we demonstrate that each such set may be accurately represented by a low-dimensional, linear subspace. A subspace-to-point mapping may further be used to produce a novel descriptor representation, the Scale-Less SIFT (SLS), as an alternative to single-scale descriptors. These claims are verified by quantitative and qualitative tests, demonstrating significant improvements over existing methods. A preliminary version of this work appeared in [1].
Original language | English |
---|---|
Article number | 7516703 |
Pages (from-to) | 1431-1443 |
Number of pages | 13 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 39 |
Issue number | 7 |
DOIs | |
State | Published - 1 Jul 2017 |
Bibliographical note
Publisher Copyright:© 1979-2012 IEEE.
Keywords
- Vision and scene understanding
- and transforms
- data structures
- representations