A piggyback representation for action recognition

Lior Wolf, Yair Hanani, Tal Hassner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In video understanding, the spatial patterns formed by local space-time interest points hold discriminative information. We encode these spatial regularities using a word2vec neural network, a recently proposed tool in the field of text processing. Then, building upon recent accumulator based image representation solutions, input videos are represented in a hybrid manner: the appearance of local space time interest points is used to collect and associate the learned descriptors, which capture the spatial patterns. Promising results are shown on recent action recognition benchmarks, using well established methods as the underlying appearance descriptors.

Original languageEnglish
Title of host publicationProceedings - 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014
PublisherIEEE Computer Society
Pages520-525
Number of pages6
ISBN (Electronic)9781479943098, 9781479943098
DOIs
StatePublished - 24 Sep 2014
Event2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014 - Columbus, United States
Duration: 23 Jun 201428 Jun 2014

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014
Country/TerritoryUnited States
CityColumbus
Period23/06/1428/06/14

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Fingerprint

Dive into the research topics of 'A piggyback representation for action recognition'. Together they form a unique fingerprint.

Cite this