Enriching audio databases with temporal information hidden in the acoustic signal

Nehory Carmi, Anat Lerner, Vered Silber-Varod

Research output: Contribution to conferenceAbstractpeer-review


Audio information retrieval is an emerging field as users' needs for accurate search results are growing (Pradhan & Sharma, 2018), and as Media Assets Management (MAM) systems tackle with a massive amount of audio and video excerpts that need to be indexed. Previous research investigated the use of automatic speech recognition engine to search engine optimization (Silber-Varod, Winer, & Geri, 2016). However, conventional search engines cannot harness a potentially varied temporal information in videos, such as those exist in documentaries, for example. We believe existing techniques can be used to automatically tag a massive amount of cultural heritage videos with no (accurate) metadata concerning their temporal information. In this feasibility study, we investigate the possibility to carbon-dating (i.e., estimate the accurate period) a video, based on its acoustic signal, and by that, to offer an automatic enrichment of video databases, beyond the traditional manual meta-data. For this purpose, we use a collection of web media that combine present-days broadcasting with archive excerpts. For the current proof-of concept study, we chose broadcasts from 2010 that combine speech from 2010 and archive sections (with varied sounds, not necessarily speech) ranging from 1967-1973. The procedure included extraction of acoustic features, and clustering by k-mean methods, usually used for speaker recognition technology (Giannakopoulos and Pikrakis, 2014). We then compared the automatic results to a manual segmentation of the video into "new" or "old" sections. We succeeded in obtaining classification for 96% of video sections, using automatic technique of parsing the videos according to the two periods.
Original languageAmerican English
Number of pages1
StatePublished - 2019
EventKnowledge Management Conference 2019 - Faculty of Economic Sciences, Warsaw University of Life Sciences (SGGW), Warsaw, Poland
Duration: 26 Jun 201929 Jun 2019


ConferenceKnowledge Management Conference 2019
Abbreviated titleKM Conference 2019
City Warsaw


Dive into the research topics of 'Enriching audio databases with temporal information hidden in the acoustic signal'. Together they form a unique fingerprint.

Cite this