Enriching audio databases with temporal information hidden in the acoustic signal

Nehory Carmi, Anat Lerner, Vered Silber-Varod

פרסום מחקרי: תוצר מחקר מכנסתקצירביקורת עמיתים

תקציר

Audio information retrieval is an emerging field as users' needs for accurate search results are growing (Pradhan & Sharma, 2018), and as Media Assets Management (MAM) systems tackle with a massive amount of audio and video excerpts that need to be indexed. Previous research investigated the use of automatic speech recognition engine to search engine optimization (Silber-Varod, Winer, & Geri, 2016). However, conventional search engines cannot harness a potentially varied temporal information in videos, such as those exist in documentaries, for example. We believe existing techniques can be used to automatically tag a massive amount of cultural heritage videos with no (accurate) metadata concerning their temporal information. In this feasibility study, we investigate the possibility to carbon-dating (i.e., estimate the accurate period) a video, based on its acoustic signal, and by that, to offer an automatic enrichment of video databases, beyond the traditional manual meta-data. For this purpose, we use a collection of web media that combine present-days broadcasting with archive excerpts. For the current proof-of concept study, we chose broadcasts from 2010 that combine speech from 2010 and archive sections (with varied sounds, not necessarily speech) ranging from 1967-1973. The procedure included extraction of acoustic features, and clustering by k-mean methods, usually used for speaker recognition technology (Giannakopoulos and Pikrakis, 2014). We then compared the automatic results to a manual segmentation of the video into "new" or "old" sections. We succeeded in obtaining classification for 96% of video sections, using automatic technique of parsing the videos according to the two periods.
שפה מקוריתאנגלית אמריקאית
עמודים29
מספר עמודים1
סטטוס פרסוםפורסם - 2019
אירועKnowledge Management Conference 2019 - Faculty of Economic Sciences, Warsaw University of Life Sciences (SGGW), Warsaw, פולין
משך הזמן: 26 יוני 201929 יוני 2019

כנס

כנסKnowledge Management Conference 2019
כותר מקוצרKM Conference 2019
מדינה/אזורפולין
עיר Warsaw
תקופה26/06/1929/06/19

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Enriching audio databases with temporal information hidden in the acoustic signal'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי