Enriching audio databases with temporal information hidden in the acoustic signal

Nehory Carmi, Anat Lerner, Vered Silber-Varod

نتاج البحث: نتاج بحثي من مؤتمرمختصرمراجعة النظراء


Audio information retrieval is an emerging field as users' needs for accurate search results are growing (Pradhan & Sharma, 2018), and as Media Assets Management (MAM) systems tackle with a massive amount of audio and video excerpts that need to be indexed. Previous research investigated the use of automatic speech recognition engine to search engine optimization (Silber-Varod, Winer, & Geri, 2016). However, conventional search engines cannot harness a potentially varied temporal information in videos, such as those exist in documentaries, for example. We believe existing techniques can be used to automatically tag a massive amount of cultural heritage videos with no (accurate) metadata concerning their temporal information. In this feasibility study, we investigate the possibility to carbon-dating (i.e., estimate the accurate period) a video, based on its acoustic signal, and by that, to offer an automatic enrichment of video databases, beyond the traditional manual meta-data. For this purpose, we use a collection of web media that combine present-days broadcasting with archive excerpts. For the current proof-of concept study, we chose broadcasts from 2010 that combine speech from 2010 and archive sections (with varied sounds, not necessarily speech) ranging from 1967-1973. The procedure included extraction of acoustic features, and clustering by k-mean methods, usually used for speaker recognition technology (Giannakopoulos and Pikrakis, 2014). We then compared the automatic results to a manual segmentation of the video into "new" or "old" sections. We succeeded in obtaining classification for 96% of video sections, using automatic technique of parsing the videos according to the two periods.
اللغة الأصليةإنجليزيّة أمريكيّة
عدد الصفحات1
حالة النشرنُشِر - 2019
الحدثKnowledge Management Conference 2019 - Faculty of Economic Sciences, Warsaw University of Life Sciences (SGGW), Warsaw, بولندا
المدة: ٢٦ يونيو ٢٠١٩٢٩ يونيو ٢٠١٩


!!ConferenceKnowledge Management Conference 2019
المسمى المختصرKM Conference 2019
المدينة Warsaw


أدرس بدقة موضوعات البحث “Enriching audio databases with temporal information hidden in the acoustic signal'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا