TY - GEN
T1 - Diffusion maps for PLDA-based speaker verification
AU - Barkan, Oren
AU - Aronowitz, Hagai
PY - 2013/10/18
Y1 - 2013/10/18
N2 - During the last few years, i-vectors have become an important component in most state-of-the-art speaker recognition systems. Ivector extraction is based on an assumption that GMM supervectors reside on a low dimensional space, which is modeled using Factor Analysis. In this paper we replace the above assumption with an assumption that the GMM supervectors reside on a low dimensional manifold and propose to use Diffusion Maps to learn that manifold. The learnt manifold implies a mapping of spoken sessions into a modified i-vector space which we call d-vector space. D-vectors can further be processed using standard techniques such as LDA, WCCN, cosine distance scoring or Probabilistic Linear Discriminant Analysis (PLDA). We demonstrate the usefulness of our approach on the telephone core conditions of NIST 2010, and obtain significant error reduction.
AB - During the last few years, i-vectors have become an important component in most state-of-the-art speaker recognition systems. Ivector extraction is based on an assumption that GMM supervectors reside on a low dimensional space, which is modeled using Factor Analysis. In this paper we replace the above assumption with an assumption that the GMM supervectors reside on a low dimensional manifold and propose to use Diffusion Maps to learn that manifold. The learnt manifold implies a mapping of spoken sessions into a modified i-vector space which we call d-vector space. D-vectors can further be processed using standard techniques such as LDA, WCCN, cosine distance scoring or Probabilistic Linear Discriminant Analysis (PLDA). We demonstrate the usefulness of our approach on the telephone core conditions of NIST 2010, and obtain significant error reduction.
KW - Diffusion Maps
KW - Non-linear dimensionality reduction
KW - Pattern recognition
KW - Speaker verification
KW - ivectors
UR - http://www.scopus.com/inward/record.url?scp=84890521051&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6639149
DO - 10.1109/ICASSP.2013.6639149
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84890521051
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7639
EP - 7643
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -