Abstract
Joint factor analysis (JFA) is widely used by state-of-the-art speech processing systems for tasks such as speaker verification, language identification and emotion detection. In this paper we introduce new developments for the JFA framework which we validate empirically for the speaker verification task but in principle may be beneficial for other tasks too. We first propose a method for obtaining improved recognition accuracy by better modeling supervector estimation uncertainty. We then propose a novel approach we name JFAlight for extremely efficient approximated estimation of speaker, common and channel factors. Using JFAlight we are able to efficiently score a given test session with a very small degradation in accuracy.
Original language | English |
---|---|
Pages (from-to) | 129-132 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2011 |
Externally published | Yes |
Event | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy Duration: 27 Aug 2011 → 31 Aug 2011 |
Keywords
- Efficient joint factor analysis
- Efficient speaker recognition
- Joint factor analysis
- Speaker recognition
- Speaker verification