Gaussian mixture models with equivalence constraints

Noam Shental, Aharon Bar-Hillel, Tomer Hertz, Daphna Weinshall

פרסום מחקרי: פרק בספר / בדוח / בכנספרקביקורת עמיתים

תקציר

Abstract Gaussian Mixture Models (GMMs) have been widely used to cluster data in an unsupervised manner via the Expectation Maximization (EM) algorithm. In this chapter we suggest a semi-supervised EM algorithm that incorporates equivalence constraints into a GMM. Equivalence constraints provide information about pairs of data points, indicating whether the points arise from the same source (a must-link constraint) or from different sources (a cannot-link constraint). These constraints allow the EM algorithm to converge to solutions that better reflect the class structure of the data. Moreover, in some learning scenarios equivalence constraints can be gathered automatically while they are a natural form of supervision in others. We present a closed form EM algorithm for handling must-link constraints, and a generalized EM algorithm using a Markov network for incorporating cannotlink constraints. Using publicly available data sets, we demonstrate that incorporating equivalence constraints leads to a considerable improvement in clustering performance. Our GMM-based clustering algorithm significantly outperforms two other available clustering methods that use equivalence con-Mixture models are a powerful tool for probabilistic modelling of data, which have been widely used in various research areas such as pattern recognition, machine learning, computer vision, and signal processing [13, 14, 18]. Such models provide a principled probabilistic approach to cluster data in an unsupervised manner [24, 25, 30, 31]. In addition, their ability to represent complex density functions has also made them an excellent choice in density estimation problems [20, 23].

שפה מקוריתאנגלית
כותר פרסום המארחConstrained Clustering
כותר משנה של פרסום המארחAdvances in Algorithms, Theory, and Applications
מוציא לאורCRC Press
עמודים33-58
מספר עמודים26
מסת"ב (אלקטרוני)9781584889977
מסת"ב (מודפס)9781584889960
סטטוס פרסוםפורסם - 1 ינו׳ 2008
פורסם באופן חיצוניכן

הערה ביבליוגרפית

Publisher Copyright:
© 2008, CRC Press. All rights reserved.

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Gaussian mixture models with equivalence constraints'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי