Evaluating a positive attribute clustering model for data mining

Zippy Erlich, Roy Gelbard, Israel Spiegler

פרסום מחקרי: פרסום בכתב עתמאמרביקורת עמיתים

תקציר

We outline and evaluate a binary-positive clustering model. It is based on a binary representation of data records in rows where column entries, either T or '0', correspond to all possible data values that tributes may take. A new group similarity index (GSI) is devised which takes into account only the positive attributes as basis for the grouping and clustering algorithm. The model is compared with standard clustering models. For the comparison we define an objective measure about two similarity factors: within-class similarity (WCS) and between-class similarity (BCS), seeking a maximum intra-group and minimum inter-group proximity, respectively. A coefficient of variation (CV) statistic is then employed to combine the two factors into a measure of relative diversity between records and groups. When applied to a common data set our binary clustering shows significant advantages over standard clustering models.

שפה מקוריתאנגלית
עמודים (מ-עד)100-108
מספר עמודים9
כתב עתJournal of Computer Information Systems
כרך43
מספר גיליון3
סטטוס פרסוםפורסם - מרץ 2003

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Evaluating a positive attribute clustering model for data mining'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי