Keywords : SD


Evaluation of Clustering Validity

Rudhwan Yousif Sideek; Ghaydaa A.A. Al-Talib

AL-Rafidain Journal of Computer Sciences and Mathematics, 2008, Volume 5, Issue 2, Pages 79-97
DOI: 10.33899/csmj.2008.163987

Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation as regards its validity.
            In this paper, we present a clustering validity procedure, which evaluates the results of clustering algorithms on data sets. We define a validity indexes, S_Dbw & SD, based on well-defined clustering criteria enabling the selection of the optimal input parameters values for a clustering algorithm that result in the best partitioning of a data set.
            We evaluate the reliability of our indexes experimentally, considering clustering algorithm (K_Means) on real data sets.
Our approach is performed favorably in finding the correct number of clusters fitting a data set.