Experimental Estimation of Number of Clusters Based on Cluster Quality


G. Hannah Grace - Department of Mathematics, School of Advanced Sciences, VIT University, Chennai 600127, India. Kalyani Desikan - Department of Mathematics, School of Advanced Sciences, VIT University, Chennai 600127, India.


Text Clustering is a text mining technique which divides the given set of text documents into significant clusters. It is used for organizing a huge number of text documents into a well-organized form. In the majority of the clustering algorithms, the number of clusters must be specified apriori, which is a drawback of these algorithms. The aim of this paper is to show experimentally how to determine the number of clusters based on cluster quality. Since partitional clustering algorithms are well-suited for clustering large document datasets, we have confined our analysis to a partitional clustering algorithm.

