Fast Clustering on CUDA Platform

K-Medoids clustering is very expensive. The basic algorithm PAM (Partitioning Around Medoids) does not scale very well for bigger datasets. To cope with this problem many modifications of the PAM algorithm, have been developed (ie. CLARANS). Unfortunately larger datasets still need to be clustered on computers whose computing power exceeds those of the normal desktop PCs. In this paper we present modifications of K-Medoids clustering algorithms PAM and CLARANS (Clustering Large Applications based on RANdomized Search) which utilize the graphics processing units (GPUs) of the modern graphics cards through CUDA (Compute Unified Device Architecture) platform to accelerate the most costly stages of these algorithms. We also present results of extensive performance experiments which show high improvements over old versions of these algorithms.
