Wyniki wyszukiwania - BazTech

1

Towards explainability of hashtags in the light of Graph Spectral Clustering methods

Starosta Bartłomiej, Kłopotek Mieczysław A., Wierzchoń Sławomir T.

Studia Informatica : systems and information technology

|

2023

|

Vol. 2(29)

57--68

EN

Hashtags constitute an indispensable part of modern social media world. As more and more hashtags are invented, it becomes a necessity to create clusters of these hashtags. Nowadays, however, the clustering alone does not help the users. They are asking for justification or expressed in the modern AI language, the clustering has to be explainable. We discuss a novel approach to hashtag explanation via a measure of similarity between hashtags based on the Graph Spectral Analysis. The application of this similarity measure may go far beyond the classical clustering task. It can be used to provide with explanations for the hashtags. In this paper we propose such a novel view of the proposed hashtag similarity measure.

2

On the Consistency of k-means++ algorithm

Kłopotek Mieczysław A.

Fundamenta Informaticae

|

2020

|

Vol. 172, nr 4

361--377

EN

We prove in this paper that the expected value of the objective function of the k-means++ algorithm for samples converges to population expected value. As k-means++, for samples, provides with constant factor approximation for k-means objectives, such an approximation can be achieved for the population with increase of the sample size. This result is of potential practical relevance when one is considering using subsampling when clustering large data sets (large data bases).

3

On the Existence of Kernel Function for Kernel-Trick of k-Means in the Light of Gower Theorem

Kłopotek Mieczysław A.

Fundamenta Informaticae

|

2019

|

Vol. 168, nr 1

25--43

EN

This paper, constituting an extension to the conference paper [1], corrects the proof of the Theorem 2 from the Gower’s paper [2, page 5]. The correction is needed in order to establish the existence of the kernel function used commonly in the kernel trick e.g. for k-means clustering algorithm, on the grounds of distance matrix. The correction encompasses the missing if-part proof and dropping unnecessary conditions.