Hashtags constitute an indispensable part of modern social media world. As more and more hashtags are invented, it becomes a necessity to create clusters of these hashtags. Nowadays, however, the clustering alone does not help the users. They are asking for justification or expressed in the modern AI language, the clustering has to be explainable. We discuss a novel approach to hashtag explanation via a measure of similarity between hashtags based on the Graph Spectral Analysis. The application of this similarity measure may go far beyond the classical clustering task. It can be used to provide with explanations for the hashtags. In this paper we propose such a novel view of the proposed hashtag similarity measure.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
We prove in this paper that the expected value of the objective function of the k-means++ algorithm for samples converges to population expected value. As k-means++, for samples, provides with constant factor approximation for k-means objectives, such an approximation can be achieved for the population with increase of the sample size. This result is of potential practical relevance when one is considering using subsampling when clustering large data sets (large data bases).
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
This paper, constituting an extension to the conference paper [1], corrects the proof of the Theorem 2 from the Gower’s paper [2, page 5]. The correction is needed in order to establish the existence of the kernel function used commonly in the kernel trick e.g. for k-means clustering algorithm, on the grounds of distance matrix. The correction encompasses the missing if-part proof and dropping unnecessary conditions.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.