PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Hashtag Discernability - Competitiveness Study of Graph Spectral and Other Clustering Methods

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Spectral clustering methods are claimed to possess ability to represent clusters of diverse shapes, densities etc. They constitute an approximation to graph cuts of various types (plain cuts, normalized cuts, ratio cuts). They are applicable to unweighted and weighted similarity graphs. We perform an evaluation of these capabilities for clustering tasks of increasing complexity.
Rocznik
Tom
Strony
759--767
Opis fizyczny
Bibliogr. 24 poz., tab., wykr.
Twórcy
  • nstitute of Computer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warsaw, Poland
  • nstitute of Computer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warsaw, Poland
  • nstitute of Computer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warsaw, Poland
  • nstitute of Computer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warsaw, Poland
Bibliografia
  • 1. S. T. Wierzchoń and M. A. Kłopotek, Modern Clustering Algorithms, ser. Studies in Big Data. Springer Verlag, 2018, vol. 34. ISBN 978-3-319-69307-1. http://dx.doi.org/https://doi.org/10.1007/978-3-319-69308-8
  • 2. P. Łoziński, D. Czerski, and M. A. Kłopotek, “Grammatical case based IS-A relation extraction with boosting for polish,” in Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, September 11-14, 2016, ser. Annals of Computer Science and Information Systems, M. Ganzha, L. A. Maciaszek, and M. Paprzycki, Eds., vol. 8. IEEE, 2016. http://dx.doi.org/10.15439/2016F391 pp. 533–540. [Online]. Available: https://doi.org/10.15439/2016F391
  • 3. J. Dörpinghaus, S. Schaaf, J. Fluck, and M. Jacobs, “Document clustering using a graph covering with pseudostable sets,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Prague, Czech Republic, September 3-6, 2017, ser. Annals of Computer Science and Information Systems, M. Ganzha, L. A. Maciaszek, and M. Paprzycki, Eds., vol. 11, 2017. http://dx.doi.org/10.15439/2017F84 pp. 329–338. [Online]. Available: https://doi.org/10.15439/2017F84
  • 4. P. Borkowski, M. A. Kłopotek, B. Starosta, S. T. Wierzchoń, and M. Sydow, “Eigenvalue based spectral classification,” PLoS ONE, vol. 18, no. 4, p. e0283413, 2023. http://dx.doi.org/https://doi.org/10.1371/journal.pone.0283413
  • 5. I. S. Dhillon and D. S. Modha, “Concept decompositions for large sparse text data using clustering,” Machine Learning, vol. 42, no. 1, pp. 143–175, Jan 2001. http://dx.doi.org/https://doi.org/10.1023/A:1007612920971
  • 6. S. T. Wierzchoń and M. A. Kłopotek, “Spectral cluster maps versus spectral clustering,” in Computer Information Systems and Industrial Management, ser. LNCS, vol. 12133. Springer, 2020. http://dx.doi.org/10.1007/978-3-030-47679-3 40 pp. 472–484.
  • 7. C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. Fernández del Rı́o, M. Wiebe, P. Peterson, P. Gérard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke, and T. E. Oliphant, “Array programming with NumPy,” Nature, vol. 585, p. 357–362, 2020. http://dx.doi.org/10.1038/s41586-020-2649-2 https://numpy.org.
  • 8. P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, İ. Polat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. van Mulbregt, and SciPy 1.0 Contributors, “SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python,” Nature Methods, vol. 17, pp. 261–272, 2020. http://dx.doi.org/10.1038/s41592-019-0686-2 https://scipy.org.
  • 9. L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, and G. Varoquaux, “API design for machine learning software: experiences from the scikit-learn project,” in ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 2013. http://dx.doi.org/10.48550/arXiv.1309.023 pp. 108–122, https://scikit-learn.org.
  • 10. H. Kim and H. K. Kim, “clustering4docs github repository,” 2020, https://pypi.org/project/soyclustering/. [Online]. Available: https://github.com/lovit/clustering4docs
  • 11. H. Kim, H. K. Kim, and S. Cho, “Improving spherical k-means for document clustering: Fast initialization, sparse centroid projection, and efficient cluster labeling,” Expert Systems with Applications, vol. 150, p. 113288, 2020. http://dx.doi.org/https://doi.org/10.1016/j.eswa.2020.113288. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417420301135
  • 12. U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007. http://dx.doi.org/https://doi.org/10.48550/arXiv.0711.0189
  • 13. P. Macgregor and H. Sun, “A tighter analysis of spectral clustering, and beyond,” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162. PMLR, 17–23 Jul 2022. http://dx.doi.org/https://doi.org/10.48550/arXiv.2208.01724 pp. 14 717–14 742. [Online]. Available: https://proceedings.mlr.press/v162/macgregor22a.html
  • 14. Y. Xu, A. Srinivasan, and L. Xue, A Selective Overview of Recent Advances in Spectral Clustering and Their Applications. Cham: Springer International Publishing, 2021, pp. 247–277. ISBN 978-3-030-72437-5. http://dx.doi.org/10.1007/978-3-030-72437-5_12
  • 15. C. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. New York, NY, USA: Cambridge University Press, 2008. http://dx.doi.org/https://doi.org/10.1017/CBO9780511809071
  • 16. F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborov, and P. Zhang, “Spectral redemption in clustering sparse networks,” in Proc. the National Academy of Sciences, vol. 110 2013. http://dx.doi.org/10.48550/arXiv.1306.5550 pp. 20 935–20 940.
  • 17. H. T. Ali and R. Couillet, “Improved spectral community detection in large heterogeneous networks,” Journal of Machine Learning Research, vol. 18, no. 225, pp. 1–49, 2018. [Online]. Available: http://jmlr.org/papers/v18/17-247.html
  • 18. A. Saade, F. Krzakala, and L. Zdeborová, “Spectral clustering of graphs with the bethe hessian,” 2014. [Online]. Available: https://arxiv.org/abs/1406.1880. http://dx.doi.org/10.48550/ARXIV.1406.1880
  • 19. Y. Endo and S. Miyamoto, “Spherical k-means++ clustering,” in Modeling Decisions for Artificial Intelligence, V. Torra and T. Narukawa, Eds. Cham: Springer International Publishing, 2015. http://dx.doi.org/https://doi.org/10.1007/978-3-319-23240-9 9. ISBN 978-3-319-23240-9 pp. 103–114.
  • 20. S. Ji, D. Xu, L. Guo, M. Li, and D. Zhang, “The seeding algorithm for spherical k-means clustering with penalties,” J. Comb. Optim., vol. 44, no. 3, p. 1977–1994, oct 2022. http://dx.doi.org/10.1007/s10878-020-00569-1. [Online]. Available: https://doi.org/10.1007/s10878-020-00569-1
  • 21. J. Knittel, S. Koch, and T. Ertl, “Efficient sparse spherical k-means for document clustering,” in Proceedings of the 21st ACM Symposium on Document Engineering, DocEng ’21. ACM, New York, NY, United States, 2021. http://dx.doi.org/https://doi.org/10.1145/3469096.3474937 pp. 1–4.
  • 22. R. Pratap, A. Deshmukh, P. Nair, and T. Dutt, “A faster sampling algorithm for spherical k-means,” in Proceedings of The 10th Asian Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Zhu and I. Takeuchi, Eds., vol. 95. PMLR, 14–16 Nov 2018, pp. 343–358. [Online]. Available: https://proceedings.mlr.press/v95/pratap18a.html
  • 23. R. A. Kłopotek, M. A. Kłopotek, and S. T. Wierzchoń, “A feasible k-means kernel trick under non-euclidean feature space,” International Journal of Applied Mathematics and Computer Science, vol. 30, no. 4, pp. 703–715, 2020. http://dx.doi.org/https://doi.org/10.34768/amcs-2020-0052 Online publication date: 1-Dec-2020.
  • 24. J. Gower, “Some distance properties of latent root and vector methods used in multivariate analysis,” Biometrika, vol. 53(3-4), pp. 325—-338, 1966. http://dx.doi.org/https://doi.org/10.1093/biomet/53.3-4.325
Uwagi
1. Thematic Tracks Regular Papers
2. Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-49a857f4-749a-491a-ba85-de83d91435d0
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.