Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 1

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  SrpELTeC
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
1
Content available remote Topic Modeling of the SrpELTeC Corpus: A Comparison of NMF, LDA, and BERTopic
EN
Topic modeling is an effective way to gain insight into large amounts of data. Some of the most widely used topic models are Latent Dirichlet allocation (LDA) and Nonnegative Matrix Factorization (NMF). However, with the rise of self- attention models and pre-trained language models, new ways to mine topics have emerged. BERTopic represents the current state-of-the-art when it comes to modeling topics. In this paper, we comapred LDA, NMF, and BERTopic performance on literary texts in Serbian, by measuring Topic Coherency and Topic Diveristy, as well as qualitatively evaluating the topics. For BERTopic, we compared multilingual sentence transformer embeddings, to the Jerteh-355 monolingual embeddings for Serbian. For TC, NMF yielded the best results, while BERTopic with Jerteh-355 embeddings gave the best TD. Jerteh-355 also outperformed sentence transformers embeddings in both TC and TD.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.