Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 2

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  topic model
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
It is useful to extract review sentences based on an assigned viewpoint for purposes such as summarization tasks. Previous studies have considered review extraction using semi-supervised learning or association mining. However, we approach this task using a clustering method. In particular, we focus on a topic model as a clustering method. In the conventional topic model, after randomly initializing the word distribution and the topic distribution, these distributions are estimated in order to minimize the perplexity using Gibbs sampling or variational Bayes. We introduce a new method called the PageRank topic model (PRTM) for estimating multinomial distributions over topics and words using network structure analysis methods. PRTM extracts topics by focusing on the co-occurrence relationships of words and it does not need randomly initialized values. Therefore, it can calculate unique word and topic distributions. In experiments using synthetic data, we showed that PRTM can infer an appropriate number of topics by clustering short sentences, and it was particularly effective when the sentences were covered by a small number of topics. Furthermore, in a real-world review data experiment, we showed that PRTM performed better with a shorter runtime compared with other models that infer the number of topics.
EN
A huge amount of documents in the digitalized libraries requires efficient methods for exploring contained there information. ìTopic modelingî is considered as one of the most effective among them. In spite of commonly used approaches for finding occurrences of single words, in the paper building topic models based on phrases is pondered. We propose a methodology, which enables to create a set of significant word sequences and thus limiting the search area to phrases which contain them. The methodology is evaluated on experiments performed on real text datasets. Obtained results are compared with those received by using LDA algorithm.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.