It is useful to extract review sentences based on an assigned viewpoint for purposes such as summarization tasks. Previous studies have considered review extraction using semi-supervised learning or association mining. However, we approach this task using a clustering method. In particular, we focus on a topic model as a clustering method. In the conventional topic model, after randomly initializing the word distribution and the topic distribution, these distributions are estimated in order to minimize the perplexity using Gibbs sampling or variational Bayes. We introduce a new method called the PageRank topic model (PRTM) for estimating multinomial distributions over topics and words using network structure analysis methods. PRTM extracts topics by focusing on the co-occurrence relationships of words and it does not need randomly initialized values. Therefore, it can calculate unique word and topic distributions. In experiments using synthetic data, we showed that PRTM can infer an appropriate number of topics by clustering short sentences, and it was particularly effective when the sentences were covered by a small number of topics. Furthermore, in a real-world review data experiment, we showed that PRTM performed better with a shorter runtime compared with other models that infer the number of topics.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Previous seriation algorithms are confronted with a balance problem. Some approaches provide permutations with perfect wholeness, where matrix rows/columns are associated with increasing or decreasing gradient. However, this smooth permutation may lead to the blurred representation of the data structure, such as clustering structures and detailed structures inside clusters. Some other approaches indicate these structures well by tighter aggregating similar rows/columns, but this aggregation is alway at the cost of losing necessary coherence of the matrix rows/columns. In this paper, we introduce a seriation algorithm that aims at balancing the smoothness of the permutation and the clarity of the matrix structure. The permutation algorithm greedily and recursively replaces high-dissimilar object pairs with low-dissimilar ones, and the optimization algorithm searches the global optimizing solution by applying the simulated annealing algorithm. A comparison study shows both empirical and statistical evidence that Recut can provide more accurate and visually appropriate permutation by considering the balance problem.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Many medical accidents and incidents occurred due to communication errors. To avoid such incidents, in this paper, we propose a system for determining communication errors. Especially, we propose a model that can be applied to multi-layered or chained situations. First, we provide an overview of communication errors in nursing activities. Then we describe the warp and woof model for nursing task that was proposed by Harada and considers multi-layered or chained situations. Next we describe a system for determining communication errors based on the warp and woof model for nursing task. The system is capable of generating nursing activity diagrams semiautomatically and compiles necessary nursing activities. We also propose a prototype tagging of the nursing corpus for an effective generation of the diagrams. Then we combine the diagram generation with the Kamishibai KeyGraph to determine possible points of the hidden or potential factors of communication errors.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.