Performance of an e-learning system depends on an extent to which it is adjusted to student needs. Priorities of the last ones may differ in accordance with the context of use of an e-learning environment. For personalized e-learning system based on student groups, different distribution of the groups should be taken into account. In the paper, using of data mining techniques for building student groups depending on the context of the system use is considered. As the main technique unsupervised classification is examined. Context parameters depending on courses and student models are tested. Experiment results for real student data are discussed.
Increasing number of repositories of online documents resulted in growing demand for automatic categorization algorithms. However, in many cases the texts should be assigned to more than one class. In the paper, new multi-label classification algorithm for short documents is considered. The presented problem transformation Labels Chain (LC) algorithm is based on relationship between labels, and consecutively uses result labels as new attributes in the following classification process. The method is validated by experiments conducted on several real text datasets of restaurant reviews, with different number of instances, taking into account such classifiers as kNN, Naive Bayes, SVM and C4.5. The obtained results showed the good performance of the LC method, comparing to the problem transformation methods like Binary Relevance and Label Powerset.
Effective analysis of structured documents may decide on management information systems performance. In the paper, an adaptive method of information extraction from structured text documents is considered. We assume that documents belong to thematic groups and that required set of information may be determined ”apriori”. The knowledge of document structure allows to indicate blocks, where certain information is more probable to appear. As the result structured data, which can be further analysed are obtained. The proposed solution uses dictionaries and flexion analysis, and may be applied to Polish texts. The presented approach can be used for information extraction from official letters, information sheets and product specifications.
A huge amount of documents in the digitalized libraries requires efficient methods for exploring contained there information. “Topic modeling” is considered as one of the most effective among them. In spite of commonly used approaches for finding occurrences of single words, in the paper building topic models based on phrases is pondered. We propose a methodology, which enables to create a set of significant word sequences and thus limiting the search area to phrases which contain them. The methodology is evaluated on experiments performed on real text datasets. Obtained results are compared with those received by using LDA algorithm.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.