Wyniki wyszukiwania - Biblioteka Nauki

1

New test for inclusion minimality in reduct generation

100%

Susmaga R.

Foundations of Computing and Decision Sciences

|

2000

|

tom Vol. 25, No. 2

121-146

EN

The paper addresses the problem of generating reducts, i.e. minimal subsets of attributes that satisfy pre-defmed consistency and minimality conditions. The main problem with the reduct generating process is its high computational complexity. This paper describes a breadth-first search algorithm to reduct generation that is based on the notion of discernibility matrix. The most time consuming part of the algorithm is a test for inclusion minimality that has to be applied to every potential reduct. As it has been shown, implementation of this minimality test determines strongly the behaviour of the whole algorithm. A number of different minimality tests has been presented and computationally evaluated in [33]. This paper is in a sense its continuation in that it introduces further improvements to the minimality tests. It also presents results of a set of experiments with non-trivial real-life data sets in which the new tests have been compared with their earlier implementations.

2

Computation of shortest reducts

100%

Susmaga R.

Foundations of Computing and Decision Sciences

|

1998

|

tom Vol. 23, No. 2

119-137

EN

The paper addresses the problem of computing short reducts in information/decision tables. Reducts in general, and short reducts in particular, may be usefully applied for consitency preserving data size reduction but are hard to find because of high theoretical complexity of the problem. Practical experiments demonstrate, however, that reducts may be successfully computed for many real life data sets using some advanced algorithms. This paper reports on a series of experiments designed to verify not the theoretical complexity but the practical behaviour of algorithms for reduct computation in the average case. In particular, the problem of computing short reducts is sotved by presenting a new algorithm, which is based on the notion of discernibility matrix. All the results of the experiments reported in this paper have been obtained for real-life data sets:

3

Reducts and Constructs in Attribute Reduction

100%

Susmaga R.

Fundamenta Informaticae

|

2004

|

tom Vol. 61, nr 2

159--181

EN

One of the main notions in the Rough Sets Theory (RST) is that of a reduct. According to its classic definition, the reduct is a minimal subset of the attributes that retains some important properties of the whole set of attributes. The idea of the reduct proved to be interesting enough to inspire a great deal of research and resulted in introducing various reduct-related ideas and notions. First of all, depending on the character of the attributes involved in the analysis, so called absolute and relative reducts can be defined. The more interesting of these, relative reducts, are minimal subsets of attributes that retain discernibility between objects belonging to different classes. This paper focuses on the topological aspects of such reducts, identifying some of their limitations and introducing alternative definitions that do not suffer from these limitations. The modified subsets of attributes, referred to as constructs, are intended to assist the subsequent inductive process of data generalisation and knowledge acquisition, which, in the context of RST, usually takes the form of decision rule generation. Usefulness of both reducts and constructs in this role is examined and evaluated in a massive computational experiment, which was carried out for a collection of real-life data sets.

4

Effective tests for minimality in reduct generation

100%

Susmaga R.

Foundations of Computing and Decision Sciences

|

1998

|

tom Vol. 23, No. 4

219-240

EN

The paper addresses the problem of checking for inclusion minimality in attribute reduction. Reduction of attributes information/decision tables is an important aspect of table analysis where so called reducts of attributes may successfully be applied. The reducts, however, are hard to generate because of high theoretical complexity of the problem. Especially difficult is the generation of all exact reducts for a given data set. This paper reports on a series of experiments with some advariced algorithms that allow to generate all reducts. Particular attention is paid to a family of algorithms based on the notion of discernibility matrix. The heaviest computing load of these algorithms lies in testing for minimality with regard to inclusion. The paper introduces a new minimality test that makes the algorithms even more effective. All the presented tests are evaluated in experiments with real-life data sets.

5

Can Confirmation Measures Reflect Statistically Sound Dependencies in Data? The Concordance-based Assessment

63%

Susmaga R. , Szczęch I.

Foundations of Computing and Decision Sciences

|

2018

|

tom Vol. 43, No. 1

41--66

EN

The paper considers particular interestingness measures, called confirmation measures (also known as Bayesian confirmation measures), used for the evaluation of “if evidence, then hypothesis” rules. The agreement of such measures with a statistically sound (significant) dependency between the evidence and the hypothesis in data is thoroughly investigated. The popular confirmation measures were not defined to possess such form of agreement. However, in error-prone environments, potential lack of agreement may lead to undesired effects, e.g. when a measure indicates either strong confirmation or strong disconfirmation, while in fact there is only weak dependency between the evidence and the hypothesis. In order to detect and prevent such situations, the paper employs a coefficient allowing to assess the level of dependency between the evidence and the hypothesis in data, and introduces a method of quantifying the level of agreement (referred to as a concordance) between this coefficient and the measure being analysed. The concordance is characterized and visualised using specialized histograms, scatter-plots, etc. Moreover, risk-related interpretations of the concordance are introduced. Using a set of 12 confirmation measures, the paper presents experiments designed to establish the actual concordance as well as other useful characteristics of the measures.

6

Visualization support for the analysis of properties of interestingness measures

63%

Susmaga R. , Szczęch I.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2015

|

tom Vol. 63, nr 1

315--327

EN

The paper considers a particular group of rule interestingness measures, called Bayesian confirmation measures, which have become the subject of numerous, but often exclusively theoretical studies. To assist and enhance their analysis in real-life situations, where time constraints may impede conducting such time consuming procedures, a visual technique has been introduced and described in this paper. It starts with an exhaustive and non-redundant set of contingency tables, which consists of all possible tables having the same number of observations. These data, originally 4-dimensional, may, owing to an inherent constraint, be effectively represented as a 3-dimensional tetrahedron, while an additional, scalar function of the data (e.g. a confirmation measure) may be rendered using colour. Dedicated analyses of particular colour patterns on this tetrahedron allow to promptly perceive particular properties of the visualized measures. To illustrate the introduced technique, a set of 12 popular confirmation measures has been selected and visualized. Additionally, a set of 9 popular properties has been chosen and the visual interpretations of the measures in terms of the properties have been presented.

7

The Concept Of Topological Information In Text Representation

51%

Susmaga R. , Masłowska I. , Budzyńska L.

Foundations of Computing and Decision Sciences

|

2011

|

tom Vol. 36, No. 1

57-78

EN

This paper studies the possibility of processing text documents using topological information on keywords, by which we mean internal positions of the keywords in the text. While the word counts are pieces of information that is independent of the sequence of words in the text, the topological, i.e. position-related, information manifests obvious dependency on the sequence of words. In result, the presented method stops treating the texts as amorphous collections of words and starts treating them as linearly-ordered sequences of words. Thus, the introduced, topological approach is of higher level than the popular bag-of-words approaches, and its advantage should unveil in applications to texts of similar themes; due to their similar counts of keywords the topological information may prove to be indispensable. It should also require significantly smaller sets of keywords as compared to the bag-of-words approaches.

8

Generation of reducts and rules in multi-attribute and multi-criteria classification

51%

Susmaga R. , Słowiński R. , Greco S. , Matarazzo B.

Control and Cybernetics

|

2000

|

tom Vol. 29, no 4

969-988

EN

The paper addresses the problem of analysing information tables which contain objects described by both attributes and criteria, i.e. attributes with preference-ordered scales. The objects contained in those tables, representing exemplary decisions made by a decision maker or a domain expert, are usually classified into one of several classes that are also often preference-ordered. Analysis of such data using the classic rough set methodology may produce improper results, as the original rough set approach is not able to discover inconsistencies originating from consideration of typical criteria, like e.g. product quality, market share or debt ratio. The paper presents the framework for the analysis of both attributes and criteria and a very promising algorithm for generating reducts. The algorithm presented is evaluated in an experiment with real-life data sets and its results are compared to those by two other reduct generating algorithms.

9

Hyperplane Aggregation of Dominance Decision Rules

51%

Pindur R. , Susmaga R. , Stefanowski J.

Fundamenta Informaticae

|

2004

|

tom Vol. 61, nr 2

117--137

EN

In this paper we consider multiple criteria decision aid systems based on decision rules generated from examples. A common problem in such systems is the over-abundance of decision rules, as in many situations the rule generation algorithms produce very large sets of rules. This prolific representation of knowledge provides a great deal of detailed information about the described objects, but is appropriately difficult to interpret and use. One way of solving this problem is to aggregate the created rules into more general ones, e.g. by forming rules of enriched syntax. The paper presents a generalization of elementary rule conditions into linear combinations. This corresponds to partitioning the preference-ordered condition space of criteria with non-orthogonal hyperplanes. The objective of this paper is to introduce the generalized rules into the multiple criteria classification problems and to demonstrate that these problems can be successfully solved using the introduced rules. The usefulness of the introduced solution is finally demonstrated in computational experiments with real-life data sets.

10

Community Traffic: a technology for the next generation car navigation

32%

Dembczyński K. , Gaweł P. , Jaszkiewicz A. , Kotłowski W. , Kubiak M. , Susmaga R. , Wesołek P. , Wojciechowski A. , Zielniewicz P.

Control and Cybernetics

|

2012

|

tom Vol. 41, no. 4

867--883

EN

The paper presents the NaviExpert’s Community Traffic technology, an interactive, community–based car navigation system. Using data collected from its users, Community Traffic offers services unattainable to earlier systems. On the one hand, the current traffic data are used to recommend the best routes in the navigation phase, during which many potentially unpredictable traffic-delaying and traffic-jamming events, like unexpected roadworks, road accidents, or diversions, can be taken into account and thereby successfully avoided. On the other hand, a number of istinctive features, like immediate location of various traffic dangers, are offered. Using exclusively real-life data, provided by NaviExpert, the paper presents two illustrative case studies concerned with experimental evaluation of solutions to computational problems related to the community-based services offered by the system.

11

Visualization support for the analysis of properties of interestingness measures

26%

Susmaga R. , Szczęch I.

|