Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Ograniczanie wyników
Czasopisma help
Lata help
Autorzy help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 46

Liczba wyników na stronie
first rewind previous Strona / 3 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  knowledge discovery
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 3 next fast forward last
1
100%
EN
Requirements analysis is a highly critical step in software life-cycle. Our solution to the problem of managing requirements is an embedded domainspecific language with Clojure playing the role of the host language. The requirements are placed directly in the source code, but there is an impedance mismatch between the compilation units and electronic documents, that are the major carriers of requirements information. This paper presents a coverage for this problem.
2
100%
EN
Methods of patterns detection in the sets of data are useful and demanded tools in a knowledge discovery process. The problem of searching patterns in set of sequences is named Sequential Patterns Mining. It can be defined as a way of finding frequent subsequences in the sequences database. The patterns selection procedure may be simply understood. Every subsequence must be enclosed in the required number of sequences from the database at least to become a pattern. The number of a pattern enclosing sequences is called a pattern support. The process of finding patterns may look trivial but its efficient solution is not. The efficiency plays a crucial role if the required support is lowered. The number of mined patterns may grow exponentially. Moreover, the situation may change if the problem of Sequential Patterns Mining will be extended further. In the classic definition the sequence is a list of ordered elements containing only non-empty sets of items. The Context Based Sequential Patterns Mining adds uniform and multi-attribute contexts (vectors) to the elements of the sequence and the sequence itself. Introducing contexts significantly enlarges the problem search space. However, it brings some additional occasions to constrain the mining process, too. This enhancement requires new algorithms. Traditional ones are not able to cope with non-nominal data directly. Algorithms derived straightly from traditional algorithms were verified to be inefficient. This study evaluates efficiency of novel ContextMapping and ContextMappingHeuristic algorithms. These innovative algoritnms are designed to solve the problem of Context Based Sequential Pattern Mining. This study answers in what scope the algorithms parameterization impacts on mining costs and accuracy. It also refers the modified problem to the traditional one pointing at the common and uncommon properties and drawing perspective for further research.
EN
A context pattern is a frequent subsequence mined from the context database containing set of sequences. This kind of sequential patterns and all elements inside them are described by additional sets of context attributes e.g. continuous ones. The contexts describe circumstances of transactions and sources of sequential data. These patterns can be mined by an algorithm for the context based sequential pattern mining. However, this can create large sets of patterns because all contexts related to patterns are taken from the database. The goal of the generalization method is to reduce the context pattern set by introducing a more compact and descriptive kind of patterns. This is achieved by finding clusters of similar context patterns in the mined set and transforming them to a smaller set of generalized context patterns. This process has to retain as much as possible information from the mined context patterns. This paper introduces a definition of the generalized context pattern and the related algorithm. Results from the generalization may differ as depending on the algorithm design and settings. Hence, generalized patterns may reflect frequent information from the context database differently. Thus, an accuracy measure is also proposed to evaluate the generalized patterns. This measure is used in the experiments presented. The generalized context patterns are compared to patterns mined by the basic sequential patterns mining with prediscretization of context values.
EN
Information becomes more and more sophisticated with its ever-increasing use. Information sophistication relates closely to human intelligence. In order to ensure the common form of information, a symbolic language has been developed. It gradually progressed so that the representation of information of higher-level sophistication becomes possible. However, there is still a lot of information that cannot be captured by a language and has to be represented at very low level. A great effort is necessary for representing such information in a symbolic language because there is a large gap between non-symbolic and symbolic representations. This paper discusses two problems concerning bridging this gap, one from symbolic processing side and the other from non-symbolic processing side. The former is the language aspect of activity called discovery. The latter concerns an evolutional process of language creation. Both are very important topics for explaining the process of sophisticating information.
5
Content available remote Knowledge Mining from Data: Methodological Problems and Directions for Development
100%
EN
The development of knowledge engineering and, within its framework, of data mining or knowledge mining from data should result in the characteristics or descriptions of objects, events, processes and/or rules governing them, which should satisfy certain quality criteria: credibility, accuracy, verifiability, topicality, mutual logical consistency, usefulness, etc. Choosing suitable mathematical models of knowledge mining from data ensures satisfying only some of the above criteria. This paper presents, also in the context of the aims of The Committee on Data for Science and Technology (CODATA), more general aspects of knowledge mining and popularization, which require applying the rules that enable or facilitate controlling the quality of data.
6
Content available remote Musical Sound Classification based on Wavelet Analysis
100%
EN
Contents-based searching through audio data is basically restricted to metadata, which are attached manually to the file. Otherwise, users have to look for the specific musical information alone. Nevertheles, when classifiers based on descriptors extracted from sounds analytically are used, automatic classification can be in some cases possible. For instance, wavelet analysis can be used as a basis for automatic classification of audio data. In this paper, classification of musical instrument sounds based on wavelet parameterization is described. Decision trees and rough set based algorithms are used as classification tools. The parameterization is very simple, but the efficiency of classification proves that automatic classification of these sounds is possible.
7
Content available remote Rough modeling - a bottom-up approach to model construction
100%
EN
Traditional data mining methods based on rough set theory focus on extracting models which are good at classifying unseen obj-ects. If one wants to uncover new knowledge from the data, the model must have a high descriptive quality-it must describe the data set in a clear and concise manner, without sacrificing classification performance. Rough modeling, introduced by Kowalczyk (1998), is an approach which aims at providing models with good predictive emphand descriptive qualities, in addition to being computationally simple enough to handle large data sets. As rough models are flexible in nature and simple to generate, it is possible to generate a large number of models and search through them for the best model. Initial experiments confirm that the drop in performance of rough models compared to models induced using traditional rough set methods is slight at worst, and the gain in descriptive quality is very large.
8
100%
EN
The paper discusses the results of experiments with a new context extension of a sequential pattern mining problem. In this extension, two kinds of context attributes are introduced for describing the source of a sequence and for each element inside this sequence. Such context based sequential patterns may be discovered by a new algorithm, called Context Mapping Improved, specific for handling attributes with similarity functions. For numerical attributes an alternative approach could include their pre-discretization, transforming discrete values into artificial items and, then, using an adaptation of an algorithm for mining sequential patterns from nominal items. The aim of this paper is to experimentally compare these two approaches to mine artificially generated sequence databases with numerical context attributes where several reference patterns are hidden. The results of experiments show that the Context Mapping Improved algorithm has led to better re-discovery of reference patterns. Moreover, a new measure for comparing two sets of context based patterns is introduced.
EN
Set of Experience Knowledge Structure (SOEKS) is a structure able to collect and manage explicit knowledge of formal decision events on different forms. It was built as part of a platform for transforming information into knowledge named Knowledge Supply Chain System (KSCS). In brief, the KSCS takes information from different technologies that make formal decision events, integrates them and transforms them into knowledge represented by Sets of Experience. SOEKS is a structure that can be source and target of multiple technologies. Moreover, it comprises variables, functions, constraints and rules associated in a DNA shape allowing the construction of Decisional DNA. However, when having various dissimilar Sets of Experience as output of the same formal decision event, a renegotiation and unification of the decision has to be performed. The purpose of this paper is to show the process of renegotiating various dissimilar Sets of Experience collected from the same formal decision event.
10
Content available remote Global Action Rules in Distributed Knowledge Systems
100%
EN
In papers [4,5], query answering system based on distributed knowledge mining was introduced and investigated. In paper by Ras and Wieczorkowska [3], the notion of an action rule was introduced and for its application domain e-business was taken. In this paper, we generalize the notion of action rules in a similar way to handling global queries in [4,5]. Mainly, when values of attributes for a given customer, used in action rules, can not be easily changed by business user, definitions of these attributes are extracted from other sites of a distributed knowledge system. To be more precise, attributes at every site of a distributed knowledge system are divided into two sets: stable and flexible. Values of flexible attributes, for a given consumer, sometime can be changed and this change can be influenced and controlled by a business user. However, some of these changes (for instance to the attribute ``profit'') can not be done directly to a chosen attribute. In this case, definitions of such an attribute in terms of other attributes have to be learned. These new definitions are used to construct action rules showing what changes in values of flexible attributes, for a given consumer, are needed in order to re-classify this consumer the way business user wants. But, business user may be either unable or unwilling to proceed with actions leading to such changes. In all such cases we may search for definitions of these flexible attributes looking at either local or remote sites for help.
EN
The paper contains a review of methodologies of a process of knowledge discovery from data and methods of data exploration (Data Mining), which are the most frequently used in mechanical engineering. The methodologies contain various scenarios of data exploring, while DM methods are used in their scope. The paper shows premises for use of DM methods in industry, as well as their advantages and disadvantages. Development of methodologies of knowledge discovery from data is also presented, along with a classification of the most widespread Data Mining methods, divided by type of realized tasks. The paper is summarized by presentation of selected Data Mining applications in mechanical engineering.
EN
Due to the vast and rapid increase in the size of data, data mining has been an increasingly important tool for the purpose of knowledge discovery to prevent the presence of rich data but poor knowledge. In this context, machine learning can be seen as a powerful approach to achieve intelligent data mining. In practice, machine learning is also an intelligent approach for predictive modelling. Rule learning methods, a special type of machine learning methods, can be used to build a rule based system as a special type of expert systems for both knowledge discovery and predictive modelling. A rule based system may be represented through different structures. The techniques for representing rules are known as rule representation, which is significant for knowledge discovery in relation to the interpretability of the model, as well as for predictive modelling with regard to efficiency in predicting unseen instances. This paper justifies the significance of rule representation and presents several existing representation techniques. Two types of novel networked topologies for rule representation are developed against existing techniques. This paper also includes complexity analysis of the networked topologies in order to show their advantages comparing with the existing techniques in terms of model interpretability and computational efficiency.
EN
The main purpose of a topological index is to encode a chemical structure by a number. A topological index is a graph invariant, which decribes the topology of the graph and remains constant under a graph automorphism. Topological indices play a wide role in the study of QSAR (quantitative structure-activity relationship) and QSPR (quantitative structure-property relationship). Topological indices are implemented to judge the bioactivity of chemical compounds. In this article, we compute the ABC (atom-bond connectivity); ABC4 (fourth version of ABC), GA(geometric arithmetic) and GA5(fifth version of GA) indices of some networks sheet. These networks include: octonano window sheet; equilateral triangular tetra sheet; rectangular sheet; and rectangular tetra sheet networks.
14
Content available remote Mining the Largest Dense Vertexlet in a Weighted Scale-free Graph
88%
EN
An important problem of knowledge discovery that has recently evolved in various reallife networks is identifying the largest set of vertices that are functionally associated. The topology of many real-life networks shows scale-freeness, where the vertices of the underlying graph follow a power-law degree distribution. Moreover, the graphs corresponding to most of the real-life networks are weighted in nature. In this article, the problem of finding the largest group or association of vertices that are dense (denoted as dense vertexlet) in a weighted scale-free graph is addressed. Density quantifies the degree of similarity within a group of vertices in a graph. The density of a vertexlet is defined in a novel way that ensures significant participation of all the vertices within the vertexlet. It is established that the problem is NP-complete in nature. An upper bound on the order of the largest dense vertexlet of a weighted graph, with respect to certain density threshold value, is also derived. Finally, an O(n2 log n) (n denotes the number of vertices in the graph) heuristic graph mining algorithm that produces an approximate solution for the problem is presented.
EN
The diagnostics of machinery is nowadays aided by expert systems which require knowledge on the machine to be diagnosed. This knowledge may be acquired either from human experts or from databases containing examples. The paper deals with several methods of knowledge acquisition from examples. It addresses the whole range of problems starting from preparation of examples up to the verification and validation of the knowledge base which ends the proceeding. To acquire diagnostic knowledge from a dataset of examples we apply machine learning or knowledge discovery methods. We also describe a new method suitable for induction of rules which is especially useful in technical diagnostic of machinery where complex structure of a set of technical states often occurs. An example of the application of described methods for acquisition of diagnostic knowledge on rotating machinery is given, too.
16
Content available remote About New Version of RSDS System
88%
EN
The aim of this paper is to present a new version of a bibliographic database system - Rough Set Database System (RSDS). The RSDS system, among others, includes bibliographic descriptions of publications on rough set theory and its applications. This system is also an experimental environment for research related to the processing of bibliographic data using the domain knowledge and the related information retrieval.
17
88%
EN
Themain goal of this paper is to give the outline of some approach to intelligent searching the Rough Set Database System (RSDS). RSDS is a bibliographical system containing bibliographical descriptions of publications connected with methodology of rough sets and its applications. The presented approach bases on created ontologies which are models for the considered domain (rough set theory, its applications and related fields) and for information about publications coming from, for example, abstracts.
18
Content available remote Light Region-based Techniques for Process Discovery
88%
EN
A central problem in the area of Process Mining is to obtain a formal model that represents selected behavior of a system. The theory of regions has been applied to address this problem, enabling the derivation of a Petri net whose language includes a set of traces. However, when dealing with real-life systems, the available tool support for performing such a task is unsatisfactory, due to the complex algorithms that are required. In this paper, the theory of regions is revisited to devise a novel technique that explores the space of regions by combining the elements of a region basis. Due to its light space requirements, the approach can represent an important step for bridging the gap between the theory of regions and its industrial application. Experimental results show that there is improvement in orders of magnitude in comparison with state-of-the-art tools for the same task.
19
Content available remote The Outline of an Ontology for the Rough Set Theory and its Applications
88%
EN
The paper gives the outline of an ontology for the rough set theory and its applications. This ontology will be applied in intelligent searching the Rough Set Database System. A specialized editor from the Protege system is used to define the ontology.
EN
Data mining offers tools for data analysis, knowledge discovery, and autonomous decision-making. In the paper, a data mining approach is used to extract meaningful features (attributes) from a data set and make accurate predictions for a semiconductor process application. An important property of the approach discussed in the paper is that a decision is made only when it is accurately predicted, otherwise no autonomous decision is recommended. The high accuracy of predictions made by the proposed approach is based on a weak assumption that objects with equivalent values of a subset of attributes produce equivalent outcomes.
first rewind previous Strona / 3 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.