Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 22

Liczba wyników na stronie
first rewind previous Strona / 2 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  data clustering
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 2 next fast forward last
EN
The following paper presents the players profiling methodology applied to the turn-based computer game in the audience-driven system. The general scope are mobile games where the players compete against each other and are able to tackle challenges presented by the game engine. As the aim of the game producer is to make the gameplay as attractive as possible, the players should be paired in a way that makes their duel the most exciting. This requires the proper player profiling based on their previous games. The paper presents the general structure of the system, the method for extracting information about each duel and storing them in the data vector form and the method for classifying different players through the clustering or predefined category assignment. The obtained results show the applied method is suitable for the simulated data of the gameplay model and clustering of players may be used to effectively group them and pair for the duels.
2
Content available remote Impact of time series clustering on fuel sales prediction results
EN
The purpose of the paper is to check the impact of data clustering in the process of predicting demand. We checked different ways of adding information about similar datasets to the forecasting process and we grouped the measurements in multiple ways. The experiments were executed on 50 time series describing fuels sales (gasoline and diesel sales) on 25 petrol stations from an international company. We described the data preparation process and feature extraction process. In the 9 presented experiments, we used the XGBoost algorithm and some typical time series forecasting methods (ARIMA, moving average). We showed a case study for two datasets and we discussed the practical usage of the tested solutions. The results showed that the solution which used XGBoost model utilising data gathered from all available petrol stations, in general, worked the best and it outperformed more advanced approaches as well as typical time series methods.
3
Content available A quaternion clustering framework
EN
Data clustering is one of the most popular methods of data mining and cluster analysis. The goal of clustering algorithms is to partition a data set into a specific number of clusters for compressing or summarizing original values. There are a variety of clustering algorithms available in the related literature. However, the research on the clustering of data parametrized by unit quaternions, which are commonly used to represent 3D rotations, is limited. In this paper we present a quaternion clustering methodology including an algorithm proposal for quaternion based k-means along with quaternion clustering quality measures provided by an enhancement of known indices and an automated procedure of optimal cluster number selection. The validity of the proposed framework has been tested in experiments performed on generated and real data, including human gait sequences recorded using a motion capture technique.
4
Content available remote Method for Clustering of Brain Activity Data Derived from EEG Signals
EN
A method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets, Daubechies and Symlets. Euclidean distance between clusters normalized with respect to the standard deviation of the whole set of data are used to separate each task performed by participants. The results of this stage allow for an assessment of separability between subsets of data associated with each activity performed by experiment participants. The speed of convergence of the training process employing deep learning-based clustering is also measured.
5
Content available Linguistically defined clustering of data
EN
This paper introduces a method of data clustering that is based on linguistically specified rules, similar to those applied by a human visually fulfilling a task. The method endeavors to follow these remarkable capabilities of intelligent beings. Even for most complicated data patterns a human is capable of accomplishing the clustering process using relatively simple rules. His/her way of clustering is a sequential search for new structures in the data and new prototypes with the use of the following linguistic rule: search for prototypes in regions of extremely high data densities and immensely far from the previously found ones. Then, after this search has been completed, the respective data have to be assigned to any of the clusters whose nuclei (prototypes) have been found. A human again uses a simple linguistic rule: data from regions with similar densities, which are located exceedingly close to each other, should belong to the same cluster. The goal of this work is to prove experimentally that such simple linguistic rules can result in a clustering method that is competitive with the most effective methods known from the literature on the subject. A linguistic formulation of a validity index for determination of the number of clusters is also presented. Finally, an extensive experimental analysis of benchmark datasets is performed to demonstrate the validity of the clustering approach introduced. Its competitiveness with the state-of-the-art solutions is also shown.
EN
Most geolocation applications for mobile devices assume a constant connection with the network and high computational power nodes. However, with ever-developing devices it now becomes possible to establish peer-to-peer networks in case when the network can be unreachable due to special circumstances (like conflicts or natural disasters). In this paper, a method for clustering spatial data in mobile environment is discussed. A simple solution based on OPTICS algorithm with lexical distance is proposed for grouping the observations.
7
Content available Mathematical aspects of ranking theory
EN
The paper covers the theoretical grounds for defining of rankings, basing on the terms taken from the relation space theory. One presented an array of new definitions which allow establishing rankings without the necessity of using typical ranking functions. Moreover, one introduced the term precedence ranking relation (not necessarily order relation), and demonstrated general algorithms to establish rankings on the basis of definitions of extreme elements.
PL
W pracy przedstawiono podstawy teoretyczne definiowania rankingów, bazujące na pojęciach teorii zbiorów i relacji. Zaprezentowano szereg nowych definicji pozwalających budować rankingi bez konieczności korzystania z typowych funkcji rankingowych. Wprowadzono pojęcie relacji rankingowego poprzedzania (niekoniecznie porządku) oraz przedstawiono ogólne algorytmy pozwalające budować rankingi w oparciu o definicje elementów ekstremalnych.
EN
The paper presents the possibility of using Recurrent Pareto Filter (RPF) to the categorization procedures of objects (data). The paper presents a new implementation of the RPF algorithm, that uses lexicographical sorting objects and binary search Pareto optimal elements. The functioning of the algorithm illustrated by an example categorization procedure of scientific journals contained in the Scimago Scientific Journals Base.
PL
W pracy przedstawiono możliwość wykorzystania Rekurencyjnego Filtra Pareto (RPF) w procedurach kategoryzacji obiektów (danych). Przedstawiono nową implementację algorytmu RPF, wykorzystującą leksykograficzne sortowanie obiektów i binarne poszukiwanie elementów optymalnych w sensie Pareto (LBS). Funkcjonowanie algorytmu zilustrowano przykładem z obszaru kategoryzacji czasopism naukowych zawartych w Bazie Scimago Scientific Journals.
PL
Celem niniejszego artykułu jest przedstawienie miar służących do badania jakości grupowania danych i zastosowanie tych miar do oceny segmentacji rynku. W wykonanych badaniach analizowano dane dotyczące rynków zbytu przedsiębiorstwa produkującego wyroby gospodarstwa domowego. Segmentację rynku przeprowadzono z wykorzystaniem sieci neuronowych Kohonena. W pracy przedstawiono wyniki grupowania danych oraz ich ocenę. Wnioski na temat jakości utworzonych klastrów są próbą ogólnej oceny przeprowadzonej segmentacji rynku.
EN
The purpose of this paper is to present the measures used to evaluate the quality of data clustering and apply them to assess market segmentation. In the analysis the data of manufacturing companies that producing household products was used. The market segmentation was carried out using Kohonen neural network. This paper describes results of the clustering and evaluation of the clusters. The conclusions on the quality of clusters are attempt to overall assessment of the market segmentation.
EN
The paper presents a method of choosing the information technology system, the task of which is to support the management process of the military aircraft operation. The proposed method is based on surveys conducted among direct users of IT systems used in aviation of the Polish Armed Forces. The analysis of results of the surveys was conducted using statistical methods. The paper was completed with practical conclusions related to further usefulness of the individual information technology systems. In the future, they can be extremely useful in the process of selecting the best solutions and integration of the information technology systems.
PL
Celem niniejszego artykułu są przedstawienie i ocena możliwości wykorzystania metod eksploracji danych do segmentacji rynków zbytu. Przedstawiono segmentacje opisową i predykcyjną oraz przeanalizowano wyniki rozwiązywania zadań klasyfikacji i grupowania danych za pomocą sieci neuronowych Kohonena oraz drzew klasyfikacyjnych CART i CHAID. W pracy wykorzystano dane dotyczące rynków zbytu przedsiębiorstwa produkującego wyroby gospodarstwa domowego.
EN
The purpose of this paper is to present and evaluate the possibility of using data mining methods in the market segmentation process. In the paper the descriptive and predictive segmentation were presented and the results of classification and clustering data were analyzed. To carry out the analysis were used following methods: Kohonen neural networks, CART and CHAID. The analysis concerns the manufacturing company producing household products.
12
Content available remote Analysis of medical data using dimensionality reduction techniques
EN
The paper presents the application of dimensionality reduction methods for representation of the multidimensional medical data representing the images of the blood cells in leukemia. Different techniques of reduction belonging to linear and nonlinear methods will be applied and their efficiency compared. Their application to the visualization of different classes as well as clusterization and classification of data will be studied and discussed in the paper.
PL
Praca przedstawia zastosowanie różnych metod redukcji wymiaru danych w reprezentacji numerycznej deskryptorów charakteryzujących klasy komórek krwiotwórczych w białaczce. Porównane zostaną różne podejścia do redukcji oparte na metodach liniowych i nieliniowych transformacji. W szczególności analizie poddane zostaną możliwości zastosowania tych metod w wizualizacji danych jak również klasteryzacji i klasyfikacji. W pracy pokazane zostaną wyniki przeprowadzonych eksperymentów dotyczących 11 klas komórek.
EN
Clustering is a very important technique in knowledge discovery. It has been widely used in data mining, image processing, machine learning, bioinformatics, marketing and other fields. Clustering discern the objects into groups called clusters, based on certain criteria. The similarity of objects is high within the clusters, but low between the clusters. In this work, we investigate a hybridization of the gravitational search algorithm (GSA) and big bang-big crunch algorithm (BB-BC) on data clustering. In the proposed approach, namely GSA-BB, GSA is used to explore the search space for finding the optimal locations of the clusters centroids. Whenever GSA loses its exploration, BB-BC algorithm is used to diversify the population. The performance of the proposed method is compared with GSA, BB-BC and K-means algorithms using six standard and real datasets taken from the UCI machine learning repository. Experimental results indicate that there is significant improvement in the quality of the clusters obtained by the proposed hybrid method over the non-hybrid methods.
14
Content available remote Granular Computing Based on Gaussian Cloud Transformation
EN
Granular computing is one of the important methods for extracting knowledge from data and has got great achievements. However, it is still a puzzle for granular computing researchers to imitate the human cognition process of choosing reasonable granularities automatically for dealing with difficult problems. In this paper, a Gaussian cloud transformation method is proposed to solve this problem, which is based on Gaussian Mixture Model and Gaussian Cloud Model. Gaussian Mixture Model (GMM) is used to transfer an original data set to a sum of Gaussian distributions, and Gaussian Cloud Model (GCM) is used to represent the extension of a concept and measure its confusion degree. Extensive experiments on data clustering and image segmentation have been done to evaluate this method and the results show its performance and validity.
PL
Istotny wpływ na wykrywanie zagrożenia pożarowego przenośników taśmowych w kopalniach węgla mają wartości takich parametrów, jak: stężenie tlenku węgla (CO) i cyjanowodoru (HCN) oraz wartości sygnałów z czujników dymu. Wielkości te są uwzględniane podczas wyznaczania wartości wskaźnika zagrożenia pożarowego. Zbudowano rozmyty model wskaźnika zagrożenia pożarowego w oparciu o laboratoryjne dane pomiarowe wymienionych wielkości. Model rozmyty wygenerowano z danych numerycznych przy zastosowaniu czterech algorytmów rozmytej klasteryzacji, które zaimplementowano w kodzie środowiska MATLAB. Uzyskane wyniki pokazano w tabelach i na wykresach. Do budowy i wizualizacji projektowanego modelu rozmytego wykorzystano funkcje oraz interfejsy Fuzzy Logic Toolbox.
EN
Significant influence on detecting the fire hazard of belt conveyor in the coal mine have values such parameters as concentration of carbon monoxide (CO), concentration of hydrogen cyanide (HCN) and signals from smoke detectors. Those values are used to set the fire risk index. Fuzzy model of the fire risk index was built based on laboratory data measurements. Fuzzy model was generated from the above numerical data using four algorithms of fuzzy clustering, implemented in the MATLAB code. The results are shown in tables and graphs. MATLAB and Fuzzy Logic Toolbox library (functions and interfaces) were used to design and visualize the proposed fuzzy model.
PL
W artykule zaproponowano podejście do wyznaczenia wartości granicznych za pomocą algorytmów rozmytego grupowania danych. Wykorzystano algorytmy FCM, PCM oraz algorytm Gustafsona-Kessela. Eksperyment przeprowadzano na danych symulacyjnych. W tym celu zbudowano model numeryczny maszyny wirnikowej, symulującej określone stany i wielkości niewyważenia. Wyznaczone wartości graniczne porównano z wartościami otrzymanymi przy pomocy metody statystycznej. Wszystkie obliczenia wykonywano w środowisku Matlab-Simulink.
EN
The paper describes a methodology for estimating the limit values of char-icteristics of diagnostic signals using methods of fuzzy data clustering (FCM, PCM and Gustafson-Kessel algorithms). The experiment was conducted on simulated data, using a numerical model of a rotor machine, simulating given inbalanced states. Limits were compared with value estimating using the statistical method.
EN
A description which summarizes entire and usually big set of data is called its model. The problem investigated in the paper consists in verification of models of data coming from a simulation experiment of selecting candidates for operators of mobile robot (more strictly building reliable predictive model of the data). The models are validated using train-and-test method and verified with the help of the EM (expectation-maximization) algorithm which was originally designed for solving clustering problems with missing data. Actually, the selecting is a clustering problem because the candidates are assigned to ‘chosen’, ‘accepted’ or ‘rejected’ subgroups. For such a case the missing data is the category (the subgroup) for which a candidate should be assigned on the basis of his activity measured during the simulation experiment. The paper explains the procedure of model verification. It also shows experimental results and draws conclusions.
PL
Pokazano możliwość analizy zbioru danych numerycznych w aspekcie odkrywania niewidocznych związków pomiędzy tymi danymi. Posłużono się metodą analizy składowych głównych oraz wybranymi metodami grupowania danych. W pierwszym przykładzie przeanalizowano podobieństwo wybranych krajów UE w dziedzinie pozyskiwania przez nie energii ze źródeł odnawialnych. Posłużono się powszechnie dostępnymi danymi statystycznymi z baz Głównego Urzędu Statystycznego. W drugim przykładzie pokazano możliwość grupowania okresów zmienności notowań giełdowych. Posłużono się historycznymi (rok 1998) danymi dotyczącymi notowań wybranych indeksów giełdy amerykańskiej.
EN
In this paper we analyze some numerical data sets in order to uncover unknown or hidden relationships between them. We use principal component analysis approach as well as the hierarchical clustering method. In the first example we analyze similarities of EU countries in the field of production of energy from renewable sources. We use commonly available data from the Polish Central Statistical Office. In the second example we try to find groups of similar periods of time based on the US stock exchange. We use same historical (1998) stock exchange quotations of some selected indexes.
EN
Application of machine learning method for creation of equipment diagnostic model is presented in the paper. Dewater pump working in abyssal mining pump station has been chosen as the illustrative example. In the second section, dewater pumps monitoring system is presented, and necessity of the pump diagnostic model creation is justified. Next sections present application of data clustering algorithm and algorithm of decision trees induction. Methods of reduction the get diagnostic model is also developed. The reduction leads to more legible data models. Results of analysis done for two different type of pumps are presented in the last part of the paper.
EN
This paper presents a novel approach to data clustering and multiple-class classification problems. The proposed method is based on a metaphor derived from immune systems, the clonal selection paradigm. A novel clonal selection algorithm - Immune K-Means, is proposed. The proposed system is able to cluster real valued data efficiently and correctly, dynamically estimating the number of clusters. In classification problems discrimination among classes is based on the k-nearest neighbor method. Two different types of suppression are proposed. They enable the evolution of different populations of lymphocytes well suited to a given problem : clustering or classification. The first type of suppression enables the lymphocytes to discover the data distribution while the second type of suppression focuses the lymphocytes on the classes' boundaries. Primary results on artificial data and a real-world benchmark dataset (Fisher's Iris Database) as well as a discussion of the parameters of the algorithm are given.
first rewind previous Strona / 2 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.