Wyniki wyszukiwania - BazTech

1

Method for the Player Profiling in the Turn-based Computer Games

Bilski Piotr, Antoniuk Izabella, Łabędzki Rafał

International Journal of Electronics and Telecommunications

|

2023

|

Vol. 69, No. 3

461--468

EN

The following paper presents the players profiling methodology applied to the turn-based computer game in the audience-driven system. The general scope are mobile games where the players compete against each other and are able to tackle challenges presented by the game engine. As the aim of the game producer is to make the gameplay as attractive as possible, the players should be paired in a way that makes their duel the most exciting. This requires the proper player profiling based on their previous games. The paper presents the general structure of the system, the method for extracting information about each duel and storing them in the data vector form and the method for classifying different players through the clustering or predefined category assignment. The obtained results show the applied method is suitable for the simulated data of the gameplay model and clustering of players may be used to effectively group them and pair for the duels.

2

Impact of time series clustering on fuel sales prediction results

Henzel Joanna, Sikora Marek, Bularz Jakub

Annals of Computer Science and Information Systems

|

2021

|

Vol. 26

13--21

EN

The purpose of the paper is to check the impact of data clustering in the process of predicting demand. We checked different ways of adding information about similar datasets to the forecasting process and we grouped the measurements in multiple ways. The experiments were executed on 50 time series describing fuels sales (gasoline and diesel sales) on 25 petrol stations from an international company. We described the data preparation process and feature extraction process. In the 9 presented experiments, we used the XGBoost algorithm and some typical time series forecasting methods (ARIMA, moving average). We showed a case study for two datasets and we discussed the practical usage of the tested solutions. The results showed that the solution which used XGBoost model utilising data gathered from all available petrol stations, in general, worked the best and it outperformed more advanced approaches as well as typical time series methods.

3

A quaternion clustering framework

Piórek Michał, Jabłoński Bartosz

International Journal of Applied Mathematics and Computer Science

|

2020

|

Vol. 30, no. 1

133--147

EN

Data clustering is one of the most popular methods of data mining and cluster analysis. The goal of clustering algorithms is to partition a data set into a specific number of clusters for compressing or summarizing original values. There are a variety of clustering algorithms available in the related literature. However, the research on the clustering of data parametrized by unit quaternions, which are commonly used to represent 3D rotations, is limited. In this paper we present a quaternion clustering methodology including an algorithm proposal for quaternion based k-means along with quaternion clustering quality measures provided by an enhancement of known indices and an automated procedure of optimal cluster number selection. The validity of the proposed framework has been tested in experiments performed on generated and real data, including human gait sequences recorded using a motion capture technique.

4

Method for Clustering of Brain Activity Data Derived from EEG Signals

Kurowski Adam, Mrozik Katarzyna, Kostek Bozena, Czyzewski Andrzej

Fundamenta Informaticae

|

2019

|

Vol. 168, nr 2-4

249--268

EN

A method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets, Daubechies and Symlets. Euclidean distance between clusters normalized with respect to the standard deviation of the whole set of data are used to separate each task performed by participants. The results of this stage allow for an assessment of separability between subsets of data associated with each activity performed by experiment participants. The speed of convergence of the training process employing deep learning-based clustering is also measured.

5

Linguistically defined clustering of data

Leski J. M., Kotas M. P.

International Journal of Applied Mathematics and Computer Science

|

2018

|

Vol. 28, no. 3

545--557

EN

This paper introduces a method of data clustering that is based on linguistically specified rules, similar to those applied by a human visually fulfilling a task. The method endeavors to follow these remarkable capabilities of intelligent beings. Even for most complicated data patterns a human is capable of accomplishing the clustering process using relatively simple rules. His/her way of clustering is a sequential search for new structures in the data and new prototypes with the use of the following linguistic rule: search for prototypes in regions of extremely high data densities and immensely far from the previously found ones. Then, after this search has been completed, the respective data have to be assigned to any of the clusters whose nuclei (prototypes) have been found. A human again uses a simple linguistic rule: data from regions with similar densities, which are located exceedingly close to each other, should belong to the same cluster. The goal of this work is to prove experimentally that such simple linguistic rules can result in a clustering method that is competitive with the most effective methods known from the literature on the subject. A linguistic formulation of a validity index for determination of the number of clusters is also presented. Finally, an extensive experimental analysis of benchmark datasets is performed to demonstrate the validity of the clustering approach introduced. Its competitiveness with the state-of-the-art solutions is also shown.

6

Spatial data clustering in independent mobile environment

Gajewski B., Martyn T.

Measurement Automation Monitoring

|

2016

|

Vol. 62, No. 5

163--165

EN

Most geolocation applications for mobile devices assume a constant connection with the network and high computational power nodes. However, with ever-developing devices it now becomes possible to establish peer-to-peer networks in case when the network can be unreachable due to special circumstances (like conflicts or natural disasters). In this paper, a method for clustering spatial data in mobile environment is discussed. A simple solution based on OPTICS algorithm with lexical distance is proposed for grouping the observations.

7

Mathematical aspects of ranking theory

Ameljańczyk A.

Computer Science and Mathematical Modelling

|

2015

|

No. 2

5-10

EN

The paper covers the theoretical grounds for defining of rankings, basing on the terms taken from the relation space theory. One presented an array of new definitions which allow establishing rankings without the necessity of using typical ranking functions. Moreover, one introduced the term precedence ranking relation (not necessarily order relation), and demonstrated general algorithms to establish rankings on the basis of definitions of extreme elements.

PL

W pracy przedstawiono podstawy teoretyczne definiowania rankingów, bazujące na pojęciach teorii zbiorów i relacji. Zaprezentowano szereg nowych definicji pozwalających budować rankingi bez konieczności korzystania z typowych funkcji rankingowych. Wprowadzono pojęcie relacji rankingowego poprzedzania (niekoniecznie porządku) oraz przedstawiono ogólne algorytmy pozwalające budować rankingi w oparciu o definicje elementów ekstremalnych.

8

Lexicographical binary implementation of the Recurrent Pareto Filter in categorization procedures

Ameljańczyk A., Tran Quang C.

Computer Science and Mathematical Modelling

|

2015

|

No. 2

11-15

EN

The paper presents the possibility of using Recurrent Pareto Filter (RPF) to the categorization procedures of objects (data). The paper presents a new implementation of the RPF algorithm, that uses lexicographical sorting objects and binary search Pareto optimal elements. The functioning of the algorithm illustrated by an example categorization procedure of scientific journals contained in the Scimago Scientific Journals Base.

PL

W pracy przedstawiono możliwość wykorzystania Rekurencyjnego Filtra Pareto (RPF) w procedurach kategoryzacji obiektów (danych). Przedstawiono nową implementację algorytmu RPF, wykorzystującą leksykograficzne sortowanie obiektów i binarne poszukiwanie elementów optymalnych w sensie Pareto (LBS). Funkcjonowanie algorytmu zilustrowano przykładem z obszaru kategoryzacji czasopism naukowych zawartych w Bazie Scimago Scientific Journals.

9

Ocena segmentacji rynku za pomocą miar jakości grupowania danych

Paśko Ł., Setlak G.

Studia Informatica

|

2014

|

Vol. 35, nr 2

157--173

PL

Celem niniejszego artykułu jest przedstawienie miar służących do badania jakości grupowania danych i zastosowanie tych miar do oceny segmentacji rynku. W wykonanych badaniach analizowano dane dotyczące rynków zbytu przedsiębiorstwa produkującego wyroby gospodarstwa domowego. Segmentację rynku przeprowadzono z wykorzystaniem sieci neuronowych Kohonena. W pracy przedstawiono wyniki grupowania danych oraz ich ocenę. Wnioski na temat jakości utworzonych klastrów są próbą ogólnej oceny przeprowadzonej segmentacji rynku.

EN

The purpose of this paper is to present the measures used to evaluate the quality of data clustering and apply them to assess market segmentation. In the analysis the data of manufacturing companies that producing household products was used. The market segmentation was carried out using Kohonen neural network. This paper describes results of the clustering and evaluation of the clusters. The conclusions on the quality of clusters are attempt to overall assessment of the market segmentation.

10

Method of choosing the information technology system supporting management of the military aircraft operation

Barszcz P., Zieja M.

Prace Naukowe Instytutu Technicznego Wojsk Lotniczych

|

2014

|

nr 35

141--154

EN

The paper presents a method of choosing the information technology system, the task of which is to support the management process of the military aircraft operation. The proposed method is based on surveys conducted among direct users of IT systems used in aviation of the Polish Armed Forces. The analysis of results of the surveys was conducted using statistical methods. The paper was completed with practical conclusions related to further usefulness of the individual information technology systems. In the future, they can be extremely useful in the process of selecting the best solutions and integration of the information technology systems.

11

Zastosowanie metod eksploracji danych do segmentacji rynków

Setlak G., Paśko Ł.

Studia Informatica

|

2013

|

Vol. 34, nr 2A

311--323

PL

Celem niniejszego artykułu są przedstawienie i ocena możliwości wykorzystania metod eksploracji danych do segmentacji rynków zbytu. Przedstawiono segmentacje opisową i predykcyjną oraz przeanalizowano wyniki rozwiązywania zadań klasyfikacji i grupowania danych za pomocą sieci neuronowych Kohonena oraz drzew klasyfikacyjnych CART i CHAID. W pracy wykorzystano dane dotyczące rynków zbytu przedsiębiorstwa produkującego wyroby gospodarstwa domowego.

EN

The purpose of this paper is to present and evaluate the possibility of using data mining methods in the market segmentation process. In the paper the descriptive and predictive segmentation were presented and the results of classification and clustering data were analyzed. To carry out the analysis were used following methods: Kohonen neural networks, CART and CHAID. The analysis concerns the manufacturing company producing household products.

12

Analysis of medical data using dimensionality reduction techniques

Siwek K., Osowski S., Markiewicz T., Korytkowski J.

Przegląd Elektrotechniczny

|

2013

|

R. 89, nr 2a

279--281

EN

The paper presents the application of dimensionality reduction methods for representation of the multidimensional medical data representing the images of the blood cells in leukemia. Different techniques of reduction belonging to linear and nonlinear methods will be applied and their efficiency compared. Their application to the visualization of different classes as well as clusterization and classification of data will be studied and discussed in the paper.

PL

Praca przedstawia zastosowanie różnych metod redukcji wymiaru danych w reprezentacji numerycznej deskryptorów charakteryzujących klasy komórek krwiotwórczych w białaczce. Porównane zostaną różne podejścia do redukcji oparte na metodach liniowych i nieliniowych transformacji. W szczególności analizie poddane zostaną możliwości zastosowania tych metod w wizualizacji danych jak również klasteryzacji i klasyfikacji. W pracy pokazane zostaną wyniki przeprowadzonych eksperymentów dotyczących 11 klas komórek.

13

Hybridization of the Gravitational Search Algorithm and Big Bang-Big Crunch Algorithm for Data Clustering

Hatamlou A., Hatamlou M.

Fundamenta Informaticae

|

2013

|

Vol. 126, nr 4

319--333

EN

Clustering is a very important technique in knowledge discovery. It has been widely used in data mining, image processing, machine learning, bioinformatics, marketing and other fields. Clustering discern the objects into groups called clusters, based on certain criteria. The similarity of objects is high within the clusters, but low between the clusters. In this work, we investigate a hybridization of the gravitational search algorithm (GSA) and big bang-big crunch algorithm (BB-BC) on data clustering. In the proposed approach, namely GSA-BB, GSA is used to explore the search space for finding the optimal locations of the clusters centroids. Whenever GSA loses its exploration, BB-BC algorithm is used to diversify the population. The performance of the proposed method is compared with GSA, BB-BC and K-means algorithms using six standard and real datasets taken from the UCI machine learning repository. Experimental results indicate that there is significant improvement in the quality of the clusters obtained by the proposed hybrid method over the non-hybrid methods.

14

Granular Computing Based on Gaussian Cloud Transformation

Liu Y., Li D., He W., Wang G.

Fundamenta Informaticae

|

2013

|

Vol. 127, nr 1-4

385--398

EN

Granular computing is one of the important methods for extracting knowledge from data and has got great achievements. However, it is still a puzzle for granular computing researchers to imitate the human cognition process of choosing reasonable granularities automatically for dealing with difficult problems. In this paper, a Gaussian cloud transformation method is proposed to solve this problem, which is based on Gaussian Mixture Model and Gaussian Cloud Model. Gaussian Mixture Model (GMM) is used to transfer an original data set to a sum of Gaussian distributions, and Gaussian Cloud Model (GCM) is used to represent the extension of a concept and measure its confusion degree. Extensive experiments on data clustering and image segmentation have been done to evaluate this method and the results show its performance and validity.

15

Inteligentny model wskaźnika zagrożenia pożarowego w kopalni węgla

Mrozek B., Felka D.

Pomiary Automatyka Robotyka

|

2012

|

R. 16, nr 2

540-545

PL

Istotny wpływ na wykrywanie zagrożenia pożarowego przenośników taśmowych w kopalniach węgla mają wartości takich parametrów, jak: stężenie tlenku węgla (CO) i cyjanowodoru (HCN) oraz wartości sygnałów z czujników dymu. Wielkości te są uwzględniane podczas wyznaczania wartości wskaźnika zagrożenia pożarowego. Zbudowano rozmyty model wskaźnika zagrożenia pożarowego w oparciu o laboratoryjne dane pomiarowe wymienionych wielkości. Model rozmyty wygenerowano z danych numerycznych przy zastosowaniu czterech algorytmów rozmytej klasteryzacji, które zaimplementowano w kodzie środowiska MATLAB. Uzyskane wyniki pokazano w tabelach i na wykresach. Do budowy i wizualizacji projektowanego modelu rozmytego wykorzystano funkcje oraz interfejsy Fuzzy Logic Toolbox.

EN

Significant influence on detecting the fire hazard of belt conveyor in the coal mine have values such parameters as concentration of carbon monoxide (CO), concentration of hydrogen cyanide (HCN) and signals from smoke detectors. Those values are used to set the fire risk index. Fuzzy model of the fire risk index was built based on laboratory data measurements. Fuzzy model was generated from the above numerical data using four algorithms of fuzzy clustering, implemented in the MATLAB code. The results are shown in tables and graphs. MATLAB and Fuzzy Logic Toolbox library (functions and interfaces) were used to design and visualize the proposed fuzzy model.

16

Wyznaczanie wartości granicznych z wykorzystaniem metod grupowania danych

Targosz M., Timofiejczuk A.

Problemy Eksploatacji

|

2011

|

nr 2

213-221

PL

W artykule zaproponowano podejście do wyznaczenia wartości granicznych za pomocą algorytmów rozmytego grupowania danych. Wykorzystano algorytmy FCM, PCM oraz algorytm Gustafsona-Kessela. Eksperyment przeprowadzano na danych symulacyjnych. W tym celu zbudowano model numeryczny maszyny wirnikowej, symulującej określone stany i wielkości niewyważenia. Wyznaczone wartości graniczne porównano z wartościami otrzymanymi przy pomocy metody statystycznej. Wszystkie obliczenia wykonywano w środowisku Matlab-Simulink.

EN

The paper describes a methodology for estimating the limit values of char-icteristics of diagnostic signals using methods of fuzzy data clustering (FCM, PCM and Gustafson-Kessel algorithms). The experiment was conducted on simulated data, using a numerical model of a rotor machine, simulating given inbalanced states. Limits were compared with value estimating using the statistical method.

17

An application of expectation-maximization for model verification

Łukawska B., Łukawski G., Sapiecha K.

Annales Universitatis Mariae Curie-Skłodowska. Sectio AI, Informatica

|

2010

|

Vol. 10, no. 1

15--27

EN

A description which summarizes entire and usually big set of data is called its model. The problem investigated in the paper consists in verification of models of data coming from a simulation experiment of selecting candidates for operators of mobile robot (more strictly building reliable predictive model of the data). The models are validated using train-and-test method and verified with the help of the EM (expectation-maximization) algorithm which was originally designed for solving clustering problems with missing data. Actually, the selecting is a clustering problem because the candidates are assigned to ‘chosen’, ‘accepted’ or ‘rejected’ subgroups. For such a case the missing data is the category (the subgroup) for which a candidate should be assigned on the basis of his activity measured during the simulation experiment. The paper explains the procedure of model verification. It also shows experimental results and draws conclusions.

18

Statystyczne odkrywanie zależności w danych

Gramacki J., Gramacki A.

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2008

|

R. 81, nr 6

711-713

PL

Pokazano możliwość analizy zbioru danych numerycznych w aspekcie odkrywania niewidocznych związków pomiędzy tymi danymi. Posłużono się metodą analizy składowych głównych oraz wybranymi metodami grupowania danych. W pierwszym przykładzie przeanalizowano podobieństwo wybranych krajów UE w dziedzinie pozyskiwania przez nie energii ze źródeł odnawialnych. Posłużono się powszechnie dostępnymi danymi statystycznymi z baz Głównego Urzędu Statystycznego. W drugim przykładzie pokazano możliwość grupowania okresów zmienności notowań giełdowych. Posłużono się historycznymi (rok 1998) danymi dotyczącymi notowań wybranych indeksów giełdy amerykańskiej.

EN

In this paper we analyze some numerical data sets in order to uncover unknown or hidden relationships between them. We use principal component analysis approach as well as the hierarchical clustering method. In the first example we analyze similarities of EU countries in the field of production of energy from renewable sources. We use commonly available data from the Polish Central Statistical Office. In the second example we try to find groups of similar periods of time based on the US stock exchange. We use same historical (1998) stock exchange quotations of some selected indexes.

19

Application of machine learning and soft computing techniques in monitoring systems' data analysis by example of dewater pumps monitoring system

Sikora M.

Archives of Control Sciences

|

2007

|

Vol. 17, no. 4

369-391

EN

Application of machine learning method for creation of equipment diagnostic model is presented in the paper. Dewater pump working in abyssal mining pump station has been chosen as the illustrative example. In the second section, dewater pumps monitoring system is presented, and necessity of the pump diagnostic model creation is justified. Next sections present application of data clustering algorithm and algorithm of decision trees induction. Methods of reduction the get diagnostic model is also developed. The reduction leads to more legible data models. Results of analysis done for two different type of pumps are presented in the last part of the paper.

20

Immune K-Means : a novel immune algorithm for data clustering and multiple-class discrimination

Bereta M., Burczyński T.

Prace Naukowe Politechniki Warszawskiej. Elektronika

|

2006

|

z. 156

49-60

EN

This paper presents a novel approach to data clustering and multiple-class classification problems. The proposed method is based on a metaphor derived from immune systems, the clonal selection paradigm. A novel clonal selection algorithm - Immune K-Means, is proposed. The proposed system is able to cluster real valued data efficiently and correctly, dynamically estimating the number of clusters. In classification problems discrimination among classes is based on the k-nearest neighbor method. Two different types of suppression are proposed. They enable the evolution of different populations of lymphocytes well suited to a given problem : clustering or classification. The first type of suppression enables the lymphocytes to discover the data distribution while the second type of suppression focuses the lymphocytes on the classes' boundaries. Primary results on artificial data and a real-world benchmark dataset (Fisher's Iris Database) as well as a discussion of the parameters of the algorithm are given.