Wyniki wyszukiwania - Biblioteka Nauki

1

A differential evolution approach to dimensionality reduction for classification needs

100%

Martinović G. , Bajer D. , Zorić B.

International Journal of Applied Mathematics and Computer Science

|

2014

|

tom Vol. 24, no. 1

111--122

EN

The feature selection problem often occurs in pattern recognition and, more specifically, classification. Although these patterns could contain a large number of features, some of them could prove to be irrelevant, redundant or even detrimental to classification accuracy. Thus, it is important to remove these kinds of features, which in turn leads to problem dimensionality reduction and could eventually improve the classification accuracy. In this paper an approach to dimensionality reduction based on differential evolution which represents a wrapper and explores the solution space is presented. The solutions, subsets of the whole feature set, are evaluated using the k-nearest neighbour algorithm. High quality solutions found during execution of the differential evolution fill the archive. A final solution is obtained by conducting k-fold cross-validation on the archive solutions and selecting the best one. Experimental analysis is conducted on several standard test sets. The classification accuracy of the k-nearest neighbour algorithm using the full feature set and the accuracy of the same algorithm using only the subset provided by the proposed approach and some other optimization algorithms which were used as wrappers are compared. The analysis shows that the proposed approach successfully determines good feature subsets which may increase the classification accuracy.

2

Instance based kNN modification for classification of medical data

100%

Orczyk T. , Porwik P. , Lewandowski M. , Cholewa M.

Journal of Medical Informatics & Technologies

|

2015

|

tom Vol. 24

99--106

EN

Paper describes a novel modification to a well known kNN algorithm, which enables using it for medical data, which often is a class-imbalanced data with randomly missing values. Paper presents the modified algorithm details, experiment setup, results obtained on a cross validated classification of a benchmark database with randomly removed values (missing data) and records (class imbalance), and their comparison with results of the state of the art classification algorithms.

3

A practical application of kernel-based fuzzy discriminant analysis

100%

Gao J. Q. , Fan L. Y. , Li L. , Xu L. Z.

International Journal of Applied Mathematics and Computer Science

|

2013

|

tom Vol. 23, no. 4

887--903

EN

A novel method for feature extraction and recognition called Kernel Fuzzy Discriminant Analysis (KFDA) is proposed in this paper to deal with recognition problems, e.g., for images. The KFDA method is obtained by combining the advantages of fuzzy methods and a kernel trick. Based on the orthogonal-triangular decomposition of a matrix and Singular Value Decomposition (SVD), two different variants, KFDA/QR and KFDA/SVD, of KFDA are obtained. In the proposed method, the membership degree is incorporated into the definition of between-class and within-class scatter matrices to get fuzzy between-class and within-class scatter matrices. The membership degree is obtained by combining the measures of features of samples data. In addition, the effects of employing different measures is investigated from a pure mathematical point of view, and the t-test statistical method is used for comparing the robustness of the learning algorithm. Experimental results on ORL and FERET face databases show that KFDA/QR and KFDA/SVD are more effective and feasible than Fuzzy Discriminant Analysis (FDA) and Kernel Discriminant Analysis (KDA) in terms of the mean correct recognition rate.

4

Zastosowanie metody mini-modeli opartej na hipersześcianie w procesie modelowania danych wielowymiarowych

100%

Pietrzykowski M.

|

tom Nr 38

91--103

PL

W artykule zaprezentowano metodę samo-uczenia mini-modeli (metodę MM) opartą na hiperbryłach w przestrzeni wielowymiarowej. Jest to metoda nowa i rozwojowa, będąca w trakcie intensywnych badań. Bazuje ona na próbkach pobieranych jedynie z lokalnego otoczenia punktu zapytania, a nie z obszarów odległych od tego punktu. Grupa punktów, używana w procesie uczenia mini-modelu jest ograniczona obszarem hiperbryły. na tak zdefiniowanym lokalnym otoczeniu punktu zapytania metoda MM w procesie uczenia oraz obliczania odpowiedzi można użyć dowolnej metody aproksymacji. W artykule przedstawiono algorytm uczenia i działania metody w przestrzeni wielowymiarowej bazujący na hipersferycznym układzie współrzędnych. Metodę przebadano na zbiorach danych wielowymiarowych, a wyniki porównano z innymi metodami bazującymi na próbkach.

EN

The article presents self-learning method of mini-models (MM-method) based on polytopes in multidimensional space. The method is new and is an object of intensive research. MM method is the instance based learning method and uses data samples only from the local neighborhood of the query point. Group of points which are used in the model-learning process is constrained by a polytope area. The MM-method can on a defined local area use any approximation algorithm to compute mini-model answer for the query point. The article describes a learning technique based on hyper-spherical coordinate system. The method was used in the modeling task with multidimensional datasets. The results of numerical experiments were compared with other instance based methods.

5

Automatyczna detekcja płaszczyzn w chmurze punktów w oparciu o algorytm RANSAC i elementy teorii grafów

80%

Poręba M. , Goulette F.

Archiwum Fotogrametrii, Kartografii i Teledetekcji

|

2012

|

tom Vol. 24

301--310

PL

Artykuł przedstawia metodę automatycznego wyodrębniania punktów modelujących płaszczyzny w chmurach punktów pochodzących z mobilnego bądź statycznego skaningu laserowego. Zaproponowany algorytm bazuje na odpornym estymatorze RANSAC umożliwiającym iteracyjną detekcję płaszczyzn w zbiorze cechującym się znacznym poziomem szumu pomiarowego i ilością punktów odstających. Aby zoptymalizować jego działanie, dla każdej wykrytej płaszczyzny uwzględniono relacje sąsiedztwa pomiędzy punktami przynależnymi. W tym celu zastosowano podejście oparte na teorii grafów, gdzie chmura punktów traktowana jest jako graf nieskierowany, dla którego poszukiwane są spójne składowe. Wprowadzona modyfikacja obejmuje dwa dodatkowe etapy: ustalenie najbliższych sąsiadów dla każdego punktu wykrytej płaszczyzny wraz z konstrukcją listy sąsiedztwa oraz etykietowanie spójnych komponentów. Rezultaty uzyskane pokazują iż algorytm poprawnie wykrywa płaszczyzny modelujące, przy czym niezbędny jest odpowiedni dobór parametrów początkowych. Czas przetwarzania uzależniony jest przede wszystkim od liczby punktów w chmurze. Nadal jednak aktualny pozostaje problem wrażliwości algorytmu RANSAC na niską gęstość chmury oraz nierównomierne rozmieszczenie punktów.

EN

Laser scanning techniques play very important role in acquiring of spatial data. Once the point cloud is available, the data processing must be performed to achieve the final products. The segmentation is an inseparable step in point cloud analysis in order to separate the fragments of the same semantic meaning. Existing methods of 3D segmentation are divided into two categories. The first family contains algorithms functioning on principle of fusion, such as surface growing approach or split-merge algorithm. The second group consists of techniques making possible the extraction of features defined by geometric primitives i.e.: sphere, cone or cylinder. Hough transform and RANSAC algorithm (RANdom SAmple Consensus) are classified to the last of aforementioned groups. This paper studies techniques of point cloud segmentation such as fully automatic plane detection. Proposed method is based on RANSAC algorithm providing an iterative plane modelling in point cloud affected by considerable noise. The algorithm is implemented sequentially, therefore each successive plane represented by the largest number of points is separated. Despite all advantages of RANSAC, it sometimes gives erroneous results. The algorithm looks for the best plane without taking into account the particularity of the object. Consequently, RANSAC may combine points belonging to different objects into one single plane. Hence, RANSAC algorithm is optimized by analysing the adjacency relationships of neighbouring points for each plane. The approach based on graph theory is thus proposed, where the point cloud is treated as undirected graph for which connected components are extracted. Introduced method consists of three main steps: identification of k-nearest neighbours for each point of detected plane, construction of adjacency list and finally connected component labelling. Described algorithm was tested with raw point clouds, unprocessed in sense of filtration. All the numerical tests have been performed on real data, characterized by different resolutions and derived from both mobile and static laser scanning techniques. Obtained results show that proposed algorithm properly separates points for particular planes, whereas processing time is strictly dependent on number of points within the point cloud. Nevertheless, susceptibility of RANSAC algorithm to low point cloud density as well as irregular points distribution is still animportant problem. This paper contains literature review in subject of existing methods for plane detection in data set. Moreover, the description for proposed algorithm based on RANSAC, its principle, as well as the results is also presented.

6

Influence of accelerometer signal preprocessing and classification method on human activity recognition

80%

Kupryjanow A. , Kaszuba K. , Czyżewski A.

Elektronika : konstrukcje, technologie, zastosowania

|

2010

|

tom Vol. 51, nr 3

18-23

EN

A study of data preprocessing influence on accelerometer-based human activity recognition algorithms is presented. The frequency band used to filter-out the accelerometer signals and the number of accelerometers involved were considered in terms of their influence on the recognition accuracy. In the tests four methods of classification were used: support vector machine, decision trees, neural network, k-nearest neighbor.

PL

W artykule przedstawiono wpływ przetwarzania wstępnego sygnału przyspieszenia na skuteczność rozpoznawania aktywności ruchowych. Przeanalizowano zależność filtracji sygnałów oraz ilości zastosowanych czujników na skuteczność klasyfikacji. W badaniach wykorzystano cztery różne klasyfikatory: maszynę wektorów wsparcia, drzewa decyzyjne, sztuczne sieci neuronowe oraz klasyfikator najbliższego sąsiada.

7

Klasyfikacja danych algorytmy redukcji i edycji zbiorów wykorzystujące miarę reprezentatywności

70%

Raniszewski M.

Zeszyty Naukowe. Elektryka / Politechnika Łódzka

|

2010

|

tom z. 121

463-486

PL

Klasyfikacja danych to podejmowanie decyzji na podstawie informacji, które te dane przenoszą (tzw. cech danych). Prawidłowa i szybka klasyfikacja zależy od prawidłowego przygotowania zbioru danych, jak i doboru odpowiedniego algorytmu klasyfikacji. Jednym z takich algorytmów jest popularny algorytm najbliższego sąsiada (NN). Jego zaletami są prostota, intuicyjność i szerokie spektrum zastosowań. Jego wadą są duże wymagania pamięciowe i spadek szybkości działania dla ogromnych zbiorów danych. Algorytmy redukcji usuwają znaczną część elementów ze zbioru danych, co znacząco przyspiesza działanie algorytmu NN, jednocześnie pozostawiając te, na podstawie których nadal można z zadawalającą jakością klasyfikować dane. Algorytmy edycji oczyszczają zbiór danych z nadmiarowych i błędnych elementów. W artykule zaprezentowane zostaną algorytm redukcji i algorytm edycji zbiorów danych, obydwa wykorzystujące miarę reprezentatywności. Testy przeprowadzono na kilku dobrze znanych w literaturze zbiorach danych różnej wielkości. Otrzymane wyniki są obiecujące. Zestawiono je z wynikami innych popularnych algorytmów redukcji i edycji.

EN

In data classification we make decision based on data features. Proper and fast classification depends on a Preparation of a data set and a selection of a suitable classification algorithm. One of these algorithms is popular Nearest Neighbor Rule (NN). Its advantages are simplicity, intuitiveness and wide rangę of applications. Its disadvantages are large memory requirements and decrease in speed for large data sets. Reduction algorithms remove much of data, which significantly speeds up NN. Simultaneously, they leave that data on the basis of which we can still make decisions with an acceptable classification quality. Editing algorithms remove redundant and atypical data from a data set. In this paper new reduction and editing algorithms, both using the representative measure, are presented. Tests were performed on several well-known in the literature data sets of different sizes. The results are promising. They were compared with the results of other popular reduction and editing procedures.