Wyniki wyszukiwania - BazTech

1

Feature Extraction for Dynamic Integration of Classifiers

Pechenizkiy M., Tsymbal A., Puuronen S., Patterson D.

Fundamenta Informaticae

|

2007

|

Vol. 77, nr 3

243-275

EN

Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique of dynamic integration, in which local accuracy estimates are calculated for each base classifier of an ensemble, in the neighborhood of a new instance to be processed. Generally, the whole space of original features is used to find the neighborhood of a new instance for local accuracy estimates in dynamic integration. However, when dynamic integration takes place in high dimensions the search for the neighborhood of a new instance is problematic, since the majority of space is empty and neighbors can in fact be located far from each other. Furthermore, when noisy or irrelevant features are present it is likely that also irrelevant neighbors will be associated with a test instance. In this paper, we propose to use feature extraction in order to cope with the curse of dimensionality in the dynamic integration of classifiers. We consider classical principal component analysis and two eigenvector-based class-conditional feature extraction methods that take into account class information. Experimental results show that, on some data sets, the use of FEDIC leads to significantly higher ensemble accuracies than the use of plain dynamic integration in the space of original features.

2

On combining principal components with Fisher's linear discriminants for supervised learning

Pechenizkiy M., Tsymbal A., Puuronen S.

Foundations of Computing and Decision Sciences

|

2006

|

Vol. 31, No. 1

59-73

EN

"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic increase of computational complexity and classification error in high dimensions. In this paper, principal component analysis (PCA). parametric feature extraction (FE) based on Fisher's linear discriminant analysis (LDA), and their combination as means of dimensionality reduction are analysed with respect to the performance of different classifiers. Three commonly used classifiers are taken for analysis: ŁNN, Naive Bayes and C4.5 decision tree. Recently, it has been argued that it is extremely important to use class information in FE for supervised learning (SL). However, LDA-based FE, although using class information, has a serious shortcoming due to its parametric nature. Namely, the number of extracted components cannot be more that the number of classes minus one. Besides, as it can be concluded from its name, LDA works mostly for linearly separable classes only. In this paper we study if it is possible to overcome these shortcomings adding the most significant principal components to the set of features extracted with LDA. In experiments on 21 benchmark datasets from UCI repository these two approaches (PCA and LDA) are compared with each other, and with their combination, for each classifier. Our results demonstrate that such a combination approach has certain potential, especially when applied for C4.5 decision tree learning. However, from the practical point of view the combination approach cannot be recommended for Naive Bayes since its behavior is very unstable on different datasets.

3

Local Feature Selection with Dynamic Integration of Classifiers

Puuronen S., Tsymbal A.

Fundamenta Informaticae

|

2001

|

Vol. 47, nr 1,2

91-117

EN

Multidimensional data is often feature space heterogeneous so that individual features have unequal importance in different sub areas of the feature space. This motivates to search for a technique that provides a strategic splitting of the instance space being able to identify the best subset of features for each instance to be classified. Our technique applies the wrapper approach where a classification algorithm is used as an evaluation function to differentiate between different feature subsets. In order to make the feature selection local, we apply the recent technique for dynamic integration of classifiers. This allows to determine which classifier and which feature subset should be used for each new instance. Decision trees are used to help to restrict the number of feature combinations analyzed. For each new instance we consider only those feature combinations that include the features present in the path taken by the new instance in the decision tree built on the whole feature set. We evaluate our technique on data sets from the UCI machine learning repository. In our experiments, we use the C4.5 algorithm as the learning algorithm for base classifiers and for the decision trees that guide the local feature selection. The experiments show some advantages of the local feature selection with dynamic integration of classifiers in comparison with the selection of one feature subset for the whole space.