Wyniki wyszukiwania - BazTech

1

Drought classification using gradient boosting decision tree

Danandeh Mehr Ali

Acta Geophysica

|

2021

|

Vol. 69, no. 3

909--918

EN

This paper compares the classification and prediction capabilities of decision tree (DT), genetic programming (GP), and gradient boosting decision tree (GBT) techniques for one-month ahead prediction of standardized precipitation index in Ankara province and standardized precipitation evaporation index in central Antalya region. The evolved models were developed based on multi-station prediction scenarios in which observed (reanalyzed) data from nearby stations (grid points) were used to predict drought conditions in a target location. To tackle the rare occurrence of extreme dry/wet conditions, the drought series at the target location was categorized into three classes of wet, normal, and dry events. The new models were trained and validated using the frst 70% and last 30% of the datasets, respectively. The results demonstrated the promising performance of GBT for meteorological drought classification. It provides better performance than DT and GP in Ankara; however, GP predictions for Antalya were more accurate in the testing period. The results also exhibited that the proposed GP model with a scaled sigmoid function at root can efortlessly classify and predict the number of dry, normal, and wet events in both case studies.

2

Efficient decision trees for multi-class support vector machines using entropy and generalization error estimation

Kantavat P., Kijsirikul B., Songsiri P., Fukui K. I., Numao M.

International Journal of Applied Mathematics and Computer Science

|

2018

|

Vol. 28, no. 4

705--717

EN

We propose new methods for support vector machines using a tree architecture for multi-class classification. In each node of the tree, we select an appropriate binary classifier, using entropy and generalization error estimation, then group the examples into positive and negative classes based on the selected classifier, and train a new classifier for use in the classification phase. The proposed methods can work in time complexity between O(log2 N) and O(N), where N is the number of classes. We compare the performance of our methods with traditional techniques on the UCI machine learning repository using 10-fold cross-validation. The experimental results show that the methods are very useful for problems that need fast classification time or those with a large number of classes, since the proposed methods run much faster than the traditional techniques but still provide comparable accuracy.

3

Using the one-versus-rest strategy with samples balancing to improve pairwise coupling classification

Chmielnicki W., Stąpor K.

International Journal of Applied Mathematics and Computer Science

|

2016

|

Vol. 26, no. 1

191--201

EN

The simplest classification task is to divide a set of objects into two classes, but most of the problems we find in real life applications are multi-class. There are many methods of decomposing such a task into a set of smaller classification problems involving two classes only. Among the methods, pairwise coupling proposed by Hastie and Tibshirani (1998) is one of the best known. Its principle is to separate each pair of classes ignoring the remaining ones. Then all objects are tested against these classifiers and a voting scheme is applied using pairwise class probability estimates in a joint probability estimate for all classes. A closer look at the pairwise strategy shows the problem which impacts the final result. Each binary classifier votes for each object even if it does not belong to one of the two classes which it is trained on. This problem is addressed in our strategy. We propose to use additional classifiers to select the objects which will be considered by the pairwise classifiers. A similar solution was proposed by Moreira and Mayoraz (1998), but they use classifiers which are biased according to imbalance in the number of samples representing classes.

4

Stability of gene selection methods for multiclass clssification

Student S., Fujarewicz K.

Journal of Medical Informatics & Technologies

|

2010

|

Vol. 15

101--107

EN

A big problem in applying DNA microarrays for classification is dimension of the dataset. Recently we proposed a gene selection method based on Partial Least Squares (PLS) for searching best genes for classification. The new idea is to use PLS not only as multiclass approach, but to construct more binary selections that use one versus rest and one versus one approaches. Ranked gene lists are highly instable in the sense, that a small change of the data set often leads to big change of the obtained ordered list. In this article, we take a look at the assessment of stability of our approaches. We compare the variability of the obtained ordered lists from proposed methods with well known Recursive Feature Elimination (RFE) method and classical t-test method. This paper focuses on effective identification of informative genes. As a result, a new strategy to find small subset of significant genes is designed. Our results on real cancer data show that our approach has very high accuracy rate for different combinations of classification methods giving in the same time very stable feature rankings.