Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 7

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  class imbalance
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
The imbalance and complexity of network traffic data are hot issues in the field of intrusion detection. To improve the detection rate of minority class attacks in network traffic, this paper presents a method for intrusion detection based on the recombination generative adversarial network (RGAN). In this study, dual-stage game learning is used to optimize the discriminator for efficient identification of attack samples. In the first stage, the proposed model trains a deep convolutional generative adversarial network (DCGAN) integrated with the self-attention (SA) mechanism, and simultaneously trains an independent convolutional neural network (CNN) classifier integrated with the gated recurrent unit (GRU). This stage allows the generator to generate minority class attack samples that closely resemble real samples, while the independent classifier possesses the basic classification ability. In the second stage, the generator and the independent classifier of the DCGAN together constitute the second layer of the model - the generative adversarial network. Through dual-stage game learning, the classifier’s discrimination ability for the minority samples is optimized, and it serves as the final output of the discriminator. In addition, the introduction of reconstruction loss helps prevent the detection rate of false positive samples. Experimental results on the CSE-IDS-2018 dataset demonstrate that our model performs well compared with various other intrusion detection techniques in terms of detection accuracy, recall, and F1-score for minority class attacks.
EN
In practical applications of machine learning, the class distribution of the collected training set is usually imbalanced, i.e., there is a large difference among the sizes of different classes. The class imbalance problem often hinders the achievable generalization performance of most classifier learning algorithms to a large extent. To ameliorate the learning performance, some effective approaches have been proposed in the literature, where the recently presented GAN-based oversampling methods are very representative. However, their generated minority class examples have the risk of high similarity and duplication degree. To further ameliorate the quality of the generated minority class examples, i.e., to make the generated examples effectively expand the minority class region, a novel oversampling approach named the GWGAN-GP is proposed, which is based on the Gaussian distribution label within the framework of a Wasserstein generative adversarial network with gradient penalty (WGAN-GP). Our GWGAN-GP approach incorporates the Gaussian distribution as an input label, thereby making the generated examples more diverse and dispersive. The examples are then combined with the original dataset to form a balanced dataset, which is subsequently utilized to evaluate the classification performance of three selected classification algorithms. Experimental results on 16 imbalanced datasets demonstrate that the GWGAN-GP not only generates examples that better conform to the distribution of the original dataset, but also achieves superior classification performance. Specifically, when combined with the KNN classifier, the GWGAN-GP significantly outperforms other oversampling approaches considered in the study.
EN
Covid-19 has spread across the world and many different vaccines have been developed to counter its surge. To identify the correct sentiments associated with the vaccines from social media posts, we fine-tune various state-of-the-art pretrained transformer models on tweets associated with Covid-19 vaccines. Specifically, we use the recently introduced state-of-the-art RoBERTa, XLNet, and BERT pre-trained transformer models, and the domain-specific CT-BER and BERTweet transformer models that have been pre-trained on Covid-19 tweets. We further explore the option of text augmentation by oversampling using the language model-based oversampling technique (LMOTE) to improve the accuracies of these models - specifically, for small sample data sets where there is an imbalanced class distribution among the positive, negative and neutral sentiment classes. Our results summarize our findings on the suitability of text oversampling for imbalanced, small-sample data sets that are used to fine-tune state-of-the-art pre-trained transformer models as well as the utility of domain-specific transformer models for the classification task.
EN
The article concerns the problem of imbalanced data classification. A new algorithm is presented and tested. The HImbA technique is a hybrid method that uses well-known SMOTE algorithm and modified k-nearest neighbours method. 28 datasets have been preprocessed using the HImbA and 10 variants of existing techniques, classified using two algorithms (C4.5 and SMO) and the results have been compared. The new algorithm occurred to give the best results for some datasets.
PL
Praca dotyczy braku zrównoważenia liczności klas w problemie klasyfikacji. Zaprezentowany oraz przetestowany został nowy algorytm. Technika HImbA jest metodą hybrydową, która łączy znany algorytm SMOTE oraz zmodyfikowaną wersję metody k najbliższych sąsiadów. Została ona zastosowana wraz z dziesięcioma wariantami istniejących technik w celu przetwarzania wstępnego 28 zbiorów danych, które zostały następnie poddane klasyfikacji (użyto dwóch algorytmów – C4.5 oraz SMO), a wyniki zostały porównane. Dla wybranych zbiorów przy użyciu nowego algorytmu uzyskano najlepsze rezultaty.
5
Content available remote Post-processing of BRACID Rules Induced from Imbalanced Data
EN
Rule-based classifiers constructed from imbalanced data fail to correctly classify instances from the minority class. Solutions to this problem should deal with data and algorithmic difficulty factors. The new algorithm BRACID addresses these factors more comprehensively than other proposals. The experimental evaluation of classification abilities of BRACID shows that it significantly outperforms other rule approaches specialized for imbalanced data. However, it may generate too high a number of rules, which hinder the human interpretation of the discovered rules. Thus, the method for post-processing of BRACID rules is presented. It aims at selecting rules characterized by high supports, in particular for the minority class, and covering diversified subsets of examples. Experimental studies confirm its usefulness.
EN
The article concerns the problem of imbalanced data classification. Two algorithms improving the standard SMOTE method have been created and tested. To measure the distance between objects the Euclidean or the HVDM metric was applied, depending on the number of nominal attributes in a dataset.
PL
Artykuł dotyczy problemu klasyfikacji w przypadku, gdy mamy do czynienia z klasami niezrównoważonymi. W tym celu stworzone zostały dwa algorytmy poprawiające wyniki uzyskiwane za pomocą standardowego algorytmu SMOTE. Do pomiaru odległości między obiektami zastosowano metrykę euklidesową lub metrykę HVDM, w zależności od liczby cech nominalnych w zbiorze.
EN
The paper addresses problems of improving performance of rule-based classifiers constructed from imbalanced data sets, i.e., data sets where the minority class of primary importance is under-represented in comparison to majority classes. We introduced two techniques to detect and process inconsistent examples from the majority classes in the boundary between the minority and majority classes. Both these techniques differ in the way of processing inconsistent boundary examples from the majority classes. The first approach removes them, while the other relabels them as belonging to the minority class. The experiments showed that the best results were obtained for the filtering technique, where inconsistent majority class examples were reassigned to the minority class, combined with a classifier composed of decision rules generated by the MODLEM algorithm.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.