Wyniki wyszukiwania - BazTech

1

Boosting and Comparing Performance of Machine Learning Classifiers with Meta-heuristic Techniques to Detect Code Smell

Jain Shivani, Saha Anju

e-Informatica Software Engineering Journal

|

2024

|

Vol. 18, nr 1

24107

EN

Background: Continuous modifications, suboptimal software design practices, and stringent project deadlines contribute to the proliferation of code smells. Detecting and refactoring these code smells are pivotal to maintaining complex and essential software systems. Neglecting them may lead to future software defects, rendering systems challenging to maintain, and eventually obsolete. Supervised machine learning techniques have emerged as valuable tools for classifying code smells without needing expert knowledge or fixed threshold values. Further enhancement of classifier performance can be achieved through effective feature selection techniques and the optimization of hyperparameter values. Aim: Performance measures of multiple machine learning classifiers are improved by fine tuning its hyperparameters using various type of meta-heuristic algorithms including swarm intelligent, physics, math, and bio-based etc. Their performance measures are compared to find the best meta-heuristic algorithm in the context of code smell detection and its impact is evaluated based on statistical tests. Method: This study employs sixteen contemporary and robust meta-heuristic algorithms to optimize the hyperparameters of two machine learning algorithms: Support Vector Machine (SVM) and k-nearest Neighbors (k-NN). The No Free Lunch theorem underscores that the success of an optimization algorithm in one application may not necessarily extend to others. Consequently, a rigorous comparative analysis of these algorithms is undertaken to identify the best-fit solutions for code smell detection. A diverse range of optimization algorithms, encompassing Arithmetic, Jellyfish Search, Flow Direction, Student Psychology Based, Pathfinder, Sine Cosine, Jaya, Crow Search, Dragonfly, Krill Herd, Multi-Verse, Symbiotic Organisms Search, Flower Pollination, Teaching Learning Based, Gravitational Search, and Biogeography-Based Optimization, have been implemented. Results: In the case of optimized SVM, the highest attained accuracy, AUC, and F-measure values are 98.75%, 100%, and 98.57%, respectively. Remarkably, significant increases in accuracy and AUC, reaching 32.22% and 45.11% respectively, are observed. For k-NN, the best accuracy, AUC, and F-measure values are all perfect at 100%, with noteworthy hikes in accuracy and ROC-AUC values, amounting to 43.89% and 40.83%, respectively. Conclusion: Optimized SVM exhibits exceptional performance with the Sine Cosine Optimization algorithm, while k-NN attains its peak performance with the Flower Optimization algorithm. Statistical analysis underscores the substantial impact of employing meta-heuristic algorithms for optimizing machine learning classifiers, enhancing their performance significantly. Optimized SVM excels in detecting the God Class, while optimized k-NN is particularly effective in identifying the Data Class. This innovative fusion automates the tuning process and elevates classifier performance, simultaneously addressing multiple longstanding challenges.

2

Detection of distributed denial of service attacks for IoT-based healthcare systems

Kaur Gaganjot, Gupta Prinima

Computer Assisted Methods in Engineering and Science

|

2023

|

Vol. 30, no. 2

167--186

EN

One of the major common assaults in the current Internet of things (IoT) network-based healthcare infrastructures is distributed denial of service (DDoS). The most challenging task in the current environment is to manage the creation of vast multimedia data from the IoT devices, which is difficult to be handled solely through the cloud. As the software defined networking (SDN) is still in its early stages, sampling-oriented measurement techniques used today in the IoT network produce low accuracy, increased memory usage, low attack detection, higher processing and network overheads. The aim of this research is to improve attack detection accuracy by using the DPTCM-KNN approach. The DPTCMKNN technique outperforms support vector machine (SVM), yet it still has to be improved. For healthcare systems, this work develops a unique approach for detecting DDoS assaults on SDN using DPTCM-KNN.

3

Space-Time-Frequency Machine Learning for Improved 4G/5G Energy Detection

Wasilewska Małgorzata, Bogucka Hanna

International Journal of Electronics and Telecommunications

|

2020

|

Vol. 66, No. 1

217--223

EN

In this paper, the future Fifth Generation (5G New Radio) radio communication system has been considered, coexisting and sharing the spectrum with the incumbent Fourth Generation (4G) Long-Term Evolution (LTE) system. The 4G signal presence is detected in order to allow for opportunistic and dynamic spectrum access of 5G users. This detection is based on known sensing methods, such as energy detection, however, it uses machine learning in the domains of space, time and frequency for sensing quality improvement. Simulation results for the considered methods: k-Nearest Neighbor sand Random Forest show that these methods signiﬁcantly improves the detection probability.

4

A machine learning approach to epileptic seizure prediction using Electroencephalogram (EEG) Signal

Savadkoohi Marzieh, Oladunni Timothy, Thompson Lara

Biocybernetics and Biomedical Engineering

|

2020

|

Vol. 40, no. 3

1328--1341

EN

This study investigates the properties of the brain electrical activity from different recording regions and physiological states for seizure detection. Neurophysiologists will find the work useful in the timely and accurate detection of epileptic seizures of their patients. We explored the best way to detect meaningful patterns from an epileptic Electroencephalogram (EEG). Signals used in this work are 23.6 s segments of 100 single channel surface EEG recordings collected with the sampling rate of 173.61 Hz. The recorded signals are from five healthy volunteers with eyes closed and eyes open, and intracranial EEG recordings from five epilepsy patients during the seizure-free interval as well as epileptic seizures. Feature engineering was done using; i) feature extraction of each EEG wave in time, frequency and time-frequency domains via Butterworth filter, Fourier Transform and Wavelet Transform respectively and, ii) feature selection with T-test, and Sequential Forward Floating Selection (SFFS). SVM and KNN learning algorithms were applied to classify preprocessed EEG signal. Performance comparison was based on Accuracy, Sensitivity and Specificity. Our experiments showed that SVM has a slight edge over KNN.

5

Detection of valvular heart diseases using impedance cardiography ICG

Chabchoub S., Mansouri S., Ben Salah R.

Biocybernetics and Biomedical Engineering

|

2018

|

Vol. 38, no. 2

251--261

EN

Impedance cardiography (ICG) is a simple, non-invasive and cost effective tool for monitor-ing hemodynamic parameters. It has been successfully used to diagnose several cardiovas-cular diseases, like the heart failure and myocardial infarction. In particular, valvular heart disease (VHD) is characterized by the affection of one or more heart valves: mitral, aortic, tricuspid or pulmonary valves and it is usually diagnosed using the Doppler echocardiogra- phy. However, this technique is rather expensive, requires qualified expertise, discontinu- ous, and often not necessary to make just a simple diagnosis. In this paper, a new computer aided diagnosis system is proposed to detect VHD using the ICG signals. Six types of ICG heartbeats are analyzed and classified: normal heartbeats (N), mitral insufficiency heart-beats (MI), aortic insufficiency heartbeats (AI), mitral stenosis heartbeats (MS), aortic steno-sis heartbeats (AS), and pulmonary stenosis heartbeats (PS). The proposed methodology is validated on 120 ICG recordings. Firstly, ICG signal is denoised using the Daubechies wavelet family with order eight (db8). Then, these signals are segmented into several heartbeats and, later, subjected to the linear prediction LP and discrete wavelet transform DWT approaches to extract temporal and time–frequency features, respectively. In order to reduce the number of features and select the most relevant ones among them, the Student's t-test is applied. Therefore, a total of 16 features are selected (3 temporal features and 13 time– frequency features). For the classification step, the support vector machine SVM and k-nearest neighbors KNN classifiers are used. Different combinations between extracted features and classifiers are proposed. Hence, experimental results showed that the combi-nation between temporal features, time–frequency features and SVM classifier achieved the highest classification performance in classifying the N, MI, MS, AI, AS and PS heartbeats with 98.94% of overall accuracy.

6

A bionic hand controlled by hand gesture recognition based on surface EMG signals: A preliminary study

Shi W. T., Lyu Z. J., Tang S. T., Chia T. L., Yang C. Y.

Biocybernetics and Biomedical Engineering

|

2018

|

Vol. 38, no. 1

126--135

EN

A bionic hand with fine motor ability could be a favorable option for replacing the human hand when performing various operations. Myoelectric control has been widely used to recognize hand movements in recent years. However, most of the previous studies have focused on whole-hand movements, with only a few investigating subtler motions. The aim of this study was to construct a prototype system for recognizing hand postures with the aim of controlling a bionic hand by analyzing sEMG signals measured at the flexor digitorum superficialis and extensor digitorum muscles. We adopted multiple features commonly used in previous studies—mean absolute value, zero crossing, slope sign change, and waveform length—in the algorithm for extracting hand-posture features, and the k-nearest-neighbors (KNN) algorithm as the classifier to perform hand-posture recognition. The bionic hand was controlled by an Arduino microprocessor, which converted the signals received from the classification process that were fed to the servo motors controlling the bionic fingers. We constructed a two-channel sEMG pattern-recognition system that can identify human hand postures and control a homemade bionic hand to perform corresponding hand postures. The KNN approach was able to recognize four different hand postures with a classification accuracy of 94% in the online experiment by using the channel combination. Moreover, the experimental tests show that the bionic hand could faithfully imitate the hand postures of the human hand. This study has bridged the gap between the features of sEMG signals of fingers and the postures of a bionic hand.

7

Improving prediction models applied in systems monitoring natural hazards and machinery

Sikora M., Sikora B.

International Journal of Applied Mathematics and Computer Science

|

2012

|

Vol. 22, no. 2

477-491

EN

A method of combining three analytic techniques including regression rule induction, the k-nearest neighbors method and time series forecasting by means of the ARIMA methodology is presented. A decrease in the forecasting error while solving problems that concern natural hazards and machinery monitoring in coal mines was the main objective of the combined application of these techniques. The M5 algorithm was applied as a basic method of developing prediction models. In spite of an intensive development of regression rule induction algorithms and fuzzy-neural systems, the M5 algorithm is still characterized by the generalization ability and unbeatable time of data model creation competitive with other systems. In the paper, two solutions designed to decrease the mean square error of the obtained rules are presented. One consists in introducing into a set of conditional variables the so-called meta-variable (an analogy to constructive induction) whose values are determined by an autoregressive or the ARIMA model. The other shows that limitation of a data set on which the M5 algorithm operates by the k-nearest neighbor method can also lead to error decreasing. Moreover, three application examples of the presented solutions for data collected by systems of natural hazards and machinery monitoring in coal mines are described. In Appendix, results of several benchmark data sets analyses are given as a supplement of the presented results.

8

Center-Based Indexing in Vector and Metric Spaces

Wojna A.

Fundamenta Informaticae

|

2003

|

Vol. 56, nr 3

285-310

EN

The paper addresses the problem of indexing data for k nearest neighbors (k-nn) search. Given a collection of data objects and a similarity measure the searching goal is to find quickly the k most similar objects to a given query object. We present a top-down indexing method that employs a widely used scheme of indexing algorithms. It starts with the whole set of objects at the root of an indexing tree and iteratively splits data at each level of indexing hierarchy. In the paper two different data models are considered. In the first, objects are represented by vectors from a multi-dimensional vector space. The second, more general, is based on an assumption that objects satisfy only the axioms of a metric space. We propose an iterative k-means algorithm for tree node splitting in case of a vector space and an iterative k-approximate-centers algorithm in case when only a metric space is provided. The experiments show that the iterative k-means splitting procedure accelerates significantly k-nn searching over the one-step procedure used in other indexing structures such as GNAT, SS-tree and M-tree and that the relevant representation of a tree node is an important issue for the performance of the search process. We also combine different search pruning criteria used in BST, GHT nad GNAT structures into one and show that such a combination outperforms significantly each single pruning criterion. The experiments are performed for benchmark data sets of the size up to several hundreds of thousands of objects. The indexing tree with the k-means splitting procedure and the combined search criteria is particularly effective for the largest tested data sets for which this tree accelerates searching up to several thousands times