Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Ograniczanie wyników
Czasopisma help
Lata help
Autorzy help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 59

Liczba wyników na stronie
first rewind previous Strona / 3 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  CNN
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 3 next fast forward last
1
Content available remote Visual emotion sensing using convolutional neural network
100%
EN
The objective of this article is to present a CNN architecture relevant to the Interactive Emotional Dyadic Motion Capture (IEMOCAP). Since the database showed some issues during the training phase, we are using frames as inputs instead of video recorder to minimize the error and increase the accuracy. We apply the methodology of transfer learning by adjust the number of layers and the weight of the database. The results of the female and male genders are 91% and 89% respectively.
PL
Celem tego artykułu jest przedstawienie architektury CNN odpowiedniej do interaktywnego emocjonalnego przechwytywania ruchu (IEMOCAP). Ponieważ baza danych wykazała pewne problemy w fazie uczenia, używamy klatek jako danych wejściowych zamiast rejestratora wideo, aby zminimalizować błąd i zwiększyć dokładność. Stosujemy metodologię transferu uczenia się dostosowując liczbę warstw i wagę bazy danych. Wyniki dla płci żeńskiej i męskiej wynoszą odpowiednio 91% i 89%.
EN
Chaos and chaos theory is a field of study in mathematics, Computer science, electronics, physics and engineering too. Over the last two decades, theoretical design and circuit implementation of various chaos generator have been a focal subject of increasing interest due to their promising applications in various real-world chaos-based technologies and information systems. In our article unidirectional and diffusive coupling of identical n-double scroll cells in a one-dimensional cellular neural network is studied.
|
|
tom T. 21 (4)
397-417
EN
Deep neural networks (DNN) currently play a most vital role in automatic speech recognition (ASR). The convolution neural network (CNN) and recurrent neural network (RNN) are advanced versions of DNN. They are right to deal with the spatial and temporal properties of a speech signal, and both properties have a higher impact on accuracy. With its raw speech signal, CNN shows its superiority over precomputed acoustic features. Recently, a novel first convolution layer named SincNet was proposed to increase interpretability and system performance. In this work, we propose to combine SincNet-CNN with a light-gated recurrent unit (LiGRU) to help reduce the computational load and increase interpretability with a high accuracy. Different configurations of the hybrid model are extensively examined to achieve this goal. All of the experiments were conducted using the Kaldi and Pytorch-Kaldi toolkit with the Hindi speech dataset. The proposed model reports an 8.0% word error rate (WER).
EN
The digital revolution has encouraged many companies to set up new strategic and operational mechanisms to supervise the flow of information published about them on the Web. Press coverage analysis is a part of sentiment analysis that allows companies to discover the opinion of the media concerning their activities, products and services. It is an important research area, since it involves the opinion of informed public such as journalists, who may influence the opinion of their readers. However, from an implementation perspective, the analysis of the opinion from media coverage encounters many challenges. In fact, unlike social networks, the Media coverage is a set of large textual documents written in natural language. The training base being huge, it is necessary to adopt large-scale processing techniques like Deep Learning to analyze their content. To guide researchers to choose between one of the most commonly used models CNN and LSTM, we compare and apply both models for opinion mining from long text documents using real datasets.
5
71%
EN
A compact and precise application of rice disease classification is helpful to assist farmers in their work for treatment on the plants and therefore could be quick and accurate to measure and eliminate the effects of diseases more profitably. In the past, the works were completed by naked-eye observation and basically relied on the experiences. Even so, the results are quite subjective and heuristic. In this paper, a mobile application to automatically classify several kinds of rice diseases from rice plant images and then to accurately recommend the uses of pesticides or chemicals. To do so, a proposed convolutional neural network (CNN) model is given. The results show that the proposed CNN model achieves the performance with the best trade-off between accuracy and time efficiency in comparison with the state-of-the-art models in our dataset. This model could be easily embedded into a mobile application to process in near real-time processing.
EN
Predicting epileptic seizures in advance improves greatly the life of epileptic patients. In this paper we present a new approach based on patient specific channel optimization using four different features namely entropy, variance, kurtosis and skewness. After selecting three best channels for each method, we then use Convolutional Neural Network (CNN) to classify raw EEG signal in order to discriminate between interictal and preictal state. With entropy, our method achieves a good degree of prediction in terms of accuracy 97.09%, sensitivity 97.67% and specificity 96.51% for patient 01 using channels 4, 8 and 20.
PL
Przewidywanie napadów padaczkowych z wyprzedzeniem znacznie poprawia życie chorych na padaczkę. W tym artykule prezentujemy nowe podejście oparte na optymalizacji kanałów specyficznych dla pacjenta przy użyciu czterech różnych metod, a mianowicie entropii, wariancji, kurtozy i skośności. Po wybraniu trzech najlepszych kanałów dla każdej z metod, wykorzystujemy Neuronową Sieć Konwolucyjną (CNN) do klasyfikacji surowego sygnału EEG w celu rozróżnienia pomiędzy stanem międzynapadowym i przednapadowym. Dzięki entropii nasza metoda osiąga dobry stopień predykcji w zakresie dokładności 97,09%, czułości 97,67% i specyficzności 96,51% dla pacjenta 01 przy użyciu kanałów 4, 8 i 20.
EN
Combining tomographic imaging with deep learning techniques enables image analysis. There are still many questions in the subject of image reconstruction from projection using a deep neural network. This publication focuses on biomedical imaging with an emphasis on developing a new generation of image reconstruction techniques using deep neural networks. Such targeted research may lead to the development of intelligent use of knowledge in big data, including innovative approaches to the reconstruction of tomographic images and further development in the area of diagnostic imaging. Fully utilizing the possibilities of machine learning in biomedical imaging will be the first step in the development of new translational techniques.
PL
Połączenie obrazowania tomograficznego z technikami uczenia głębokiego umożliwia analizę obrazu. W dziedzinie rekonstrukcji obrazu z projekcji za pomocą głębokiej sieci neuronowej wciąż istnieje wiele wątpliwości. Ta publikacja skupia się na obrazowaniu biomedycznym z naciskiem na opracowanie nowej generacji technik rekonstrukcji obrazów właśnie z użyciem głębokich sieci neuronowych. Tak ukierunkowane badania mogą prowadzić do rozwoju inteligentnego wykorzystania wiedzy z zakresu big data, w tym innowacyjnych podejść do rekonstrukcji obrazów tomograficznych oraz dalszego rozwoju w obszarze diagnostyki obrazowej. W pełni wykorzystane możliwości uczenia maszynowego w obrazowaniu biomedycznym będzie pierwszym krokiem do rozwoju nowych technik translacyjnych.
8
Content available A few-shot fine-grained image recognition method
63%
EN
Deep learning methods benefit from data sets with comprehensive coverage (e.g., ImageNet, COCO, etc.), which can be regarded as a description of the distribution of real-world data. The models trained on these datasets are considered to be able to extract general features and migrate to a domain not seen in downstream. However, in the open scene, the labeled data of the target data set are often insufficient. The depth models trained under a small amount of sample data have poor generalization ability. The identification of new categories or categories with a very small amount of sample data is still a challenging task. This paper proposes a few-shot fine-grained image recognition method. Feature maps are extracted by a CNN module with an embedded attention network to emphasize the discriminative features. A channel-based feature expression is applied to the base class and novel class followed by an improved cosine similarity-based measurement method to get the similarity score to realize the classification. Experiments are performed on main few-shot benchmark datasets to verify the efficiency and generality of our model, such as Stanford Dogs, CUB-200, and so on. The experimental results show that our method has more advanced performance on fine-grained datasets.
PL
W pracy przedstawiono algorytm predykcji wolnych zasobów w sieciach radiowych 5G. Sygnał 5G nadawany przez użytkownika pierwotnego (PU) poddawany jest zanikom występującym w kanale, co uniemożliwia poprawną detekcję i tym samym właściwą ochronę transmisji PU. Zaproponowany algorytm wykorzystuje możliwości głębokiego uczenia maszynowego w celu rozpoznania zależności czasowo-częstotliwościowych występujących w odebranym sygnale, a także rozpoznania stopnia zaniku. Znając te informacje, algorytm dokonuje lepszej detekcji wolnych zasobów, przy jednoczesnej ochronie transmisji PU.
EN
In this paper, we present a 5G spectrum resources prediction algorithm. 5G signal, transmitted by the primary user (PU) is transmitted through fading channel, which makes negatively affects prediction performance and proper protection of PU’s transmission. The proposed algorithm applies deep learning for estimating fading level and recognizing time-frequency patterns in a received signal. Having this information, the algorithm can perform better signal prediction and PU’s transmission protection.
10
63%
EN
The identity of a language being spoken has been tackled over the years via statistical models on audio samples. A drawback of these approaches is the unavailability of phonetically transcribed data for all languages. This work proposes an approach based on image classification that utilized image representations of audio samples. Our model used Neural Networks and deep learning algorithms to analyse and classify three languages. The input to our network is a Spectrogram that was processed through the networks to extract local visual and temporal features for language prediction. From the model, we achieved 95.56 % accuracy on the test samples from the 3 languages.
|
|
tom Vol. 69, nr 4
art. no. e137728
EN
Workflow scheduling is the major problem in cloud computing consists of a set of interdependent tasks which is used to solve the various scientific and healthcare issues. In this research work, the cloud based workflow scheduling between different tasks in medical imaging datasets using Machine Learning (ML) and Deep Learning (DL) methods (hybrid classification approach) is proposed for healthcare applications. The main objective of this research work is to develop a system which is used for both workflow computing and scheduling in order to minimize the makespan, execution cost and to segment the cancer region in the classified abnormal images. The workflow computing is performed using different Machine Learning classifiers and the workflow scheduling is carried out using Deep Learning algorithm. The conventional AlexNet Convolutional Neural Networks (CNN) architecture is modified and used for workflow scheduling between different tasks in order to improve the accuracy level. The AlexNet architecture is analyzed and tested on different cloud services Amazon Elastic Compute Cloud- EC2 and Amazon Lightsail with respect to Makespan (MS) and Execution Cost (EC).
12
63%
EN
Accurate nuclei segmentation is a critical step for physicians to achieve essential information about a patient’s disease through digital pathology images, enabling an effective diagnosis and evaluation of subsequent treatments. Since pathology images contain many nuclei, manual segmentation is time-consuming and error-prone. Therefore, developing a precise and automatic method for nuclei segmentation is urgent. This paper proposes a novel multi-task segmentation network that incorporates background and contour segmentation into the nuclei segmentation method and produces more accurate segmentation results. The convolution and attention modules are merged with the model to increase its global focus and enhance good segmentation results indirectly. We propose a reverse feature enhance module for contour extraction that facilitates feature integration between auxiliary tasks. The multi-feature fusion module is embedded in the final decoding branch to use different levels of features from auxiliary segmentation branches with varying concerns. We evaluate the proposed method on four challenging nuclei segmentation datasets. The proposed method achieves excellent performance on all four datasets. We found that the Dice coefficient reached 0.8563±0.0323, 0.8183±0.0383, 0.9222±0.0216, and 0.9220±0.0602 on the TNBC, MoNuSeg, KMC, and Glas. Our method produces better boundary accuracy and less sticking than other end-to-end segmentation methods. The results show that our method can perform better than other proposed state-of-the-art methods.
13
Content available remote Using deep learning to recognize the sign alphabet
63%
EN
This article describes a vision system that uses deep learning to recognize 24 static signs of the American Sign Alphabet in real time. As part of the project, images of signs from four publicly available databases were used as a training set. A DenseNet was implemented for image recognition. For testing, images were acquired with the use of a web camera. The accuracy of sign recognition in images is more than 80%. The real-time version of the system was implemented.
PL
Artykuł zawiera opis systemu wizyjnego wykorzystującego uczenie głębokie do rozpoznawania, w czasie rzeczywistym 24 statycznych znaków Amerykańskiego Alfabetu Migowego. W ramach realizacji projektu, w charakterze zbioru uczącego, wykorzystano obrazy znaków pochodzące z czterech ogólnodostępnych baz danych. Zastosowano sieć DenseNet do rozpoznawania obrazów. Do testów stworzono własne obrazy z wykorzystaniem kamery internetowej. Skuteczność rozpoznawania znaków migowych z wykorzystaniem obrazów przekroczyła 80%. Zaimplementowano wersję systemu pracującą w czasie rzeczywistym.
EN
Constructing textile defect detection systems is significant for quality control in industrial production, but it is costly and laborious to label sufficient detailed samples. This paper proposes a model called ‘spatial adversarial convolutional neural network’ which tries to solve the problem above by only using the image-level label. It consists of two parts: a feature extractor and feature competition. Firstly, a string of convolutional blocks is used as a feature extractor. After feature extraction, a maximum greedy feature competition is taken among features in the feature layer. The feature competition mechanism can lead the network to converge to the defect location. To evaluate this mechanism, experiments were carried on two datasets. As the training time increases, the model can spontaneously focus on the actual defective location, and is robust towards an unbalanced sample. The classification accuracy of the two datasets can reach more than 98%, and is comparable with the method of labelling samples in detail. Detection results show that defect location from the model is more compact and accurate than in the Grad-CAM method. Experiments show that our model has potential usage in defect detection in an industrial environment.
PL
Konstruowanie systemów wykrywania wad tekstyliów ma duże znaczenie dla kontroli jakości w produkcji przemysłowej, ale etykietowanie wystarczająco szczegółowych próbek jest kosztowne i pracochłonne. W artykule zaproponowano model zwany „przestrzenną przeciwstawną splotową siecią neuronową”, który próbuje rozwiązać powyższy problem jedynie przy użyciu etykiety na poziomie obrazu. Składa się z dwóch części: ekstraktora fabuły i konkursu fabularnego. Po pierwsze, ciąg bloków splotowych jest używany jako ekstraktor cech. Po wyodrębnieniu cech dochodzi do maksymalnej zachłannej konkurencji między funkcjami w warstwie cech. Mechanizm współzawodnictwa cech może doprowadzić do konwergencji sieci do lokalizacji defektu. Aby ocenić ten mechanizm, przeprowadzono eksperymenty na dwóch zbiorach danych. Wraz ze wzrostem czasu szkolenia model może spontanicznie skupić się na rzeczywistej wadliwej lokalizacji i jest odporny na niezrównoważoną próbkę. Dokładność klasyfikacji obu zbiorów danych może sięgać ponad 98% i jest porównywalna ze szczegółową metodą znakowania próbek. Wyniki detekcji pokazują, że lokalizacja defektu z modelu jest bardziej zwarta i dokładna niż w metodzie Grad-CAM. Eksperymenty pokazują, że zaprezentowany model ma potencjalne zastosowanie do wykrywania defektów w środowisku przemysłowym.
EN
Brain tumors can be difficult to diagnose, as they may have similar radiographic characteristics, and a thorough examination may take a considerable amount of time. To address these challenges, we propose an intelligent system for the automatic extraction and identification of brain tumors from 2D CE MRI images. Our approach comprises two stages. In the first stage, we use an encoder-decoder based U-net with residual network as the backbone to detect different types of brain tumors, including glioma, meningioma, and pituitary tumors. Our method achieved an accuracy of 99.60%, a sensitivity of 90.20%, a specificity of 99.80%, a dice similarity coefficient of 90.11%, and a precision of 90.50% for tumor extraction. In the second stage, we employ a YOLO2 (you only look once) based transfer learning approach to classify the extracted tumors, achieving a classification accuracy of 97%. Our proposed approach outperforms state-of-the-art methods found in the literature. The results demonstrate the potential of our method to aid in the diagnosis and treatment of brain tumors.
EN
The paper describes and compares two forms of wavelet transformation: discrete (DWT) and continuous (CWT) in the analysis of electrocardiograms (ECG) to detect the anomaly. The anomalies have been limited to two types: cardiac and congestive heart failure. Two independent approaches to the problem have been considered. One is based on discrete wavelet transformation and feature generation based on statistical parameters of the results of the transformed ECG signals. These descriptors, after selection, are delivered as the input attributes to different classifiers. The second approach applies continuous wavelet transformation of ECG signals and the resulting two-dimensional image formed in time-frequency dimensions represents the input to the convolutional neural network, which is responsible for the generation of the diagnostic features and final classification. The experiments have been performed on the publically available database Complex Physiologic Signals PhysioNet. The calculations have been done in Python. The results of both approaches: DWT and CWT have been discussed and compared.
PL
Artykuł predstawia dwa podejścia do wykrywania anomalii w sygnalach ECG. Jako anomalie rozważane są: arytmia i zastoinowa niewydolność serca. Podstawą analizy jest sygnał ECG poddany transformacji falkowej w dwu postaciach: transformacja dyskretna oraz transformacja ciągła. W przypadku transformacji dyskretnej sygnał ECG poddany jest dekompozycji falkowej na kilku poziomach a wyniki tej dekompozycji (sygnały szczegółowe i sygnał aproksymacyjny ostatniego poziomu) podlegają opisowi statystycznemu tworząc zbiór deskryptorów numerycznych – potencjalnych cech diagnostycznych. Po przeprowadzonej selekcji stanowią one atrybuty wejściowe dla zespołu 9 klasyfikatorów. W drugim podejściu sygnał ECG jest poddany ciągłej transformacji falkowej generując dwuwymiarową macierz w postaci obrazu. Zbiór takich obrazów podawany jest na wejście głębokiej sieci neuronowej CNN, która w jednej strukturze dokonuje jednocześnie generacji cech diagnostycznych i klasyfikacji. Eksperymenty numeryczne przeprowadzone zostały na ogólnie dostępnej bazie danych Complex Physiologic Signals PhysioNet. Wyniki eksperymentów wykazały przewagę podejścia wykorzystujacego dyskretną transformację falkową.
EN
Bearings are important components of rotating machinery and transmission systems, and are often damaged by wear, overload and shocks. Due to the low resolution of traditional time-frequency analysis for the diagnosis of bearing faults, a synchrosqueezed wavelet transform (SSWT) is proposed to improve the resolution. An improved convolutional neural network fault diagnosis model is proposed in this paper, and a Bayesian optimisation method is applied to automatically adjust the structure and hyperparameters of the model to improve the accuracy of bearing fault diagnosis. Experimental results from the accelerated life testing of bearings show that the proposed method is able to accurately identify various types of bearing fault and the different status of these faults under complex running conditions, while achieving very good generalisation ability.
|
2023
|
tom Vol. 71, no. 6
2699--2714
EN
Exploration of potash resources under complex geological condition is particularly important. However, it is difficult to establish characteristic equations for direct prediction, since there is no direct relation between potash content (PC) and seismic response. To solve this problem, this paper proposed a potash reservoir prediction method by a specially designed convolution neural network (CNN) structure to train the special waveform and petrophysical characteristics of potash reservoirs. Considering that the potash reservoirs and petrophysical characteristics are not a one-to-one mapping, the prediction procedure is divided into two parts. First, a CNN is constructed for potash reservoir prediction, according to the spatial waveform characteristics of potash reservoirs. The mapping between potash reservoirs and waveform characteristics is used to obtain the potash reservoir probability data by the soft-max function. Then, another CNN for PC prediction is built based on the petrophysical characteristics of potash reservoirs. Meanwhile, according to the Hadamard criterion, the petrophysical characteristics of potash reservoir are constrained by the waveform characteristics. The two CNN models are used to directly predict the PC synergistically. Consequently, the bidirectional mapping problem can be alleviated and a loss function of the PC prediction CNN constrained with the waveform is obtained. Finally, by tuning the PC prediction CNN through the loss function, PC prediction is performed. The correlation between the predicted and true PC values can reach more than 80%.
EN
Automatic creation of image descriptions, i.e. cap- tioning of images, is an important topic in artificial intelligence (AI) that bridges the gap between computer vision (CV) and natural language processing (NLP). Currently, neural networks are becoming increasingly popular in captioning images and researchers are looking for more efficient models for CV and sequence-sequence systems. This study focuses on a new image caption generation model that is divided into two stages. Ini- tially, low-level features, such as contrast, sharpness, color and their high-level counterparts, such as motion and facial impact score, are extracted. Then, an optimized convolutional neural network (CNN) is harnessed to generate the captions from im- ages. To enhance the accuracy of the process, the weights of CNN are optimally tuned via spider monkey optimization with sine chaotic map evaluation (SMO-SCME). The development of the proposed method is evaluated with a diversity of metrics.
EN
Soil is a solid particle that covers the surface of the earth. Soil can be classified based on its color because the color indicates the nature and condition of the soil. CNN works well for image classification, but it requires large amounts of data. Augmentation is a technique to increase the amount of training data with various transformation techniques to the existing data. Rotation and Gamma Correction can be used simply as an augmentation technique and can reproduce an image with as many image variations as desired from the original image. CNN architecture has a convolution layer and Dense block has dense layers. The addition of Dense blocks to CNN aims to overcome underfitting and overfitting problems. This study proposes a combination of Augmentation and classification. In augmentation, a combination of rotation and Gamma correction techniques is used to reproduce image data. The CNN-Dense block is applied for classification. The soil image classification is grouped based on 5 labels black soil, cinder soil, laterite soil, peat soil, and yellow soil. The performances of the proposed method provide excellent results, where accuracy, precision, recall, and F1-Score performances are above 90%. It can be concluded that the combination of rotation and Gamma Correction as augmentation techniques and CNN-Dense blocks is powerful for use in soil image classification.
first rewind previous Strona / 3 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.