Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 4

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
This tutorial material on Convolutional Neural Networks (CNN) and its applications in digital media research is based on the concept of Symbolic Tensor Neural Networks. The set of STNN expressions is specified in Backus-Naur Form (BNF) which is annotated by constraints typical for labeled acyclic directed graphs (DAG). The BNF induction begins from a collection of neural unit symbols with extra (up to five) decoration fields (including tensor depth and sharing fields). The inductive rules provide not only the general graph structure but also the specific shortcuts for residual blocks of units. A syntactic mechanism for network fragments modularization is introduced via user defined units and their instances. Moreover, the dual BNF rules are specified in order to generate the Dual Symbolic Tensor Neural Network (DSTNN). The joined interpretation of STNN and DSTNN provides the correct flow of gradient tensors, back propagated at the training stage. The proposed symbolic representation of CNNs is illustrated for six generic digital media applications (CREAMS): Compression, Recognition, Embedding, Annotation, 3D Modeling for human-computer interfacing, and data Security based on digital media objects. In order to make the CNN description and its gradient flow complete, for all presented applications, the symbolic representations of mathematically defined loss/gain functions and gradient flow equations for all used core units, are given. The tutorial is to convince the reader that STNN is not only a convenient symbolic notation for public presentations of CNN based solutions for CREAMS problems but also that it is a design blueprint with a potential for automatic generation of application source code.
2
Content available remote Human Face Expressions from Images
EN
Several computer algorithms for recognition of visible human emotions are compared at the web camera scenario using CNN/MMOD face detector. The recognition refers to four face expressions: smile, surprise, anger, and neutral. At the feature extraction stage, the following three concepts of face description are confronted: (a) static 2D face geometry represented by its 68 characteristic landmarks (FP68); (b) dynamic 3D geometry defined by motion parameters for eight distinguished face parts (denoted as AU8) of personalized Candide-3 model; (c) static 2D visual description as 2D array of gray scale pixels (known as facial raw image). At the classification stage, the performance of two major models are analyzed: (a) support vector machine (SVM) with kernel options; (b) convolutional neural network (CNN) with variety of relevant tensor processing layers and blocks of them. The models are trained for frontal views of human faces while they are tested for arbitrary head poses. For geometric features, the success rate (accuracy) indicate nearly triple increase of performance of CNN with respect to SVM classifiers. For raw images, CNN outperforms in accuracy its best geometric counterpart (AU/CNN) by about 30 percent while the best SVM solutions are inferior. For F-score the high advantage of raw/CNN over geometric/CNN and geometric/SVM is observed, as well. We conclude that contrary to CNN based emotion classifiers, the generalization capability wrt human head pose for SVM based emotion classifiers, is worse too.
3
Content available remote On Intra-Class Variance for Deep Learning of Classifiers
EN
A novel technique for deep learning of image classifiers is presented. The learned CNN models higher offer better separation of deep features (also known as embedded vectors) measured by Euclidean proximity and also no deterioration of the classification results by class membership probability. The latter feature can be used for enhancing image classifiers having the classes at the model’s exploiting stage different from from classes during the training stage. While the Shannon information of SoftMax probability for target class is extended for mini-batch by the intra-class variance, the trained network itself is extended by the Hadamard layer with the parameters representing the class centers. Contrary to the existing solutions, this extra neural layer enables interfacing of the training algorithm to the standard stochastic gradient optimizers, e.g. AdaM algorithm. Moreover, this approach makes the computed centroids immediately adapting to the updating embedded vectors and finally getting the comparable accuracy in less epochs.
4
Content available remote Convolutional and Recurrent Neural Networks for Face Image Analysis
EN
In the presented research two Deep Neural Network (DNN) models for face image analysis were developed. The first one detects eyes, nose and mouth and it is based on a moderate size Convolutional Neural Network (CNN) while the second one identifies 68 landmarks resulting in a novel Face Alignment Network composed of a CNN and a recurrent neural network. The Face Parts Detector inputs face image and outputs the pixel coordinates of bounding boxes for detected facial parts. The Face Alignment Network extracts deep features in CNN module while in the recurrent module it generates 68 facial landmarks using not only this deep features, but also the geometry of facial parts. Both methods are robust to varying head poses and changing light conditions.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.