Wyniki wyszukiwania - BazTech

1

Symbolic Tensor Neural Networks for Digital Media : from Tensor Processing via BNF Graph Rules to CREAMS Applications

Skarbek Władysław

Fundamenta Informaticae

|

2019

|

Vol. 168, nr 2-4

89--184

EN

This tutorial material on Convolutional Neural Networks (CNN) and its applications in digital media research is based on the concept of Symbolic Tensor Neural Networks. The set of STNN expressions is specified in Backus-Naur Form (BNF) which is annotated by constraints typical for labeled acyclic directed graphs (DAG). The BNF induction begins from a collection of neural unit symbols with extra (up to five) decoration fields (including tensor depth and sharing fields). The inductive rules provide not only the general graph structure but also the specific shortcuts for residual blocks of units. A syntactic mechanism for network fragments modularization is introduced via user defined units and their instances. Moreover, the dual BNF rules are specified in order to generate the Dual Symbolic Tensor Neural Network (DSTNN). The joined interpretation of STNN and DSTNN provides the correct flow of gradient tensors, back propagated at the training stage. The proposed symbolic representation of CNNs is illustrated for six generic digital media applications (CREAMS): Compression, Recognition, Embedding, Annotation, 3D Modeling for human-computer interfacing, and data Security based on digital media objects. In order to make the CNN description and its gradient flow complete, for all presented applications, the symbolic representations of mathematically defined loss/gain functions and gradient flow equations for all used core units, are given. The tutorial is to convince the reader that STNN is not only a convenient symbolic notation for public presentations of CNN based solutions for CREAMS problems but also that it is a design blueprint with a potential for automatic generation of application source code.

2

Human Face Expressions from Images

Pilarczyk Rafał, Chang Xin, Skarbek Władysław

Fundamenta Informaticae

|

2019

|

Vol. 168, nr 2-4

287--310

EN

Several computer algorithms for recognition of visible human emotions are compared at the web camera scenario using CNN/MMOD face detector. The recognition refers to four face expressions: smile, surprise, anger, and neutral. At the feature extraction stage, the following three concepts of face description are confronted: (a) static 2D face geometry represented by its 68 characteristic landmarks (FP68); (b) dynamic 3D geometry defined by motion parameters for eight distinguished face parts (denoted as AU8) of personalized Candide-3 model; (c) static 2D visual description as 2D array of gray scale pixels (known as facial raw image). At the classification stage, the performance of two major models are analyzed: (a) support vector machine (SVM) with kernel options; (b) convolutional neural network (CNN) with variety of relevant tensor processing layers and blocks of them. The models are trained for frontal views of human faces while they are tested for arbitrary head poses. For geometric features, the success rate (accuracy) indicate nearly triple increase of performance of CNN with respect to SVM classifiers. For raw images, CNN outperforms in accuracy its best geometric counterpart (AU/CNN) by about 30 percent while the best SVM solutions are inferior. For F-score the high advantage of raw/CNN over geometric/CNN and geometric/SVM is observed, as well. We conclude that contrary to CNN based emotion classifiers, the generalization capability wrt human head pose for SVM based emotion classifiers, is worse too.

3

On Intra-Class Variance for Deep Learning of Classifiers

Pilarczyk Rafał, Skarbek Władysław

Foundations of Computing and Decision Sciences

|

2019

|

Vol. 44, No. 3

285--301

EN

A novel technique for deep learning of image classifiers is presented. The learned CNN models higher offer better separation of deep features (also known as embedded vectors) measured by Euclidean proximity and also no deterioration of the classification results by class membership probability. The latter feature can be used for enhancing image classifiers having the classes at the model’s exploiting stage different from from classes during the training stage. While the Shannon information of SoftMax probability for target class is extended for mini-batch by the intra-class variance, the trained network itself is extended by the Hadamard layer with the parameters representing the class centers. Contrary to the existing solutions, this extra neural layer enables interfacing of the training algorithm to the standard stochastic gradient optimizers, e.g. AdaM algorithm. Moreover, this approach makes the computed centroids immediately adapting to the updating embedded vectors and finally getting the comparable accuracy in less epochs.

4

Convolutional and Recurrent Neural Networks for Face Image Analysis

Yüksel Kıvanç, Skarbek Władysław

Foundations of Computing and Decision Sciences

|

2019

|

Vol. 44, No. 3

331--347

EN

In the presented research two Deep Neural Network (DNN) models for face image analysis were developed. The first one detects eyes, nose and mouth and it is based on a moderate size Convolutional Neural Network (CNN) while the second one identifies 68 landmarks resulting in a novel Face Alignment Network composed of a CNN and a recurrent neural network. The Face Parts Detector inputs face image and outputs the pixel coordinates of bounding boxes for detected facial parts. The Face Alignment Network extracts deep features in CNN module while in the recurrent module it generates 68 facial landmarks using not only this deep features, but also the geometry of facial parts. Both methods are robust to varying head poses and changing light conditions.