Symbolic Tensor Neural Networks for Digital Media : from Tensor Processing via BNF Graph Rules to CREAMS Applications

Skarbek, Władysław

doi:10.3233/FI-2019-1827

Artykuł - szczegóły

Tytuł artykułu

Symbolic Tensor Neural Networks for Digital Media : from Tensor Processing via BNF Graph Rules to CREAMS Applications

Autorzy

Skarbek Władysław

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2019-1827

Warianty tytułu

Języki publikacji

Abstrakty

This tutorial material on Convolutional Neural Networks (CNN) and its applications in digital media research is based on the concept of Symbolic Tensor Neural Networks. The set of STNN expressions is specified in Backus-Naur Form (BNF) which is annotated by constraints typical for labeled acyclic directed graphs (DAG). The BNF induction begins from a collection of neural unit symbols with extra (up to five) decoration fields (including tensor depth and sharing fields). The inductive rules provide not only the general graph structure but also the specific shortcuts for residual blocks of units. A syntactic mechanism for network fragments modularization is introduced via user defined units and their instances. Moreover, the dual BNF rules are specified in order to generate the Dual Symbolic Tensor Neural Network (DSTNN). The joined interpretation of STNN and DSTNN provides the correct flow of gradient tensors, back propagated at the training stage. The proposed symbolic representation of CNNs is illustrated for six generic digital media applications (CREAMS): Compression, Recognition, Embedding, Annotation, 3D Modeling for human-computer interfacing, and data Security based on digital media objects. In order to make the CNN description and its gradient flow complete, for all presented applications, the symbolic representations of mathematically defined loss/gain functions and gradient flow equations for all used core units, are given. The tutorial is to convince the reader that STNN is not only a convenient symbolic notation for public presentations of CNN based solutions for CREAMS problems but also that it is a design blueprint with a potential for automatic generation of application source code.

Słowa kluczowe

convolutional neural network tensor neural network deep learning deep digital media application

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2019

Tom

Vol. 168, nr 2-4

Strony

89--184

Opis fizyczny

Bibliogr. 28 poz., rys., tab.

Twórcy

autor

Skarbek Władysław

wladyslaw.skarbek@pw.edu.pl

Faculty of Electronics and Information Technology, Warsaw University of Technology, Warsaw, Poland

Bibliografia

[1] Rosenblatt F. The Perceptron-a perceiving and recognizing automaton. Report 85-460-1. Technical report, Cornell Aeronautical Laboratory, 1957.
[2] Werbos PJ. Applications of advances in nonlinear sensitivity analysis. In: Drenick RF, Kozin F (eds.), System Modeling and Optimization. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-540-39459-4, 1982 pp. 762-770.
[3] Rumelhart DE, Hinton GE, Williams RJ. Learning Representations by Back-propagating Errors. In: Anderson JA, Rosenfeld E (eds.), Neurocomputing: Foundations of Research, pp. 696-699. MIT Press, Cambridge, MA, USA. ISBN 0-262-01097-6, 1988. URL http://dl.acm.org/citation.cfm?id=65669.104451.
[4] Schmidhuber J. Deep Learning in Neural Networks: An Overview. CoRR, 2014. abs/1404.7828. 1404. 7828, URL http://arxiv.org/abs/1404.7828.
[5] Knuth DE. Backus Normal Form vs. Backus Naur Form. Commun. ACM, 1964. 7(12):735-736. doi:10.1145/355588.365140. URL http://doi.acm.org/10.1145/355588.365140.
[6] Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A. Automatic differentiation in PyTorch. In: 31st Conference on Neural Information Processing Systems (NIPS 2017). 2017.
[7] Ruder S. An overview of gradient descent optimization algorithms. CoRR, 2016. abs/1609.04747. 1609.04747, URL http://arxiv.org/abs/1609.04747.
[8] Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. CoRR, 2014. abs/1412.6980. 1412.6980, URL http://arxiv.org/abs/1412.6980.
[9] Nesterov YE. A method for solving the convex programming problem with convergence rate O(1=k2). Dokl. Akad. Nauk SSSR, 1983. 269:543-547. URL https://ci.nii.ac.jp/naid/10029946121/en/.
[10] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative Adversarial Nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14. MIT Press, Cambridge, MA, USA, 2014 pp. 2672-2680. URL http://dl.acm.org/citation.cfm?id=2969033.2969125.
[11] Ulyanov D, Vedaldi A, Lempitsky VS. Instance Normalization: The Missing Ingredient for Fast Stylization. CoRR, 2016. abs/1607.08022. 1607.08022, URL http://arxiv.org/abs/1607.08022.
[12] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, 2014. abs/1409.1556. 1409.1556, URL http://arxiv.org/abs/1409.1556.
[13] Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going Deeper with Convolutions. CoRR, 2014. abs/1409.4842. 1409.4842, URL http://arxiv.org/abs/1409.4842.
[14] He K, Zhang X, Ren S, Sun J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. CoRR, 2014. abs/1406.4729. 1406.4729, URL http://arxiv.org/abs/1406.4729.
[15] Agustsson E, Tschannen M, Mentzer F, Timofte R, Gool LV. Generative Adversarial Networks for Extreme Learned Image Compression. CoRR, 2018. abs/1804.02958. 1804.02958, URL http://arxiv.org/abs/1804.02958.
[16] Mentzer F, Agustsson E, Tschannen M, Timofte R, Van Gool L. Conditional Probability Models for Deep Image Compression. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018.
[17] Wang T, Liu M, Zhu J, Tao A, Kautz J, Catanzaro B. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. CoRR, 2017. abs/1711.11585. 1711.11585, URL http://arxiv.org/abs/1711.11585.
[18] Isola P, Zhu J, Zhou T, Efros AA. Image-to-Image Translation with Conditional Adversarial Networks. CoRR, 2016. abs/1611.07004. 1611.07004, URL http://arxiv.org/abs/1611.07004.
[19] Johnson J, Alahi A, Li F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. CoRR, 2016. abs/1603.08155. 1603.08155, URL http://arxiv.org/abs/1603.08155.
[20] Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. CoRR, 2015. abs/1511.06434. 1511.06434, URL http://arxiv.org/abs/1511.06434.
[21] Zhu J, Park T, Isola P, Efros AA. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. CoRR, 2017. abs/1703.10593. 1703.10593, URL http://arxiv.org/abs/1703.10593.
[22] Chung JS, Nagrani A, Zisserman A. VoxCeleb2: Deep Speaker Recognition. In: INTERSPEECH. 2018.
[23] He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. CoRR, 2015. abs/1512.03385. 1512.03385, URL http://arxiv.org/abs/1512.03385.
[24] Dong S, Zhang R, Liu J. Invisible Steganography via Generative Adversarial Network. ArXiv e-prints, 2018. 1807.08571.
[25] Wang Z, Simoncelli EP, Bovik AC. Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, volume 2. 2003 pp. 1398-1402 Vol.2. doi:10.1109/ACSSC.2003.1292216.
[26] Kowalski M, Naruniec J. (personal communication).
[27] Wen Y, Zhang K, Li Z, Qiao Y. A Discriminative Feature Learning Approach for Deep Face Recognition. In: Leibe B, Matas J, Sebe N, Welling M (eds.), Computer Vision - ECCV 2016. Springer International Publishing, Cham. ISBN 978-3-319-46478-7, 2016 pp. 499-515.
[28] Pilarczyk R, Skarbek W. Tuning deep learning algorithms for face alignment and pose estimation. In: Proc.SPIE, volume 10808. 2018 pp. 10808-10808-8. doi:10.1117/12.2501682. URL https://doi.org/10.1117/12.2501682.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e37354a1-8541-4d2b-a857-c26506390cc0