From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition

Rościszewski, P.

doi:10.7494/csci.2017.18.4.2119

Artykuł - szczegóły

Tytuł artykułu

From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition

Autorzy

Rościszewski P.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.7494/csci.2017.18.4.2119

Warianty tytułu

Języki publikacji

Abstrakty

Recently gathered image datasets and new capabilities of high performance computing systems allowed developing new artificial neural network models and training algorithms. Using the new machine learning models, computer vision tasks can be accomplished based on the raw values of image pixels, instead of specific features. The principle of operation of deep artificial neural networks is more and more resembling of what we believe to be happening in the human visual cortex. In this paper we build up an understanding of convolutional neural networks through investigating supervised machine learning methods suchas K-Nearest Neighbors, linear classifiers and fully connected neural networks. We provide examples and accuracy results based on our implementation aimed for the problem of hand pose recognition.

Słowa kluczowe

machine learning artificial neural networks computer vision

Wydawca

Wydawnictwa AGH

Czasopismo

Computer Science

Rocznik

2017

Tom

Vol. 18 (4)

Strony

341--356

Opis fizyczny

Bibliogr. 21 poz., rys., wykr., tab.

Twórcy

autor

Rościszewski P.

pawel.rosciszewski@pg.edu.pl

Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Gdańsk, Poland

Bibliografia

[1] Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G.S., Davis A., Dean J., Devin M., others: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. In: arXiv preprint arXiv:1603.04467, 2016. http://arxiv.org/abs/1603.04467.
[2] Bhuyan M.K., Neog D.R., Kar M.K.: Hand pose recognition using geometric features. In: Communications (NCC), 2011 National Conference on, pp. 1-5. IEEE, 2011. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5734786.
[3] Camgoz N.C., Had_eld S., Koller O., Bowden R.: Using convolutional 3d neural networks for user-independent continuous gesture recognition. In: Pattern Recognition (ICPR), 2016 23rd International Conference on, pp. 49-54. IEEE, 2016. http://ieeexplore.ieee.org/abstract/document/7899606/.
[4] Dardas N.H., Georganas N.D.: Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques, IEEE Transactions on Instrumentation and Measurement, vol. 60(11), pp. 3592{3607, 2011. http://dx.doi.org/10.1109/TIM.2011.2161140.
[5] Deng X., Yang S., Zhang Y., Tan P., Chang L., Wang H.: Hand3D: Hand Pose Estimation using 3D Neural Network. In: arXiv preprint arXiv:1704.02224, 2017. https://arxiv.org/abs/1704.02224.
[6] Erol A., Bebis G., Nicolescu M., Boyle R.D., Twombly X.: Vision-based hand pose estimation: A review, Computer Vision and Image Understanding, vol. 108(1{2), pp. 52-73, 2007. http://dx.doi.org/10.1016/j.cviu.2006.10.012.
[7] Hubel D.H., Wiesel T.N.: Receptive felds of single neurones in the cat's striate cortex, The Journal of Physiology, vol. 148(3), pp. 574-591, 1959. http://dx.doi.org/10.1113/jphysiol.1959.sp006308.
[8] Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., Fei-Fei L.: Large-scale video classi_cation with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725-1732. 2014. http://www.cv-foundation.org/openaccess/content_ cvpr_2014/html/Karpathy_Large-scale_Video_Classification_2014_CVPR_paper.html.
[9] Kim T.K., Wong S.F., Cipolla R.: Tensor canonical correlation analysis for action classification. In: Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pp. 1-8. IEEE, 2007. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4270162.
[10] Kingma D., Ba J.: Adam: A method for stochastic optimization. In: arXiv preprint arXiv:1412.6980, 2014. http://arxiv.org/abs/1412.6980.
[11] Molchanov P., Gupta S., Kim K., Kautz J.: Hand Gesture Recognition with 3D Convolutional Neural Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1-7. 2015. http://www.cv-foundation.org/openaccess/content_cvpr_workshops_2015/ W15/html/Molchanov_Hand_Gesture_Recognition_2015_CVPR_paper.html.
[12] Neverova N., Wolf C., Taylor G.W., Nebout F.: Multi-scale deep learning for gesture detection and localization. In: Workshop at the European Conference on Computer Vision, pp. 474-490. Springer, 2014. http://link.springer.com/chapter/10.1007/978-3-319-16178-5_33.
[13] Powers D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, Journal of Machine Learning Technologies, vol. 2(1), pp. 37-63, 2011. https://bioinfopublication.org/viewhtml.php?artid=BIA0001114.
[14] Prasad V.S.N., Domke J.: Gabor filter visualization, Technical report, University of Maryland, 2005.
[15] Rumelhart D.E., Hinton G.E., Williams R.J.: Learning representations by back- propagating errors, Nature, vol. 323(6088), pp. 533{536, 1986. http://dx.doi.org/10.1038/323533a0.
[16] Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Huang Z., Karpathy A., Khosla A., Bernstein M., Berg A.C., Fei-Fei L.: ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol. 115(3), pp. 211-252, 2015. http://dx.doi.org/10.1007/s11263-015-0816-y.
[17] Sankowski D., Nowakowski J. (eds.): Computer vision in robotics and industrial applications. No. 3 in Series in computer vision.World Scientific, Singapore, 2014.
[18] Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R.: Drop out: A simple way to prevent neural networks from overffitting, The Journal of Machine Learning Research, vol. 15(1), pp. 1929-1958, 2014. http://dl.acm.org/citation.cfm?id=2670313.
[19] Sturman D., Zeltzer D.: A survey of glove-based input, IEEE Computer Graphics and Applications, vol. 14(1), pp. 30{39, 1994. http://dx.doi.org/10.1109/38.250916.
[20] Suarez J., Murphy R.R.: Hand gesture recognition with depth images: A review. In: 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, pp. 411-417. 2012. http://dx.doi.org/10.1109/ROMAN.2012.6343787.
[21] Szegedy C., Vanhoucke V., Io_e S., Shlens J., Wojna Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818-2826. 2016. http://www.cv-foundation.org/openaccess/content_cvpr_2016/html/ Szegedy_Rethinking_the_Inception_CVPR_2016_paper.html.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-93700b39-8e0c-4783-9d0c-f36c843fffe4