Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The aim of the article is to analyze and compare the performance and accuracy of architectures with a different number of parameters on the example of a set of handwritten Latin characters from the Polish Handwritten Characters Database (PHCD). It is a database of handwriting scans containing letters of the Latin alphabet as well as diacritics characteristic of the Polish language. Each class in the PHCD dataset contains 6,000 scans for each character. The research was carried out on six proposed architectures and compared with the architecture from the literature. Each of the models was trained for 50 epochs, and then the accuracy of prediction was measured on a separate test set. The experiment thus constructed was repeated 20 times for each model. Accuracy, number of parameters and number of floating-point operations performed by the network were compared. The research was conducted on subsets such as uppercase letters, lowercase letters, lowercase letters with diacritics, and a subset of all available characters. The relationship between the number of parameters and the accuracy of the model was indicated. Among the examined architectures, those that significantly improved the prediction accuracy at the expense of a larger network size were selected, and a network with a similar prediction accuracy as the base one, but with twice as many model parameters was selected.
Czasopismo
Rocznik
Tom
Strony
88--102
Opis fizyczny
Bibliogr. 19 poz., fig., tab.
Twórcy
autor
- Lublin University of Technology, Faculty of Electrical Engineering and Computer Science, Department of Computer Science, Poland
autor
- Lublin University of Technology, Faculty of Electrical Engineering and Computer Science, Poland
Bibliografia
- [1] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., … Zheng, X. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. ArXiv, abs/1603.04467. https://doi.org/10.48550/ARXIV.1603.04467
- [2] Belkin, M., Hsu, D., Ma, S., & Mandal, S. (2019). Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proceedings of the National Academy of Sciences, 116(32), 15849-15854. https://doi.org/10.1073/pnas.1903070116
- [3] Blalock, D., Ortiz, J. J. G., Frankle, J., & Guttag, J. (2020). What is the state of neural network pruning?. ArXiv, abs/2003.03033. https://doi.org/10.48550/arXiv.2003.03033
- [4] Bouthillier, X., Delaunay, P., Bronzi, M., Trofimov, A., Nichyporuk, B., Szeto, J., Sepah, N., Raff, E., Madan, K., Voleti, V., Kahou, S. E., Michalski, V., Serdyuk, D., Arbel, T., Pal, C., Varoquaux, G., & Vincent, P. (2021). Accounting for variance in machine learning benchmarks. ArXiv, abs/2103.03098. https://doi.org/10.48550/ARXIV.2103.03098
- [5] Choi, Y., El-Khamy, M., & Lee, J. (2016). Towards the limit of network quantization. ArXiv, abs/1612.01543. https://doi.org/10.48550/arXiv.1612.01543
- [6] Cohen, G., Afshar, S., Tapson, J., & Van Schaik, A. (2017). EMNIST: Extending MNIST to handwritten letters. 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 2921-2926). IEEE. https://doi.org/10.1109/IJCNN.2017.7966217
- [7] Gajoui, K. E., Allah, F. A., & Oumsis, M. (2015). Diacritical language OCR based on neural network: Case of amazigh language. Procedia Computer Science, 73, 298‒305. https://doi.org/10.1016/j.procs.2015.12.035
- [8] Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J., & Chen, T. (2018). Recent advances in convolutional neural networks. ArXiv, abs/1512.07108. https://doi.org/10.48550/arXiv.1512.07108
- [9] Hadidi, R., Cao, J., Xie, Y., Asgari, B., Krishna, T., & Kim, H. (2019). Characterizing the deployment of deep neural networks on commercial edge devices. 2019 IEEE International Symposium on Workload Characterization (IISWC) (pp. 35-48). IEEE. https://doi.org/10.1109/IISWC47752.2019.9041955
- [10] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770-778). IEEE. https://doi.org/10.1109/CVPR.2016.90
- [11] Idziak, J., Šeļa, A., Woźniak, M., Leśniak, A., Byszuk, J., & Eder, M. (2021). Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets. In International Conference on Computational Science 2021 (pp. 137–150). Springer. https://doi.org/10.1007/978-3-030-77961-0_13
- [12] Islam, N., Islam, Z., & Noor, N. (2017). A Survey on optical character recognition system. ArXiv, abs/1710.05703. https://doi.org/10.48550/arXiv.1710.05703
- [13] Lukasik, E., Charytanowicz, M., Milosz, M., Tokovarov, M., Kaczorowska, M., Czerwinski, D., & Zientarski, T. (2021). Recognition of handwritten Latin characters with diacritics using CNN. Bulletin of the Polish Academy of Sciences: Technical Sciences, 69(1), e136210. https://doi.org/10.24425/bpasts.2020.136210
- [14] Lutf, M., You, X., Cheung, Y., & Chen, C. (2014). Arabic font recognition based on diacritics features. Pattern Recognition, 47(2), 672–684. https://doi.org/10.1016/j.patcog.2013.07.015
- [15] Łukasik, E.,& Zientarski, T. (2018). Comparative analysis of selected programs for optical text recognition. Journal of Computer Sciences Institute, 7, 191-194. https://doi.org/10.35784/jcsi.676
- [16] Sharma, R., Kaushik, B., & Gondhi, N. (2020). Character recognition using machine learning and deep learning - a survey. 2020 International Conference on Emerging Smart Computing and Informatics (ESCI) (pp. 341-345). IEEE. http://doi.org/10.1109/ESCI48226.2020.9167649
- [17] Tokovarov, M., Kaczorowska, M., & Milosz, M. (2020). Development of extensive polish handwritten characters database for text recognition research. Advances in Science and Technology Research Journal, 14(3), 30-38. https://doi.org/10.12913/22998624/122567
- [18] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. ArXiv, abs/1706.03762. https://doi.org/10.48550/arXiv.1706.03762
- [19] Wang, H., Qin, C., Bai, Y., Zhang, Y., & Fu, Y. (2022). Recent advances on neural network pruning at initialization. ArXiv, abs/2103.06460. https://doi.org/10.48550/arXiv.2103.06460
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b4bb5168-64de-443a-aee8-13fccae5f813