PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Lexicon and attention based handwritten text recognition system

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The handwritten text recognition problem is widely studied by the researchers of computer vision community due to its scope of improvement and applicability to daily lives. It is a sub-domain of pattern recognition. Due to advancement of computational power of computers since last few decades neural networks based systems heavily contributed towards providing the state-of-the-art handwritten text recognizers. In the same direction, we have taken two state-of-the art neural networks systems and merged the attention mechanism with it. The attention technique has been widely used in the domain of neural machine translations and automatic speech recognition and now is being implemented in text recognition domain. In this study, we are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset after merging the attention and word beam search decoder with existing Flor et al. architecture. To analyse further, we have also used system similar to Shi et al. neural network system with greedy decoder and observed 23.27% improvement in character error rate from the base model.
Rocznik
Strony
75--92
Opis fizyczny
Bibliogr. 58 poz., rys., ta., wykr.
Twórcy
  • Department of Computer Science and Applications, Panjab University, Chandigarh, India
  • D.M. College (Affil. to Panjab University), Moga, Punjab, India
  • Computer Networking and Information Technology, PRL, Ahmedabad, Gujarat, India
autor
  • Department of Computer Science and Applications, Panjab University, Chandigarh, India
Bibliografia
  • [1] J. Almazan, A. Gordo, A. Fornes, and E. Valveny. Word spotting and recognition with embedded attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12): 2552-2566, 2014. doi:10.1109/TPAMI.2014.2339814.
  • [2] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In Proc. 3rd Int. Conf. Learning Representations, ICLR 2015, San Diego, CA, 7-9 May 2015. Accessible in arXiv. doi:10.48550/arXiv.1409.0473.
  • [3] R. E. Bellman and S. E. Dreyfus. Applied Dynamic Programming, volume 2050 of Princeton Legacy Library. Princeton University Press, 2015. doi:10.1515/9781400874651.
  • [4] A.-L. Bianne-Bernard, F. Menasri, Al-Hajj M. R., et al. Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10): 2066-2080, 2011. doi:10.1109/TPAMI.2011.22.
  • [5] T. Bluche. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. arXiv, 2016. arXiv:1604.08352. doi:10.48550/arXiv.1604.08352.
  • [6] T. Bluche. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In Advances in Neural Information Processing Systems 29 - Proc. 30th Conf. NIPS 2016, volume 29, pages 838-846, Barcelona, Spain, 5-10 Dec 2019. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/2bb232c0b13c774965ef8558f0fbd615-Paper.pdf.
  • [7] T. Bluche, J. Louradour, and R. Messina. Scan, Attend and Read: End-to-end hand-written paragraph recognition with MDLSTM attention. arXiv, 2016. arXiv:1604.03286. doi:10.48550/arXiv.1604.03286.
  • [8] T. Bluche, J. Louradour, and R. Messina. Scan, Attend and Read: End-to-end handwritten paragraph recognition with MDLSTM attention. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 1050-1055, Kyoto, Japan, 9-15 Nov 2017. IEEE. doi:10.1109/ICDAR.2017.174.
  • [9] T. Bluche, H. Ney, and C. Kermorvant. Tandem HMM with convolutional neural network for handwritten word recognition. In Proc. 2013 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pages 2390-2394, Vancouver, Canada, 26-31 May 2013. IEEE. doi:10.1109/ICASSP.2013.6638083.
  • [10] K.-N. Chen, C.-H. Chen, and C.-C. Chang. Efficient illumination compensation techniques for text images. Digital Signal Processing, 22(5): 726-733, 2012. doi:10.1016/j.dsp.2012.04.010.
  • [11] W.-T. Chen, P. Gader, and H. Shi. Lexicon-driven handwritten word recognition using optimal linear combinations of order statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(1): 77-82, 1999. doi:10.1109/34.745738.
  • [12] A. Chowdhury and L. Vig. An efficient end-to-end neural model for handwritten text recognition, 2018. arXiv:1807.07965v2. doi:10.48550/arXiv.1807.07965.
  • [13] D. Coquenet, Y. Soullard, C. Chatelain, and T. Paquet. Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition? In Proc. 2019 Int. Conf. Document Analysis and Recognition Workshops (ICDARW), volume 5, pages 65-70, Sydney, NSW, Australia, 20-25 Sep 2019. doi:10.1109/ICDARW.2019.40083.
  • [14] A. Das, J. Li, G. Ye, et al. Advancing acoustic-to-word CTC model with attention and mixed-units. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12): 1880-1892, 2019. doi:10.1109/TASLP.2019.2933325.
  • [15] A. F. de Sousa Neto, B. L. D. Bezerra, A. H. Toselli, and E. B. Lima. HTR-Flor: A deep learning system for offline handwritten text recognition. In Proc. 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 54-61, Porto de Galinhas, Brazil, 07-10 Nov 2020. doi:10.1109/SIBGRAPI51738.2020.00016.
  • [16] A. Flor de Sousa Neto. handwritten-text-recognition. GitHub repository, 2020. https://github.com/arthurflor23/handwritten-text-recognition.
  • [17] P. Doetsch, M. Kozielski, and H. Ney. Fast and robust training of recurrent neural networks for offline handwriting recognition. In Proc. 2014 14th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 279-284, Hersonissos, Greece, 01-04 Sep 2014. IEEE. doi:10.1109/ICFHR.2014.54.
  • [18] P. Dreuw, P. Doetsch, C. Plahl, and H. Ney. Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: A comparison for offline handwriting recognition. In 2011 18th IEEE Int. Conf. Image Processing (ICIP), pages 3541-3544, Brussels, Belgium, 11-14 Sep 2011. IEEE. doi:10.1109/ICIP.2011.6116480.
  • [19] S. España-Boquera, M. J. Castro-Bleda, J. Gorbe-Moya, and F. Zamora-Martinez. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(4): 767-779, 2010. doi:10.1109/TPAMI.2010.141.
  • [20] A. Fischer. Handwriting Recognition in Historical Documents. PhD thesis, Universität Bern, Switzerland, 13 Mar 2012. https://www.researchgate.net/publication/259346163.
  • [21] A. Fischer, A. Keller, V. Frinken, and H. Bunke. Lexicon-free handwritten word spotting using character HMMs. Pattern Recognition Letters, 33(7): 934-942, 2012. Special Issue on Awards from ICPR 2010. doi:10.1016/j.patrec.2011.09.009.
  • [22] A. Fischer, K. Riesen, and H. Bunke. Graph similarity features for HMM-based handwriting recognition in historical documents. In Proc. 2010 12th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 253-258, Kolkata, India, 16-18 Nov 2010. IEEE. doi:10.1109/ICFHR.2010.47.
  • [23] V. Frinken and S. Uchida. Deep BLSTM neural networks for unconstrained continuous handwritten text recognition. In Proc. 2015 13th Int. Conf. Document Analysis and Recognition (ICDAR), pages 911-915, Tunis, Tunisia, 23-26 Aug 2015. IEEE. doi:10.1109/ICDAR.2015.7333894.
  • [24] A. Giménez, I. Khoury, J. Andrés-Ferrer, and A. Juan. Handwriting word recognition using windowed Bernoulli HMMs. Pattern Recognition Letters, 35:149-156, 01 2014. doi:10.1016/j.patrec.2012.09.002.
  • [25] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML ’06: Proc. 23rd Int. Conf. Machine Learning, pages 369–376, Pittsburgh, PA, USA, 25-29 Jun 2006.
  • doi:10.1145/1143844.1143891.
  • [26] A. Graves and N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks. In Proc. 31st Int. Conf. Machine Learning (ICML’14), volume 32 of ACM Proceedings, pages II-1764-II-1772, Beijing, China, 21-26 Jun 2014. JMLR.org. https://dl.acm.org/doi/abs/10.5555/3044805.3045089.
  • [27] A. Graves, M. Liwicki, S. Fernández, et al. A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5): 855-868, 2009. doi:10.1109/TPAMI.2008.137.
  • [28] A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems 21 - Proc. 22nd Conf. NeurIPS 2008, volume 21, pages 545-552. Curran Associates, Inc., 2008. https://proceedings.neurips.cc/paper/2008/file/66368270ffd51418ec58bd793f2d9b1b-Paper.pdf.
  • [29] Keras Special Interest Group. Keras. simple. flexible. powerful. https://keras.io.
  • [30] S. Johansson, G. N. Leech, and H. Goodluck. Manual of Information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital Computers. Department of English, University of Oslo, Oslo, Norway, 1978.
  • [31] L. Kang, P. Riba, M. Rusi˜nol, et al. Pay attention to what you read: Non-recurrent handwritten text-line recognition. arXiv, 2020. arXiv:2005.13044. doi:10.48550/arXiv.2005.13044.
  • [32] L. Kang, P. Riba, M. Rusiñol, et al. Pay attention to what you read: Non-recurrent handwritten text-line recognition. Pattern Recognition, 129: 108766, 2022. doi:10.1016/j.patcog.2022.108766.
  • [33] G. Kim, V. Govindaraju, and S. N. Srihari. An architecture for handwritten text recognition systems. International Journal on Document Analysis and Recognition, 2(1): 37-44, 1999. doi:10.1007/s100320050035.
  • [34] M. Kozielski, P. Doetsch, and H. Ney. Improvements in RWTH’s system for off-line handwriting recognition. In Proc. 2013 IAPR 12th Int. Conf. Document Analysis and Recognition (ICDAR), pages 935-939, Washington, DC, USA, 25-28 Aug 2013. IEEE. doi:10.1109/ICDAR.2013.190.
  • [35] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90, 2017. doi:10.1145/3065386.
  • [36] L. Kumari and A. Sharma. A review of deep learning techniques in document image word spotting. Archives of Computational Methods in Engineering, 29(2): 1085-1106. doi:10.1007/s11831-021-09605-7.
  • [37] L. Kumari, S. Singh, and A. Sharma. Page level input for handwritten text recognition in document images. In J. H. Kim et al., editors, Proc. 7th Int. Conf. Harmony Search, Soft Computing and Applications (ICHSA), volume 140 of Lecture Notes on Data Engineering and Communications Technologies, pages 171-183, Seoul, South Korea, 23-24 Feb 2022. Springer Nature Singapore. doi:10.1007/978-981-19-2948-9 17.
  • [38] Y. Le Cun, B. Boser, J. S. Denker, et al. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems 2 - Proc. Conf. Neur IPS 2008, volume 2, page 396-404, San Francisco, CA, USA, 1990. Morgan Kaufmann Publishers Inc. https://proceedings.neurips.cc/paper/1989/file/53c3bce66e43be4f209556518c2fcb54-Paper.pdf.
  • [39] M.-T. Luong, H. Pham, and C. D. Manning. Effective approaches to attention-based neural machine translation. In Proc. EMNLP 2015, Lisbon, Portugal, 17-21 Sep 2015. Accessible in arXiv. doi:10.48550/ARXIV.1508.04025.
  • [40] U.-V. Marti and H. Bunke. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5(1): 39-46, 2002. doi:10.1007/s100320200071.
  • [41] J. Michael, R. Labahn, T. Grüning, and J. Zöllner. Evaluating sequence-to-sequence models for handwritten text recognition. In Proc. 2019 IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 12861293, Sydney, NSW, Australia, 20-25 Sep 2019. IEEE. doi:10.1109/ICDAR.2019.00208.
  • [42] J. Poulos and R. Valle. Character-based handwritten text transcription with attention networks. Neural Computing and Applications, 33(16): 10563-10573, 2021. doi:10.1007/s00521-021-05813-1.
  • [43] A. Poznanski and L. Wolf. CNN-N-Gram for handwriting word recognition. In Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 2305-2314, Las Vegas, NV, USA, 27-30 Jun 2016. doi:10.1109/CVPR.2016.253.
  • [44] R. Ptucha, F. Petroski Such, S. Pillai, et al. Intelligent character recognition using fully convolutional neural networks. Pattern Recognition, 88: 604-613, 2019. doi:10.1016/j.patcog.2018.12.017.
  • [45] J. Puigcerver. Are multidimensional recurrent layers really necessary for handwritten text recognition? In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 67-72, Kyoto, Japan, 9-15 Nov 2017. IEEE. doi:10.1109/ICDAR.2017.20.
  • [46] H. Sak, A. Senior, and F. Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proc. Annual Conf. of the International Speech Communication Association (Interspeech), pages 338-342, Singapore, 14-18 Sep 2014. doi:10.21437/Interspeech.2014-80.
  • [47] J. Sauvola and M. Pietikäinen. Adaptive document image binarization. Pattern Recognition, 33(2): 225-236, 2000. doi:10.1016/S0031-3203(99)00055-2.
  • [48] H. Scheidl. CTCWordBeamSearch. GitHub repository, 2019. https://github.com/githubharald/CTCWordBeamSearch.
  • [49] H. Scheidl, S. Fiel, and R. Sablatnig. Word Beam Search: A connectionist temporal classification decoding algorithm. In Proc. 2018 16th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 253-258, Niagara Falls, NY, USA, 5-8 Aug 2018. IEEE. doi:10.1109/ICFHR-2018.2018.00052.
  • [50] B. Shi, X. Bai, and C. Yao. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. arXiv, 2015. arXiv:1507.05717. doi:10.48550/arXiv.1507.05717.
  • [51] B. Shi, X. Bai, and C. Yao. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(11): 2298-2304, 2017. doi:10.1109/TPAMI.2016.2646371.
  • [52] F. Such Petroski, D. Peri, F. Brockler, et al. Fully convolutional networks for handwriting recognition. In Proc. 2018 16th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 86-91, Niagara Falls, NY, USA, 5-8 Aug 2018. IEEE. doi:10.1109/ICFHR-2018.2018.00024.
  • [53] D. Suryani, P. Doetsch, and H. Ney. On the benefits of convolutional neural network combinations in offline handwriting recognition. In Proc. 2016 15th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 193-198, Shenzhen, China, 23-26 Oct 2016. IEEE. doi:10.1109/ICFHR.2016.0046.
  • [54] J. I. Toledo, S. Dey, A. Fornes, and J. Llados. Handwriting recognition by attribute embedding and recurrent neural networks. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), volume 01, pages 1038-1043, Kyoto, Japan, 9-15 Nov 2017. IEEE. doi:10.1109/ICDAR.2017.172.
  • [55] A. Vinciarelli. A survey on off-line cursive word recognition. Pattern Recognition, 35(7): 1433-1446, 2002. doi:10.1016/S0031-3203(01)00129-7.
  • [56] A. Vinciarelli and J. Luettin. A new normalization technique for cursive handwritten words. Pattern Recognition Letters, 22(9): 1043-1050, 2001. doi:10.1016/S0167-8655(01)00042-3.
  • [57] M. Yousef and T. Bishop. OrigamiNet: Weakly-supervised, segmentation-free, one-step, full page text recognition by learning to unfold. In Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 14698-14707, Seattle, WA, USA, 13-19 Jun 2020. IEEE. doi:10.1109/CVPR42600.2020.01472.
  • [58] M. Yousef, K. F. Hussain, and U. S. Mohammed. Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognition, 108: 107482, 2020. doi:10.1016/j.patcog.2020.107482.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-2f920e79-de75-4803-9a99-17cacf1a11be
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.