PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Character/word modelling: a two-step framework for text recognition in natural scene images

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Text recognition from images is a complex task in computer vision. Traditional text recognition methods typically rely on Optical Character Recognition (OCR); however, their limitations in image processing can lead to unreliable results. However, recent advancements in deep-learning models have provided an effective alternative for recognizing and classifying text in images. This study proposes a deep-learning-based text recognition system for natural scene images that incorporates character/word modeling, a two-step procedure involving the recognition of characters and words. In the first step, Convolutional Neural Networks (CNN) are used to differentiate individual characters from image frames. In the second step, the Viterbi search algorithm employs lexicon-based word recognition to determine the optimal sequence of recognized characters, thereby enabling accurate word identification in natural scene images. The system is tested using the ICDAR 2003 and ICDAR 2013 datasets from the Kaggle repository, and achieved accuracies of 78.5% and 80.5%, respectively.
Wydawca
Czasopismo
Rocznik
Tom
Strony
637–--652
Opis fizyczny
Bibliogr. 16 poz., rys., tab., wykr.
Twórcy
  • Anna University, Chennai, Tamil Nadu, India
autor
  • Anna University, Chennai, Tamil Nadu, India
autor
  • Chitkara University, Chitkara University Institute of Engineering & Technology, India, Punjab
Bibliografia
  • [1] Almaz´an J., Gordo A., Forn´es A., Valveny E.: Word Spotting and Recognition with Embedded Attributes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36(12), pp. 2552–2566, 2014. doi: 10.1109/tpami.2014.2339814.
  • [2] Arafat S.Y., Iqbal M.J.: Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning, IEEE Access, vol. 8, pp. 96787–96803, 2020. doi: 10.1109/access.2020.2994214.
  • [3] Bahi H.E., Zatni A.: Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network, Multimedia Tools and Applications, vol. 78(18), pp. 26453–26481, 2019. doi: 10.1007/s11042- 019-07855-z.
  • [4] Bhunia A.K., Kumar G., Roy P.P., Balasubramanian R., Pal U.: Text recognition in scene image and video frame using color channel selection, Multimedia Tools and Applications, vol. 77, pp. 8551–8578, 2018. doi: 10.1007/s11042-017-4750-6.
  • [5] Chen X., Wang T., Zhu Y., Jin L., Luo C.: Adaptive embedding gate for attention-based scene text recognition, Neurocomputing, vol. 381, pp. 261–271, 2020. doi: 10.1016/j.neucom.2019.11.049.
  • [6] Coates A., Carpenter B., Case C., Satheesh S., Suresh B., Wang T., Wu D.J., Ng A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: 2011 International Conference on Document Analysis and Recognition, pp. 440–445, IEEE, 2011. doi: 10.1109/icdar.2011.95.
  • [7] Elagouni K., Garcia C., Mamalet F., Sebillot P.: Combining multi-scale character recognition and linguistic knowledge for natural scene text OCR. In: DAS ’12: Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 120–124, IEEE, 2012. doi: 10.1109/das.2012.26.
  • [8] Goel V., Mishra A., Alahari K., Jawahar C.V.: Whole is greater than sum of parts: Recognizing scene text words. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 398–402, IEEE, 2013. doi: 10.1109/ icdar.2013.87.
  • [9] Harizi R., Walha R., Drira F., Zaied M.: Convolutional neural network with joint stepwise character/word modelling based system for scene text recognition, Multimedia Tools and Applications, pp. 3091–3106, 2022. doi: 10.1007/s11042- 021-10663-z.
  • [10] Jaderberg M., Simonyan K., Vedaldi A., Zisserman A.: Reading text in the wild with convolutional neural networks, International Journal of Computer Vision, vol. 116, pp. 1–20, 2016. doi: 10.1007/s11263-015-0823-z.
  • [11] Liao M., Zhang J., Wan Z., Xie F., Liang J., Lyu P., Yao C., Bai X.: Scene text recognition from two-dimensional perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8714–8721, 2019. doi: 10.1609/ aaai.v33i01.33018714.
  • [12] Liu X., Kawanishi T., Wu X., Kashino K.: Scene text recognition with CNN classifier and WFST-based word labeling. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3999–4004, IEEE, 2016. doi: 10.1109/ icpr.2016.7900259.
  • [13] Liu X., Kawanishi T., Wu X., Kashino K.: Scene text recognition with high performance CNN classifier and efficient word inference. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1322–1326, IEEE, 2016. doi: 10.1109/icassp.2016.7471891.
  • [14] Long S., He X., Yao C.: Scene Text Detection and Recognition: The Deep Learning Era, 2018, arXiv preprint arXiv:181104256. arXiv:1811.04256.
  • [15] Novikova T., Barinova O., Kohli P., Lempitsky V.: Large-lexicon attributeconsistent text recognition in natural images, Computer Vision–ECCV 2012: 12th European Conference on Computer Vision Florence, Italy, October 7–13, 2012 Proceedings, Part VI, pp. 752–765, 2012. doi: 10.1007/978-3-642-33783-3 54.
  • [16] Portaz M., Kohl M., Chevallet J.P., Qu´enot G., Mulhem P.: Object instance identification with fully convolutional networks, Multimedia Tools and Applications, vol. 78(3), pp. 2747–2764, 2019. doi: 10.1007/s11042-018-5798-7.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-9e5ed01d-4ed7-478e-97ba-f4c56a029a49
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.