PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Improving Logical Structure Analysis of Visually Structured Documents with Textual Features

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper introduces a new model to improve the quality of logical structure analysis of visually structured documents. To do that, we extend the model of Koreeda and Manning [1]. In order to enhance textual features, we define a new feature that uses the font size of texts as an indicator. As our observation, the font size is an important indicator that can be used to represent the structure of a document. The new font size feature is combined with visual, textual, and semantic features for training an analyzer. Experimental results on four legal datasets show that the new font size feature contributes to the model and helps to improve the F-scores. The ablation study also shows the contribution of each feature in our model.
Rocznik
Tom
Strony
151--156
Opis fizyczny
Bibliogr. 14 poz., wykr.
Twórcy
autor
  • Hung Yen University of Technology and Education, Hung Yen, Vietnam
  • Hanoi University of Science and Technology, Hanoi, Vietnam
  • Hung Yen University of Technology and Education, Hung Yen, Vietnam
Bibliografia
  • [1] Y. Koreeda and C. Manning, “Capturing logical structure of visually structured documents with multimodal transition parser,” in Proceedings of the Natural Legal Language Processing Workshop 2021. Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 144–154. [Online]. Available: https://aclanthology.org/2021.nllp-1.15
  • [2] V. W. Frederik Obermaier, Bastian Obermayer and W. Jaschensky, “About the panama papers,” in Süddeutsche Zeitung, 2016.
  • [3] M.-T. Nguyen, D. T. Le, and L. Le, “Transformers-based information extraction with limited data for domain-specific business documents,” Engineering Applications of Artificial Intelligence, vol. 97, p. 104100, 2021.
  • [4] Y. Hatsutori, K. Yoshikawa, and H. Imai, “Estimating legal document structure by considering style information and table of contents,” in New Frontiers in Artificial Intelligence, S. Kurahashi, Y. Ohta, S. Arai, K. Satoh, and D. Bekki, Eds. Cham: Springer International Publishing, 2017, pp. 270–283.
  • [5] C. G. Stahl, S. R. Young, D. Herrmannova, R. M. Patton, and J. C. Wells, “Deeppdf: A deep learning approach to extracting text from pdfs,” Oak Ridge National Lab.(ORNL), Oak Ridge, TN (United States), Tech. Rep., 2018.
  • [6] C. Soto and S. Yoo, “Visual detection with context for document layout analysis,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3464–3470.
  • [7] Y. Xu, M. Li, L. Cui, S. Huang, F. Wei, and M. Zhou, “Layoutlm: Pre-training of text and layout for document image understanding,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1192–1200.
  • [8] Y. Xu, Y. Xu, T. Lv, L. Cui, F. Wei, G. Wang, Y. Lu, D. A. F. Florêncio, C. Zhang, W. Che, M. Zhang, and L. Zhou, “Layoutlmv2: Multi-modal pre-training for visually-rich document understanding,” CoRR, vol. abs/2012.14740, 2020. [Online]. Available: https://arxiv.org/abs/2012.14740
  • [9] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  • [10] C. Sporleder and M. Lapata, “Automatic paragraph identification: A study across languages and domains,” in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004, pp. 72–79.
  • [11] C. Abreu, H. Cardoso, and E. Oliveira, “FinDSE@FinTOC-2019 shared task,” in Proceedings of the Second Financial Narrative Processing Workshop (FNP 2019). Turku, Finland: Linköping University Electronic Press, Sep. 2019, pp. 69–73. [Online]. Available: https://aclanthology.org/W19-6410
  • [12] D. Ferrés, H. Saggion, F. Ronzano, and À. Bravo, “Pdfdigest: an adaptable layout-aware pdf-to-xml textual content extractor for scientific articles,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018.
  • [13] M. Ostendorf, M. Collins, S. Narayanan, D. W. Oard, and L. Vanderwende, “Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics,” in Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2009.
  • [14] S. Zhang, X. Ma, K. Duh, and B. V. Durme, “AMR parsing as sequence-to-graph transduction,” CoRR, vol. abs/1905.08704, 2019. [Online]. Available: http://arxiv.org/abs/1905.08704
Uwagi
PL
Opracowanie rekordu ze środków MNiSW, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-58ca1fb9-fc04-440c-90b5-31f95bb10239
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.