ReqTagger: A Rule-Based Tagger for Automatic Glossary of Terms Extraction from Ontology Requirements

Wiśniewski, Dawid; Potoniec, Jędrzej; Ławrynowicz, Agnieszka

doi:10.2478/fcds-2022-0003

Artykuł - szczegóły

Tytuł artykułu

ReqTagger: A Rule-Based Tagger for Automatic Glossary of Terms Extraction from Ontology Requirements

Autorzy

Wiśniewski Dawid , Potoniec Jędrzej , Ławrynowicz Agnieszka

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.2478/fcds-2022-0003

Warianty tytułu

Języki publikacji

Abstrakty

Glossary of Terms extraction from textual requirements is an impor- tant step in ontology engineering methodologies. Although initially it was intended to be performed manually, last years have shown that some degree of automatization is possible. Based on these promising approaches, we introduce a novel, human inter- pretable, rule-based method named ReqTagger, which can extract candidates for ontology entities (classes or instances) and relations (data or object properties) from textual requirements automatically. We compare ReqTagger to existing automatic methods on an evaluation benchmark consisting of over 550 requirements and tagged with over 1700 entities and relations expected to be extracted. We discuss the quality of ReqTagger and provide details showing why it outperforms other methods. We also publish both the evaluation dataset and the implementation of ReqTagger.

Słowa kluczowe

competency questions ontology requirements ontology information extraction part-of-speech tagging

Wydawca

Wydawnictwo Politechniki Poznańskiej

Czasopismo

Foundations of Computing and Decision Sciences

Rocznik

2022

Tom

Vol. 47, No. 1

Strony

65--86

Opis fizyczny

Bibliogr. 28 poz., rys., tab.

Twórcy

autor

Wiśniewski Dawid

dawid.wisniewski@cs.put.poznan.pl

Faculty of Computing and Telecommunication, Poznan University of Technology, Poland

https://orcid.org/0000-0003-1194-7921

autor

Potoniec Jędrzej

Faculty of Computing and Telecommunication, Poznan University of Technology, Poland

https://orcid.org/0000-0002-6115-6485

autor

Ławrynowicz Agnieszka

Faculty of Computing and Telecommunication, Poznan University of Technology, Poland

https://orcid.org/0000-0002-2442-345X

Bibliografia

[1] Antoniou G. and Van Harmelen F. Web ontology language: Owl. In Handbookon ontologies, pages 67–92. Springer, 2004.
[2] Bezerra C., Santana F., and Freitas F. Cqchecker: A tool to check ontologiesin owl-dl using competency questions written in controlled natural language. Learning and Nonlinear Models, 12:115–129, 2014.
[3] del Carmen Suárez-Figueroa M., de Cea G. A., Buil C., Dellschaft K., Fernández-López M., García A., Gómez-Pérez A., Herrero G., Montiel-Ponsoda E., Sabou M., Villazon-Terrazas B., and Yufei Z. D5.4.1 neon methodology for building contextualized ontology networks, Feb. 2008.
[4] Dwarakanath A., Ramnani R. R., and Sengupta S. Automatic extraction of glossary terms from natural language requirements. In21st IEEE International Requirements Engineering Conference, RE 2013, Rio de Janeiro-RJ, Brazil, July15-19, 2013, pages 314–319. IEEE Computer Society, 2013.
[5] Fernández-Izquierdo A., Poveda-Villalón M., and García-Castro R. CORAL: Acorpus of ontological requirements annotated with lexico-syntactic patterns. InESWC, 2019.
[6] Fernandez-Lopez M., Gomez-Perez A., and Juristo N. Methontology: from onto-logical art towards ontological engineering. In Proceedings of the AAAI97 Spring Symposium, pages 33–40, Stanford, USA, March 1997.
[7] Grishman R. Information extraction: Techniques and challenges. In International summer school on information extraction, pages 10–27. Springer, 1997.
[8] Gruninger M. Methodology for the design and evaluation of ontologies. InIJCAI1995, 1995.
[9] Huang Z., Xu W., and Yu K. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.
[10] Keet C. M., Mahlaza Z., and Antia M.-J. Claro: a data-driven cnl for specifying competency questions. arXiv preprint arXiv:1907.07378, 2019.
[11] Lafferty J. D., McCallum A., and Pereira F. C. N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, page282–289, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.
[12] Lawrynowicz A. and Keet C. M. The TDDonto tool for test-driven development of DL knowledge bases. In Lenzerini M. and Peñaloza R., editors, Description Logics, volume 1577 of CEUR Workshop Proceedings. CEUR-WS.org, 2016.
[13] Lenat D. B. and Guha R. V. Building Large Knowledge-Based Systems; Representation and Inference in the Cyc Project. Addison-Wesley Longman Publishing Co., Inc., USA, 1st edition, 1989.
[14] Ling X. and Weld D. S. Fine-grained entity recognition. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI’12, page 94–100.AAAI Press, 2012.
[15] Malone J., Brown A., Lister A., Ison J., Hull D., Parkinson H., and Stevens R. The software ontology (SWO): A resource for reproducibility in biomedical data analysis, curation and digital preservation. Journal of biomedical semantics, 5:25,06 2014.
[16] Miller G. A. WordNet: A lexical database for english. Commun. ACM, 38(11):39–41, 1995.
[17] Ochodek M. and Nawrocki J. R. Automatic transactions identification in use cases. In Meyer B., Nawrocki J. R., and Walter B., editors, Balancing Agility and Formalism in Software Engineering, Second IFIP TC 2 Central and East European Conference on Software Engineering Techniques, CEE-SET 2007, Poznan, Poland, October 10-12, 2007, Revised Selected Papers, volume 5082 ofLectureNotes in Computer Science, pages 55–68. Springer, 2007.
[18] Park Y., Byrd R. J., and Boguraev B. Automatic glossary extraction: Beyond terminology identification. In19th International Conference on Computational Linguistics, COLING 2002, Howard International House and Academia Sinica, Taipei, Taiwan, August 24 - September 1, 2002, 2002.
[19] Petrucci G., Ghidini C., and Rospocher M. Ontology learning in the deep. In Knowledge Engineering and Knowledge Management - 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016, Proceedings, pages480–495, 2016.
[20] Potoniec J., Wisniewski D., Ławrynowicz A., and Keet C. M. Dataset of ontology competency questions to SPARQL-OWL queries translations. Data in Brief, 29, 2020.
[21] Ren Y., Parvizi A., Mellish C., Pan J. Z., van Deemter K., and Stevens R. Towards competency question-driven ontology authoring. In Presutti V.,d’Amato C., Gandon F., d’Aquin M., Staab S., and Tordai A., editors, The Semantic Web: Trends and Challenges, pages 752–767, Cham, 2014. Springer International Publishing.
[22] Suárez-Figueroa M. C., Gómez-Pérez A., and Fernández-López M. The NeOn Methodology for Ontology Engineering, pages 9–34. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
[23] Sure Y., Staab S., and Studer R. On-To-Knowledge Methodology (OTKM), pages117–132. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
[24] Uschold M. and King M. Towards a methodology for building ontologies. In In Workshop on Basic Ontological Issues in Knowledge Sharing, held in conjunction with IJCAI-95, 1995.
[25] Wisniewski D. Automatic translation of competency questions into SPARQL-OWL queries. In Companion Proceedings of the The Web Conference 2018,WWW ’18, page 855–859, Republic and Canton of Geneva, CHE, 2018. International World Wide Web Conferences Steering Committee.
[26] Wisniewski D. et al. Analysis of ontology competency questions and their formalizations in SPARQL-OWL.JWS, 59, 2019.
[27] Wisniewski D. and Ławrynowicz A. A tagger for glossary of terms extraction from ontology competency questions. InProc. of ESWC, Satellite Events, pages181–185. Springer, 2019.
[28] Wisniewski D., Potoniec J., and Lawrynowicz A. BigCQ: A large-scale synthetic dataset of competency question patterns formalized into SPARQL-OWL query templates. CoRR, abs/2105.09574, 2021.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-19797be1-70ac-476b-b29c-bf959d6a7821