PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Automated Creation of Parallel Bible Corpora with Cross-Lingual Semantic Concordance

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Federated Conference on Computer Science and Information Systems (16 ; 02-05.09.2021 ; online)
Języki publikacji
EN
Abstrakty
EN
Here we present a novel approach for automated creation of parallel New Testament corpora with cross-lingual semantic concordance based on Strong's numbers. There is a lack of available digital Biblical resources for scholars. We present two approaches to tackle the problem, a dictionary-based approach and a CRF model and a detailed evaluation on annotated and non-annotated translations. We discuss a proof-of-concept based on English and German New Testament translations. The results presented in this paper are novel and according to our knowledge unique. They present promising performance, although further research is necessary.
Rocznik
Tom
Strony
111--114
Opis fizyczny
Bibliogr. 23 poz., rys., tab.
Twórcy
  • University of Pretoria, Faculty of Theology and Religion, Hatfield, Pretoria, South Africa
  • Faculty for Mathematics and Informatics, Fernuniversität Hagen, Germany
Bibliografia
  • 1. S. Landes, C. Leacock, and R. I. Tengi, “Building semantic concordances,” WordNet: An electronic lexical database, vol. 199, no. 216, pp. 199–216, 1998.
  • 2. B. Metzger, The Bible in Translation: Ancient and English Versions, ser. Biblical studies. Baker Publishing Group, 2001.
  • 3. C. Clivaz, “Die bibel im digitalen zeitalter: Multimodale schriften in gemeinschaften,” Zeitschrift für Neues Testament, vol. 20, no. 39/40, pp. 35–57, 2017.
  • 4. C. Anderson, “Digital humanities and the future of theology,” 2018.
  • 5. C. Clivaz, A. Gregory, and D. Hamidović, Digital Humanities in Biblical, Early Jewish and Early Christian Studies. Brill, 2013.
  • 6. M. Cysouw, C. Biemann, and M. Ongyerth, “Using strong’s numbers in the bible to test an automatic alignment of parallel texts,” STUF-language typology and universals, vol. 60, no. 2, pp. 158–171, 2007.
  • 7. B. Wälchli, “Similarity semantics and building probabilistic semantic maps from parallel texts,” Linguistic Discovery, vol. 8, no. 1, pp. 331–371, 2010.
  • 8. M. Simard, “Building and using parallel text for translation,” The Routledge Handbook of Translation and Technology, pp. 78–90, 2020.
  • 9. A. Yli-Jyrä, J. Purhonen, M. Liljeqvist, A. Antturi, P. Nieminen, K. M. Räntilä, and V. Luoto, “Helfi: a hebrew-greek-finnish parallel bible corpus with cross-lingual morpheme alignment,” arXiv preprint https://arxiv.org/abs/2003.07456, 2020.
  • 10. N. Rees and J. Riding, “Automatic concordance creation for texts in any language,” Proceedings of Translation and the Computer, vol. 31, 2009.
  • 11. M. Diab and S. Finch, “A statistical word-level translation model for comparable corpora,” MARYLAND UNIV COLLEGE PARK INST FOR ADVANCED COMPUTER STUDIES, Tech. Rep., 2000.
  • 12. P. Resnik, M. B. Olsen, and M. Diab, “The bible as a parallel corpus: Annotating the ‘book of 2000 tongues’,” Computers and the Humanities, vol. 33, no. 1, pp. 129–153, 1999.
  • 13. C. Christodouloupoulos and M. Steedman, “A massively parallel corpus: the bible in 100 languages,” Language resources and evaluation, vol. 49, no. 2, pp. 375–395, 2015.
  • 14. J. D. Riding, “Statistical glossing, language independent analysis in bible translation,” Translating and the Computer, vol. 30, 2008.
  • 15. J. Renkema and C. van Wijk, “Converting the words of god: An experimental evaluation of stylistic choices in the new dutch bible translation,” Linguistica Antverpiensia, New Series–Themes in Translation Studies, no. 1, 2002.
  • 16. L. De Vries, “Bible translation and primary orality,” The Bible Translator, vol. 51, no. 1, pp. 101–114, 2000.
  • 17. G. G. Scorgie, M. L. Strauss, S. M. Voth et al., The challenge of Bible translation: Communicating God’s Word to the world. Zondervan Academic, 2009.
  • 18. A. McMillan-Major, “Automating gloss generation in interlinear glossed text,” Proceedings of the Society for Computation in Linguistics, vol. 3, no. 1, pp. 338–349, 2020.
  • 19. X. Zhao, S. Ozaki, A. Anastasopoulos, G. Neubig, and L. Levin, “Automatic interlinear glossing for under-resourced languages leveraging translations,” in Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 5397–5408.
  • 20. A. B. Muhammad, Annotation of conceptual co-reference and text mining the Qur’an. University of Leeds, 2012.
  • 21. E. Biagetti, C. Zanchi, and W. M. Short, “Toward the creation of wordnets for ancient indo-european languages,” in Proceedings of the 11th Global Wordnet Conference, 2021, pp. 258–266.
  • 22. V. Perrone, M. Palma, S. Hengchen, A. Vatri, J. Q. Smith, and B. McGillivray, “GASC: Genre-aware semantic change for Ancient Greek,” in Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change. Florence, Italy: Association for Computational Linguistics, Aug. 2019, pp. 56–66. [Online]. Available: https://www.aclweb.org/anthology/W19-4707
  • 23. J. Lafferty, A. McCallum, and F. C. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” Proceedings of the Eighteenth International Conferenceon Machine Learning, 2001.
Uwagi
1. Track 1: Artificial Intelligence in Applications
2. Session: 15th International Symposium Advances in Artificial Intelligence and Applications
3. Short Paper
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b10b3263-9df5-4de5-b07b-77d57f31e518
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.