PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Slovak Morphosyntactic Tagset

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Morphological annotation constitutes essential, very useful and very common linguistic information presented in corpora, especially for highly inflectional languages. The morphological tagset used in the Slovak National Corpus has been designed with several goals in mind – the tags are compact and easily human-readable, without sacrificing their informational contents. The tags consist of ASCII letters, numbers and several other characters. In general, they have a variable numer of symbols, but their order is obligatory, and each category or specific feature is assigned a particular character, which can be shared among several parts of speech. The tagset is highly functional and pragmatic, although some allowances had to be made to accommodate the traditional analysis of Slovak morphology and part of speech categories.
Rocznik
Strony
41--63
Opis fizyczny
Bibliogr. 19 poz., tab.
Twórcy
autor
  • Ľ. Štúr Institute of Linguistics of Slovak Academy of Sciences, Bratislava, Slovakia
autor
  • Ľ. Štúr Institute of Linguistics of Slovak Academy of Sciences, Bratislava, Slovakia
Bibliografia
  • [1] Vladimír Benko, Jana Hašanová, and Eduard Kostolanský (1998), Model morfologickej databázy slovenčiny. Počítačové spracovanie jazyka, Pedagogická fakulta Univerzity Komenského, Bratislava, Slovakia.
  • [2] Roger Comtet (1997), Grammaire du russe contemporain, Presses Universitaires du Mirail.
  • [3] Łukasz Dębowski (2001), Tagowanie i dezambiguacja, in Prace IPI PAN 934, Instytut Podstaw Informatyki PAN, Warsaw, Poland.
  • [4] Ludmila Dimitrova, Tomaž Erjavec, Nancy Ide, Heiki Jaan Kaalep, Vladimir Petkevič, and Dan Tufiș (1998), Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages, in Proceedings of the COLING-ACL’98, pp. 315-319, Montréal, Québec, Canada.
  • [5] Ladislav Dvonč, Gejza Horák, František Miko, Jozef Mistrík, Ján Oravec, Jozef Ružička, and Milan Urbančok (1966), Morfológia slovenského jazyka, Vydavateľstvo Slovenskej akadémie vied, Bratislava, Slovakia.
  • [6] Sašo Džeroski, Tomaž Erjavec, and Jakub Zavrel (2000), Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagset, in Proceedings of the Second International Conference on Language Resources and Evaluation, pp. 1099-1044, ELRA, Paris, France.
  • [7] Radovan Garabík (2006), Slovak morphology analyzer based on Levenshtein edit operations, in Proceedings of the WIKT’06 conference, pp. 2-5, Institute of Informatics SAS, Bratislava, Slovakia.
  • [8] Radovan Garabík (2011), Slovak MULTEXT-East Morphology tagset, Jazykovedný časopis, (1):19-39.
  • [9] Radovan Garabík, Lucia Gianitsová, Alexander Horák, and Mária Šimková (2004), Tokenizácia, lematizácia a morfologická anotácia Slovenského národného korpusu, URL http://korpus.sk/attachments/publications/2004-garabik-gianitsova-horak-simkova-tokenizacia.pdf, Internal documentation.
  • [10] Radovan Garabík, Daniela Majchráková, and Ludmila Dimitrova (2009), Comparing Bulgarian and Slovak Multext-East morphology tagset, in Organization and Development of Digital Lexical Resources, pp. 38-46, Dovira Publishing House, Kyiv, Ukraine.
  • [11] Jan Hajič (2004), Disambiguation of Rich Inflection (Computational Morphology of Czech), Karolinum, Charles Univeristy Press, Prague, Czech Republic.
  • [12] Jan Hajič (2000), Morphological Tagging: Data vs. Dictionaries, in Proceedings of the 6th Applied Natural Language Processing and the 1st NAACL Conference, pp. 94-101.
  • [13] Jan Hajič and Barbora Vidová-Hladká (1997), Morfologické značkování korpusu českých textů stochastickou metodou, 4 (58):288-304.
  • [14] Matej Považaj, editor (2003), Krátky slovník slovenského jazyka. 4., doplnené a upravené vydanie, Veda, Bratislava, Slovakia.
  • [15] Emil Páleš (1994), SAPFO. Parafrázovač slovenčiny. Počítačový nástroj na modelovanie v jazykovede, Veda, Bratislava, Slovakia.
  • [16] Radek Sedláček (2001), A new Czech morphological analyser ajka, in Proceedings of the TSD, Czech Republic, pp. 100-107, Springer Verlag.
  • [17] Serge Sharoff, Mikhail Kopotev, Tomaz Erjavec, Anna Feldman, and Dagmar Divjak (2008), Designing and Evaluating a Russian Tagset, in Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, and Daniel Tapias, editors, Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), European Language Resources Association (ELRA), Marrakech, Morocco, URL http://www.lrec-conf.org/proceedings/lrec2008/.
  • [18] Pavel Šmerk (2010), A New Data Format for Czech Morphological Analysis, in Proccedings of Recent Advances in Slavonic Natural Language Processing, pp. 3-8, Tribun EU, Karlova Studánka, Czech Republic, URL http://www.fi.muni.cz/sojka/download/raslan2010/raslan10.pdf.
  • [19] Jan Votrubec (2006), Morphological Tagging Based on Averaged Perceptron, in WDS’06 Proceedings of Contributed Papers, pp. 191-195, Matfyzpress, Charles University, Praha, Czech Republic.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-bbbcfb61-bf87-4ba0-b692-252769b153da
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.