PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2016 | 77 | 2 | 83-101
Tytuł artykułu

Nová koncepce synchronních korpusů psané češtiny

Treść / Zawartość
Warianty tytułu
EN
A A new design of synchronic corpora of written Czech
Języki publikacji
CS
Abstrakty
EN
The paper describes the new corpus SYN2015, the most recent 100 million word corpus of contemporary written Czech. General notions of corpus representativeness and balance are discussed in this context with a focus on the new design of representativeness adopted for SYN2015. Unlike the previous synchronic corpora SYN2000, SYN2005 and SYN2010, which were balanced according to text reception (based on sociological surveys), the composition of SYN2015 is based on the “texts-as-products” principle with arbitrary proportions of the individual categories within a revised text classification scheme. The paper argues in favour of this solution by highlighting three major advantages: (1) this type of composition can be upheld constant in the future, ensuring corpus comparability, while reception changes constantly; (2) it emphasises diverse composition of the corpus as a language sample; (3) corpus SYN2015 serves not only as a representative sample, but also as a large pool of texts from which different subsets (subcorpora) based on various linguist-specified criteria can be drawn.
Twórcy
  • Slovo a slovesnost, redakce, Ústav pro jazyk český AV ČR, v.v.i., Letenská 4, 118 51 Praha 1, Czech Republic
  • Slovo a slovesnost, redakce, Ústav pro jazyk český AV ČR, v.v.i., Letenská 4, 118 51 Praha 1, Czech Republic
  • Slovo a slovesnost, redakce, Ústav pro jazyk český AV ČR, v.v.i., Letenská 4, 118 51 Praha 1, Czech Republic
Bibliografia
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.aaad68bb-ba00-406f-a0b4-3bcde2f7a39f
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.