Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 17

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  korpusová lingvistika
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
1
Content available Protektorát v korpusu
100%
EN
With the application of corpus analysis, this paper describes the current tendencies in the usage of four concrete phenomena (in orthography and semantics), which appear in contemporary discourse rather inconsistently. Specifically, the research focuses on differences in the usage of: p/Protektorát [Protectorate], spáchat/provést atentát [to assassinate], Sudety/pohraničí [Sudetenland] and Benešovy dekrety / dekrety prezidenta republiky [presidential decrees].
2
Content available remote Distribuce předpon v českém sylabotónickém trocheji
84%
EN
The article deals with the use of prefixes in the Czech accentual syllabic trochee. We test a hypothesis raised by Miroslav Červenka, Květa Sgallová, and Petr Kaiser which states that some authors in the 19th century used prefixes to moderate rhythmical irregularities. In our analysis – based on automatic prefix recognition in a large body of poetic texts from the Corpus of Czech Verse – we observe a clear tendency in the work of some authors to employ prefixes in such contexts with a frequency significantly higher than would be expected merely by chance. Furthermore, we observe this technique to be very common in the first half of the 19th century, but to gradually disappear in later works.
3
Content available remote Jen popis s čísly? Perspektivy korpusové lingvistiky:
84%
EN
The aim of the article is both to point to the descriptive character of a majority of corpus linguistic analyses and to argue that this character (which is manifested by the classification, sorting or labelling of language data) represents a limit in corpus linguistic research. Further, an experimental approach (in the sense of empirical testing of a hypothesis) is proposed as a possible way of overcoming this limit. Finally, some methodological aspects of current state of corpus linguistics, namely the notion of representativeness and the interpretation of quantification, are critically discussed.
EN
The aim of this paper is to provide a corpus-based analysis of one type of Czech proper nouns (type Zubří). We will argue that the adequate annotation (lemmatisation and morphological tagging) of proper nouns type Zubří depends on several circumstances: 1) the coverage of the dictionary of the automatic analyser; 2) the accurate description of the variability of inflexion forms; 3) the non-trivial disambiguation of numerous homonymous word forms. We believe that while meeting the first two conditions is possible, the adequate disambiguation goes beyond the possibilities of automatic morphological analysis.
5
Content available remote Trajectories of change in paradigmatic cells in Czech
84%
EN
We examine a well-known phenomenon in the development of the Czech nominal declension system: the gradual supplanting of the original o-stem ending in the locative singular with the u-stem ending. We observe that, contrary to expectations from the literature based primarily on studies of English, this shift has been in progress for a millennium and, in the high-frequency nouns for which we have enough data to observe, the opposing trend is also frequently in evidence: the o-stem ending is introduced to lexemes where it was not found earlier. In the absence of a single, overriding motivation that could have derailed this shift from following the classic ‘S-curve’ pattern, we propose re-examining the retextualization model as a more fitting one for the complex interaction of factors and forms found in languages with complex inflectional morphology.
EN
The first aim of the article is to address major problems of current historical corpus linguistics such as representativeness in genre, place and time, transcription of historical texts, etc. The second goal is to introduce the reader to traditional and innovative historical corpora of Spanish, focusing on their characteristics, advantages and limitations.
7
84%
EN
The article aims to review corpus-based research on spoken language, emphasizing issues in description and conceptualization of the grammar of spoken language in relation to the grammar of written language. The review first briefly looks at the development of spoken corpora, from simply transcribed corpora without sound alignment to today’s sophisticated multi-modal corpora. The main part of the article deals with issues concerning the metalanguage for the description of spoken language, the choice of its basic descriptive unit, the status of basic linguistic categories such as part-of-speech, and typical lexical and grammatical devices. The existing extensive research on spoken English is reviewed and in line with it, illustrative examples based on Czech spoken corpora are provided. These are further contrasted with examples from written data to enhance the inherent differences between spoken and written language and the need to adjust the metalanguage of the description.
EN
This study deals with confusion of the diphthongs /ow/ and /oj/ in the evolution of European Portuguese. These two diphthongs have different etymologies, but in ancient Portuguese they begin to be confused. We analyzed selected words containing these diphthongs in the diachronic corpus www.corpusdoportugues.org. The results of this analysis showed that the final form of the word depends on its phonological structure: in words with the final back vowels /o/ and /u/, the diphthong /ow/ predominates, and in words ending in a central or central back vowel, /ɐ/ or /ɨ/, the clearly preferred diphthong is /oj/.
9
Content available remote Definite associative anaphora in informal spoken Czech: a corpus-based study
67%
EN
The present study, couched within the framework of Löbner’s Concept Types and Determination theory (CTD) and relying both on corpus data and the questionnaire method, attempts to provide some evidence for the claim that there is a growing tendency in contemporary informal spoken Czech to use the emerging definite article ten with definite associative anaphora (DAA). Just like its Western Slavic cognates, the distance-neutral demonstrative ten appears to manifest characteristics typical of definite articles across languages (cf. Ortmann, 2014; Czardybon, 2017; Dvořák, 2020). One of these characteristics is the spreading of ten to contexts situated between pragmatic and semantic definiteness on Löbner’s definiteness scale (Löbner, 1985; 2011). DAA is part of these contexts. However, as the present study shows, marked differences exist between the three sub-types of DAA as defined by Löbner with regards to their willingness to accept ten. These are, respectively, the “part-whole,” the “relational” and the “situational” sub-type. Other factors must also be taken into account, such as the speaker’s emotional involvement and competing interpretations of the occurrence of ten.
EN
The article is devoted to applications of the InterCorp parallel corpus in Czech-Polish translation lexicography. Selected phrasemes exemplify possibilities of using a parallel corpus to identify translation counterparts. The examples provided in the article indicate various kinds of limitations which might cause problems in establishing reciprocal equivalence of Polish and Czech phrasemes. The article also describes some advantages and disadvantages of using subtitles, which are well represented in InterCorp, in the process of establishing equivalence. The author used selected phrasemes from Wielki czesko-polski słownik frazeologiczny (Great CzechPolish Phraseological Dictionary), compiled by traditional means without reference to a corpus, as a starting point for the analysis, which demonstrated the value of corpus-based material and tools for creating dictionary entries. The article is a contribution to theoretical research into contemporary directions in the development of translation lexicography.
EN
The aim of this study is the issue of competitive endings -a and -u of genitive singular inanimate masculine nouns in the context of foreign language teaching. The emphasis is on a systematic description of genitive endings -a and -u using corpus methods. In the first step, we analysed the part of the learner corpus including texts of Slavic language speaking students. The results have shown that students quite often confuse both of these endings: the correct ending -u frequently substituted by -a. Next, we examined the competition of genitive endings within the corpus of contemporary Czech using the Morfio tool that identifies relevant word pairs for further analysis. The identified pairs were divided into three categories: a) nouns with the same etymological origin and meaning, b) nouns with the same etymological origin but different meaning, and c) nouns with inconclusive competition of genitive endings. A systemised list of pairs, along with the proportional and absolute frequency given, is a source of information on the use of appropriate endings, with respect to frequency. The information is crucial for students of Czech as a foreign language in order to choose an appropriate variant with the ending that is closest to the current usage. Based on the analysed material, we proposed three model corpus exercises: two direct exercises for determining a more frequent variant, and one indirect exercise — taking into account semantic differences in the usage of particular endings.
12
Content available Partikule v Pražském mluveném korpusu
67%
XX
The article deals with particles, usually considered to be a residual part of speech, and strives to come to some general conclusions on occurrence and frequency of particles as well as their function in common spoken language. The basic source is the reference Prague Spoken Corpus (PSC) which is a part of the Czech National Corpus. The fact that particles are after verbs and pronouns the third most frequent word in spoken Czech appears for the first time in the Frequency Dictionary of Spoken Czech based on the PSC. This finding demands new tasks on linguists, especially more detailed description of this so frequently used part of speech, which hasn’t been so far thoroughly analyzed on the basis of true authentic data. PSC provides the unique possibility to describe functions and the meaning of particles in the direct authentic context and usage, where they naturally appear. Large contextual scope is the decisive criterion for their identification. Description of particles requires a practical approach and by analyzing their real occurrence and co-occurrence one can prove, deny or change all theoretical premises. What we haven’t found in the corpus is also a positive knowledge — the prove that in the corpus with a size of almost three quarter million tokens a particular word didn’t appear. The article presents all types of particles which appear in the corpus and provides both its quantitative analyses dealing with original particles as well as with those homonymous with other parts of speech. It also deals with the existing processing of particles in various linguistic manuals.
EN
The paper provides a thorough review of the corpus-linguistic approach to critical discourse analysis. It briefly presents the core of critical discourse analysis (CDA) and examines the possibilities of applying corpus tools to it. In the next step, critical commentaries on CDA are summarized and at the same time, possible corpus-linguistic solutions are offered. The final part offers an illustrative application of corpus-assisted CDA focusing on language ideologies in the Czech parliamentary discourse.
EN
This contribution discusses three ways of operationalising the notion of frequency as it relates to how often an item occurs in a corpus: the proportional frequency of forms (i.e. the percentage of instances in which one or another variant is found) and two ways of looking at absolute frequency. Working with data from unmotivated morphological variation in Czech case forms, we show that different types of data contribute to some extent to the way variation is perceived and implemented by native speakers, but suggest that proportional frequency seems most salient for speakers in forming their impressions and shaping their behaviour.
EN
The paper deals with the possibilities of the co‑ occurrence database CCDB for translation from Czech into German. The analysis is carried out on two practical examples (rukopis and opačný) which are difficult for translators as there is no proper German or Czech‑ German dictionary that would include collocations based on language use. These collocations would help in many cases more than a (often too vague) general description of the meaning or a list of lexicographic synonyms. That is why this paper offers a combination of two approaches: a dictionary search combined with a CCDB analysis which can reveal tendencies in language use.
16
67%
EN
The present survey, which focuses on selected methods of quantitative corpus analysis applied to literary texts, asks to what extent it is useful to employ quantification and statistics in literary studies. The study discusses two basic ways in which quantitative corpus methods might be utilized. The first of these is macroanalysis, a method employing research into a large base of material. Its results provide empirical data about the macrostructural behaviour of predominantly the developmental tendencies and characteristics of a given literary category. The second is microanalysis, which focuses on smaller textual units, for instance, a group of texts by the same author. Its aim is to analyze empirically structural patterns within these texts. The research questions posed in studies using both methods include issues related to theme, authorship, narrative discourse, or gender. The study stresses the importance of employing these methods critically. Quantification must always be accompanied by relevant interpretation, based on appropriate methods of literary criticism.
EN
The study presents a model for a dictionary based on the work of Jan Čep in the context of Czech authorial dictionaries and quantitative research on fiction texts. Designed to make a functional link between perspectives from linguistics and literary criticism perspectives, the dictionary with its conception, purposes and type is a part of the development of Czech authorial lexicography. It is the first project of its kind among frequency author-based dictionary that is aimed at providing tools for literary criticism to pre-interpret literary texts in a functional, objectivized way. The dictionary, however, is not (and cannot be) a substitute for the actual critical analysis of literary texts.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.