Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Ograniczanie wyników
Czasopisma help
Lata help
Autorzy help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 102

Liczba wyników na stronie
first rewind previous Strona / 6 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  corpus linguistics
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 6 next fast forward last
EN
This paper deals with a corpus-based linguistic study in lexical semantics. Our topic is the general scientific lexicon, the cross-disciplinary lexicon peculiar to the academic genre. We show how the use of a large corpus enables to develop an inventory of this vocabulary and present the first semantic treatments performed with the help of the corpus, with a first experiment in natural language processing..
Mäetagused
|
2017
|
tom 69
217-242
EN
Linking Estonian linguistic proficiency to reference levels of the CEFR and different educational stages does not rely on research but is based on deep-rooted perceptions. More veracious data can be obtained by comparing a native speaker’s language usage patterns to morphological and lexical preferences characteristic to speakers of every language level. For this purpose, tools for automatic text processing (which are mainly created on the basis of English) and different techniques for data analysis are needed. The article introduces an original computer program called Cluster Catcher that has been developed in the Tallinn University for finding usage patterns from Estonian written language texts.
3
Content available remote Zdvojená slovesa v současné češtině
80%
EN
This paper presents an analysis of so-called double-paradigm verbs (muset – musit, bydlet – bydlit, myslet – myslit, šílet – šílit, kvílet – kvílit and hanět – hanit) in contemporary Czech which is based on data from two Czech corpora: SYN2010 and SYN2009PUB. There is a common assumption in the literature that these verbs are classified as having two distinct paradigms: a “prosit- paradigm” and a “sázet-paradigm” (or in some cases a “trpět-paradigm”). The analysis shows that this assumption is false for contemporary Czech. It is shown that these verbs behave differently: muset, kvílet and šílet are used according to the “sázet-paradigm”, myslet and bydlet according to “trpět-paradigm” and the verb hanět is even more specific (present forms are used according to the “prosit-paradigm” and infinitive forms vary between the usage of stem suffix -e- and -i-). It is thus demonstrated that these verbs do not form a distinct category in contemporary Czech.
EN
Based on a mega corpus, The Corpus of Contemporary American English (COCA), this study aims to determine the most frequent adjectives used in academic texts and to investigate whether these adjectives differ in frequency and function in social sciences, technology, and medical sciences. It also identifies evaluative adjectives from a list of a hundred most frequently used adjectives. A total of 839 adjectives, which comprises the list of frequently used adjectives in COCA, were searched using a search engine. 334 of the adjectives were found to appear more frequently in the academic sub-corpus than in other sub-corpora (spoken, fiction, magazine, and newspaper). There was only one adjective that was used more frequently in technology and medical sciences than in social sciences. Some adjectives were very dominant in a specific discipline of academic texts. The frequency of evaluative adjectives in most frequently used 100 adjectives was also listed. It is found that almost 40% percent of the adjectives are evaluative. The results of the study were discussed in terms of frequency effects in language learning and writing in the foreign language as providing learners with corpus data may improve language knowledge and the correct use of adjectives.
5
Content available remote Korpusomat : a Tool for Creating Searchable Morphosyntactically Tagged Corpora
80%
EN
The paper presents Korpusomat, a web application aimed at building annotated corpora for the purpose of corpus linguistic studies. Korpusomat combines existing tools, such as morphological analyser, tagger and corpus search engine, and provides an easy-to-use environment for building corpora technically compatible with the National Corpus of Polish from almost any text, including texts in binary formats. In the paper we present the current state of the project, its features and functionalities, as well as some future plans and developments tasks. A usage example is also presented.
EN
Advances in spoken corpora analysis have brought about new insights into language pedagogy and have led to an awareness of the characteristics of spoken language. Current findings have shown that grammar of spoken language is different from written language. However, most listening and speaking materials are concocted based on written grammar and lack core spoken language features. The aim of the present study was to explore the question whether awareness of spoken grammar features could affect learners’ comprehension of real-life conversations. To this end, 45 university students in two intact classes participated in a listening course employing corpus-based materials. The instruction of the spoken grammar features to the experimental group was done overtly through awareness raising tasks, whereas the control group, though exposed to the same materials, was not provided with such tasks for learning the features. The results of the independent samples t tests revealed that the learners in the experimental group comprehended everyday conversations much better than those in the control group. Additionally, the highly positive views of spoken grammar held by the learners, which was elicited by means of a retrospective questionnaire, were generally comparable to those reported in the literature.
EN
Advances in spoken corpora analysis have brought about new insights into language pedagogy and have led to an awareness of the characteristics of spoken language. Current findings have shown that grammar of spoken language is different from written language. However, most listening and speaking materials are concocted based on written grammar and lack core spoken language features. The aim of the present study was to explore the question whether awareness of spoken grammar features could affect learners’ comprehension of real-life conversations. To this end, 45 university students in two intact classes participated in a listening course employing corpus-based materials. The instruction of the spoken grammar features to the experimental group was done overtly through awareness raising tasks, whereas the control group, though exposed to the same materials, was not provided with such tasks for learning the features. The results of the independent samples t tests revealed that the learners in the experimental group comprehended everyday conversations much better than those in the control group. Additionally, the highly positive views of spoken grammar held by the learners, which was elicited by means of a retrospective questionnaire, were generally comparable to those reported in the literature.
EN
This paper deals with debates about political correctness as they can be observed in comment sections of the website “Zeit Online”. Under articles on the topic of political correctness, numerous critical comments can be found which are then in turn reacted to with counter speech. On the basis of a corpus of 4791 comments of nine articles, in which the thread structures are also marked up, typical linguistic features of counter speech which are summarized as characteristics of counterness, are determined with quantitative corpus linguistic methods. In qualitative fine analyses, selected findings are further enriched. It will be shown that epistemic positioning, i.e., the indexing of one’s own and other people’s knowledge, and the associated acts of demarcation play an important role in the articulation of counter speech.
EN
This paper will present a corpus-based study on the translated language of tourism, focusing in particular on the stylistics of tourist landscapes. Through a comparative analysis of a specifically designed corpus of travel articles originally written in English (namely the TourEC-Tourism English Corpus) and a corpus of tourist texts translated from a variety of languages into English (namely the T-TourEC – Translational Tourism English Corpus), the study will investigate a selection of collocates, concordances and keywords related to the description and representation of tourist settings in both corpora. The aim will be that of identifying differences, aspects or practices to be potentially improved that characterize the translated language of tourism with respect to tourist texts originally written in English. Results will show that the discursive patterns of translated texts differ from the stylistic strategies typically employed in native English for the linguistic representation of landscape and settings due to phenomena of translation universals, and that these differences may affect the relating communicative functions, properties and persuasive effects of tourist promotional discourse.
EN
Strelica je pogodila grješnika – a certain issue of the Croatian orthography in light of the Croatian National Corpus dataBased on the Croatian National Corpus data the author presents inconsistencies in the spelling rules regarding words like str(j)elica, gr(j)ešnik which are described in Croatian orthographic dictionaries. The paper addresses also the discrepancies between orthographic norm and how it is reflected in the real Croatian texts, as well as the ideological reasons for these differences.
11
Content available Uniwersalia przekładowe
71%
EN
Synaesthesia turns out to thus be a strategy for linguistic pleasure, representing a somatic impulse to engage with texts. Barthes, Nabokov and Robinson, daring to reveal their scandalously pleasurable literary habits, point to synaesthetic engagement with language as the source of translators’ intuitions, readers’ sensitivities, as well as – inseparably – textual pleasures, understood as an integral component of the experiential dimension of lecture and translation.
PL
Artykuł przedstawia zagadnienie tzw. uniwersaliów przekładowych, które pojawiło się w związku z rozwojem lingwistyki korpusowej. Wysunięta przez Monę Baker hipoteza o istnieniu takich uniwersaliów wywołuje kontrowersje wśród badaczy zjawisk przekładowych, co również zostało w artykule pokrótce zreferowane.
12
Content available Massnahmen zur Dokumentation des Niedersorbischen
70%
PL
Lower Sorbian is one of the most endangered European languages. The article states the necessity and urgency of a comprehensive documentation of this language and gives an overview of respective projects undertaken at the Lower Sorbian department of the Sorbian Institute. Apart from the building of text corpora representing the literary language as well as dialectal forms of Lower Sorbian, lexicographic projects are also described.
EN
The aim of this paper is to offer an analysis of the uses of the verb jít ‘go’ by non-native speakers of Czech. Using quantitative and qualitative analysis based on data drawn from the CzeSL corpus, we will point out the specifics of use of this important verb in the written communication of non-native speakers of Czech. This research is not concerned with error analysis, but instead focuses on the overall picture of semantic uses of this verb by non-native speakers of Czech. The resulting analysis should contribute to a methodical description of the teaching of the verb jít within the grammatical and lexical system of Czech for foreigners. The current paper is based on a learner corpus of Czech as a foreign language, the CzeSL-SGT (Czech as a Second Language). The CzeSL corpus contains 12,388 texts (960,000 words) and offers both linguistic and error annotation; the error annotation is based on two target hypotheses. From these texts, 5,785 occurrences of use of the verb jít were excerpted and analysed for their valency and semantic patterns, prepositional co-occurrence, collocation and lexical use. The results of the analysis are discussed in the context of the individual levels of the Common European Framework of Reference.
EN
This study investigates recurrent language resources employed in corporate blogs to connect with readers and (to a lesser extent) express authorial positions. It is based on the premise that constructing identity and enhancing image underpins most, if not all, corporate discourse and blogs are no exception. Based on a corpus of 500 different posts (totalling 318,296 words) from the Business Process Outsourcing and Information Technology sectors, we use standard Corpus Linguistics (Partington et al. 2013) techniques (keywords, cluster analysis, concordancing) to identify linguistic features associated with the expression of engagement: reader pronouns and their co-occurrence with selected modal verbs, questions, adverbs marking shared knowledge and directives. These are then interpreted in in terms of a model of textual interactions proposed in Hyland (2005). We argue that the communication found in this relatively new and underresearched genre is essentially effected one-way establishing a pseudo-dialogue, with virtually no or very low level of interactivity between blog writers and blog readers.
EN
This article examines the discursive construction of Scottish and British-English national identities in the printed press within the context of the planned Scottish independence referendum. Using Critical Discourse Analysis and informed by sociological and anthropological research, the study uses a Corpus Linguistics approach to analyse newspaper texts from the Scottish and British printed media to define the strategies used in the construction and disarticulation of these identities and the ideologies behind them. The results of the analysis will show that the Scottish broadsheets use a staunchly Scottish rhetoric with frequent examples of nation flagging, showing the palpable struggle for power and a certain sense of inferiority. Inadvertently or otherwise, these newspapers engender a sense of separateness by employing techniques of positive in-group identification. The Scottish editions of UK broadsheets, on the contrary, hold a more Anglocentric perspective and their treatment of the referendum is more political than ideological, frequently attributing negative evaluations to the independence issue and engaging in the practice of "tartanisation". To conclude, the UK broadsheets tend to provide a more balanced and objective point of view, thus being at the political centre of the social debate enacted by the referendum and the subsequent possible independence of Scotland.
Glottodidactica
|
2021
|
tom 48
|
nr 1
7-26
EN
A rule stating that we tend to avoid using go and come after the future marker going to appears again and again in many coursebooks and grammars used in English Language Teaching, and has done for decades. This article attempts to show, using empirical evidence from corpora, why the rule is inaccurate, and different ways that this might be established. As the rule under consideration is typically framed as a tendency (like many other pedagogical grammar rules), an additional aim of the work is to outline the kinds of corpus analyses researchers and materials designers can potentially use in order to investigate the question of (claimed) linguistic tendencies. The article concludes by discussing why a rule that is apparently inaccurate nevertheless appears again and again in print, arguing that the existence of a well-established and widely-accepted ‘canon’ of ELT grammar means that such inaccuracies in descriptions of grammar can be easily perpetuated
EN
In the present paper we examine the extent to which age, gender, and education affect the use of the Spisz regional dialect. It is widely assumed that only elderly speakers use pure dialect with no influences of the standard variety of Polish, whereas other generations mix the dialectal with the standard grammar. The data are drawn from the Spisz Corpus. Eight features were chosen, six of them pertaining to inflection, two others to syntax. Though the number of non-dialectal features increases with each generation, it remains, however, quite limited. Still, this is not true in the case of the syntactic idiosyncrasies of the regional dialect, which are almost entirely abandoned by younger generations. Also, women are more prone to use dialectal forms compared to men. Finally, the higher the education of the speaker, the higher the amount of non-dialectal forms, again with the notable exception of academic degree holders, who master code-switching better. In general, however, the Spisz regional dialect is well-preserved by its speakers.
18
Content available remote Contrastive word-formation today: Retrospect and prospect
70%
EN
This paper proposes an exploratory bird’s-eye view of contrastive word-formation research, an area which, to date, remains largely under-researched in the three fields in which it partakes, namely morphology, contrastive linguistics and lexicology. Studies in contrastive word-formation, as well as their meta-analysis in terms of scope, objectives and data, are presented in a critical survey of the literature, together with an extensive bibliography (1960–2010). A new contrastive methodology for future research is looked into and the major practical applications of contrastive word-formation in bilingual lexicography and translator training, among others, are overviewed. Contrastive word-formation, it is argued, should be set within a more rigorous theoretical and methodological framework, which would be characterised by a dynamic conception of the tertium comparationis and the use of empirical data drawn from multilingual corpora.
EN
The article presents a research project on linguistically profiled (quantitative and qualitative) analyses of the (sub)space of pandemic-related discourses, as well as the corpus of Polish texts concerning the SARS-CoV-2 pandemic that broke out in 2020, prepared for analytical purposes. The authors describe the following: 1. the reasons for the interest in this issue, the subject and purpose of the research and the research theoretical and methodological background -- discourse linguistics (mainly from the perspective of Jürgen Spizmüller and Ingo Warnke); 2. source material of the project (mainly individual/non-institutional Internet statements that constitute the basis for the shaping of specific systems of meaning, i.e. comments posted under posts on Facebook or Twitter and the dialogical relations among them); 3. problems related to the development of the pandemic discourses corpus (criteria for the selection of texts, methods of the corpus balancing, categories of metadata that shall be used for the material description); 4. conclusions drawn from an exemplary analytical procedure where a section of the corpus was used; 5. the potential of the above-mentioned research and possible applications of the research results.
20
70%
EN
Somatic idioms – those including a part of the body – have been traditionally studied from a synchronic perspective, yielding different explanations for their semantic value. The main objective of this paper is to highlight the diachronic origin of idiomatic meaning, by illustrating the process of phraseologization from a historical, usage-based perspective. As the first step, we will reflect on the general nature of phraseological meaning, and then on the semantic particularities of somatic idioms. Secondly, we will carry out a corpus-based diachronic analysis of the Catalan idiom tapar-se el nas (to hold one’s nose) within the framework of the Invited Inference Theory of Semantic Change. The different stages of the process will be exemplified and discussed. As a result, a new notion of somatic idioms as frozen human actions will be presented.
first rewind previous Strona / 6 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.