Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 11

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  COMPUTATIONAL LINGUISTICS
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
The phrase 'logico-computational philosophy' was coined by Witold Marciszewski to deal with problems which could be characterised as a modern Renaissance observed in the domain of science since the emergence of cybernetics and computer science. This renaissance was made possible by the use of universal programming languages in different areas of science. The authoress claims that thanks to the Polish logico-mathematical heritage (stemming from the Warsaw-Lwow School of Logic and Philosophy at the beginning of the 20th century) there is a chance that Polish linguistics might be present in this new age of science on the condition that it adopt the computational paradigm as developed within the realm of artificial intelligence. Investigations on the Polish language at CELTA Sorbonne (Centre de Linguistique theorique et appliqué) are conducted in the framework of what we call the CASK (Computer Aided Acquisition of Semantic Knowledge) method supported by the SEMANA software which includes a dynamic database builder and a bundle of algorithms for knowledge discovery in databases (KDD) specifically selected for semantic analysis. This paper sketches out the results of research conducted on Polish adjective declension, subject-verb agreement (choice of past tense verb forms -li or -ly) and impersonal sentences, thereby giving evidence of the need to distinguish in Polish grammar, the category of ender form the three-valued category of animacy (non animate, human animate and non-human animate). The approach revealed that the grammatical category of Polish gender has only 3 values (masculine, feminine and neuter) instead of 5, 7, 8 or even 9 values (which, in fact, correspond do declension classes) of a hybrid gender-like category as proposed by today's grammarians of Polish. Thus, the CASK method makes it possible to describe the Polish grammar using common European linguistic technology.
2
Content available remote Corpus linguistics in Poland - the origins, the present, the prospects
100%
EN
In the article, three sources of corpus engineering are mentioned: (a) theoretical and descriptive achievements of structural linguistics, (b) the formal apparatus of generative theories, and (c) the development of computational tools. For the last decades, the Polish language has been satisfactorily accounted for both in terms of morphology and syntax. On that basis, two corpus search engines have recently been designed to annotate Polish text corpora (Poliqarp) or to disambiguate them morphologically (Holmes). The prospects of corpus engineering in Poland do not look optimistic, indeed. Unlike in neighbouring countries, not many people work in the area of computational linguistics. The article expresses the author's hope that young Polish linguists may find the job attractive, not only intellectually.
3
Content available remote CORPORA TOOLS IN BILINGUAL LEXICOGRAPHY
80%
EN
The use of monolingual electronic corpora in bilingual Polish and Russian lexicography is discussed. Starting from her own experience in dictionary building the author provides evidence that electronic corpora are most useful for the description of new words (even those not yet included in monolingual dictionaries). Electronic corpora have also proved to be an effective method for selecting equivalents in existing bilingual dictionaries. A typology of available electronic corpora is proposed.
4
Content available remote A GENRE ANALYSIS OF THE WIKIPEDIA ONLINE ENCYCLOPEDIA
80%
EN
The paper presents the results of a genre analysis of an online encyclopedia - Wikipedia, whose form is completely shaped by the web. The aim of the analysis was to determine characteristic features of the encyclopedia and point out the ways in which it differs from other available online encyclopedias. Analysis covered structural and stylistic features of the articles. Structural analysis defined the characteristics of the visual side of the encyclopedia, graphics, constructional elements of the site as well as systemic mechanisms. Stylistic analysis verified the language of the encyclopedia, language of articles and means of communication with the users. Linguistic analysis verified the degrees of formality, lexis as well as the most frequent syntactic structures. The results show how users acting through mechanisms provided by the system can shape the features of the content in particular ways. The paper also includes a comparative analysis of the Polish version of Wikipedia with its English and German equivalents.
5
Content available remote THE INTERNET AND ITS RESOURCES IN POLISH LINGUISTIC RESEARCH: OVERVIEW
80%
EN
The Internet's mediality, geography, and communicative specificity make it a socially interesting domain of study. Linguistics as a humanistic discipline may contribute significantly to the research on the new media and their impact on contemporary society and it has the necessary tools for the task. The article discusses the following areas of linguistic research on the Internet: secondary orality or oralisation of writing, understood as 'transformation of verbal expressions by electronic means' (a secondarily oral written text may be sent and received simultaneously in real time); tendency to abbreviation, evident in acronyms; emoticons as a fashionable means of conveying information about the attitude of senders of messages; netiquette, i.e. the emergence of new conventions of politeness observed by chat users; the role of the Internet in language counselling and guidance; creating a new type of dictionary (e-dictionary) as a task for lexicographers and metalexicographers; hypertext as a representation of a net text 'sensu stricto'; genres
6
Content available remote TERMINOLOGICAL GRID FOR THE SELECTED SYSTEMS PROCESSING POLISH TEXTS
80%
EN
Several computer systems are discussed. 'PoMor' is a commercial product (morphological analyser and spelling checker) developed by Robert Wolosz for Morphologic. 'Morfeusz' is a morphologic analyser developed by Marcin Wolinski. 'Poliqarp' (POLyinterpretation Indexing Query and Retrieval Processor) developed by Zygmunt Krynicki and Daniel Janus is used to search in a corpus of Polish (http://korpus.pl). Both 'Morfeusz' and 'Poliqarp' are freely available for research purposes. The linguistic tools of Michal Rudolf are also mentioned. The paper advocates the use of 'lex' instead of 'word' or 'segment' for better precision and clarity.
7
Content available remote Pojmenované entity v počítačové lingvistice a vlastní jména
80%
EN
The article deals with the relationship between the so-called 'named entities', the concept of which has been established by the computational linguistics (namely its fields of Information Extraction and Natural Language Processing), and the proper names, as they are understood by onomasticians. Named entities are understood as those text units that are not included in dictionaries, and therefore cause difficulties during the automatic processing. Named entities include not only proper names, but also numeric expressions, Internet and e-mail addresses, according to some conceptions also biomedical terms, etc. The longest part of the article is devoted to the classification of named entities, as it has been proposed by computational linguists and modified by onomasticians.
8
Content available remote POLISH-UKRAINIAN PARALLEL CORPUS (POLUKR)
80%
EN
A fairly considerable lack of Polish-Ukrainian electronic language resources, especially large dictionaries and parallel corpora, is presently felt among the professional translators and those who learn either language. A morpho-syntactically annotated aligned corpus with a parallel concordancer can be used in many cases as a traditional dictionary. One of its possible advantages is being an up-to-date database of numerous examples of language use, which could be of use while dealing with neologisms and new lexical meanings. The authors aim is to create such a corpus and make it publicly available through the Internet. At the moment one can advise the prototype of the parallel Polish-Ukrainian corpus on-line from the website http://corpus.domeczek.pl. It contains ca. 50 parallel bitexts compliant with the XCES standard. Not all sentence segments have been tagged yet, the work on morphological annotation is on-going as well. The corpus is aligned presently only at the paragraph level. The authors are convinced that such a parallel corpus with an efficient search engine, a flexible query language and a friendly user interface will be one of the most powerful tools for all Polish and Ukrainian researchers, translators and students.
9
80%
EN
The paper presents a system for automatic content extraction from mammogram reports written in Polish. The system combines general information extraction (IE) techniques with external post-processing aimed at structuralizing the results. The paper contains a characteristics of the specific type of texts as well as a description of the results obtained together with a short analysis of advantages and disadvantages of shallow text processing.
10
Content available remote K některým otázkám závislostní gramatiky
70%
EN
The popularity of dependency-based syntax has grown in the last thirty years, in spite of the fact that phrase-structure-based descriptions have prevailed in so-called mainstream linguistics. Two factors are important here: (i) a growing interest in semantics, which results in the penetration of dependency-based notions into the original phrase-structure-based grammars, (ii) dependency offers a more perspicuous view of the sentence structure and as such has played an important role in computational linguistics. We first summarize the basic tenets of both theories mentioned above (Section 2) and point out the reasons for the growing interest in dependency-based grammars (Section 3). In Section 4, attention is focused on one of the issues often quoted as problematic in dependency-based analysis, namely cases in which the surface order of words is not in accordance with the condition of projectivity. The analysis, based on material from the Prague Dependency Treebank, supports the claim made by Functional Generative Description that this issue can be adequately solved by postulating a dependency-based underlying (tectogrammatical) syntactic structure that meets the condition of projectivity and by describing the relationship between this structure and the surface word order on the basis of certain contextual conditions.
11
Content available remote COMPUTATIONAL TOOLS FOR MANAGING LARGE TEXT CORPORA: THE SEARCH ENGINE 'HOLMES'
70%
EN
Large text corpora management requires sophisticated computational tools. For highly inflecting languages like Polish homonymy is a challenge computer men have to face; in Polish texts, every 42nd word per 100 is grammatically ambiguous. A search engine 'Holmes', designed by Michal Rudolf, works as a disambiguator, rather than a tagger. It operates on texts which are morphologically marked before by special programs. After the user keyboards her query 'Holmes' examines sets of tags for each word, rejecting as many improper interpretations as possible. 'Holmes' makes use of linguistic, not statistical methods of disambiguation. It is based upon a number of rules formalizing various contextual restrictions on words. Query results are obtainable online.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.