Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 24

Liczba wyników na stronie
first rewind previous Strona / 2 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  parsing
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 2 next fast forward last
EN
In this paper we present a parsing framework for extensions of Tree Adjoining Grammar (TAG) called TuLiPA (T¨ubingen Linguistic Parsing Architecture). In particular, besides TAG, the parser can process Tree-Tuple MCTAG with Shared Nodes (TT-MCTAG), a TAGextension which has been proposed to deal with scrambling in free word order languages such as German. The central strategy of the parser is such that the incoming TT-MCTAG (or TAG) is transformed into an equivalent Range Concatenation Grammar (RCG) which, in turn, is then used for parsing. The RCG parser is an incremental Earley-style chart parser. In addition to the syntactic anlysis, TuLiPA computes also an underspecified semantic analysis for grammars that are equipped with semantic representations.
2
Content available remote A simple CF formalism and free word order
100%
EN
The first objective of this paper is to present a simple grammatical formalism named treegenerating Binary Grammar. The formalism is weakly equivqlent to CFGs, yet it is capable of generating various kinds of syntactic trees, including dependency trees. Its strong equivalence to some other grammatical formalisms is discussed. The second objective is to show how some free word order phenomena in Polish can be captured within the proposed formalism.
EN
We prove a Chomsky-Schützenberger representation theorem for multiple context-free languages weighted over complete commutative strong bimonoids. Using this representation we devise a parsing algorithm for a restricted form of those devices.
4
Content available Tunnel parsing with counted repetitions
100%
EN
This article describes a new and efficient algorithm for parsing (called tunnel parsing) that parses from left to right on the basis of context-free grammar without left recursion nor rules that recognize empty words. The algorithm is mostly applicable for domain-specific languages. In the article, particular attention is paid to the parsing of grammar element repetitions. As a result of the parsing, a statically typed concrete syntax tree is built from top to bottom, that accurately reflects the grammar. The parsing is not done through a recursion, but through an iteration. The tunnel parsing algorithm uses the grammars directly without a prior refactoring and is with a linear time complexity for deterministic context-free grammars.
EN
DPLL(k) automata have been introduced as a tool for inference support in pattern recognition-based real-time expert systems. The automata can be characterised by the two following features: they can recognise languages of a big descriptive power (quasi context sensitive languages), and they are efficient (i.e. they are of linear computational complexity). The two features make the automata useful in case of many practical applications, such as the on-line analysis of complex trend functions describing behaviour of industrial equipment. In this paper we present a method for construction of transition functions for the automata and a formal proof of its correctness.
6
Content available remote Parsing Local Internal Contextual Languages with Context-Free Choice
100%
EN
We extend the result from [8] by giving also a concrete polynomial parsing algorithm for a class of languages generated by a variant of contextual grammars, namely local internal contextual grammars with context-free choice.
EN
The paper focuses on the adjustment of NLP tools for Polish; e.g., morphological analyzers and parsers, to user-generated content (UGC). The authors discuss two rule-based techniques applied to improve their efficiency: pre-processing (text normalization) and parser adaptation (modified segmentation and parsing rules). A new solution to handle OOVs based on inflectional translation is also offered.
8
Content available remote More About Converting BNF to PEG
88%
EN
Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. The parser has many useful properties. Converting PEG to an executable parser is a rather straightforward task. Unfortunately, PEG is not well understood as a language definition tool. It is thus of a practical interest to construct PEGs for languages specified in some familiar way, such as Backus-Naur Form (BNF). The problem was attacked by Medeiros in an elegant way by noticing that both PEG and BNF can be formally defined in a very similar way. Some of his results were extended in a previous paper by this author. We continue here with further extensions.
9
Content available remote Cut Points in PEG
88%
EN
Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. It has been recently noticed that in the situation when the parser is to explore several alternatives one after another, no further alternatives need to be explored after the parser reached certain ”cut point”. This fact can be used to save both processing time and storage. The subject of the paper is identification of cut points, which can also help in producing better diagnostics.
10
Content available remote Image labelling by random graph parsing for syntactic scene description
88%
EN
A new approach to scene labelling is proposed. The proposed approach involves parsing for graph grammars. To take into account all variations of an ambiguous (distorted) scene under study, a probabilistic description of the scene is needed. Random graphs are proposed here for such a description. An efficient, O(n2), parsing algorithm for random graphs is proposed here as a tool for scene labelling. An example is provided.
EN
Further results of research into graph grammar parsing for syntactic pattern recognition (Pattern Recognit. 21:623-629, 1988; 23:765-774, 1990; 24:1223-1224, 1991; 26:1-16, 1993; 43:249-2264, 2010; Comput. Vision Graph. Image Process. 47:1-21, 1989; Fundam. Inform. 80:379-413, 2007; Theoret. Comp. Sci. 201:189-231, 1998) are presented in the paper. The notion of interpreted graphs based on Tarski's model theory is introduced. The bottom-up parsing algorithm for ETPR(k) graph grammars is defined.
EN
Further results of research into parsable graph grammars used for syntactic pattern recognition (Pattern Recognition: 21, 623-629 (1988); 23, 765-774 (1990); 24, 12-23 (1991); 26, 1-16 (1993); 43, 2249-2264 (2010), Comput. Vision Graph. Image Process. 47, 1-21 (1989), Computer-Aided Design 27, 403-433 (1995), Theoret. Comp. Sci. 201, 189-231 (1998), Pattern Analysis Applications bf 17, 465-480 (2014)) are presented in the paper. The generative power of reduction-based parsable ETPR(k) graph grammars is investigated. The analogy between the triad of CF - LL(k) - LR(k) string languages and the triad of NLC - ETPL(k) - ETPR(k) graph languages is discussed.
13
Content available remote Trying to Understand PEG
88%
EN
Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. Its properties are useful in many applications, but it is not well understood as a language definition tool. In its appearance, PEG is almost identical to a grammar in the Extended Backus-Naur Form (EBNF), and one may expect it to define the same language. But, due to the limited backtracking, PEG may reject some strings defined by EBNF, which gives an impression of PEG being unpredictable. We note that for some grammars, the limited backtracking is “efficient”, in the sense that it exhausts all possibilities. A PEG with efficient backtracking should therefore be easy to understand. There is no general algorithm to check if the grammar has efficient backtracking, but it can be often checked by inspection. The paper outlines an interactive tool to facilitate such inspection.
EN
In this article the algorithm for automated detection of non-relevant or wrong information on websites is introduced. The algorithm extracts the semantic information from the webpage using third party software and compares the semantic information with the reliable resources. Reliable information is identified by the means of majority voting or extracted from reliable databases.
EN
Technical Features of the Architecture of an Electronic Trilingual DictionaryThis article is devoted to the development of the software system used to create an English-Russian-Ukrainian terminological dictionary. Scanned and recognized documents in MSWord format were the input data for the dictionary. Issues which appeared during the parsing of the input data are analyzed and solutions using regular expressions are identified. This article also describes the scheme of the dictionary’s lexicographical database, and its classes of models, views and view models.In addition, a detailed description of the software system from a user’s perspective is included, the prospects for the usage of the dictionary are discussed, and the methods used during the development of the system are described.The software system is built using the design pattern Model-View-View-Model. Through the use of this pattern, internal logic is separated from user interface, thus changes made in different parts of the software may be independent. The developed software system allows users to edit, to fill, and thus to create new thematic transferable electronic dictionaries. The main advantage of the system is the equality of languages, i.e. each user can decide which language is to be major. Opracowanie oprogramowania trzyjęzycznego słownika elektronicznegoArtykuł jest poświęcony opracowaniu oprogramowania rosyjsko-ukraińsko-angielskiego słownika terminologicznego. Za wejściowe dane autorzy przyjęli zeskanowane i rozpoznane dokumenty w formacie MSWord. Błędy powstałe w czasie analizy składniowej wejściowych danych zostały przeanalizowane, a autorzy wskazali drogę ich likwidacji za pomocą regularnych wyrażeń.W pracy została dokładnie opisana baza leksykograficzna danych słownika, zostały opisane klasy modelu danych i klasy modelu prezentacji systemu. Oprogramowanie jest zbudowane w taki sposób, aby można było wykorzystać szablon projektowania Model-View-ViewModel. Dzięki wykorzystaniu tego szablonu interfejs użytkowania jest oddzielony od logiki programu, co pozwala wprowadzać niezależne zmiany poszczególnych części oprogramowania.Sporządzone oprogramowanie pozwala na redagowanie, uzupełnienie i tym samym tworzenie nowych tematycznych słowników przekładowych. Zaletą systemu jest równorzędność języków. Autorzy nakreślili zarówno sposoby wykorzystania samego słownika, jak i metody jego budowania.
16
Content available remote Evolutionary ordering of the mitochondrion-encoded proteins
75%
EN
The parsing of a symbolic sequence into a set of short substrings called words invented by the author is used for a new definition of the distance between sequences. No sequence alignment is necessary. The most frequent among spectra of multiprotein sequences are selected and considered as a reference spectrum of the sequences. The distance between the reference spectrum and protein sequences is considered as the estimation of the evolutionary distance of the protein. As an application, amino acid sequences of the several mitochondria-encoded proteins of mammal species are ordered according to their evolutionary distance. Statistical distribution of the distances between exhibits some structures related to the evolutionary rate in the past.
17
Content available Parsing based on n-path tree - controlled grammars
75%
EN
This paper discusses recently introduced kind of linguistically motivated restriction placed on tree-controlled grammars-context-free grammars with some root-to-leaf paths in their derivation trees restricted by a control language. We deal with restrictions placed on n greater-than or equal to 1 paths controlled by a deterministic context-free language, and we recall several basic properties of such a rewriting system. Then, we study the possibilities of corresponding parsing methods working in polynomial time and demonstrate that some non-context-free languages can be generated by this regulated rewriting model. Furthermore, we illustrate the syntax analysis of LL grammars with controlled paths. Finally, we briefly discuss how to base parsing methods on bottom-up syntax-analysis.
18
75%
EN
This paper presents the development of a grammar and a syntactic parser for the Vietnamese language. We first discuss the construction of a lexicalized tree-adjoining grammar using an automatic extraction approach. We then present the construction and evaluation of a deep syntactic parser based on the extracted grammar. This is a complete system that produces syntactic structures for Vietnamese sentences. A dependency annotation scheme for Vietnamese and an algorithm for extracting dependency structures from derivation trees are also proposed. This is the first Vietnamese parsing system capable of producing both constituency and dependency analyses. It offers encouraging performance: accuracy of 69.33% and 73.21% for constituency and dependency analysis, respectively.
EN
The syntactic pattern recognition model based on GDPLL (k) grammars has been proposed [6, 13] as an efficient tool for inference support in diagnostic and control expert systems. In this paper we discuss the software engineering aspect of the syntactic pattern recognition (sub)system. The architecture of the system should allow to embed the system in real-time environments, accumulate knowledge about the environment, and flexible react to the changes in the environment. The object-oriented approach has been applied to design the system, and the Unified Modeling Language has been used for the specification of the software model. In the paper we presented the model and its practical applications.
20
75%
EN
This paper is focused on the process of computing First Sets. The First Sets are used to build structures which control a syntax analyser (also known as parser). Three methods of creating First Sets were compared in terms of execution time. The first method is known sequential algorithm and the author’s own methods are concurrent computing sets for each non-terminal symbol (called the CEN method) and concurrent computing sets for each production (called the CEP method). These methods have been tested on personal computer. Three programming languages (including the C language) were used in the research. The results and the analysis of calculations allow the author to hypothesise that the problem of computing First Sets is hard to concurrence.
first rewind previous Strona / 2 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.