Wyniki wyszukiwania - Biblioteka Nauki

1

TuLiPA - Parsing extensions of TAG with range concatenation grammars

100%

Kallmeyer L. , Maier W. , Parmentier Y. , Dellert J.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2010

|

tom Vol. 58, nr 3

377-391

EN

In this paper we present a parsing framework for extensions of Tree Adjoining Grammar (TAG) called TuLiPA (T¨ubingen Linguistic Parsing Architecture). In particular, besides TAG, the parser can process Tree-Tuple MCTAG with Shared Nodes (TT-MCTAG), a TAGextension which has been proposed to deal with scrambling in free word order languages such as German. The central strategy of the parser is such that the incoming TT-MCTAG (or TAG) is transformed into an equivalent Range Concatenation Grammar (RCG) which, in turn, is then used for parsing. The RCG parser is an incremental Earley-style chart parser. In addition to the syntactic anlysis, TuLiPA computes also an underspecified semantic analysis for grammars that are equipped with semantic representations.

2

A simple CF formalism and free word order

100%

Graliński F.

Archives of Control Sciences

|

2005

|

tom Vol. 15, no. 4

541-554

EN

The first objective of this paper is to present a simple grammatical formalism named treegenerating Binary Grammar. The formalism is weakly equivqlent to CFGs, yet it is capable of generating various kinds of syntactic trees, including dependency trees. Its strong equivalence to some other grammatical formalisms is discussed. The second objective is to show how some free word order phenomena in Polish can be captured within the proposed formalism.

3

Chomsky-Schützenberger parking for weighted multiple context-free languages

100%

Denkinger T.

Journal of Language Modelling

|

2017

|

tom Vol. 5, No. 1

3--55

EN

We prove a Chomsky-Schützenberger representation theorem for multiple context-free languages weighted over complete commutative strong bimonoids. Using this representation we devise a parsing algorithm for a restricted form of those devices.

4

Tunnel parsing with counted repetitions

100%

Handzhiyski N. , Somova E.

Computer Science

|

2020

|

tom T. 21 (4)

441-462

EN

This article describes a new and efficient algorithm for parsing (called tunnel parsing) that parses from left to right on the basis of context-free grammar without left recursion nor rules that recognize empty words. The algorithm is mostly applicable for domain-specific languages. In the article, particular attention is paid to the parsing of grammar element repetitions. As a result of the parsing, a statically typed concrete syntax tree is built from top to bottom, that accurately reflects the grammar. The parsing is not done through a recursion, but through an iteration. The tunnel parsing algorithm uses the grammars directly without a prior refactoring and is with a linear time complexity for deterministic context-free grammars.

5

On the construction of transition functions for DPLL(k) automata for syntactic pattern recognition

100%

Jurek J.

Machine Graphics and Vision

|

2003

|

tom Vol. 12, No. 4

489-513

EN

DPLL(k) automata have been introduced as a tool for inference support in pattern recognition-based real-time expert systems. The automata can be characterised by the two following features: they can recognise languages of a big descriptive power (quasi context sensitive languages), and they are efficient (i.e. they are of linear computational complexity). The two features make the automata useful in case of many practical applications, such as the on-line analysis of complex trend functions describing behaviour of industrial equipment. In this paper we present a method for construction of transition functions for the automata and a formal proof of its correctness.

6

Parsing Local Internal Contextual Languages with Context-Free Choice

100%

Gramatovici R. , Manea F.

Fundamenta Informaticae

|

2005

|

tom Vol. 64, nr 1-4

171-183

EN

We extend the result from [8] by giving also a concrete polynomial parsing algorithm for a class of languages generated by a variant of contextual grammars, namely local internal contextual grammars with context-free choice.

7

Adapting a constituency parser to user-generated content in polish opinion mining

100%

Pluwak A. , Korczynski W. , Kisiel-Dorohinicki M.

|

tom Vol. 17 (1)

23--44

EN

The paper focuses on the adjustment of NLP tools for Polish; e.g., morphological analyzers and parsers, to user-generated content (UGC). The authors discuss two rule-based techniques applied to improve their efficiency: pre-processing (text normalization) and parser adaptation (modified segmentation and parsing rules). A new solution to handle OOVs based on inflectional translation is also offered.

8

More About Converting BNF to PEG

88%

Redziejowski R. R.

Fundamenta Informaticae

|

2014

|

tom Vol. 133, nr 2/3

257--270

EN

Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. The parser has many useful properties. Converting PEG to an executable parser is a rather straightforward task. Unfortunately, PEG is not well understood as a language definition tool. It is thus of a practical interest to construct PEGs for languages specified in some familiar way, such as Backus-Naur Form (BNF). The problem was attacked by Medeiros in an elegant way by noticing that both PEG and BNF can be formally defined in a very similar way. Some of his results were extended in a previous paper by this author. We continue here with further extensions.

9

Cut Points in PEG

88%

Redziejowski R. R.

Fundamenta Informaticae

|

2016

|

tom Vol. 143, nr 1/2

141--149

EN

Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. It has been recently noticed that in the situation when the parser is to explore several alternatives one after another, no further alternatives need to be explored after the parser reached certain ”cut point”. This fact can be used to save both processing time and storage. The subject of the paper is identification of cut points, which can also help in producing better diagnostics.

10

Image labelling by random graph parsing for syntactic scene description

88%

Skomorowski M.

Foundations of Computing and Decision Sciences

|

1998

|

tom Vol. 23, No. 3

161-178

EN

A new approach to scene labelling is proposed. The proposed approach involves parsing for graph grammars. To take into account all variations of an ambiguous (distorted) scene under study, a probabilistic description of the scene is needed. Random graphs are proposed here for such a description. An efficient, O(n2), parsing algorithm for random graphs is proposed here as a tool for scene labelling. An example is provided.

11

Interpreted Graphs and ETPR(k) Graph Grammar Parsing for Syntactic Pattern Recognition

88%

Flasiński M.

Machine Graphics and Vision

|

2018

|

tom Vol. 27, No. 1/4

3--19

EN

Further results of research into graph grammar parsing for syntactic pattern recognition (Pattern Recognit. 21:623-629, 1988; 23:765-774, 1990; 24:1223-1224, 1991; 26:1-16, 1993; 43:249-2264, 2010; Comput. Vision Graph. Image Process. 47:1-21, 1989; Fundam. Inform. 80:379-413, 2007; Theoret. Comp. Sci. 201:189-231, 1998) are presented in the paper. The notion of interpreted graphs based on Tarski's model theory is introduced. The bottom-up parsing algorithm for ETPR(k) graph grammars is defined.

12

Generative power of reduction-based parsable ETPR(k) graph grammars for syntactic pattern recognition

88%

Flasiński M.

Journal of Automation Mobile Robotics and Intelligent Systems

|

2018

|

tom Vol. 12, No. 2

61--81

EN

Further results of research into parsable graph grammars used for syntactic pattern recognition (Pattern Recognition: 21, 623-629 (1988); 23, 765-774 (1990); 24, 12-23 (1991); 26, 1-16 (1993); 43, 2249-2264 (2010), Comput. Vision Graph. Image Process. 47, 1-21 (1989), Computer-Aided Design 27, 403-433 (1995), Theoret. Comp. Sci. 201, 189-231 (1998), Pattern Analysis Applications bf 17, 465-480 (2014)) are presented in the paper. The generative power of reduction-based parsable ETPR(k) graph grammars is investigated. The analogy between the triad of CF - LL(k) - LR(k) string languages and the triad of NLC - ETPL(k) - ETPR(k) graph languages is discussed.

13

Trying to Understand PEG

88%

Redziejowski R. R.

Fundamenta Informaticae

|

2018

|

tom Vol. 157, nr 4

463--475

EN

Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. Its properties are useful in many applications, but it is not well understood as a language definition tool. In its appearance, PEG is almost identical to a grammar in the Extended Backus-Naur Form (EBNF), and one may expect it to define the same language. But, due to the limited backtracking, PEG may reject some strings defined by EBNF, which gives an impression of PEG being unpredictable. We note that for some grammars, the limited backtracking is “efficient”, in the sense that it exhausts all possibilities. A PEG with efficient backtracking should therefore be easy to understand. There is no general algorithm to check if the grammar has efficient backtracking, but it can be often checked by inspection. The paper outlines an interactive tool to facilitate such inspection.

14

Generalized structure of the algorithm for automated detection of non relevant and wrong information on Web resources

75%

Dyvak M. , Kovbasistyi A. , Stakhiv P. , Lipiński P.

Journal of Applied Computer Science

|

2017

|

tom Vol. 25, nr 1

23--37

EN

In this article the algorithm for automated detection of non-relevant or wrong information on websites is introduced. The algorithm extracts the semantic information from the webpage using third party software and compares the semantic information with the reliable resources. Reliable information is identified by the means of majority voting or extracted from reliable databases.

15

Technical Features of the Architecture of an Electronic Trilingual Dictionary

75%

Chetverikov G. , Vechirska I. , Puzik O.

Cognitive Studies

|

2016

|

nr 16

EN

Technical Features of the Architecture of an Electronic Trilingual DictionaryThis article is devoted to the development of the software system used to create an English-Russian-Ukrainian terminological dictionary. Scanned and recognized documents in MSWord format were the input data for the dictionary. Issues which appeared during the parsing of the input data are analyzed and solutions using regular expressions are identified. This article also describes the scheme of the dictionary’s lexicographical database, and its classes of models, views and view models.In addition, a detailed description of the software system from a user’s perspective is included, the prospects for the usage of the dictionary are discussed, and the methods used during the development of the system are described.The software system is built using the design pattern Model-View-View-Model. Through the use of this pattern, internal logic is separated from user interface, thus changes made in different parts of the software may be independent. The developed software system allows users to edit, to fill, and thus to create new thematic transferable electronic dictionaries. The main advantage of the system is the equality of languages, i.e. each user can decide which language is to be major. Opracowanie oprogramowania trzyjęzycznego słownika elektronicznegoArtykuł jest poświęcony opracowaniu oprogramowania rosyjsko-ukraińsko-angielskiego słownika terminologicznego. Za wejściowe dane autorzy przyjęli zeskanowane i rozpoznane dokumenty w formacie MSWord. Błędy powstałe w czasie analizy składniowej wejściowych danych zostały przeanalizowane, a autorzy wskazali drogę ich likwidacji za pomocą regularnych wyrażeń.W pracy została dokładnie opisana baza leksykograficzna danych słownika, zostały opisane klasy modelu danych i klasy modelu prezentacji systemu. Oprogramowanie jest zbudowane w taki sposób, aby można było wykorzystać szablon projektowania Model-View-ViewModel. Dzięki wykorzystaniu tego szablonu interfejs użytkowania jest oddzielony od logiki programu, co pozwala wprowadzać niezależne zmiany poszczególnych części oprogramowania.Sporządzone oprogramowanie pozwala na redagowanie, uzupełnienie i tym samym tworzenie nowych tematycznych słowników przekładowych. Zaletą systemu jest równorzędność języków. Autorzy nakreślili zarówno sposoby wykorzystania samego słownika, jak i metody jego budowania.

16

Evolutionary ordering of the mitochondrion-encoded proteins

75%

Kozarzewski B.

Journal of Applied Computer Science

|

2016

|

tom Vol. 24, nr 1

37--49

EN

The parsing of a symbolic sequence into a set of short substrings called words invented by the author is used for a new definition of the distance between sequences. No sequence alignment is necessary. The most frequent among spectra of multiprotein sequences are selected and considered as a reference spectrum of the sequences. The distance between the reference spectrum and protein sequences is considered as the estimation of the evolutionary distance of the protein. As an application, amino acid sequences of the several mitochondria-encoded proteins of mammal species are ordered according to their evolutionary distance. Statistical distribution of the distances between exhibits some structures related to the evolutionary rate in the past.

17

Parsing based on n-path tree - controlled grammars

75%

Čermák M. , Koutný J. , Meduna A.

Theoretical and Applied Informatics

|

2011

|

tom Vol. 23, No. 3-4

213-228

EN

This paper discusses recently introduced kind of linguistically motivated restriction placed on tree-controlled grammars-context-free grammars with some root-to-leaf paths in their derivation trees restricted by a control language. We deal with restrictions placed on n greater-than or equal to 1 paths controlled by a deterministic context-free language, and we recall several basic properties of such a rewriting system. Then, we study the possibilities of corresponding parsing methods working in polynomial time and demonstrate that some non-context-free languages can be generated by this regulated rewriting model. Furthermore, we illustrate the syntax analysis of LL grammars with controlled paths. Finally, we briefly discuss how to base parsing methods on bottom-up syntax-analysis.

18

A syntactic component for Vietnamese language processing

75%

Le-Hong P. , Roussanaly A. , Nguyen T. M. H.

Journal of Language Modelling

|

2015

|

tom Vol. 3, No. 1

145--184

EN

This paper presents the development of a grammar and a syntactic parser for the Vietnamese language. We first discuss the construction of a lexicalized tree-adjoining grammar using an automatic extraction approach. We then present the construction and evaluation of a deep syntactic parser based on the extracted grammar. This is a complete system that produces syntactic structures for Vietnamese sentences. A dependency annotation scheme for Vietnamese and an algorithm for extracting dependency structures from derivation trees are also proposed. This is the first Vietnamese parsing system capable of producing both constituency and dependency analyses. It offers encouraging performance: accuracy of 69.33% and 73.21% for constituency and dependency analysis, respectively.

19

The Object-oriented Architecture of the Syntactic Pattern-recognition System Based on GDPLL(k) Grammars

75%

Jurek J.

Schedae Informaticae

|

2004

|

tom Vol. 13

83-102

EN

The syntactic pattern recognition model based on GDPLL (k) grammars has been proposed [6, 13] as an efficient tool for inference support in diagnostic and control expert systems. In this paper we discuss the software engineering aspect of the syntactic pattern recognition (sub)system. The architecture of the system should allow to embed the system in real-time environments, accumulate knowledge about the environment, and flexible react to the changes in the environment. The object-oriented approach has been applied to design the system, and the Unified Modeling Language has been used for the specification of the software model. In the paper we presented the model and its practical applications.

20

An analysis of the concurrent calculation of the First Sets

75%

Jeruszka P.

Journal of Applied Mathematics and Computational Mechanics

|

2014

|

tom Vol. 13, nr 3

67--74

EN

This paper is focused on the process of computing First Sets. The First Sets are used to build structures which control a syntax analyser (also known as parser). Three methods of creating First Sets were compared in terms of execution time. The first method is known sequential algorithm and the author’s own methods are concurrent computing sets for each non-terminal symbol (called the CEN method) and concurrent computing sets for each production (called the CEP method). These methods have been tested on personal computer. Three programming languages (including the C language) were used in the research. The results and the analysis of calculations allow the author to hypothesise that the problem of computing First Sets is hard to concurrence.