Wyniki wyszukiwania - Biblioteka Nauki

1

Noisy-parallel and comparable corpora filtering methodology for the extraction of bi-lingual equivalent data at sentence level

100%

Wołk K.

|

2015

|

tom Vol. 16 (2)

169--184

EN

Text alignment and text quality are critical to the accuracy of Machine Translation (MT) systems, some NLP tools, and any other text processing tasks requiring bilingual data. This research proposes a language-independent bisentence filtering approach based on Polish (not a position-sensitive language) to English experiments. This cleaning approach was developed on the TED Talks corpus and also initially tested on the Wikipedia comparable corpus, but it can be used for any text domain or language pair. The proposed approach implements various heuristics for sentence comparison. Some of the heuristics leverage synonyms as well as semantic and structural analysis of text as additional information. Minimization of data loss has been? ensured. An improvement in MT system scores with text processed using this tool is discussed.

2

Assessment of various GPU acceleration strategies in text categorization processing flow

100%

Korduła Ł. , Wielgosz M. , Karwatowski M. , Pietroń M. , Żurek D. , Wiatr K.

Measurement Automation Monitoring

|

2017

|

tom Vol. 63, No. 6

203--205

EN

Automatic text categorization presents many difficulties. Modern algorithms are getting better in extracting meaningful information from human language. However, they often significantly increase complexity of computations. This increased demand for computational capabilities can be facilitated by the usage of hardware accelerators like general purpose graphic cards. In this paper we present a full processing flow for document categorization system. Gram-Schmidt process signatures calculation up to 12 fold decrease in computing time of system components.

3

Towards Learning Word Representation

100%

Wiercioch M.

|

tom Vol. 25

103--115

EN

Continuous vector representations, as a distributed representations for words have gained a lot of attention in Natural Language Processing (NLP) field. Although they are considered as valuable methods to model both semantic and syntactic features, they still may be improved. For instance, the open issue seems to be to develop different strategies to introduce the knowledge about the morphology of words. It is a core point in case of either dense languages where many rare words appear and texts which have numerous metaphors or similies. In this paper, we extend a recent approach to represent word information. The underlying idea of our technique is to present a word in form of a bag of syllable and letter n-grams. More specifically, we provide a vector representation for each extracted syllable-based and letter-based n-gram, and perform concatenation. Moreover, in contrast to the previous method, we accept n-grams of varied length n. Further various experiments, like tasks-word similarity ranking or sentiment analysis report our method is competitive with respect to other state-of-theart techniques and takes a step toward more informative word representation construction.

4

Exploiting bert for malformed segmentation detection to improve scientific writings

100%

Halawa A. , Gamalel-Din S. , Nasr A.

|

2023

|

tom Vol. 19, no 2

126--141

EN

Writing a well-structured scientific documents, such as articles and theses, is vital for comprehending the document's argumentation and understanding its messages. Furthermore, it has an impact on the efficiency and time required for studying the document. Proper document segmentation also yields better results when employing automated Natural Language Processing (NLP) manipulation algorithms, including summarization and other information retrieval and analysis functions. Unfortunately, inexperienced writers, such as young researchers and graduate students, often struggle to produce well-structured professional documents. Their writing frequently exhibits improper segmentations or lacks semantically coherent segments, a phenomenon referred to as "mal-segmentation." Examples of mal-segmentation include improper paragraph or section divisions and unsmooth transitions between sentences and paragraphs. This research addresses the issue of mal-segmentation in scientific writing by introducing an automated method for detecting mal-segmentations, and utilizing Sentence Bidirectional Encoder Representations from Transformers (sBERT) as an encoding mechanism. The experimental results section shows a promising results for the detection of mal-segmentation using the sBERT technique.

5

Thirty-Five Years of Research on Neuro-Linguistic Programming. NLP Research Data Base. State of the Art or Pseudoscientific Decoration?

100%

Witkowski T.

|

tom 41

|

nr 2

58-66

EN

The huge popularity of Neuro-Linguistic Programming (NLP) therapies and training has not been accompanied by knowledge of the empirical underpinnings of the concept. The article presents the concept of NLP in the light of empirical research in the Neuro-Linguistic Programming Research Data Base. From among 315 articles the author selected 63 studies published in journals from the Master Journal List of ISI. Out of 33 studies, 18.2% show results supporting the tenets of NLP, 54.5% - results non-supportive of the NLP tenets and 27.3% brings uncertain results. The qualitative analysis indicates the greater weight of the non-supportive studies and their greater methodological worth against the ones supporting the tenets. Results contradict the claim of an empirical basis of NLP.

6

Analiza wskaźnika φ pod kątem przydatności jako miary w ocenie złożoności szyfrowania i jakości szyfrogramu w metodzie s-Tech

89%

Bezeg P. , Pacyna P.

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2023

|

tom nr 4

121--124

PL

Artykuł dotyczy szczególnego rodzaju szyfrowania wiadomości, któremu towarzyszy ukrywanie szyfrogramu pod postacią tekstu. W efekcie otrzymujemy szyfrogram w formie tekstu, który jest poprawny stylistycznie i semantycznie, a więc zbliżony do tekstu naturalnego. W toku badań analizujemy metodę szyfrująco-ukrywającą s-Tech, a w szczególności jej wskaźnik φ, który służy do oceny trudności generowania szyfrogramu i do szacowania jakości wynikowego tekstu, to jest stopnia naturalizmu powstającego szyfrogramu. Celem badań jest sprawdzenie użyteczności tej miary jako uniwersalnego wskaźnika złożoności przebiegu szyfrowania i jakości tekstu. Badanie wskaźnika φ odbywa się poprzez manipulację dwoma parametrami systemu: długością n-Gramów w bazie n-Gramowej (w zakresie od n=1 do n=6, oznaczanej też jako LBS) oraz włączając (albo wyłączając) przetwarzanie wstępne. Oceniamy ich łączny wpływ – nie tylko na przebieg szyfrowania, na trudność, lecz również na jakość szyfrogramu. Analiza odbywa się poprzez porównanie wyników dla trzech wariantów preprocessingu: szyfrowanie hybrydowe połączone z kompresją LZW, kompresja SMAZ oraz dla sytuacji referencyjnej, w której tekst jawny w zapisie ASCII jest szyfrowany bez przetwarzania wstępnego.

EN

The paper focuses on a unique encryption method combined with shaping ciphertext as natural text, which is a form of steganography. We analyze the s-Tech encryption method and its φ indicator by evaluating the difficulty of ciphertext generation and the quality of the resulting natural text. The research aims to examine φ as a universal indicator of both encryption complexity and natext quality. The analysis involves three preprocessing variants: hybrid encryption with LZW compression, SMAZ compression, and a reference situation with null preprocessing.

7

Koncepcja bazy danych jako podstawowej części programu generującego oceny opisowe w nauczaniu wczesnoszkolnym

89%

Trzeciakowska A.

Studia i Materiały Informatyki Stosowanej

|

2010

|

tom nr 3

31--37

PL

Szeroki dostęp do Internetu, istnienie ogromnej ilości tekstów w wersji elektronicznej powoduje konieczność rozwoju nauki określanej jako inżynieria lingwistyczna. Zajmuje się ona szeroko pojętym przetwarzaniem danych lingwistycznych. Jednym z aspektów przetwarzania tego rodzaju danych jest generowanie tekstów w języku naturalnym. Ponieważ przeważająca ilość powstających tekstów dostępna jest w wersji elektronicznej, istnieje bardzo duże zapotrzebowanie na programy przetwarzające je. Głównym celem powstania tego artykułu jest przedstawienie koncepcji relacyjnej bazy danych będącej podstawą eksperymentalnego programu automatycznie generującego oceny opisowe w nauczaniu wczesnoszkolnym.

EN

Common access to the Internet and huge number of the texts in numeric version causes necessity of progress of the science known as linguistic engineering. It researches the wide implied natural language processing. One of the aspects of processing that kind of data is genering the texts in the natural language. Because the most of the nascent texts are available in numeric version, there is large demand for the programs processing them. The main point of that article is to present the conception of a database that is the fundamental part of the experimental program automatically genering descriptive grades in elementary schools.

8

Enhancing Students' Self-confidence in the EFL Classroom through Neuro-Linguistic Programing Technique - Reframing

88%

Gashi S.

Academicus. International Scientific Journal

|

2024

|

tom 15

|

nr 30

138-152

EN

This research paper deals with self-confidence and how the application of Neuro-Linguistic Programming (NLP) technique, six-step reframing, plays a crucial role in students' self-confidence in the English as a foreign language (EFL) classroom. It analyses the eefctiveness of incorporating NLP six-step reframing on students' self-confidence levels across diefrent age groups, cultural backgrounds, and academic setings. This provides an opportunity for learners to gain self-confidence, helping them to truly believe in themselves, and allowing them to feel 'seen, heard, and understood'. Reframing, drawing from Neuro-Linguistic Programming (NLP), enables learners to see mistakes as an integral parts of learning process and also helps to refine their skills which are crucial for language acquisition. The research includes sixty-six students who diefr based on criteria such as age, cultural backgrounds, and learning experience. Quantitative methods through questionnaires and qualitative methods, through observation and interviews, have been helpful in conducting this research paper. The results have shown a significant eefct of incorporating NLP's six-step reframing within the pedagogical framework, shedding light on its potential to empower students' self-confidence challenges commonly encountered in language learning environments. The study has shown that, even students encounter dificulties or setbacks in language learning, the incorporation of six-step reframing technique has proven to be a transformative approach. By systematically identifying dissatisfaction, establishing clear signals, alternative behaviors, eliciting positive intentions, and encouraging students' self-confidence in learning a foreign language. This psychological tool not only contributes to language proficiency but also empowers students to navigate the complexities of the EFL classroom with confidence and adaptability.

9

Dissipative soliton resonance and noise-like pulse generation of large normal dispersion Yb-doped fiber laser

88%

Pan Y. , Qiu H. , Zhang T. , Liu W. , Miao J. , Zhaoshuo T.

Optica Applicata

|

2022

|

tom Vol. 52, nr 1

77--88

EN

In this paper, we experimentally demonstrate two types of dissipative soliton resonant (DSR) and noise-like pulse (NLP) in a mode-locked fiber laser using the nonlinear optical loop mirror (NOLM). By appropriately adjusting the polarization states, the switchable generation of DSR and NLP can be achieved from one mode-locked fiber laser. By adjusting the pump power, the pulse width of DSR increases gradually from 2.45 to 13.35 ns with a constant peak intensity, while the NLP just has a little increase, even splitting into two narrower pulses at higher pump power. Two types of DSR and NLP have the same pulse periods of 1.29 μs, corresponding to the cavity length of the fiber laser. The obtained results display the evolution process of DSR pulse and NLP in mode-locked fiber laser and have some application in optical sensing, spectral reflectometry, micromachining, and other relative domains.

10

Cost Optimal Project Scheduling

88%

Klanšek U. , Pšunder M.

Organizacija

|

2008

|

tom 41

|

nr 4

153-158

EN

This paper presents the cost optimal project scheduling. The optimization was performed by the nonlinear programming approach, NLP. The nonlinear total project cost objective function is subjected to the rigorous system of the activity precedence relationship constraints, the activity duration constraints and the project duration constraints. The set of activity precedence relationship constraints was defined to comprise Finish-to-Start, Start-to-Start, Start-to-Finish and Finish-to-Finish precedence relationships between activities. The activity duration constraints determine relationships between minimum, maximum and possible duration of the project activities. The project duration constraints define the maximum feasible project duration. A numerical example is presented at the end of the paper in order to present the applicability of the proposed approach.

11

Morfetik – mises à jour et évolutions d’une ressource en ligne

88%

Grezka A.

|

2023

|

tom 35

1-19

EN

In this article, a morphological linguistic resource for contemporary French called Morfetik is presented. The evolution of the resource and its various linguistic and technological characteristics are discussed. Additionally, an overview of the numerous tools integrated into the resource is provided. Morfetik represents a continuously evolving platform designed to progress and enhance the processing of textual data.

12

NLP – narzędzie perswazji czy sztuka manipulacji?

88%

Drabik A.

|

2011

|

tom 5

19-24

EN

Neurolinguistic programming, technology once used mostly by psychotherapists, nowadays tool used increasingly in various fields of human activities. Used mostly by retailers, in whose work negotiations and communication techniques play the special role. Neurolinguistic programming aims at establishing positive contact with a correspondent, strengthening the forces of communication as well as creating and modifying patterns of thinking and perception of other people. Th ese actions are the cause of lively discussion among inthusiasts appreciating the effectiveness of NLP as a tool of persuasion and its critics, who recognize the negative aspects and the possible misuse for the purposes of manipulation.

13

MODERNIZATION OF MEANS FOR ANALYSES AND SOLUTION OF NONLINEAR PROGRAMMING PROBLEMS

88%

Trunov A.

|

2015

|

tom 16

|

nr 2

133-141

EN

The problems of optimization for nonlinear programming (NLP) with constraints inequalities are considered. Definition of condition-indicator as quantitative criterion of the properties of Lagrange function is justified. Application of indicator to increasing degree of completeness for system in NLP for finance and business problems with constraints inequalities are obtained. The new Lagrange function with square of each component of vector Lagrange multipliers for nonlinear objective function simultaneously with criterion-indicator as a source of additional equations is investigated. The conditions, in which the dimensionality of the vector of strategies and the number of constraints doesn’t effects on the uniqueness of the optimization problem solution is received and discussed.

14

Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level Semantic Text Similarity Measurement in Turkish

88%

Tulu C. N.

Advances in Science and Technology. Research Journal

|

2022

|

tom Vol. 16, no 4

147--156

EN

This study aims to evaluate experimentally the word vectors produced by three widely used embedding methods for the word-level semantic text similarity in Turkish. Three benchmark datasets SimTurk, AnlamVer, and RG65_Turkce are used in this study to evaluate the word embedding vectors produced by three different methods namely Word2Vec, Glove, and FastText. As a result of the comparative analysis, Turkish word vectors produced with Glove and FastText gained better correlation in the word level semantic similarity. It is also found that The Turkish word coverage of FastText is ahead of the other two methods because the limited number of Out of Vocabulary (OOV) words have been observed in the experiments conducted for FastText. Another observation is that FastText and Glove vectors showed great success in terms of Spearman correlation value in the SimTurk and AnlamVer datasets both of which are purely prepared and evaluated by local Turkish individuals. This is another indicator showing that these aforementioned datasets are better representing the Turkish language in terms of morphology and inflections.

15

Natural language processing for web search : fuzzy set approach

75%

Kogut D.

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

|

2001

|

tom Nr 928

42-46

EN

This article investigates whether an innovatory fuzzy set approach to the syntactic analysis of natural language can service to the information retrieval systems. It concentrates on Web search as the internet becomes a vast resource of information. In addition, the article presents a syntactic analysis module of TORCH project where the fuzzy set disambiguation has been implemented and tested.

16

Review of Current Text Representation Technics for Sematic Relationship Extraction

75%

Gałusza M.

|

2020

|

tom No. 11-12

13--22

EN

Article provides review on current most popular text processing technics; sketches their evolution and compares sequence and dependency models in detecting semantic relationship between words.

PL

Artykuł zawiera przegląd najpopularniejszych metod reprezentacji tekstu - modele sekwencyjne i grafowe w kontekście wykrywania relacji semantycznych między słowami.

17

LEMATIZÁCIA, MORFOLOGICKÁ ANOTÁCIA A DEZAMBIGUÁCIA SLOVENSKÉHO TEXTU – WEBOVÉ ROZHRANIE

75%

Garabík R. , Bobeková K.

Slovenská reč (Slovak Language)

|

2021

|

tom 86

|

nr 1

104 – 109

EN

Lemmatization, morphological (or morphosyntactic) annotation (MSD) and disambiguation is a basic and indispensable step in Natural Language Processing of languages with a moderate level of inflection. We present a web interface demonstrating the de facto default lemmatization and MSD for Slovak, as used in major Slovak corpora (with several enhancements yet to be applied in the corpora). The interface can be used chiefly for presentation or pedagogical purposes, with the morphological tags expanded and explained using plain language in several languages, including two different terminological registers of Slovak (professional linguistic or a “common” one).

18

Goal - oriented conversational bot for employment domain

75%

Drozda P. , Żmijewski T. , Osowski M. , Krasnodębska A. , Talun A.

|

2023

|

tom nr 26(1)

111--123

EN

This paper focuses of the implementation of the goal – oriented chatbot in order to prepare virtual resumes of candidates for job position. In particular the study was devoted to testing the feasibility of using Deep Q Networks (DQN) to prepare an effective chatbot conversation flow with the final system user. The results of the research confirmed that the use of the DQN model in the training of the conversational system allowed to increase the level of success, measured as the acceptance of the resume by the recruiter and the finalization of the conversation with the bot. The success rate increased from 10% to 64% in experimental environment and from 15% to 45% in production environment. Moreover, DQN model allowed the conversation to be shortened by an average of 4 questions from 11 to 7.

19

Ontology Extraction from Software Requirements Using Named-Entity Recognition

75%

Kocerka J. , Krześlak M. , Gałuszka A.

Advances in Science and Technology. Research Journal

|

2022

|

tom Vol. 16, no 3

207--212

EN

With the software playing a key role in most of the modern, complex systems it is extremely important to create and keep the software requirements precise and non-ambiguous. One of the key elements to achieve such a goal is to define the terms used in a requirement in a precise way. The aim of this study is to verify if the commercially available tools for natural language processing (NLP) can be used to create an automated process to identify whether the term used in a requirement is linked with a proper definition. We found out, that with a relatively small effort it is possible to create a model that detects the domain specific terms in the software requirements with a precision of 87 %. Using such model it is possible to determine if the term is followed by a link to a definition.

20

Adaptive Rider Feedback Artificial Tree Optimization-Based Deep Neuro-Fuzzy Network for Classification of Sentiment Grade

75%

Jasti S. , Kumar G. V. S. R.

Journal of Telecommunications and Information Technology

|

2023

|

tom nr 1

37--50

EN

Sentiment analysis is an efficient technique for expressing users’ opinions (neutral, negative or positive) regarding specific services or products. One of the important benefits of analyzing sentiment is in appraising the comments that users provide or service providers or services. In this work, a solution known as adaptive rider feedback artificial tree optimization-based deep neuro-fuzzy network (RFATO-based DNFN) is implemented for efficient sentiment grade classification. Here, the input is pre-processed by employing the process of stemming and stop word removal. Then, important factors, e.g. SentiWordNet-based features, such as the mean value, variance, as well as kurtosis, spam word-based features, term frequency-inverse document frequency (TF-IDF) features and emoticon-based features, are extracted. In addition, angular similarity and the decision tree model are employed for grouping the reviewed data into specific sets. Next, the deep neuro-fuzzy network (DNFN) classifier is used to classify the sentiment grade. The proposed adaptive rider feedback artificial tree optimization (A-RFATO) approach is utilized for the training of DNFN. The A-RFATO technique is a combination of the feedback artificial tree (FAT) approach and the rider optimization algorithm (ROA) with an adaptive concept. The effectiveness of the proposed A-RFATO-based DNFN model is evaluated based on such metrics as sensitivity, accuracy, specificity, and precision. The sentiment grade classification method developed achieves better sensitivity, accuracy, specificity, and precision rates when compared with existing approaches based on Large Movie Review Dataset, Datafiniti Product Database, and Amazon reviews.