Wyniki wyszukiwania - Biblioteka Nauki

Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

Ograniczanie wyników

Znaleziono wyników: 4

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: thesaurus

Sortuj według:

Ogranicz wyniki do:

Methods of automatic topic mining in publications in agriculture domain

100%

Karwowski W. , Wrzeciono P.

tom Vol. 6, No. 3

192--202

Today the vast majority of resources are available in digital form. Publications frequently are related to topics not set out in the title or even summary. In this paper we presented and discussed examples of methods of finding the common topic of a publication in the field of agriculture with the use of AGROVOC dictionary. The focus is on publications in the Polish language, and the possibilities of the use of the semantics defined in the multi-language thesaurus AGROVOC. First indexing tools, especially Agrotagger, which is useful for documents in the field of agriculture, are presented, and also the test results of Agrotagger are discussed. Next the semantic technologies implemented in the AGROVOC thesaurus are discussed. In the final part, we described the design and implementation of a system, based on Polish language dictionary and AGROVOC. Additionally some tests of implemented system are discussed.

Innovative Training Practice in Electronics for Enterprises

75%

Raud Z. , Vodovozov V.

tom R. 88, nr 7b

166-170

An educational practice suitable for planning optimal staff training trajectories is described. Effective instruments are given to build professional thesauri and to find an institution capable of providing training in the frame of such thesauri. Using them, new knowledge and skills can be introduced, the contents of the corresponding disciplines refreshed, and the borders between the disciplines shifted fluently. This promotes designing the teaching modules in highly interdisciplinary areas and in the areas with specific needs.

W artykule opisano praktykę edukacyjną przydatną do optymalizowania formy i zakresu szkolenia personelu. Podano także skuteczne instrumenty do profesjonalnego budowania tezaurusów i sposobów znajdowania instytucji zdolnych do szkolenia w ramach takich tezaurusów. Tezaurusy mogą być uzupełniane o nową wiedzę i umiejętności, może być również odświeżana zawartość odpowiednich dyscyplin, a granice między dyscyplinami przesuwane płynnie. Sprzyja to projektowaniu modułów nauczania w dziedzinach wielodyscyplinarnych oraz na obszarach o szczególnych potrzebach.

Extending Word2Vec with domain-specific labels

51%

Švaňa M.

tom Vol. 30

157--160

Choosing a proper representation of textual data is an important part of natural language processing. One option is using Word2Vec embeddings, i.e., dense vectors whose properties can to a degree capture the “meaning” of each word. One of the main disadvantages of Word2Vec is its inability to distinguish between antonyms. Motivated by this deficiency, this paper presents a Word2Vec extension for incorporating domain-specific labels. The goal is to improve the ability to differentiate between embeddings of words associated with different document labels or classes. This improvement is demonstrated on word embeddings derived from tweets related to a publicly traded company. Each tweet is given a label depending on whether its publication coincides with a stock price increase or decrease. The extended Word2Vec model then takes this label into account. The user can also set the weight of this label in the embedding creation process. Experiment results show that increasing this weight leads to a gradual decrease in cosine similarity between embeddings of words associated with different labels. This decrease in similarity can be interpreted as an improvement of the ability to distinguish between these words.

44%

Niewiarowski A.

tom Y. 113, iss. 1-NP

159--173

This paper proposes a method of comparing the short texts using the Levenshtein distance algorithm and thesaurus for analysing terms enclosed in texts instead of popular methods exploiting the grammatical variations glossary. The tested texts contain a variety of nouns and verbs together with grammatical or orthographical mistakes. Based on the proposed new algorithm the similarity of such texts will be estimated. The described technique is compared with methods: Cosine distances, distance Dice and Jaccard distance constructed on the term frequency method. The proposition is competitive against well-known algorithms of stemming and lemmatization.

Artykuł przedstawia propozycję metody porównywania krótkich fragmentów tekstów bazującą na algorytmie odległości Levenshteina i słowniku wyrazów bliskoznacznych. Porównywane teksty zawierają odmienione terminy oraz celowe błędy ortograficzne i gramatyczne. Opisany mechanizm zestawiony został z popularnymi metodami porównywania tekstów, takimi jak: odległości Kosinusowa, Dice’a i Jaccard’a, dla których wartości wektorów obliczane są metodą częstości terminów. Zastosowanie w mechanizmie słownika wyrazów bliskoznacznych jest alternatywą wobec znanych algorytmów określania rdzenia terminu i lematyzacji w analizie danych tekstowych.