Wyniki wyszukiwania - BazTech

1

Możliwości automatyzacji elektrostymulacji języka – podejście obliczeniowe

Mikołajewski Dariusz, Mikołajewska Emilia

Studia i Materiały Informatyki Stosowanej

|

2023

|

T. 15, nr 2

PL

Celem pracy jest ocena, w jakim stopniu obecny rozwój automatyzacji elektrostymulacji języka stanowi podstawę do opracowania nowej grupy rozwiązań klinicznych i technologicznych. Naukowcy i inżynierowie mogą przyczynić się do rozwoju skutecznych, bezpiecznych i szeroko stosowanych technologii elektrostymulacji języka o różnorodnych zastosowaniach w opiece zdrowotnej i produktach konsumenckich.

2

Coding effects on changes in formant frequencies in Japanese speech signals

Kucharski Mateusz, Brachmański Stefan

Vibrations in Physical Systems

|

2019

|

Vol. 30, nr 1

art. no. 2019131

EN

This paper presents results of research on effects of lossy coding on formant frequencies for japanese speech signals. Additionally changes in pitch of the voice were inspected. For this research four most popular lossy coding standards were chosen, MP3, WMA, AAC and OGG, and compared to original WAVE files. Audio files were created by the author based on ITU-T P.501 recommendation in two sampling frequencies, 16 kHz and 48 kHz, and converted into chosen codecs. To extract the data from audio files, open license software Praat was used. Due to discovered differences in time duration between original and encoded files, that also differed between individual codecs, only OGG and WMA standards were compared directly. MP3 and AAC standards were divided into Japanese syllables, averaged and then compared into also averaged WAVE files. Results were additionally compared to FLAC lossless codec.

3

Selekcja cech osobniczych sygnału mowy z wykorzystaniem algorytmów genetycznych

Kamiński Kamil, Dobrowolski Andrzej P., Majda Ewelina

Inżynieria Bezpieczeństwa Obiektów Antropogenicznych

|

2019

|

Nr 1-2

8--16

PL

W referacie przedstawiono system automatycznego rozpoznawania mówcy zaimplementowany w środowisku Matlab oraz pokazano sposoby realizacji i optymalizacji poszczególnych elementów tego systemu. Główny nacisk położono na wyselekcjonowanie cech dystynktywnych głosu mówcy z wykorzystaniem algorytmu genetycznego, który pozwala na uwzględnienie synergii cech podczas selekcji. Pokazano również wyniki optymalizacji wybranych elementów klasyfikatora, m.in. liczby rozkładów Gaussa użytych do zamodelowania każdego z głosów. Ponadto, podczas tworzenia modeli głosów zastosowano model głosu uniwersalnego.

EN

The paper presents automatic speaker recognition system, implemented in the Matlab environment, and demonstrates how to achieve and optimize various elements of the system. The main emphasis was put on features selection of speech signal using a genetic algorithm, which takes into account synergy of features. The results of the selected elements of optimizing classifier have been also shown, including the number of Gaussian distributions used to model each of the voices. In addition during creating voice models, the universal voice model have been used.

4

Polish emotional speech recognition based on the committee of classifiers

Kamińska D., Sapiński T.

Przegląd Elektrotechniczny

|

2017

|

R. 93, nr 6

101--105

EN

This article presents the novel method for emotion recognition from polish speech. We compared two different databases: spontaneous and acted out speech. For the purpose of this research we gathered a set of audio samples with emotional information, which serve as input database. Multiple Classifier Systems were used for classification, with commonly used speech descriptors and different groups of perceptual coefficients as features extracted from audio samples.

PL

Niniejsza praca dotyczy rozpoznawania stanów emocjonalnych na podstawie głosu. W artykule porównaliśmy mowę spontaniczną z mową odegraną. Na potrzeby zrealizowanych badań zgromadzone zostały emocjonalne nagrania audio, stanowiące kompleksową bazę wejściową. Przedstawiamy nowatorski sposób klasyfikacji emocji wykorzystujący komitety klasyfikujące, stosując do opisu emocji powszechnie używane deskryptory sygnału mowy oraz percepcyjne współczynniki hybrydowe.

5

Automatic detection of stuttering in a speech

Wiśniewski M., Kuniszyk-Jóźkowiak W.

Journal of Medical Informatics & Technologies

|

2015

|

Vol. 24

31--37

EN

In the work authors applied speech recognition techniques to find disfluent events. The recognition system based on the Hidden Markov Model Toolkit was built and tested. The set of context dependent HMM models was trained and used to locate speech disturbances. Authors were not concentrated on specific disfluency type but tried to find any extraneous sounds in a speech signal. Patients read prepared sentences, the system recognized them and then results were compared to manual transcriptions. It allowed the system to be more robust and enabled to find all disfluencies types appearing at word boundaries. Such system can by utilized in many ways, for example like a "preprocessor" that finds strange sounds in a speech to be analyzed or classified by other algorithms later, to evaluate or track therapy process of stuttering people, to evaluate speech fluency by ´normal´ speakers, etc.

6

Analysis of the Arabic using neural networks: an overview

Soori H., Platos J., Snasel V.

Przegląd Elektrotechniczny

|

2013

|

R. 89, nr 11

47-50

EN

This paper is a quick review of some of the scholarly work aiming at solving various problems of the Arabic language using neural networks. It includes some research work concerning online recognition of handwritten Arabic characters, speech recognition, offline character text recognition, text categorization and recognition of printed text. This paper concludes that more research should be conducted in this area considering the importance of the Arabic language, the rapid growth of internet users in the Arab world, and the widespread usage of Arabic characters by many languages other than Arabic.

PL

W artykule przedstawiono metody analizy języka arabskiego z wykorzystaniem sieci neuronowych. Analizowano możliwości rozpoznawania pisma odręcznego, drukowanego jak i mowy.

7

Automatic detection and classification of phoneme repetitions using HTK toolkit

Wiśniewski M., Kuniszyk-Jóźkowiak W.

Journal of Medical Informatics & Technologies

|

2011

|

Vol. 17

141--147

EN

The therapy of stuttering people is based on a proper selection of texts and then on a practice of their articulation by reading or narration. The texts are chosen on the basis of kind and intensity of dysfluencies appearing in a speech. Thus there is still a requirement to find effective and objective methods of analysis of dysfluent speech. Hidden Markov models are stochastic models widely used in recognition of any patterns appearing in a signal. In the work a simple monophone system based on the Hidden Markov Model Toolkit was built and tested in the context of detection and classification of phoneme repetitions - a common speech disorder in the Polish language.

8

Speech nonfluency detection and classification based on linear prediction coefficients and neural networks

Kobus A., Kuniszyk-Jóźkowiak W., Smołka E., Codello I.

Journal of Medical Informatics & Technologies

|

2010

|

Vol. 15

135--143

EN

The goal of the paper is to present a speech nonfluency detection method based on linear prediction coefficients obtained by using the covariance method. The application “Dabar” was created for research. It implements three different methods of LP with the ability to send coefficients computed by them into the input of Kohonen networks. Neural networks were used to classify utterances in categories of fluent and nonfluent. The first one was Kohonen network (SOM), used to reduce LP coefficients representation of each window, which were used as input data to SOM input layer, to a vector of winning neurons of SOM output layer. Radial Basis Function (RBF) networks, linear networks and Multi-Layer Perceptrons were used as classifiers. The research was based on 55 fluent samples and 54 samples with blockades on plosives (p, b, d, t, k, g). The examination was finished with the outcome of 76% classifying.

9

Improved approach to automatic detection of speech disorders based on the hidden Markov models approach

Wiśniewski M., Kuniszyk-Jóźkowiak W., Smołka E., Suszyński W.

Journal of Medical Informatics & Technologies

|

2010

|

Vol. 15

145--152

EN

In the work algorithms commonly utilized in continuous speech recognition systems were applied to detection of speech disorders. The used algorithms were briefly described and the final method of speech disorders detection was presented. The article includes the results of the short test performed in order to check the effectiveness and accuracy of the method. The aim of the test was detection and classification of fricative phonemes prolongation one of the most common speech disorders in the Polish language. It is worth emphasizing that this method enables detection of a category of speech disturbance (e.g. fricative, nasal, vowels, etc… prolongation), but also provides the information about a specific phoneme being disturbed.

10

Poprawa zrozumiałości mowy w obecności zakłóceń z wykorzystaniem algorytmu opartego na filtracji adatacyjnej

Elwart D., Czyżewski A.

Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej

|

2009

|

Nr 26

33-36

PL

W pracy opisano nowy sposób wykorzystania filtracji adaptacyjnej do poprawy jakości dźwięków użytecznych nagrywanych w obecności zakłóceń. Przedstawiono stworzony algorytm adaptacji, omówiono możliwości przetwarzania dźwięku dodatkowymi algorytmami, opisano przeprowadzone eksperymenty. Zamieszczono i omówiono wyniki eksperymentów. Zaproponowano sposób integracji opracowanej metody z systemami akustycznego monitorowania aglomeracji miejskiej.

EN

This paper describes a technique of improving the quality of speech signals recorded under interference (adaptive filter based algorithm). Proposed algorithm is described and additional possibilities of speech intelligibility improvement are discussed. Results of the tests are presented. A way of integrating the elaborated method with an agglomeration acoustic monitoring system is proposed. The research is subsidized by the Polish Ministry of Science and Higher Education within Grant No. R00-O0005/3.

11

Method of the measurement of the time between centre of chosen sounds

Meyer A., Portalska H., Portalski M.

Elektronika : konstrukcje, technologie, zastosowania

|

2008

|

Vol. 49, nr 4

71-73

EN

In this paper the method and the program to measurement of time distances between middle points of speaking or playing sounds was presented. The proposed algorithm can be used to the score of the human state for persons with dysfunction of the central nervous system. The differences between results obtained for healthy persons an impaired persons was shown.

PL

Przedstawiono metodę i program służący do pomiaru czasów między środkami wyrazów lub wygrywanych dźwięków. Zaproponowany algorytm może być wykorzystany do oceny stanu osób z dysfunkcjami centralnego układu nerwowego. Pokazano różnice między wynikami osób zdrowych i chorych.

12

Automatic detection of prolonged fricative phonemes with the Hidden Markov Models approach

Wiśniewski M., Kuniszyk-Jóźkowiak W., Smołka E., Suszyński W.

Journal of Medical Informatics & Technologies

|

2007

|

Vol. 11

293--297

EN

The Hidden Markov Model (HMM) is a stochastic approach to recognition of patterns appearing in an input signal. In the work author's implementation of the HMM were used to recognize speech disorders - prolonged fricative phonemes. To achieve the best recognition effectiveness and simultaneously preserve reasonable time required for calculations two problems need to be addressed: the choice of the HMM and the proper preparation of an input data. Tests results for recognition of the considered type of speech disorders are presented for HMM models with different number of states and for different sizes of codebooks.

13

Computer-supported individualised therapy of non-fluent speech

Dzieńkowski M., Kuniszyk-Jóźkowiak W., Smołka E., Suszyński W.

Biocybernetics and Biomedical Engineering

|

2006

|

Vol. 26, no. 4

71-77

EN

The therapy of stuttering people is a time-consuming and long-Iasting process which requires a great effort both from the logopaedist and patient. The process can be divided into three parts: recording of patient's utterances (reading, telling, conversation), 20-minute corrective exercises with the echo (reading, tell ing) and individual work of the stuttering person with difficult words. All of these tasks may be performed with the use of a computer, controlled by a special program elaborated for that purpose. The computer system for the logopaedic diagnosis and therapy (DTL) allows for recording and saving utterances as sound files, practice with acoustical or visual echo and performance of automatically generated tasks adjusted to individual difficulties of particular speakers. Examples of analyses performed at various periods of therapy, i.e. at the beginning, during and after the therapy, supply information conceming e.g. the stuttering intensity and types of the occurring errors. The results presented in this work concern the control recordings performed at 1-1.5-month periods of time for twelve patients.

14

Metody opisu sygnału mowy

Kłys D., Wochowski A.

Mikroelektronika i Informatyka : prace naukowe

|

2005

|

Z. nr 5

191-195

PL

Artykuł zawiera krótki przegląd metod opisu mowy ludzkiej, ze wskazaniem ich wykorzystania w określonych etapach analizy głosu ludzkiego. Celem publikacji nie jest przedstawienie szczegółowych opisów analitycznych wzorów i zaawansowanych aparatów matematycznych, ale zaprezentowanie podstawowych założeń i mozliwości poszczególnych metod. Ze szczegółowym opisem kazdej z metod autor może zapoznać się w wymienionej, na końcu artykułu, bibliografii.

15

Badania jakości mowy w połączeniach głosowych. Stara usługa - nowe problemy (artykuł wprowadzający)

Brachmański S., Kula S.

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2003

|

nr 8-9

418-423

PL

Omówiono zjawiska, które wpływają na jakość mowy transmitowanej w sieciach telekomunikacyjnych. Przedstawiono przegląd subiektywnych i obiektywnych metod oceny jakości mowy.

EN

Causes of speech signal degradations in telephone chains are described. There are given some subjective and objective methods of speech quality evaluation.

16

Wpływ czasu pogłosu na zrozumiałość mowy w pomieszczeniu zamkniętym

Nowoświat A.

Zeszyty Naukowe. Budownictwo / Politechnika Śląska

|

2002

|

z. 95

437-444

PL

W pracy przedstawiono wpływ czasu pogłosu na współczynniki RASTI, określające zrozumiałość mowy w pomieszczeniu. Do tego celu wykorzystano wyprowadzone przez Houtgasta i Steenekena [4] oraz opisane przez Rutkowskiego [7] zależności funkcji MTF od czasu pogłosu pomieszczenia, a następnie zastosowano je do wyznaczenia współczynników RASTI. Wyniki obliczeń przedstawiono na wykresach zależności wskaźnika zrozumiałości mowy RASTI od czasu pogłosu oraz funkcji MTF od częstotliwości modulacji amplitudowej w zależności od czasu pogłosu.

EN

The paper desribes the influence of the reverberation time on the RASTI-index, defining the speech intelligibility in rooms. To gain this aim the relation between the MTF function and the reverberation time, derived by Houtgast and Steeneken [4] and developed by Rutkowski [7], has been used. The same relation has also been applied to calculate the RASTI- index.

17

Teorie aktów mowy

Osika G.

Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska

|

2001

|

z. 6

95-113

PL

W artykule tym autor próbuje przybliżyć teorie, które rozpatrują język w kategoriach jego użycia. Podstawą do stworzenia tych teorii było rozpoznanie wykonawczej funkcji języka, natomiast same teorie są próbą zrozumienia możliwości zachodzenia tego typu czynności. Artykuł opisuje dwie teorie, Austina oraz Searle'a.

EN

In this article the author tries to approach theories which present language in category of its using. The basis for creation of these theories are an attempt to understanding this kind of acting. The article describes two theories: Austin's and Searle's.

18

Filtracja synału mowy z wykorzystaniem algorytmu SVD

Maćkowiak J.

Zeszyty Naukowe. Elektronika / Politechnika Śląska

|

2000

|

z. 12

123-133

PL

Technika filtracji bazująca na estymacji widma szumu, z wykorzystaniem rozkładu na wartości szczególne, jest jedną ze skutecznych metod filtracji szumu. Rozkład na wartości szczególne, SVD, jest techniką, która umożliwia szybsze obliczenia, gdzie istnieje konieczność odwracania jedynie macierzy diagonalnych.

EN

Filtration's techniques using estimation of the spectrum of the clear speech from noisy speech is based on Singular Value Decomposition is one of the efficient method of the denoising. SVD requires less computing burden that is limited to inversion of the only diagonal matrixes.

19

Rozpoznawanie poleceń głosowych z wykorzystaniem technik transformacji falkowej

Binkowski M.

Zeszyty Naukowe. Automatyka / Politechnika Śląska

|

2000

|

z. 132

5-19

PL

W pracy opisano przykładowy system rozpoznawania poleceń głosowych, wyposażony w bazę wiedzy zawierającą 21 słów. W systemie sygnał mowy jest dekomponowany za pomocą transformacji falkowej. Poszczególne pasma zdekomponowanego sygnału są poddawane analizie cepstralnej, w wyniku czego ekstrahowane są cechy związane z informacją, niesioną w sygnale mowy. Cechy te są następnie poddawane dwupoziomowej klasyfikacji za pomocą sieci neuronowej typu sieć samoorganizująca się. Skuteczność rozpoznawania w systemie, omówiona na końcu pracy, plasuje się na poziomie 39% (rozpoznawanie pewne) plus 43,5% (rozpoznawanie niepewne). Na końcu pracy zasugerujemy również metody potencjalnego podniesienia skuteczności rozpoznawania w proponowanym systemie.

EN

This work describes an example of voice commands recognition system, equipped with a database with 21 words included. The speech signal in the system is decomposed using Wavelet Transformation. Individual sub-bands of the decomposed signal are then analysed using cepsrtal analysis, and the features related to spoken information are extracted. This features are then classified with a self organizing map neural network. The effectiveness of recognition is about 39% (sure recognition), plus about 43,5% (unsure recognition). Addtionally, some potential improvements of the recognition effectiveness are proposed.