Wyniki wyszukiwania - Biblioteka Nauki

1

Automatic disordered sound repetition recognition in continuous speech using CWT and kohonen network

100%

Codello I. , Kuniszyk-Jóźkowiak W. , Smołka E. , Kobus A.

Annales Universitatis Mariae Curie-Skłodowska. Sectio AI, Informatica

|

2012

|

tom Vol. 12, no. 2

39--48

EN

Automatic disorders recognition in speech can be very helpful for a therapist while monitoring therapy progress of patients with disordered speech. This article is focused on sound repetitions. The signal is analyzed using Continuous Wavelet Transform with 16 bark scales. Using the silence finding algorithm, only speech fragments are automatically found and cut. Each cut fragment is converted into a fixed-length vector and passed into the Kohonen network. Finally, the Kohonen winning neuron result is put on the 3-layer perceptron. Most of the analysis was performer and the results were obtained using the authors’ program WaveBlaster. We use the STATISTICA package for finding the best perceptron which was then imported back into WaveBlaster and used for automatic blockades finding. The problem presented in this article is a part of our research work aimed at creating an automatic disordered speech recognition system.

2

Speech nonfluency detection and classification based on linear prediction coefficients and neural networks

100%

Kobus A. , Kuniszyk-Jóźkowiak W. , Smołka E. , Codello I.

Journal of Medical Informatics & Technologies

|

2010

|

tom Vol. 15

135--143

EN

The goal of the paper is to present a speech nonfluency detection method based on linear prediction coefficients obtained by using the covariance method. The application “Dabar” was created for research. It implements three different methods of LP with the ability to send coefficients computed by them into the input of Kohonen networks. Neural networks were used to classify utterances in categories of fluent and nonfluent. The first one was Kohonen network (SOM), used to reduce LP coefficients representation of each window, which were used as input data to SOM input layer, to a vector of winning neurons of SOM output layer. Radial Basis Function (RBF) networks, linear networks and Multi-Layer Perceptrons were used as classifiers. The research was based on 55 fluent samples and 54 samples with blockades on plosives (p, b, d, t, k, g). The examination was finished with the outcome of 76% classifying.

3

Improved approach to automatic detection of speech disorders based on the hidden Markov models approach

100%

Wiśniewski M. , Kuniszyk-Jóźkowiak W. , Smołka E. , Suszyński W.

Journal of Medical Informatics & Technologies

|

2010

|

tom Vol. 15

145--152

EN

In the work algorithms commonly utilized in continuous speech recognition systems were applied to detection of speech disorders. The used algorithms were briefly described and the final method of speech disorders detection was presented. The article includes the results of the short test performed in order to check the effectiveness and accuracy of the method. The aim of the test was detection and classification of fricative phonemes prolongation one of the most common speech disorders in the Polish language. It is worth emphasizing that this method enables detection of a category of speech disturbance (e.g. fricative, nasal, vowels, etc… prolongation), but also provides the information about a specific phoneme being disturbed.

4

Computer-supported individualised therapy of non-fluent speech

100%

Dzieńkowski M. , Kuniszyk-Jóźkowiak W. , Smołka E. , Suszyński W.

Biocybernetics and Biomedical Engineering

|

2006

|

tom Vol. 26, no. 4

71-77

EN

The therapy of stuttering people is a time-consuming and long-Iasting process which requires a great effort both from the logopaedist and patient. The process can be divided into three parts: recording of patient's utterances (reading, telling, conversation), 20-minute corrective exercises with the echo (reading, tell ing) and individual work of the stuttering person with difficult words. All of these tasks may be performed with the use of a computer, controlled by a special program elaborated for that purpose. The computer system for the logopaedic diagnosis and therapy (DTL) allows for recording and saving utterances as sound files, practice with acoustical or visual echo and performance of automatically generated tasks adjusted to individual difficulties of particular speakers. Examples of analyses performed at various periods of therapy, i.e. at the beginning, during and after the therapy, supply information conceming e.g. the stuttering intensity and types of the occurring errors. The results presented in this work concern the control recordings performed at 1-1.5-month periods of time for twelve patients.

5

Automatic detection of prolonged fricative phonemes with the Hidden Markov Models approach

100%

Wiśniewski M. , Kuniszyk-Jóźkowiak W. , Smołka E. , Suszyński W.

Journal of Medical Informatics & Technologies

|

2007

|

tom Vol. 11

293--297

EN

The Hidden Markov Model (HMM) is a stochastic approach to recognition of patterns appearing in an input signal. In the work author's implementation of the HMM were used to recognize speech disorders - prolonged fricative phonemes. To achieve the best recognition effectiveness and simultaneously preserve reasonable time required for calculations two problems need to be addressed: the choice of the HMM and the proper preparation of an input data. Tests results for recognition of the considered type of speech disorders are presented for HMM models with different number of states and for different sizes of codebooks.

6

Automatic prolongation recognition in disordered speech using CWT and Kohonen network

100%

Codello I. , Kuniszyk-Jóźkowiak W. , Smołka E. , Kobus A.

Journal of Medical Informatics & Technologies

|

2012

|

tom Vol. 20

137--144

EN

Automatic disorder recognition in speech can be very helpful for the therapist while monitoring therapy progress of the patients with disordered speech. In this article we focus on prolongations. We analyze the signal using Continuous Wavelet Transform with 18 bark scales, we divide the result into vectors (using windowing) and then we pass such vectors into Kohonen network. Quite large search analysis was performed (5 variables were checked) during which, recognition above 90% was achieved. All the analysis was performed and the results were obtained using the authors' program - "WaveBlaster". It is very important that the recognition ratio above 90% was obtained by a fully automatic algorithm (without a teacher) from the continuous speech. The presented problem is part of our research aimed at creating an automatic prolongation recognition system.

7

Disordered sound repetition recognition in continuous speech using CWT and Kohonen network

100%

Codello I. , Kuniszyk-Jóźkowiak W. , Smołka E. , Kobus A.

|

tom Vol. 17

123--130

EN

Automatic disorders recognition in speech can be very helpful for therapist while monitoring therapy progress of patients with disordered speech. This article is focused on sound repetitions. The signal is analyzed using Continuous Wavelet Transform with 16 bark scales, the result is divided into vectors and passed into Kohonen network. Finally, the Kohonen winning neuron result is put on the 3-layer perceptron. The recognition ratio was increased by about 20% by adding a modification into the Kohonen network training process as well as into CWT computation algorithm. All the analysis was performed and the results were obtained using the authors' program ”WaveBlaster“, The problem presented in this article is a part of our research work aimed at creating an automatic disordered speech recognition system.

8

A new elliptical model of the vocal tract

88%

Kobus A. , Kuniszyk-Jóźkowiak W. , Smołka E. , Suszyński W. , Codello I.

|

tom Vol. 17

131--139

EN

In this paper a new model of the vocal tract is proposed. It is based on elliptical cylinders. It uses the vocal tract model based on PARCOR coefficients and midsaggital measurements of the voice tube. PARCOR coefficients were obtained from linear prediction coefficients which had been obtained by Levinson-Durbin method. Midsaggital lengths, understood as the height of a real vocal tract, were taken from X-Ray pictures, and they were averaged from the vocal tracts of a few people, who uttered the same vowels. The paper bases on Polish vowels: a,e,o,u,i,y.