Wyniki wyszukiwania - BazTech

1

Sieci Bayesa w rozpoznawaniu mowy

Mermon A.

Pomiary Automatyka Robotyka

|

2011

|

R. 15, nr 12

109-111

PL

Problematyka rozpoznawania mowy nie doczekała się, jak dotąd, kompleksowego rozwiązania. Współczesne efektywne systemy rozpoznawania mowy korzystają najczęściej z metod stochastycznych opartych na ukrytych modelach Markowa. Alternatywą dla nich mogą być sieci Bayesa, będące odpowiednią strukturą do formułowania modeli probabilistycznych, które cechują się jednocześnie precyzją oraz zwartością. Sieci Bayesa mogą reprezentować rozkład prawdopodobieństwa dowolnego zbioru zmiennych losowych. Mnogość dostępnych obecnie algorytmów i narzędzi obliczeniowych sprawia, że testowanie i wdrażanie nowych rozwiązań staje się mniej pracochłonne. Zalety te determinują duże możliwości wykorzystania sieci Bayesa do rozwiązywania praktycznych problemów również w zakresie rozpoznawania mowy.

EN

Speech recognition problem hasn't been fully-scaled solved till nowadays. Contemporary effective speech recognition systems mostly use stochastic methods based on Hidden Markov Models. Bayes networks can be alternative to them. BN are appropriate structures to formulate probabilistic models, which are simultaneously precise and compact. They can represent a probability distribution of arbitrary set of random variables. Variety of algorithms and computational tools which are available to use makes testing and implementing new solutions less demanding. Those advantages determine that Bayes networks have potential to be used in solving practical problems also in the area of speech recognition.

2

Parametry identyfikacyjne umożliwiające automatyczne rozpoznawanie cyfr wypowiadanych w języku polskim

Dulas J.

Pomiary Automatyka Kontrola

|

2011

|

R. 57, nr 3

308-311

PL

Artykuł przedstawia najnowsze wyniki prac autora w dziedzinie automatycznego rozpoznawania sygnałów mowy. Wyniki badań prowadzonych na zbiorze 500 nagrań cyfr wypowiadanych w języku polskim przez 50 mówców różnej płci i w różnym wieku pozwalają na zaproponowanie zestawu parametrów niezbędnych do przeprowadzenia procesu ich identyfikacji. Jak pokazano w artykule zestaw kilku podstawowych cech identyfikujących jest wystarczający aby taki proces przeprowadzić. Zaproponowany zestaw parametrów jest łatwy do uzyskania przy niewielkiej mocy obliczeniowej.

EN

The paper describes a new author's method for automatic recognition of digits spoken in Polish. In this new approach there are no frequency analyses as used to be made in such systems but the image recognition of the time characteristic is applied. Investigations performed on 500 records of people of different sex and age showed that there was possibility of constructing an automatic recognition system based on a few parameters. The first is the number of voiced phonemes included in a recognized word (Tab. 1). In this group there are all wavelets and some consonants. They include basic periods inside their time characteristics. This parameter is obtained using the grid method designed by the author (Fig. 3). The second one is the number and position of noisy phonemes. To this group there belong phonemes without basic periods but with big signal variety. This parameter is calculated using the number of local extrema, the signal amplitude level and checking if there are no basic periods. The third parameter is the shape of a signal envelope (Tab. 2). As investigations showed, it is possible to find the envelope pattern for each Polish digit common for all tested speakers. It was proved that these parameters are sufficient for automatic speech recognition of digits spoken in Polish. This new method can also be applied to other systems with small number of recognized words. It is fast and lack of frequency analyses causes that it has low hardware demands.

3

Analiza obwiedni jako parametr wspomagający automatyczną identyfikację wyrażeń

Dulas J.

Pomiary Automatyka Kontrola

|

2009

|

R. 55, nr 5

308-309

PL

W badaniach nad automatycznym rozpoznawaniem sygnałów mowy notuje się stały postęp, choć różnorodność języków utrudnia wprowadzenie jednakowych rozwiązań. Przykładem rozwoju i upowszechnienia metod identyfikacji mowy może być system operacyjny Windows XP, w którym zamieszczono narzędzia do sterowania aplikacjami za pomocą sygnałów głosowych. Brak jednak nadal rozwiązań dla języka polskiego, co sprawia że potrzebne są badania zmierzające do opracowania niezawodnych algorytmów identyfikujących i sterujących. W artykule przedstawiono wyniki badań obwiedni sygnałów mowy, będących cyframi z zakresu 0-9, uzyskanych dla grupy 50-ciu osób różnych płci i w różnym wieku. Celem przeprowadzonych badań było uzyskanie odpowiedzi na pytanie, czy analiza obwiedni może stanowić parametr w procesie automatycznego rozpoznawania sygnałów mowy i czy jest możliwe stworzenie modeli obwiedni dla każdej z cyfr, które byłyby wspólne dla wszystkich (50) mówców.

EN

In scientific research on the speech signal recognition there can be noted great development, although differences between languages make it difficult to work out the same algorithms for all of them. A good example of the big progress in this field can be Windows XP, an operating system which enables controlling some applications by voice (but not in Polish). There is still lack of good working programs controlled by Polish. In this paper the results of investigations on the voice signal envelope are described. There were tested digital recordings, from the range 0 - 9, obtained for 50 persons of different age and sex . The main goal was to find out if the envelope analysis could be helpful in automatic speech recognition. During the investigations basing on the analysis of the digit time characteristic, each digit was divided into parts (from 2 to 5) having the similar envelope. Also the minimum duration and the amplitude range were found for each part. The results are given in Table 1. Table 2 contains the results of fitting the envelope to each digit. It is shown that the envelope patterns are common for all the speakers and digits. Although the envelope analysis is not sufficient alone for automatic speech recognition (some digit patterns fit to the others), it can be used as one of the parameters employed for this purpose.