Wyniki wyszukiwania - BazTech

Ograniczanie wyników

2 2018

Znaleziono wyników: 2

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: whispered speech recognition

Sortuj według:

Ogranicz wyniki do:

Application of Teager Energy Operator on Linear and Mel Scales for Whispered Speech Recognition

Marković B. R., Galić J., Mijić M.

Archives of Acoustics

2018

Vol. 43, No. 1

3--9

This paper presents experimental results on whispered speech recognition based on Teager Energy Operator for linear and mel cepstral coefficients including the Cepstral Mean Subtraction normalization technique. The feature vectors taken into consideration are Linear Frequency Cepstral Coefficients, Teager Energy based Linear Frequency Cepstral Coefficients, Mel Frequency Cepstral Coefficients and Teager Energy based Mel Frequency Cepstral Coefficients. A speaker dependent scenario is used. For the recognition process, Dynamic Time Warping and Hidden Markov Models methods are applied. Results show a respectable improvement in whispered speech recognition as achieved by using the Teager Energy Operator with Cepstral Mean Subtraction.

Acoustic model training, using Kaldi, for automatic whispery speech recognition

Kozierski P., Sadalla T., Drgas Sz., Dąbrowski A., Ziętkiewicz J., Giernacki W.

Annals of Computer Science and Information Systems

2018

Vol. 16

109--114

The article presents research on the automatic whispery speech recognition. The main task was to find dependences between a number of triphone classes (number of leaves in decision tree) and the total number of Gaussian distributions and therefore, to determine optimal values, for which the quality of speech recognition is best. Moreover, it was found, how these dependences differ between normal and whispery speech, what was not done earlier, and this is the innovative part of this work. Based on the performed experiments and obtained results one can say that the number of triphone classes (number of leaves) for whispered speech should be significantly lower than for normal speech.