Speaker identification based on Gaussian mixture model : experiments with Polish language utterances

Dąbrowska, A.; Drgas, S.; Cetnarowicz, D.; Chmielewska, I.

Artykuł - szczegóły

Tytuł artykułu

Speaker identification based on Gaussian mixture model : experiments with Polish language utterances

Autorzy

Dąbrowska A. , Drgas S. , Cetnarowicz D. , Chmielewska I.

Identyfikatory

Warianty tytułu

Identyfikacja mówcy na podstawie sumy rozkładów normalnych : eksperymenty z wypowiedziami w języku polskim

Konferencja

Signal Processing Algorithms, Architectures, Arrangements, and Applications. 11th IEEE Signal Processing Workshop SPA 2007 ; 7.09.2007 ; Poznan, Poland

Języki publikacji

Abstrakty

In this paper results of experiments with the prototype speaker recognition system based on Gaussian mixture model (GMM) and mel-cepstral coefficients (MFCCs) are presented for Polish Corpora database [4]. The minimum amount of data to train a reliable model and the minimum length of a signal to recognize speakers have been determined. Furthermore, the speaker discriminative properties of Polish phonemes have been investigated. The phonemes with the best speaker discriminative properties have been determined.

Przedstawiono eksperymenty identyfikacji mówcy za pomocą prototypowego systemu rozpoznawania mowy na podstawie sumy rozkładów normalnych (GMM) i współczynników mel-cepstralnych, (MFCC), uzyskanych z wykorzystaniem polskojęzycznej bazy Corpora [4]. W eksperymentach zbadano minimalną ilość danych potrzebnych do wytrenowania wiarygodnego modelu oraz długość sygnału wymaganą do poprawnej klasyfikacji. Ponadto przebadano dyskryminacyjne właściwości polskich fonemów do identyfikacji mówcy. Wyodrębniono fonemy, które w największym stopniu przyczyniają się do poprawnego rozpoznawania.

Słowa kluczowe

speaker identification Gaussian mixture model mel-cepstral coefficients Corpora database

identyfikacja mówcy model sumy rozkładów normalnych współczynniki mel-cepstralne baza danych Corpora

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2008

Tom

Vol. 49, nr 4

Strony

29--33

Opis fizyczny

Bibliogr. 10 poz., tab., wykr.

Twórcy

autor

Dąbrowska A.

autor

Drgas S.

autor

Cetnarowicz D.

autor

Chmielewska I.

Poznań University of Technology, Chair of Control and System Engineering

Bibliografia

[1] Atal B.: Automatic speaker recognition based on pitch contours. J. Acoust. Soc. Am., 52, pp. 1687-1697, 1972.
[2] Furui S.: Speaker-dependent feature extraction, recognition and processing techniques. Speech Communciation. Vol. 10, 5-6, pp. 505-520, 1998.
[3] Faundez-Zanuy M., Monte-Moreno E.: State-of-the-art in speaker recognition. IEEE Aerospace and Electronic Systems Magazine. Vol. 20, 5, pp. 7-12, 2005.
[4] Grocholewski S.: Statystyczne podstawy systemu ARM dla jezyka polskiego. Wyd. Politechniki Poznariskiej, Poznań 2001.
[5] Kinnunen T.: Joint acoustic-modulation frequency for speaker recognition. ICASSP2006, vol. 1, pp. I-665-I-668.
[6] Nabney I.T.: Netlab: Algorithms for pattern recognition. Springer 2004.
[7] Neciouglu B., Clements M., Barnwell T.: Objectively measured descriptor applied to speaker characterisation. Proc. IEEE Int. Conf. ASSP, pp. 483-486, 1996.
[8] Pelecanos J., Slomka J., Sridharan S.: Enhancing automatic speaker identification using phoneme clustering and frame based parameter and frame size selection. Signal Processing and Its Applications 1999. ISSPA '99. Proceedings on the Fifth International Symposium on., vol. 2. 22-25 Aug. 1999, pp. 633-636.
[9] Reynolds D., Rose R.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech & Audio Proc. 3, 1995, pp. 72-83.
[10] Yu K.M., Oglesby J.: Speaker recognition models. Proc. Europ. Conf. Speech Comm. & Tech., pp. 629-632.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BWAD-8101-0005