Visualization of stages of determining cepstral factors in speech recognition systems

Proksa, R.

Artykuł - szczegóły

Tytuł artykułu

Visualization of stages of determining cepstral factors in speech recognition systems

Autorzy

Proksa R.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The article presents two methods of determination of cepstral parameters commonly applied in digital signal processing, in particular in speech recognition systems. The solutions presented are part of a project aimed at developing applications allowing to control the Windows operating system with voice and the use of MSAA (Microsoft Active Accessibility). The analysed voice signal has been visually presented at each of the crucial stages of developing cepstral coefficients.

Słowa kluczowe

speech recognition cepstral coefficients LPCC MFCC isolated word

rozpoznawanie mowy LPCC MFCC wyizolowane słowo sygnały mowy

Wydawca

University of Silesia, Institute of Informatics, Computer Systems Department

Czasopismo

Journal of Medical Informatics & Technologies

Rocznik

2009

Tom

Vol. 13

Strony

121--128

Opis fizyczny

Bibliogr. 12 poz., rys.

Twórcy

autor

Proksa R.

robert.proksa@us.edu.pl

Institute of Informatics, University of Silesia, Będzińska 39 St., 41-200 Sosnowiec, Poland

Bibliografia

[1] AHMED N., RAO K. R., 1985: Orthogonal Transform for Digital Signal Processing, Springer-Verlag, New York.
[2] AUGUSTYN G., Rekursywno adaptacyjna dyskretna transformata Fouriera jako nowe narzędzie analizy sygnałów, IX Międzynarodowe sympozjum reżyserii i inżynierii dźwięku ISSET, 2001.
[3] DUSTOR A., IZYDORCZYK J., Rozpoznawanie mówców, Przegląd telekomunikacyjny, rocznik LXXVI, nr 2-3/2003, pp. 71-76.
[4] KAMM T., HERMANSKY H., ANDREOU A. G., Learning the Mel-scale and Optimal VTN Mapping, Johns Hopkins University, Center for Language and Speech Processing, 1997 workshop (WS97), 1997.
[5] PORWIK P., Isolated word descriptors as control parameters of the computer applications, Journal of Medical Informatics&Technologies, Vol.10, 2006, pp.35-46.
[6] PORWIK P., PROKSA R., Endpoints detection level for isolated words recognition, IMM 2008, Warsaw, Poland.
[7] PORWIK P., PROKSA R., Word extraction method in human speech processing, Journal of Medical Informatics&Technologies. 2008, Vol 12, pp. 209-216.
[8] PROKSA R., Metody detekcji granic słowa dla zaszumionych sygnałów mowy, Systemy Wspomagania Decyzji 2008, Zakopane, Poland.
[9] SIGURDSSON S., PETERSEN K. B., LEHN-SCHIØLER T., Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music, Proceedings of the Seventh International Conference on Music Information Retrieval (ISMIR), 2006.
[10] THRASYVOULOU T., BENTON S., Speech parameterization using the Mel scale Part II, 2003.
[11] TYCHTL Z. PSUTKA J., Speech Production Based on the Mel-Frequency Cepstral Coefficients, In EUROSPEECH'99, 2335-2338.
[12] ZHANG X., GUO Y., HOU X., A Speech Recognition Method of Isolated Words Based on Modified LPC Cepstrum, Proceedings of the 2007 IEEE International Conference on Granular Computing, 2007, p. 481.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA4-0002-0022