PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Speech Segmentation Algorithm Based on an Analysis of the Normalized Power Spectral Density

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This article demonstrates a new approach to speaker independent phoneme detection. The core of the algorithm is to measure the distance between normalized power spectral densities in adjacent, short-time segments and verify it based on velocity of changes of values of short-time signal energy analysis. The results of experiment analysis indicate that proposed algorithm allows revealing a phoneme structure of pronounced speech with high probability. The advantages of this algorithm are absence of any prior information on a signal or model of phonemes and speakers that allows the algorithm to be speaker independent and have a low computation complexity.
Rocznik
Tom
Strony
44--49
Opis fizyczny
Bibliogr. 9 poz., rys.
Twórcy
autor
Bibliografia
  • [1] A. Saheli and A. Abolfazl, “Speech recognition from PSD using neural network”, in Proc. Int. MultiConf. Engin. Comp. Scient. IMECS 2009, Hong Kong, 2009, vol. 1, pp. 174–176.
  • [2] B. Gajic and K. Paliwal, “Robust parameters for speech recognition based on subband spectral centroid histograms”, in Proc. 7th Eur. Conf. Speech Commun. Technol. EUROSPEECH 2001, Aalborg, Denmark, 2001.
  • [3] C. Espy-Wilson and S. Manocha, “A new set of features for text-independent speaker identification”, in Proc. Int. Conf. Spoken Lang. Proces. INTERSPEECH 2006, Pittsburgh, USA, 2006.
  • [4] P. Labutin and S Koval, “Speaker identification based on the statistical analysis of f0”, in Proc. 16th Annual Conference IAFPA 2007, Plymouth, UK, 2007.
  • [5] T. Becker and M. Jessen, “Forensic speaker verification using formant features and gaussian mixture models”, in Proc. Int. Conf. Spoken Lang. Proces. INTERSPEECH 2008, Brisbane, Australia, 2008.
  • [6] E. H. Kim and K. H. Hyun, “Robust emotion recognition feature, frequency range of meaningful signal”, in Proc. IEEE Int. Worksh. Robot Human Interact. Commun., Nashville, USA, 2005.
  • [7] M. A. Al-Alaoui and L. Al-Kanj, “Speech recognition using artificial neural networks and hidden Markov models”, IEEE Multidiscipl. Engin. Educ. Mag., vol. 3, no. 3, 2008.
  • [8] N. Bhatnagar, “A modified spectral subtraction method combined with perceptual weighting for speech enhancement”, M.Sc. thesis, The University of Texas, Dallas, August 2002.
  • [9] M. S. Medvedev, “Ispolzovanije vejvlet-preobrazowanija dla postrojenia modelej fonem russkowo jazyka”, Wiestnik Sibirskogo Federalnego Universiteta, no. 9, p. 198, 2006 (in Russian).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BAT8-0020-0016
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.