Ten serwis zostanie wyłączony 2025-02-11.
Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 5

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  speech coding
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
In a person identification or verification, the prime interest is not in recognizing the words but determining who is speaking the words. In systems of person identification, a test of signal from an unknown speaker is compared to all known speaker signals in the set. The signal that has the maximum probability is identified as the unknown speaker. In security systems based on person identification and verification, faultless identification has huge meaning for safety. In systems of person verification, a test of signal from a known speaker is compared to recorded signals in the set, connected with a known tested persons label. There are more than one recorded signals for every user in the set. In aim of increasing safety, in this work it was proposed own approach to person verification, based on independent speech and facial asymmetry. Extraction of the audio features of person's speech is done using mechanism of cepstral speech analysis. The idea of improvement of effectiveness of face recognition technique was based on processing information regarding face asymmetry in the most informative parts of the face the eyes region.
|
|
tom Vol. 30, nr 1
art. no. 2019131
EN
This paper presents results of research on effects of lossy coding on formant frequencies for japanese speech signals. Additionally changes in pitch of the voice were inspected. For this research four most popular lossy coding standards were chosen, MP3, WMA, AAC and OGG, and compared to original WAVE files. Audio files were created by the author based on ITU-T P.501 recommendation in two sampling frequencies, 16 kHz and 48 kHz, and converted into chosen codecs. To extract the data from audio files, open license software Praat was used. Due to discovered differences in time duration between original and encoded files, that also differed between individual codecs, only OGG and WMA standards were compared directly. MP3 and AAC standards were divided into Japanese syllables, averaged and then compared into also averaged WAVE files. Results were additionally compared to FLAC lossless codec.
|
2006
|
tom Vol. 2, nr 1
55-64
EN
The purpose of this work is to explain the theoretical issues and implementational techniques related to the fascinating field of speech recognition. The topic of discussion are focused on some of the well-established and widely used speech coding standards, required to speech recognition and speaker identification. By studying the most successful standards and understanding their principles, performance and limitations, it is possible to apply a particular technique to a given situation according to the underlying constraints - with the ultimate goal being the development of next-generation algorithms, with improvements in all aspects. This document contains own created methods to determine the beginning and end of isolated words in audio speech. To extraction of the audio features of person's speech, in this work it was applied the mechanism of cepstral speech analysis. Finally, the paper will show results of speech coding.
EN
This paper focuses on the class of speech enhancement systems, which capitalize on psychoacoustic properties of the human ear. More advanced psychoacoustically motivated spectral weighting rules are described. Presented systems are analyzed and classified according to their similarity with a human auditory model. Especially, a comparison of improvements in musical noise cancellation and increasing speech intelligibility is performed. Moreover, advantages of the perceptual approaches over conventional ones are focused. Finally, perspectives of integrated psychoacoustically motivated speech enhancement and coding systems are discussed. Paper shows that integration of subband coder with speech enhancement system based on non-uniformly spaced filter bank leads to most promissing combined scheme.
PL
Dokonano przeglądu oraz porównania metod uzdatniania sygnału mowy motywowanych perceptualnie. Wskazano na niedoskonałość rozwiązań psychoakustycznych wykorzystujących klasyczne metody wag widmowych. Opierając się na literaturze zaprezentowano różne sposoby psychoakustycznej optymalizacji tych metod. Prezentowane systemy sklasyfikowano według stopnia zgodności z modelem słuchowym człowieka. Jednocześnie zestawiono wyniki zastosowań rozwiązań psychoakustycznych pod kątem możliwości tłumienia szumu środowiskowego i zapobiegabia zniekształceń sygnału mowy. W zestawieniu uwzględniono także połączone systemy eliminacji echa i redukcji szumów. Ostatecznie przedstawiono perspektywy integracji systemu uzdatniania sygnału mowy z systemem kodowania podpasmowego uwydatniając wykorzystanie modeli psychoakustycznych jako element wspólny obu systemów.
PL
W artykule poruszono problem tworzenia systemów automatycznego rozpoznawania mowy zbudowanych na bazie ukrytych modeli Markowa. Przedstawiono matematyczne podstawy HMM oraz odniesiono je do rzeczywistego problemu. Wykazano, że niezwykle istotny jest odpowiedni dobór liczby stanów oraz rozkładów w systemie. Zaprezentowano także wyniki testów stwierdzające przewagę współczynników RASTA-PLP nad MFCC oraz konieczność stosowania parametrów delta oraz delta-delta.
EN
Article discusses problems associated with automatic speech recognition systems based on Hidden Markov Model. Mathematical basis of HMM have been presented and it is shown how it can be applied to the real problem. Extremely important is the proper selection of the quantity of states and Gaussian distributions. Test results indicating the advantage of RASTA-PLP coefficients over MFCCs and necessity of using delta and delta-delta parameters are presented.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.