Lossy coding and bitrate effects on changes in formant frequencies in Japanese and English speech signals

Kucharski, Mateusz Andrzej; Brachmański, Stefan

doi:10.24425/ijet.2024.149584

Artykuł - szczegóły

Tytuł artykułu

Lossy coding and bitrate effects on changes in formant frequencies in Japanese and English speech signals

Autorzy

Kucharski Mateusz Andrzej , Brachmański Stefan

Treść / Zawartość

Pełne teksty:

IJET_2024_70_3_KUCHARSKI_BRACHMANSKI_Lossy coding and bitrate effects on changes.pdf

Pobierz

Identyfikatory

DOI

10.24425/ijet.2024.149584

Warianty tytułu

Języki publikacji

Abstrakty

Since speaker recognition and verification became heavily used technology, both in professional applications like forensics and more everyday ones, the question arose: what factors can impact results of those processes? One thing that may be important with respect to this subject is lossy coding, as some of the information contained in an original file is lost in the coding process. In the era of globalization, not only native languages or languages of neighboring countries are of interest to researchers, but also those quite far, especially from Asia – the biggest exporter of goods and services to Europe. Those economic relationships are usually connected with the interchange of personnel, which further shortens geographical distance. The article presents the results that are a continuation of research on the behavior of Japanese language formants. Earlier research focused on changes occurring for the first and second formants. This article presents changes observed for the third and fourth formants. The knowledge of these changes is indicated in the process of speaker identification in forensics using the spectrographic method. At the Department of Acoustics and Multimedia, Wroclaw University of Science and Technology and in many centers around the world, the auditoryspectrographic method is used, which is a combination of the aural and spectrographic methods. In the spectrographic part, a person is identified on the basis of a comparison of the formants’ trajectory.

Słowa kluczowe

formants formant frequency bitrate coding lossy codecs speech

Wydawca

Polish Academy of Sciences, Committee of Electronics and Telecommunication

Czasopismo

International Journal of Electronics and Telecommunications

Rocznik

2024

Tom

Vol. 70, No. 3

Strony

597--602

Opis fizyczny

Bibliogr. 14 poz., rys.

Twórcy

autor

Kucharski Mateusz Andrzej

mateusz.kucharski@pwr.edu.pl

Wroclaw University of Science and Technology, Wroclaw, Poland

autor

Brachmański Stefan

stefan.brachmanski@pwr.edu.pl

Wroclaw University of Science and Technology, Wroclaw, Poland

https://orcid.org/0000-0002-9075-4337

Bibliografia

[1] H. Tachibana and Y. Suzuki, “ “Acoustical Science and Technology” – An improved version of the “Journal of Acoustical Society of Japan (E)” –,” Acoust. Sci. & Tech., 22, 1–1 (2001).
[2] L. G. Kersta, (1962), Voiceprint Identification, Nature, 196, 1253-1257.
[3] Y. Kinoshita, (2001) Testing Realistic Forensic Speaker Identification In Japanese: A Likelihood Ratio Based Approach Using Formants, PhD Thesis, Australian National University.
[4] H. Hollien , R. Schwarz (2000), Aural-perceptual speaker identification: Problems with noncontemporary samples, Forensic Linguistics: The International Journal of Speech, Language and the Law, 7, 2, 199-211.
[5] S. Brachma´nski , (2015) Selected problems of speech transmission quality assessment (in Polish – Wybrane zagadnienia oceny jakości transmisji sygnału mowy), Wroclaw: Oficyna Wyd. Politechniki Wroclawskiej.
[6] M. Kucharski, S. Brachma´nski, (2019) Coding Effects on Changes in Formant Frequencies in Japanese Speech Signals, Vibrations in Physical Systems, 1, 30, 243-250.
[7] ITU-T Recommendation P.501, (2017) Test signals for use in telephonometry.
[8] ITU-T Recommendation P.800, (1996) Method for subjective determination of transmission quality.
[9] M. Kucharski, (2017) Realization of Japanese sentences sets acoustical database for selected coding techniques, Wroclaw, BSc Thesis, Wroclaw University of Science and Technology.
[10] Y. Hirata, K. Tsukada, (2004) The Effects of Speaking Rates and Vowel Length on Formant Movements in Japanese Proceedings of the 2003 Texas Linguistic Society Conference.
[11] T. Hirahara, R. Akahane-Yamada, (2004) Acoustic Characteristics of Japanese Vowels, 18th International Congress of Acoustics.
[12] P. Warden, (2018) Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition, http://download.tensorflow.org/data/speech commands v0.01.tar.gz access 22.09.2020.
[13] J. M. Valin (2007) The Speex Codec Manual Version 1.2 Beta 3 Xiph.org Foundation.
[14] https://docs.microsoft.com/en-us/windows/win32/medfound/about-the-windows-media-codecswindows-media-audio-9 (access 15.07.2020).

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-bf031de2-f6f8-46b5-8f41-e75fd8973adb