Wyniki wyszukiwania - Biblioteka Nauki

1

Audio Stream Analysis for Deep Fake Threat Identification

100%

Jędrasiak K.

|

nr 1

21-35

EN

This article introduces a novel approach for the identification of deep fake threats within audio streams, specifically targeting the detection of synthetic speech generated by text-to-speech (TTS) algorithms. At the heart of this system are two critical components: the Vocal Emotion Analysis (VEA) Network, which captures the emotional nuances expressed within speech, and the Supervised Classifier for Deepfake Detection, which utilizes the emotional features extracted by the VEA to distinguish between authentic and fabricated audio tracks. The system capitalizes on the nuanced deficit of deepfake algorithms in replicating the emotional complexity inherent in human speech, thus providing a semantic layer of analysis that enhances the detection process. The robustness of the proposed methodology has been rigorously evaluated across a variety of datasets, ensuring its efficacy is not confined to controlled conditions but extends to realistic and challenging environments. This was achieved through the use of data augmentation techniques, including the introduction of additive white noise, which serves to mimic the variabilities encountered in real-world audio processing. The results have shown that the system's performance is not only consistent across different datasets but also maintains high accuracy in the presence of background noise, particularly when trained with noise-augmented datasets. By leveraging emotional content as a distinctive feature and applying sophisticated machine learning techniques, it presents a robust framework for safeguarding against the manipulation of audio content. This methodological contribution is poised to enhance the integrity of digital communications in an era where synthetic media is proliferating at an unprecedented rate.

2

Acoustic Parameters in the Evaluation of Voice Quality of Choral Singers. Prototype of Mobile Application for Voice Quality Evaluation

75%

Szklanny K. , Szklanny K.

|

nr 3

3

Speech Emotion Recognition Based on Voice Fundamental Frequency

75%

Dimitrova-Grekow T. , Klis A. , Igras-Cybulska M. , Dimitrova-Grekow T. , Klis A. , Igras-Cybulska M.

|

nr 2

4

Classification of Parkinson’s disease and other neurological disorders using voice features extraction and reduction techniques

51%

Majdoubi O. , Benba A. , Hammouch A.

|

tom T. 13, nr 3

16--22

EN

This study aimed to differentiate individuals with Parkinson's disease (PD) from those with other neurological disorders (ND) by analyzing voice samples, considering the association between voice disorders and PD. Voice samples were collected from 76 participants using different recording devices and conditions, with participants instructed to sustain the vowel /a/ comfortably. PRAAT software was employed to extract features including autocorrelation (AC), cross-correlation (CC), and Mel frequency cepstral coefficients (MFCC) from the voice samples. Principal component analysis (PCA) was utilized to reduce the dimensionality of the features. Classification Tree (CT), Logistic Regression, Naive Bayes (NB), Support Vector Machines (SVM), and Ensemble methods were employed as supervised machine learning techniques for classification. Each method provided distinct strengths and characteristics, facilitating a comprehensive evaluation of their effectiveness in distinguishing PD patients from individuals with other neurological disorders. The Naive Bayes kernel, using seven PCA-derived components, achieved the highest accuracy rate of 86.84% among the tested classification methods. It is worth noting that classifier performance may vary based on the dataset and specific characteristics of the voice samples. In conclusion, this study demonstrated the potential of voice analysis as a diagnostic tool for distinguishing PD patients from individuals with other neurological disorders. By employing a variety of voice analysis techniques and utilizing different machine learning algorithms, including Classification Tree, Logistic Regression, Naive Bayes, Support Vector Machines, and Ensemble methods, a notable accuracy rate was attained. However, further research and validation using larger datasets are required to consolidate and generalize these findings for future clinical applications.

PL

Przedstawione badanie miało na celu różnicowanie osób z chorobą Parkinsona (PD) od osób z innymi zaburzeniami neurologicznymi poprzez analizę próbek głosowych, biorąc pod uwagę związek między zaburzeniami głosu a PD. Próbki głosowe zostały zebrane od 76 uczestników przy użyciu różnych urządzeń i warunków nagrywania, a uczestnicy byli instruowani, aby wydłużyć samogłoskę /a/ w wygodnym tempie. Oprogramowanie PRAAT zostało zastosowane do ekstrakcji cech, takich jak autokorelacja (AC), krzyżowa korelacja (CC) i współczynniki cepstralne Mel (MFCC) z próbek głosowych. Analiza składowych głównych (PCA) została wykorzystana w celu zmniejszenia wymiarowości cech. Jako techniki nadzorowanego uczenia maszynowego wykorzystano drzewa decyzyjne (CT), regresję logistyczną, naiwny klasyfikator Bayesa (NB), maszyny wektorów nośnych (SVM) oraz metody zespołowe. Każda z tych metod posiadała swoje unikalne mocne strony i charakterystyki, umożliwiając kompleksową ocenę ich skuteczności w rozróżnianiu pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Naiwny klasyfikator Bayesa, wykorzystujący siedem składowych PCA, osiągnął najwyższy wskaźnik dokładności na poziomie 86,84% wśród przetestowanych metod klasyfikacji. Należy jednak zauważyć, że wydajność klasyfikatora może się różnić w zależności od zbioru danych i konkretnych cech próbek głosowych. Podsumowując, to badanie wykazało potencjał analizy głosu jako narzędzia diagnostycznego do rozróżniania pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Poprzez zastosowanie różnych technik analizy głosu i wykorzystanie różnych algorytmów uczenia maszynowego, takich jak drzewa decyzyjne, regresja logistyczna, naiwny klasyfikator Bayesa, maszyny wektorów nośnych i metody zespołowe, osiągnięto znaczący poziom dokładności. Niemniej jednak, konieczne są dalsze badania i walidacja na większych zbiorach danych w celu skonsolidowania i uogólnienia tych wyników dla przyszłych zastosowań klinicznych.

5

Automatyczna ocena zaburzeń emisji głosu będących wynikiem procesów neurodegeneracyjnych w oparciu o analizę wyizolowanych głosek

51%

Orzechowski T. , Chmurzyńska K. , Radkowski P.

|

tom T. 10, z. 3

91-97

PL

Prezentowane wyniki stanowią początek badań nad automatyczną klasyfikacją głosu. W niniejszej pracy zarysowano teoretyczne podstawy fizjologiczne głosu, patologiczne zmiany w mowie powodowane dyzartrią, następnie scharakteryzowano dobór materiału lingwistycznego pod względem miejsca i sposobu artykulacji w systemie fonetycznym języka polskiego. Kolejne miejsce w pracy zajmuje opis rejestracji i wstępnej analizy głosu badanych (zmiany w realizacji głosek, natężenie głosek wymawianych wielokrotnie w izolacji, analiza widma dźwięków ciągłych). Zjawiska słyszane w badaniu subiektywnym patologa mowy, bądź neurologa zostały potwierdzone precyzyjnym badaniem obiektywnym. Uzyskane parametry pozwalają na sparametryzowanie wyników badań, umożliwiające kompleksową klasyfikację. Pozwoli to również na dokładną ocenę progresji choroby, niemożliwą w klasycznym badaniu subiektywnym.

EN

This paper presents results of preliminary research of voice pathological changes caused by dysarthria. Computer analysis of voice may lead to identification of parameters correlated with neurological diseases. The selection of linguistic material was characterized according to the place and manner of articulation in the phonetic system of Polish. Results of clinical examination allowed to determine simple markers of neurodegenerative diseases, which will serve as a basis for construction of objective examination model.