Wyniki wyszukiwania - BazTech

1

Semi-supervised vs. supervised learning for mental health monitoring: A case study on bipolar disorder

Casalino Gabriella, Castellano Giovanna, Hryniewicz Olgierd, Leite Daniel, Opara Karol, Radziszewska Weronika, Kaczmarek-Majer Katarzyna

International Journal of Applied Mathematics and Computer Science

|

2023

|

Vol. 33, no. 3

419--428

EN

Acoustic features of speech are promising as objective markers for mental health monitoring. Specialized smartphone apps can gather such acoustic data without disrupting the daily activities of patients. Nonetheless, the psychiatric assessment of the patient’s mental state is typically a sporadic occurrence that takes place every few months. Consequently, only a slight fraction of the acoustic data is labeled and applicable for supervised learning. The majority of the related work on mental health monitoring limits the considerations only to labeled data using a predefined ground-truth period. On the other hand, semi-supervised methods make it possible to utilize the entire dataset, exploiting the regularities in the unlabeled portion of the data to improve the predictive power of a model. To assess the applicability of semi-supervised learning approaches, we discuss selected state-of-the-art semi-supervised classifiers, namely, label spreading, label propagation, a semi-supervised support vector machine, and the self training classifier. We use real-world data obtained from a bipolar disorder patient to compare the performance of the different methods with that of baseline supervised learning methods. The experiment shows that semi-supervised learning algorithms can outperform supervised algorithms in predicting bipolar disorder episodes.

2

Impact of noise on the performance of automatic systems for vocal fold lesions detection

Madruga Mario, Campos-Roca Yolanda, Pérez Carlos J.

Biocybernetics and Biomedical Engineering

|

2021

|

Vol. 41, no. 3

1039--1056

EN

Automatic voice condition analysis systems have been developed to automatically discriminate pathological voices from healthy ones in the context of two disorders related to exudative lesions of Reinke’s space: nodules and Reinke’s edema. The systems are based on acoustic features, extracted from sustained vowel recordings. Reduced subsets of features have been obtained from a larger set by a feature selection algorithm based on Whale Optimization in combination with Support Vector Machine classification. Robustness of the proposed systems is assessed by adding noise of two different types (synthetic white noise and actual noise recorded in a clinical environment) to corrupt the speech signals. Two speech databases were used for this investigation: the Massachusetts Eye and Ear Infirmary (MEEI) database and a second one specifically collected in Hospital San Pedro de Alcántara (Cáceres, Spain) for the scope of this work (UEX-Voice database). The results show that the prediction performance of the detection systems appreciably decrease when moving from MEEI to a database recorded in more realistic conditions. For both pathologies, the prediction performance declines under noisy conditions, being the effect of white noise more pronounced than the effect of noise recorded in the clinical environment.

3

Machine learning-based analysis of English lateral allophones

Piotrowska Magdalena, Korvel Gražina, Kostek Bożena, Ciszewski Tomasz, Czyżewski Andrzej

International Journal of Applied Mathematics and Computer Science

|

2019

|

Vol. 29, no. 2

393--405

EN

Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and self-organizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’ and phonology experts’ speech was selected for analyses. For the purpose of the present study, a sub-list of 103 words containing the English alveolar lateral phoneme /l/ was compiled. The list includes ‘dark’ (velarized) allophonic realizations (which occur before a consonant or at the end of the word before silence) and 52 ‘clear’ allophonic realizations (which occur before a vowel), as well as voicing variants. The recorded signals were segmented into allophones and parametrized using a set of descriptors, originating from the MPEG 7 standard, plus dedicated time-based parameters as well as modified MFCC features proposed by the authors. Classification methods such as ANNs, the kNN and the SOM were employed to automatically detect the two types of allophones. Various sets of features were tested to achieve the best performance of the automatic methods. In the final experiment, a selected set of features was used for automatic evaluation of the pronunciation of dark /l/ by non-native speakers.

4

Audio features for speech detection in adverse conditions

Mąka T.

Elektronika : konstrukcje, technologie, zastosowania

|

2010

|

Vol. 51, nr 4

38-40

EN

The paper presents an analysis of the audio features for speech processing systems, where speech signal is contaminated by background noise. To determine robustness of speech features for different audio environments, a comparison between feature contours in clean and noisy conditions using mean-square error criterion was performed. The obtained results have been exploited to simple, low-complexity speech detection algorithm. Experimental results show that accurate determination of speech regions is highly dependent on recording conditions and speaker characteristics. However, such approach is suitable for automatic detection of sentence boundaries for speech processing systems.

PL

W pracy przedstawiono analizę cech wykorzystywanych w systemach przetwarzania sygnału mowy w kontekście jego detekcji w niekorzystnych warunkach rejestracji. W wyniku przeprowadzonej analizy określono zbiór cech, których kontury ulegają najmniejszym zniekształceniom na podstawie pomiaru błędu średniokwadratowego dla sygnału bez zakłóceń i zdegradowanego. Z użyciem tych cech zaproponowano prosty algorytm detekcji sygnału mowy o niskiej złożoności. Wyniki przeprowadzonych badań pokazują, że określenie dokładnych granic poszczególnych słów jest ściśle uzależnione od warunków akwizycji oraz rodzaju mówcy. Pomimo tego, proponowane podeście umożliwia określenie w sposób automatyczny granic wypowiedzi w systemach przetwarzania sygnału mowy.