Wyniki wyszukiwania - BazTech

1

Investigation of the Lombard effect based on a machine learning approach

Korvel Gražina, Treigys Povilas, Kąkol Krzysztof, Kostek Bożena

International Journal of Applied Mathematics and Computer Science

|

2023

|

Vol. 33, no. 3

479--492

EN

The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters related to speech changes produced by the Lombard effect are extracted. Mid-term statistics are built upon the parameters and used for the self-similarity matrix construction. They constitute input data for a convolutional neural network (CNN). The self-similarity-based approach is then compared with two other methods, i.e., spectrograms used as input to the CNN and speech acoustic parameters combined with the k-nearest neighbors algorithm. The experimental investigations show the superiority of the self-similarity approach applied to Lombard effect detection over the other two methods utilized. Moreover, small standard deviation values for the self-similarity approach prove the resulting high accuracies.

2

Pursuing analytically the influence of hearing aid use on auditory perception in various acoustic situations

Szymański Piotr, Poremski Tomasz, Kostek Bożena

Vibrations in Physical Systems

|

2022

|

Vol. 33, nr 1

art. no. 2022107

EN

The paper presents the development of a method for assessing auditory perception and the effectiveness of applying hearing aids for hard-of-hearing people during short-term (up to 7 days) and longer-term (up to 3 months) use. The method consists of a survey based on the APHAB questionnaire. Additional criteria such as the degree of hearing loss, technological level of hearing aids used, as well as the user experience are taken into consideration. A web-based application is developed, allowing answering the survey questions from any computer with Internet access. The results of the benefit obtained from the use of hearing aids in various acoustic environments, taking into account the time of their use, are presented and compared to the earlier outcomes. The research results show that in the first period of use of hearing aids, speech perception improves, especially in noisy environments. The perception of unpleasant sounds also increases, which may lead to deterioration of hearing aid acceptance by their users.

3

Technologia CyberOko do diagnozy, rehabilitacji i komunikowania się z pacjentami niewykazującymi oznak przytomności

Czyżewski Andrzej, Odya Piotr, Szczuko Piotr, Kostek Bożena, Kwiatkowska Agnieszka, Lech Michał, Kucewicz Michał

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2022

|

nr 3

60--69

PL

CyberOko jest rozwiązaniem opracowanym w Politechnice Gdańskiej, które umożliwia nawiązanie kontaktu i pracę z osobami głęboko upośledzonymi komunikacyjnie. W sposób inteligentny śledzi ruch gałek ocznych, dzięki czemu umożliwia rehabilitację i ocenę stanu świadomości pacjenta nawet w stanie całkowitego porażenia. Rozwiązanie obejmuje także analizę fal EEG, obiektywne badanie słuchu i badanie sygnałów z macierzy elektrod wszczepianych w głąb ludzkiego mózgu. Wspomaga komunikację z pacjentami niewykazującymi oznak przytomności i ich dalszą rehabilitację sposobami umożliwiającymi pokonanie istotnych ograniczeń, jakie mają metody i technologie będące w powszechnym użyciu, tzn. subiektywne skale ocen pacjentów (np. ocena w skali GCS – Glasgow Conciousness Scale), badanie procesów pamięciowych wewnątrz mózgu ludzkiego. Wdrożone urządzenie jest często jedyną szansą dla osoby chorej (np. w stanach podobnych do śpiączki, w przetrwałym stanie wegetatywnym, osoby sparaliżowanej, bez możliwości mówienia), aby mogła ona wyrazić swoje potrzeby.

EN

CyberOko (CyberEye) is a pioneering solution developed at the Gdansk University of Technology enabling contact and work with people with profound communication disabilities. It intelligently tracks eye movements, allowing for rehabilitation and assessment of the patient's state of consciousness even in a state of profound paralysis or locked-in syndrome. The technology engineered also includes the analysis of EEG waves, objective hearing testing, and examination of signals from an array of electrodes implanted deep into the human brain. It supports communication with unconscious patients and their further rehabilitation by means that overcome significant limitations of the methods and technologies in common use, i.e. subjective patient rating scales such as, e.g., Glasgow Conciousness Scale (GCS), study of memory processes inside the human brain. The device implemented is often the only chance for the sick person (e.g., in coma-like states, in a persistent vegetative state, paralyzed, unable to speak) to express their needs.

4

Z perspektywy nieco ponad 15 lat działalności Oddziału IEEE Gdańsk Computer Society (Chapter C16) na Wydziale Elektroniki, Telekomunikacji i Informatyki, Politechniki Gdańskiej

Kostek Bożena, Woźniak Józef

Elektronika : konstrukcje, technologie, zastosowania

|

2022

|

Vol. 63, Nr 11

23--26

PL

W pracy przywołano pokrótce najważniejsze działania, które towarzyszyły powstaniu i funkcjonowaniu Oddziału IEEE Gdańsk Computer Society (Chapter C16). Zaprezentowano skład Zarządu Oddziału w kolejnych kadencjach. Zwrócono uwagę między innymi na rolę Oddziału w promowaniu osiągnięć wybitnych naukowców, prezentujących swoje prace w ramach wykładów, odbywających się pod auspicjami Oddziału, jak też na współudział Oddziału w organizacji i sponsorowaniu konferencji. Podkreślono także udział Oddziału w promowaniu młodych naukowców, w ramach dorocznych konkursów na wyróżniające się prace dyplomowe. Ostatnim elementem krótkiego przeglądu jest przed- stawienie wniosków, które nasunęły się autorom z perspektywy ponad 15 lat działalności Oddziału, a także zasygnalizowanie działań planowanych do realizacji w kolejnych latach.

EN

The paper briefly recalls the most important activities that accompanied the establishment and operation of the IEEE Gdańsk Computer Society Chapter-C16. The composition of the Chapter Board in subsequent terms was presented. Attention was paid to, inter alia, the role of the Chapter in promotion of achievements of outstanding scientists who presented their works during lectures held under the auspices of the Chapter, as well as the participation of the Chapter in the organization and sponsorship of conferences. The participation of the Chapter in promoting young scientists as part of annual competitions for outstanding diploma theses was also emphasized. The last element of the short review is to present the conclusions that came to the authors from the perspective of over 15 years of the Chapter’s activity, as well as to signal the activities planned to be implemented in the coming years.

5

Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions – Headphones vs Loudspeakers: A Case Study

Mróz Bartłomiej, Kostek Bożena

Archives of Acoustics

|

2022

|

Vol. 47, No. 1

71--79

EN

This study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology assumptions and the experimental setup are described to the participants. Then, the results are presented and analysed using statistical methods. The results indicate that the headphone trials showed much higher perceptual ambiguity for the listeners than when the sound is delivered via loudspeakers. The influence of the visual modality dominates the audio-visual evaluation when loudspeaker playback is employed. Moreover, when the visual stimulus is present, the headphone playback pattern of behavior is not always in response to the loudspeaker playback.

6

Analiza ruchu drogowego z wykorzystaniem analizy akustycznej

Marciniuk Karolina, Kostek Bożena

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2020

|

nr 7-8

370--373, CD

PL

Tematyka pracy porusza zagadnienia dotyczące pozyskiwania informacji o ruchu drogowym z wykorzystaniem monitoringu akustycznego. Przybliżono podstawowe techniki nadzoru nad ruchem drogowym. Przedstawiono założenia akustycznego detektora ruchu i zbadano jego skuteczność na trzech płaszczyznach działania – zliczania pojazdów, klasyfikacji rodzajowej i klasyfikacji warunków pogodowych panujących na nawierzchni.

EN

The subject of the work is related to the acquisition of traffic information using acoustic monitoring. Baseline techniques of road traffic supervision are presented. The assumptions of the acoustic motion detector are introduced, and its effectiveness is examined at three levels of operation - vehicle counting, generic classification, and classification of weather conditions on the surface.

7

Analiza parametrów sygnału mowy w kontekście ich przydatności w automatycznej ocenie jakości ekspresji śpiewu

Zaporowski Szymon, Kostek Bożena

Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej

|

2019

|

Nr 68

61--64

PL

Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory niskopoziomowe MPEG 7. W celu selekcji cech, posiadających najlepsze wyniki rankingowe, wykorzystano las drzew. Następnie dokonano klasyfikacji emocji z za pomocą maszyny wektorów nośnych (SVM, Support Vector Machine). Stwierdzono, że parametryzacja skuteczna dla mowy nie jest skuteczna dla śpiewu. Wyznaczono podstawowe parametry, które zgodnie z otrzymanymi wynikami pozwalają na znaczną redukcję wymiarowości wektorów cech, jednocześnie podnosząc skuteczność klasyfikacji.

EN

This paper concerns the approach to parameterization for the classification of emotions in singing and comparison with the classification of emotions in speech. For this purpose, the RAVDESS database containing emotional speech and song was used. This database contains recordings of professional actors presenting six different emotions. Next, Mel Frequency Cepstral Coefficients and selected Low-Level MPEG 7 descriptors were calculated. Using the algorithm of Feature Selection based on a Forest of Trees, coefficients, and descriptors with the best ranking results were determined. Then, the emotions were classified using the Support Vector Machine. The classification was repeated several times, and the results were averaged. It was found that descriptors used for emotion detection in speech are not as useful for singing. Basic parameters for singing were determined which, according to the obtained results, allow for a significant reduction in the dimensionality of feature vectors while increasing the classification efficiency of emotion detection.

8

Analiza kolorów scen filmowych w kontekście color gradingu

Weber Dawid, Kostek Bożena

Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej

|

2019

|

Nr 68

57--60

PL

W artykule przedstawiono zagadnienia związane z kolorowaniem sceny filmowej. W pracy przedyskutowano główne aspekty obróbki koloru obrazu filmowego oraz omówiono definicje pojęć związanych z kolorowaniem sceny, tj.: color correction oraz color gradingu. Opisano teorie psychologii koloru oraz ich praktyczne wykorzystanie w filmie i odniesiono je do podstawowych gatunków filmowych i modeli emocji. Następnie przedyskutowano założenia metodologii analizy kolorów scen filmowych w kontekście color gradingu, obejmującej również zebranie przykładów scen filmowych i ich adnotację. Przedstawiono strukturę oraz opis algorytmu uzyskiwania najbardziej dominujących kolorów scen filmowych w produkcjach filmowych. Wynikiem pracy algorytmu jest ekstrakcja parametrów związanych z trzema najważniejszymi cechami koloru, tj.: luminancją, nasyceniem i odcieniem, czyli histogramów luminancji i saturacji wyznaczanymi dla kilku pasm osobno w skali logarytmicznej (np. dla luminancji: najbardziej dominujące kolory, średnie i cienie). W artykule zawarto wstępne wyniki analizy kolorów na podstawie przetwarzania obrazu uzyskane w wyniku implementacji algorytmu. Pracę kończy podsumowanie i wnioski dotyczące połączenia najbardziej dominujących kolorów w scenach filmowych wraz z psychologią kolorów oraz oddziaływaniem ich na ludzkie emocje.

EN

The article presents issues related to the film scene coloring. The main aspects of film image color processing are discussed in the paper and the definitions of concepts related to scene coloring, i.e., color correction and color grading, are brought. The theories of color psychology and their practical use in film and the relationship to basic film genres and emotion models were described. Next, the assumptions of the methodology for analyzing the color of film scenes in the context of color grading were discussed, including the collection of examples of film scenes and their annotation. The structure and description of the algorithm for obtaining the most dominant colors of film scenes in film productions is presented. The results of the algorithm performance are parameters extracted related to the three most essential color features, i.e., luminance, saturation and shade. Luminance and saturation histograms are determined for several bands separately on a logarithmic scale (e.g., for luminance: the most dominant colors, medium, and shadows). The article contains preliminary results of the color analysis based on image processing obtained as a result of the algorithm implementation. The paper ends with a summary regarding the possibility of combining the most dominant colors in movie scenes along with music based on color psychology and their impact on human emotions.

9

Machine learning-based analysis of English lateral allophones

Piotrowska Magdalena, Korvel Gražina, Kostek Bożena, Ciszewski Tomasz, Czyżewski Andrzej

International Journal of Applied Mathematics and Computer Science

|

2019

|

Vol. 29, no. 2

393--405

EN

Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and self-organizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’ and phonology experts’ speech was selected for analyses. For the purpose of the present study, a sub-list of 103 words containing the English alveolar lateral phoneme /l/ was compiled. The list includes ‘dark’ (velarized) allophonic realizations (which occur before a consonant or at the end of the word before silence) and 52 ‘clear’ allophonic realizations (which occur before a vowel), as well as voicing variants. The recorded signals were segmented into allophones and parametrized using a set of descriptors, originating from the MPEG 7 standard, plus dedicated time-based parameters as well as modified MFCC features proposed by the authors. Classification methods such as ANNs, the kNN and the SOM were employed to automatically detect the two types of allophones. Various sets of features were tested to achieve the best performance of the automatic methods. In the final experiment, a selected set of features was used for automatic evaluation of the pronunciation of dark /l/ by non-native speakers.

10

Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results

Korvel Gražina, Kurasova Olga, Kostek Bożena

Archives of Acoustics

|

2019

|

Vol. 44, No. 4

693--707

EN

The goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are time- and frequency-domain descriptors. English language as an auxiliary language is used in the experiments. In the first part of the experiments, an analysis of Lithuanian and Polish language samples is carried out, features are extracted, and the most discriminating ones are determined. In the second part of the experiments, automatic classification of Lithuanian/English, Polish/English, and Lithuanian/Polish phonemes is performed.

11

Assessment of the Effectiveness of a Short-term Hearing Aid Use in Patients with Different Degrees of Hearing Loss

Poremski Tomasz, Szymański Piotr, Kostek Bożena

Archives of Acoustics

|

2019

|

Vol. 44, No. 4

719--729

EN

The study presents evaluating the effectiveness of the hearing aid fitting process in the short-term use (7 days). The evaluation method consists of a survey based on the APHAB (Abbreviated Profile of Hearing Aid Benefit) questionnaire. Additional criteria such as a degree of hearing loss, number of hours and days of hearing aid use as well as the user’s experience were also taken into consideration. The outcomes of the benefit obtained from the hearing aid use in various listening environments for 109 hearing aid users are presented, including a degree of their hearing loss. The research study results show that it is possible to obtain relevant and reliable information helpful in assessing the effectiveness of the short-term (7 days) hearing aid use. The overall percentage of subjects gaining a benefit when communicating in noise is the highest of all the analyzed and the lowest in the environment with reverberation. The statistical analysis performed confirms that in the listening environments in which conversation is held, a subjective indicator determined by averaging benefits for listening situations individually is statistically significant with respect to the degree of hearing loss. Statistically significant differences depending on the degree of hearing loss are also found separately for noisy as well as reverberant environments. However, it should be remembered that this study is limited to three types of hearing loss, i.e. mild, moderate and severe. The acceptance of unpleasant sounds gets the lowest rating. It has also been observed that in the initial period of hearing aid use, the perception of unpleasant sounds has a big influence on the evaluation of hearing improvement.