Wyniki wyszukiwania - BazTech

1

Effect of Time-domain Windowing on Isolated Speech Recognition System Performance

Ananthakrishna Thalengala, Anitha H., Girisha T.

International Journal of Electronics and Telecommunications

|

2022

|

Vol. 68, No. 1

161--166

EN

Speech recognition system extract the textual data from the speech signal. The research in speech recognition domain is challenging due to the large variabilities involved with the speech signal. Variety of signal processing and machine learning techniques have been explored to achieve better recognition accuracy. Speech is highly non-stationary in nature and therefore analysis is carried out by considering short time-domain window or frame. In the speech recognition task, cepstral (Mel frequency cepstral coefficients (MFCC)) features are commonly used and are extracted for short time-frame. The effectiveness of features depend upon duration of the time-window chosen. The present study is aimed at investigation of optimal time-window duration for extraction of cepstral features in the context of speech recognition task. A speaker independent speech recognition system for the Kannada language has been considered for the analysis. In the current work, speech utterances of Kannada news corpus recorded from different speakers have been used to create speech database. The hidden Markov tool kit (HTK) has been used to implement the speech recognition system. The MFCC along with their first and second derivative coefficients are considered as feature vectors. Pronunciation dictionary required for the study has been built manually for mono-phone system. Experiments have been carried out and results have been analyzed for different time-window lengths. The overlapping Hamming window has been considered in this study. The best average word recognition accuracy of 61.58% has been obtained for a window length of 110 msec duration. This recognition accuracy is comparable with the similar work found in literature. The experiments have shown that best word recognition performance can be achieved by tuning the window length to its optimum value.

2

An SFA-HMM performance evaluation method using state difference optimization for running gear systems in high-speed trains

Cheng Chao, Wang Meng, Wang Jiuhe, Shao Junjie, Chen Hongtian

International Journal of Applied Mathematics and Computer Science

|

2022

|

Vol. 32, no. 3

389--402

EN

The evaluation of system performance plays an increasingly important role in the reliability analysis of cyber-physical systems. Factors of external instability affect the evaluation results in complex systems. Taking the running gear in high-speed trains as an example, its complex operating environment is the most critical factor affecting the performance evaluation design. In order to optimize the evaluation while improving accuracy, this paper develops a performance evaluation method based on slow feature analysis and a hidden Markov model (SFA-HMM). The utilization of SFA can screen out the slowest features as HMM inputs, based on which a new HMM is established for performance evaluation of running gear systems. In addition to directly classical performance evaluation for running gear systems of high-speed trains, the slow feature statistic is proposed to detect the difference in the system state through test data, and then eliminate the error evaluation of the HMM in the stable state. In addition, indicator planning and status classification of the data are performed through historical information and expert knowledge. Finally, a case study of the running gear system in high-speed trains is discussed. After comparison, the result shows that the proposed method can enhance evaluation performance.

3

Land vehicle navigation using low-cost integrated smartphone gnss mems and map matching technique

Mahmoud Mostafa, Abd Rabbou Mahmoud, El Shazly Adel

Artificial Satellites : Journal of Planetary Geodesy

|

2022

|

Vol. 57, No. 3

138--157

EN

The demand for smartphone positioning has grown rapidly due to increased positioning accuracy applications, such as land vehicle navigation systems used for vehicle tracking, emergency assistance, and intelligent transportation systems. The integration between navigation systems is necessary to maintain a reliable solution. High-end inertial sensors are not preferred due to their high cost. Smartphone microelectromechanical systems (MEMS) are attractive due to their small size and low cost; however, they suffer from long-term drift, which highlights the need for additional aiding solutions using road network that can perform efficiently for longer periods. In this research, the performance of the Xiaomi MI 8 smartphone's single-frequency precise point positioning was tested in kinematic mode using the between-satellite single-difference (BSSD) technique. A Kalman filter algorithm was used to integrate BSSD and inertial navigation system (INS)-based smartphone MEMS. Map matching technique was proposed to assist navigation systems in global navigation satellite system (GNSS)-denied environments, based on the integration of BSSD-INS and road network models applying hidden Marcov model and Viterbi algorithm. The results showed that BSSD-INS- map performed consistently better than BSSD solution and BSSD–INS integration, irrespective of whether simulated outages were added or not. The root mean square error (RMSE) values for 2D horizontal position accuracy when applying BSSD-INS-map integration improved by 29% and 22%, compared to BSSD and BSSD-INS navigation solutions, respectively, with no simulated outages added. The overall average improvement of proposed BSSD-INS-map integration was 91%, 96%, and 98% in 2D horizontal positioning accuracy, compared to BSSD-INS algorithm for six GNSS simulated signal outages with duration of 10, 20, and 30 s, respectively.

4

HMM-based phoneme speech recognition system for the control and command of industrial robots

Naik Adwait

Technical Transactions

|

2021

|

Vol. 118, iss. 1

art. no. e2021002

EN

n recent years, the integration of human-robot interaction with speech recognition has gained a lot of pace in the manufacturing industries. Conventional methods to control the robots include semi-autonomous, fully-autonomous, and wired methods. Operating through a teaching pendant or a joystick is easy to implement but is not effective when the robot is deployed to perform complex repetitive tasks. Speech and touch are natural ways of communicating for humans and speech recognition, being the best option, is a heavily researched technology. In this study, we aim at developing a stable and robust speech recognition system to allow humans to communicate with machines (robotic-arm) in a seamless manner. This paper investigates the potential of the linear predictive coding technique to develop a stable and robust HMM-based phoneme speech recognition system for applications in robotics. Our system is divided into three segments: a microphone array, a voice module, and a robotic arm with three degrees of freedom (DOF). To validate our approach, we performed experiments with simple and complex sentences for various robotic activities such as manipulating a cube and pick and place tasks. Moreover, we also analyzed the test results to rectify problems including accuracy and recognition score.

5

Animal mimicry for covert communication with arbitrary output distribution: beyond the assumption of ignorance

Zuber Krzysztof Władysław, Opieliński Krzysztof J.

Vibrations in Physical Systems

|

2019

|

Vol. 30, nr 1

art. no. 2019119

EN

The paper describes a new method of embedding human communication in acoustic sequences mimicking animal communication. This is done to ensure a low probability of detection (LPD) transfer of covert messages. The proposed scheme mimics not only individual sounds, but also the imitated species’ communication structure. This paper presents a step forward in animal communication mimicry - from pure vocal imitation without regard for the plausibility of communication’s structure, through Zipf’s law-preserving scheme, to the mimicry of a known communication structure. Unlike previous methods, the updated scheme does not rely on third parties’ ignorance of the imitated species’ communication structure beyond Zipf’s law - instead, the new method enables one to encode information in a known zeroth-order Markov model. The paper describes a method of encoding an arbitrary message in a syntactically plausible, species-specific sequence of animal sounds through evolutionary means. A comparison with the previous iteration of the method is also presented.

6

Integration of hidden markov models in the automated speaker recognition system for critical use

Kovtun Vjatcheslav V., Yukhimchuk Maria S., Kisała Piotr, Abisheva Akmaral, Rakhmetullina Saule

Przegląd Elektrotechniczny

|

2019

|

R. 95, nr 4

176--180

EN

In this article, the author theoretically substantiated the possibility of integration of hidden Markov models (IHMM) in the structure of the automated speaker recognition system for critical use (ASRSCU) for analysis of speech information from a plurality of independent input channels, which allowed within the statistical conception of pattern recognition to combine the accuracy of the approximation of input signals inherent the apparatus of GMM models. The authors proposed a mathematical apparatus for the integration of hidden Markov models, which allows us to adequately describe the set of interacting processes in the Markov paradigm with the preservation of temporal, asymmetric conditional probabilities between the chains.

PL

W tym artykule autorzy teoretycznie uzasadnili możliwość integracji ukrytych modeli Markowa (IHMM) w strukturze zautomatyzowanego systemu rozpoznawania głosu osoby mówiącej do zastosowań krytycznych (ASRSCU) do analizy informacji o mowie z wielu niezależnych kanałów wejściowych, które dopuszczają wewnątrz statystyczna koncepcję rozpoznawania wzorców w celu połączenia dokładności aproksymacji sygnałów wejściowych z aparatem modeli GMM. Autorzy zaproponowali aparat matematyczny do integracji ukrytych modeli Markowa, który pozwala odpowiednio opisać zestaw oddziałujących procesów w paradygmacie Markowa z zachowaniem czasowych, asymetrycznych warunkowych prawdopodobieństw między łańcuchami.

7

Times series averaging and denoising from a probabilistic perspective on time-elastic kernels

Marteau Pierre-Francois

International Journal of Applied Mathematics and Computer Science

|

2019

|

Vol. 29, no. 2

375--392

EN

In the light of regularized dynamic time warping kernels, this paper re-considers the concept of a time elastic centroid for a set of time series. We derive a new algorithm based on a probabilistic interpretation of kernel alignment matrices. This algorithm expresses the averaging process in terms of stochastic alignment automata. It uses an iterative agglomerative heuristic method for averaging the aligned samples, while also averaging the times of their occurrence. By comparing classification accuracies for 45 heterogeneous time series data sets obtained by first nearest centroid/medoid classifiers, we show that (i) centroid-based approaches significantly outperform medoid-based ones, (ii) for the data sets considered, our algorithm, which combines averaging in the sample space and along the time axes, emerges as the most significantly robust model for time-elastic averaging with a promising noise reduction capability. We also demonstrate its benefit in an isolated gesture recognition experiment and its ability to significantly reduce the size of training instance sets. Finally, we highlight its denoising capability using demonstrative synthetic data. Specifically, we show that it is possible to retrieve, from few noisy instances, a signal whose components are scattered in a wide spectral band.

8

Chemotherapy-induced fatigue estimation using hidden Markov model

Ameli Sina, Naghdy Fazel, Stirling David, Naghdy Golshah, Aghmesheh Morteza

Biocybernetics and Biomedical Engineering

|

2019

|

Vol. 39, no. 1

176--187

EN

Chemotherapy-induced fatigue undermines the physical performance and alter gait behaviour of patients. In clinics, there is not a well-established method to objectively assess the effects of chemotherapy-induced fatigue on gait characteristics. Clinical trials commonly use 6 Minute Walking Tests (6MWT) to assess patients' gait. However, these studies only measure the distance that patients can walk. The distance does not provide comprehensive information about variations in ambulatory motion characteristics and body postural behaviour which can more appropriately describe the fatigue effects on general physical performance. Gait characteristics provide a manifestation of relationships between muscular and cardiovascular fitness status and physical motions. Hence, an assessment of gait characteristics provides more appropriate information about the effects of chemotherapy-induced fatigue on gait behaviour. A novel approach is proposed to objectively assess the impacts of chemotherapy-induced fatigue on cancer gait by analysing the gait characteristics during 6MWT. The joint angles of the lower body segments are measured by inertial sensors and modelled through a Hidden Markov Model (HMM) with Gaussian emissions. A Gaussian clustering method classifies the joint angles of first gait cycle to determine the six gait phases of a normal gait as initial training values. A comparison of gait characteristics before and after chemotherapy-induced fatigue determines the gait abnormalities. The method is applied to four cancer patients and outcomes are benchmarked against the gait of a healthy subject before and after running program-induced fatigue. The results indicate a more accurate quantitative-based tool to measure the effects of chemotherapy-induce fatigue on gait and physical performance.

9

Multimodal face recognition method with two-dimensional hidden Markov model

Bobulski J.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2017

|

Vol. 65, nr 1

121—128

EN

The paper presents a new solution for the face recognition based on two-dimensional hidden Markov models. The traditional HMM uses one-dimensional data vectors, which is a drawback in the case of 2D and 3D image processing, because part of the information is lost during the conversion to one-dimensional features vector. The paper presents a concept of the full ergodic 2DHMM, which can be used in 2D and 3D face recognition. The experimental results demonstrate that the system based on two dimensional hidden Markov models is able to achieve a good recognition rate for 2D, 3D and multimodal (2D+3D) face images recognition, and is faster than ICP method.

10

Predykcja stanu kanału z wykorzystaniem ukrytych Modeli Markowa w sieciach radia kognitywnego

Bednarczyk W., Gajewski P.

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2016

|

nr 6

536--539, CD

PL

Predykcja stanu kanału, czyli oszacowanie prawdopodobieństwa, czy kanał będzie wolny, czy zajęty, pozwala na skuteczniejsze zarządzanie widmem w sieciach radia kognitywnego. W artykule przedstawiono ocenę zastosowania ukrytych modeli Markowa do predykcji stanu kanału radiowego, na podstawie oszacowania prawdopodobieństwa poprawnej i fałszywej detekcji. Uzyskane prawdopodobieństwo predykcji stanu kanału potwierdza potencjalne możliwości modelu dla sieci radia kognitywnego z oportunistycznym dostępem do widma.

EN

Cognitive radio (CR) networks can be designed to manage the radio spectrum more efficiently by utilizing of temporarily not used channels in licensed frequency bands. In this paper, we propose to use so called Hidden Markov Models (HMM) to predict the spectrum occupancy of sharing radio bands. The results obtained using HMM are very promising and they show that HMM offer a new paradigm for predicting channel behavior in cognitive radio.

11

Parametryzacja sygnału mowy w algorytmach rozpoznawania mowy

Wojtuń J., Ośka J., Piotrowski Z., Bernat M.

Elektronika : konstrukcje, technologie, zastosowania

|

2015

|

Vol. 56, nr 2

34-39

PL

Historia systemów automatycznego rozpoznawania mowy ma już kilkadziesiąt lat. Pierwsze prace badawcze z tego zakresu pochodzą z lat 50. XX wieku (prace w laboratoriach Bella oraz MIT). Pomimo iż zagadnieniem tym zajmuje się wiele zespołów badawczych na całym świecie, problem automatycznego rozpoznawania mowy nie został definitywne rozwiązany. Dostępne systemy rozpoznawania mowy nadal charakteryzują się gorszą skutecznością w porównaniu do umiejętności człowieka. W artykule przedstawiono schemat systemu rozpoznawania mowy na przykładzie rozpoznawania izolowanych słów języka polskiego. Zaprezentowano szczegółowy opis wyznaczania cech dystynktywnych sygnału mowy w oparciu o współczynniki mel – cepstralne oraz cepstralne współczynniki liniowej predykcji. Przedstawiono wyniki skuteczności rozpoznawania poszczególnych fraz.

EN

The first research in automatic speech recognition systems dates back to the fifties of the 20th century (the works of Bell Labs and MIT). Although this issue has been treated by many research teams, the problem of automatic speech recognition has not been definitively resolved and remains open. Available voice recognition systems still have a poorer efficiency compared to human skills. This article presents a diagram of speech recognition system for isolated words of the Polish language. A detailed description of the determination of distinctive features of the speech signal is presented based on the mel-frequency cepstral coefficient and linear predictive cepstral coefficients. Efficiency results are also presented.

12

Transient and stationary characteristics of a packet buffer modelled as an MAP/SM/1/b system

Rusek K., Janowski L., Papir Z.

International Journal of Applied Mathematics and Computer Science

|

2014

|

Vol. 24, no. 2

429--442

EN

A packet buffer limited to a fixed number of packets (regardless of their lengths) is considered. The buffer is described as a finite FIFO queuing system fed by a Markovian Arrival Process (MAP) with service times forming a Semi-Markov (SM) process (MAP/SM/1/b in Kendall’s notation). Such assumptions allow us to obtain new analytical results for the queuing characteristics of the buffer. In the paper, the following are considered: the time to fill the buffer, the local loss intensity, the loss ratio, and the total number of losses in a given time interval. Predictions of the proposed model are much closer to the trace-driven simulation results compared with the prediction of the MAP/G/1/b model.

13

Comparison of the Effectiveness of 1D and 2D Hmm in the Pattern Recognition

Bobulsk J

Image Processing & Communications

|

2014

|

Vol. 19, no. 1

5--11

EN

Hidden Markov Model (HMM) is a well established technique for image recognition and has also been successfully applied in other domains such as speech recognition, signature verification and gesture recognition. HMM is widely used mechanism for pattern recognition based on 1D data. For images one dimension is not satisfactory, because the conversion of one-dimensional data into a twodimensional lose some information. This paper presents a solution to the problem of 2D data by developing the 2D HMM structure and the necessary algorithms.

14

Pipelined language model construction for Polish speech recognition

Sas J., Żołnierek A.

International Journal of Applied Mathematics and Computer Science

|

2013

|

Vol. 23, no. 3

649--668

EN

The aim of works described in this article is to elaborate and experimentally evaluate a consistent method of Language Model (LM) construction for the sake of Polish speech recognition. In the proposed method we tried to take into account the features and specific problems experienced in practical applications of speech recognition in the Polish language, reach inflection, a loose word order and the tendency for short word deletion. The LM is created in five stages. Each successive stage takes the model prepared at the previous stage and modifies or extends it so as to improve its properties. At the first stage, typical methods of LM smoothing are used to create the initial model. Four most frequently used methods of LM construction are here. At the second stage the model is extended in order to take into account words indirectly co-occurring in the corpus. At the next stage, LM modifications are aimed at reduction of short word deletion errors, which occur frequently in Polish speech recognition. The fourth stage extends the model by insertion of words that were not observed in the corpus. Finally the model is modified so as to assure highly accurate recognition of very important utterances. The performance of the methods applied is tested in four language domains.

15

Statistical proper name recognition in Polish economic texts

Marcińczuk M., Piasecki M.

Control and Cybernetics

|

2011

|

Vol. 40, no 2

393-418

EN

In the paper we present a Proper Name Recognition algorithm based on the Hidden Markov Model (HMM). Recognition of the Proper Names (PN) is treated as the basis for Named Entity Recognition problem in general. The proposed method is based on combining domain-dependent method based on HMM with domain independent methods based on gazetteers and hand-written rules for recognition and post-processing that capture the general properties of Polish PN structure. A large gazetteer with entries described morphologically was acquired from the web. The HMM re-scoring mechanism was applied as a basis for integration of different knowledge sources in PN recognition. Results of experiments on a domain corpus of Polish stock exchange reports, used for training and testing, are presented. A cross-domain evaluation on two other corpora is also presented. Adaptability of the method was analysed by applying the trained model to two other domain corpora.

16

Ukryte modele Markowa jako metoda eksploracji danych tekstowych

Mazurek M.

Biuletyn Instytutu Systemów Informatycznych

|

2010

|

nr 6

27-31

PL

W eksploracji danych tekstowych z dużym powodzeniem stosuje się probabilistyczne modele dokumentów. W artykule przedstawiony został jeden z podstawowych, dla tej dziedziny informatyki, sposobów reprezentacji dokumentu za pomocą ukrytych modeli Markowa. Przedstawiono definicję ukrytego modelu Markowa oraz sposób wyznaczenia podstawowych wielkości związanych z wykorzystaniem tego modelu, takich jak prawdopodobieństwo wystąpienia obserwowanej sekwencji symboli (słów), wyszukanie najbardziej prawdopodobnej sekwencji stanów procesu, czy też formuły reestymacji parametrów modelu używane w procesie uczenia modelu.

EN

In the text mining applications probabilistic models of document are widely used. In this paper the Hidden Markov Models were described as a fundamental method for text processing. Definition of the HMM was presented and the algorithms to find parameters of the model. Some of the possible applications of HMM were suggested.

17

Recognizing The Signal Of Speech Using The Hidden Markov Model In The Process Of Controlling The Mechanical Device

Rosik M.

Zeszyty Naukowe. Elektryka / Politechnika Opolska

|

2009

|

z. 62

59-60

EN

In this paper the problems connected with the process of the production, analysis and recognition of the speech signal are presented. The author has attempted to present briefly the theory of Hidden Markov Models and their implementation in real world system such as steering the mechanical device.

18

Predicting access to materialized methods by means of hidden Markov model

Masewicz M., Andrzejewski W., Wrembel W., Kró1ikowski Z.

Control and Cybernetics

|

2009

|

Vol. 38, no 1

127-152

EN

Method materialization is a promising data access optimization technique for multiple applications, including, in particular object programming languages with persistence, object databases, distributed computing systems, object-relational data warehouses, multimedia data warehouses, and spatial data warehouses. A drawback of this technique is that the value of a materialized method becomes invalid when an object used for computing the value of the method is updated. As a consequence, a materialized value of the method has to be recomputed. The materialized value can be recomputed either immediately after updating the object or just before calling the method. The moment the method is recomputed bears a strong impact on the overall system performance. In this paper we propose a technique of predicting access to materialized methods and objects, for the purpose of selecting the most appropriate recomputation technique. The prediction technique is based on the Hidden Markov Model (HMM). The prediction technique was implemented and evaluated experimentally. Its performance characteristics were compared to: immediate recomputation, deferred recomputation, random recomputation, and to our previous prediction technique, called a PMAP.

19

Application of deformable grids and hidden Markov models for isolated word recognition from facial image sequences of a speaking person

Nowak H.

Zeszyty Naukowe. Elektryka / Politechnika Łódzka

|

2008

|

z. 115

87-93

EN

The paper reports a method of word recognition using visual information only derived from a video speech recording. Combination of the discriminative deformable grid approach to individual frame analysis with the Hidden Markov Model technique, applied to a sequence analysis, is proposed to solve the lip-reading problem. The main research objective was to develop the deformable grid construction method and to extract the visual speech characteristics from the mouth images that could be used in a speech recognition. The visual speech recognition system has been described. Similarly, the method of verification with isolated phones and digits recognition experiments has been presented.

PL

Celem badań było opracowanie metody rozpoznawania słów na podstawie sekwencji obrazów twarzy z zarejestrowaną wypowiedzią. Do rozwiązania tak postawionego zadania zaproponowano koncepcję połączenia metody dyskryminacyjnej siatki deformowalnej do analizy pojedynczych klatek video oraz Niejawnych Modeli Markova (HMM) do analizy sekwencji. Głównym przedmiotem badań było opracowanie metody projektowania siatki deformowalnej i ekstrakcji charakterystyk wizualnych mowy na podstawie obrazów ust. Siatka deformowalna jest abstrakcyjną strukturą złożoną z elastycznie połączonych węzłów, które przechowują wartości lokalnej cechy obrazu. Odpowiednio skonstruowana siatka jest wykorzystywana do ekstrakcji deskryptora obrazu ust w procesie jej iteracyjnego dopasowania do obrazu. W przedstawionym systemie zaimplementowano procedury lokalizacji twarzy i ust oraz analizy sekwencji. W pierwszym kroku przetwarzania, siatka deformowalna jest wykorzystana do obliczenia deskryptora obrazu ust dla każdej klatki sekwencji. Uzyskane dane są następnie kodowane i analizowane za pomocą HMM. Podsumowując, zaproponowaną metodę rozpoznawania słów w oparciu jedynie o informację obrazową przetestowano przy użyciu eksperymentów z rozpoznawaniem pojedynczych głosek oraz wypowiadanych cyfr. Metoda może służyć rozpoznawaniu słów z większego słownika lub w systemach rozpoznawania na podstawie obrazu i dźwięku.

20

Zastosowanie metod grupowania sekwencji czasowych w rozpoznawaniu mowy na podstawie ukrytych modeli Markowa

Pałys T.

Biuletyn Instytutu Automatyki i Robotyki

|

2006

|

R. 12, nr 23

113-127

PL

Artykuł dotyczy problemu tworzenia ukrytych modeli Markowa na podstawie zarejestrowanych wypowiedzi. Kluczowym problemem jest tu wyznaczenie zbioru stanów modelu Markowa. Przyjęto, że stany modelu są określone przez skupienia obserwacji. Skupienia te można uzyskać drogą grupowania sekwencji obserwacji sygnału mowy.

EN

A problem of hidden Markov models formation on the basis of recorded speech is considered in this paper. The key issue is the designation of a Markov model set. The assumption is that each HMM state is associated with clusters of observations. The clusters may be obtained by gathering of observations sequences for a speech signal.