Wyniki wyszukiwania - BazTech

1

Expansion of the area of practical application of the PLC control system with parallel architecture

Tymchuk Sergiy, Piskarev Oleksiy, Miroshnyk Oleksandr, Halko Serhii, Shchur Taras

Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska

|

2022

|

T. 12, nr 3

16--19

EN

The analysis of architecture is carried out and offers concerning expansion of a area of practical application of PLC of parallel action are offered. The proposed methodology for constructing a logical control automaton of parallel action, the developed models, algorithm and structures represent a theoretical platform for the practical implementation of information technology for parallel logical control of railway automation objects.

PL

Przeprowadzono analizę architektury i zaproponowano rozszerzenie obszaru praktycznego zastosowania PLC o działaniu równoległym. Zaproponowano metodologię budowy logicznego automatu sterującego o działaniu równoległym, opracowano modele, algorytm i struktury, które stanowią teoretyczną platformę dla praktycznej implementacji technologii informatycznych do równoległego logicznego sterowania obiektami automatyki kolejowej.

2

FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov models

Pietras M., Klęsk P.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2017

|

Vol. 65, nr 6

935--946

EN

This paper presents a programmable system-on-chip implementation to be used for acceleration of computations within hidden Markov models. The high level synthesis (HLS) and “divide-and-conquer” approaches are presented for parallelization of Baum-Welch and Viterbi algorithms. To avoid arithmetic underflows, all computations are performed within the logarithmic space. Additionally, in order to carry out computations efficiently – i.e. directly in an FPGA system or a processor cache – we postulate to reduce the floating-point representations of HMMs. We state and prove a lemma about the length of numerically unsafe sequences for such reduced precision models. Finally, special attention is devoted to the design of a multiple logarithm and exponent approximation unit (MLEAU). Using associative mapping, this unit allows for simultaneous conversions of multiple values and thereby compensates for computational efforts of logarithmic-space operations. Design evaluation reveals absolute stall delay occurring by multiple hardware conversions to logarithms and to exponents, and furthermore the experiments evaluation reveals HMMs computation boundaries related to their probabilities and floating-point representation. The performance differences at each stage of computation are summarized in performance comparison between hardware acceleration using MLEAU and typical software implementation on an ARM or Intel processor.

3

An Implementation of Procedure for Screen Out the Puffs in the Massively Parallel Architecture

Przybyła T.

Zeszyty Naukowe. Elektryka / Politechnika Opolska

|

2013

|

z. 69

87--88

EN

This paper presents the performance improvement achievable in Gaussian Puff Model through parallelization of the procedure of screen out the puff. Calculations carried out in the massively parallel architecture allowed to accelerate the calculations associated with the elimination of the puffs in the CALPUFF model.

4

Accelerated EMF Evaluation Using a SIMD Algorithm

Vilachá C., Otero A. F., Moreira J. C., Míguez E.

Przegląd Elektrotechniczny

|

2013

|

R. 89, nr 3a

286--292

EN

This article presents a fast Single-Instruction Multiple-Data (SIMD) algorithm that evaluates electromagnetic fields (EMFs). It is based on the Method of Moments (MOM) adapted for execution on an SIMD architecture. The big speed-up obtained with this new implementation enables us to obtain results faster and to simulate more complex and realistic models and keep the computing time within a reasonable range, which give lead to better solutions for current EMF problems. This article gives a brief overview of generic massively parallel processors, taking into consideration their hardware architecture and the new computer languages for managing them. We describe the mathematical foundations of the algorithm in order to explain how the operations are distributed and performed by the GPU. Many cases are simulated to analyze the performance of the method proposed and they are compared with a fully implemented CPU algorithm, as well as with another CPU algorithm that uses the Intel MKL solvers for dense matrices. The differences in performance between floating-point precision numbers and double precision numbers is also studied and how they influence the accuracy of the results. The tests carried out suggest that the acceleration obtained grows with the complexity of the model. As a result, the proposed algorithm’s only limitations lie with the hardware features.

PL

Artykuł przedstawia szybki algorytm typu Single-Instruction Multiple-Data (SIMD - pojedyncza instrukcja wiele danych), do obliczania rozkładu pól elektromagnetycznych (EMF). Jest ona oparta na metodzie momentów (MOM) przystosowanych do realizacji w architekturze SIMD. Duże przyspieszenie uzyskane dzięki tej nowej implementacji pozwala uzyskać wyniki szybciej i dla bardziej złożonych symulacji i realistycznych modeli i utrzymać czas obliczeniowy w rozsądnym zakresie. Artykuł zawiera krótki przegląd procesorów masowo równoległych, biorąc pod uwagę ich architekturę sprzętową i nowe języki programowania do zarządzania nimi. Opisano matematyczne podstawy algorytmu, aby wyjaśnić, w jaki sposób operacje są wykonywane przez procesor graficzny (GPU). Wiele przypadków zostało symulowanych, aby przeanalizować działanie proponowanej metody a wyniki porównano ze znanymi algorytmami, jak również z algorytmem, który wykorzystuje Intel MKL Solvers do gęstych matryc. Z przeprowadzonych testów wynika, że uzyskane przyspieszenie rośnie wraz ze złożonością modelu. Jedyne ograniczenia algorytmu zależą od możliwości sprzętowych.

5

Implementacja sztucznej sieci neuronowej w architekturze równoległej z wykorzystaniem protokołu MPI

Bartecki K., Czorny M.

Pomiary Automatyka Kontrola

|

2011

|

R. 57, nr 6

638-640

PL

W artykule wskazano na pewne aspekty związane z implementacją jednokierunkowej sieci neuronowej w architekturze równoległej z wykorzystaniem standardu przesyłania komunikatów MPI. Zaprezentowany przykład zastosowania sieci dotyczy klasycznego problemu aproksymacji funkcji. Zbadano wpływ liczby uruchamianych procesów na efektywność procedury uczenia i działania sieci oraz zademonstrowano negatywny wpływ opóźnień powstałych przy przesyłaniu danych za pomocą sieci LAN.

EN

In the paper some characteristic features concerning feed-forward neural network implementation in parallel computer architecture using MPI communication protocol are investigated. Two fundamental methods of neural network parallelization are described: neural (Fig. 1) as well as synaptic parallelization (Fig. 2). Based on the presented methods, an original application implementing feed-forward multilayer neural network was built. The application includes: a Java runtime interface (Fig. 3) and a computational module based on the MPI communication protocol. The simulation tests consisted in neural network application to classical problem of nonlinear function approximation. Effect of the number of processes on the network learning efficiency was examined (Fig. 4, Tab. 1). The negative effect of transmission time delays in the LAN is also demonstrated in the paper. The authors conclude that computational advantages of neural networks parallelization on a heterogeneous cluster consisting of several personal computers will become apparent only in the case of very complex neural networks, composed of many thousands of neurons.

6

Architektury sprzętowych rdzeni dedykowanych do algorytmu pasowania bloków : nowe wyniki

Mroczek K., Modelski J.

Elektronizacja : podzespoły i zastosowania elektroniki

|

2003

|

Vol. 23, nr 3

3-7

PL

W artykule opisano nowa architekturę sprzętowego akceleratora dedykowanego do algorytmu pasowania bloków w wersji pełnego przeglądu (ang. full-search block- matching). Architektura charakteryzuje się dużą przepustowością obliczeniową, niskimi wymaganiami pamięciowymi, skalowalnością oraz możliwością pracy z blokami i oknami przeszukiwania o zmiennych rozmiarach. Efektywnośc obliczeń wynosi 100% i jest zachowana również wtedy, gdy kolejne okna przeszukiwania nie mają obszaru wspólnego. Architektura może być wykorzystana do realizacji podsystemów estymacji ruchu w koderach sekwencji obrazów oraz w aplikacjach pasowania wzorców.

EN

In this paper, a new dedicated hardware architecture for full-search block-matching (FSBM) processor core is presented. Based on 2D SAD array, the architecture is easily scalable to handle different number of PE's different search ranges and block sizes with 100% efficiency and simple control, simple data flow and low memory reguirements. Additionally, the architecture spends no idle time, when two consecutively processed search areas have no common region. Compared to previously proposed FSBM architectures, this architecture is favourable in term of both flexibility and throughput.