Wyniki wyszukiwania - BazTech

1

Simulator of a Supercomputer Job Management System as a Scientific Service

Savin Gennadiy, Shabanov Boris, Lyakhovets Dmitriy, Baranov Anton, Telegin Pavel

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

413--416

EN

Job management system (JMS) is an important part of any supercomputer. JMS creates a schedule for launching jobs of different users. Actual job management systems are complex software systems with a number of settings. These settings have a significant impact on various JMS metrics, such as supercomputer resources utilization, mean waiting time of a job in queue, and others. Various JMS simulators are widely used to study the influence of JMS settings or modifications, new scheduling algorithms, jobs input stream parameters or available computing resources for JMS efficiency metrics. The article presents the comparative analysis results of the actual JMS simulators (Alea, ScSF, Batsim, AccaSim, Slurm simulator) and their application areas. The authors consider new ways to use the JMS simulator as a scientific service for researchers. With such a service, the researchers are able to study various hypotheses about JMS efficiency, algorithms or parameters. This gives the folowing: (1) research is performed on the service side around the clock, (2) the simulator accuracy or adequacy is provided by the service, (3) the research results reproducibility is ensured, and the simulator-as-a-service becomes a single entry point for the researchers.

2

Parallel implementation of a PIC simulation algorithm using OpenMP

Suciu Alin, Hangan Anca, Marginean Anca, Joldos Marius, Voitcu Gabriel, Echim Marius

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

381--385

EN

Particle-in-cell (PIC) simulations are focusing on the individual trajectories of a very large number of particles in self-consistent and external electric and magnetic fields; they are widely used in the study of plasma jets, for example. The main disadvantage of PIC simulations is the large simulation runtime,which often requires a parallel implementation of the algorithm. The current paper focuses on a PIC1d3v simulation algorithm and describes the successful implementation of a parallel version of it on a multi-core architecture, using OpenMP, with very promising experimental and theoretical results.

3

Agoge - zintegrowane środowisko programistyczne dla OpenCL C

Bugała M.

Szybkobieżne Pojazdy Gąsienicowe

|

2017

|

nr 3 (45)

49--65

PL

W artykule zaprezentowano możliwości i rozwiązania technologiczne zintegrowanego środowiska programistycznego dedykowanego technologii OpenCL o nazwie Agoge. Omówiono proces zautomatyzowania tworzenia kontekstu OpenCL, obsługi argumentów jądra obliczeniowego i siatki obliczeniowej, oraz przedstawiono możliwości wykorzystania środowiska w obliczeniach numerycznych i analizie obrazu.

EN

The article presents the capabilities and technological solutions of Agoge, an integrated development environment dedicated to OpenCL. The automation of the process of creating an OpenCL context, handling of computational kernel and computational grid are discussed and the potential for using the environment in numerical calculations and image analysis is presented.

4

Zastosowanie obliczeń równoległych do klasyfikacji punktów overlap

Bratuś R., Musialik P., Pióro P., Prochaska M., Rzonca A.

Archiwum Fotogrametrii, Kartografii i Teledetekcji

|

2017

|

Vol. 29

11--26

PL

Publikacja omawia nowatorskie metody rozwiązania ważnego technologicznie zagadnienia, jakim jest klasyfikacji punktów overlap, czyli punktów w pasie podwójnego pokrycia pomiędzy sąsiednimi szeregami skanowania. Prezentowane podejście oparte jest na wydajnej metodzie obliczeń równoległych na procesorach graficznych GPU, pozwalającej na zastosowanie bardziej zaawansowanego algorytmu podczas analizy i przetwarzania danych. Celem sprawdzenia wydajności przeprowadzono testy badanego narzędzia do klasyfikacji punktów overlap, a wyniki odniesiono do możliwości powszechnie stosowanego programu Terrascan firmy Terrasolid. Proponowane innowacje obliczeniowe mają na celu poprawę jakości danych skaningowych pozyskiwanych przy pomocy latających platform takich jak lekkie samoloty czy wiatrakowce. Podniesienie jakości procesu klasyfikacji punktów typu overlap, wymaga dwóch wstępnych etapów przetwarzania. Pierwszy polega na obcięciu brzegów szeregu ściśle według zadanego kąta od pionu. Zastosowane podejście daje bardziej regularne wyniki niż inne metody. Z kolei drugi, oparty o algorytm rozgęszczenia punktów, prowadzi do usuwania nadmiarowych profili skanowania. Proponowane rozwiązanie to klasyfikacja punktów overlap według kąta padania promienia skanera na teren i obiekty terenowe. Reasumując, w ramach opisanych badań dotychczas stosowane metody klasyfikacji punktów overlap zostały poddane rewizji. Korzystając z praktycznych uwag oraz sugestii ze strony wykonawców, wprowadzono szereg udoskonaleń, których prezentacja i dyskusja jest przedmiotem niniejszej publikacji.

EN

The paper presents innovative methods of solving important technological problem: the classification of LiDAR points located in the overlapping area between two parallel scan strips. The presented approach is based on an efficient method of parallel computation using graphic processors, allowing to apply more sophisticated algorithms for data analysis and processing. The tests of the algorithms were executed in order to verify correctness of the assumption that the innovative solutions presented in the paper might increase the efficiency and correctness of the data, referred to well known and popular technological solutions. The suggested computational innovations are applied to increase the quality of the LiDAR data acquired by light airplanes and gyrocopters. Two approaches to increase the quality of classification of overlapping points have bee, proposed. The first process is cutting-off the points of the strip borders strictly according to defined angle measured from vertical direction. The second process is dissolving of the points to get the regular density of the result point cloud. The title issue is the classification of overlapping points according to the angle of incidence to the terrain and other objects. The normal vectors calculation for each of the scan points is necessary for the analysis. Such solution increases the quality of overlaps classification and guarantees its high efficiency thanks to the parallel computation. In conclusion, during the research three innovative approaches were tested and reviewed against commonly used methods. Parallel computation can improve quality and reduce time of processing for overlap classification problem was confirmed.

5

Hadoop, narzędzie technologii Big Data i jego aplikacje

Nowakowski K., Nowakowski W.

Elektronika : konstrukcje, technologie, zastosowania

|

2016

|

Vol. 57, nr 3

33--36

PL

Big Data jest jednym z najważniejszych wyzwań współczesnej informatyki. Wobec zmasowanego napływu wielkich ilości informacji obecnych czasach pochodzących z różnych źródeł, konieczne jest wprowadzanie nowych technik analizy danych oraz rozwiązań technologicznych. Ważnym narzędziem w Big Data jest oprogramowanie Hadoop.

EN

Big Data is a term frequently used in the literature, but still there is no consensus in implementations of such environments. An important tool in Big data is a software Hadoop. The are many tools and technologies in this area. This paper is the review in the Big Data technologies.

6

Ocena szybkości i efektywności obliczeń wybranych systemów komputerowych w zakresie obciążeń impulsowych

Panowicz R.

Modelowanie Inżynierskie

|

2015

|

T. 26, nr 57

47--53

PL

W artykule przedstawiono wyniki badań szybkości i efektywności obliczeń w zależności od liczby równocześnie prowadzonych obliczeń i liczby rdzeni na których obliczenia te są prowadzone. Rozważono typowe przypadki z zakresu dynamiki dla wybranych systemów komputerowych. Przedstawiono wyniki analiz szybkości i efektywności systemów obliczeniowych dla testu Taylora oraz obciążenia falą ciśnienia pochodzącą z detonacji materiału wybuchowego za pomocą funkcji ConWep deflektora. Przedstawiono również wyniki dla przypadku stosowanego do oceny najszybszych komputerów. W przypadku testu Taylora i obciążenia falą ciśnienia deflektora przebadano również wpływ liczby elementów na szybkość obliczeń.

EN

The paper presents the results of computation speed and efficiency based on a number of calculations performed at the same time and on a number of cores on which the calculations are carried out. Typical cases of numerical analysis of dynamic phenomena for the selected computer systems are considered. The results of speed and efficiency analyses of computing systems for the Taylor test and a blast wave interacting with the deflector is presented. The ConWep function is used to model the blast wave. The paper presents also the results for the case applied for evaluation of hi-tech computers. In the case of the Taylor test and the blast wave interacting with the structure, an influence of a number of elements on the speed of calculations is examined, too.

7

Zastosowanie metod przetwarzania równoległego do prognozowania stref zjawisk atmosferycznych niebezpiecznych dla transportu lądowego

Chaładyniak D., Jasiński J., Krawczyk K., Pietrek S., Winnicki I.

Logistyka

|

2015

|

nr 4

2747--2754, CD2

PL

Artykuł przedstawia oryginalną propozycję wykorzystania metod przetwarzania równoległego do analizy i prognozowania wybranych zjawisk i elementów meteorologicznych niebezpiecznych dla transportu lądowego. W badaniach zastosowano metodę Q-wektorów, która służy do wyznaczania obszarów występowania prądów pionowych, określania ich kierunku i intensywności. Obliczona na podstawie metody Q- wektorów funkcja frontogenetyczna wyznacza obszary frontogenezy i frontolizy, czyli strefy powstawania i zaniku frontów atmosferycznych kształtujących pogodę. Przetwarzanie danych z modeli numerycznych oraz pozyskiwanie i interpretacja zdjęć satelitarnych wymagają zastosowania wydajnych systemów obliczeniowych. Właściwym kierunkiem rozwoju wydaje się być budowa komputerów o strukturach równoległych. Filozofia przetwarzania równoległego polega na podziale programu na fragmenty, z których każdy wykonywany jest przez inny procesor, w wyniku czego cała operacja skraca się proporcjonalnie do liczby zastosowanych jednostek obliczeniowych. O skuteczności prognozowania procesów kształtujących pogodę decydują zarówno odpowiednie metodyki wyznaczania obszarów frontogenetycznych i frontolitycznych, jak również metody ich szybkiego, skutecznego i niezawodnego przetwarzania.

EN

The paper presents an original concept of application of parallel processing methodsto analysis and forecasting of some phenomena and meteorological elementswhich are hazardous for land transportation. The research is based on the Q-vectors method used for determination of zones of vertical air currents occurrence, intensity and direction. The frontogenetic function computedusing the Q-vectors methoddetermines areas of frontogenesis and frontolysis, i.e. the areas of atmosphericfronts generation or dissipation. Numerical weather prediction models’data processing and satellite imagery interpretationrequire high performance computingsystems. Building computers with parallel structures seems to be the right direction of development. Parallel processing philosophy is based on the division of the program code into fragments, each of which is executed by another processor, whereby the time needed for the whole operation is reduced in proportion to the number of computational units.The effectiveness of forecasting of the processes that shape weather is determined by both the appropriate methodology of determining the frontogenetic and frontolytic areas as well as by methods for their fast, efficient and reliable processing.

8

Optymalizacja czasu obliczeń symulacji elektromagnetycznych prowadzonych metodą FDTD

Sypniewski M., Celuch M.

Elektronika : konstrukcje, technologie, zastosowania

|

2015

|

Vol. 56, nr 7

30-34

PL

W artykule przedstawiona jest dyskusja dotycząca optymalnego wyboru sprzętu komputerowego do celów prowadzenia symulacji elektromagnetycznych metodą FDTD oraz optymalizacji kodów symulatora pod względem jak najlepszego wykorzystania dostępnych systemów komputerowych. Omówiono tu tendencje światowe w badaniach nad tymi problemami oraz pokazano rozwiązania praktyczne, wprowadzone przez autorów do kodów symulatorów QW-3D oraz QW-V2D.

EN

The paper presents a discussion concerning optimal choice of computer hardware with respect to its application to electromagnetic simulations using the FDTD method as well as optimization of FDTD codes to obtain their best performance with particular hardware available. Tendencies in the worldwide research on those problems are outlined. Specific practical solutions applied by the authors in the QW-3D and QW-V2D simulators are also presented.

9

Analysis of parallelisation of 3D-CEMBS model using technologies like OpenACC and OpenMP

Piotrowski P.

Biuletyn Instytutu Morskiego w Gdańsku

|

2015

|

Vol. 30, No. 1

10--15

EN

Oceanographic models utilise parallel computing techniques to increase their performance. Computer hardware constantly evolves and software should follow to better utilise modern hardware potential. The number of CPU cores with access to shared memory increases with hardware evolution. To fully utilise the possibilities new hardware presents, parallelisation techniques employed in oceanographic models, which were designed with distributed memory systems in mind, have to be revised. This research focuses on analysing the 3D-CEMBS model to assess the feasibility of using OpenMP and OpenACC technologies to increase performance. This was done through static code analysis and profiling. The findings show that the main performance problems are attributed to task decomposition that was designed with distributed memory systems in mind. To fully utilise modern shared memory systems, other task decomposition strategies need to be employed. The presented 3D-CEMBS model analysis is a first stage in wider research of oceanographic models as a specific class of parallel applications. In the long term the research will result in proposing design patterns tailored for oceanographic models that would exploit their characteristics to achieve better hardware utilisation on evolving hardware architectures.

PL

Modele oceanograficzne wykorzystują przetwarzanie równoległe dla zwiększenia wydajności. Sprzęt komputerowy ciągle ewoluuje, więc oprogramowanie powinno zmieniać się razem z nim, aby w pełni wykorzystać potencjał współczesnego sprzętu. Wraz z rozwojem sprzętu komputerowego zwiększa się liczba rdzeni procesorów, które mają dostęp do pamięci współdzielonej. Aby w pełni wykorzystać możliwości nowego sprzętu, techniki zrównoleglania wykorzystywane w modelach oceanograficznych muszą zostać zrewidowane. Modele oceanograficzne były często projektowane z myślą o systemach z pamięcią rozproszoną. Niniejsze badania skupiają się na analizie modelu 3D-CEMBS pod kątem możliwości wykorzystania technologii OpenMP i OpenACC w celu podniesienia wydajności modelu. W tym celu została przeprowadzona statyczna analiza kodu modelu oraz profilowanie. Wyniki badań pokazują, że główny problem wydajnościowy modelu jest wynikiem zastosowania dekompozycji zadań przewidzianej dla systemów z pamięcią rozproszoną. Aby w pełni wykorzystać współczesne komputery z pamięcią współdzieloną należy wprowadzić inne strategie dekompozycji zadań.

10

Obliczenia równoległe z wykorzystaniem GPU

Schubring T.

Logistyka

|

2014

|

nr 6

9405--9412

PL

W artykule przedstawiono wybrane aspekty przetwarzania równoległego w systemach zbudowanych na układach CPU(Central Processing Unit) i GPU(Graphical Procesing Unit) . Opisano równoległą architekturę obliczeniową CUDA (Compute Unified Device Architecture) oraz problemy związane z jej zastosowaniem i kopiowaniem danych z RAM CPU do RAM GPU. Omówiono ogólną koncepcję programu oraz jądra w języku C++ zbudowanego zgodnie z architekturą CUDA , odwołującego się do gridów, bloków i wątków. Przedstawiono przykłady układów graficznych przeznaczonych do przetwarzania równoległego HPC (High performance computing). Wskazano przykładowe zastosowania obliczeń GPGPU (General-Purpose computation on Graphics Processing Units). Zamieszczono spis dziedzin oraz przykłady aplikacji wykorzystujących GPGPU.

EN

The paper presents selected aspects of parallel processing systems built on the CPU (Central Processing Unit) and the GPU (Graphical Procesing Unit) systems. Describes parallel computing architecture CUDA ( Compute Unified Device Architecture ), and problems associated with its use and copying of data from from CPU RAM to GPU RAM. Discusses the general concept of the program and the kernel in C++ constructed in accordance with the CUDA architecture , referring to the grids , blocks and threads. The examples of graphics systems for HPC (High Performance Computing) parallel processing were presented. Sample calculations indicated use GPGPU (General-Purpose computation on Graphics Processing Units). Provide a list of fields and examples of applications using GPGPU.

11

Redukcja czasu wykonania algorytmu Cannego dzięki zastosowaniu połączenia OpenMP z technologią NVIDIA CUDA

Sychel D.

Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej

|

2013

|

Nr 5

103--113

PL

Artykuł prezentuje alternatywne podejście do programowania równoległego poprzez wykorzystanie programowalnych kart graficznych w celu wsparcia obliczeń, oraz połączenie tego podejścia z klasycznym zrównolegleniem opartym o wielordzeniowe procesory. Przeprowadzone testy przedstawiają zysk czasu jaki można uzyskać dzięki odpowiedniemu połączeniu OpenMP z technologią CUDA w obliczeniach związanych z wykrywaniem krawędzi na obrazie rastrowym przy użyciu algorytmu Cannego. Badania przeprowadzone zostały na sprzęcie różnej jakości. Napisane algorytmy są zgodne z CC 1,0 (zdolność obliczeniowa karty graficznej).

EN

This paper presents an alternative approach to parallel programming by using programmable graphics card to support calculations and combines this approach with a classical parallelization based on multi-core processors. The tests show the gain time that can be achieved through a combination of OpenMP with CUDA technology in the calculation of the edge detection on the raster image using the Canny’s algorithm. Tests were carried out on the equipment of varying quality. The algorithms are compatible with CC 1.0 (compute capability graphics card).

12

Porównanie metod programowania gniazd sieciowych i ich wydajności w systemach iSERIES oraz X86

Adamczyk B., Zghidi H.

Studia Informatica

|

2013

|

Vol. 34, nr 3

37--58

PL

W ostatnich latach coraz większą popularnością cieszą się klastry i różnego rodzaju systemy komputerowe, złożone z wielu tańszych komputerów. Warto więc zastanowić się, jakie zalety zapewniają zaawansowane architektury serwerowe, takie jak na przykład IBM Power Series, w szczególności w kontekście rozproszonych usług sieciowych. W tym artykule chcemy skupić się na sposobie obsługi żądań sieciowych komputerów IBM iSeries z systemem operacyjnym i5/OS oraz porównać ich wydajność z komputerami opartymi na architekturze x86 i systemie Linux. Chcemy sprawdzić, czym różnią się techniki programowania podstawowych usług sieciowych w tych platformach oraz wpływ architektury procesora i samego systemu operacyjnego na ich wydajność.

EN

Clusters and other complex computer systems, containing many cheaper computers, are getting very popular these days. Therefore, it is worth to analyze what advantages bring the advanced server architectures like for example IBM Power Series, especially in the distributed network services context. In this article we would like to focus on the way IBM iSeries computer with i5/OS operating system services network requests and compare its performance with standard x86 computer with Linux system. We want to summarize the differences between methods of programming the network services in both platforms and analyze the influence of processor and operating system on their performance.

13

A Parallel Branch and Bound Approach to Optimal Power Flow with Discrete Variables

Moreira J. C., Míguez E., Vilachá C., Otero A. F.

Przegląd Elektrotechniczny

|

2013

|

R. 89, nr 3a

47--52

EN

An optimal power flow (OPF) with discrete variables is a non-convex, nonlinear combinatorial problem. Usually the discrete variables present in an OPF are treated as continuous variables. The solution obtained using this method is clearly infeasible, but it is considered to be close to the discrete real solution and can be attained easily without producing an excessive degradation in optimality. These hypotheses can easily be refuted by demonstrating the need for a more robust general mechanism for treating the discrete variables in the OPF. Finding the exact solution is intractable due to the high computing cost it requires--this fact causes the heuristic techniques to be seen as a natural way to obtain good solutions quickly. This article presents an algorithm based on a branch and bound technique that, with the help of the parallel computing power a personal computer (PC) provides, allows pseudo-optimal solutions to be attained with good calculating times. The numerical results obtained by applying the technique proposed in IEEE networks of 118 and 300 nodes and a real size network derived from the Spanish transport network, demonstrate that the algorithm proposed has good execution times, provides solutions close to the optimum, and naturally manages the infeasibilities that are produced during the process.

PL

Obliczenie optymalnego rozpływu mocy dla zmiennych dyskretnych jest problemem nieliniowym kombinacyjnym. Zwykle zakłada się, że zmienne dyskretne są traktowane jak zmienne ciągłe. Ta metoda uniemożliwia uzyskanie dokładnego wyniku w akceptowalnym czasie. Stąd powstało wiele technik heurystycznych, umożliwiających szybkie uzyskanie wyniku o dobrej dokładności. W artykule zaprezentowano algorytm na podstawie techniki podziału i ograniczeń, która, przy zastosowaniu obliczeń równoległych w nowoczesnych komputerach PC, pozwala na szybkie uzyskanie wyniku sub-optymalnego. Algorytm zastosowano do obliczeń sieci IEEE o 118 I 300 węzłach oraz rzeczywistej sieci, uzyskując krótkie czasy obliczeń i wyniki bliskie optymalnym.

14

Analiza implementacji parametryzującego filtru klasy Volterry-Wienera na platformie CUDA

Biernacki P.

Elektronika : konstrukcje, technologie, zastosowania

|

2013

|

Vol. 54, nr 9

116-118

PL

W artykule zaprezentowano porównanie wydajności realizacji nieliniowego filtru parametryzującego klasy Volterry-Wienera na platformie jednoprocesorowej CPU oraz platformie wieloprocesorowej GPU. Przebadano wpływ zmiany parametrów filtru na jego szybkość działania na obu platformach. Zrównoleglenie obliczeń ma pozwolić na szybsze działanie filtru lub na swobodną zmianę jego parametrów w trakcie działania.

EN

In the article a performance comparison was introduced to the realization of the non-linear parametrization filter of the Volterry-Wienera class on the multicore CPU platform and for multiprocessor GPU platform. An influence of the change of parameters of the filter on his running speed was examined on both platforms. Parallel calculations allow to speed-up an action of the filter or for the on-line filter parameters changing.

15

Ewolucja ISA – wierzchołek góry lodowej

Komorowski W.

Zeszyty Naukowe Dolnośląskiej Wyższej Szkoły Przedsiębiorczości i Techniki. Studia z Nauk Technicznych

|

2012

|

Nr 1

73--94

PL

Lista rozkazów stanowiąca główny atrybut architektury każdego komputera zmieniała się zależnie od dostępnej technologii i wymagań stawianych przez użytkowników. W artykule opisano kilka rozwiązań ISA (Instruction-Set Architecture) – kluczowych w historii informatyki, wskazując na uwarunkowania istniejące w czasie ich powstawania. Przedstawiono powody zmiany paradygmatu projektowania CISC-RISC w latach osiemdziesiątych. Scharakteryzowano istotę przetwarzania równoległego – od potokowości, przez superskalarność i organizacje VLIW aż do przetwarzania masywnie równoległego w obecnych superkomputerach.

EN

Instruction-set architecture is determined by many factors, such as technology and users’ demand. The ISA evolution is illustrated on several examples – milestones in computing history: EDSAC, VAX, Berkeley RISC. The early 80’ CISC-RISC turning point in architecture paradigm is explained. A short characteristic of parallel processing is given – starting from pipelining, through superscalar and VLIW processors up to petaflops supercomputers using Massively Parallel Processing technique.

16

Realizacja przetwarzania w chmurze obliczeniowej na przykładzie systemu uczenia sieci neuronowej opartego na technologii Microsoft Windows Azure

Augustyn D., Badura K.

Studia Informatica

|

2012

|

Vol. 33, nr 2A

49-66

PL

W artykule przedstawiono budowę systemu uczenia sieci neuronowej, opartego na koncepcji przetwarzania w chmurze obliczeniowej. Implementacja systemu bazuje na technologii Microsoft Windows Azure. W rozwiązaniu zastosowano znany algorytm uczenia – metodę wstecznej propagacji błędu – dostosowany do rozproszonego sposobu realizacji. Zaproponowano architekturę systemu, w której wykorzystano współpracujące procesy (instancje) typu WorkerRole. W opracowaniu przedstawiono sposób wykorzystania różnych metod magazynowania danych, dostępnych przez mechanizmy Windows Azure Table, Queue, Blob Storage. Równoległe przetwarzanie systemu zostało zapewnione nie tylko dzięki zastosowaniu wielu procesów WorkerRole, ale również dzięki wykorzystaniu modułu Parallel Extension for .NET przy implementacji kodu WorkerRole.

EN

The paper presents the system for neural network learning based on the idea of Cloud computing. System implementation uses Microsoft Windows Azure technology. The well-known learning algorithm i.e. back propagation method was adopted for parallel and distributed execution. The architecture of cooperative worker role processes was proposed. The paper describes applying of methods of data storage like Windows Azure Table, Queue, Blob. The advantages of parallelization result from either applying multiple processes (instances) of WorkerRoles or applying Parallel Extension for .NET module in WorkeRole’s implementation.

17

Parallel implementation of exact formulae for magnetic field of axisymmetric current distributions

Krawczyk Z., Starzyński J.

Przegląd Elektrotechniczny

|

2012

|

R. 88, nr 11a

56-58

EN

This paper discusses parallel implementation of Python program which computes magnetic induction of a cylindrical coils. The speed-up which can be obtained by use of two Python libraries - MPI4Py and Parallel Python is compared. The use of exact analytical expressions and their parallel implementations allow to achieve computational speed appropriate for practical applications, significantly better than in the case of general numerical methods like FEM.

PL

Artykuł przedstawia rownoległą implementację w języku Python, obliczeń indukcji pola magnetycznego od cewek cylindrycznych oraz ósemkowej. Porównano przyspieszenie działania programu osiągnięte przy wykorzystaniu dwóch bibliotek umożliwiających zrównoleglenie jego kodu: MPI4Py i Parallel Python. Użycie dokładnych wyrażeń analitycznych i ich rownoległa implementacja pozwalają na osiągnięcie szybkości obliczeń odpowiedniej dla zastosowań praktycznych, istotnie większej niż w przypadku metod numerycznych ogolnego zastosowania, takich jak metoda elementow skończonych.

18

Realizacja urządzeń automatyki elektroenergetycznej na bazie układów FPGA

Niklas P.

Pomiary Automatyka Kontrola

|

2012

|

R. 58, nr 1

84-87

PL

W artykule opisano wykorzystanie układu FPGA do realizacji układu automatycznej synchronizacji prądnic. Zastosowanie układu FPGA zapewnia w pełni sprzętową realizację procesu synchronizacji. Gwarantuje to deterministyczną i niezawodną realizację procesu synchronizacji. Układ FPGA pozwala również na równoległą realizację poszczególnych zadań procesu synchronizacji.

EN

In the paper there is described implementation of a power object automatic synchronizer with use of the FPGA chip. The FPGA unit is a programmable chip. It is equipped with a specific set of logic elements, among which you can define the network of connections (Fig. 1). In this way, a hardware implementation of the desired functionality of the system is obtained [3]. The task of the automatic synchronizer is to connect a synchronized power object to parallel work, according to the amplitude, frequency and phase conditions. Given the very serious consequences of erroneous execution of the synchronization process [1], automatic synchronizers belong to the group of devices which puts very high demands for reliability. Application of FPGA provides fully hardware realization of the synchronization process. The advantage is high reliability, resulting from elimination of layers of software, which can be a potential source of errors. Another advantage is the true parallel realization of each task of the synchronization process. Each task is carried out in parallel by separate blocks of logic elements, as shown in Fig. 3. This solution also provides fully deterministic execution of the program code. The developed synchronizer enables full registration of parameters of the synchronization process, which is realized by application operating on a PC. Communication between the synchronization process and the application takes place via the Internet and the mechanism of direct memory access DMA. The communication diagram is shown in Fig. 5.

19

Using graphic processors for simulation of solidification process

Michalski G., Szczygiol N., Wawszczak A.

Hutnik, Wiadomości Hutnicze

|

2012

|

Vol. 79, nr 1

55--58

EN

This paper presents a simulation of solidification process performed on GPU. The new approach described in this paper allows to divide the process of matrix building into two parts. The first one is independent of nodal temperature values determined in successive time-steps and the second part is performed on the basis of nodal temperature values, but does not require information about finite element mesh. Such separation of two steps of the conductivity matrix building process allows efficient implementation of simulation software for modern multi- and many-core architectures. Conducted simulation shows that GPU can be successfully used for such purposes.

PL

W artykule przedstawiono symulację procesu krzepnięcia wykonaną na procesorach graficznych. Nowe podejście opisane w pracy pozwala na podział procesu budowy macierzy na dwie części. Pierwsza z nich jest niezależna od temperatur w węzłach, wyznaczanych w kolejnych krokach czasowych. Druga część jest budowana na podstawie temperatur w węzłach, ale nie wymaga informacji o siatce elementów skończonych. Taki podział pozwala na wydajną implementację aplikacji przeznaczonej dla nowoczesnych architektur wielordzeniowych. Przeprowadzone symulacje pokazały, że procesory graficzne (GPU) mogą być z powodzeniem wykorzystywane do takich celów.

20

Realizacja nieliniowego ortogonalnego filtru klasy Volterry-Wienera na platformie wieloprocesorowej

Biernacki P.

Elektronika : konstrukcje, technologie, zastosowania

|

2012

|

Vol. 53, nr 10

57-58

PL

W artykule zaprezentowano koncepcję realizacji nieliniowego filtru parametryzujacego klasy Volterry-Wienera na platformie wieloprocesorowej. Rozpatrzono dwa warianty rozwiązania: działanie off-line oraz on-line. Zrównoleglenie obliczeń ma pozwolić na szybsze działanie filtru lub na swobodną zmianę jego parametrów w trakcie działania.

EN

Article presents the concept of realization of nonlinear orthogonal parametrization filter of Volterra-Wiener class using multi-processor platform. Two variants of solutions were examined: working off-line and on-line. Parallelism of calculations can admit of filters parameters changing during operations.