Wyniki wyszukiwania - BazTech

1

Parallel computing of two-parameter bifurcation diagrams of an electric arc model with chaotic dynamics using Nvidia CUDA and OpenMP technologies

Pala Artur, Machaczek Marek

Przegląd Elektrotechniczny

|

2019

|

R. 95, nr 3

138--142

EN

This paper presents parallel and massively parallel calculations of two-parameter bifurcation diagrams of an electric arc model. A simple dynamical model of electric arc is used. Such a model can show complex two-parameter bifurcations with periodic and chaotic responses. Two different parallel computing technologies were used to implement the calculations. Parallel computations are implemented using the OpenMP library and CPU processors. Massively parallel computations are implemented using the Nvidia CUDA technology and GPU processors.

PL

W artykule przedstawiono równoległe i masowo równoległe obliczenia dwuparametrycznych diagramów bifurkacyjnych dla modelu łuku elektrycznego. Do analizy wykorzystano dynamiczny model łuku elektrycznego z okresowymi i chaotycznymi odpowiedziami. Do realizacji obliczeń wykorzystano dwie różne technologie. Obliczenia równoległe zaimplementowano przy użyciu biblioteki OpenMP i procesorów CPU. Obliczenia masowo równoległe zostały zaimplementowane przy użyciu technologii Nvidia CUDA i procesorów GPU.

2

Preconditioned Conjugate Gradient Method for Solution of Large Finite Element Problems on CPU and GPU

Fialko S. Y., Zeglen F.

Journal of Telecommunications and Information Technology

|

2016

|

nr 2

26--33

EN

In this article the preconditioned conjugate gradient (PCG) method, realized on GPU and intended to solution of large finite element problems of structural mechanics, is considered. The mathematical formulation of problem results in solution of linear equation sets with sparse symmetrical positive definite matrices. The authors use incomplete Cholesky factorization by value approach, based on technique of sparse matrices, for creation of efficient preconditioning, which ensures a stable convergence for weakly conditioned problems mentioned above. The research focuses on realization of PCG solver on GPU with using of CUBLAS and CUSPARSE libraries. Taking into account a restricted amount of GPU core memory, the efficiency and reliability of GPU PCG solver are checked and these factors are compared with data obtained with using of CPU version of this solver, working on large amount of RAM. The real-life large problems, taken from SCAD Soft collection, are considered for such a comparison.

3

Passive Radar Parallel Processing Using General-Purpose Computing on Graphics Processing Units

Szczepankiewicz K., Malanowski M., Szczepankiewicz M.

International Journal of Electronics and Telecommunications

|

2015

|

Vol. 61, No. 4

357-363

EN

In the paper an implementation of signal processing chain for a passive radar is presented. The passive radar which was developed at the Warsaw University of Technology, uses FM radio and DVB-T television transmitters as ”illuminators of opportunity”. As the computational load associated with passive radar processing is very high, NVIDIA CUDA technology has been employed for effective implementation using parallel processing. The paper contains the description of the algorithms implementation and the performance results analysis.

4

Stereoscopic video chroma key processing using NVIDIA CUDA

Sagan J.

Annales Universitatis Mariae Curie-Skłodowska. Sectio AI, Informatica

|

2013

|

Vol. 13, no. 1

81--87

EN

In this paper, I use the NVIDIA CUDA technology to perform the chroma key algorithm on stereoscopic images. NVIDIA CUDA allows to process parallel algorithms on GPU. Input data are stereoscopic images with the monochromatic background and the destination background image. Output data is the combination of inputs by using the chroma key. I compare the algorithm efficiency between the GPU and CPU execution.

5

Multi - frontal multi - thread direct solver for finite element simulation of Step – and - Flash Imprint Lithography

Obrok P., Paszyński M.

Computer Methods in Materials Science

|

2012

|

Vol. 12, No. 1

1-8

EN

The paper presents the multi-thread multi-frontal direct solver for shared memory architectures. The solver algorithm consists in a sequence of tasks executing recursive forward eliminations and backward substitutions over the constructed elimination tree. The tasks have been grouped into the sets of independent tasks that can be executed in parallel. The computational problem involves two dimensional model of the linear elastivity with thermal expansion coefficient. The finite element method model is used to simulate the Step-and-Flash Imprint Lithography (SFIL), a modern patterning process utilizing photopolymerization to replicate the topography of a template onto a substrate. The multi-thread multi-frontal direct solver has been implemented and tested on NVIDIA CUDA graphic card environment, delivering O(NlogN) execution time and O(N-1 5)memory usage.

PL

W artykule zaprezentowano wielowątkowy solver wielofrontalny dla architektur o pamięci współdzielonej. Algorytm solvera przedstawiony został w postaci sekwencji tasków wykonujących rekurencyjnie częściowe eliminację oraz podstawiania wstecz w oparciu o skonstruowane drzewo eliminacji. Taski pogrupowane zostały w grupy niezależnych tasków wykonywanych współbieżnie. Algorytm solvera przetestowany został na dwuwymiarowym problemie liniowej sprężystości ze współczynnnikiem rozszerzalności cieplnej. Przeprowadzono obliczenia za pomocą metody elementów skończonych procesu nanolitografii poprzez naświetalnie i wyciskanie, stanowiącej nowoczesną technologię produkcji układów scalonych wykorzystującą zjawisko fotopolimeryzacji. Algorytm solvera został zaimplementowany i przetestowany w środowisku karty graficznej NVIDIA CUDA, osiągając czas wykonania rzędu O(NlogN) oraz zużycie pamięci rzędu O(N-1 5).

6

High Performance Computing on New Accelerated Hardware Architectures

Błażewicz M., Kurowski K., Ludwiczak B., Napierała K.

Computational Methods in Science and Technology

|

2010

|

Vol. spec. iss. (1)

71-79

EN

This paper presents recent work that has been performed in the context of high performance computing and hybrid architectures at Poznan Supercomputing and Networking Center. Three algorithms: JPEG2000 – compression/decompression, computational fluid mechanics and motion tracking have been parallelized on various architectures and compared to reference sequential applications. The performance results, implementation issues and best practices are discussed as well.