Wyniki wyszukiwania - BazTech

1

Robust Line-Convex Polygon Intersection Computation in E2 using Projective Space Representation

Skala Vaclav

Machine Graphics and Vision

|

2023

|

Vol. 32, No. 3/4

3--16

EN

This paper describes modified robust algorithms for a line clipping by a convex polygon in, E2 and a convex polyhedron in E3. The proposed algorithm is based on the Cyrus-Beck algorithm and uses homogeneous coordinates to increase the robustness of computation. The algorithm enables computation fully in the projective space using the homogeneous coordinates and the line can be given in the projective space, in general. If the result can remain in projective space, no division operation is needed. It supports the use of vector-vector operations, SSE/AVX instructions, and GPU.

2

Exploiting multi-core and many-core parallelism for subspace clustering

Datta Amitava, Kaur Amardeep, Lauer Tobias, Chabbouh Sami

International Journal of Applied Mathematics and Computer Science

|

2019

|

Vol. 29, no. 1

81--91

EN

Finding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. But the exponential increase in the number of subspaces with the dimensionality of data renders most of the algorithms inefficient as well as ineffective. Moreover, these algorithms have ingrained data dependency in the clustering process, which means that parallelization becomes difficult and inefficient. SUBSCALE is a recent subspace clustering algorithm which is scalable with the dimensions and contains independent processing steps which can be exploited through parallelism. In this paper, we aim to leverage the computational power of widely available multi-core processors to improve the runtime performance of the SUBSCALE algorithm. The experimental evaluation shows linear speedup. Moreover, we develop an approach using graphics processing units (GPUs) for fine-grained data parallelism to accelerate the computation further. First tests of the GPU implementation show very promising results.

3

Using GPU Accelerators for Parallel Simulations in Material Physics

Uchroński M., Potasz P., Szymańska-Kwiecień A., Hruszowiec M.

Computational Methods in Science and Technology

|

2018

|

Vol. 24, No. 4

249--258

EN

This work is focused on parallel simulation of electron-electron interactions in materials with non-trivial topological order (i.e. Chern insulators). The problem of electron-electron interaction systems can be solved by diagonalizing a many-body Hamiltonian matrix in a basis of configurations of electrons distributed among possible single particle energy levels – the configuration interaction method. The number of possible configurations exponentially increases with the number of electrons and energy levels; 12 electrons occupying 24 energy levels corresponds to the dimension of Hilbert space about 106 . Solving such a problem requires effective computational methods and highly efficient optimization of the source code. The work is focused on many-body effects related to strongly interacting electrons on flat bands with non-trivial topology. Such systems are expected to be useful in study and understanding of new topological phases of matter, and in further future they can be used to design novel nanomaterials. Heterogeneous architecture based on GPU accelerators and MPI nodes will be used for improving performance and scalability in parallel solving problem of electron-electron interaction systems

4

Application of the Lattice Boltzmann Method to the flow past a sphere

Kajzer A., Pozorski J.

Journal of Theoretical and Applied Mechanics

|

2017

|

Vol. 55 nr 3

1091--1099

EN

The results of fully resolved simulations and large eddy simulations of bluff-body flows obtained by means of the Lattice Boltzmann Method (LBM) are reported. A selection of Reynolds numbers has been investigated in unsteady laminar and transient flow regimes. Computed drag coefficients of a cube have been compared with the available data for validation purposes. Then, a more detailed analysis of the flow past a sphere is presented, including also the determination of vortex shedding frequency and the resulting Strouhal numbers. Advantages and drawbacks of the chosen geometry implementation technique, so called “staircase geometry”, are discussed. For the quest of maximum computational effi- ciency, all simulations have been carried out with the use of in-house code executed on GPU.

5

A Novel GPU-Enabled Simulator for Large Scale Spiking Neural Networks

Szynkiewicz P.

Journal of Telecommunications and Information Technology

|

2016

|

nr 2

34--42

EN

The understanding of the structural and dynamic complexity of neural networks is greatly facilitated by computer simulations. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper a framework for modeling and parallel simulation of biological-inspired large scale spiking neural networks on high-performance graphics processors is described. This tool is implemented in the OpenCL programming technology. It enables simulation study with three models: Integrate-andfire, Hodgkin-Huxley and Izhikevich neuron model. The results of extensive simulations are provided to illustrate the operation and performance of the presented software framework. The particular attention is focused on the computational speed-up factor.

6

Simulating P Systems on GPU Device : A Survey

Martínez-del-Amor M. A., García-Quismondo M., Macías-Ramos L. F., Valencia-Cabrera L., Riscos-Núñez A., Pérez-Jiménez M. J.

Fundamenta Informaticae

|

2015

|

Vol. 136, nr 3

269--284

EN

P systems have been proven to be useful as modeling tools in many fields, such as Systems Biology and Ecological Modeling. For such applications, the acceleration of P system simulation is often desired, given the computational needs derived from these kinds of models. One promising solution is to implement the inherent parallelism of P systems on platforms with parallel architectures. In this respect, GPU computing proved to be an alternative to more classic approaches in Parallel Computing. It provides a low cost, and a manycore platform with a high level of parallelism. The GPU has been already employed to speedup the simulation of P systems. In this paper, we look over the available parallel P systems simulators on the GPU, with special emphasis on those included in the PMCGPU project, and analyze some useful guidelines for future implementations and developments.

7

Very Fast Non-Dominated Sorting

Smutnicki C., Rudy J., Żelazny D.

Decision Making in Manufacturing and Services

|

2014

|

Vol. 8, no. 1-2

13--23

EN

A new and very efficient parallel algorithm for the Fast Non-dominated Sorting of Pareto fronts is proposed. By decreasing its computational complexity, the application of the proposed method allows us to increase the speedup of the best up to now Fast and Elitist Multi-Objective Genetic Algorithm (NSGA-II) more than two orders of magnitude. Formal proofs of time complexities of basic as well as improved versions of the procedure are presented. The provided experimental results fully confirm theoretical findings.

8

G-DNA – a highly efficient multi-GPU/MPI tool for aligning nucleotide reads

Frohmberg W., Kierzynka M., Blazewicz J., Gawron P., Wojciechowski P.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2013

|

Vol. 61, nr 4

989--992

EN

DNA/RNA sequencing has recently become a primary way researchers generate biological data for further analysis. Assembling algorithms are an integral part of this process. However, some of them require pairwise alignment to be applied to a great deal of reads. Although several efficient alignment tools have been released over the past few years, including those taking advantage of GPUs (Graphics Processing Units), none of them directly targets high-throughput sequencing data. As a result, a need arose to create software that could handle such data as effectively as possible. G-DNA (GPU-based DNA aligner) is the first highly parallel solution that has been optimized to process nucleotide reads (DNA/RNA) from modern sequencing machines. Results show that the software reaches up to 89 GCUPS (Giga Cell Updates Per Second) on a single GPU and as a result it is the fastest tool in its class. Moreover, it scales up well on multiple GPUs systems, including MPI-based computational clusters, where its performance is counted in TCUPS (Tera CUPS).

9

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

Niewiadomska-Szynkiewicz E., Marks M., Jantura J., Podbielski M.

Journal of Telecommunications and Information Technology

|

2012

|

nr 3

32-39

EN

The main advantage of a distributed computing system over standalone computer is an ability to share the workload between cores, processors and computers. In our paper we present a hybrid cluster system - a novel computing architecture with multi-core CPUs working together with many-core GPUs. It integrates two types of CPU, i.e., Intel and AMD processor with advanced graphics processing units, adequately, Nvidia Tesla and AMD FirePro (formerly ATI). Our CPU/GPU cluster is dedicated to perform massive parallel computations which is a common approach in cryptanalysis and cryptography. The efficiency of parallel implementations of selected data encryption and decryption algorithms are presented to illustrate the performance of our system.

10

Cuda Based Fuzzy C-Means Acceleration for the Segmentation of Images with Fungus Grown in Foam Matrices

Rowińska Z., Gocławski J

Image Processing & Communications

|

2012

|

Vol. 17, no. 4

191--200

EN

In the paper authors verify the advantages of GPU computing applied to fuzzy c-means segmentation. Three different algorithms implementing FCM method have been compared by their execution times. All tests refer to the images of polyurethane foam matrices filled with fungus (mould). They are aimed at separating mould regions from the matrix base. The authors proposed a method using CUDA programming tools, which significantly speedsup FCM computations with multiple cores built in a graphic card.

11

Heterogeneous GPU&CPU cluster for High Performance Computing in cryptography

Marks M., Jantura J., Niewiadomska-Szynkiewicz E., Strzelczyk P., Góźdź K.

Computer Science

|

2012

|

Vol. 13 (2)

63-79

EN

This paper addresses issues associated with distributed computing systems and the application of mixed GPU&CPU technology to data encryption and decryption algorithms. We describe a heterogenous cluster HGCC formed by two types of nodes: Intel processor with NVIDIA graphics processing unit and AMD processor with AMD graphics processing unit (formerly ATI), and a novel software framework that hides the heterogeneity of our cluster and provides tools for solving complex scientific and engineering problems. Finally, we present the results of numerical experiments. The considered case study is concerned with parallel implementations of selected cryptanalysis algorithms. The main goal of the paper is to show the wide applicability of the GPU&CPU technology to large scale computation and data processing.

12

G-PAS 2.0 - an improved version of protein alignment tool with an efficient backtracking routine on multiple GPUs

Frohmberg W., Kierzynka M., Blazewicz J., Wojciechowski P.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2012

|

Vol. 60, nr 3

491-494

EN

Several highly efficient alignment tools have been released over the past few years, including those taking advantage of GPUs (Graphics Processing Units). G-PAS (GPU-based Pairwise Alignment Software) was one of them, however, with a couple of interesting features that made it unique. Nevertheless, in order to adapt it to a new computational architecture some changes had to be introduced. In this paper we present G-PAS 2.0 - a new version of the software for performing high-throughput alignment. Results show, that the new version is faster nearly by a fourth on the same hardware, reaching over 20 GCUPS (Giga Cell Updates Per Second).