Wyniki wyszukiwania - Biblioteka Nauki

1

Multiple tasks in FPGA-based programmable controller

100%

Hajduk Z. , Sadolewski J.

e-Informatica Software Engineering Journal

|

2011

|

tom Vol. 5, nr 1

77--85

EN

An FPGA-based execution platform for PLC controllers with capability to run multiple control tasks is presented. The platform, called multi-CPCore, uses hardware virtual machines to execute control tasks defined in CPDev engineering environment. The tasks consist of one or more programs written in IEC 61131-3 languages, such as ST, IL or FBD. They may run with different cycles and communicate via global variables. Parallel programming mechanisms like process image and semaphores are provided to handle potential conflicts when accessing shared resources.

2

A distributed neural network

100%

Włodarczyk M. , Piech H.

Informatyka Teoretyczna i Stosowana

|

2001

|

tom R. 1, nr 1

53-60

EN

In this paper, the authors present their concept for the execution of a neural network operating on many computers that communicate with each other using a computer network. The implementation of the neural network bas been based on the DCOM technology by Microsoft.

3

Parallelization of processes in neural networks

100%

Piech H. , Włodarczyk M. , Leks D.

Informatyka Teoretyczna i Stosowana

|

2001

|

tom R. 1, nr 1

41-51

EN

The paper presents the methods of distribution of neurons into the processors of parallel and dispersed processing structures. Neural networks, as executed in parallel, force frequent transfers of messages and data between neurons. When performing the distribution of neurons, the following factors should be taken into account: the optimization of acceleration resulting from parallel conversion and the associated uniform loading of processors. In addition, the number of messages and data transferred between the processors should be considered. The time of inter-processor transmission depends not only on the size of information being transferred, but also on the computational power of contacting converters. The paper does not include computational power analysis, being limited only to the determination of the components of the time of creating and transmitting information.

4

Cooperation of CUDA and Intel multi-core architecture in the independent component analysis algorithm for EEG data

100%

Gajos-Balińska A. , Wójcik G. M. , Stpiczyński P.

Bio-Algorithms and Med-Systems

|

2020

|

tom Vol. 16, no. 3

art. no. 20200044

EN

Objectives: The electroencephalographic signal is largely exposed to external disturbances. Therefore, an important element of its processing is its thorough cleaning. Methods: One of the common methods of signal improvement is the independent component analysis (ICA). However, it is a computationally expensive algorithm, hence methods are needed to decrease its execution time. One of the ICA algorithms (fastICA) and parallel computing on the CPU and GPU was used to reduce the algorithm execution time. Results: This paper presents the results of study on the implementation of fastICA, which uses some multi-core architecture and the GPU computation capabilities. Conclusions: The use of such a hybrid approach shortens the execution time of the algorithm.

5

Understanding 3D shapes being in motion

88%

Bedkowski J.

Journal of Automation Mobile Robotics and Intelligent Systems

|

2013

|

tom Vol. 7, No. 1

42--46

EN

This paper concerns a classification problem of 3D shapes being in motion. The goal is to develop the system with real-time capabilities to distinguish basic shapes (corners, planes, cones, spheres etc.) that are moving in front of RGB-D sensor. It is introduced an improvement of SoA algorithms (normal vector computation using PCA Principal Component Analysis and SVD Singular Value Decomposition, PFH – Point Feature Histogram) based on GPGPU (General Purpose Graphic Processor Unit) computation. This approach guarantee on-line computation of normal vectors, unfortunately computation time of the PFH for each normal vector is still a challenge to obtain on-line capabilities, therefore in this paper it is shown how to find a region of movement and to perform the classification process assuming the decreased amount of data. Proposed approach can be a starting point for further developments of the systems able to recognize the objects in the dynamic environments.

6

A Duality between Forward and Adjoint MPI Communication Routines

88%

Cheng B.

Computational Methods in Science and Technology

|

2006

|

tom Vol. spec. issue

23-24

EN

In this article, we explore a natural duality that exist between MPI communication routines in parallel programs, and show the ease of its adjoint implementation via pointers.

7

Adding parallelism to sequential programs : a combined method

88%

Daszczuk W. B. , Czejdo D. B. , Grześkowiak W.

International Journal of Electronics and Telecommunications

|

2024

|

tom Vol. 70, No. 1

135--144

EN

The article outlines a contemporary method for creating software for multi-processor computers. It describes the identification of parallelizable sequential code structures. Three structures were found and then carefully examined. The algorithms used to determine whether or not certain parts of code may be parallelized result from static analysis. The techniques demonstrate how, if possible, existing sequential structures might be transformed into parallel-running programs. A dynamic evaluation is also a part of our process, and it can be used to assess the efficiency of the parallel programs that are developed. As a tool for sequential programs, the algorithms have been implemented in C#. All proposed methods were discussed using a common benchmark.

8

A parallel pipelined naive method for testing satisfiability

75%

Sadowski A. , Jakubski A. , Michalski G.

Przegląd Elektrotechniczny

|

2015

|

tom R. 91, nr 11

154-157

EN

Field Programmable Gate Array (FPGA) systems are highly suitable for solving satisfiability problems SAT. The paper will present the possibilities in programmable FPGA chips to test satisfiability by use of parallelism and pipelining. There will be presented various options to approach this problem by use of VHDL language. For this purpose, authors created a dedicated architecture, combined with a PC, by use of the UART protocol. To build the architecture authors used a Xilinx Spartan-3AN plate, the synthesis was performed in the ISE 11.3. Xilinx software.

PL

Układy FPGA ze względu na swoją architekturę bardzo dobrze pasują do rozwiązywania zagadnień z zakresu rozwiązywania problemów spełnialności SAT. W artykule przedstawiono współbierzne rozwiązanie problemu spełnialności z zastosowaniem programowalnych układów FPGA. Dla potrzeb realizacji zadania opracowno dedykowaną architekturę, opartą o układ FPGA (Xilinx Spartan-3AN) komunikującą się za pomocą protokołu UART.

9

Implementacja równoległa sumowania wolnobieżnego szeregu

75%

Walendziuk W.

Zeszyty Naukowe Politechniki Białostockiej. Elektryka

|

1999

|

tom Z. 15

PL

Trójwymiarowy rozkład temperatury elektrycznego ogrzewania podłogowego, opi-sany jest skomplikowanym szeregiem wolnozbieżnym. Sumowanie wyrazów szeregu staje się kłopotliwe przy zbliżaniu się do punktów nieciągłości (przewody aproksymowane są dwuwymiarową deltą Dirac˘a) ze względu na ilość sumowanych wyrazów. W artykule opisane jest porównanie algorytmu sekwencyjnego sumującego szereg wolnozbieżny z algorytmem równoległym. Algorytm równoległy wykorzystuje maszynę wieloproceso-rową ALEX VX2.

EN

Three - dimensional temperature field in concrete floor excited by a current flowing in a thin lead is described by the complicated trigonometric series. A summation of the series, specially in surroundings of discontinuity points (the lead was approximated by the two-dimensional Dirac's distribution ) requires large computational scale. In order to speedup the computations the parallel computer ALEX AVX2 was applied. In the article a comparison of the computational times obtained by means of the sequentional algorithm and current algorithm with the use of the parallel computer ALEX AVX2 are presented.

10

Analiza porównawcza współprogramów języka Kotlin z językami Java i Scala w przetwarzaniu równoległym

63%

Zieliński A. A.

Journal of Computer Sciences Institute

|

2020

|

tom Vol. 16

241--246

EN

The article presents a comparison of Kotlin coroutines with analogous solutions in Java and Scala in parallel program-ming using chosen metric and non-metric criteria. For that purpose, a multi-module project with corresponding imple-mentations of selected algorithms in all of the three languages was created and then analyzed. The studies were preced-ed by a description of the created project.

PL

Artykuł prezentuje porównanie wykorzystania współprogramów języka Kotlin w przetwarzaniu równoległym do analogicznych rozwiązań w Javie i Scali względem wybranych kryteriów mierzalnych i niemierzalnych. W tym celu stworzono oraz przeanalizowano wielomodułową aplikację z odpowiadającymi sobie implementacjami wyselekcjonowanych algorytmów w trzech wspomnianych językach. Analiza poprzedzona została opisem utworzonego projektu.

11

Polyhedral Source-to-Source Compile

63%

Adamski D. , Jabłoński G. , Perek P. , Napieralski A.

Elektronika : konstrukcje, technologie, zastosowania

|

2016

|

tom Vol. 57, nr 12

3--13

EN

This paper describes a novel Polyhedral Source-to-Source Compiler (PSSC) that enables automatic recognition of parallel regions of C/C++ code and annotating them with OpenMP/OpenACC pragmas. The proposed source-to-source compiler uses polyhedral model to detect and optimize parallel loops. Loop optimization is done on intermediate code representation by Polly compiler and then it is mapped to original source code. This approach allows combining the simplicity and efficiency of Intermediate Representation (IR) code optimization with readability of output code. Experimental results show that the proposed compiler is able to reach the comparable performance to the original Polly compiler.

PL

Artykuł opisuje nowatorski kompilator typu source-to-source, który wykorzystuje model polihedralny do automatycznego wykrywania kodu C/C++, który może być wykonywany równolegle. Fragmenty kodu źródłowego, które mogą zostać zrównoleglone, są opatrywane pragmami OpenMP/OpenACC. Opisywany kompilator śledzi zmiany jakie zostały wprowadzone w kodzie pośrednim przez kompilator Polly, a następnie odwzoruje te transformacje w kodzie źródłowym. Przedstawione w artykule podejście umożliwia połączenie zalet wynikających z optymalizowania kodu pośredniego z możliwością łatwego przenoszenia na różne platformy kodu wysokopoziomowego. Przeprowadzone pomiary wydajności wykazały, że opracowany kompilator pozwala zrównoleglić kod wysokopoziomowy równie wydajnie jak bazowy kompilator Polly.

12

Aspekty czasowe algorytmu SURF w wersji sekwencyjnej i równoległej zaimplementowanej w technologii CUDA

63%

Szymczyk M. , Szymczyk P.

Pomiary Automatyka Robotyka

|

2011

|

tom R. 15, nr 12

241-243

PL

W artykule przedstawiono wyniki prac, których celem było zbadanie możliwości implementacji algorytmu wyznaczania punktów charakterystycznych za pomocą metody SURF na platformie CUDA oraz porównanie czasów obliczeń sekwencyjnej i równoległej implementacji tego algorytmu.

EN

This article presents results of our work concerned possibility of implementation of algorithm for assigning key points using SURF algorithm and CUDA technology. The work also compares time of execution of these applications.

13

Pięć sposobów wprowadzenia współbieżności do programu w języku C#

63%

Szyszko P. , Smołka J.

Journal of Computer Sciences Institute

|

2018

|

tom Vol. 6

62--67

PL

Dzisiejsze procesory w komputerach osobistych i urządzeniach mobilnych umożliwiają coraz bardziej efektywne zrównoleglanie działań w celu szybszego uzyskania wyników. Twórcy oprogramowania mają wiele różnych możliwości zaimplementowania współbieżności, jednak zazwyczaj trzymają się jednej, najbardziej znanej sobie techniki. Warto prześledzić działanie każdej z nich, aby odkryć, kiedy można ją wykorzystać w sposób efektywny, a kiedy lepiej poszukać alternatywy. W poniższym artykule zostały przedstawione sposoby równoległej implementacji obliczeń matematycznych z wykorzystaniem wątków, zadań, puli wątków, puli zadań oraz równoległej pętli for z klasy Parallel. Wszystkie zostały napisane w języku C# na silniku Windows Presentation Foundation platformy .NET. Zaimplementowane obliczenia matematyczne to obliczenie liczby Pi z pomocą wzoru Leibniza.

EN

Nowadays processors working in personal computers and mobile devices allow for more and more effective parallel computing. Developers have at their disposal many different methods of implementing concurrency, but usually use the one, that they now best. It is beneficial to know, when a particular technique is good and when it is better to find an alternative. This paper presents different ways of implementing parallel mathematical calculations using threads, tasks, thread pool, task pool and parallel for loop. Each method was used in a C# application running on Windows Presentation Foundation engine on .NET platform. Implemented operation is calculation value of Pi using Leibnitz’s formula.

14

Schedule design for multiprocessor systems

63%

Globa L. , Lysenko D.

Pomiary Automatyka Kontrola

|

2010

|

tom R. 56, nr 12

1554-1556

EN

Efficiency of multiprocessor system usage is strongly dependent on methods of schedule design - the way of task distribution on each processor to decrease overall schedule time. This article is devoted to the part of this process - schedule design on example of software development for LTE and WIMAX base stations.

PL

Wydajność użytkowania systemów mikroprocesorowych silnie zależy od metody zaprojektowania harmonogramu, tj. od sposobu rozdziału zadań na każdy procesor. Ma to wpływ na zmniejszenie całkowitego czasu wykonywania zadań. W artykule przedstawiono część tego procesu, tj. projektowanie harmonogramu na przykładzie opracowania oprogramowania dla stacji bazowych LTE oraz WIMAX. Wskazano cztery algorytmy możliwe do zastosowania przy wykorzystaniu algorytmów genetycznych. Podano wyniki badań symulacyjnych tych algorytmów, z których wynika, że uzyskuje się dobrą zbieżność przy ograniczonej liczbie generacji. Głównym zadaniem analizowanym w pracy jest skrócenie czasu opracowania oprogramowania za pomocą automatycznego opracowania harmonogramu, znajdowania błędów, uproszczenia debugowania, i wizualizacji za pomocą diagramu. Do rozwoju oprogramowania telekomunikacyjnego proponuje się oryginalną metodę możliwą do zastosowania w formie systemu wbudowanego (SOC). Platformą hardware'ową jest element SOC i kilka różnych jednostek przetwarzających. Algorytm cyfrowego przetwarzania sygnałów jest zdefiniowany przez listę zadań wraz z informacjami o zależnościach. Typ jednostki przetwarzającej i czas przetwarzania są zdefiniowane z góry dla każdego zadania.

15

Using erlang in research and education in a technical university

63%

Petrov I. , Alexeyenko A. , Ivanova G.

Computer Science

|

2018

|

tom Vol. 19 (3)

333--343

EN

This paper addresses the problem of using functional programming (FP) languages for research and educational purposes. In order to identify the problems associated with the use of FP languages such as Erlang, an experiment consisting of two surveys was performed. The first survey was anonymous and aimed at establishing whether the participants prefer object-oriented or functional coding. The second one was a survey made after the students finished an Erlang course. The results of these two surveys demonstrate that functional programming is underrated with no apparent reasons. Possible steps to address this problem are suggested.

16

Równoległe łamanie haseł metodą słownikową w środowiskach MPI, OPENMP i CUDA

63%

Płażek J. , Podyma M.

Czasopismo Techniczne. Nauki Podstawowe

|

2011

|

tom R. 108, z. 1-NP

121-139

PL

Algorytm łamania haseł metodą słownikową jest techniką używaną do siłowego odgadywania haseł do systemów. Jego realizacja wymaga dużych nakładów obliczeniowych, dlatego też uzasadnione jest wykorzystanie do tego celu programowania równoległego. W artykule przedstawiono i porównano ze sobą, równoległe implementacje tego algorytmu w trzech różnych środowiskach programowania równoległego. Są nimi MPI (Message Passing Interface), środowisko wykorzystujące model z przesyłaniem komunikatów, OpenMP (Open Multi Processing), realizujące zrównoleglenie obliczeń na poziomie danych oraz środowisko procesorów kart graficznych.

EN

Password cracking by the dictionary method is a technique for detecting passphrases. Its implementation requires a large computational effort, so it is justified to apply for it a parallel programming. In this article we describe and compare a parallel implementation in three different parallel programming environments, i.e. MPI (Message Passing Interface), the environment that uses a model with messagepassing; OpenMP (Open Multi Processing), which is based on data-parallelization; and the graphics processing units environment.

17

Wykorzystanie kompilacji iteracyjnej do optymalizacji warstwy programowej systemów wbudowanych

63%

Wierciński T. , Radziewicz M. , Burak D.

Pomiary Automatyka Kontrola

|

2010

|

tom R. 56, nr 7

701-704

PL

Artykuł dotyczy wykorzystania kompilacji iteracyjnej do optymalizacji warstwy programowej systemów wbudowanych. W oparciu o autorskie narzędzie WIZUTIC zminejszono czas przetwarzania algorytmu szyfrowania DES. Danymi wejściowymi kompilatora są programy sekwencyjne, wynikami programy zrównoleglone zgodnie ze standardem OpenMP oraz zoptymalizowane pod względem lokalności danych. Parametrem kompilacji iteracyjnej jest rozmiar bloku dla transformacji pętli programowej-tiling.

EN

Embedded systems are special-purpose computers that perform one or few dedicated tasks. They are mostly part of larger electronic devices, such as communication devices, home appliances, office automation, business equipment, automobiles, etc. Complexity of computers has grown tremendously in recent years, because multi-core processors are in widespread use. Parallelized programs must be run on multi-core processors to use the most of its computing power. Exploiting parallel compilers for automatic parallelization of sequential programs accelerates design processes and reduces costs of the designed systems. In this paper there is described a WIZUTIC iterative compiler developed by the Faculty of Computer Science and Information Technology of the West Pomeranian University of Technology. It uses the source code of PLUTO parallel compiler developed at the Ohio State University by Uday Bondhugula. A simulated annealing algorithm is used for finding optimization passes for the given program features. Parameters that are changed in each iteration are tile sizes of loop transformation tiling. Experimental tests are described and the speed-up results obtained for the DES encryption algorithm are given.

18

Parallel implementation of the artificial ant colony algorithm applied to medical images segmentation

63%

Krawczyk Z. , Starzyński J.

Przegląd Elektrotechniczny

|

2017

|

tom R. 93, nr 4

6--9

EN

Artificial Ant Colony algorithm (AAC) can be applied to segmentation of bone structures out of CT data series. AAC procedure produces promising results in regions of adjacent bones and joints which are hard to distinguished by common segmentation algorithms. The article presents parallel implementation of the AAC which allows for significant speed-up of the segmentation procedure. The results of the segmentation for various bone structures in the area of the human pelvis are presented.

PL

Algorytm kolonii mrówkowej (AAC) pozwala na segmentację struktur kostnych z serii obrazów tomografii komputerowej. AAC daje obiecujące wyniki dla przylegających do siebie fragmentów kości i stawów, które trudno rozróżnić przy pomocy często używanych filtrów obrazu. Artykuł przedstawia równoległą implementację algorytmu pozwalającą znacznie przyspieszyć operację segmentacji. Zaprezentowano w nim wyniki algorytmu dla wybranych struktur kostnych w obrębie miednicy.

19

Intel Manycore Testing Lab - środowisko sprzętowo-programowe do dydaktyki tworzenia i testowania efektywności równoleglizacji oprogramowania

63%

Świerczewski Ł.

Biuletyn Naukowy Wrocławskiej Wyższej Szkoły Informatyki Stosowanej. Informatyka

|

2013

|

tom vol. 3

32--36

PL

Współczesny proces dydaktyczny technik programowania często wymaga dostępu zarówno do nowoczesnego sprzętu, jak i oprogramowania. W szczególnej mierze odnosi się to do algorytmów równoległych, których odpowiednie właściwości w dużo większym stopniu można zaobserwować na wydajnych procesorach nowej generacji. Aby stworzyć międzynarodową społeczność akademicką związaną z tą specjalizacją firma Intel udostępniła wirtualne laboratorium testowe (Manycore Testing Lab - MTL). Artykuł przedstawia aspekt architektury oraz praktycznego zastosowania MTL w pracy wieloużytkowej i skupia się na empirycznym potwierdzeniu wzrostu wydajności uzyskanej dzięki programowaniu równoległemu i10-rdzeniowym procesorom Westmere-EX. Badaniom objęto cztery klasy algorytmów: czysto matematyczny dotyczący problemu Collatza, kryptograficzny 3DES, kwantowy algorytm Grovera oraz klasyczny algorytm genetyczny. Dla zastosowań edukacyjnych dostęp do laboratorium jest bezpłatny, a udostępniane platformy wspierają wszelkie zaawansowane technologie.

EN

The modern didactic process of programming techniques often requires access to the modern hardware and software. In a particular part applies to parallel algorithms, where appropriate properties to a much greater extent can be seen in the new generation of high-performance processors. To create an international academic community associated with this specialization, Intel released a virtual test lab (Manycore Testing Lab - MTL). The paper presents the architectural aspect and the practical application of MTL at work reusable and focuses on empirical confirmation gains obtained through parallel programming and 10-core Westmere-EX processors. The study consisted of four classes of algorithms: for a purely mathematical problem Collatz, 3DES cryptography, quantum Grover algorithm and the classic genetic algorithm. For educational access to the laboratory is free and available to all platforms support advanced technologies.

20

Effective biclustering on GPU - capabilities and constraints

51%

Orzechowski P. , Boryczko K.

Przegląd Elektrotechniczny

|

2015

|

tom R. 91, nr 8

131--134

PL

W artykule przedstawiono korzyści i ograniczenia związane z projektowaniem równoległego algorytmu biklasteryzacji, przeznaczonego na GPU. Zaprezentowano definicję biklasteryzacji oraz skrótowo opisano architekturze GPU. Zestawiono popularne wzorce strategii implementacji algorytmów, przydatne w projektowaniu efektywnych rozwiązań na GPU. Publikacja zawiera także praktyczne wskazówki programistyczne, w kontekście implementacji algorytmów biklasteryzacji w języku CUDA/OpenCL.

EN

This article presents the benefits and limitations related to designing a parallel biclustering algorithm on a GPU. A definition of biclustering is provided together with a brief description of the GPU architecture. We then review algorithm strategy patterns, which are helpful in providing efficient implementations on GPU. Finally, we highlight programming aspects of implementing biclustering algorithms in CUDA/OpenCL programming language.