Wyniki wyszukiwania - BazTech

1

Toward the best combination of optimization with fuzzy systems to obtain the best solution for the GA and PSO algorithms using parallel processing

Valdez Fevrier, Kawano Yunkio, Melin Patricia

Journal of Automation Mobile Robotics and Intelligent Systems

|

2020

|

Vol. 14, No. 1

55--64

EN

In general, this paper focuses on finding the best configuration for PSO and GA, using the different migration blocks, as well as the different sets of the fuzzy systems rules. To achieve this goal, two optimization algorithms were configured in parallel to be able to integrate a migration block that allow us to generate diversity within the subpopulations used in each algorithm, which are: the particle swarm optimization (PSO) and the genetic algorithm (GA). Dynamic parameter adjustment was also performed with a fuzzy system for the parameters within the PSO algorithm, which are the following: cognitive, social and inertial weight parameter. In the GA case, only the crossover parameter was modified.

2

Simulator of a Supercomputer Job Management System as a Scientific Service

Savin Gennadiy, Shabanov Boris, Lyakhovets Dmitriy, Baranov Anton, Telegin Pavel

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

413--416

EN

Job management system (JMS) is an important part of any supercomputer. JMS creates a schedule for launching jobs of different users. Actual job management systems are complex software systems with a number of settings. These settings have a significant impact on various JMS metrics, such as supercomputer resources utilization, mean waiting time of a job in queue, and others. Various JMS simulators are widely used to study the influence of JMS settings or modifications, new scheduling algorithms, jobs input stream parameters or available computing resources for JMS efficiency metrics. The article presents the comparative analysis results of the actual JMS simulators (Alea, ScSF, Batsim, AccaSim, Slurm simulator) and their application areas. The authors consider new ways to use the JMS simulator as a scientific service for researchers. With such a service, the researchers are able to study various hypotheses about JMS efficiency, algorithms or parameters. This gives the folowing: (1) research is performed on the service side around the clock, (2) the simulator accuracy or adequacy is provided by the service, (3) the research results reproducibility is ensured, and the simulator-as-a-service becomes a single entry point for the researchers.

3

Parallel implementation of a PIC simulation algorithm using OpenMP

Suciu Alin, Hangan Anca, Marginean Anca, Joldos Marius, Voitcu Gabriel, Echim Marius

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

381--385

EN

Particle-in-cell (PIC) simulations are focusing on the individual trajectories of a very large number of particles in self-consistent and external electric and magnetic fields; they are widely used in the study of plasma jets, for example. The main disadvantage of PIC simulations is the large simulation runtime,which often requires a parallel implementation of the algorithm. The current paper focuses on a PIC1d3v simulation algorithm and describes the successful implementation of a parallel version of it on a multi-core architecture, using OpenMP, with very promising experimental and theoretical results.

4

Novel approach for big data classification based on hybrid parallel dimensionality reduction using spark cluster

Ali Ahmed Hussein, Abdullah Mahmood Zaki

Computer Science

|

2019

|

Vol. 20 (4)

411--429

EN

The big data concept has elicited studies on how to accurately and efficiently extract valuable information from such huge dataset. The major problem during big data mining is data dimensionality due to a large number of dimensions in such datasets. This major consequence of high data dimensionality is that it affects the accuracy of machine learning (ML) classifiers; it also results in time wastage due to the presence of several redundant features in the dataset. This problem can be possibly solved using a fast feature reduction method. Hence, this study presents a fast HP-PL which is a new hybrid parallel feature reduction framework that utilizes spark to facilitate feature reduction on shared/distributed-memory clusters. The evaluation of the proposed HP-PL on KDD99 dataset showed the algorithm to be significantly faster than the conventional feature reduction techniques. The proposed technique required >1 minute to select 4 dataset features from over 79 features and 3,000,000 samples on a 3-node cluster (total of 21 cores). For the comparative algorithm, more than 2 hours was required to achieve the same feat. In the proposed system, Hadoop’s distributed file system (HDFS) was used to achieve distributed storage while Apache Spark was used as the computing engine. The model development was based on a parallel model with full consideration of the high performance and throughput of distributed computing. Conclusively, the proposed HP-PL method can achieve good accuracy with less memory and time compared to the conventional methods of feature reduction. This tool can be publicly accessed at https://github.com/ahmed/Fast-HP-PL.

5

Optimization of Short-Lag Spatial Coherence Imaging Method

Domaradzki Jakub, Lewandowski Marcin, Żołek Norbert, Lewandowski Marcin

Archives of Acoustics

|

2019

|

Vol. 44, No. 4

669--679

EN

The computing performance optimization of the Short-Lag Spatial Coherence (SLSC) method applied to ultrasound data processing is presented. The method is based on the theory that signals from adjacent receivers are correlated, drawing on a simplified conclusion of the van Cittert-Zernike theorem. It has been proven that it can be successfully used in ultrasound data reconstruction with despeckling. Former works have shown that the SLSC method in its original form has two main drawbacks: time-consuming processing and low contrast in the area near the transceivers. In this study, we introduce a method that allows to overcome both of these drawbacks. The presented approach removes the dependency on distance (the “lag” parameter value) between signals used to calculate correlations. The approach has been tested by comparing results obtained with the original SLSC algorithm on data acquired from tissue phantoms. The modified method proposed here leads to constant complexity, thus execution time is independent of the lag parameter value, instead of the linear complexity. The presented approach increases computation speed over 10 times in comparison to the base SLSC algorithm for a typical lag parameter value. The approach also improves the output image quality in shallow areas and does not decrease quality in deeper areas.

6

Agoge - zintegrowane środowisko programistyczne dla OpenCL C

Bugała M.

Szybkobieżne Pojazdy Gąsienicowe

|

2017

|

nr 3 (45)

49--65

PL

W artykule zaprezentowano możliwości i rozwiązania technologiczne zintegrowanego środowiska programistycznego dedykowanego technologii OpenCL o nazwie Agoge. Omówiono proces zautomatyzowania tworzenia kontekstu OpenCL, obsługi argumentów jądra obliczeniowego i siatki obliczeniowej, oraz przedstawiono możliwości wykorzystania środowiska w obliczeniach numerycznych i analizie obrazu.

EN

The article presents the capabilities and technological solutions of Agoge, an integrated development environment dedicated to OpenCL. The automation of the process of creating an OpenCL context, handling of computational kernel and computational grid are discussed and the potential for using the environment in numerical calculations and image analysis is presented.

7

Agoge - an integrated development environment for OpenCL C

Bugała M.

Szybkobieżne Pojazdy Gąsienicowe

|

2017

|

nr 3 (45)

115--129

EN

The article presents the capabilities and technological solutions of Agoge, an integrated development environment dedicated to OpenCL. The automation of the process of creating an OpenCL context, handling of computational kernel and computational grid are discussed and the potential for using the environment in numerical calculations and image analysis is presented.

8

Redukcja powstawania narostu podczas obróbki elementów hybrydowych Al-żeliwo

Jemielniak K.

Mechanik

|

2017

|

R. 90, nr 7

534

9

Parallel Mutant Execution Techniques in Mutation Testing Process for Simulink Models

Hanh L. T. M., Binh N. T., Tung K. T.

Journal of Telecommunications and Information Technology

|

2017

|

nr 4

90--100

EN

Mutation testing – a fault-based technique for software testing – is a computationally expensive approach. One of the powerful methods to improve the performance of mutation without reducing effectiveness is to employ parallel processing, where mutants and tests are executed in parallel. This approach reduces the total time needed to accomplish the mutation analysis. This paper proposes three strategies for parallel execution of mutants on multicore machines using the Parallel Computing Toolbox (PCT) with the Matlab Distributed Computing Server. It aims to demonstrate that the computationally intensive software testing schemes, such as mutation, can be facilitated by using parallel processing. The experiments were carried out on eight different Simulink models. The results represented the efficiency of the proposed approaches in terms of execution time during the testing process.

10

Depth images filtering in distributed streaming

Dziubich T., Szymański J., Brzeski A. M., Cychnerski J., Korłub W. M.

Polish Maritime Research

|

2016

|

nr 2

91--98

EN

In this paper, we propose a distributed system for point cloud processing and transferring them via computer network regarding to effectiveness-related requirements. We discuss the comparison of point cloud filters focusing on their usage for streaming optimization. For the filtering step of the stream pipeline processing we evaluate four filters: Voxel Grid, Radial Outliner Remover, Statistical Outlier Removal and Pass Through. For each of the filters we perform a series of tests for evaluating the impact on the point cloud size and transmitting frequency (analysed for various fps ratio). We present results of the optimization process used for point cloud consolidation in a distributed environment. We describe the processing of the point clouds before and after the transmission. Pre- and post-processing allow the user to send the cloud via network without any delays. The proposed pre-processing compression of the cloud and the post-processing reconstruction of it are focused on assuring that the end-user application obtains the cloud with a given precision.

11

Zastosowanie metod przetwarzania równoległego do prognozowania stref zjawisk atmosferycznych niebezpiecznych dla transportu lądowego

Chaładyniak D., Jasiński J., Krawczyk K., Pietrek S., Winnicki I.

Logistyka

|

2015

|

nr 4

2747--2754, CD2

PL

Artykuł przedstawia oryginalną propozycję wykorzystania metod przetwarzania równoległego do analizy i prognozowania wybranych zjawisk i elementów meteorologicznych niebezpiecznych dla transportu lądowego. W badaniach zastosowano metodę Q-wektorów, która służy do wyznaczania obszarów występowania prądów pionowych, określania ich kierunku i intensywności. Obliczona na podstawie metody Q- wektorów funkcja frontogenetyczna wyznacza obszary frontogenezy i frontolizy, czyli strefy powstawania i zaniku frontów atmosferycznych kształtujących pogodę. Przetwarzanie danych z modeli numerycznych oraz pozyskiwanie i interpretacja zdjęć satelitarnych wymagają zastosowania wydajnych systemów obliczeniowych. Właściwym kierunkiem rozwoju wydaje się być budowa komputerów o strukturach równoległych. Filozofia przetwarzania równoległego polega na podziale programu na fragmenty, z których każdy wykonywany jest przez inny procesor, w wyniku czego cała operacja skraca się proporcjonalnie do liczby zastosowanych jednostek obliczeniowych. O skuteczności prognozowania procesów kształtujących pogodę decydują zarówno odpowiednie metodyki wyznaczania obszarów frontogenetycznych i frontolitycznych, jak również metody ich szybkiego, skutecznego i niezawodnego przetwarzania.

EN

The paper presents an original concept of application of parallel processing methodsto analysis and forecasting of some phenomena and meteorological elementswhich are hazardous for land transportation. The research is based on the Q-vectors method used for determination of zones of vertical air currents occurrence, intensity and direction. The frontogenetic function computedusing the Q-vectors methoddetermines areas of frontogenesis and frontolysis, i.e. the areas of atmosphericfronts generation or dissipation. Numerical weather prediction models’data processing and satellite imagery interpretationrequire high performance computingsystems. Building computers with parallel structures seems to be the right direction of development. Parallel processing philosophy is based on the division of the program code into fragments, each of which is executed by another processor, whereby the time needed for the whole operation is reduced in proportion to the number of computational units.The effectiveness of forecasting of the processes that shape weather is determined by both the appropriate methodology of determining the frontogenetic and frontolytic areas as well as by methods for their fast, efficient and reliable processing.

12

Optymalizacja czasu obliczeń symulacji elektromagnetycznych prowadzonych metodą FDTD

Sypniewski M., Celuch M.

Elektronika : konstrukcje, technologie, zastosowania

|

2015

|

Vol. 56, nr 7

30-34

PL

W artykule przedstawiona jest dyskusja dotycząca optymalnego wyboru sprzętu komputerowego do celów prowadzenia symulacji elektromagnetycznych metodą FDTD oraz optymalizacji kodów symulatora pod względem jak najlepszego wykorzystania dostępnych systemów komputerowych. Omówiono tu tendencje światowe w badaniach nad tymi problemami oraz pokazano rozwiązania praktyczne, wprowadzone przez autorów do kodów symulatorów QW-3D oraz QW-V2D.

EN

The paper presents a discussion concerning optimal choice of computer hardware with respect to its application to electromagnetic simulations using the FDTD method as well as optimization of FDTD codes to obtain their best performance with particular hardware available. Tendencies in the worldwide research on those problems are outlined. Specific practical solutions applied by the authors in the QW-3D and QW-V2D simulators are also presented.

13

Distributed Processing of the Lattice in Monte Carlo Simulations of the Ising Type Spin Model

Murawski S., Musiał G., Pawłowski G.

Computational Methods in Science and Technology

|

2015

|

Vol. 21, No. 3

117--121

EN

Parallelization of processing in Monte Carlo simulations of the Ising spin system with the lattice distributed in a stripe way is proposed. Message passing is applied and one-sided MPI communication with the MPI memory window is exploited. The 2D Ising spin lattice model is taken for testing purposes. The scalability of processing in our simulations is tested in real-life computing on high performance multicomputers and discussed on the basis of speedup and efficiency. The larger the lattice the better scalability is obtained.

14

Parallel mesh generator for biomechanical purpose

Hausa H., Nowak M.

Journal of Theoretical and Applied Mechanics

|

2014

|

Vol. 52 nr 1

71--80

EN

The analysis of a biological structure with numerical methods based on engineering approach (i.e. Computational Solid Mechanics) is becoming more and more popular nowadays. The examination of complex, well reproduced biological structures (i.e. bone) is impossible to perform with a single workstation. The mesh for Finite Element Method (FEM) of the order of 106 is required for modeling a small piece of trabecular bone. The homogenization techniques could be used to solve this problem, but these methods require several assumptions and simplifications. Hence, effective analysis of a biological structure in a parallel environment is desirable. The software for structure simulation at cluster architecture are available; however, FEM generator is still inaccessible in that environment. The mesh generator for biological applications – Cosmoprojector – developed at Division of Virtual Engineering, Poznan University of Technology has been adapted for the parallel environment. The preliminary results of complex structure generation confirm the correctness of the proposed method. In this paper, the algorithm of computational mesh generation in a parallel environment has been presented. The proposed system has been tested at biological structure.

15

Cost-efficient project management based on critical chain method with partial availability of resources

Pawiński G., Sapiecha K.

Control and Cybernetics

|

2014

|

Vol. 43, no. 1

95--109

EN

Cost-efficient project management based on Critical Chain Method (CCPM) is investigated in this paper. This is a variant of the resource-constrained project scheduling problem (RCPSP) when resources are only partially available and a deadline is given, but the cost of the project should be minimized. RCPSP is a well- known NP hard problem but originally it does not take into consideration the initial resource workload. A metaheuristic algorithm driven by a metric of a gain was adapted to solve the problem when applied to CCPM. Refinement methods enhancing the quality of the results are developed. The improvement expands the search space by inserting the task in place of an already allocated task, if a better allocation can be found for it. The increase of computation time is reduced by distributed calculations. The computational experiments showed significant efficiency of the approach, in comparison with the greedy methods and with genetic algorithm, as well as high reduction of time needed to obtain the results.

16

Redukcja czasu wykonania algorytmu Cannego dzięki zastosowaniu połączenia OpenMP z technologią NVIDIA CUDA

Sychel D.

Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej

|

2013

|

Nr 5

103--113

PL

Artykuł prezentuje alternatywne podejście do programowania równoległego poprzez wykorzystanie programowalnych kart graficznych w celu wsparcia obliczeń, oraz połączenie tego podejścia z klasycznym zrównolegleniem opartym o wielordzeniowe procesory. Przeprowadzone testy przedstawiają zysk czasu jaki można uzyskać dzięki odpowiedniemu połączeniu OpenMP z technologią CUDA w obliczeniach związanych z wykrywaniem krawędzi na obrazie rastrowym przy użyciu algorytmu Cannego. Badania przeprowadzone zostały na sprzęcie różnej jakości. Napisane algorytmy są zgodne z CC 1,0 (zdolność obliczeniowa karty graficznej).

EN

This paper presents an alternative approach to parallel programming by using programmable graphics card to support calculations and combines this approach with a classical parallelization based on multi-core processors. The tests show the gain time that can be achieved through a combination of OpenMP with CUDA technology in the calculation of the edge detection on the raster image using the Canny’s algorithm. Tests were carried out on the equipment of varying quality. The algorithms are compatible with CC 1.0 (compute capability graphics card).

17

Ewolucja ISA – wierzchołek góry lodowej

Komorowski W.

Zeszyty Naukowe Dolnośląskiej Wyższej Szkoły Przedsiębiorczości i Techniki. Studia z Nauk Technicznych

|

2012

|

Nr 1

73--94

PL

Lista rozkazów stanowiąca główny atrybut architektury każdego komputera zmieniała się zależnie od dostępnej technologii i wymagań stawianych przez użytkowników. W artykule opisano kilka rozwiązań ISA (Instruction-Set Architecture) – kluczowych w historii informatyki, wskazując na uwarunkowania istniejące w czasie ich powstawania. Przedstawiono powody zmiany paradygmatu projektowania CISC-RISC w latach osiemdziesiątych. Scharakteryzowano istotę przetwarzania równoległego – od potokowości, przez superskalarność i organizacje VLIW aż do przetwarzania masywnie równoległego w obecnych superkomputerach.

EN

Instruction-set architecture is determined by many factors, such as technology and users’ demand. The ISA evolution is illustrated on several examples – milestones in computing history: EDSAC, VAX, Berkeley RISC. The early 80’ CISC-RISC turning point in architecture paradigm is explained. A short characteristic of parallel processing is given – starting from pipelining, through superscalar and VLIW processors up to petaflops supercomputers using Massively Parallel Processing technique.

18

A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures

Krawczyk H., Proficz J., Ziółkowski T.

TASK Quarterly : scientific bulletin of Academic Computer Centre in Gdansk

|

2012

|

Vol. 16, No 1-2

145-162

EN

In this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from a specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person’s relatives to judge which of the evolved images are most likely to resemble his/her actual appearance. Unlike typical GAs, our algorithm uses a new supervised form of fitness function which itself is affected by the evolution process. Additional description of requested facial features can be provided to further influence the final solution (i.e. the virtual portrait). We present an example of a virtual portrait created by our algorithm. Finally, the performance of a parallel implementation developed for the KASKADA platform is presented and evaluated.

19

Realizacja przetwarzania w chmurze obliczeniowej na przykładzie systemu uczenia sieci neuronowej opartego na technologii Microsoft Windows Azure

Augustyn D., Badura K.

Studia Informatica

|

2012

|

Vol. 33, nr 2A

49-66

PL

W artykule przedstawiono budowę systemu uczenia sieci neuronowej, opartego na koncepcji przetwarzania w chmurze obliczeniowej. Implementacja systemu bazuje na technologii Microsoft Windows Azure. W rozwiązaniu zastosowano znany algorytm uczenia – metodę wstecznej propagacji błędu – dostosowany do rozproszonego sposobu realizacji. Zaproponowano architekturę systemu, w której wykorzystano współpracujące procesy (instancje) typu WorkerRole. W opracowaniu przedstawiono sposób wykorzystania różnych metod magazynowania danych, dostępnych przez mechanizmy Windows Azure Table, Queue, Blob Storage. Równoległe przetwarzanie systemu zostało zapewnione nie tylko dzięki zastosowaniu wielu procesów WorkerRole, ale również dzięki wykorzystaniu modułu Parallel Extension for .NET przy implementacji kodu WorkerRole.

EN

The paper presents the system for neural network learning based on the idea of Cloud computing. System implementation uses Microsoft Windows Azure technology. The well-known learning algorithm i.e. back propagation method was adopted for parallel and distributed execution. The architecture of cooperative worker role processes was proposed. The paper describes applying of methods of data storage like Windows Azure Table, Queue, Blob. The advantages of parallelization result from either applying multiple processes (instances) of WorkerRoles or applying Parallel Extension for .NET module in WorkeRole’s implementation.

20

Parallel implementation of exact formulae for magnetic field of axisymmetric current distributions

Krawczyk Z., Starzyński J.

Przegląd Elektrotechniczny

|

2012

|

R. 88, nr 11a

56-58

EN

This paper discusses parallel implementation of Python program which computes magnetic induction of a cylindrical coils. The speed-up which can be obtained by use of two Python libraries - MPI4Py and Parallel Python is compared. The use of exact analytical expressions and their parallel implementations allow to achieve computational speed appropriate for practical applications, significantly better than in the case of general numerical methods like FEM.

PL

Artykuł przedstawia rownoległą implementację w języku Python, obliczeń indukcji pola magnetycznego od cewek cylindrycznych oraz ósemkowej. Porównano przyspieszenie działania programu osiągnięte przy wykorzystaniu dwóch bibliotek umożliwiających zrównoleglenie jego kodu: MPI4Py i Parallel Python. Użycie dokładnych wyrażeń analitycznych i ich rownoległa implementacja pozwalają na osiągnięcie szybkości obliczeń odpowiedniej dla zastosowań praktycznych, istotnie większej niż w przypadku metod numerycznych ogolnego zastosowania, takich jak metoda elementow skończonych.