Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 56

Liczba wyników na stronie
first rewind previous Strona / 3 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  parallel processing
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 3 next fast forward last
EN
In general, this paper focuses on finding the best configuration for PSO and GA, using the different migration blocks, as well as the different sets of the fuzzy systems rules. To achieve this goal, two optimization algorithms were configured in parallel to be able to integrate a migration block that allow us to generate diversity within the subpopulations used in each algorithm, which are: the particle swarm optimization (PSO) and the genetic algorithm (GA). Dynamic parameter adjustment was also performed with a fuzzy system for the parameters within the PSO algorithm, which are the following: cognitive, social and inertial weight parameter. In the GA case, only the crossover parameter was modified.
2
Content available remote Simulator of a Supercomputer Job Management System as a Scientific Service
EN
Job management system (JMS) is an important part of any supercomputer. JMS creates a schedule for launching jobs of different users. Actual job management systems are complex software systems with a number of settings. These settings have a significant impact on various JMS metrics, such as supercomputer resources utilization, mean waiting time of a job in queue, and others. Various JMS simulators are widely used to study the influence of JMS settings or modifications, new scheduling algorithms, jobs input stream parameters or available computing resources for JMS efficiency metrics. The article presents the comparative analysis results of the actual JMS simulators (Alea, ScSF, Batsim, AccaSim, Slurm simulator) and their application areas. The authors consider new ways to use the JMS simulator as a scientific service for researchers. With such a service, the researchers are able to study various hypotheses about JMS efficiency, algorithms or parameters. This gives the folowing: (1) research is performed on the service side around the clock, (2) the simulator accuracy or adequacy is provided by the service, (3) the research results reproducibility is ensured, and the simulator-as-a-service becomes a single entry point for the researchers.
3
Content available remote Parallel implementation of a PIC simulation algorithm using OpenMP
EN
Particle-in-cell (PIC) simulations are focusing on the individual trajectories of a very large number of particles in self-consistent and external electric and magnetic fields; they are widely used in the study of plasma jets, for example. The main disadvantage of PIC simulations is the large simulation runtime,which often requires a parallel implementation of the algorithm. The current paper focuses on a PIC1d3v simulation algorithm and describes the successful implementation of a parallel version of it on a multi-core architecture, using OpenMP, with very promising experimental and theoretical results.
EN
The big data concept has elicited studies on how to accurately and efficiently extract valuable information from such huge dataset. The major problem during big data mining is data dimensionality due to a large number of dimensions in such datasets. This major consequence of high data dimensionality is that it affects the accuracy of machine learning (ML) classifiers; it also results in time wastage due to the presence of several redundant features in the dataset. This problem can be possibly solved using a fast feature reduction method. Hence, this study presents a fast HP-PL which is a new hybrid parallel feature reduction framework that utilizes spark to facilitate feature reduction on shared/distributed-memory clusters. The evaluation of the proposed HP-PL on KDD99 dataset showed the algorithm to be significantly faster than the conventional feature reduction techniques. The proposed technique required >1 minute to select 4 dataset features from over 79 features and 3,000,000 samples on a 3-node cluster (total of 21 cores). For the comparative algorithm, more than 2 hours was required to achieve the same feat. In the proposed system, Hadoop’s distributed file system (HDFS) was used to achieve distributed storage while Apache Spark was used as the computing engine. The model development was based on a parallel model with full consideration of the high performance and throughput of distributed computing. Conclusively, the proposed HP-PL method can achieve good accuracy with less memory and time compared to the conventional methods of feature reduction. This tool can be publicly accessed at https://github.com/ahmed/Fast-HP-PL.
EN
The computing performance optimization of the Short-Lag Spatial Coherence (SLSC) method applied to ultrasound data processing is presented. The method is based on the theory that signals from adjacent receivers are correlated, drawing on a simplified conclusion of the van Cittert-Zernike theorem. It has been proven that it can be successfully used in ultrasound data reconstruction with despeckling. Former works have shown that the SLSC method in its original form has two main drawbacks: time-consuming processing and low contrast in the area near the transceivers. In this study, we introduce a method that allows to overcome both of these drawbacks. The presented approach removes the dependency on distance (the “lag” parameter value) between signals used to calculate correlations. The approach has been tested by comparing results obtained with the original SLSC algorithm on data acquired from tissue phantoms. The modified method proposed here leads to constant complexity, thus execution time is independent of the lag parameter value, instead of the linear complexity. The presented approach increases computation speed over 10 times in comparison to the base SLSC algorithm for a typical lag parameter value. The approach also improves the output image quality in shallow areas and does not decrease quality in deeper areas.
6
Content available remote Agoge - zintegrowane środowisko programistyczne dla OpenCL C
PL
W artykule zaprezentowano możliwości i rozwiązania technologiczne zintegrowanego środowiska programistycznego dedykowanego technologii OpenCL o nazwie Agoge. Omówiono proces zautomatyzowania tworzenia kontekstu OpenCL, obsługi argumentów jądra obliczeniowego i siatki obliczeniowej, oraz przedstawiono możliwości wykorzystania środowiska w obliczeniach numerycznych i analizie obrazu.
EN
The article presents the capabilities and technological solutions of Agoge, an integrated development environment dedicated to OpenCL. The automation of the process of creating an OpenCL context, handling of computational kernel and computational grid are discussed and the potential for using the environment in numerical calculations and image analysis is presented.
7
Content available remote Agoge - an integrated development environment for OpenCL C
EN
The article presents the capabilities and technological solutions of Agoge, an integrated development environment dedicated to OpenCL. The automation of the process of creating an OpenCL context, handling of computational kernel and computational grid are discussed and the potential for using the environment in numerical calculations and image analysis is presented.
EN
Mutation testing – a fault-based technique for software testing – is a computationally expensive approach. One of the powerful methods to improve the performance of mutation without reducing effectiveness is to employ parallel processing, where mutants and tests are executed in parallel. This approach reduces the total time needed to accomplish the mutation analysis. This paper proposes three strategies for parallel execution of mutants on multicore machines using the Parallel Computing Toolbox (PCT) with the Matlab Distributed Computing Server. It aims to demonstrate that the computationally intensive software testing schemes, such as mutation, can be facilitated by using parallel processing. The experiments were carried out on eight different Simulink models. The results represented the efficiency of the proposed approaches in terms of execution time during the testing process.
10
Content available Depth images filtering in distributed streaming
EN
In this paper, we propose a distributed system for point cloud processing and transferring them via computer network regarding to effectiveness-related requirements. We discuss the comparison of point cloud filters focusing on their usage for streaming optimization. For the filtering step of the stream pipeline processing we evaluate four filters: Voxel Grid, Radial Outliner Remover, Statistical Outlier Removal and Pass Through. For each of the filters we perform a series of tests for evaluating the impact on the point cloud size and transmitting frequency (analysed for various fps ratio). We present results of the optimization process used for point cloud consolidation in a distributed environment. We describe the processing of the point clouds before and after the transmission. Pre- and post-processing allow the user to send the cloud via network without any delays. The proposed pre-processing compression of the cloud and the post-processing reconstruction of it are focused on assuring that the end-user application obtains the cloud with a given precision.
PL
Artykuł przedstawia oryginalną propozycję wykorzystania metod przetwarzania równoległego do analizy i prognozowania wybranych zjawisk i elementów meteorologicznych niebezpiecznych dla transportu lądowego. W badaniach zastosowano metodę Q-wektorów, która służy do wyznaczania obszarów występowania prądów pionowych, określania ich kierunku i intensywności. Obliczona na podstawie metody Q- wektorów funkcja frontogenetyczna wyznacza obszary frontogenezy i frontolizy, czyli strefy powstawania i zaniku frontów atmosferycznych kształtujących pogodę. Przetwarzanie danych z modeli numerycznych oraz pozyskiwanie i interpretacja zdjęć satelitarnych wymagają zastosowania wydajnych systemów obliczeniowych. Właściwym kierunkiem rozwoju wydaje się być budowa komputerów o strukturach równoległych. Filozofia przetwarzania równoległego polega na podziale programu na fragmenty, z których każdy wykonywany jest przez inny procesor, w wyniku czego cała operacja skraca się proporcjonalnie do liczby zastosowanych jednostek obliczeniowych. O skuteczności prognozowania procesów kształtujących pogodę decydują zarówno odpowiednie metodyki wyznaczania obszarów frontogenetycznych i frontolitycznych, jak również metody ich szybkiego, skutecznego i niezawodnego przetwarzania.
EN
The paper presents an original concept of application of parallel processing methodsto analysis and forecasting of some phenomena and meteorological elementswhich are hazardous for land transportation. The research is based on the Q-vectors method used for determination of zones of vertical air currents occurrence, intensity and direction. The frontogenetic function computedusing the Q-vectors methoddetermines areas of frontogenesis and frontolysis, i.e. the areas of atmosphericfronts generation or dissipation. Numerical weather prediction models’data processing and satellite imagery interpretationrequire high performance computingsystems. Building computers with parallel structures seems to be the right direction of development. Parallel processing philosophy is based on the division of the program code into fragments, each of which is executed by another processor, whereby the time needed for the whole operation is reduced in proportion to the number of computational units.The effectiveness of forecasting of the processes that shape weather is determined by both the appropriate methodology of determining the frontogenetic and frontolytic areas as well as by methods for their fast, efficient and reliable processing.
PL
W artykule przedstawiona jest dyskusja dotycząca optymalnego wyboru sprzętu komputerowego do celów prowadzenia symulacji elektromagnetycznych metodą FDTD oraz optymalizacji kodów symulatora pod względem jak najlepszego wykorzystania dostępnych systemów komputerowych. Omówiono tu tendencje światowe w badaniach nad tymi problemami oraz pokazano rozwiązania praktyczne, wprowadzone przez autorów do kodów symulatorów QW-3D oraz QW-V2D.
EN
The paper presents a discussion concerning optimal choice of computer hardware with respect to its application to electromagnetic simulations using the FDTD method as well as optimization of FDTD codes to obtain their best performance with particular hardware available. Tendencies in the worldwide research on those problems are outlined. Specific practical solutions applied by the authors in the QW-3D and QW-V2D simulators are also presented.
EN
Parallelization of processing in Monte Carlo simulations of the Ising spin system with the lattice distributed in a stripe way is proposed. Message passing is applied and one-sided MPI communication with the MPI memory window is exploited. The 2D Ising spin lattice model is taken for testing purposes. The scalability of processing in our simulations is tested in real-life computing on high performance multicomputers and discussed on the basis of speedup and efficiency. The larger the lattice the better scalability is obtained.
14
Content available Parallel mesh generator for biomechanical purpose
EN
The analysis of a biological structure with numerical methods based on engineering approach (i.e. Computational Solid Mechanics) is becoming more and more popular nowadays. The examination of complex, well reproduced biological structures (i.e. bone) is impossible to perform with a single workstation. The mesh for Finite Element Method (FEM) of the order of 106 is required for modeling a small piece of trabecular bone. The homogenization techniques could be used to solve this problem, but these methods require several assumptions and simplifications. Hence, effective analysis of a biological structure in a parallel environment is desirable. The software for structure simulation at cluster architecture are available; however, FEM generator is still inaccessible in that environment. The mesh generator for biological applications – Cosmoprojector – developed at Division of Virtual Engineering, Poznan University of Technology has been adapted for the parallel environment. The preliminary results of complex structure generation confirm the correctness of the proposed method. In this paper, the algorithm of computational mesh generation in a parallel environment has been presented. The proposed system has been tested at biological structure.
EN
Cost-efficient project management based on Critical Chain Method (CCPM) is investigated in this paper. This is a variant of the resource-constrained project scheduling problem (RCPSP) when resources are only partially available and a deadline is given, but the cost of the project should be minimized. RCPSP is a well- known NP hard problem but originally it does not take into consideration the initial resource workload. A metaheuristic algorithm driven by a metric of a gain was adapted to solve the problem when applied to CCPM. Refinement methods enhancing the quality of the results are developed. The improvement expands the search space by inserting the task in place of an already allocated task, if a better allocation can be found for it. The increase of computation time is reduced by distributed calculations. The computational experiments showed significant efficiency of the approach, in comparison with the greedy methods and with genetic algorithm, as well as high reduction of time needed to obtain the results.
PL
Artykuł prezentuje alternatywne podejście do programowania równoległego poprzez wykorzystanie programowalnych kart graficznych w celu wsparcia obliczeń, oraz połączenie tego podejścia z klasycznym zrównolegleniem opartym o wielordzeniowe procesory. Przeprowadzone testy przedstawiają zysk czasu jaki można uzyskać dzięki odpowiedniemu połączeniu OpenMP z technologią CUDA w obliczeniach związanych z wykrywaniem krawędzi na obrazie rastrowym przy użyciu algorytmu Cannego. Badania przeprowadzone zostały na sprzęcie różnej jakości. Napisane algorytmy są zgodne z CC 1,0 (zdolność obliczeniowa karty graficznej).
EN
This paper presents an alternative approach to parallel programming by using programmable graphics card to support calculations and combines this approach with a classical parallelization based on multi-core processors. The tests show the gain time that can be achieved through a combination of OpenMP with CUDA technology in the calculation of the edge detection on the raster image using the Canny’s algorithm. Tests were carried out on the equipment of varying quality. The algorithms are compatible with CC 1.0 (compute capability graphics card).
17
Content available Ewolucja ISA – wierzchołek góry lodowej
PL
Lista rozkazów stanowiąca główny atrybut architektury każdego komputera zmieniała się zależnie od dostępnej technologii i wymagań stawianych przez użytkowników. W artykule opisano kilka rozwiązań ISA (Instruction-Set Architecture) – kluczowych w historii informatyki, wskazując na uwarunkowania istniejące w czasie ich powstawania. Przedstawiono powody zmiany paradygmatu projektowania CISC-RISC w latach osiemdziesiątych. Scharakteryzowano istotę przetwarzania równoległego – od potokowości, przez superskalarność i organizacje VLIW aż do przetwarzania masywnie równoległego w obecnych superkomputerach.
EN
Instruction-set architecture is determined by many factors, such as technology and users’ demand. The ISA evolution is illustrated on several examples – milestones in computing history: EDSAC, VAX, Berkeley RISC. The early 80’ CISC-RISC turning point in architecture paradigm is explained. A short characteristic of parallel processing is given – starting from pipelining, through superscalar and VLIW processors up to petaflops supercomputers using Massively Parallel Processing technique.
18
Content available remote A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures
EN
In this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from a specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person’s relatives to judge which of the evolved images are most likely to resemble his/her actual appearance. Unlike typical GAs, our algorithm uses a new supervised form of fitness function which itself is affected by the evolution process. Additional description of requested facial features can be provided to further influence the final solution (i.e. the virtual portrait). We present an example of a virtual portrait created by our algorithm. Finally, the performance of a parallel implementation developed for the KASKADA platform is presented and evaluated.
PL
W artykule przedstawiono budowę systemu uczenia sieci neuronowej, opartego na koncepcji przetwarzania w chmurze obliczeniowej. Implementacja systemu bazuje na technologii Microsoft Windows Azure. W rozwiązaniu zastosowano znany algorytm uczenia – metodę wstecznej propagacji błędu – dostosowany do rozproszonego sposobu realizacji. Zaproponowano architekturę systemu, w której wykorzystano współpracujące procesy (instancje) typu WorkerRole. W opracowaniu przedstawiono sposób wykorzystania różnych metod magazynowania danych, dostępnych przez mechanizmy Windows Azure Table, Queue, Blob Storage. Równoległe przetwarzanie systemu zostało zapewnione nie tylko dzięki zastosowaniu wielu procesów WorkerRole, ale również dzięki wykorzystaniu modułu Parallel Extension for .NET przy implementacji kodu WorkerRole.
EN
The paper presents the system for neural network learning based on the idea of Cloud computing. System implementation uses Microsoft Windows Azure technology. The well-known learning algorithm i.e. back propagation method was adopted for parallel and distributed execution. The architecture of cooperative worker role processes was proposed. The paper describes applying of methods of data storage like Windows Azure Table, Queue, Blob. The advantages of parallelization result from either applying multiple processes (instances) of WorkerRoles or applying Parallel Extension for .NET module in WorkeRole’s implementation.
EN
This paper discusses parallel implementation of Python program which computes magnetic induction of a cylindrical coils. The speed-up which can be obtained by use of two Python libraries - MPI4Py and Parallel Python is compared. The use of exact analytical expressions and their parallel implementations allow to achieve computational speed appropriate for practical applications, significantly better than in the case of general numerical methods like FEM.
PL
Artykuł przedstawia rownoległą implementację w języku Python, obliczeń indukcji pola magnetycznego od cewek cylindrycznych oraz ósemkowej. Porównano przyspieszenie działania programu osiągnięte przy wykorzystaniu dwóch bibliotek umożliwiających zrównoleglenie jego kodu: MPI4Py i Parallel Python. Użycie dokładnych wyrażeń analitycznych i ich rownoległa implementacja pozwalają na osiągnięcie szybkości obliczeń odpowiedniej dla zastosowań praktycznych, istotnie większej niż w przypadku metod numerycznych ogolnego zastosowania, takich jak metoda elementow skończonych.
first rewind previous Strona / 3 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.