Wyniki wyszukiwania - BazTech

1

Robust simulation method of complex technical transport systems

Szpytko Janusz, Salgado Duarte Yorlandys

Transport Problems

|

2021

|

T. 16, z. 2

101--112

EN

In the optimization of technical systems focused on a specific functional purpose (reliability, safety, and availability) with the use of simulation methods, an important parameter is the digital simulation time of the research subject. With the complexity of the issue, the digital simulation time increases. The aim of the article is to present a method (combination of parallel computing and variance reduction techniques) of reducing the computer simulation time of the research technical object. An example of the application of the developed method was presented as a result of an experiment conducted for decision making and control processes aimed at optimizing the process of operating overhead cranes in critical conditions. In this paper, selecting parallel batch jobs computation and stratified sampling, we exponentially decreased the simulation time, finding fast and practical solutions and eliminating the time constraint in the search of solutions.

2

Productivity of a low-budget computer cluster applied to overcome the n-body problem

Nowicki Tomasz, Gregosiewicz Adam, Łagodowski Zbigniew

Applied Computer Science

|

2021

|

Vol. 17, no 4

100--109

EN

The classical n-body problem in physics addresses the prediction of individual motions of a group of celestial bodies under gravitational forces and has been studied since Isaac Newton formulated his laws. Nowadays the n-body problem has been recognized in many more fields of science and engineering. Each problem of mutual interaction between objects forming a dynamic group is called as the n-body problem. The cost of the direct algorithm for the problem is O(n2) and is not acceptable from the practical point of view. For this reason cheaper algorithms have been developed successfully reducing the cost to O(nln(n)) or even O(n). Because further improvement of the algorithms is unlikely to happen it is the hardware solutions which can still accelerate the calculations. The obvious answer here is a computer cluster that can preform the calculations in parallel. This paper focuses on the performance of a low-budget computer cluster created on ad hoc basis applied to n-body problem calculation. In order to maintain engineering valuable results a real technical issue was selected to study. It was Discrete Vortex Method that is used for simulating air flows. The pre-sented research included writing original computer code, building a computer cluster, preforming simulations and comparing the results.

3

Practical parallelization of Gear-Nordsieck and Brayton-Gustavson-Hatchel stiff ODE solver

Stabrowski Marek

Annals of Computer Science and Information Systems

|

2021

|

Vol. 25

313--316

EN

The paper compares two ODE solvers using an example of a heat transfer equation. The sequential version of Brayton-Gustavson-Hatchel solver has been slightly inferior to Gear-Nordsieck solver. Algorithms profiling has led to the decision of parallelizing linear equation solving section and function evaluation. The first approach (parallelizing linear equations) improves performance of both algorithms. Second approach (parallelizing function evaluation) boosts BGH solver performance. Finally, it has been proved that wholly parallel version of BGH solver is more efficient with respect to processing time.

4

An efficient parallel global optimization strategy based on Kriging properties suitable for material parameters identification

Roux Emile, Tillier Yannick, Kraria Salim, Bouchard Pierre-Olivier

Archive of Mechanical Engineering

|

2020

|

LXVII, nr 2

169--195

EN

Material parameters identification by inverse analysis using finite element computations leads to the resolution of complex and time-consuming optimization problems. One way to deal with these complex problems is to use meta-models to limit the number of objective function computations. In this paper, the Efficient Global Optimization (EGO) algorithm is used. The EGO algorithm is applied to specific objective functions, which are representative of material parameters identification issues. Isotropic and anisotropic correlation functions are tested. For anisotropic correlation functions, it leads to a significant reduction of the computation time. Besides, they appear to be a good way to deal with the weak sensitivity of the parameters. In order to decrease the computation time, a parallel strategy is defined. It relies on a virtual enrichment of the meta-model, in order to compute q new objective functions in a parallel environment. Different methods of choosing the qnew objective functions are presented and compared. Speed-up tests show that Kriging Believer (KB) and minimum Constant Liar (CLmin) enrichments are suitable methods for this parallel EGO (EGO-p) algorithm. However, it must be noted that the most interesting speed-ups are observed for a small number of objective functions computed in parallel. Finally, the algorithm is successfully tested on a real parameters identification problem.

5

Implementacja metody „Wir w Komórce” w środowisku wieloprocesorowym dla zagadnienia przepływu w komorze z ruchomą ścianką

Błoński Dominik

Zeszyty Energetyczne

|

2019

|

T. 6

27--44

PL

W pracy przedstawiono algorytm rozwiązywania równań ruchu płynu metody „Wir w Komórce” z wykorzystaniem różnicowego schematu kompaktowego rzędu czwartego do rozwiązywania równania Poissona i równania dyfuzji. Opisano kolejne kroki algorytmu wraz z badaniami dokładności poszczególnych schematów różnicowych. Program obliczeniowy został sprawdzony na przykładzie popularnego zagadnienia przepływu we wnęce. Otrzymane wyniki porównano z wynikami opublikowanymi przez innych autorów.

6

Parallel computations and co-simulation in universal mechanism software. Part 2: Examples

Pogorelov Dmitry, Rodikov Alexander, Kovalev Roman

Transport Problems

|

2019

|

T. 14, z. 4

31--38

EN

The second part of the paper continues a discussion on the topic of paralel computations in railway dynamics. The algorithms described in the first part of the paper are applied to parallel simulation on computers with multicore processors of six different models of rail vehicles and trains with the number of degrees of freedom from about one hundred to more than 20 thousands. A considerable simulation speedup is reported. In addition, an example of evaluation of wheel profile wear on multicore processors and comparison of different approaches to multi-variant computations are considered.

7

Parallel computations and co-simulation in universal mechanism software. Part 1: Algorithms and implementation

Pogorelov Dmitry, Rodikov Alexander, Kovalev Roman

Transport Problems

|

2019

|

T. 14, z. 3

163--175

EN

Parallel computations speed up simulation of multibody system dynamics, in particular, dynamics of railway vehicles and trains. It is important for reduction of required time at the stage of new railway vehicle design, for increase of complexity of studied problems and for real-time applications. We consider realization of paralel computations in Universal Mechanism software in three different areas: simulation of rail vehicle and train dynamics, evaluation of wheel profile wear and multi-variant computations. The use of clusters for parallel running of multi-variant computations is illustrated. Co-simulation based on the interface between Universal Mechanism and Matlab/Simulink and other software tools is discussed.

8

Identification of local elastic parameters in heterogeneous materials using a parallelized femu method

Petureau L., Doumalin P., Bremand F.

International Journal of Applied Mechanics and Engineering

|

2019

|

Vol. 24, no. 4

140--156

EN

In this work, we explore the possibilities of the widespread Finite Element Model Updating method (FEMU) in order to identify the local elastic mechanical properties in heterogeneous materials. The objective function is defined as a quadratic error of the discrepancy between measured fields and simulated ones. We compare two different formulations of the function, one based on the displacement fields and one based on the strain fields. We use a genetic algorithm in order to minimize these functions. We prove that the strain functional associated with the genetic algorithm is the best combination. We then improve the implementation of the method by parallelizing the algorithm in order to reduce the computation cost. We validate the approach with simulated cases in 2D.

9

Realizacja metody cząstek wirowych w środowisku wieloprocesorowym z użyciem schematów różnicowych wysokiego rzędu

Błoński Dominik

Zeszyty Energetyczne

|

2018

|

T. 5

25--40

PL

W pracy przedstawiono algorytm rozwiązywania równań ruchu płynu z wykorzystaniem kompaktowej, czwartorzędowej metody „Vortex in Cell“. W pierwszej części zawarto sposób budowania układów równań liniowych z wykorzystaniem wysokowydajnej biblioteki hypre. Przedstawiono kolejne kroki algorytmu wraz z badaniami dokładności i wydajności obliczeniowej poszczególnych schematów różnicowych. Ostatecznie przedstawiono badanie dokładności działania metody na przykładzie zagadnienia Taylora-Greena ze znanym rozwiązaniem dokładnym i porównano otrzymane wyniki z wartościami dokładnymi.

10

Wpływ zastosowania obliczeń równoległych na czas wykonywania rozmytych algorytmów sterowania

Dróżdż Ł., Roj J.

Przegląd Elektrotechniczny

|

2018

|

R. 94, nr 11

26--29

PL

W artykule poruszono zagadnienia związane z wykorzystaniem obliczeń równoległych przy implementacji algorytmów sterowania bazujących na zasadach logiki rozmytej. Przedstawiono realizację przykładowego regulatora, w którym wykorzystano standard specyfikacji przetwarzania współbieżnego OpenMP oraz QtConcurrent. Opisano przeprowadzone badania porównawcze standardowego algorytmu sekwencyjnego z algorytmem wykorzystującym obliczenia równoległe pod względem szybkości działania.

EN

In the paper issues related to the use of a parallel computation in the implementation of control algorithms based on the principles of fuzzy logic have been described. The implementation of an exemplary regulator, in which the OpenMP standard and QtConcurrent is used has been presented. Comparative studies of a standard sequential algorithm with an algorithm that uses a parallel computation in terms of speed of operation were described.

11

Architektura węzłowa : superkomputer klasy Beowulf

Lenarczyk P., Piotrowski Z.

Elektronika : konstrukcje, technologie, zastosowania

|

2018

|

Vol. 59, nr 2

12--14

PL

Zapotrzebowanie na możliwości obliczeniowe nieustannie wzrasta w wielu dziedzinach wiedzy. Dotyczy to również działów, które wcześniej uznawane były za niewymagające obliczeniowo. Szczególną odpowiedzią jest technologia superkomputerów, których liczba dynamicznie zwiększa się. Możliwość rozwoju poszczególnych dziedzin może zostać w znacznym stopniu ułatwiona dzięki rozwojowi technologii Obliczeń Ogólnego Przeznaczenia z użyciem typowych graficznych procesorów masowo równoległych. W artykule zawarto kompletny opis budowy superkomputera w architekturze węzłowej, wraz z opisem problemów związanych z praktyczną implementacją.

EN

Nowadays computational demands are rapidly growing in many scientific areas. This also applies to engineering branches that were previously considered as not very computationally demanding. The answer is supercomputer technology, the number of which is dynamically increasing. Attention should be given to supercomputers, with General Purpose Graphical Processing Unit technology. Such coprocessor could easily enhance computational power in many scientific areas of interest. The paper describes node architecture of supercomputer with description of practical implementation problems.

12

Multi-thread evolutionary computation for design optimization

Krenich S.

Technical Transactions

|

2017

|

Vol. 9(114)

197--206

EN

The paper presents multi-thread calculations using parallel evolutionary algorithms (EA) for single and multicriteria design optimization. This approach was implemented to avoid a negative influence of incorrectly chosen initial and EA’s control parameters for the accuracy of generated solutions and thereby to improve the effectiveness of the EA’s use. Parallel computation for single optimization problems relies just on running n threads with different randomly chosen parameters in order to find the best final solution. For multicriteria optimization problems, each thread generates a set of Pareto optimal solutions and at the end these sets are combined together, giving a real set of Pareto optimal solutions. During the run of the algorithm, random interactions between threads were applied. The experiments were carried out using tenthread processes for different examples of single and multicriteria design optimization problems, two of which are presented in the paper.

PL

W artykule przedstawiono wielowątkowe obliczenia równoległe z wykorzystaniem algorytmów ewolucyjnych (AE) dla jedno- i wielokryterialnej optymalizacji konstrukcji. Przedstawioną metodę wykorzystano w celu uniknięcia negatywnego wpływu niewłaściwie dobranych parametrów inicjujących i sterujących w algorytmie ewolucyjnym na dokładność obliczeń, a tym samym w celu poprawy efektywności działania algorytmu. Obliczenia równoległe dla optymalizacji jednokryterialnej polegają na uruchomieniu n wątków z losowo dobranymi parametrami AE z przyjętych zakresów i zbiorów dyskretnych. Dla optymalizacji wielokryterialnej każdy wątek generuje niezależny zbiór rozwiązań Pareto, a następnie na końcu zbiory te są łączone w finalny zbiór rozwiązań Pareto. W trakcie obliczeń wprowadzono losowe interakcję między wątkami. Eksperymenty przeprowadzono z wykorzystaniem 10 wątków równoległych dla wielu przykładów, dwa przedstawiono w artykule.

13

Improving performance of non-interior point based optimal power flow algorithm computations

Połomski M.

Przegląd Elektrotechniczny

|

2017

|

R. 93, nr 1

320--332

EN

This paper presents the non-interior point method (NIP) based optimal power flow (OPF) algorithm parallelization and performance improvement experiments. The aim is to investigate the impact of parallelization techniques on overall OPF computations speedup. Presented approach takes advantage of the structure of algorithm and exploits it with the usage of multithreading to gain computation speedup. Obtained results give insight into the impact of multithreading techniques and algorithm initialization techniques.

PL

W artykule zaprezentowano wyniki eksperymentów prowadzących do redukcji czasu realizacji obliczeń algorytmu metody non-interior point (NIP) w zastosowaniu do zadania optymalizacji rozpływu mocy (OPF). Celem pracy było zbadanie wpływu zastosowania technik zrównoleglenia obliczeń na czas realizacji zadania OPF. W zaprezentowanym podejściu brano pod uwagę strukturę algorytmu oraz wykorzystano implementację wielowątkową. Uzyskane wyniki pokazują wpływ wielowątkowej implementacji oraz zastosowanych technik inicjalizacji algorytmu na czas obliczeń.

14

Zastosowanie sztucznych sieci neuronowych oraz architektury OPENCL w spektralnej i falkowej analizie prądu silnika LSPMSM

Pietrowski W., Wiśniewski G. D., Górny K.

Poznan University of Technology Academic Journals. Electrical Engineering

|

2017

|

No. 91

311--321

PL

W artykule przedstawiono autorskie algorytmy obliczeń równoległych które zostały zastosowane w oprogramowaniu do diagnostyki silnika LSPMSM. Oprogramowanie umożliwia spektralną i falkową analizę prądu maszyny a także posiada wbudowane mechanizmy sztucznych sieci neuronowych (SSN) które to mogą służyć jako element decyzyjny systemu diagnostycznego. Ponadto przybliżono tematykę związaną ze strukturą zastosowanej sieci neuronowej, algorytmami nauczania sztucznych sieci neuronowych oraz standardem OpenCL.

EN

The paper presents algorithms of parallel computing which have been used in program for diagnosis of LSPMSM machine. The software allows to spectral and wavelet analysis of phase current of LSPMSM motor. Moreover, the program has a built-in artificial neural network which is a decisive element of the diagnostic system. In addition, the article brought closer to issues related to the structure and learning algorithms of artificial neural networks and OpenCL.

15

Interpretable decision-tree induction in a big data parallel framework

Weinberg A. I., Last M.

International Journal of Applied Mathematics and Computer Science

|

2017

|

Vol. 27, no. 4

737--748

EN

When running data-mining algorithms on big data platforms, a parallel, distributed framework, such as MAPREDUCE, may be used. However, in a parallel framework, each individual model fits the data allocated to its own computing node without necessarily fitting the entire dataset. In order to induce a single consistent model, ensemble algorithms such as majority voting, aggregate the local models, rather than analyzing the entire dataset directly. Our goal is to develop an efficient algorithm for choosing one representative model from multiple, locally induced decision-tree models. The proposed SySM (syntactic similarity method) algorithm computes the similarity between the models produced by parallel nodes and chooses the model which is most similar to others as the best representative of the entire dataset. In 18.75% of 48 experiments on four big datasets, SySM accuracy is significantly higher than that of the ensemble; in about 43.75% of the experiments, SySM accuracy is significantly lower; in one case, the results are identical; and in the remaining 35.41% of cases the difference is not statistically significant. Compared with ensemble methods, the representative tree models selected by the proposed methodology are more compact and interpretable, their induction consumes less memory, and, as confirmed by the empirical results, they allow faster classification of new records.

16

A rapid algebraic 3D volume image reconstruction technique for cone beam computed tomography

Al-masni M. A., Al-antari M. A., Metwally M. K., Kadah Y. M., Han S. M., Kim T. S.

Biocybernetics and Biomedical Engineering

|

2017

|

Vol. 37, no. 4

619--629

EN

Computed tomography (CT) is a widely used imaging technique in medical diagnosis. Among the latest advances in CT imaging techniques, the use of cone-beam X-ray projections, instead of the usual planar fan beam, promises faster yet safer 3D imaging in comparison to the previous CT imaging methodologies. This technique is called Cone Beam CT (CBCT). However, these advantages come at the expense of a more challenging 3D reconstruction problem that is still an active research area to improve the speed and quality of image reconstruction. In this paper, we propose a rapid parallel Multiplicative Algebraic Reconstruction Technique (rpMART) via a vectorization process for CBCT which gives more accurate and faster reconstruction even with a lower number of projections via parallel computing. We have compared rpMART with the parallel version of Algebraic Reconstruction Technique (pART) and the conventional non-parallel versions of npART, npMART and Feldkamp, Davis, and Kress (npFDK) techniques. The results indicate that the reconstructed volume images from rpMART provide a higher image quality index of 0.99 than the indices of pART and npFDK of 0.80 and 0.39, respectively. Also the proposed implementation of rpMART and pART via parallel computing significantly reduce the reconstruction time from more than 6 h with npART and npMART to 580 and 560 s with the full 360° projections data, respectively. We consider that rpMART could be a better image reconstruction technique for CBCT in clinical applications instead of the widely used FDK method.

17

The influence of side thermal insulation on distribution of the temperature field in an electrical floor heater

Gołębiowski J., Forenc J.

Przegląd Elektrotechniczny

|

2016

|

R. 92, nr 12

271--277

EN

Two models of side thermal insulation (adiabatic and lossy) were examined in the analysis of the operation of electrical floor heater. Temperature field distributions obtained in both cases were compared. Computation costs of consideration of edge effects resulted from insulation lossiness were estimated. The use of parallel operation of a traditional processor (CPU) and a graphics processing unit (GPU) enabled a significant reduction of the computation time.

PL

W analizie pracy elektrycznego grzejnika podłogowego rozpatrywano dwa modele bocznej izolacji termicznej (idealnej i rzeczywistej). Porównano rozkłady pola temperatury wyznaczone w wymienionych przypadkach. Oszacowano obliczeniowe koszty uwzględnienia efektów krawędziowych spowodowanych stratnością izolacji. Zastosowanie równoległej pracy tradycyjnego procesora (CPU) oraz procesora karty graficznej (GPU) umożliwiło znaczne skrócenie czasu obliczeń numerycznych.

18

Parallel computation of transient processes on OpenCL framework

Cegielski M.

Przegląd Elektrotechniczny

|

2016

|

R. 92, nr 7

75--78

EN

Parallel execution of calculation of transient analysis is based on a split-level model into sub-systems, which in certain time increments are calculated independently of each other. Each such process has a high computational complexity. The process of implementing the calculation allows the use of parallel systems to calculations based on the use of the GPU, whose dynamic growth has been observed for several years. The article presents a brief description of parallel computing systems based on the OpenCL platform that uses GPUs. There is described the ability to implement the algorithm using this platform. There is also discussed, the timing to perform operations on GPU in relation to the calculations for classic CPU.

PL

Równoległa realizacja obliczeń analizy stanów przejściowych bazuje na podziale na poziomie modelu na pod-układy, które w określonych krokach czasowych obliczane są niezależnie od siebie. Każdy taki proces charakteryzuje się dużą złożonością obliczeniową. Proces realizacji obliczeń pozwala na zastosowanie do obliczeń systemów równoległych opartych o wykorzystanie GPU, których dynamiczny rozwój jest obserwowany od kilku lat. W artykule przedstawiono krótką charakterystykę równoległych systemów obliczeniowych opartych o platformę OpenCL wykorzystującą procesory GPU. Opisano możliwość implementacji algorytmu z wykorzystaniem tej platformy. Omówiono zależności czasowe realizacji obliczeń na procesorach graficznych w stosunku do obliczeń na klasycznych CPU.

19

Effectiveness of Fast Fourier Transform implementations on GPU and CPU

Puchała D., Stokfiszewski K., Szczepaniak B., Yatsymirskyy M.

Przegląd Elektrotechniczny

|

2016

|

R. 92, nr 7

69--71

EN

In this paper, we present the results of comparison of the effectiveness of selected variants of radix-2 Fast Fourier Transform (FFT) algorithms implemented on both Graphics (GPU) and Central (CPU) Processing Units. The considered algorithms differ in memory consumption and the arrangement of data-flow paths which affects the global memory coalescing and cache memory exploitation. The obtained results allow to indicate the variants of FFT algorithms which are best suited for GPU and CPU architectures, to confirm the advisability of GPU oriented calculations of FFT and to formulate a guideline for implementations of fast algorithms of various linear transforms.

XX

W niniejszej pracy przedstawiono wyniki porównania efektywności wybranych wariantów algorytmów szybkiej transformaty Fouriera (FFT) typu radix-2 realizowanych zarówno dla procesorów graficznych (GPU) jak i typowych jednostek centralnych (CPU). Rozważane algorytmy różnią się zapotrzebowaniem pamięciowym oraz postaciami grafów przepływu danych, które mają wpływ na spójność wykorzystania pamięci globalnej oraz pamięci cache jednostek GPU i CPU. Uzyskane wyniki pozwalają na wskazanie wariantów algorytmów FFT, które są najlepiej dostosowane dla architektur GPU i CPU, pozwalają też potwierdzić celowość realizacji implementacji FFT zorientowanych na wykorzystanie jednostek GPU, a także sformułować ogólne wytyczne dla implementacji zorientowanych na wykorzystanie jednostek GPU algorytmów szybkich przekształceń liniowych.

20

Widmowa i falkowa analiza prądu silnika LSPMSM z wykorzystaniem OpenCL

Pietrowski W., Wiśniewski G. D., Górny K.

Poznan University of Technology Academic Journals. Electrical Engineering

|

2016

|

No. 85

355--364

PL

W artykule przedstawiono zastosowanie algorytmów obliczeń równoległych oraz funkcji zawartych w bibliotece OpenCL do analizy harmonicznej i analizy falkowej prądu fazowego silnika LSPMSM. Opisano interface programowania OpenCL oraz opracowane oprogramowanie w języku C++, w którym zaimplementowano zarówno algorytmy sekwencyjne realizowane przez CPU jak również algorytmy równoległe realizowane przez GPU. Przedstawiono porównanie czasu obliczeń algorytmem sekwencyjnym oraz algorytmem równoległym.

EN

The article presents a comparison of a computing time of a parallel and a sequential algorithm in a spectral and a wavelet analysis of a motor LSPMSM current. The test calculations were made on two different sets of computer for different number of signals samples. On the basis of the results of test calculations of harmonic analysis it can be observed that using parallel algorithm a signal processing time has been reduced of several times compared to a sequential algorithm. The advantage of the parallel algorithm is the greater, the more signal samples are processed.