We present the implementation of the hardware ANS compressor in FPGAs. The main goal of the design was to propose a solution suitable to low-cost, low-energy embedded systems. We propose the streaming-rANS algorithm of the ANS family as a target for the implementation. Also, we propose a set of algorithm parameters that substantially reduce the use of FPGA resources, and we examine what is the influence of the chosen parameters on compression performance. Further, we compare our design to the lossless codecs found in literature, and to the streaming-rANS codecs with arbitrary parameters.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The main goal of this work was to implement a reliable machine learning algorithm that can classify the age of a dog, given only a photograph of its face. The problem, which seems simple for humans, presents itself as very difficult for the machine learning algorithms due to differences of facial features among the dogs population. As convolutional neural networks (CNNs) performed poorly in this problem, authors took other approach of creating novel architecture consisting of combination of CNN and vision transformer (ViT) and examining the age of the dogs separately for every breed. Authors were able to achieve better results than those in initial works covering the problem.
We address one of the weaknesses of the RSA ciphering systems \textit{i.e.} the existence of the private keys that are relatively easy to compromise by the attacker. The problem can be mitigated by the Internet services providers, but it requires some computational effort. We propose the proof of concept of the GPGPU-accelerated system that can help detect and eliminate users' weak keys. We have proposed the algorithms and developed the GPU-optimised program code that is now publicly available and substantially outperforms the tested CPU processor. The source code of the OpenSSL library was adapted for GPGPU, and the resulting code can perform both on the GPU and CPU processors. Additionally, we present the solution how to map a triangular grid into the GPU rectangular grid \textendash{} the basic dilemma in many problems that concern pair-wise analysis for the set of elements. Also, the comparison of two data caching methods on GPGPU leads to the interesting general conclusions. We present the results of the experiments of the performance analysis of the selected algorithms for the various RSA key length, configurations of GPU grid, and size of the tested key set.
In our study, we present the results of the implementation of the SHA-512 algorithm in FPGAs. The distinguished element of our work is that we conducted the work using OpenCL for FPGA, which is a relatively new development method for reconfigurable logic. We examine loop unrolling as an OpenCL performance optimization method and compare the efficiency of the different kernel implementation types: NDRange, Single-Work Item, and SIMD kernels. In our conclusions, we compare the metrics of the created FPGA accelerator to the corresponding GPGPU solutions. Also, our paper is accompanied by a source code repository to allow the reader to follow and extend our survey.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.