Wyniki wyszukiwania - BazTech

1

Novel architecture for floating point accumulator with cancelation error detection

Jamro E., Dąbrowska-Boruch A., Russek P., Wielgosz M., Wiatr K.

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2018

|

Vol. 66, nr 5

579-587

EN

A floating point accumulator cannot be obtained straightforwardly due to its pipeline architecture and feedback loop. Therefore, an essential part of the proposed floating point accumulator is a critical accumulation loop which is limited to an integer adder and 16-bit shifter only. The proposed accumulator detects a catastrophic cancellation which occurs e.g. when two similar numbers are subtracted. Additionally, modules with reduced hardware resources for rough error evaluation are proposed. The proposed architecture does not comply with the IEEE-754 floating point standard but it guarantees that a correct result, with an arbitrarily defined number of significant bits, is obtained. The proposed calculation philosophy focuses on the desired result error rather than on calculation precision as such.

2

Wykorzystanie akceleracji sprzętowej przy implementacji metryk podobieństwa tekstów

Iwanecki Ł., Koryciak S., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2014

|

R. 60, nr 7

426--428

PL

Artykuł opisuje badania na temat klasyfikatorów tekstów. Zadanie polegało na zaprojektowaniu akceleratora sprzętowego, który przyspieszyłby proces klasyfikacji tekstów pod względem znaczeniowym. Projekt został podzielony na dwie części. Celem części pierwszej było zaproponowanie sprzętowej implementacji algorytmu realizującego metrykę do obliczania podobieństwa dokumentów. W drugiej części zaprojektowany został cały systemem akceleratora sprzętowego. Kolejnym etapem projektowym jest integracja modelu metryki z system akceleracji.

EN

The aim of this project is to propose a hardware accelerating system to improve the text categorization process. Text categorization is a task of categorizing electronic documents into the predefined groups, based on the content. This process is complex and requires a high performance computing system and a big number of comparisons. In this document, there is suggested a method to improve the text categorization using the FPGA technology. The main disadvantage of common processing systems is that they are single-threaded – it is possible to execute only one instruction per a single time unit. The FPGA technology improves concurrence. In this case, hundreds of big numbers may be compared in one clock cycle. The whole project is divided into two independent parts. Firstly, a hardware model of the required metrics is implemented. There are two useful metrics to compute a distance between two texts. Both of them are shown as equations (1) and (2). These formulas are similar to each other and the only difference is the denominator. This part results in two hardware models of the presented metrics. The main purpose of the second part of the project is to design a hardware accelerating system. The system is based on a Xilinx Zynq device. It consists of a Cortex-A9 ARM processor, a DMA controller and a dedicated IP Core with the accelerator. The block diagram of the system is presented in Fig.4. The DMA controller provides duplex transmission from the DDR3 memory to the accelerating unit omitting a CPU. The project is still in development. The last step is to integrate the hardware metrics model with the accelerating system.

3

Równoległa implementacja algorytmu winnowing dla operacji strumieniowej analizy tekstu

Wielgosz M., Żurek D., Pietroń M., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2014

|

R. 60, nr 5

309--312

PL

W ramach praca przeprowadzona została analiza możliwości wykorzystania algorytmu winnowing do strumieniowego przetwarzania informacji tekstowej. W szczególności nacisk został położony na operacje generacji odcisku jako jej zredukowanej reprezentacji wiadomości tekstowej. Autorzy przeprowadzili szereg eksperymentów, w celu określenia efektywności działania algorytmu oraz możliwego do uzyskania przyspieszenia obliczeń, z wykorzy-staniem węzła procesorów Intel Xeon E5645 2.40GHz oraz karty GPU Nvidia Tesla m2090.

EN

There are several models available for information retrieval and text analysis but the two are considered to be the dominant ones, namely Boolean and the vector space model (VSM). A model maps the existing words or text into a new representation space. This paper presents a boolean n-gram-based algorithm - winnowing for fast text search and comparison of documents with main focus on its implementation and performance analysis. The algorithm is used to generate fingerprints (i.e. a set of hashes) of the analyzed documents. A dedicated test framework was designed and implemented to handle the task of the algorithm evaluation which utilizes PAN test corpus and programming environment. Several tests were conducted in order to determine the comparison quality of the obfuscated and not obfuscated text for the winnowing algorithm and different window and n-gram size. The tests revealed interesting properties of the algorithms with respect to comparison of documents as well as defied the limits of their applicability. The n-gram-based algorithms due to their simplicity are well suited for hardware implementation. Thus, the authors implemented compu-tationally demanding part of both fingerprint generation both on CPU and GPU. Performance measurements for Intel Xeon E5645, 2.40GHz and Nvidia Tesla m2090 implementation of Ngram-based algorithm show approximately 14x computational speedup.

4

Komunikacja ze sprzętowym akceleratorem haszowania n-gramów dla procesora ARM z wykorzystaniem portu ACP

Barszczowski M., Koryciak S., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2014

|

R. 60, nr 7

486--488

PL

Artykuł opisuje uruchomienie portu ACP w układzie EPP firmy Xilinx przy użyciu CDMA zarządzającego transmisją pomiędzy akceleratorem, a rdzeniami procesora. Głównym celem badań było utworzenie modułu dokonującego tak zwanego haszowania zbiorów danych. Do wykonania tej operacji wykorzystany został układ Zynq 7000 posiadający zasoby logiki programowalnej oraz dwa rdzenie ARM A9. Powstały dwie koncepcje realizacji akceleratora. Pierwsza wersja zakładała bezpośredni przepływ danych ze źródła do akceleratora, a następnie do rdzeni ARM. Drugie rozwiązanie zakłada wykorzystanie portu ACP.

EN

This paper introduces a new approach to hardware acceleration using the ACP(Acceleration Coherency Port) in Xilinx Zynq-7000 EPP XC7Z020. The first prototype allocated BRAM memory and transferred data through the ACP. The second one used a hardware hashing module to process data outside the CPU. The module received and returned data through the ACP port. The main task of the system is to replace a set of data with its shorter representative of constant length without interference of the processing unit. The main benefit of hashing data lies within the constant length of function outcome, which leads to data compression. Compression is highly desirable while comparing large subsets of data, especially in data mining. The execution of a hashing function requires high performance of the CPU due to the computational complexity of the algorithm. Two concepts where established. The first one assumed transferring data directly do the hardware accelerator and later to ARM cores. This solution is attractive due to its simplicity and relatively fast. Unfortunately, the data cannot be processed before hashing with the same CPU without significant speed reduction. The second approach used the ACP port which can transfer data very fast between L2/L3 cache memory without flushing of validating cache. The data can be processed by the software driven CPU, sent to the accelerator and then sent back to CPU for further processing. To accomplish the established task, the Zynq 7000 EPP with double ARM A9 core and programmable logic in one chip was used.

5

Medical Visualizer 3D: Hardware Controller for Dmd Module

Koryciak S., Barszczowski M, Dąbrowska-Boruch A., Wiatr K.

Image Processing & Communications

|

2014

|

Vol. 19, no. 2-3

15--23

EN

In this paper an implementation of the module responsible for the control of micro-mirror array for later use in projection is described. Existing technologies allow for projections of medical images in Digital Imaging and Communications in Medicine format only in the form of a flat 2D image. The 3D Visualizer will allow to display medical images in three dimensions using its own projection surface. The matrix controlling device has been largely developed on the basis of reverse engineering studies carried out on the functional system based on a driver from Texas Instruments. Driver is built on the FPGA with implemented soft processor from Xilinx - MicroBlaze.

6

Sprzętowa akceleracja wybranych algorytmów kompresji obrazu nieruchomego w standardzie JPEG

Koryciak S., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2012

|

R. 58, nr 7

593-595

PL

Artykuł opisuje opracowanie akceleratora dla wybranych algorytmów kompresji obrazu nieruchomego. Do jego sprzętowej realizacji został wykorzystany język opisu sprzętu VHDL. Wynikiem pracy była skuteczna implementacja na układ programowalny dekompresora obrazów nieruchomych zapisanych w standardzie JPEG ISO/IEC 10918-1(1993), trybie Baseline będącym podstawowym i obowiązkowym trybem dla tego standardu. Szczególną uwagę poświęcono wyborowi i implementacji dwóch najważniejszych zdaniem autora algorytmów występujących w omawianym standardzie.

EN

Image compression is one of the most important topics in the industry, commerce and scientific research. Image compression algorithms need to perform a large number of operations on a large number of data. In the case of compression and decompression of still images the time needed to process a single image is not critical. However, the assumption of this project was to build a solution which would be fully parallel, sequential and synchronous. The paper describes the development of an accelerator for selected still image compression algorithms. In its hardware implementation there was used the hardware description language VHDL. The result of this work was a successful implementation on a programmable system decompressor of still images saved in JPEG standard ISO / IEC 10918-1 (1993), Baseline mode, which is a primary, fundamental, and mandatory mode for this standard. The modular system and method of connection allows the continuous input data stream. Particular attention was paid to selection and implementation of two major, in the authors opinion, algorithms occuring in this standard. Executing the IDCT module uses an algorithm transformation IDCT-SQ modified by the authors of this paper. It provides a full pipelining by applying the same kind of arithmetic operations between each stage. The module used to decode Huffman's code proved to be a bottleneck

7

Hardware Implementation of IDCT Fast Algorithms for Still Images Decompression in the Jpeg Standard

Koryciak S., Dąbrowska-Boruch A., Wiatr K.

Image Processing & Communications

|

2012

|

Vol. 17, no. 4

103--108

EN

Many algorithms are used in JPEG standard for compression of still images, but the most demanding one is the DCT. The fast discrete cosine transform is the basic transform which occur in most coding algorithms. In the case of images it is performed on 8×8 pixel blocks. Paper presents comparison of IDCT algorithms concentrated on amount of arithmetic operations, multiplications, and number of pipelined steps. Results are achieved by implementing each one in programmable device FPGA (xc6vlx240t).

8

Statistics in cyphertext detection

Gancarczyk G., Dąbrowska-Boruch A., Wiatr K.

Prace Instytutu Elektrotechniki

|

2011

|

Z. 251

67-85

EN

Mostly when word encrypted occurs in an article text, another word decryption comes along. However not always knowledge about the plaintext is the most significant one. An example could be a network data analysis where only information, that cipher data were sent from one user to another or what was the amount of all cipher data in the observed path, is needed. Also before data may be even tried being decrypted, they must be somehow distinguished from non-encrypted messages. In this paper it will be shown, that using only simple Digital Data Processing, encrypted information can be detected with high probability. That knowledge can be very helpful in preventing cyberattacks, ensuring safety and detecting security breaches in local networks, or even fighting against software piracy in the Internet. Similar solutions are successfully used in steganalysis and network anomaly detections.

PL

Nowoczesna kryptografia wykorzystuje wyszukane i skomplikowane obliczeniowo przekształcenia matematyczno-logiczne w celu ukrycia ważnej informacji jawnej przez osobami niepowołanymi. Przeważająca większość z nich nadal odwołuje się do postawionego w roku 1949 przez Claude'a E. Shannona postulatu, że idealnie utajniona informacja charakteryzuje się tym, że żaden z pojawiających się w niej symboli nie jest bardziej prawdopodobny niż inne spośród używanego alfabetu znaków. Zgodnie z tą definicją dane idealnie zaszyfrowane w swej naturze przypominają dane losowe o rozkładzie równomiernym, czyli przypomina swoim rozkładem szum biały. Koncepcja detektora opiera się o algorytm analizujący podawane na wejściu dane pod względem ich podobieństwa do szumu białego. Wielkości odniesienia są bardzo dobrze znane, a ich ewentualne wyprowadzenie nie przysparza żadnych trudności. Wyznaczając w sposób doświadczalny granice tolerancji dla każdego z parametrów uzyskuje się w pełni działający algorytm, dokonujący w sposób zero-jedynkowy klasyfikacji na jawny/tajny. W grupie przedstawionych 14 Parametrów Statystycznych pojawiają się takie jak: energia, wartość średnia czy też momenty centralne. Na ich podstawie można stworzyć klasyfikator pierwszego poziomu. Efektywność poprawnego rozróżnienia danych przez klasyfikator pierwszego rzędu waha się w granicach od 80% do 90% (w zależności od użytej w algorytmie wielkości). W celu zwiększenia wykrywalności danych proponuje się, a następnie przedstawia, klasyfikator drugiego rzędu, bazujący na dwóch lub więcej, wzajemnie nieskorelowanych Parametrach Statystycznych. Rozwiązanie takie powoduje wzrost sprawności do około 95%. Zaproponowany w artykule algorytm może być wykorzystany na potrzeby kryptoanalizy, statystycznej analizy danych, analizy danych sieciowych. W artykule przedstawiona jest także koncepcja klasyfikatora trzeciego rzędu, wykorzystującego dodatkowo informacje o charakterze innym niż statystyczny, na potrzeby prawidłowej detekcji danych zaszyfrowanych.

9

Sprzętowy detektor szyfrowanej informacji przesyłanej w sieciach TCP/IP

Gancarczyk G., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2011

|

R. 57, nr 8

923-925

PL

Artykuł prezentuje sposób realizacji, cechy charakterystyczne i zasadę działania urządzenia wykrywającego pakiety zawierające dane zaszyfrowane przesyłane w sieciach opartych o stos protokołów TCP/IP. Detektor zrealizowano w oparciu o system SPARTAN 3E Development Kit firmy Digilent [1]. Kluczowym elementem jest układ FPGA xc3s1600e firmy Xilinx [2]. W artykule przedstawiono schemat blokowy detektora, informacje o sprawności detekcji rozwiązania programowego oraz sprzętowego, zasobach logicznych zajętych przez układ.

EN

The paper describes how to realize a device which can detect encrypted data transfer in computer networks based on the TCP/IP protocols stack. Its features and principles of operation are given. The device is based on the Digilent's SPARTAN 3E Development Kit [1] whose key element is the Xilinx's xc3s1600e [2]. The available publications about distinguishing ciphertext from plaintext tell only that methods typical for randomness check of encrypting algorithms can be used [6]. Many alternative (in field of data distinguishing), interesting publications about steganography [7], computer worms and viruses detection can be easily found [3, 4]. Exemplary implementations of those in FPGA are not difficult to find, either [8]. Lack of publications in the field of encrypted message detection was partial motivation for this paper (Section 1). The presented algorithm of encrypted data detection is based on theorems from [9, 10]. It has advantages and disadvantages, which are discussed (Section 2). The detector (of so called 2nd order) chosen for implementation has good theoretical efficiency (Tab. 1). Its block diagram is shown in Fig. 1 (Section 3). The results of synthesis and implementation are given in Tab. 2, and its efficiency in Tab. 3. The functionality of all blocks of Fig. 1 is discussed (Sections 4 and 5). The efficiency of the implemented device is almost as good as the theoretical one. There are two main limitations - lower (100 B) and upper (1460 B) length of the Ethernet frame data field, and maximum frequency of device clock, which makes it unable (as for xc3s1600) to operate in Gigabit Ethernet networks (Section 6). The presented device can be used as a network data analyzer, a ciphertext detector and a network anomaly detector.

10

Implementation of FCT transformation in JPEG-XR standard in programmable devices

Dąbrowska-Boruch A., Wiatr K.

Automatyka / Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie

|

2011

|

T. 15, z. 3

83-91

EN

JPEG-XR is a new standard of still image compression. Paper presents basic information of the FCT transformation. Article presents modifications of FCT transformation blocks and results of hardware implementation of FCT transformation in V5LX110 Xilinx chip.

PL

Najnowszym standardem kompresji obrazów nieruchomych jest standard JPEG-XR. Artykuł prezentuje podstawowe informacje dotyczące transformacji FCT. Dodatkowo przedstawione zostały wprowadzone modyfikacje, jak również wyniki otrzymane z implementacji transformacji FCT w układzie V5LX110 firmy Xilinx.

11

Efektywność parametrów statystycznych w detekcji informacji szyfrowanej

Gancarczyk G., Dąbrowska-Boruch A., Wiatr K.

Pomiary Automatyka Kontrola

|

2010

|

R. 56, nr 10

1137-1143

PL

Informacja szyfrowana, podobnie jak wszystkie inne typy danych, może zostać poddana analizie statystycznej. Wyznaczenie dla niej parametrów takich jak wartość średnia, wariancja czy też entropia nie nastręcza większych trudności. Wykorzystać do tego można nowoczesne narzędzia numeryczne jak np. MATLAB, Mathcad czy też Microsoft Exel. Pytanie, na które ma dać odpowiedź niniejsze opracowanie brzmi - "czy parametry te niosą ze sobą wiedzę, którą można wykorzystać w użyteczny sposób?" Przykładowym zastosowaniem może być np. określenie czy informacja jest zaszyfrowana (ang. cipher text), czy też jest ona jawna (ang. plain text).

EN

A cipher text, like any other data, can be analysed with use of parameters typical for statistics. Values such as the mean value, variance or entropy are easy to be calculated, especially if one can use numerical tools like e.g. MATLAB, Mathcad or simply Microsoft Exel. The question, to which this paper should give an answer is - "do those parameters provide any information that could be used in any useful way?" For example, the information, whether the analysed data is a cipher or plain text. The available publications about distinguishing the cipher from plain text use only methods typical for testing the randomness of cipher text and random number generator or immunity for cipher breaking. They are presented in the paper by the National Institute of Standards and Technology [1]. The other common method, used for distinguishing the data, is the analysis based on entropy [2]. Lack of published results about the efficiency of methods based on e.g. entropy, is additional motivation for this paper. (see Paragraph 1.) The proposed algorithms use parameters and transformations typical for Statistic and Signal Processing to classify the analysed data as cipher/plain. The authors assume that cipher data are very similar to random numbers due to Shannon's Perfect Secrecy theorem [3]. Six types of plain and cipher data (text, music, image, video, archives and others), seven types of cipher cores (3DES, AES, Blowfish, CAST - 128, RC4, Serpent, Twofish) and various length (1 B to 2323 B) data were examined and group of the so called Statistic Parameters was formed (see Table 1). Definitions of all of them (and a few more) are given by equations (1) to (12). The efficiency of Statistic Parameters after 1417 test samples is shown in Table 2. The most interesting results are also shown in Figs. 1 to 9. (see Paragraphs 2 - 4.) The results show that using simple values like e.g. energy one can built a data distinguisher of the efficiency equal to 90% and low numerical complexity. The lower bound for usability of this method was found to be 200 B. The upper bound was not found. The presented algorithm can be used for creating a network data analyser or cipher text detector. (see Paragraph 5.)

12

Utilization of FPGA Architectures for High Performance Computations

Dąbrowska-Boruch A., Jamro E., Janiszewski M., Kuna D., Machaczek K., Russek P., Wiatr K., Wielgosz M.

Computational Methods in Science and Technology

|

2010

|

Vol. spec. iss. (1)

63-69

EN

The primary intention of this paper is to present the results of several cases where the FPGA technology was used for high performance calculations. We gathered applications that had been developed over the last couple of years. Over this period of time we observed that there had been a rapid growth of interest in the utilization of FPGA for HPC. Basing on our expertise we give selected metrics, results and conclusions which, in our opinion, are important for anyone who is interested in FPGA as an alternative for faster computations. A brief description of the characteristics of FPGA and FPGA usage for acceleration are also included for novices on the subject.

13

Implementacja kodeka standardu MPEG-2 w układach FPGA

Dąbrowska-Boruch A., Wiatr K.

Automatyka / Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie

|

2008

|

T. 12, z. 3

615-623

PL

Metoda kompresji zastosowana w standardzie MPEG-2 jest kombinacją innych standardów, a mianowicie: JPEG oraz H.261. Ponieważ sygnał wizyjny jest w tym przypadku sekwencją nieruchomych obrazów, możliwe jest zastosowanie technik kompresji, podobnych jak w przypadku standardu JPEG. W artykule zostały przedstawione wyniki implementacji toru przetwarzania sygnału wizyjnego zgodnego ze specyfikacją standardu ISO/IEC 13818 w układzie XC2VP100(-6)FF1704 firmy Xilinx.

EN

The compression method applied in MPEG-2 standard is a combination of different standards, namely JPEG and H.261. There is possible to use similar compression techniques how in case of the JPEG standard, because the video signal is a sequence of still pictures. Paper presents implementation results of video signal processing path compatible with ISO/IEC 13818 standard specification in XC2VP100(-6)FF1704 Xilinx chip.