Wyniki wyszukiwania - BazTech

1

High Frequency Rule Synthesis in a Large Scale Multiple Database with MapReduce

Bisoyi Sudhanshu Shekhar, Mishra Pragnyaban, Mishra Saroja Nanda

International Journal of Electronics and Telecommunications

|

2022

|

Vol. 68, No. 2

177--186

EN

Increasing development in information and communication technology leads to the generation of large amount of data from various sources. These collected data from multiple sources grows exponentially and may not be structurally uniform. In general, these are heterogeneous and distributed in multiple databases. Because of large volume, high velocity and variety of data mining knowledge in this environment becomes a big data challenge. Distributed Association Rule Mining(DARM) in these circumstances becomes a tedious task for an effective global Decision Support System(DSS). The DARM algorithms generate a large number of association rules and frequent itemset in the big data environment. In this situation synthesizing highfrequency rules from the big database becomes more challenging. Many algorithms for synthesizing association rule have been proposed in multiple database mining environments. These are facing enormous challenges in terms of high availability, scalability, efficiency, high cost for the storage and processing of large intermediate results and multiple redundant rules. In this paper, we have proposed a model to collect data from multiple sources into a big data storage framework based on HDFS. Secondly, a weighted multi-partitioned method for synthesizing high-frequency rules using MapReduce programming paradigm has been proposed. Experiments have been conducted in a parallel and distributed environment by using commodity hardware. We ensure the efficiency, scalability, high availability and costeffectiveness of our proposed method.

2

Finding frequent items: novel method for improving Apriori algorithm

Karimtabar Noorollah, Fard Mohammad Javad Shayegan

Computer Science

|

2022

|

T. 23 (2)

161--177

EN

In this paper, we use an intelligent method for improving the Apriori algorithm in order to extract frequent itemsets. PAA (the proposed Apriori algorithm) pursues two goals: first, it is not necessary to take only one data item at each step – in fact, all possible combinations of items can be generated at each step; and second, we can scan only some transactions instead of scanning all of the transactions to obtain a frequent itemset. For performance evaluation, we conducted three experiments with the traditional Apriori, BitTableFI, TDM-MFI, and MDC-Apriori algorithms. The results exhibited that the algorithm execution time was significantly reduced due to the significant reduction in the number of transaction scans to obtain the itemset. As in the first experiment, the time that was spent to generate frequent items underwent a reduction of 52% as compared to the algorithm in the first experiment. In the second experiment, the amount of time that was spent was equal to 65%, while in the third experiment, it was equal to 46%.

3

Implementation of dynamic and fast mining algorithms on incremental datasets to discover qualitative rules

Naresh Pannangi, Suguna R.

Applied Computer Science

|

2021

|

Vol. 17, no 3

82--91

EN

Face recognition is one of the applications in image processing that recognizes or checks an individual's identity. 2D images are used to identify the face, but the problem is that this kind of image is very sensitive to changes in lighting and various angles of view. The images captured by 3D camera and stereo camera can also be used for recognition, but fairly long processing times is needed. RGB-D images that Kinect produces are used as a new alternative approach to 3D images. Such cameras cost less and can be used in any situation and any environment. This paper shows the face recognition algorithms’ performance using RGB-D images. These algorithms calculate the descriptor which uses RGB and Depth map faces based on local binary pattern. Those images are also tested for the fusion of LBP and DCT methods. The fusion of LBP and DCT approach produces a recognition rate of 97.5% during the experiment

4

A Contribution to the Use of Decision Diagrams for Loading and Mining Transaction Databases

Salleb-Aouissi A., Vrain C.

Fundamenta Informaticae

|

2007

|

Vol. 78, nr 4

575-594

EN

In this paper, we mainly address the problem of loading transaction datasets into main memory and estimating the density of such datasets. We propose BoolLoader, an algorithm dedicated to these tasks; it relies on a compressed representation of all the transactions of the dataset. For sake of efficiency, we have chosen Decision Diagrams as the main data structure to the representation of datasets into memory. We give an experimental evaluation of our algorithm on both dense and sparse datasets. Experiments have shown that BoolLoader is efficient for loading some dense datasets and gives a partial answer about the nature of the dataset before time-consuming pattern extraction tasks. We further investigate the use of Algebraic Decision Diagrams by studying the feasibility of current Data Mining operations, as for instance computing the support of an itemset and even mining frequent itemsets.