Wyniki wyszukiwania - BazTech

1

Basu T., Murthy C. A.

Fundamenta Informaticae

|

2015

|

Vol. 141, nr 4

275--295

EN

The similarity based decision rule computes the similarity between a new test document and the existing documents of the training set that belong to various categories. The new document is grouped to a particular category in which it has maximum number of similar documents. A document similarity based supervised decision rule for text categorization is proposed in this article. The similarity measure determine the similarity between two documents by finding their distances with all the documents of training set and it can explicitly identify two dissimilar documents. The decision rule assigns a test document to the best one among the competing categories, if the best category beats the next competing category by a previously fixed margin. Thus the proposed rule enhances the certainty of the decision. The salient feature of the decision rule is that, it never assigns a document arbitrarily to a category when the decision is not so certain. The performance of the proposed decision rule for text categorization is compared with some well known classification techniques e.g., k-nearest neighbor decision rule, support vector machine, naive bayes etc. using various TREC and Reuter corpora. The empirical results have shown that the proposed method performs significantly better than the other classifiers for text categorization.

2

Centrality Measures, Upper Bound, and Influence Maximization in Large Scale Directed Social Networks

Pal S.K., Kundu S., Murthy C.A.

Fundamenta Informaticae

|

2014

|

Vol. 130, nr 3

317--342

EN

The paper addresses the problem of finding top k influential nodes in large scale directed social networks. We propose two new centrality measures, Diffusion Degree for independent cascade model of information diffusion and Maximum Influence Degree. Unlike other existing centrality measures, diffusion degree considers neighbors' contributions in addition to the degree of a node. The measure also works flawlessly with non uniform propagation probability distributions. On the other hand, Maximum Influence Degree provides the maximum theoretically possible influence (Upper Bound) for a node. Extensive experiments are performed with five different real life large scale directed social networks. With independent cascade model, we perform experiments for both uniform and non uniform propagation probabilities. We use Diffusion Degree Heuristic (DiDH) and Maximum Influence Degree Heuristic (MIDH), to find the top k influential individuals. k seeds obtained through these for both the setups show superior influence compared to the seeds obtained by high degree heuristics, degree discount heuristics, different variants of set covering greedy algorithms and Prefix excluding Maximum Influence Arborescence (PMIA) algorithm. The superiority of the proposed method is also found to be statistically significant as per T-test.

3

Variance as a Stopping Criterion for Genetic Algorithms with Elitist Model

Bhandari D., Murthy C.A., Pal S.K.

Fundamenta Informaticae

|

2012

|

Vol. 120, nr 2

145-164

EN

Genetic Algorithm (GA) has now become one of the leading mechanisms in providing solution to complex optimization problems. Although widely used, there are very few theoretical guidelines for determining when to stop the algorithm. This article establishes theoretically that the variance of the best fitness values obtained in the iterations can be considered as a measure to decide the termination criterion of a GA with elitist model (EGA). The criterion automatically takes into account the inherent characteristics of the objective function. Implementation issues of the proposed stopping criterion are explained. Its difference with some other stopping criteria is also critically analyzed.

4

Tensor Framework and Combined Symmetry for Hypertext Mining

Saha S., Murthy C.A., Pal S.K.

Fundamenta Informaticae

|

2009

|

Vol. 97, nr 1/2

215-234

EN

We have made a case here for utilizing tensor framework for hypertext mining. Tensor is a generalization of vector and tensor framework discussed here is a generalization of vector space model which is widely used in the information retrieval and web mining literature. Most hypertext documents have an inherent internal tag structure and external link structure that render the desirable use of multidimensional representations such as those offered by tensor objects. We have focused on the advantages of Tensor Space Model, in which documents are represented using sixth-order tensors. We have exploited the local-structure and neighborhood recommendation encapsulated by the proposed representation. We have defined a similarity measure for tensor objects corresponding to hypertext documents, and evaluated the proposed measure for mining tasks. The superior performance of the proposed methodology for clustering and classification tasks of hypertext documents have been demonstrated here. The experiment using different types of similarity measure in the different components of hypertext documents provides the main advantage of the proposed model. It has been shown theoretically that, the computational complexity of an algorithm performing on tensor framework using tensor similarity measure as distance is at most the computational complexity of the same algorithmperforming on vector space model using vector similarity measure as distance.

5

A New Probabilistic Approach for Fractal Based Image Compression

Mitra S.K., Kundu M.K., Murthy C.A., Bhattacharya Bhargab.B., Acharya T.

Fundamenta Informaticae

|

2008

|

Vol. 87, nr 3-4

417-433

EN

Approximation of an image by the attractor evolved through iterations of a set of contractive maps is usually known as fractal image compression. The set of maps is called iterated function system (IFS). Several algorithms, with different motivations, have been suggested towards the solution of this problem. But, so far, the theory of IFS with probabilities, in the context of image compression, has not been explored much. In the present article we have proposed a new technique of fractal image compression using the theory of IFS and probabilities. In our proposed algorithm, we have used a multiscaling division of the given image up to a predetermined level or up to that level at which no further division is required. At each level, the maps and the corresponding probabilities are computed using the gray value information contained in that image level and in the image level higher to that level. A fine tuning of the algorithm is still to be done. But, the most interesting part of the proposed technique is its extreme fastness in image encoding. It can be looked upon as one of the solutions to the problem of huge computational cost for obtaining fractal code of images.

6

Rough set Based Ensemble Classifier for Web Page Classification

Saha S., Murthy C.A., Pal S.K.

Fundamenta Informaticae

|

2007

|

Vol. 76, nr 1-2

171-187

EN

Combining the results of a number of individually trained classification systems to obtain a more accurate classifier is a widely used technique in pattern recognition. In this article, we have introduced a rough set based meta classifier to classify web pages. The proposed method consists of two parts. In the first part, the output of every individual classifier is considered for constructing a decision table. In the second part, rough set attribute reduction and rule generation processes are used on the decision table to construct a meta classifier. It has been shown that (1) the performance of the meta classifier is better than the performance of every constituent classifier and, (2) the meta classifier is optimal with respect to a quality measure defined in the article. Experimental studies show that the meta classifier improves accuracy of classification uniformly over some benchmark corpora and beats other ensemble approaches in accuracy by a decisive margin, thus demonstrating the theoretical results. Apart from this, it reduces the CPU load compared to other ensemble classification techniques by removing redundant classifiers from the combination.

7

E - optimal stopping time for genetic algorithms

Murthy C.A., Bhandari D., Pal S.K.

Fundamenta Informaticae

|

1998

|

Vol. 35, nr 1-4

91-111

EN

In this article, the concept of e-optimal stopping time of a genetic algorithm with elitist model (EGA) has been introduced. The probability of performing mutation plays an important role in the computation of the Î-optimal stopping times. Two approaches, namely, pessimistic and optimistic have been considered here to find out the e-optimal stopping time. It has been found that the total number of strings to be searched in the optimistic approach to obtain e-optimal string is less than the number of all possible strings for sufficiently large string length. This observation validates the use of genetic algorithms in solving complex optimization problems.

8

A study on Partitioned Iterative Function Systems for image compression

Mitra S.K., Murthy C.A., Kundu M.K.

Fundamenta Informaticae

|

1998

|

Vol. 34, nr 4

413-428

EN

The technique of image compression using Iterative Function System (IFS) is known as fractal image compression. An extension of IFS theory is called as Partitioned or local Iterative Function System (PIFS) for coding the gray level images. The theory of PIFS appears to be different from that of IFS in the sense of application domain. Assuming the theory of PIFS is the same as that of IFS, several techniques of image compression have been developed. In the present article we have studied the PIFS scheme as a separate one and proposed a mathematical formulation for the existence of its attractor. Moreover the results of a Genetic Algorithm (GA) based PIFS technique is presented. This technique appears to be efficient in the sense of computational cost.