We have made a case here for utilizing tensor framework for hypertext mining. Tensor is a generalization of vector and tensor framework discussed here is a generalization of vector space model which is widely used in the information retrieval and web mining literature. Most hypertext documents have an inherent internal tag structure and external link structure that render the desirable use of multidimensional representations such as those offered by tensor objects. We have focused on the advantages of Tensor Space Model, in which documents are represented using sixth-order tensors. We have exploited the local-structure and neighborhood recommendation encapsulated by the proposed representation. We have defined a similarity measure for tensor objects corresponding to hypertext documents, and evaluated the proposed measure for mining tasks. The superior performance of the proposed methodology for clustering and classification tasks of hypertext documents have been demonstrated here. The experiment using different types of similarity measure in the different components of hypertext documents provides the main advantage of the proposed model. It has been shown theoretically that, the computational complexity of an algorithm performing on tensor framework using tensor similarity measure as distance is at most the computational complexity of the same algorithmperforming on vector space model using vector similarity measure as distance.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.