Tytuł artykułu
Autorzy
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Traditional information retrieval techniques become inadequate for the increasingly vast amounts of text data. Here we show a method of query processing, which retrieve the documents containing not only the query terms but also documents having their synonyms. The method performs the query processing by retrieving and scanning the inverted index document list. We show that query response time for conjunctive Boolean queries can be dramatically reduced, at cost in terms of secondary storage, by applying range partition feature of Oracle to reduce the primary memory storage space requirement for looking the inverted list. The proposed method is based on fuzzy relations and fuzzy reasoning to retrieve only top ranking documents from the database and grouping of the retrieved documents through Suffix tree clustering.
Rocznik
Tom
Strony
7--24
Opis fizyczny
Bibliogr. 13 poz., rys., tab.
Twórcy
Bibliografia
- [1] J. Zobel, A. Moffat, R. Sacks-Davis, Searching large Lexicons for Partially Specified Terms using Compressed Inverted Files, Proc.l9th VLDB Conference, Dublin, August 1993, 290-391.
- [2] P. Bratley, Y. Choueka, Processing truncated terms in document retrieval systems, Information Processing and Management, 18, 5, 257-266, 1982.
- [3] A. Moffat, J. Zobel, Self-Indexing Inverted Files for fast Text Retrieval, Presented as Preliminary form at J 994 Australian Database Conference and 1994 IEEE Conference on Data Engineering, February 1994.
- [4] C. Faloutsos, D. Oard, A Survey of Information Retrieval and Filtering Methods, University of Maryland, College Park, MD 20742.
- [5] M. Martynov, B. Novikov, An Indexing Algorithm for Text Retrieval, Proc. Intern. Workshop on Advances in Databases and Information Systems (ADBIS'96), Moscow, September 1996, 10-13.
- [6] P. A Ojala, Compression Basics, Available at http://www.cs.tut.fi/ albert/.
- [7] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice-Hall of India Private Limited, January 2000, 385-388.
- [8] G. Akrivas, G. Stamou, Fuzzy Semantic Association of Multimedia Document Descriptions, Proc. of Int. Workshop on Very Low Bitrate Video Coding, Athens, October 2001.
- [9] O. Zamir, O. Etzioni, Web Document Clustering: A Feasibility Demonstration, Research and Development in Information Retrieval, 1998, 46-54.
- [10] J. Han, M. Kamber: Data Mining Concepts and Techniques, Morgan Kaufmann Publishers 2001, 432-433.
- [11] P. Fenwick, Punctured Ellas Codes for variable-length coding of the integers, Technical Report, ISSN 1173-3500, New Deli, December 5, 1996, 137.
- [12] B. Liu, C. Wee Chin, H. Tou Ng, Mining Topic-Specific Concepts and Definitions on the web, In Proceedings of the Twelfth International World Wide Web Conference (WWW'03), Budapest 2003.
- [13] P. Elias, Universal Codeword Sets and Representations of the Integers, IEEE Trans. Information Theory IT-2I, 2, 194-203, March 1975.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPC1-0001-0048