Czasopismo
2005
|
Vol. 15, no. 4
|
673-680
Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Konferencja
Human Language Technologies as a challenge for Computer Science and Linguistics (2; 21-23.04.2005; Poznań, Poland)
Języki publikacji
Abstrakty
The paper is devoted to description and evaluation of a new method of linguistically annotated text compression. A semantically motivated transcoding scheme is proposed in which text is split into three distinct strems of data. By applying the scheme it is possible to reduce compressed text length by as high as 67%, compared to the initial compression algorithm. An important advantage of the method is the feasibility of processing text in its compressed form.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
673-680
Opis fizyczny
Bibliogr. 10 poz., tab.
Twórcy
autor
- Szczecin University, Institute of Information Technology in Management, Mickiewicza 64, 71-101 Szczecin, jakubs@uoo.univ.szczecin.pl
Bibliografia
- [1] J. CHeney: Compressing XML with Multiplexed Hierarchical PPM Models. In Proc. IEEE Data Compression Conf., Snowbird. Utah. IEEE Computer Society, (2001). 163-172.
- [2] J. Davies, D. Fensel and F. van Harmelen (EDS): Towards the Semantic-Web: Ontology-driven knowledge management. Chichester, John Wiley & Sons, 2003.
- [3] R. N. Horspool and G. V. Cormack: Constructing word-based text compression algorithms. In Proc. of IEEE Data Compression Conf., Snowbird. Utah. IEEE Computer Society, (1992), 62-71.
- [4] D. A. Huffman: A Method for the Construction of Minimum Redundancy Codes. In Proc. IRE, 40 (1951), 1098-1101.
- [5] K. Nagao: Digital Content Annotation and Transcoding. Boston, Massachusetts, Artech House, 2003.
- [6] R. Richardson and A. F. Smeaton: Using WordNet in a Knowledge Based Approach to Information Retrieval. Working Paper. CA0395, School of Computer Applications. Dublin City University, 1995.
- [7] B. Santorini: PartofSpeech Tagging Guidelines for the Penn Treebank Project. Technical Report. MS-CIS-90-47, Department of Computer and Information Science. University of Pennsylvania, 1990.
- [8] P. Skibiński, Sz. Grabowski and S. Deorowicz: Revisiting dictionary-based compression. To appear in Software. Practice and Experience, (2005).
- [9] W. J. Teahan and J. G. Cleary: Tag Based Models of English Text. In Proc. IEEE Data Compression Conf. Snowbird. Utah. IEEE Computer Society, (1998), 43-52.
- [10] P. M. Tolani and J. R. Haritsa: XGrind: A Query-friendly XML Compressor. In Proc. 18th IEEE Int. Conf. on Data Engineering. San Jose, California, IEEE Computer Society. (2002).
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-article-BSW3-0021-0038