PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Compressing annotated natural language text

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Human Language Technologies as a challenge for Computer Science and Linguistics (2; 21-23.04.2005; Poznań, Poland)
Języki publikacji
EN
Abstrakty
EN
The paper is devoted to description and evaluation of a new method of linguistically annotated text compression. A semantically motivated transcoding scheme is proposed in which text is split into three distinct strems of data. By applying the scheme it is possible to reduce compressed text length by as high as 67%, compared to the initial compression algorithm. An important advantage of the method is the feasibility of processing text in its compressed form.
Słowa kluczowe
Rocznik
Strony
673--680
Opis fizyczny
Bibliogr. 10 poz., tab.
Twórcy
autor
Bibliografia
  • [1] J. CHeney: Compressing XML with Multiplexed Hierarchical PPM Models. In Proc. IEEE Data Compression Conf., Snowbird. Utah. IEEE Computer Society, (2001). 163-172.
  • [2] J. Davies, D. Fensel and F. van Harmelen (EDS): Towards the Semantic-Web: Ontology-driven knowledge management. Chichester, John Wiley & Sons, 2003.
  • [3] R. N. Horspool and G. V. Cormack: Constructing word-based text compression algorithms. In Proc. of IEEE Data Compression Conf., Snowbird. Utah. IEEE Computer Society, (1992), 62-71.
  • [4] D. A. Huffman: A Method for the Construction of Minimum Redundancy Codes. In Proc. IRE, 40 (1951), 1098-1101.
  • [5] K. Nagao: Digital Content Annotation and Transcoding. Boston, Massachusetts, Artech House, 2003.
  • [6] R. Richardson and A. F. Smeaton: Using WordNet in a Knowledge Based Approach to Information Retrieval. Working Paper. CA0395, School of Computer Applications. Dublin City University, 1995.
  • [7] B. Santorini: PartofSpeech Tagging Guidelines for the Penn Treebank Project. Technical Report. MS-CIS-90-47, Department of Computer and Information Science. University of Pennsylvania, 1990.
  • [8] P. Skibiński, Sz. Grabowski and S. Deorowicz: Revisiting dictionary-based compression. To appear in Software. Practice and Experience, (2005).
  • [9] W. J. Teahan and J. G. Cleary: Tag Based Models of English Text. In Proc. IEEE Data Compression Conf. Snowbird. Utah. IEEE Computer Society, (1998), 43-52.
  • [10] P. M. Tolani and J. R. Haritsa: XGrind: A Query-friendly XML Compressor. In Proc. 18th IEEE Int. Conf. on Data Engineering. San Jose, California, IEEE Computer Society. (2002).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BSW3-0021-0038
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.