Text summarizing in Polish

Branny, E.; Gajęcki, M.

Artykuł - szczegóły

Tytuł artykułu

Text summarizing in Polish

Autorzy

Branny E. , Gajęcki M.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Streszczanie tekstu w języku polskim

Języki publikacji

Abstrakty

The aim of this article is to describe an existing implementation of a text summarizer for Polish, to analyze the results and propose the possibilities of further development. The problem of text summarizing has been already addressed by science but until now there has been no implementation designed for Polish. The implemented algorithm is based on existing developments in the field but it also includes some improvements. It has been optimized for newspaper texts ranging from approx. 10 to 50 sentences. Evaluation has shown that it works better than known generic summarization tools when applied to Polish.

Celem artykułu jest zaprezentowanie algorytmu streszczającego teksty w języku polskim. Mimo istnienia algorytmów streszczających teksty, brak jest algorytmów dedykowanych dla języka polskiego. Przedstawiony algorytm bazuje na istniejących algorytmach streszczania tekstu, ale zawiera kilka ulepszeń. Algorytm jest przeznaczony dla streszczania tekstów prasowych liczących od 10 do 50 zdań. Przeprowadzone testy pokazują, że algorytm działa lepiej od znanych algorytmów zastosowanych dla języka polskiego.

Słowa kluczowe

natural language processing text summarizing

przetwarzanie języka naturalnego streszczanie tekstu

Wydawca

Wydawnictwa AGH

Czasopismo

Computer Science

Rocznik

2005

Tom

Vol. 7

Strony

31--48

Opis fizyczny

Bibliogr. 14 poz., rys., wykr.

Twórcy

autor

Branny E.

PhD Student EAIiE, AGH-UST, Kraków, Poland

autor

Gajęcki M.

mag@agh.edu.pl

Institute of Computer Science, AGH-UST, Kraków, Poland

Bibliografia

[1] ”summarize” (entry) in Merriam-Webster Online Thesaurus, 15 Jun 2005, http://www.m-w.com/cgi-bin/thesaurus
[2] Van Dijk T.A.: Some Aspects of Text Grammars. A Study in Theoretical Lin- guistics and Poetics, Mouton,The Hague,1972
[3] Dalianis H., Hassel M., Smedt de K., Liseth A., Lech T.C., Wedekind J.: Porting and evaluation of automatic summarization. In Holmboe H. (ed.), Nordisk Sprogteknologi 2003. Arbog for Nordisk Sprakteknologisk, Forskningsprogram 2000–2004, pp.107–121.
[4] Dalianis H., Hassel M., Wedekind J., Haltrup D., Smedt de K., Lech T.C.: Automatic text summarization for the Scandinavian languages. In Holmboe H. (ed.), Nordisk Sprogteknologi, 2002. Arbog for Nordisk SprakteknologiskForskn- ingsprogram2000–2004, pp.153–163.
[5] Mazdak N.: FarsiSum – a Persian text summarizer. Master thesis, Department of Linguistics, Stockholm University, 2004
[6] PachantourisG.: GreekSum–AGreek Text Summarizer. Master Thesis, Department of Computer and Systems Sciences, KTH–Stockholm University 2005
[7] LinC.Y.: Training a Selection Function for Extraction. In the 8th International Conference on Information and Knowledge Management (CIKM99), Kansa City, Missouri,1999
[8] Luhn H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development,1959, pp.159–165
[9] Edmundson H.P.: New Methods in Automatic Extraction. Journal of the ACM 16(2),1969, pp.264–285.
[10] Hassel M.: Evaluation of automatic text summarization – a practical implementation Licentiate thesis Stockholm, NADA-KTH, 2004
[11] Dalianis H.: SweSum – A Text Summarizer for Swedish. http://www.dsv.su.se/%7Ehercules/papers/Textsumsummary.html, 2000.
[12] Dalianis H.: Aggregation in Natural Language Generation. Journal of Computational Intelligence, Vol.15, No.4, 1999, pp.384–414.
[13] Smedtde K.,Liseth A.,Hassel M.,Dalianis H.: How short is good? Anevaluation of automatic summarization. In Holmboe,H.(ed.) Nordisk Sprogteknologi 2004, pp.267-287
[14] Gajecki M.: Serwer lekskalny jezyka polskiego. Computer Science, Rocznik AGH, 2001

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-AGH1-0010-0009