The method of automatic summarization from different sources

Shakhovska, N.; Cherna, T.

Artykuł - szczegóły

Tytuł artykułu

The method of automatic summarization from different sources

Autorzy

Shakhovska N. , Cherna T.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

In this article is analyzed technology of automatic text abstracting and annotation. The role of annotation in automatic search and classification for different scientific articles is described. The algorithm of summarization of natural language documents using the concept of importance coefficients is developed. Such concept allows considering the peculiarity of subject areas and topics that could be found in different kinds of documents. Method for generating abstracts of single document based on frequency analysis is developed. The recognition elements for unstructured text analysis are given. The method of pre-processing analysis of several documents is developed. This technique simultaneously considers both statistical approaches to abstracting and the importance of terms in a particular subject domain. The quality of generated abstract is evaluated. For the developed system there was conducted experts evaluation. It was held only for texts in Ukrainian. The developed system concluding essay has higher aggregate score on all criteria. The summarization system architecture is building. To build an information system model there is used CASE-tool AllFusion ERwin Data Modeler. The database scheme for information saving was built. The system is designed to work primarily with Ukrainian texts, which gives a significant advantage, since most modern systems still oriented to English texts.

Słowa kluczowe

annotation abstracting national system of abstracting heterogeneous data analysis

Wydawca

Polish Academy of Sciences, Branch in Lublin

Czasopismo

ECONTECHMOD : An International Quarterly Journal on Economics of Technology and Modelling Processes

Rocznik

2016

Tom

Vol. 5, No 1

Strony

103--109

Opis fizyczny

Bibliogr. 21 poz., rys., tab., wz.

Twórcy

autor

Shakhovska N.

natalya233@gmail.com

Information systems and networks department Lviv Polytechnic National University, S. Bandery str., 12, Lviv, 79013, Ukraine

autor

Cherna T.

Information systems and networks department Lviv Polytechnic National University, S. Bandery str., 12, Lviv, 79013, Ukraine

Bibliografia

1. Brandow R., Mitze, K., and Rau, L.F. 1995. Automatic condensation of electronic publications by sentencje selection. In Information Processing and Management, 31 (5), 675–685.
2. Solton J. 1979. Dynamic library-information systems. – М: the World. 2003.
3. Shakhovska N. and Stakhiv Z. 2012. “Automated system for laying essay” Proceedings of the 14th international conference SAIT-2012. – Kyiv, Ukraine, 2012. – 428. (in Ukrainian).
4. Hahn U. 2000. The Challenges of Automatic Summarization/ U. Hahn, I. Mani // Computer. – Vol. 33. – № 11. 29–36.
5. Document Understanding Conferences (DUC) 2008:Web site, – http://duc.nist.gov.15.10.2011.
6. Yang Ch.C. 2003. Fractal Summarization for Mobile Devices to Access Large Documents on the Web/ Ch.C. Yang, F.L. Wang // Proc. of the WWW2003, Budapest, Hungary. 26–31.
7. Gorkovoy V. 1993. “Automate indexing and abstracting documents”. Results of science and technology. Ser. Informatics. V.7 . Moscow: VINITI. – 246.
8. Soroka M.B. 2002. The national system of abstracting Ukrainian scientific literature / NAS Ukraine, Nat. Library of Ukraine named after VI Vernadsky. – K .: Vernadsky National Library, 209. (in Ukrainian).
9. Kupiec J. 2005. “A Trainable Document Summarizer”. Proc.18th Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, E.A. Fox, P. Ingwersen, and R. Fidel, eds., ACMPress, New York, 68–73.
10. Krajovskyj V., Lytvyn V. and Shakhovska N. 2009. Basic approaches to the development of software system for automatic summarization of text documents. Collected 108 Works of NAS Ukraine / Institute for Modelling in Energy. – №51. – Kyiv. 178–186 (in Ukrainian).
11. Park S.-T. 2002. Analysis of Lexical Signatures for Finding Lost or Related Documents / S.-T. Park, D. Pennock, C. Lee Giles, R. Krovetz. – Finland. 8.
12. Kolcz A. 2004. Improved Robustness of Signature-Based Near-Replica Detection via Lexicon Randomization / A. Kolcz, A. Chowdhury, J. Alspector. – KDD, 23–28.
13. Andrei Z. and Broder. 2000. Identifying and Filtering Near-Duplicate Documents, COM’00. Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching, 1–10.
14. Berson Thomas A. Differential Cryptanalysis Mod 232 with Applications to MD5. EUROCRYPT. – http://dl.acm.org/citation.cfm?id=1754956.
15. Hahn U. 2000. The Challenges of Automatic Summarization/ U. Hahn, I. Mani // Computer. – Vol. 33. – № 11. 29–36.
16. Document Understanding Conferences (DUC) 2008. http://duc.nist.gov.
17. Alyguliev R. 2007. Automatic document summarization with extracting informative sentences // Computational technologies. – 2007. – Т. 12, №5. 5–15. (in Russian).
18. Yang Ch.C. 2003. Fractal Summarization for Mobile Devices to Access Large Documents on the Web/ Ch.C. Yang, F.L. Wang // Proc. of the WWW2003, Budapest, Hungary. 26–31.
19. Stupin V. 2004. The automatic summarization method of symmetric summarization // Computational linguistics and intelligent technologies. Proceedings of the International Conference “Dialog'2004.” (Upper Volga, 2–7 June 2004). –M. Nauka. 579–591. (in Russian).
20. Gasfild D. 2003. Rows, trees and sequences in algorytms. – S.Pt:. Nevskyj dialect. 340 (in Russian)
21. Kalyuzhna N. and Golovkova К. 2013. “Structural contradictions in control system by enterprise as function of associate administrative decisions”. Econtechmod: аninternational quarterly journal. – Lublin–Rzeszow. – Vol. 02. No. 3. 33–40.

Uwagi

Opracowane ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-8c3af7b6-2670-4bba-a00d-1bee3adbc4a4