Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
We present a novel approach to automatic text mining on the World Wide Web. Considering the fact that the enormously dynamic growth of the WWW results in a need for new, more powerful information extraction tools we designed and implemented a system, which adapts techniques originally introduced in the field of data mining. We believe that similar systems, which usually base on machine learning or natural language processing methods, can prove to be ineffective when dealing with the very large numbers of hypertext documents of different structure and subject. Moreover, such systems tend to treat HTML documents as plain texts not taking into account the additional information contained in their markup tags.
Słowa kluczowe
Rocznik
Tom
Strony
67--87
Opis fizyczny
Bibliogr. 13 poz.
Twórcy
autor
autor
- Institute of Computing Science, Poznań University of Technology, Piotrowo 3A, 60-965 Poznań, Poland
Bibliografia
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPP1-0017-0089