Information monitoring based on web resources

Opaliński, A.; Turek, W.; Głowacki, M.

Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

Artykuł - szczegóły

Czasopismo

Czasopismo Techniczne. Mechanika

2013 | R. 110, z. 1-M | 277--284

Tytuł artykułu

Information monitoring based on web resources

Autorzy

Opaliński, A. , Turek, W. , Głowacki, M.

Wybrane pełne teksty z tego czasopisma

http://repozytorium.biblos.pk.edu.pl/resources/35439

Warianty tytułu

Monitoring informacji w oparciu o zasoby sieci web

Języki publikacji

Abstrakty

The paper summarizes the system for WEB resources monitoring based on defined query. Experiment compares results returned by the proposed system to those provided by Google Search and Google Alert services. Results indicate that the system could be solid base for development and tests of pattern detection and information retrieval mechanism, while providing more data than Google solutions. Drawback of system and further development plans are also presented.

W artykule przedstawiono architekturę systemu monitorującego zasoby sieci WEB pod kątem zdefiniowanego zapytania. Wyniki działania systemu porównano z prowadzonym w tym samym czasie monitoringiem za pomocą mechanizmów oferowanych przez Google. Rezultaty wskazują, że system może być przydatną bazą do badania mechanizmów wykrywania wzorców i wyszukiwania informacji, udostępniając więcej danych w porównaniu do mechanizmów Googla. Wykazano też niedoskonałości aktualnej wersji systemu wynikające ze specyfiki źródeł danych i zaproponowano kierunki jego rozwoju.

Słowa kluczowe

crawling monitoring Internetu wyszukiwanie informacji

crawling web monitoring information retrieval

Wydawca

Czasopismo

Czasopismo Techniczne. Mechanika

Rocznik

2013

Tom

R. 110, z. 1-M

Strony

277--284

Opis fizyczny

Bibliogr. 18 poz., il., tab., wykr.

Twórcy

autor

Opaliński, A.

Department of Applied Computer Science and Modelling, Faculty of Metals Engineering and Industrial Computer Science, AGH University of Science and Technology

autor

Turek, W.

Department of Applied Computer Science and Modelling, Faculty of Metals Engineering and Industrial Computer Science, AGH University of Science and Technology

autor

Głowacki, M.

Department of Applied Computer Science and Modelling, Faculty of Metals Engineering and Industrial Computer Science, AGH University of Science and Technology

Bibliografia

[1] Kunder M., WorldWideWebSize.com, 12.2012.
[2] Alpert J., Hajaj N., We knew the web was big... (http://tinyurl.com/crzays7‒25.07.2008).
[3] Croft, W.B., Metzler D., Strohman T., Search engines: Information retrieval in practice, Addison-Wesley 2010.
[4] Kobayashi M., Takeda K., Information retrieval on the web, ACM Computing Surveys (CSUR), 32(2), 2000, 144-173.
[5] Pandey S.K., Mishra R.B., Intelligent Web mining model to enhance knowledge discovery on the Web, In Parallel and Distributed Computing, Applications and Technologies, 2006. PDCAT’06. Seventh International Conference on, 339-343, IEEE.
[6] Manku G.S., Jain A. & Das Sarma A., Detecting near-duplicates for web crawling, In Proceedings of the 16th international conference on WWW, 141-150, ACM, 2007.
[7] Broder A.Z., Najork M., Wiener J.L., Efficient URL caching for world wide web crawling, In Proc. of the 12th international conf. on WWW, 679-689, ACM, 2003.
[8] Menczer F., Belew R.K., Adaptive Information Agents in Distributed Textual Environments, Proc. of the 2nd Int. Conf. on Autonomous Agents, ACM, 1998, 157-164.
[9] Dong H., Hussain F.K., Chang E., State of the Art in Semantic Focused Crawlers, Computational Science and Its Applications, Seoul, Korea, 2009, 910-924.
[10] Dorosz K., Korzycki M., Latent Semantic Analysis Evaluation of Conceptual Dependency Driven Focused Crawling, Multimedia Communications, Services and Security, 5th International Conference, MCSS 2012, Krakow 2012, 77-84.
[11] Crawling Web content with the FAST Search Web crawler, MS SharePoint library (http://technet.microsoft.com/en-us/library/ff383271%28v=office.14%29.aspx).
[12] Mohr G., Stack M., Rnitovic I., Avery D., Kimpton M., Introduction to heritrix, In 4th InternationalWeb Archiving Workshop, 2004.
[13] Miller R., Websphinx, a personal, customizable web crawler (http://www. cs. cmu. edu/~ rcm/websphinx ‒ 2011-02-12).
[14] Girardi C., Ricca, F., Tonella, P., Web crawlers compared, International Journal of Web Information Systems, 2(2), 2006, 85-94.
[15] Getting Started Guide – What are Google Alerts? (http://tinyurl.com/csr4z3b).
[16] Turek W., Opaliński A., Kisiel-Dorohinicki M., Extensible Web Crawler ‒ Towards Multimedia Material Analysis, Multimedia Communications, Services and Security, 4th International Conference, MCSS 2011, Krakow 2011, 183-190.
[17] Wilaszek K., Wójcik T., Opaliński A., Turek W., Internet Identity Analysis and Similarities Detection, MCSS 2012, Krakow 2012, 369-379.
[18] Alexa – provider of global web metrics (http://www.alexa.com ‒ 01.2013).

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA

bwmeta1.element.baztech-9daed8c7-f3d6-4c87-bab7-76604952ff76