An efficient Monte Carlo approach to compute PageRank for large graphs on a single PC

Sonobe, T.

doi:10.1515/fcds-2016-0002

Artykuł - szczegóły

Tytuł artykułu

An efficient Monte Carlo approach to compute PageRank for large graphs on a single PC

Autorzy

Sonobe T.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1515/fcds-2016-0002

Warianty tytułu

Języki publikacji

Abstrakty

This paper describes a novel Monte Carlo based random walk to compute PageRanks of nodes in a large graph on a single PC. The target graphs of this paper are ones whose size is larger than the physical memory. In such an environment, memory management is a difficult task for simulating the random walk among the nodes. We propose a novel method that partitions the graph into subgraphs in order to make them fit into the physical memory, and conducts the random walk for each subgraph. By evaluating the walks lazily, we can conduct the walks only in a subgraph and approximate the random walk by rotating the subgraphs. In computational experiments, the proposed method exhibits good performance for existing large graphs with several passes of the graph data.

Słowa kluczowe

graph Monte Carlo Page rank random walk

Wydawca

Wydawnictwo Politechniki Poznańskiej

Czasopismo

Foundations of Computing and Decision Sciences

Rocznik

2016

Tom

Vol. 41, No. 1

Strony

29--43

Opis fizyczny

Bibliogr. 24 poz., rys.

Twórcy

autor

Sonobe T.

tominlab@gmail.com)

National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan

Bibliografia

[1] K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. Monte carlo methods in pagerank computation: When one iteration is sufficient. SIAM Jounal on Numerical Analysis, (2):890-904, February 2007.
[2] Bahman Bahmani, Kaushik Chakrabarti and Dong Xin. Fast personalized pagerank on mapreduce. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pages 973-984, 2011.
[3] Bahman Bahmani, Abdur Chowdhury and Ashish Goel. Fast incremental and personalized pagerank. VLDB Endow., 4(3):173-184, 2010.
[4] Paolo Boldi and Sebastiano Vigna. The WebGraph framework I: Compression techniques. In Proceedings of the Thirteenth International World Wide Web Conference (WWW 2004), pages 595-601, 2004.
[5] Yenyu Chen, Qingqing Gan and Torsten Suel. Local methods for estimating pagerank values. In Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pages 381-389, 2004.
[6] Shumo Chu and James Cheng. Triangle listing in massive networks and its applications. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 672-680, 2011.
[7] Atish Das Sarma, Sreenivas Gollapudi and Rina Panigrahy. Estimating pagerank on graph streams. In Proceedings of the Twenty-seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 69-78, 2008.
[8] Atish Das Sarma, AnisurRahaman Molla, Gopal Pandurangan, and Eli Upfal. Fast distributed pagerank computation. In Distributed Computing and Networking, pages 11-26. 2013.
[9] Dániel Fogaras and Balázs Rácz. Towards scaling fully personalized pagerank. In Algorithms and Models for the Web-Graph, pages 105-117. 2004.
[10] Wookshin Han, Sangyeon Lee, Kyungyeol Park, Jeonghoon Lee, Minsoo Kim, Jinha Kim and Hwanjo Yu. Turbograph: a fast parallel graph engine handling billion-scale graphs in a single pc. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 77-85, 2013.
[11] Sepandar Kamvar, Taher Haveliwala, Christopher Manning and Gene Golub. Exploiting the block structure of the web for computing pagerank. Stanford University Technical Report, 2003.
[12] U. Kang and Christos Faloutsos. Big graph mining: Algorithms and discoveries. SIGKDD Explorations Newsletter, (2):29-36, April 2013.
[13] Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. Efficient parallel computation of pagerank. In Advances in Information Retrieval, pages 241-252. 2006.
[14] Haewoon Kwak, Changhyun Lee, Hosung Park and Sue Moon. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, pages 591-600, 2010.
[15] Aapo Kyrola, Guy Blelloch and Carlos Guestrin. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, pages 31-46, 2012.
[16] Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C. Dehnert, Ilan Horn, Naty Leiser and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 135-146, 2010.
[17] Frank McSherry. A uniform approach to accelerated pagerank computation. In Proceedings of the 14th international conference on World Wide Web, pages 575-582, 2005.
[18] Julie L Morrison, Rainer Breitling, Desmond J. Higham and David R. Gilbert. Generank: using search engine technology for the analysis of microarray experiments. BMC bioinformatics, (1):233, 2005.
[19] Mark E., J. Newman. Fast algorithm for detecting community structure in networks. Physical review E, 69(6):066133, 2004.
[20] Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd. The pagerank citation ranking: Bringing order to the web. 1999.
[21] Jiayu Pan, Hyungjeong Yang, Christos Faloutsos and Pinar Duygulu. Automatic multimedia cross-modal correlation discovery. In Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 653-658, 2004.
[22] Marcin Sydow. Random surfer with back step. Fundamental Informaticae, 68(4):379-398, 2005.
[23] John Wicks and Amy Greenwald. Parallelizing the computation of pagerank. In Algorithms and Models for the Web-Graph, pages 202-208. 2007.
[24] Jaewon Yang and Jure Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of ACM SIGKDD Workshop on Mining Data Semantics, pages 1-8, 2012.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-365d69ab-41c9-4ed8-9706-c2f690a207ce