PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

An efficient Monte Carlo approach to compute PageRank for large graphs on a single PC

Autorzy
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper describes a novel Monte Carlo based random walk to compute PageRanks of nodes in a large graph on a single PC. The target graphs of this paper are ones whose size is larger than the physical memory. In such an environment, memory management is a difficult task for simulating the random walk among the nodes. We propose a novel method that partitions the graph into subgraphs in order to make them fit into the physical memory, and conducts the random walk for each subgraph. By evaluating the walks lazily, we can conduct the walks only in a subgraph and approximate the random walk by rotating the subgraphs. In computational experiments, the proposed method exhibits good performance for existing large graphs with several passes of the graph data.
Słowa kluczowe
Rocznik
Strony
29--43
Opis fizyczny
Bibliogr. 24 poz., rys.
Twórcy
autor
  • National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
Bibliografia
  • [1] K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. Monte carlo methods in pagerank computation: When one iteration is sufficient. SIAM Jounal on Numerical Analysis, (2):890-904, February 2007.
  • [2] Bahman Bahmani, Kaushik Chakrabarti and Dong Xin. Fast personalized pagerank on mapreduce. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pages 973-984, 2011.
  • [3] Bahman Bahmani, Abdur Chowdhury and Ashish Goel. Fast incremental and personalized pagerank. VLDB Endow., 4(3):173-184, 2010.
  • [4] Paolo Boldi and Sebastiano Vigna. The WebGraph framework I: Compression techniques. In Proceedings of the Thirteenth International World Wide Web Conference (WWW 2004), pages 595-601, 2004.
  • [5] Yenyu Chen, Qingqing Gan and Torsten Suel. Local methods for estimating pagerank values. In Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pages 381-389, 2004.
  • [6] Shumo Chu and James Cheng. Triangle listing in massive networks and its applications. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 672-680, 2011.
  • [7] Atish Das Sarma, Sreenivas Gollapudi and Rina Panigrahy. Estimating pagerank on graph streams. In Proceedings of the Twenty-seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 69-78, 2008.
  • [8] Atish Das Sarma, AnisurRahaman Molla, Gopal Pandurangan, and Eli Upfal. Fast distributed pagerank computation. In Distributed Computing and Networking, pages 11-26. 2013.
  • [9] Dániel Fogaras and Balázs Rácz. Towards scaling fully personalized pagerank. In Algorithms and Models for the Web-Graph, pages 105-117. 2004.
  • [10] Wookshin Han, Sangyeon Lee, Kyungyeol Park, Jeonghoon Lee, Minsoo Kim, Jinha Kim and Hwanjo Yu. Turbograph: a fast parallel graph engine handling billion-scale graphs in a single pc. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 77-85, 2013.
  • [11] Sepandar Kamvar, Taher Haveliwala, Christopher Manning and Gene Golub. Exploiting the block structure of the web for computing pagerank. Stanford University Technical Report, 2003.
  • [12] U. Kang and Christos Faloutsos. Big graph mining: Algorithms and discoveries. SIGKDD Explorations Newsletter, (2):29-36, April 2013.
  • [13] Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. Efficient parallel computation of pagerank. In Advances in Information Retrieval, pages 241-252. 2006.
  • [14] Haewoon Kwak, Changhyun Lee, Hosung Park and Sue Moon. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, pages 591-600, 2010.
  • [15] Aapo Kyrola, Guy Blelloch and Carlos Guestrin. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, pages 31-46, 2012.
  • [16] Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C. Dehnert, Ilan Horn, Naty Leiser and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 135-146, 2010.
  • [17] Frank McSherry. A uniform approach to accelerated pagerank computation. In Proceedings of the 14th international conference on World Wide Web, pages 575-582, 2005.
  • [18] Julie L Morrison, Rainer Breitling, Desmond J. Higham and David R. Gilbert. Generank: using search engine technology for the analysis of microarray experiments. BMC bioinformatics, (1):233, 2005.
  • [19] Mark E., J. Newman. Fast algorithm for detecting community structure in networks. Physical review E, 69(6):066133, 2004.
  • [20] Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd. The pagerank citation ranking: Bringing order to the web. 1999.
  • [21] Jiayu Pan, Hyungjeong Yang, Christos Faloutsos and Pinar Duygulu. Automatic multimedia cross-modal correlation discovery. In Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 653-658, 2004.
  • [22] Marcin Sydow. Random surfer with back step. Fundamental Informaticae, 68(4):379-398, 2005.
  • [23] John Wicks and Amy Greenwald. Parallelizing the computation of pagerank. In Algorithms and Models for the Web-Graph, pages 202-208. 2007.
  • [24] Jaewon Yang and Jure Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of ACM SIGKDD Workshop on Mining Data Semantics, pages 1-8, 2012.
Uwagi
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-365d69ab-41c9-4ed8-9706-c2f690a207ce
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.