Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save wolfram77/c76889c777a6031ed4e3ea1727eec2aa to your computer and use it in GitHub Desktop.
Save wolfram77/c76889c777a6031ed4e3ea1727eec2aa to your computer and use it in GitHub Desktop.
FrogWild! – Fast PageRank Approximations on Graph Engines : NOTES

In this paper, the authors discuss about FrogWild, their Monte Carlo based Approximate Distributed PageRank approach for obtaining top-k vertices in the graph (where k ∈ [100, 1000]). They discuss scenarios where obtaining the top-k vertices is desired. These include finding top-k customers for a business' customer loyalty program, or identifying top-k keywords/lines in a given text (where each unique word is a vertex and is linked to words in its close proximity in the text). A simple solution to obtaining top-k vertices is to run the standard PageRank algorithm for fewer iterations, but the authors indicate that their algorithm performs better.

Each random walker, called a frog, is initiated at uniformly random vertex (instead of at each vertex) with a lot of mass (frog?), and performs random walk for a life span that follows a geometric random variable (but their algorithm does not require waiting for the last frog to expire). Their algorithm relies on the Bulk Synchronous execution mode of the GraphLab PowerGraph, but is executed asynchronously (similar to HogWild, which is a lock-free asynchronous stochastic gradient descent algorithm).

ORG

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment