Skip to content

Instantly share code, notes, and snippets.

FrogWild! – Fast PageRank Approximations on Graph Engines : NOTES

In this paper, the authors discuss about FrogWild, their Monte Carlo based Approximate Distributed PageRank approach for obtaining top-k vertices in the graph (where k ∈ [100, 1000]). They discuss scenarios where obtaining the top-k vertices is desired. These include finding top-k customers for a business' customer loyalty program, or identifying top-k keywords/lines in a given text (where each unique word is a vertex and is linked to words in its close proximity in the text). A simple solution to obtaining top-k vertices is to run the standard PageRank algorithm for fewer iterations, but the authors indicate that their algorithm performs better.

Each random walker, called a frog, is initiated at uniformly random vertex (instead of at each vertex) with a lot of mass (frog?), and performs random walk for a life span that follows a geometric random variable (but their algorithm does not require waiting for the last frog to expire). Their algorithm relies on the Bulk Synchronous execution mode of the GraphLab PowerGraph, but is executed asynchronously (similar to HogWild, which is a lock-free asynchronous stochastic gradient descent algorithm).

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment