riyadparvez/Pregel Notes.md

## Pregel Notes.md

      
    Raw
  

              Pregel Notes.md
            
          
    Pregel

Why Graph Computation Is Different


Poor locality of memory access
Very little work per vertex
Changing degree of parallelism over the course of execution

MapReduce


Even though much data might be unchanged from iteration to iteration, the data must be reloaded and reprocessed at each iteration, wasting I/O, network bandwidth, and processor resources.
The termination condition might involve the detection of when a fix point is reached. The condition itself might require an extra MapReduce job on each iteration, again increasing resource use in terms of scheduling extra tasks, reading extra data from disk, and moving data across the network.

Bulk Synchronous Parallel (BSP) Model


Computations are consist of a sequence of iterations, called superstep.
During superstep, framework calls user-defined Computation function on every vertex.
Computation function specifies behaviour at a single vertex V and a single superstep S.
Messages are typically sent along outgoing edges.
Message may be sent to any vertex whose identifier is known.
All communication is from superstep S to superstep S+1.

Model of Computation


Algorithm terminates when all vertices vote to halt and there are no messages to deliver.
Graph state is in-memory, occasional saving data to disk
In superstep 0, every vertex is active.
All active vertices participate in the computation in a superstep.
A vertice deactivates itself by vote to halt
Deactivated vertices aren't allowed to participate in computation().
Vertices are reactivated upon recieving message
Think like a Vertex
Only vertex centric computations
No edge centric computations

Message Passing


Vertices communicate directly with one another by message.
A vertex can send any number of messages.
Each message consists of a mesage value and the destination vertex.
There's no guaranteed order of messages.
Messages are guaranteed be delivered
Messages are guaranteed to be not duplicated.


Master


Partitions the input and assigns one or more partitions to each worker.
Responsible for coordinating the workers.
Keeps list of all workers known to be alive, worker's unique identifiers, addressing informations and which portion of the graph is assigned to the worker.
Size of data structure is proportional to the number of partitions not the number of vertices, number of edges.
Maintains statistics of the progress of computation and the state of the graph.
Runs HTTP server for user monitoring.

Worker


Maintains the state of portion of its graph.

Combiner


Sending message incurs overhead which can be reduced in some cases.
System calls Combine() for several messages intended for a vertex V into a single message containing the combined message.
User defined, application specific.
Not enbaled by default.
No gaurantees which messages will be combined, order .
Combiners should be enabled for commutative and associative operations.

Aggregator


Mechanism for global communication, monitoring and data.
Each vertex can provide a value to aggregator in each superstep S, the system combines these values using a reduction operator, and resulting value is made available to all vertices at superstep S+1.
New aggregator is defined by subclassing "Aggregator" class.
Only reduces input values from a single superstep.
Possible to define a sticky aggregator that uses input values from all supersteps.
Should be commutative and associative.

Fault Tolerance


Achieved through checkpoints


Mutation


Mutations become effective in the superstep after the requests are issued.
Within superstep, removals are performed first. First edge removal, vetex removal.
Additions are after removal, first vertex addition, edge addition.
All mutations are before computation().
Local mutations (mutating own edges), immediately effective since no reason of conflicts.

Experiment


Naive SSSP was used
The time for initializing the cluster, generating the test graphs in-memory, and verifying results is not included in the measurements
Checkpointing is disabled
Default Partitioning was used

Critique


No fault tolerance for master is described in the paper.
How the message delivery is guaranteed is not mentioned