Kashif-Rabbani/DistributedGraphLab.md

## DistributedGraphLab.md

      
    Raw
  

              DistributedGraphLab.md
            
          
    Introduction


GraphLab abstraction exposes asynchronous, dynamic, graph-parallel computation model in the shared-memory setting.
This paper extends the abstraction to the distributed setting.
Link to the paper.

Characteristics of MLDM (Machine Learning and Data Mining)


Graph Structured Computation

Sometimes computation requires modeling dependencies between data.
eg modeling dependencies between similar users for the recommendation use case.


Asynchronous Iterative Computation

In many cases, asynchronous procedures outperform synchronous ones.
eg linear systems, belief propagation, stochastic optimization etc.


Dynamic Computation

Iterative computation converges asymmetrically.
Convergence can be accelerated by dynamic scheduling.
eg do not update parameters that have already converged.


Serializability

Ensuring that all parallel executions have an equivalent serial execution is desirable for both correctness and faster convergence.


GraphLab Abstraction

Data Graph


Store program state as a directed graph.
G = (V,E,D) where D is the user defined data (model parameters, algorithm state, statistical data etc).
The graph data D is mutable but the state of the graph (V,E) is immutable.

Update Function


Stateless procedure that modifies the data within the scope of a vertex and schedules the execution of the update function on other vertices.
Scope of a vertex (S) - data corresponding to the vertex, its edges and its adjacent vertices.
update: f (v, S_v) -> (S_v, T) where T is the set of vertices where update function is scheduled to be invoked.
Scheduling of computation id decoupled from movement of data and no message passing is required between vertices.

Execution Model


Input to the model is G and T, the initial set of vertices to be updated.
During each step, a vertex is extracted from T, updated and a set of vertices is added to T (for future computation).
Vertices in T can be executed in any order with the only constraint that all vertices be eventually executed.

Sync Operation


Sync operation runs in the background to maintain global aggregates concurrently.
These global values are read by update function and written by the sync operation.

Consistency Models


Full consistency

Full read/write access in the scope.
Scope of concurrently updating vertices cannot overlap.


Edge consistency

Read/write access on the vertex and the adjacent edges but only read access to adjacent vertices.
Slightly overlapping scope.


Vertex consistency

Write access to the vertex and read access to adjacent edges and vertices.
All vertices can run update function simultaneously.


Distributed Data Graph


Two-phase partitioning process for load balancing the graph on arbitrary cluster size.
In the first phase, partition the graph into k parts (k >> number of machines).
Each part, called atom, is a file of graph generating commands.
Atom also stores information about ghosts (set of vertices and edges adjacent to the partition boundary).
Atom index file contains connectivity structure and file location for the k atoms as a meta-graph.
In the second phase, this meta-graph is partitioned over the physical machines.

Distributed GraphLab Engines

Chromatic Engine


A vertex coloring (no adjacent vertices have the same color) is constructed to serialize parallel execution of dependent tasks (in our case, vertices in the graph).
For edge consistency model, execute all vertices of the same color before going to next color and run sync operation between color steps.
Changes to ghost vertices and edges are communicated asynchronously as they are made.
Vertex consistency is trivial - assign same color to all the vertices.
For full consistency, construct second-order vertex coloring (no vertex shares the same color as any of its distance two neighbors)

Distributed Locking Engine


Associate reader-writer locks on each vertex.
Each machine can update only the local vertices.
Optimisations

Ghosting system uses caching to eliminate wait on remote, unchanged data.
Lock request and synchronization are pipelined to hide network latency.

Each machine maintains a pipeline of vertices for which locks have been requested but not granted.
A vertex is executed once lock acquisition and data synchronization are complete.
Nonblocking reader-writer locks, that work through callback functions, are used.


Fault Tolerance


Distributed checkpointing via two modes:
Synchronous checkpointing

Suspend computation to save all modified data since the last checkpoint.


Asynchronous checkpointing based on Chandy-Lamport snapshot algorithm.

The snapshot step becomes an update function in the GraphLab abstraction.
Better than synchronous checkpointing.


System Design


One instance of GraphLab runs on each machine.
These processes are symmetric and communicate via RPC.
The first process additionally acts as the master and computes placement of atoms based on atom index.
Each process maintains a local scheduler (for its vertices) and a cache to access remote data.
Distributed consensus algorithm to decide when all the schedulers are empty.

Observations


The biggest strength of the paper are its extensive experiments.
GraphLab benefits from the use of background asynchronous communication and pipelined locking but its communication layer is not as efficient as MPI's communication layer.