Skip to content

Instantly share code, notes, and snippets.

@friso
friso / disconnected-graph.txt
Created January 29, 2012 10:11
Graph in edge list representation
0,1
0,2
0,3
1,4
4,5
5,6
2,3
2,5
4,7
7,3
@friso
friso / gist:1698283
Created January 29, 2012 11:00
Edge list to adjacency list representation
public class PrepareJob {
public static int run(String input, String output) {
Scheme sourceScheme = new TextDelimited(new Fields("source", "target"), ",");
Tap source = new Hfs(sourceScheme, input);
Scheme sinkScheme = new TextDelimited(new Fields("partition", "source", "list"), "\t");
Tap sink = new Hfs(sinkScheme, output, SinkMode.REPLACE);
Pipe prepare = new Pipe("prepare");
//Parse string to int's,
@friso
friso / graph.txt
Created January 29, 2012 11:27
Adjacency list represantation
part. source adjacency
ID node list
43 42 43,41
44 41 44
43 40 43,42,41
36 35 36
35 34 35
34 33 34
33 32 33
@friso
friso / step3.txt
Created January 29, 2012 13:03
Step 3: for each node find the largest partition ID that it belongs to
-- Input:
part. source adjacency
ID node list
29 28 29,23
26 20 26,25,24,23,22,21
-- Create records:
source node partition
node
28 28 29
@friso
friso / step4.txt
Created January 29, 2012 13:14
Step 4: Set the partition ID of each record to the largest partition ID found in step 3
-- Input:
source node partition
node
28 28 29
28 29 29
28 23 29
20 20 26
20 26 26
20 25 26
20 24 26
@friso
friso / gist:1698892
Created January 29, 2012 13:45
Step 3+4: a single iteration
public class IterateJob {
public static int run(String input, String output, int maxIterations) {
boolean done = false;
int iterationCount = 0;
while (!done) {
Scheme sourceScheme = new TextDelimited(new Fields("partition", "source", "list"), "\t");
Scheme sinkScheme = new TextDelimited(new Fields("partition", "source", "list"), "\t");
//SNIPPED SOME BOILERPLATE...
@friso
friso / graph.txt
Created February 6, 2012 21:23
Graph after indegree count and data preparation
3 0 200,2,1,100,3 0,1,1,0,1
4 1 200,100,4 0,0,1
5 2 5,3,200,100 1,1,0,0
3 3 200,100 0,0
7 4 100,5,7,200 0,1,1,0
6 5 100,200,6 0,0,1
6 6 100,200 0,0
7 7 100,3,200 0,1,0
11 10 100,11 0,1
12 11 12,100 1,0
@friso
friso / IterateWithFlags.java
Created February 6, 2012 21:51
Iterate with flags
public class IterateWithFlagsJob {
public static int run(String input, String output, int maxIterations) {
boolean done = false;
int iterationCount = 0;
while (!done) {
Scheme sourceScheme = new TextDelimited(new Fields("partition", "source", "list", "flags"), "\t");
Tap source = new Hfs(sourceScheme, currentIterationInputPath);
Scheme sinkScheme = new TextDelimited(new Fields("partition", "source", "list", "flags"), "\t");
Tap sink = new Hfs(sinkScheme, currentIterationOutputPath, SinkMode.REPLACE);
{
"metadata": {
"name": "",
"signature": "sha256:fffb04a8605ccdef5610b121eeb72faf2288494ec4a8be9dc5e177d0432910b2"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
@friso
friso / blog.ipynb
Created June 5, 2014 09:09
Blog classification notebook
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.