Last active

Embed URL


SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

Is consistent order maintained in line 9/10?

View recommend.pig
1 2 3 4 5 6 7 8 9 10
pairs = FOREACH pairs GENERATE elem1.follower AS follower,
elem1.repo AS repo1,
elem2.repo AS repo2,
elem1.rating AS rating1,
elem2.rating AS rating2;
by_repos = GROUP pairs BY (repo1, repo2);
gt_5 = FILTER by_repos BY COUNT_STAR(pairs) > 2;
pearson = FOREACH gt_5 GENERATE FLATTEN(group) AS (repo1, repo2),
udfs.cosine(pairs.rating1, pairs.rating2) as similarity;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.