public
Last active

Is consistent order maintained in line 9/10?

  • Download Gist
recommend.pig
1 2 3 4 5 6 7 8 9 10
pairs = FOREACH pairs GENERATE elem1.follower AS follower,
elem1.repo AS repo1,
elem2.repo AS repo2,
elem1.rating AS rating1,
elem2.rating AS rating2;
 
by_repos = GROUP pairs BY (repo1, repo2);
gt_5 = FILTER by_repos BY COUNT_STAR(pairs) > 2;
pearson = FOREACH gt_5 GENERATE FLATTEN(group) AS (repo1, repo2),
udfs.cosine(pairs.rating1, pairs.rating2) as similarity;

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.