Skip to content

Instantly share code, notes, and snippets.

Created September 7, 2012 05:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/3663417 to your computer and use it in GitHub Desktop.
Save anonymous/3663417 to your computer and use it in GitHub Desktop.
Hive streaming mapreduce
DROP TABLE IF EXISTS tim_hive_r_demo_new;
CREATE TABLE tim_hive_r_demo_new (
users double,
rating double)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t';
add FILE map.R;
add FILE reduce.R;
FROM
(
map users, rating, grp
using 'map.R'
as users, rating, grp
FROM my_hive_r_demo
cluster by users
) a
INSERT OVERWRITE TABLE tim_hive_r_demo_new
reduce a.users, a.rating
USING 'reduce.R'
AS users, rating;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment