Skip to content

Instantly share code, notes, and snippets.

@coppeliaMLA
Last active August 29, 2015 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save coppeliaMLA/9343237 to your computer and use it in GitHub Desktop.
Save coppeliaMLA/9343237 to your computer and use it in GitHub Desktop.
This is the hiveql for the rag: quick start Hadoop and Hive for analysts. You can find it here http://www.ragscripts.com/2014/03/04/quick-start-hadoop-and-hive-for-analysts/
drop table if exists recommender_set_num; --In case you need to rerun the script
drop table if exists person_ids_full_names;
drop table if exists recom_names;
-- Set up a table to load the recommendations data into
create external table if not exists recommender_set_num
(
userID bigint,
itemID bigint
) row format delimited fields terminated by ','
stored as textfile
location 's3n://crunchdata/input/recommendersetnum';
-- Set up a table to load the names look up into
create external table if not exists person_ids_full_names
(
userID bigint,
nameKey string,
displayName string
) row format delimited fields terminated by ','
stored as textfile
location 's3n://crunchdata/input/personsidsfullname';
-- Set up a table to add the joined data to
create external table if not exists recom_names
(
userID bigint,
itemID bigint,
nameKey string,
displayName string
) row format delimited fields terminated by ','
stored as textfile
location 's3n://crunchdata/output/recomnames.csv';
-- Join the tables
insert overwrite table recom_names
select A.userID,
A.itemID,
B.nameKey,
B.displayName
from recommender_set_num A join
person_ids_full_names B
on A.userID = B.userID;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment