Skip to content

Instantly share code, notes, and snippets.

@mwinkle
Last active August 29, 2015 13:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mwinkle/10009508 to your computer and use it in GitHub Desktop.
Save mwinkle/10009508 to your computer and use it in GitHub Desktop.
Storing from Pig to Mongo
Obtain jar's from https://github.com/mongodb/mongo-hadoop
REGISTER wasb://container@account/mongo-hadoop/mongo-java-driver-2.11.4.jar;
REGISTER wasb://container@account/mongo-hadoop/mongo-hadoop-core-1.2.1-SNAPSHOT-hadoop_2.2.jar;
REGISTER wasb://container@account/mongo-hadoop/mongo-hadoop-pig-1.2.1-SNAPSHOT-hadoop_2.2.jar;
-- pig code
STORE limited_summary INTO 'mongodb://myhappymongo.cloudapp.net:27017/pigtest.test' USING com.mongodb.hadoop.pig.MongoInsertStorage('','');
Querying Mongo from Hive
ADD JAR wasb://{container}@{account}/mongo-hadoop/mongo-java-driver-2.11.4.jar;
ADD JAR wasb://{container}@{account}/mongo-hadoop/mongo-hadoop-core-1.2.1-SNAPSHOT-hadoop_2.2.jar;
ADD JAR wasb://{container}@{account}/mongo-hadoop/mongo-hadoop-hive-1.2.1-SNAPSHOT-hadoop_2.2.jar;
CREATE TABLE mongo_hive
(
id INT,
uri_stem STRING,
number_of_requests INT,
total_egress INT,
average_time_taken DOUBLE
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id",
"uri_stem":"group","number_of_requests":"NumberOfRequests",
"total_egress":"TotalEgress",
"average_time_taken":"AverageTimeTaken"}')
TBLPROPERTIES('mongo.uri'='mongodb://yourmongo.cloudapp.net:27017/pigtest.test');
select count(*) from mongo_hive;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment