Skip to content

Instantly share code, notes, and snippets.

@randerzander
Last active August 29, 2015 14:10
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save randerzander/45b290dbec5ca6c075f7 to your computer and use it in GitHub Desktop.
Save randerzander/45b290dbec5ca6c075f7 to your computer and use it in GitHub Desktop.
An example of bidirectional communication between a MongoDB collection and Apache Hive.
# Git clone, build, and copy the necessary MongoDB jars to your worker node $HADOOP_HOME/lib directories
cd ~/
git clone https://github.com/mongodb/mongo-hadoop
cd mongo-hadoop
./gradlew jar
sudo cp build/libs/* /usr/lib/hadoop/lib
sudo cp core/build/libs/* /usr/lib/hadoop/lib
sudo cp hive/build/libs/* /usr/lib/hadoop/lib
cd ~/
git clone https://github.com/mongodb/mongo-java-driver
cd mongo-java-driver
./gradlew jar
sudo cp build/libs/* /usr/lib/hadoop/lib
hive> create table mongo_table(example_column string)
> stored by 'com.mongodb.hadoop.hive.MongoStorageHandler'
> tblproperties('mongo.uri'='mongodb://localhost:27017/test.test');
hive> insert into table mongo_table select * from hive_table;
hive> add jar /usr/lib/hadoop/lib/mongo-hadoop-hive-*.jar;
hive> add jar /usr/lib/hadoop/lib/mongo-java-driver.jar;
hive> create table mongo_mapped_table(hive_column string)
> stored by 'com.mongodb.hadoop.hive.MongoStorageHandler'
> with serdeproperties ('mongo.columns.mapping'='{"hive_column":"mongo_field"}')
> tblproperties('mongo.uri'='mongodb://n0.dev:27017/test.test');
#If using an authSource param, mongo.user and mongo.passwd as table properties will not work. Instead you'll need to specify user/pw in the URI, as in: 'mongo.uri'='mongodb://user:password@localhost/test.test?authSource=admin'
hive> insert into table mongo_mapped_table select * from hive_table;
hive> select * from mongo_table;
hive> select * from mongo_mapped_table;
@randerzander
Copy link
Author

Known to work with HDP 2.1.4, hive.execution.engine may be mr OR tez.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment