Skip to content

Instantly share code, notes, and snippets.

@RobertWSaunders
Created October 8, 2018 18:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save RobertWSaunders/0f191bb009cac27305aeb940f40bc7f3 to your computer and use it in GitHub Desktop.
Save RobertWSaunders/0f191bb009cac27305aeb940f40bc7f3 to your computer and use it in GitHub Desktop.
#!bin/bash
# Assignment #1 NoSQL Databases Part 2
# Class: CMPE 432 Advanced Databases
# Authors: Robert Saunders, Chris Jones
# Student Numebers: 10194030, 10177155
# Script to create a single MongoDB instance.
# This script will also create a database and import the movie and ratings data provided from taking a down sample of dataset 1.
# This script requires mongo to be installed and mongod to be installed within your PATH.
# For the sake of the assignment we are running this on our local machine which ensures same machine specs.
# Please accept the echo commands as comments to describe our code.
echo "Killing all mongod instances!"
killall mongod
echo "Waiting 5 seconds for mongod instances to shut down!"
sleep 5
# NOTE: For the next three steps the path to the data files is dependant on where you store your mongo data locally.
echo "Removing existing data files for the Mongo instance!"
rm -rf ~/mongo-data/cmpe432-data/single-instance-data
echo "Making new directory for data files of mongo instance!"
mkdir -p ~/mongo-data/cmpe432-data/single-instance-database
echo "Starting up mongo instance!"
mongod --logpath "r0.log" --dbpath ~/mongo-data/cmpe432-data/single-instance-database --port 27017 --fork
echo "Waiting 5 seconds to ensure instances fully boot up!"
sleep 5
echo "Dropping cmpe432-database database from the instance incase this script has already been run before!"
mongo mongodb://localhost:27017/cmpe432-database --eval "db.dropDatabase();"
# NOTE: You can comment out the below two steps if you dataset is already comma seperated.
# NOTE: This relies on you having a dataset1 directory with the data files whereever you are excuting this script.
echo "Converting movies data to tab seperated for mongoimport tool!"
LC_ALL=C sed -i -e "s/|/ /g" dataset1/movies.txt
# NOTE: You can comment out the below two steps if your dataset if already imported into the database.
echo "Importing the ratings data into a ratings collection in a new database called cmpe432-database!"
mongoimport --uri mongodb://localhost:27017/cmpe432-database --collection "ratings" --columnsHaveTypes --fields "userId.int64(), movieId.int64(), rating.double(), timestamp.string()" --type tsv --file ../random_data/netIDs.data
# NOTE: The below is a less than ideal document design, this is a limitation of the import tool outlined in the report.
echo "Importing the movies data into a movies collection into the new database called cmpe432-database!"
mongoimport --uri mongodb://localhost:27017/cmpe432-database --collection "movies" --columnsHaveTypes --fields "movieId.int64(), movieTitle.string(), releaseDate.string(), videoReleaseDate.string(), imdbUrl.string(), unknownGenre.boolean(), actionGenre.boolean(), adventureGenre.boolean(), animationGenre.boolean(), childrensGenre.boolean(), comedyGenre.boolean(), crimeGenre.boolean(), documentaryGenre.boolean(), dramaGenre.boolean(), fantasyGenre.boolean(), filmNoirGenre.boolean(), horrorGenre.boolean(), musicalGenre.boolean(), mysteryGenre.boolean(), romanceGenre.boolean(), scifiGenre.boolean(), thrillerGenre.boolean(), warGenre.boolean(), westernGenre.boolean()" --type tsv --file dataset1/movies.txt
echo "Mongo instance fully configured and all data has been imported!"
echo "Connecting to mongo shell of replica set!"
mongo mongodb://localhost:27017/cmpe432-database
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment