ProtractorNinja/Hive and Shark Instructions

## Hive and Shark Instructions
# Notes on Hive and Shark

## Usage Instructions

1. Unpack .tar to home directory (not ready yet)
2. `qsub hive-and-shark.pbs`
3. `ssh <job root node`
4. `cd ~/hive-and-shark`
5. `source ./start-shark.sh` or `source ./start-hive.sh`
	- Make sure to use `source`: don't just execute the script. `source` will add environment variables to your session.

## Files I had to edit

Not really mentioning steps outlined in [Running Shark on a Cluster](https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster).

1. `hive-and-shark.pbs`
2. `start-hive.sh` and `start-shark.sh`
3. `etc/core-site.xml` (fixed hadoop.tmp.dir value)
4. `spark-0.8.0/conf/spark-env.sh` (point to scala library)
5. `shark-0.8.0/conf/shark-env.sh` (env variables, etc)
6. `hive/conf/hive-env.sh` (point to configuration directory)

## Files that need to be updated per run

1. `spark-0.8.0/conf/spark-env.sh` (memory) (done with sed)
2. `shark-0.8.0/conf/shark-env.sh` (memory) (done with sed)
	# Notes on Hive and Shark

	## Usage Instructions

	1. Unpack .tar to home directory (not ready yet)
	2. `qsub hive-and-shark.pbs`
	3. `ssh <job root node`
	4. `cd ~/hive-and-shark`
	5. `source ./start-shark.sh` or `source ./start-hive.sh`
	- Make sure to use `source`: don't just execute the script. `source` will add environment variables to your session.

	## Files I had to edit

	Not really mentioning steps outlined in [Running Shark on a Cluster](https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster).

	1. `hive-and-shark.pbs`
	2. `start-hive.sh` and `start-shark.sh`
	3. `etc/core-site.xml` (fixed hadoop.tmp.dir value)
	4. `spark-0.8.0/conf/spark-env.sh` (point to scala library)
	5. `shark-0.8.0/conf/shark-env.sh` (env variables, etc)
	6. `hive/conf/hive-env.sh` (point to configuration directory)

	## Files that need to be updated per run

	1. `spark-0.8.0/conf/spark-env.sh` (memory) (done with sed)
	2. `shark-0.8.0/conf/shark-env.sh` (memory) (done with sed)