ivansmf/README Secret

## README
A. Disclaimers:
1. The scripts are offered for instructional purposes
2. Cassandra.yaml and cassandra-env.sh are edited by one of the test scripts. We DO change the default settings to gain performance, and the new settings are specifically designed to work on n1-standard-8 data nodes. For instance, we set the Java heap size to a large value that might not fit on other VM types.

B. Assumptions:
The scripts assume that your username on the target VM is the same as the local development server. More specifically, that the output of `whoami` on your development server is the same as in the VM.

C. Prerequisites:
1. Download all test scripts to a local folder and untar it.
  To download: wget http://storage.googleapis.com/p3rf-downloads/cassandra_1m_writes_per_sec_gist.tgz
  To untar: tar xzf cassandra_1m_writes_per_sec_gist.tgz

2. Download Cassandra binary distribution tarball into the tarballs folder. You can find detailed instructions at http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installTarball_t.html
$ curl -L http://downloads.datastax.com/community/dsc.tar.gz

3. Download the Oracle Java 7 tarball into the tarballs folder (we used server-jre-7u40-linux-x64.tar.gz). You can replace step by installing the JDK, but we did not measure the performance impact of other releases. You can find the binary download here: http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-server-jre.html (sign up required).

4. You'll need to replace [project_name] by an actual project name or ID.

D. Creating the Cassandra cluster and data loaders:
1. Create disks.
$ gcutil --project=[project_name] adddisk --zone=us-central1-b --wait_until_complete --size_gb=1000 `for i in {1..300}; do echo -n pd1t-$i " "; done`

2. Create data nodes.
$ gcutil --project=[project_name] addinstance --zone=us-central1-b --add_compute_key_to_project --auto_delete_boot_disk --automatic_restart --use_compute_key --wait_until_running --image=debian-7-wheezy-v20131120 --machine_type=n1-standard-8 `for i in {1..300}; do echo -n cas-$i " "; done`

3. Create loaders.
$ gcutil --project=[project_name] addinstance --zone=us-central1-b --add_compute_key_to_project --auto_delete_boot_disk --automatic_restart --use_compute_key --wait_until_running --image=debian-7-wheezy-v20131120 --machine_type=n1-highcpu-8 `for i in {1..30}; do echo -n l-$i " "; done`

4. Attach the disks to data nodes.
$ for i in {1..300}; do gcutil --project=[project_name] attachdisk --zone=us-central1-b --disk=pd1t-$i cas-$i; done

5. Authorize one of the loaders to ssh and rsync everywhere. Time to complete 5 minutes.
$ gcutil --project=[project_name] ssh --zone=us-central1-b l-1
$ ssh-keygen -t rsa
$ exit

6. Download the public key
$ gcutil --project=[project_name] pull --zone=us-central1-b l-1 /home/`whoami`/.ssh/id_rsa.pub l-1.id_rsa.pub

7. Upload the key to all other VMs
$ for i in {1..30}; do gcutil --project=[project_name] push --zone=us-central1-b l-$i l-1.id_rsa.pub /home/`whoami`/.ssh/; done
$ for i in {1..300}; do gcutil --project=[project_name] push --zone=us-central1-b cas-$i l-1.id_rsa.pub /home/`whoami`/.ssh/; done

8. Authorize l-1 to ssh into every VM in the project
$ for vm in `gcutil --project=[project_name] listinstances | awk '{print $10;}' | sed ':a;N;$!ba;s/\n/ /g'`; do ssh -o UserKnownHostsFile=/dev/null -o CheckHostIP=no -o StrictHostKeyChecking=no -i /home/`whoami`/.ssh/google_compute_engine -A -p 22 `whoami`@$vm "cat /home/`whoami`/.ssh/l-1.id_rsa.pub >> /home/`whoami`/.ssh/authorized_keys" ; done

9. Generate the cluster configuration file
$ echo SUDOUSER=\"`whoami`\" >benchmark.conf; echo DATA_FOLDER=\"cassandra_data\">>benchmark.conf ; for r in `gcutil 2>/dev/null --project=[project_name] listinstances --zone=us-central1-b | awk 'BEGIN {c=0; l=0;} /cas/ { print "CASSANDRA"++c"=\""$10":"$8":/dev/sdb\"";} /l\-[0-9]/ { print "LOAD_GENERATOR"++l"=\""$10"\""; }'`; do echo $r; done >> benchmark.conf

10. Upload all test scripts to l-1
$ tar czf scripts.tgz *
$ gcutil --project=[project_name] push --zone=us-central1-b l-1 scripts.tgz /home/`whoami`

11. ssh to l-1 to setup the cluster
$ gcutil --project=[project_name] ssh --zone=us-central1-b l-1

12. unpack the scripts
$ tar xzf scripts.tgz

13. Run setup_cluster.sh. Please make sure that all nodes are up and running.
$ ./setup_cluster.sh

14. Run tests
$ ./inserts_test.sh

15. Gather results from each loader


E. Deleting the cluster:
1. Delete data nodes
$ gcutil --project=[project_name] deleteinstance --zone=us-central1-b `for i in {1..300}; do echo -n cas-$i " "; done` --force --delete_boot_pd

2. Delete data loaders
$ gcutil --project=[project_name] deleteinstance --zone=us-central1-b `for i in {1..30}; do echo -n l-$i " "; done` --force --delete_boot_pd

3. Delete disks
$ gcutil --project=[project_name] deletedisk --zone=us-central1-b `for i in {1..300}; do echo -n pd1t-$i " "; done` --force
	A. Disclaimers:
	1. The scripts are offered for instructional purposes
	2. Cassandra.yaml and cassandra-env.sh are edited by one of the test scripts. We DO change the default settings to gain performance, and the new settings are specifically designed to work on n1-standard-8 data nodes. For instance, we set the Java heap size to a large value that might not fit on other VM types.

	B. Assumptions:
	The scripts assume that your username on the target VM is the same as the local development server. More specifically, that the output of `whoami` on your development server is the same as in the VM.

	C. Prerequisites:
	1. Download all test scripts to a local folder and untar it.
	To download: wget http://storage.googleapis.com/p3rf-downloads/cassandra_1m_writes_per_sec_gist.tgz
	To untar: tar xzf cassandra_1m_writes_per_sec_gist.tgz

	2. Download Cassandra binary distribution tarball into the tarballs folder. You can find detailed instructions at http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installTarball_t.html
	$ curl -L http://downloads.datastax.com/community/dsc.tar.gz

	3. Download the Oracle Java 7 tarball into the tarballs folder (we used server-jre-7u40-linux-x64.tar.gz). You can replace step by installing the JDK, but we did not measure the performance impact of other releases. You can find the binary download here: http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-server-jre.html (sign up required).

	4. You'll need to replace [project_name] by an actual project name or ID.

	D. Creating the Cassandra cluster and data loaders:
	1. Create disks.
	$ gcutil --project=[project_name] adddisk --zone=us-central1-b --wait_until_complete --size_gb=1000 `for i in {1..300}; do echo -n pd1t-$i " "; done`

	2. Create data nodes.
	$ gcutil --project=[project_name] addinstance --zone=us-central1-b --add_compute_key_to_project --auto_delete_boot_disk --automatic_restart --use_compute_key --wait_until_running --image=debian-7-wheezy-v20131120 --machine_type=n1-standard-8 `for i in {1..300}; do echo -n cas-$i " "; done`

	3. Create loaders.
	$ gcutil --project=[project_name] addinstance --zone=us-central1-b --add_compute_key_to_project --auto_delete_boot_disk --automatic_restart --use_compute_key --wait_until_running --image=debian-7-wheezy-v20131120 --machine_type=n1-highcpu-8 `for i in {1..30}; do echo -n l-$i " "; done`

	4. Attach the disks to data nodes.
	$ for i in {1..300}; do gcutil --project=[project_name] attachdisk --zone=us-central1-b --disk=pd1t-$i cas-$i; done

	5. Authorize one of the loaders to ssh and rsync everywhere. Time to complete 5 minutes.
	$ gcutil --project=[project_name] ssh --zone=us-central1-b l-1
	$ ssh-keygen -t rsa
	$ exit

	6. Download the public key
	$ gcutil --project=[project_name] pull --zone=us-central1-b l-1 /home/`whoami`/.ssh/id_rsa.pub l-1.id_rsa.pub

	7. Upload the key to all other VMs
	$ for i in {1..30}; do gcutil --project=[project_name] push --zone=us-central1-b l-$i l-1.id_rsa.pub /home/`whoami`/.ssh/; done
	$ for i in {1..300}; do gcutil --project=[project_name] push --zone=us-central1-b cas-$i l-1.id_rsa.pub /home/`whoami`/.ssh/; done

	8. Authorize l-1 to ssh into every VM in the project
	$ for vm in `gcutil --project=[project_name] listinstances \| awk '{print $10;}' \| sed ':a;N;$!ba;s/\n/ /g'`; do ssh -o UserKnownHostsFile=/dev/null -o CheckHostIP=no -o StrictHostKeyChecking=no -i /home/`whoami`/.ssh/google_compute_engine -A -p 22 `whoami`@$vm "cat /home/`whoami`/.ssh/l-1.id_rsa.pub >> /home/`whoami`/.ssh/authorized_keys" ; done

	9. Generate the cluster configuration file
	$ echo SUDOUSER=\"`whoami`\" >benchmark.conf; echo DATA_FOLDER=\"cassandra_data\">>benchmark.conf ; for r in `gcutil 2>/dev/null --project=[project_name] listinstances --zone=us-central1-b \| awk 'BEGIN {c=0; l=0;} /cas/ { print "CASSANDRA"++c"=\""$10":"$8":/dev/sdb\"";} /l\-[0-9]/ { print "LOAD_GENERATOR"++l"=\""$10"\""; }'`; do echo $r; done >> benchmark.conf

	10. Upload all test scripts to l-1
	$ tar czf scripts.tgz *
	$ gcutil --project=[project_name] push --zone=us-central1-b l-1 scripts.tgz /home/`whoami`

	11. ssh to l-1 to setup the cluster
	$ gcutil --project=[project_name] ssh --zone=us-central1-b l-1

	12. unpack the scripts
	$ tar xzf scripts.tgz

	13. Run setup_cluster.sh. Please make sure that all nodes are up and running.
	$ ./setup_cluster.sh

	14. Run tests
	$ ./inserts_test.sh

	15. Gather results from each loader


	E. Deleting the cluster:
	1. Delete data nodes
	$ gcutil --project=[project_name] deleteinstance --zone=us-central1-b `for i in {1..300}; do echo -n cas-$i " "; done` --force --delete_boot_pd

	2. Delete data loaders
	$ gcutil --project=[project_name] deleteinstance --zone=us-central1-b `for i in {1..30}; do echo -n l-$i " "; done` --force --delete_boot_pd

	3. Delete disks
	$ gcutil --project=[project_name] deletedisk --zone=us-central1-b `for i in {1..300}; do echo -n pd1t-$i " "; done` --force