Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Cassandra Install Instructions

Cassandra Install on Digital Ocean

The systems

The 512 MB/1 CPU system is too small. The 1GB/1 CPU can technically run Cassandra but it is too small. You should likely start with 2GB/2 CPU systems but expect to upgrade.

Select them in the same region for right now.

I went with Ubuntu 14 and the commands here will presume that. The commands were done from the root account. This will also presume that you are doing a multi-node cluster (3 is used here).

Setting up the system

Install Oracle Java

I was able to get away with using other Java flavors on the single node install. On the cluster this did not work. Go with Oracle 7 until Cassandra/Datastax say otherwise.

Install Java

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

Set environment variables

nano /etc/profile
    JAVA_HOME=/usr/lib/jvm/java-7-oracle
    PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
    JRE_HOME=/usr/lib/jvm/java-7-oracle
    PATH=$PATH:$HOME/bin:$JRE_HOME/bin
    export JAVA_HOME
    export JRE_HOME
    export PATH

Install and set alternative Javas

sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/java-7-oracle/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/java-7-oracle/bin/javac" 1
sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/lib/jvm/java-7-oracle/bin/javaws" 1
sudo update-alternatives --set java /usr/lib/jvm/java-7-oracle/bin/java
sudo update-alternatives --set javac /usr/lib/jvm/java-7-oracle/bin/javac
sudo update-alternatives --set javaws /usr/lib/jvm/java-7-oracle/bin/javaws

sudo reboot

Install Cassandra

Note: Datastax loves using nano and not vi/vim

First set up the iptables rules and make them persist

sudo iptables -A INPUT -p tcp --dport 7000 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 7199 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 9042 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 9160 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 61620 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 61621 -j ACCEPT

sudo apt-get install iptables-persistent 
Checkout use of iptables-persistent here: http://www.microhowto.info/howto/make_the_configuration_of_iptables_persistent_on_debian.html

Add the DataStax Community repository to the /etc/apt/sources.list.d/cassandra.sources.list

echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list

Add the DataStax repository key to your aptitude trusted keys

curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -

Install the package.

sudo apt-get update
sudo apt-get install dsc20=2.0.11-1 cassandra=2.0.11 (version may change; check http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installDeb_t.html)

nano /etc/cassandra/cassandra-env.sh

Uncomment JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname= and replace the place holder test with the ip of the machine

Stop the service on each node and clear the initial gossip history that gets populated by this initial start:

sudo service cassandra stop
sudo rm -rf /var/lib/cassandra/data/system/*
sudo rm -rf /var/lib/cassandra/*

Update configuration file Node 0

nano /etc/cassandra/cassandra.yaml
    cluster_name: 'HackAndSlash'
    seeds: "xxx.xxx.xxx.xxx,xxx.xxx.xxx.xxx" **[Do not use all nodes as seed nodes]**
    listen_address: xxx.xxx.xxx.xxx  **[the ip of the node]**
    rpc_address: 0.0.0.0
    endpoint_snitch: RackInferringSnitch
    auto_bootstrap: false **[only if this is the first time the cluster is coming up]**

Update configuration file Node 1

nano /etc/cassandra/cassandra.yaml
    cluster_name: 'HackAndSlash'
    seeds: "xxx.xxx.xxx.xxx,xxx.xxx.xxx.xxx" **[Do not use all nodes as seed nodes]**
    listen_address: xxx.xxx.xxx.xxx  **[the ip of the node]**
    rpc_address: 0.0.0.0
    endpoint_snitch: RackInferringSnitch
    auto_bootstrap: false **[only if this is the first time the cluster is coming up]**

Update configuration file Node 2

nano /etc/cassandra/cassandra.yaml
    cluster_name: 'HackAndSlash'
    seeds: "xxx.xxx.xxx.xxx,xxx.xxx.xxx.xxx" **[Do not use all nodes as seed nodes]**
    listen_address: xxx.xxx.xxx.xxx  **[the ip of the node]**
    rpc_address: 0.0.0.0
    endpoint_snitch: RackInferringSnitch
    auto_bootstrap: false **[only if this is the first time the cluster is coming up]**

Start Seed Nodes First [very important]. Then start the remaining nodes

sudo service cassandra start

Check your node status using the nodetool from the command line

nodetool status

Installing The OpsCenter (optional)

Datastax offers OpsCenter to help with monitoring the system. If you decide to do this you can use the smallest Digital Ocean drop. I tried installing on one of the nodes. This did not work well.

For the main server you would do this:

Add the DataStax Community repository to the /etc/apt/sources.list.d/cassandra.sources.list

echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/datastax.community.list

Add the DataStax repository key to your aptitude trusted keys

curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -

Install the OpsCenter

apt-get update
apt-get install opscenter
sudo service opscenterd start

To activate authentication you need to edit /etc/opscenter/opscenterd.conf file and changed enabled = False to enabled = True. The default account is admin with a password of admin (change yours).

Automatic installation of the agent can be done via the OpsCenter web page. If you want to try the manual intall you can do the following:

On each of the nodes you need to install the OpsCenter agent:

sudo apt-get update
sudo apt-get install datastax-agent

In address.yaml set stomp_interface to the IP address that OpsCenter is using. (You may have to create the file.)

$ echo "stomp_interface: <reachable_opscenterd_ip>" | sudo tee -a /var/lib/datastax-agent/conf/address.yaml

Start the agent

sudo service datastax-agent start

Tuning the JVM

Check out the documentation here: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_tune_jvm_c.html

Edit nano /etc/cassandra/cassandra-env.sh and set MAX_HEAP_SIZE and HEAP_NEWSIZE accordingly

Then sudo service cassandra stop, count to 15 and then sudo service cassandra start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment