Skip to content

Instantly share code, notes, and snippets.

@hengxin
Last active May 20, 2020 20:24
Show Gist options
  • Save hengxin/8e5040d7a8b354b1c82e to your computer and use it in GitHub Desktop.
Save hengxin/8e5040d7a8b354b1c82e to your computer and use it in GitHub Desktop.
How to install Cassandra on Ubuntu?

0. Overview

This article aims to be a portal to installation and configuration of Cassandra. It is self-contained in the first place. It also provides links to the original articles.

Warning: This article is still under written.

Prerequisites

Install Oracle Java Development Kit (JDK)

See the original article on wikiHow.

Although JRE is sufficient to run Java applications (like Cassandra here), I recommend installing JDK in case you need to write or hack them.

  1. Clear OpenJDK if necessary

  2. Download JDK

Warning comes first: The lastest version of JDK as of 07-2014 is JDK_1.8.0_11 (JDK8u11). However, there is a known bug with Java 8: it is not compatible with antlr (used to compile Cassandra source). You should upgrade to antlr >= 2.5.2. I failed to do this. Therefore, in this article, I will stick to Java 7.

Download it to (by default) the directory ~/Download.

Tips: Make sure you have downloaded JDK with the right bit edition (32-bit/64-bit). Use getconf LONG_BIT to check that of your operating system (See here).

  1. Make a directory; copy and uncompress the Oracle JDK into it

sudo mkdir -p /usr/local/java --- make directory

sudo cp -r ~/Download/jdk-8u11-linux-i586.tar.gz /usr/local/java/ --- copy

cd /usr/local/java/

sudo tar xvzf jdk-8u11-linux-i586.tar.gz --- uncompress

After that, Double-check your directory: [no jre-]

Restart your computer.

Install Oracle Java SE Runtime Environment (JRE)

See the original article here.

Skip this step, if you have installed JDK.

You must configure your operating system to use the Oracle JRE, not the OpenJDK. The 64-bit version of Java 7 is officially recommended. The minimum supported version is 1.7.0_25. The latest version of Java 8 Jre1.8.0_05 is available.

Warning: There is a known bug

Tips: Make sure you have downloaded JRE with the right bit edition (32-bit/64-bit). Use getconf LONG_BIT to check that of your operating system (See here).

In my setting, jre-8u5-linux-i586.tar.gz (of 32-bit) is used.

  1. Make a directory for the JRE:

sudo mkdir -p /usr/lib/jvm

  1. Unpack the tarball and install the JRE:

sudo tar zxvf jre-8u5-linux-i586.tar.gz -C /usr/lib/jvm

Notice: The JRE files are installed into a directory called "/usr/lib/jvm/jre1.8.0_05".

  1. Tell the system that there's a new Java version available:

sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jre1.8.0_05/bin/java" 1

Notice: If updating from a previous version that was removed manually, execute the above command twice, because you'll get an error message the first time.

  1. Set the new JRE as the default:

sudo update-alternatives --set java /usr/lib/jvm/jre1.8.0_05/bin/java

  1. Make sure your system is now using the correct JRE.

Run java -version, you should get message like this:

java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) Server VM (build 25.5-b02, mixed mode)

Install Java Native Access (JNA)

Java Native Access (wiki) provides Java programs easy access to native shared libraries without using the Java Native Interface. Installing JNA can improve Cassandra memory usage.

To install JNA, just type the command sudo apt-get install libjna-java.

1. Install Cassandra using APT repositories

See the original article [here] (http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installDeb_t.html?).

The following is a production-grade installation. For a quick test on Cassandra clusters on a local machine, go directly to the next section: Install Cassandra Cluster Manager (CCM).

Procedure of installing Cassandra

  1. Add the DataStax Community repository to the "/etc/apt/sources.list.d/cassandra.sources.list":

echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list 2. Add the DataStax repository key to your aptitude trusted keys:

curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -

Tips: cURL is a command line tool for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, and so on.

  1. Install the package:

sudo apt-get update and then sudo apt-get install dsc20

Tips: dsc20 is a free packaged distribution of the Apache Cassandra database. The latest version as of 07-16-2014 is 2.0.9-1.

Hereto, Cassandra has been installed and its service has been started automatically. Let us review some files of Cassandra before playing with it.

Files in Cassandra

The file structure of Cassandra has been described in Packaged Install Directories.

The configuration files are located in the following directories: cassandra-config

The packaged releases install into these directories: cassandra-release

Tips: You have two common ways to find the Cassandra-related files. First, use the locate filename command. Remember to update the index, using sudo updatedb. See this post. Secondly, run the dpkg -L cassandra command to list all the files in the package. See this post.

  • /usr/bin/~ :

    1. cassandra-cli: commond-line interface
    2. cqlsh: the Python-based command-line client, on the command line of a Cassandra node.

    Notice: As explained in Cassandra Wiki, the Cassandra CLI is a holdover from the pre-CQL Thrift API. You SHOULD migrate to cqlsh.

    1. cassandra-stress: (automatic) stress test of Cassandra
    2. nodetool: a utility for inspecting a cluster to determine whether it is properly configured, and to perform a variety of maintenance operations
    3. sstable~: data files
  • /var/log/cassandra: log file

  • /etc/cassandra/~:

  1. cassandra.yaml: configuration file

Notice: The default storage-conf.xml file in older cassandras (<0.7 ?) has been replaced by cassandra.yaml. See the question.

  • More ...

Playing with Cassandra

To start Cassandra service, run the command cassandra -f under the "cwd = /usr/bin/".

Notice: In case the service has been started (e.g. started automatically immediately after being installed), an error arises: Error: Exception thrown by the agent : java.lang.NullPointerException. See the post and the Fixed Bug.

To play with Cassandra, run the command cqlsh under the "cwd = /usr/bin/". You can specify the IP address and port to start cqlsh on a different node, such as cqlsh 192.168.0.10 8080. See Starting cqlsh on Linux.

There are plenty of material on CQL in DATASTAX document.

2. Install Cassandra Cluster Manager (CCM)

See CCM and README on GitHub.

Prerequisites

  • A working python installation (tested to work with python 2.7). Python is often pre-installed in Ubuntu distribution
  • Install python-pip: sudo apt-get install -y python-pip
  • Install six: sudo pip install six and sudo pip install six --upgrade if necessary
  • Install cql: sudo pip install cql
  • Install PyYAML: sudo pip install PyYAML

Possible error: libyaml is not found or a compiler error: forcing --without-libyaml See pip-PyYAML for complete information. I just ignore this error now following the issue on GitHub: PyYAML will fall back to a pure-python parser if libyaml is not available, so even when it said it failed, everything is cool. I will come back to solve this problem (maybe using easy_install) if it really matters.

  • Install git: sudo apt-get install git
  • Install ant: sudo apt-get install ant
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment