Skip to content

Instantly share code, notes, and snippets.

@dmitrizagidulin
Last active August 29, 2015 14:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmitrizagidulin/bfb15908009f91eb63f1 to your computer and use it in GitHub Desktop.
Save dmitrizagidulin/bfb15908009f91eb63f1 to your computer and use it in GitHub Desktop.
Riak Cheat Sheet
{mode, max}.
{duration, 600}. % run for 10 hrs (duration is in minutes)
{concurrent, 25}. % concurrent worker threads
{driver, basho_bench_driver_riakc_pb}.
{riakc_pb_ips, [
{192,168,0,10},
{192,168,0,11}
]}.
{riakc_pb_search_queries, [
%% queries for everything, returning 50 rows
{<<"index1">>, "*:*", [{rows,50}]},
%% queries for field created_i > 1234
{<<"index2">>, "created_i:[12345%20TO%20*]", [{rows,50}]}
]}.
{operations, [{search, 1}]}.

Creating a Cluster

Install Oracle Java 7 (Ubuntu 14.04 LTS)

apt-get install -y software-properties-common
add-apt-repository ppa:webupd8team/java
apt-get update
apt-get install oracle-java7-installer

Install Riak (Ubuntu 14.04 LTS)

apt-get install -y libssl0.9.8 libpam0g-dev
wget http://s3.amazonaws.com/downloads.basho.com/riak/2.1/2.1.1/ubuntu/trusty/riak_2.1.1-1_amd64.deb
dpkg -i riak_2.1.1-1_amd64.deb

Relevant default directories:

  • Config: /etc/riak/riak.conf
  • Data Dir: /var/lib/riak
  • Solr index files: /var/lib/riak/yz
  • Logs: /var/log/riak/

Check the Riak config file

Typically located in in /etc/riak/riak.conf.

  1. For a ~10 node cluster, increase the ring_size to 128 (the default is 64).
  2. Make sure search = on if you're going to use Riak Search / Solr

Bucket Types

Create & activate a bucket type named 'resources'

As of Riak 2.1, can only be done on the command-line.

riak-admin bucket-type create resources
riak-admin bucket-type activate resources

Riak Search (Solr/Yokozuna)

Creating / Uploading a custom schema

Assuming a file resource_schema.xml in local directory:

curl -XPUT http://localhost:8098/search/schema/resource_schema -H 'Content-Type:application/xml' --data-binary @resource_schema.xml

Index Creation

Create a new index named index1, using the schema named _yz_default. Using HTTP:

curl -XPUT -H 'Content-Type: application/json' http://localhost:8098/search/index/index1 -d '{"schema": "_yz_default"}'

In Python:

client.create_search_index('index1', '_yz_default')

Associate the index with EITHER the bucket type (recommended for production) OR a custom bucket:

Bucket Type: riak-admin bucket-type update resources '{"props": {"search_index":"index1"}}'

Bucket: via HTTP/command line:

curl -XPUT -H 'Content-Type: application/json' http://localhost:8098/types/default/buckets/my-bucket/props -d '{"props":{"search_index":"index1"}}'

or (python) bucket.set_properties({'search_index': 'index1'})

Search Queries

Remember, you're querying an index, not a bucket.

Get all documents in an index, first 20 docs (and pass it to a pretty-print module):

curl 'http://localhost:8098/search/query/index1?wt=json&q=*:*&rows=20' | python -m json.tool

Riak Python Client Installation

Riak Python Client Docs

Pre-requisites: (Assumes you're using pip (apt-get install -y python-pip)).

apt-get install -y python-dev libffi-dev libssl-dev
pip install cryptography
pip install riak

Basho Bench

basho_bench Riak Benchmarking Docs

Installing Basho Bench

wget http://ps-tools.s3.amazonaws.com/basho-bench_0.10.0.83.gfffe40c-1_amd64.deb
dpkg -i basho-bench_0.10.0.83.gfffe40c-1_amd64.deb

Installing R (for generating graphs)

The version of the R language that comes with Ubuntu 14.04 LTS is old. To install the newest version, add a custom repo from an R repo mirror site:

sudo su
echo "deb http://lib.stat.cmu.edu/R/CRAN/bin/linux/ubuntu trusty/ #enabled-manually" >> /etc/apt/sources.list
exit
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9

Install the R package:

apt-get update
apt-get install -y r-base r-base-dev

Bench Configuration

The benchmarking config files live in /etc/basho_bench/. Sample test_pb.config file:

{mode, max}.
{duration, 1}.  % in minutes
{report_interval,1}.
{concurrent, 10}.  % # of concurrent workers

{driver, basho_bench_driver_riakc_pb}.

 % Generate keys from 1 to 1000 and stop
{key_generator, {int_to_bin_bigendian, {partitioned_sequential_int, 1000}}}.

{value_generator, {fixed_bin, 2000}}.  % 2kb object values (binary blob)

% List of IPs in the cluster. Note the commas instead of periods.
% If on AWS, use private instance IPs
{riakc_pb_ips, [
  {192,168,0,10},
  {192,168,0,11}
]}.  

% Operations:
% {get, 1}
% {put, 1}
% {update, 1}   <- this is 2 ops in one, read + put
{operations, [{put, 1}]}.

{pb_connect_options, [{auto_reconnect, true}]}.
{riakc_pb_replies, 1}.

Running Benchmarks

Create a directory that the test results will go into, for example, /home/ubuntu/bench_results.

cd /home/ubuntu
mkdir bench_results

Launch basho_bench with a given config file:

basho_bench --results-dir /home/ubuntu/bench_results/ /etc/basho_bench/test_pb.config

This creates a timestamp-based test directory in bench_results, and symlinks that directory to bench_results/current.

Creating Summary Graphs from Results

Rscript --vanilla /usr/lib/basho_bench/lib/basho_bench*/priv/summary.r -i /home/ubuntu/bench_results/current/

This generates a summary.png graph in the current results subdirectory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment