This Rally track is used to test the relationship between bulk indexing rejections and the following parameters:
- Number of concurrent clients indexing into Elasticsearch
- Number of shards actively being indexed into
- Number of data nodes in the cluster
- Size of bulk requests
The track contains a number of challenges, each indexing into an index with a set number of shards using a increasing number of concurrent client connections and two different bulk sizes.
For these benchmarks we have used clusters with one, two and three data nodes in Elastic Cloud, each data node with 8GB of RAM allocated (4GB heap, 4GB native memory). Rally has been invoked multiple times as follows (challenge
and user-tag
parameters have been updated for each run to go through all challenges for each cluster):
esrally --track-path=$DIR/track.json --user-tag="data_nodes:1" --challenge=bulk-index-8_shards --target-hosts=$EC_CLUSTER:9243 --pipeline=benchmark-only --cluster-health=yellow --client-options="use_ssl:true,verify_certs:true,basic_auth_user:'$EC_USER',basic_auth_password:'$EC_PASSWORD'"
Here $DIR
is the directory where the files from this gist reside and $EC_CLUSTER
, $EC_USER
and $EC_PASSWORD
are specific to the cluster being benchmarked. A separate Elastic Cloud cluster has been configured as the metrics store, which allows us to analyze the raw data using Kibana.