Skip to content

Instantly share code, notes, and snippets.

@untergeek
Last active April 2, 2018 23:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save untergeek/ba286e7433071719ea2c4260850b2ae7 to your computer and use it in GitHub Desktop.
Save untergeek/ba286e7433071719ea2c4260850b2ae7 to your computer and use it in GitHub Desktop.
Rollover, Snapshot, and Curator

Snapshot, Rollover, and Curator

Snapshot

Create a repository

Name the repository whatever you like. In this case, testrepository.

PUT /_snapshot/testrepository
{
  "type": "azure"
}

Azure Repository Information

Repository Plugin

Configuration

Snapshot all indices

By default, Elasticsearch will snapshot all indices to the specified repository (testrepository here) identified by the specified snapshot name, snapshot_1. wait_for_completion=true tells the client not to return until the snapshot has completed.

PUT /_snapshot/testrepository/snapshot_1?wait_for_completion=true

Snapshot selected indices

You can specify which indices to snapshot:

PUT /_snapshot/testrepository/snapshot_2?wait_for_completion=true
{
  "indices": "index_1,index_2",
  "ignore_unavailable": true,
  "include_global_state": false
}

You can also tell Elasticsearch to proceed with the snapshot, even if some primary/replica shards are unavailable (ignore_unavailable), or whether to include the global cluster state (include_global_state), among some other settings described in the documentation

Rollover

Create a "rollover" index tied to an alias

A "rollover" index has a name and an incrementable number, e.g. testindex-000001

PUT /testindex-000001 
{
  "settings":{
    "number_of_shards":1,
    "number_of_replicas":0
  },
  "aliases": {
    "testalias": {}
  }
}

Rollover using the API

Identify using the named alias (testalias), and the _rollover endpoint. If any one of the specified conditions matches, the condition will be "true", and rollover will happen. You must have at least one condition specified, but do not need to specify multiple conditions.

max_age uses time values like 1s (1 second), or 1d (1 day), or other elasticsearch supported notation. max_size is only available since Elasticsearch 6.1, and uses sizes in bytes, or g for gigabytes, etc.

POST /testalias/_rollover
{
  "conditions": {
    "max_age": "1s",
    "max_docs": 100000,
    "max_size": "1g"
  }
}

Curator

Curator can help simplify index selection for snapshot and rollover, as well as other actions.

Curator is executed at the command line:

$ curator [--config /path/to/client.yml] [--dry-run] /path/to/action.yml

--dry-run lets you see what Curator would do, but not actually change anything.

Installation (RedHat)

Full documentation at https://www.elastic.co/guide/en/elasticsearch/client/curator/current/yum-repository.html

  • rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
  • Add the following in your /etc/yum.repos.d/ directory in a file with a .repo suffix, for example curator.repo
[curator-5]
name=CentOS/RHEL 7 repository for Elasticsearch Curator 5.x packages
baseurl=https://packages.elastic.co/curator/5/centos/7
gpgcheck=1
gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
  • yum install elasticsearch-curator

Client Configuration

Curator requires a client configuration file. If placed in $HOME/.curator/curator.yml, then the user executing Curator does not need to specify the client configuration file. Otherwise, --config /path/to/client.yml is required.

A client configuration may look like this:

---
client:
  hosts:
    - 127.0.0.1
    - 127.0.0.2
  port: 9200
  url_prefix:
  use_ssl: false
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: false
  http_auth:
  timeout: 30
  master_only: false

logging:
  loglevel: INFO
  logfile:
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

Multiple hosts must be in the same cluster. Curator simply round-robins the requests across each specified host.

Acceptable loglevel values include DEBUG, INFO, WARNING, ERROR, and CRITICAL. Curator will be very quiet with WARNING level, and quite verbose with DEBUG.

If no logfile is specified, Curator will log to STDOUT.

Action Configuration

Rollover

Here is an example rollover action file

---
actions:
  1:
    action: rollover
    description: >-
      Rollover the index associated with alias 'testalias'
    options:
      name: testalias
      conditions:
        max_age: 1d
        max_docs: 1000000
        max_size: 1g

Snapshot

Here is an example snapshot action file

---
actions:
  1:
    action: snapshot
    description: >-
      Select "testindex-" prefixed indices which are not associated with alias "testalias" 
      Put them in snapshot named 'testsnapshot-%Y%m%d%H%M%S'.  
      Wait for the snapshot to complete.  
      Do not skip the repository filesystem access check.  
      Use the other options as specified.
    options:
      repository: testrepository
      name: "testsnapshot-%Y%m%d%H%M%S"
      ignore_unavailable: false
      include_global_state: true
      partial: false
      wait_for_completion: true
      skip_repo_fs_check: false
    filters:
    - filtertype: pattern
      kind: prefix
      value: testindex-
    - filtertype: alias
      aliases: testalias
      exclude: true

Action chaining

An action file need not be a single action. They are processed in numerical order.

---
actions:
  1:
    action: rollover
    description: ...
    ...
  2: 
    action: snapshot
    description: ...
    ...
  3:
    action: delete_indices
    description: ...
    ...
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment