Elasticsearch introduced Snapshot and Restore API in Elasticsearch 1.0. With this module you can take backup/restore of the data easily.
To take snapshots or to restore them, first you need to create a repository. A repository is a just like allocating an address to your snapshots.
A repository can contain as many snapshots as you would like, and you can create any number of repositories.
Each repository is mapped to a location where your snapshot files will be stored. Elasticsearch first started with snapshots to FileSystem location, now you can take snapshots to Remove locations like AWS S3 directly.
Elasticsearch currently supports snapshtos to AWS S3, HDFS, Azure .
``` curl -XPUT 'http://localhost:9200/_snapshot/my_fs_repository' -d '{ "type": "fs", "settings": { "location": "/data/es-backups/fs-repository", "compress": true } }' ```where my_fs_repository
is repository name and /data/es-backups/fs-reposiotry
is the location of the repository
You need to create this location and elasticsearch user should have permissions to access this location.
sudo chown elasticsearch /data/es-backups/fs-repository
sudo chgrp elasticsearch /data/es-backups/fs-repository
curl -XPUT "localhost:9200/_snapshot/my_fs_repository/snapshot_name?wait_for_completion=true" -d '{
"indices": "indice_1,indice_2,indice_3",
"ignore_unavailable": "true",
"include_global_state": "false"
}'
This way you can specify which indices you want to take snapshot. All indices will be part of the snapshot if indices
parameter not specified.
include_global_state
parameter will decide whether to store the cluster meta data or not, which includes persistent cluster settings and index templates
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/snapshot_name/_restore" -d '{
"indices": "indice_1,indice_2",
"ignore_unavailable": "true"
}'
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/snapshot_name/_restore" -d '{
"indices": "index_1,index_2",
"ignore_unavailable": "true",
"include_global_state": "false",
"rename_pattern": "index_(.+)",
"rename_replacement": "restored_index_$1"
}'
If you have got the snapshot data from a different machine, you will have to place those files in the repository location, snapshot name can be found from looking at the files
$ ls -lh /data/es-backups/fs-repository
-rw-r--r-- 1 elasticsearch elasticsearch 34 Apr 19 17:01 index
drwxr-xr-x 3 elasticsearch elasticsearch 4.0K Apr 19 17:01 indices
-rw-r--r-- 1 elasticsearch elasticsearch 61 Apr 19 17:01 metadata-snapshot_name
-rw-r--r-- 1 elasticsearch elasticsearch 193 Apr 19 17:01 snapshot-snapshot_name
Or, you can get the list of all available snapshots in a repository by
curl -XGET "localhost:9200/_snapshot/my_fs_repository/_all?pretty"
curl -XDELETE "localhost:9200/_snapshot/my_fs_repository/snapshot_name"
elsticsearch-cloud-aws plugin needs to be installed to be able to use S3 for snapshot location.
/etc/elasticsearch/elasticsearch.yml
cloud:
aws:
access_key: AWS_ACCESS_KEY_ID
secret_key: AWS_SECRET_ACCESS_KEY
repositories:
s3:
bucket: "es-snapshots"
region: "us-west-1"
S3 bucket policy
{
"Statement": [
{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::es-snapshots"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::es-snapshots/*"
]
}
],
"Version": "2012-10-17"
}
create a bucket and a directory (optional) to use as the location for S3
curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
"type": "s3",
"settings": {
"bucket": "es-snapshots",
"region": "us-west-1",
"base_path": "my_cluster"
}
}'