Our Work

Automatic Snapshot creation and deletion in elastic search cluster

Updated 4 months and 3 days ago

Curator is one good tool to perform some operations (indices, snapshots) on elastic search with simple Commands. This tool showed a way to automate snapshot mechanism with out doing manually. To work with curator, we have some procedure to install on any one of your nodes in cluster(any one of nodes in cluster is recommended to avoid snapshot repository issues).

What is curator:

Like a museum curator manages the exhibits and collections on display, Elastic search Curator helps you curate, or manage your Elastic search indices.

Tested Environment:

Curator: v3.4


ES Version: v1.5.2

To work with elastic curator we need to install "pip" Repository which will install elastic curator in servers.

How to install pip:

  • pip will not come to CENT/Ubuntu/RHEL by default we need to install like following

for RHEL 7.x and cent7 we have to do following command.

yum install epel-release

for RHEL 6.x and cent 6.x

rpm -ivh HTTP://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

Now your server is ready to install pip

Execute following command:

yum install -y python-pip

install/upgrade curator using following URL after pip installation.

Some index operations:

Note: By default curator will connect to local-host to fetch data if it is not working then specify host and port tags to connect to particular cluster.

To check list of indices on cluster using curator is:

curator --host --port 9200 show indices –all-indices

Snapshot Commands

Get list of available snapshots in repository using following command.

curator --host --port 9200 show snapshots --repository es_repo

Creating snapshot for all indices:

curator --host --port 9200 --master-only --logfile /tmp/curator_snapshot_`date +\%Y\%m\%d`.log --loglevel WARN --loglevel INFO snapshot --repository es_snapshots --name all_indices_`date +\%Y\%m\%d` indices --all-indices

Logfile: logfile will create daily when snapshot is executed .to automate this process we will schedule the above command in crontab and we will run these at available times. Logfile contains when snapshot performed, any errors occurred etc...


--host --port 9200 – host and port which curator connect to.

--master-only - The command helps us in fail over process.

-logfile /tmp/curator_snapshot_`date +\%Y\%m\%d`.log --loglevel WARN --loglevel INFO --This logger commands to write to file to know snapshot status.

snapshot --repository es_snapshots – snapshot commands and repository name registered on ES.

--name all_indices_`date +\%Y\%m\%d` – overriding default curator name appending to snapshot this will add our custom name to repository. (ex: all_indices_20140205)

indices --all-indices – indices information. It is denoting take all indices snapshot from repository.

delete specific snapshots on repository:

curator --host --port 9200 --logfile /tmp/curator_delete_`date +\%Y\%m\%d`.log --loglevel WARN --loglevel INFO --timeout 300 delete snapshots --repository es_snapshots --time-unit days --older-than 9 --timestring \%Y\%m\%d

--timeout 300 – removing snapshot will take some time by ES the default timeout of curator is 30seconds when we hit curator command if the operation not completed with in 30 seconds the process will quit and curator throws error. In order to avoid this condition we introduces a timeout flag (300 sec = 5 mins)

--time-unit days --older-than 9 – delete older than 9 days snapshots

--timestring \%Y\%m\%d - snapshot name match to repository.

This command sweep snapshots from repository automatically by keeping 8 days snapshots in repository only.