Elasticsearch Cluster Restore
This folder contains a Terraform module to restore backups of an Elasticsearch cluster from snapshots saved in S3. The module is a lambda function that calls the Elasticsearch API to perform cluster restore tasks documented here;
Restoring snapshots
All the above section does, is deploy the Lambda function that contains the cluster restore code. You'll need to actually invoke that function with the right snapshot ID to perform a restore. The backup module generates an ID for each snapshot it saves to S3 and this can be located in its CloudWatch logs; grep for string "Saving snapshot: <SNAPSHOT>"
. Snapshot index files stored along side the backup data in S3 also contain this information.
Performing a restore is quite straightforward at this point, it involves manually invoking the Lambda function via the web interface or AWS CLI. The ID of the snapshot to restore is specified in the event data passed to the Lambda:
{
"snapshotId": "<SNAPSHOT>"
}
Restoring to a different cluster
Snapshots created from a cluster can be restored to a completely different cluster, this module will transparently setup a backup repository (backed by the S3 cluster containing the snapshots) on the new cluster and the standard restore process described above will work.
You should be mindful of the difference in versions of the Elasticsearch cluster the snapshots were created with and the cluster it's being restored to. The documentation contains more information on the compatiblity matrix and how to upgrade snapshots created with older versions of Elasticsearch.
Restore Notification
The time it takes to restore a snapshot is dependent on the volume of data within that snapshot. However, since the restore module is implemened as a Lambda function which has a maximum execution time of 5 minutes a separate notification Lambda is kicked off. The notification Lambda will check the status of the restore operation and re-invoke itself until the operation is complete. The notification Lambda continiously logs the status of the restore operation to Cloudwatch.
Reference
- Inputs
- Outputs
Required
bucket
stringThe S3 bucket that the specified repository will be associated with and where all snapshots will be stored
elasticsearch_dns
stringThe DNS to the Load Balancer in front of the Elasticsearch cluster
name
stringThe name of the Lambda function. Used to namespace all resources created by this module.
repository
stringThe name of the repository that will be associated with the created snapshots
Optional
elasticsearch_port
numberThe port on which the API requests will be made to the Elasticsearch cluster
9200
lambda_runtime
stringThe runtime to use for the Lambda function. Should be a Node.js runtime.
"nodejs14.x"
protocol
stringSpecifies the protocol to use when making the request to the Elasticsearch cluster. Possible values are HTTP or HTTPS
"http"
run_in_vpc
boolSet to true to give your Lambda function access to resources within a VPC.
false
subnet_ids
list(string)A list of subnet IDs the Lambda function should be able to access within your VPC. Only used if run_in_vpc
is true.
[]
vpc_id
stringThe ID of the VPC the Lambda function should be able to access. Only used if run_in_vpc
is true.
null