BACKING UP TANZU KUBERNETES GRID (TKG) ON VSPHERE WORKLOADS TO AWS WITH VELERO

Written by Sam McGeown on 7/10/2020 · Read in about 5 min (1061 words)
Published under Cloud-Native and VMware

As more services go live on my Kubernetes clusters and more people start relying on them, I get nervous. For the most part, I try and keep my applications and configurations stateless - relying on ConfigMaps for example to store application configuration. This means with a handful of YAML files in my Git repository I can restore everything to working order. Sometimes though, there’s no choice but to use a PersistentVolume to provide some data persistance where you can’t capture it in a config file. This is where a backup of the cluster - and specifically the PersistentVolume is really important.

Enter Velero - the artist formerly known as Heptio Ark.

Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes.

Velero uses plugins to integrate with various cloud providers, allowing you to backup to different targets - my aim is to backup my vSphere-based (CSI) Persistent Volumes to AWS S3.

Set up AWS

You can set up all the required components using the AWS console, but my preference is to use the AWS CLI.

Create a new Access Key

To use the AWS CLI you’ll need an Access Key. Log onto your AWS console, go to “My Security Credentials” and create an Access Key (if you’ve not already got one)

Keep the details safe (I store mine in my password manager).

Install AWS CLI

I’m using homebrew to install the AWS CLI, and other packages, because I’m on a Mac - check out the official install docs for other OSes.

1brew install awscli

Configure a new profile

Note: I’m using a named profile as I’ve got a few accounts - you can omit this if you are just setting up the one

Lets set up some variables first:

1BUCKET=prod-cluster-backup # The name of your S3 bucket to create
2REGION=us-west-1 # AWS Region in which to create the S3 bucket
3PROFILE=my-profile # Only needed if you're creating a named profile for AWS CLI

Configure your AWS profile (omit --profile $PROFILE if you’re using the default profile)

1aws configure --profile $PROFILE
2> AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
3> AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
4> Default region name [None]: us-west-2
5> Default output format [None]: ENTER

Create an S3 bucket for the backups:

1aws s3api create-bucket \
2    --bucket $BUCKET \
3    --region $REGION \
4    --create-bucket-configuration LocationConstraint=$REGION \
5    --profile $PROFILE

Create an IAM user

I’m creating a user with the same name as my bucket - since this user and bucket will be used to back up a single cluster, it makes sense for me to be able to identify and link the two by name.

1aws iam create-user --user-name $BUCKET --profile $PROFILE

Create a JSON file with a policy definition of the permissions velero needs - note that it’s scoped to the specific bucket using the $BUCKET variable:

 1cat > velero-policy.json <<EOF
 2{
 3    "Version": "2012-10-17",
 4    "Statement": [
 5        {
 6            "Effect": "Allow",
 7            "Action": [
 8                "ec2:DescribeVolumes",
 9                "ec2:DescribeSnapshots",
10                "ec2:CreateTags",
11                "ec2:CreateVolume",
12                "ec2:CreateSnapshot",
13                "ec2:DeleteSnapshot"
14            ],
15            "Resource": "*"
16        },
17        {
18            "Effect": "Allow",
19            "Action": [
20                "s3:GetObject",
21                "s3:DeleteObject",
22                "s3:PutObject",
23                "s3:AbortMultipartUpload",
24                "s3:ListMultipartUploadParts"
25            ],
26            "Resource": [
27                "arn:aws:s3:::${BUCKET}/*"
28            ]
29        },
30        {
31            "Effect": "Allow",
32            "Action": [
33                "s3:ListBucket"
34            ],
35            "Resource": [
36                "arn:aws:s3:::${BUCKET}"
37            ]
38        }
39    ]
40}
41EOF

Attach the policy to the user to allow it to access the S3 bucket.

1aws iam put-user-policy \
2    --user-name $BUCKET \
3    --policy-name $BUCKET \
4    --policy-document file://velero-policy.json \
5    --profile $PROFILE

We can then create an Access Key for the newly created account, which will be used by Velero to upload the data.

1aws iam create-access-key --user-name $BUCKET --profile $PROFILE

The response should include an AccessKeyId and SecretAccessKey - make a note of them:

1{
2    "AccessKey": {
3        "UserName": "prod-cluster-backup",
4        "AccessKeyId": "AKIA4Z..snip..N7QGT5",
5        "Status": "Active",
6        "SecretAccessKey": "gqjNLeZ..snip..hnGJjNKU",
7        "CreateDate": "2020-10-07T15:14:08+00:00"
8    }
9}

Next we create a credentials file using the Access Key created above - this will be imported into the Velero deployment as a secret.

1cat > credentials-prod-cluster-backup <<EOF
2[default]
3aws_access_key_id=<AccessKeyId>
4aws_secret_access_key=<SecretAccessKey>
5EOF

Install and Configure Velero

Once again I’m using homebrew to install Velero - other installation instructions are a here

1brew install velero

Install Velero into your Kubernetes cluster using the CLI - your kubectl context should be pointed to the cluster you want to install on!

1velero install \
2    --provider aws \
3    --plugins velero/velero-plugin-for-aws:v1.1.0 \
4    --bucket $BUCKET \
5    --backup-location-config region=$REGION \
6    --secret-file ./credentials-prod-cluster-backup \
7    --use-volume-snapshots=false \
8    --use-restic

At this point you could use velero backup create to start backing things up, but Velero won’t automatically backup your persistent volumes - you need to tell it what to backup using an annotation. Without annotating the pods the backup will complete and look successful but it won’t include your data!

Annotate deployments, stateful sets or pods

Let’s take my Vault deployment, for example. It consists of a stateful set of three pods, each pod has a persistant volume called “data”. Prior to deployment I can add the backup.velero.io/backup-volumes: <volume name> annotation to the template metadata in my YAML configuration:

 1apiVersion: apps/v1
 2kind: StatefulSet
 3metadata:
 4  labels:
 5    app.kubernetes.io/instance: vault
 6    app.kubernetes.io/name: vault
 7  name: vault
 8  namespace: vault
 9spec:
10  replicas: 3
11  selector:
12    matchLabels:
13      app.kubernetes.io/instance: vault
14      app.kubernetes.io/name: vault
15  serviceName: vault-internal
16  template:
17    metadata:
18      annotations:
19        backup.velero.io/backup-volumes: data
20      labels:
21        app.kubernetes.io/instance: vault
22        app.kubernetes.io/name: vault

Alternatively I could annotate the pods directly, but remember these annotations might be overwritten at a later point by the statefulset

1kubectl annotate pod/vault-0 backup.velero.io/backup-volumes=data
2kubectl annotate pod/vault-1 backup.velero.io/backup-volumes=data
3kubectl annotate pod/vault-2 backup.velero.io/backup-volumes=data

Now I can generate a backup of my vault namespace using:

1velero backup create vault-backup-test-1 --include-namespaces=vault

You can create a backup of your entire cluster using velero backup create whole-cluster-backup, or you can create scheduled backups using a cron-like schedule

1velero create schedule whole-cluster-backup-daily --schedule="0 7 * * *"

It’s also worth noting you can exclude specific namespaces as well as include, using --exclude-namespaces.

Once I’ve created a backup, I can view it using:

 1velero backup describe vault-backup-test-1
 2Name:         vault-backup-test-1
 3Namespace:    velero
 4Labels:       velero.io/storage-location=default
 5Annotations:  velero.io/source-cluster-k8s-gitversion=v1.19.1+vmware.2
 6              velero.io/source-cluster-k8s-major-version=1
 7              velero.io/source-cluster-k8s-minor-version=19
 8
 9Phase:  Completed
10
11Errors:    0
12Warnings:  0
13
14Namespaces:
15  Included:  vault
16  Excluded:  <none>
17
18Resources:
19  Included:        *
20  Excluded:        <none>
21  Cluster-scoped:  auto
22
23Label selector:  <none>
24
25Storage Location:  default
26
27Velero-Native Snapshot PVs:  auto
28
29TTL:  720h0m0s
30
31Hooks:  <none>
32
33Backup Format Version:  1.1.0
34
35Started:    2020-10-09 16:25:13 +0100 BST
36Completed:  2020-10-09 16:25:32 +0100 BST
37
38Expiration:  2020-11-08 15:25:13 +0000 GMT
39
40Total items to be backed up:  48
41Items backed up:              48
42
43Velero-Native Snapshots: <none included>
44
45Restic Backups (specify --details for more information):
46  Completed:  3
Share this post