Backing up Tanzu Kubernetes Grid (TKG) on vSphere Workloads to AWS with Velero
This article is now 5 years old! It is highly likely that this information is out of date and the author will have completely forgotten about it. Please take care when following any guidance to ensure you have up-to-date recommendations.
As more services go live on my Kubernetes clusters and more people start relying on them, I get nervous. For the most part, I try and keep my applications and configurations stateless - relying on ConfigMaps for example to store application configuration. This means with a handful of YAML files in my Git repository I can restore everything to working order. Sometimes though, there’s no choice but to use a PersistentVolume to provide some data persistance where you can’t capture it in a config file. This is where a backup of the cluster - and specifically the PersistentVolume is really important.
Enter Velero - the artist formerly known as Heptio Ark.
Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes.
Velero uses plugins to integrate with various cloud providers, allowing you to backup to different targets - my aim is to backup my vSphere-based (CSI) Persistent Volumes to AWS S3.
Set up AWS 
You can set up all the required components using the AWS console, but my preference is to use the AWS CLI.
Create a new Access Key 
To use the AWS CLI you’ll need an Access Key. Log onto your AWS console, go to “My Security Credentials” and create an Access Key (if you’ve not already got one)


Keep the details safe (I store mine in my password manager).
Install AWS CLI 
I’m using homebrew to install the AWS CLI, and other packages, because I’m on a Mac - check out the official install docs for other OSes.
Note: I’m using a named profile as I’ve got a few accounts - you can omit this if you are just setting up the one
Lets set up some variables first:
1
2
3
  | BUCKET=prod-cluster-backup # The name of your S3 bucket to create
REGION=us-west-1 # AWS Region in which to create the S3 bucket
PROFILE=my-profile # Only needed if you're creating a named profile for AWS CLI
  | 
Configure your AWS profile (omit --profile $PROFILE if you’re using the default profile)
1
2
3
4
5
  | aws configure --profile $PROFILE
> AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
> AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
> Default region name [None]: us-west-2
> Default output format [None]: ENTER
  | 
Create an S3 bucket for the backups: 
1
2
3
4
5
  | aws s3api create-bucket \
    --bucket $BUCKET \
    --region $REGION \
    --create-bucket-configuration LocationConstraint=$REGION \
    --profile $PROFILE
  | 
Create an IAM user 
I’m creating a user with the same name as my bucket - since this user and bucket will be used to back up a single cluster, it makes sense for me to be able to identify and link the two by name.
1
  | aws iam create-user --user-name $BUCKET --profile $PROFILE
  | 
Create a JSON file with a policy definition of the permissions velero needs - note that it’s scoped to the specific bucket using the $BUCKET variable:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
  | cat > velero-policy.json <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVolumes",
                "ec2:DescribeSnapshots",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:CreateSnapshot",
                "ec2:DeleteSnapshot"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
                "arn:aws:s3:::${BUCKET}/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::${BUCKET}"
            ]
        }
    ]
}
EOF
  | 
Attach the policy to the user to allow it to access the S3 bucket.
1
2
3
4
5
  | aws iam put-user-policy \
    --user-name $BUCKET \
    --policy-name $BUCKET \
    --policy-document file://velero-policy.json \
    --profile $PROFILE
  | 
We can then create an Access Key for the newly created account, which will be used by Velero to upload the data.
1
  | aws iam create-access-key --user-name $BUCKET --profile $PROFILE
  | 
The response should include an AccessKeyId and SecretAccessKey - make a note of them:
1
2
3
4
5
6
7
8
9
  | {
    "AccessKey": {
        "UserName": "prod-cluster-backup",
        "AccessKeyId": "AKIA4Z..snip..N7QGT5",
        "Status": "Active",
        "SecretAccessKey": "gqjNLeZ..snip..hnGJjNKU",
        "CreateDate": "2020-10-07T15:14:08+00:00"
    }
}
  | 
Next we create a credentials file using the Access Key created above - this will be imported into the Velero deployment as a secret.
1
2
3
4
5
  | cat > credentials-prod-cluster-backup <<EOF
[default]
aws_access_key_id=<AccessKeyId>
aws_secret_access_key=<SecretAccessKey>
EOF
  | 
Once again I’m using homebrew to install Velero - other installation instructions are a here
Install Velero into your Kubernetes cluster using the CLI - your kubectl context should be pointed to the cluster you want to install on!
1
2
3
4
5
6
7
8
  | velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.1.0 \
    --bucket $BUCKET \
    --backup-location-config region=$REGION \
    --secret-file ./credentials-prod-cluster-backup \
    --use-volume-snapshots=false \
    --use-restic
  | 
At this point you could use velero backup create to start backing things up, but Velero won’t automatically backup your persistent volumes - you need to tell it what to backup using an annotation. Without annotating the pods the backup will complete and look successful but it won’t include your data!
Annotate deployments, stateful sets or pods 
Let’s take my Vault deployment, for example. It consists of a stateful set of three pods, each pod has a persistant volume called “data”. Prior to deployment I can add the backup.velero.io/backup-volumes: <volume name> annotation to the template metadata in my YAML configuration:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
  | apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/instance: vault
    app.kubernetes.io/name: vault
  name: vault
  namespace: vault
spec:
  replicas: 3
  selector:
    matchLabels:
      app.kubernetes.io/instance: vault
      app.kubernetes.io/name: vault
  serviceName: vault-internal
  template:
    metadata:
      annotations:
        backup.velero.io/backup-volumes: data
      labels:
        app.kubernetes.io/instance: vault
        app.kubernetes.io/name: vault
  | 
Alternatively I could annotate the pods directly, but remember these annotations might be overwritten at a later point by the statefulset
1
2
3
  | kubectl annotate pod/vault-0 backup.velero.io/backup-volumes=data
kubectl annotate pod/vault-1 backup.velero.io/backup-volumes=data
kubectl annotate pod/vault-2 backup.velero.io/backup-volumes=data
  | 
Now I can generate a backup of my vault namespace using:
1
  | velero backup create vault-backup-test-1 --include-namespaces=vault
  | 
You can create a backup of your entire cluster using velero backup create whole-cluster-backup, or you can create scheduled backups using a cron-like schedule
1
  | velero create schedule whole-cluster-backup-daily --schedule="0 7 * * *"
  | 
It’s also worth noting you can exclude specific namespaces as well as include, using --exclude-namespaces.
Once I’ve created a backup, I can view it using:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
  | velero backup describe vault-backup-test-1
Name:         vault-backup-test-1
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.19.1+vmware.2
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=19
Phase:  Completed
Errors:    0
Warnings:  0
Namespaces:
  Included:  vault
  Excluded:  <none>
Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto
Label selector:  <none>
Storage Location:  default
Velero-Native Snapshot PVs:  auto
TTL:  720h0m0s
Hooks:  <none>
Backup Format Version:  1.1.0
Started:    2020-10-09 16:25:13 +0100 BST
Completed:  2020-10-09 16:25:32 +0100 BST
Expiration:  2020-11-08 15:25:13 +0000 GMT
Total items to be backed up:  48
Items backed up:              48
Velero-Native Snapshots: <none included>
Restic Backups (specify --details for more information):
  Completed:  3
  | 
Share this post