Written by Sam McGeown on 21/9/2020 · Read in about 7 min (1317 words)
Published under Cloud-Native

If you’re anything like me, your home lab is constantly changing, evolving, breaking, rebuilding. For the last year or so I’ve been running all my home kubernetes workloads on a Raspberry Pi cluster - and it’s been working really well!

I’ve been through several iterations - for example firstly running on SD cards (tl;dr - it’s bad, they wear out really fast with Kubernetes on board!), then PxE booting them from my Synology to it’s now current state of booting directly from SSDs. I’ve also moved from Raspberry Pi 3s to 4s, I’ve played around with stacking cluster cases before landing on the current rack-mount format.

Previous iterations of my setup (at least from a configuration and installation point of view) are hidden away in a private git repository, and one day I’d like to make that public, but for now it’s too messy and holds too many skeletons secrets. So this post is my “current state” as of September 2020, with my configuration more or less intact.


I’m running three raspberry Pi nodes mounted inside a 1U 19” rack case, which I modified to accomodate the Pis. I also added a large case fan, wired to USB for power, on top of the case to drive cooling air through the case. It’s a little hacky, but it works well.

The PoE hats provide all the power that’s needed for the Pis, with the added benefit of having a temperature controlled mini-fan onboard. I am using a Cisco SG300-10 PoE switch to provide 802.3af PoE.

Raspberry Pi Rack internal configuration
Raspberry Pi Rack internal configuration
Raspberry Pi Rack external configuration
Raspberry Pi Rack external configuration

Booting from USB SSD

Booting the RPi4s from the USB SSD directly is now supported directly from the latest EEPROM (2020-09-03 or newer), which is a lot easier than the previous Beta versions. Simply update the bootloader and reboot

1sudo apt update
2sudo apt full-upgrade
3sudo rpi-eeprom-update -a -d
4sudo reboot now

Once you’ve rebooted you can use sudo raspi-config to set the boot order, assuming you do not want a more complicated boot configuration. This sets BOOT_ORDER=0xf41 - f = loop, 4 = USB, 1 = SD Card.

Set the USB boot option via raspi-config
Set the USB boot option via raspi-config

You can then install your desired OS flavour on the USB SSD - I’m currently running Raspbian Buster, but I’ve heard good things about the Ubuntu 20.04 arm64 build - it’s on my to do list.

Installing Kubernetes

At the time of writing Kubernetes 1.19.2 is the most recent version, I’ve found that installing on Rasbian requires a little more fettling to get working than a “normal” distro

Installing Docker

Docker needs to be installed, cgroups enabled and then the docker daemon configured:

 1# Obligatory updates
 2sudo apt-get update && sudo apt-get upgrade -y
 3# Install pre-requisities
 4sudo apt-get install apt-transport-https ca-certificates software-properties-common vim -y
 5# Install Docker
 6curl -fsSL get.docker.com -o get-docker.sh
 7sh get-docker.sh
 8# Add Pi user to Docker group
 9sudo usermod -aG docker pi
11# Backup the current cmdline.txt
12sudo cp /boot/cmdline.txt /boot/cmdline_cgroup.txt
13# Enable cgroups
14orig="$(head -n1 /boot/cmdline.txt) cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory"
15echo $orig | sudo tee /boot/cmdline.txt
17# Edit Docker daemon
18sudo bash -c 'cat << EOF > /etc/docker/daemon.json
20  "exec-opts": ["native.cgroupdriver=systemd"],
21  "log-driver": "json-file",
22  "log-opts": {
23    "max-size": "100m"
24  },
25  "storage-driver": "overlay2"
29sudo reboot

I’ve configured legacy versions of iptables for compatibility since Raspbian Buster uses nftables by default. I’ve also disabled swap and then added the Kubernetes repository to apt.

 1# Ensure legacy iptables binaries are installed and used
 2sudo apt-get install -y iptables arptables ebtables
 3sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
 4sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
 5sudo update-alternatives --set arptables /usr/sbin/arptables-legacy
 6sudo update-alternatives --set ebtables /usr/sbin/ebtables-legacy
 8# Disable swap
 9sudo dphys-swapfile swapoff
10sudo dphys-swapfile uninstall
11sudo update-rc.d dphys-swapfile remove
12sudo systemctl disable dphys-swapfile.service
14# If you're using flannel...
15# sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
17# Add the Kubernetes repo
18curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
19echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
20sudo apt-get update -q
22# Install Kubernetes binaries (v1.19.2-00) and hold any upgrades
23sudo apt-mark unhold kubelet kubeadm kubectl
24sudo apt-get install -y kubelet=v1.19.2-00 kubeadm=v1.19.2-00 kubectl=v1.19.2-00
25sudo apt-mark hold kubelet kubeadm kubectl
27# Pre-stage the kubernetes images
28sudo kubeadm config images pull

I always like add the members of your cluster to /etc/hosts for name resolution (especially as I use CoreDNS running on this cluster for my home DNS!)

1# Add to /etc/hosts
2192.168.20.9    rpi-vip rpi-vip.lab.definit.co.uk
3192.168.20.12   rpi4-01 rpi4-01.lab.definit.co.uk
4192.168.20.13   rpi4-02 rpi4-02.lab.definit.co.uk
5192.168.20.14   rpi4-03 rpi4-03.lab.definit.co.uk

Configuring kube-vip

I’m using kube-vip to provide control plane load balancing on my Kubernetes cluster. The latest version is much simpler to configure, requiring very little configuration to get a basic setup working. It’s also got the option to use BGP for VIP failover rather than ARP, which I love - though I’ve not tested it yet.

 1# Generate the kube-vip manifest
 2sudo docker run --network host --rm plndr/kube-vip:0.1.8 kubeadm init \
 3--interface eth0 \
 4--vip \
 5--arp \
 6--leaderElection | sudo tee /etc/kubernetes/manifests/vip.yaml
 8# Initialise the first node
 9sudo kubeadm init --control-plane-endpoint "" --upload-certs
11# Join other control plane nodes to cluster - if the token has expired, use
12kubeadm token create --print-join-command --certificate-key $(sudo kubeadm init phase upload-certs --upload-certs | tail -1)
14# Generate the kube-vip manifest on additional nodes
15sudo docker run --network host --rm plndr/kube-vip:0.1.8 kubeadm init \
16--interface eth0 \
17--vip \
18--arp \
19--leaderElection | sudo tee /etc/kubernetes/manifests/vip.yaml

Network Plugin - Weave

I’ve chosen Weave to provide the network plugin for my cluster based on my requirements - but generally I would recommend Flannel for a small cluster like this. There are loads of considerations when picking a CNI - things like does it have an arm-compatible image, and whether it works with metallb were two key ones! I really recommend looking at the ITNEXT benchmark results as well, there’s some really good information there.

Be sure to run this command on the RPi - if you run it on your laptop it’s very likely you’ll get the wrong architecture downloaded!

1# Install Weave with NO_MASQ_LOCAL for metallb
2sudo kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&env.NO_MASQ_LOCAL=1"

Load Balancing with MetalLb

Last year I covered using MetalLb with Contour for ingress and load balancing, and since then I’ve moved my metallb configuration over to using BGP (layer3) to advertise load balanced VIPs to my EdgeRouter (rather than the ARP-based L2 load balancing). The installation process is really easy, install the manifests, and create a secret for encryption between nodes.

1# Install metallb for ingress/load balancing
2kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
3kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
4# On first install only
5kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

Once metallb is installed it will sit idle until you provide it with a configMap:

 1apiVersion: v1
 2kind: ConfigMap
 4  namespace: metallb-system
 5  name: config
 7  config: |
 8    peers:
 9    - peer-address:
10      peer-asn: 65000
11      my-asn: 65000
12    address-pools:
13    - name: default
14      protocol: bgp
15      avoid-buggy-ips: true
16      addresses:
17      -
18      bgp-advertisements:
19      - aggregation-length: 32


From a storage perspective, I either use NFS (for persistent storage) or the local SSD (for more transient storage). Instead of creating a whole load of mount points on my Synology, I’m making use of the subPath property of my volumeMounts. I’m aware there are issues with using NFS around the number of connections created, but this is mitigated by the size of my cluster and the small number of persistent apps that are running.

e.g. below I have a single NFS PersistentVolume that is mounting three different subPath in the container…hmm I should switch the transcode to local SSD…

 1apiVersion: v1
 2kind: PersistentVolume
 4  name: plex-system-pvc
 5  namespace: plex
 6  labels:
 7    app.kubernetes.io/name: plex-media-server
 8    app.kubernetes.io/instance: definit-plex
10  capacity:
11    storage: 100Gi
12  accessModes:
13    - ReadWriteMany
14  persistentVolumeReclaimPolicy: Retain
15  nfs:
16    server: 
17    path: /volume1/kubernetes
21          volumeMounts:
22            - name: system
23              mountPath: /config
24              subPath: plex/config
25            - name: system
26              mountPath: /transcode
27              subPath: plex/transcode
28            - name: system
29              mountPath: /shared
30              subPath: plex/shared
31      volumes:
32        - name: system
33          persistentVolumeClaim:
34            claimName: "plex-system-pvc"

So - that’s a summary, more or less, of my current configuration and a little bit of how I got here. I hope you’ve found it interesting, and I’d love to hear a bit more about your setup!

Share this post