Umberto D'Ovidio

Kubernetes on Hetzner with Talos: Part 1 - Bootstrapping Control Plane Nodes

Overview

This is the first part of a multi-part series on setting up a Kubernetes cluster on Hetzner Cloud using Talos OS. In this article, we’ll focus on bootstrapping 3 control plane nodes that can also run workloads.

My motivations for this project are threefold:

  1. Learning - I wanted to understand Kubernetes internals by building a cluster from scratch
  2. Self-hosting - Having a place to run apps I use daily (like Navidrome for music)
  3. Side projects - A cluster to deploy and experiment with my own projects

If you don’t have similar requirements, this series probably isn’t worth your time. But if you’re a hobbyist looking to learn and host your own services for cheap, read on. I don’t administer Kubernetes clusters for a leaving - this is a hobby project. If you’re looking for enterprise-grade guidance from a professional Kubernetes administrator, this might not be the right series for you. You might also want to read Mathias Pius’ excellent series that I frequently consulted whenever I got stuck, as well as the official Talos Hetzner tutorial.

Why Talos + Hetzner?

Talos OS is a minimal, hardened Linux distribution designed specifically for Kubernetes. Combined with Hetzner’s affordable cloud infrastructure, you get a secure, manageable cluster at a fraction of the cost of major cloud providers. To further keep costs down, we’ll use a three node setup, where each node is running the control plane and the actual workloads. The total costs will be less than 20 euro per month. The trade-off is slightly less isolation between control plane and workloads, but for many use cases, this is perfectly acceptable.

Prerequisites

  • Hetzner Cloud account and API token with read/write permissions. Set the HCLOUD_TOKEN environment variable with the API token. Additionally, make sure to have Hetzner’s hcloud tool installed
  • talosctl installed
  • Basic understanding of Kubernetes concepts. For absolute newbies you can still follow along, but I suggest supplementing this tutorial with the excellent book Kubernetes Up & Running
  • packer to create the initial image snapshot.
  • helm to install Hetzner cloud control manager.

Step 1: Create Talos Image with Packer

First, we need a custom Talos image that Hetzner can use. We’ll use Packer to build and upload this image.

 1packer {
 2  required_plugins {
 3    hcloud = {
 4      source  = "github.com/hetznercloud/hcloud"
 5      version = "~> 1"
 6    }
 7  }
 8  }
 9
10  variable "talos_version" {
11  type    = string
12  default = "v1.11.3"
13  }
14
15  variable "arch" {
16  type    = string
17  default = "amd64"
18  }
19
20  variable "server_type" {
21  type    = string
22  default = "cx23"
23  }
24
25  variable "server_location" {
26  type    = string
27  default = "nbg1"
28  }
29
30  locals {
31  image = "https://factory.talos.dev/image/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba/${var.talos_version}/hcloud-${var.arch}.raw.xz"
32  }
33
34  source "hcloud" "talos" {
35  rescue       = "linux64"
36  image        = "debian-11"
37  location     = "${var.server_location}"
38  server_type  = "${var.server_type}"
39  ssh_username = "root"
40
41  snapshot_name   = "talos system disk - ${var.arch} - ${var.talos_version}"
42  snapshot_labels = {
43    type    = "infra",
44    os      = "talos",
45    version = "${var.talos_version}",
46    arch    = "${var.arch}",
47  }
48  }
49
50  build {
51  sources = ["source.hcloud.talos"]
52
53  provisioner "shell" {
54    inline = [
55      "apt-get install -y wget",
56      "wget -O /tmp/talos.raw.xz ${local.image}",
57      "xz -d -c /tmp/talos.raw.xz | dd of=/dev/sda && sync",
58    ]
59  }
60  }

Build the image:

1packer build hcloud.pkr.hcl

This command will do a couple of things:

  1. Spin up a Debian server in Hetzner
  2. Enable rescue mode and reboot the server
  3. Download the Talos image from the Talos factory to override the Debian image
  4. Shut down the server and create the snapshot
  5. Upload the snapshot to Hetzner

The whole process took 7 minutes for me, so feel free to take a break while the automation does its magic. Once this is done, keep track of the snapshot id, we’ll use it in the following steps.

1export SNAPSHOT_ID=<your snapshot id>

Step 2: Set Up Load Balancer

We are going to use a load balancer for the Kubernetes API endpoint, as well as for ingress. This is not strictly necessary at the beginning and you could save $5 per month by not using it, but I decided to use it since I will need it anyway for my personal projects to be highly available.

 1# Create load balancer
 2hcloud load-balancer create \
 3  --name talos-control-plane \
 4  --network-zone eu-central \
 5  --type lb11 \
 6  --label 'type=controlplane'
 7
 8# Add service for Kubernetes API
 9hcloud load-balancer add-service \
10  --name talos-control-plane \
11  --protocol tcp \
12  --listen-port 6443 \
13  --destination-port 6443
14
15# Add targets using labels
16hcloud load-balancer add-target talos-control-plane --label-selector 'type=controlplane'

We first create the load balancer. After that we expose port 6443 and use the same port as destination. We create a target group using label selector, so that every time we spin up a new node with that label, it will automatically be added to the group.

Now we can find the IP of our newly created load balancer with the following command

1hcloud loadbalancer list

Let’s add this to an env variable, we are going to need it in the following steps

1export LB_IP=<your load balancer ip>

Step 3: Generate Talos Configuration

We are now ready to generate the talos configuration, which we’ll use to connect to the cluster and to spin up new nodes.

1# Generate cluster configuration
2talosctl gen config talos-k8s-hcloud-tutorial https://$LB_IP:6443 \
3  --with-examples=false \
4  --with-docs=false

This creates several files, including controlplane.yaml, worker.yaml, and talosconfig.

Step 4: Patch Configuration for Hetzner Integration

We are going to apply a few changes to the configuration. To do so, we’ll use patches. Let’s create a patches folder where we can keep track of all patches. We’ll start by patching the cluster to allow externalCloudProviders. This will be required to use the hetzner cloud controller manager.

1# patches/external_cloud_providers.yaml
2cluster:
3  externalCloudProvider:
4    enabled: true

Next, we’ll create a patch to allow workloads to be scheduled on control plane nodes. This will prevent nodes from being tainted, which would stop workloads from scheduling.

1# patches/allow_controlplane_workloads.yaml
2cluster:
3  allowSchedulingOnControlPlanes: true

Apply the patch:

1talosctl machineconfig patch controlplane.yaml --patch @patches/allow_controlplane_workloads.yaml --patch @patches/external_cloud_providers.yaml -o controlplane.yaml

Step 5: Deploy Control Plane Nodes

Now we create our 3 control plane nodes using the patched configuration:

 1# Create three control plane nodes
 2for i in 1 2 3; do
 3  hcloud server create \
 4    --name talos-control-plane-$i \
 5    --image $SNAPSHOT_ID \
 6    --type cx23 \
 7    --location nbg1 \
 8    --label 'type=controlplane' \
 9    --user-data-from-file controlplane.yaml
10done

For my cluster I’m using the smallest instances available as they are more than enough for my needs.

Step 6: Bootstrap the Cluster

Let’s find the first node IP using hcloud server list | grep talos-control-plane and export it as an environment variable:

1export FIRST_NODE_IP=<your control-plane-node-ip>

Now let’s configure the Talos client to connect to this node by setting both the endpoint and node:

1talosctl --talosconfig talosconfig config endpoint $FIRST_NODE_IP 
2talosctl --talosconfig talosconfig config node $FIRST_NODE_IP

We are ready to bootstrap etcd on our first control plane node

1talosctl --talosconfig talosconfig bootstrap

You should then be able to see all three nodes with the following command:

1talosctl --talosconfig talosconfig get members 

We can then retrieve kubeconfig and use kubectl with it

1talosctl --talosconfig talosconfig kubeconfig .
2export KUBECONFIG=./kubeconfig
3kubectl get nodes -o wide

Check that everything is working:

1# Check node status
2kubectl get nodes -o wide

We can get a glimpse of cluster health using talosctl dashboard command

1talosctl --talosconfig talosconfig dashboard

Step 7: Hetzner Cloud Controller Manager

Setting up Hetzner cloud control manager was quite seamless, I’ve just followed its official quickstart.md.

Create a secret containing your Hetzner API token:

1kubectl -n kube-system create secret generic hcloud --from-literal=token=<hcloud API token>

Add the helm repository

1helm repo add hcloud https://charts.hetzner.cloud
2helm repo update hcloud

Install the chart

1helm install hccm hcloud/hcloud-cloud-controller-manager -n kube-system

You should be able to see hcloud-cloud-controller-manager in the deployments

1kubectl get deployments -n kube-system

What’s Next?

In the next parts of this series, we’ll cover:

Summary

We’ve successfully bootstrapped a 3-node Kubernetes cluster using Talos on Hetzner Cloud. Our control plane nodes can also run workloads, giving us a cost-effective, highly available setup.