Node Autoscaling

Overview

The cluster-autoscaler with Hetzner Cloud provider dynamically provisions and decommissions worker nodes based on pod scheduling demand. Each client gets a dedicated node pool — their app-web pods only run on nodes in their pool, and no other workloads can land on those nodes.

How scale-up happens

Scale-down follows the reverse path: when a node has been under-utilised (< 0.5) for 10 min, the autoscaler cordons it, drains, then calls the Hetzner API to destroy the VM.

Architecture: Dedicated Pool Per Client

Each client's node pool is isolated via a label + taint pair:

Client	Pool name	Node label	Node taint	Instance type
wecare	`wecare-web`	`wecare=""`	`wecare:NoSchedule`	CCX13

The client's app-web Deployment targets the pool with:

yaml

nodeSelector:
  <client>: ""
tolerations:
  - key: <client>
    operator: Exists
    effect: NoSchedule

Why dedicated pools?

Performance isolation — one client's traffic spike never affects another
Independent scaling — each pool has its own minSize/maxSize
Cost attribution — easy to track per-client infra spend

Why per-tenant labels instead of generic `node-role.kubernetes.io/*`?

Kubelet forbids nodes from self-assigning labels in the kubernetes.io namespace (security restriction since K8s 1.24) — so autoscaler-provisioned nodes can't apply node-role.kubernetes.io/app at startup via --node-labels. Custom labels like wecare="" have no such restriction and are free to use.

Earlier Terraform-managed nodes also carried node-role.kubernetes.io/{app,db,cache} labels (applied post-join via kubectl label, which bypasses kubelet validation), but those were retired on 2026-04-21 because they described workload-class without encoding the tenant — a seajets db node would match any selector looking for node-role.kubernetes.io/db, bleeding across tenant boundaries. Today the convention is <tenant> / <tenant>-mariadb / <tenant>-cache labels only.

Components

1. Autoscaler Helm values — `app-constructs/cluster-autoscaler/values.yaml`

Defines the autoscaling groups (one per client pool):

yaml

autoscalingGroups:
  - name: wecare-web      # Must match nodeConfigs key in the secret
    minSize: 0
    maxSize: 4
    instanceType: CCX13   # 2 dedicated AMD vCPU, 8 GB RAM
    region: hel1

2. Cluster config secret — `untracked/secrets/build-autoscaler-config.sh`

Builds the cluster-autoscaler-config Secret containing HCLOUD_CLUSTER_CONFIG: a JSON blob with per-pool cloud-init, labels, and taints.

bash

# Rebuild after any cloud-init or label/taint change:
source ../../scripts/bw-unlock.sh && ./build-autoscaler-config.sh

The JSON structure:

json

{
  "imagesForArch": {"amd64": "ubuntu-24.04"},
  "nodeConfigs": {
    "wecare-web": {
      "cloudInit": "<raw cloud-init YAML>",
      "labels": {"wecare": "", "topology.kubernetes.io/region": "hel1", ...},
      "taints": [{"key": "wecare", "value": "", "effect": "NoSchedule"}]
    }
  }
}

Encoding chain: data.config = base64(base64(JSON)) — K8s decodes the outer base64 when exposing the Secret as an env var, the Hetzner provider decodes the inner base64 before JSON-parsing.

cloudInit must be raw YAML, not base64. The provider passes it directly to the Hetzner API UserData field.

3. Scheduling patch — per-client overlay

Example: app-constructs/ecommercen-clients/wecare/adveshop4/prod/app-web-scheduling-patch.yaml

yaml

spec:
  template:
    spec:
      nodeSelector:
        wecare: ""
      tolerations:
        - key: wecare
          operator: Exists
          effect: NoSchedule
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule

Adding a New Client Pool

values.yaml — add an autoscaling group:

yaml

autoscalingGroups:
  - name: wecare-web
    ...
  - name: clientb-web
    minSize: 0
    maxSize: 2
    instanceType: CCX13
    region: hel1

build-autoscaler-config.sh — add a nodeConfigs entry with the client's cloud-init (same template, different label/taint in the RKE2 config), labels, and taints. The nodeConfigs key must match the pool name from step 1.
Client scheduling patch — set nodeSelector: {clientb: ""} and tolerate clientb:NoSchedule.
Rebuild and seal the secret, commit, push.

Operational Notes

Cloud-init takes ~20 min on CCX13 (package update + RKE2 download + image pull). Set max-node-provision-time accordingly (currently 45m for safety, reduce to 25m once stable).
Scale-down is enabled with 10m unneeded time and 0.5 utilization threshold.
Hetzner API quirk: user_data and ssh_keys are write-only fields — they're set at server creation but not returned by GET /servers/{id}.
To debug a stuck node: hcloud server enable-rescue <name> + reboot, then SSH in. Filesystem is at /mnt.
Autoscaler runs on control plane nodes (nodeSelector + toleration).

Node Autoscaling ​

Overview ​

How scale-up happens ​

Architecture: Dedicated Pool Per Client ​

Why dedicated pools? ​

Why per-tenant labels instead of generic node-role.kubernetes.io/*? ​

Components ​

1. Autoscaler Helm values — app-constructs/cluster-autoscaler/values.yaml ​

2. Cluster config secret — untracked/secrets/build-autoscaler-config.sh ​

3. Scheduling patch — per-client overlay ​

Adding a New Client Pool ​

Operational Notes ​

Node Autoscaling

Overview

How scale-up happens

Architecture: Dedicated Pool Per Client

Why dedicated pools?

Why per-tenant labels instead of generic `node-role.kubernetes.io/*`?

Components

1. Autoscaler Helm values — `app-constructs/cluster-autoscaler/values.yaml`

2. Cluster config secret — `untracked/secrets/build-autoscaler-config.sh`

3. Scheduling patch — per-client overlay

Adding a New Client Pool

Operational Notes