Skip to content

Scale the cluster

Most scaling happens without you. KEDA watches request rate and CPU and adjusts Deployment replica counts; the cluster-autoscaler watches Pending pods and provisions Hetzner VMs. This page is for the cases where the automatics need tuning, or where you need to manually provision capacity the autoscaler can't reach for (a new client's dedicated pool, a new DB node, a one-off test environment).

The two autoscalers

KEDA — pod scaling

Each client's app-web Deployment is driven by a KEDA ScaledObject. For wecare it lives at manifests_v1/app-constructs/ecommercen-clients/wecare/adveshop4/prod/app-web-scaledobject.yaml — the pattern repeats per client.

Verify KEDA is healthy:

bash
kubectl get scaledobject -A
kubectl -n ecommercen-clients-wecare describe scaledobject app-web-scaler

Current state for a specific client:

bash
kubectl -n ecommercen-clients-wecare get hpa
# KEDA manages a keda-hpa-app-web-scaler object behind the scenes
kubectl -n ecommercen-clients-wecare get deploy app-web

Changing the limits

Edit the ScaledObject:

yaml
minReplicaCount: 4              # floor — always at least this many
maxReplicaCount: 16             # ceiling
triggers:
  - type: prometheus
    metadata:
      query: sum(rate(traefik_service_requests_total{service=~"ecommercen-clients-wecare.*"}[2m]))
      threshold: "15"           # scale up once req/s per pod exceeds this

Commit, push, ArgoCD syncs. KEDA will re-evaluate within one polling interval (15s).

  • 🟢 The Deployment's manifest intentionally has no replicas: field — KEDA owns the count. Don't reintroduce it (ArgoCD will fight KEDA).
  • 🟠 Don't kubectl scale deploy/app-web — see the kubernetes rule. KEDA reverts it.
  • 🔴 Don't set maxReplicaCount higher than the number of app pods a single autoscaler pool can fit. If the ceiling is unreachable, Pods will sit Pending indefinitely.

cluster-autoscaler — node scaling

The autoscaler watches for Pending pods, looks at their nodeSelector/tolerations, matches them to one of the pools in values.yaml, and provisions a VM via the Hetzner API. Config:

  • manifests_v1/app-constructs/cluster-autoscaler/values.yaml — pool definitions (name, minSize, maxSize, instanceType, region).
  • cluster-autoscaler-config Secret — assembled by untracked/secrets/build-autoscaler-config.sh from a Bitwarden-stored RKE2 join token, per-pool labels/taints, and raw cloud-init YAML.

To change a pool's minSize or maxSize:

yaml
autoscalingGroups:
  - name: wecare-web
    minSize: 0
    maxSize: 12       # ← change this
    instanceType: CCX13
    region: hel1

Commit + push. The autoscaler Deployment rolls with the new values automatically (Reloader watches the ConfigMap).

To add a new pool for a new client, the pattern is automated by the client-onboarder agent — see Add a client. The manual version:

  1. Add an entry to autoscalingGroups in values.yaml.
  2. Add a nodeConfigs.<pool> entry in build-autoscaler-config.sh (labels + taints + cloud-init).
  3. Rebuild the Secret: source scripts/bw-unlock.sh && cd untracked/secrets && ./build-autoscaler-config.sh.
  4. Commit both changes. Wait for ArgoCD to sync, then restart the autoscaler Pod so it picks up the new config.
  • 🟠 Provisioning a CCX13 takes ~20 min (cloud-init + RKE2 download + image pull). max-node-provision-time is currently 45m to absorb that; don't lower it below 25m.
  • 🟠 cloudInit inside the config Secret must be raw YAML, not base64. The Hetzner provider passes it to the UserData field directly.
  • 🔴 Kubelet rejects labels in the kubernetes.io namespace at startup (K8s 1.24+). Stick to custom labels like wecare="" for pool identity.

See Node autoscaling for the full mechanics (how the per-client taint/label pattern isolates workloads, why the config Secret is double-base64-encoded).

Manually provisioning a dedicated node pool via Terraform

When the autoscaler pattern doesn't fit — a new client's DB nodes (stateful, local-path storage, never auto-provisioned), or an urgent one-off to absorb load:

bash
cd terraform

# Edit main.tf, add an hcloud_server block + matching null_resource for labels/taints
$EDITOR main.tf

# Plan (saved to a file — never apply without -out)
./tf.sh plan -out=plans/plan-add-<description>-$(date +%Y%m%d).tfplan

# Apply the saved plan
./tf.sh apply plans/plan-add-<description>-$(date +%Y%m%d).tfplan

tf.sh unlocks Bitwarden and injects the Hetzner API token + RKE2 join token. The null_resource provisioner waits for the new node to appear in the cluster, then applies labels and taints via kubectl. Cloud-init installs RKE2 agent and joins automatically.

Before assigning a static IP, always check both Terraform and live cluster — the autoscaler also consumes IPs from the same 10.1.0.0/24 subnet:

bash
./tf.sh state list
kubectl get nodes -o wide
  • 🟠 Delegate Terraform operations to the terraform-manager agent. It knows the IP-allocation pattern, the null_resource shape, the plan-file naming convention, and will refuse to apply without a saved plan.
  • 🟠 Drain a node before destroying it: kubectl drain <node> --ignore-daemonsets --delete-emptydir-data.
  • 🔴 Never run bare terraform./tf.sh is the only way to get Bitwarden-sourced secrets injected.

PVC resizing (the annoying one)

Longhorn supports online expansion; local-path (used for our DB nodes for NVMe speed) does not. To grow a local-path PVC you need to:

  1. Cordon + drain the node holding the current volume.
  2. Orphan-delete the PVC (kubectl delete pvc <name> --wait=false, then remove the finalizer).
  3. Re-apply the manifest with the new size.
  4. Roll the owning StatefulSet pod so it gets the new PV.

This is a destructive path — the data on the old PV is gone unless you snapshotted or dumped first. Delegate to the db-manager agent for MariaDB PVCs. Don't do this ad-hoc. The project memory has a past incident write-up; start there before the first attempt.

For Longhorn-backed PVCs, just edit spec.resources.requests.storage in the PVC (Argo-owned if via Helm values) and commit. Online expansion happens on the next reconcile.

Delegation

  • 🟢 Pod scaling — edit the ScaledObject; commit. If the triggers aren't working, delegate to k8s-manager for HPA/KEDA inspection.
  • 🟢 Node pool tuning (min/max on an existing pool) — edit values.yaml; commit.
  • 🟢 New node pool / new client pool — the client-onboarder agent handles the full flow.
  • 🟠 One-off node add/remove — delegate to terraform-manager.
  • 🟠 Stateful storage resize — delegate to db-manager first; they'll loop in terraform-manager if the node itself needs resizing.

Further reading

Internal documentation — Advisable only