Scale the cluster
Most scaling happens without you. KEDA watches request rate and CPU and adjusts Deployment replica counts; the cluster-autoscaler watches Pending pods and provisions Hetzner VMs. This page is for the cases where the automatics need tuning, or where you need to manually provision capacity the autoscaler can't reach for (a new client's dedicated pool, a new DB node, a one-off test environment).
The two autoscalers
KEDA — pod scaling
Each client's app-web Deployment is driven by a KEDA ScaledObject. For wecare it lives at manifests_v1/app-constructs/ecommercen-clients/wecare/adveshop4/prod/app-web-scaledobject.yaml — the pattern repeats per client.
Verify KEDA is healthy:
kubectl get scaledobject -A
kubectl -n ecommercen-clients-wecare describe scaledobject app-web-scalerCurrent state for a specific client:
kubectl -n ecommercen-clients-wecare get hpa
# KEDA manages a keda-hpa-app-web-scaler object behind the scenes
kubectl -n ecommercen-clients-wecare get deploy app-webChanging the limits
Edit the ScaledObject:
minReplicaCount: 4 # floor — always at least this many
maxReplicaCount: 16 # ceiling
triggers:
- type: prometheus
metadata:
query: sum(rate(traefik_service_requests_total{service=~"ecommercen-clients-wecare.*"}[2m]))
threshold: "15" # scale up once req/s per pod exceeds thisCommit, push, ArgoCD syncs. KEDA will re-evaluate within one polling interval (15s).
- 🟢 The Deployment's manifest intentionally has no
replicas:field — KEDA owns the count. Don't reintroduce it (ArgoCD will fight KEDA). - 🟠 Don't
kubectl scale deploy/app-web— see the kubernetes rule. KEDA reverts it. - 🔴 Don't set
maxReplicaCounthigher than the number of app pods a single autoscaler pool can fit. If the ceiling is unreachable, Pods will sitPendingindefinitely.
cluster-autoscaler — node scaling
The autoscaler watches for Pending pods, looks at their nodeSelector/tolerations, matches them to one of the pools in values.yaml, and provisions a VM via the Hetzner API. Config:
manifests_v1/app-constructs/cluster-autoscaler/values.yaml— pool definitions (name,minSize,maxSize,instanceType,region).cluster-autoscaler-configSecret — assembled byuntracked/secrets/build-autoscaler-config.shfrom a Bitwarden-stored RKE2 join token, per-pool labels/taints, and raw cloud-init YAML.
To change a pool's minSize or maxSize:
autoscalingGroups:
- name: wecare-web
minSize: 0
maxSize: 12 # ← change this
instanceType: CCX13
region: hel1Commit + push. The autoscaler Deployment rolls with the new values automatically (Reloader watches the ConfigMap).
To add a new pool for a new client, the pattern is automated by the client-onboarder agent — see Add a client. The manual version:
- Add an entry to
autoscalingGroupsinvalues.yaml. - Add a
nodeConfigs.<pool>entry inbuild-autoscaler-config.sh(labels + taints + cloud-init). - Rebuild the Secret:
source scripts/bw-unlock.sh && cd untracked/secrets && ./build-autoscaler-config.sh. - Commit both changes. Wait for ArgoCD to sync, then restart the autoscaler Pod so it picks up the new config.
- 🟠 Provisioning a CCX13 takes ~20 min (cloud-init + RKE2 download + image pull).
max-node-provision-timeis currently 45m to absorb that; don't lower it below 25m. - 🟠
cloudInitinside the config Secret must be raw YAML, not base64. The Hetzner provider passes it to theUserDatafield directly. - 🔴 Kubelet rejects labels in the
kubernetes.ionamespace at startup (K8s 1.24+). Stick to custom labels likewecare=""for pool identity.
See Node autoscaling for the full mechanics (how the per-client taint/label pattern isolates workloads, why the config Secret is double-base64-encoded).
Manually provisioning a dedicated node pool via Terraform
When the autoscaler pattern doesn't fit — a new client's DB nodes (stateful, local-path storage, never auto-provisioned), or an urgent one-off to absorb load:
cd terraform
# Edit main.tf, add an hcloud_server block + matching null_resource for labels/taints
$EDITOR main.tf
# Plan (saved to a file — never apply without -out)
./tf.sh plan -out=plans/plan-add-<description>-$(date +%Y%m%d).tfplan
# Apply the saved plan
./tf.sh apply plans/plan-add-<description>-$(date +%Y%m%d).tfplantf.sh unlocks Bitwarden and injects the Hetzner API token + RKE2 join token. The null_resource provisioner waits for the new node to appear in the cluster, then applies labels and taints via kubectl. Cloud-init installs RKE2 agent and joins automatically.
Before assigning a static IP, always check both Terraform and live cluster — the autoscaler also consumes IPs from the same 10.1.0.0/24 subnet:
./tf.sh state list
kubectl get nodes -o wide- 🟠 Delegate Terraform operations to the terraform-manager agent. It knows the IP-allocation pattern, the null_resource shape, the plan-file naming convention, and will refuse to apply without a saved plan.
- 🟠 Drain a node before destroying it:
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data. - 🔴 Never run bare
terraform—./tf.shis the only way to get Bitwarden-sourced secrets injected.
PVC resizing (the annoying one)
Longhorn supports online expansion; local-path (used for our DB nodes for NVMe speed) does not. To grow a local-path PVC you need to:
- Cordon + drain the node holding the current volume.
- Orphan-delete the PVC (
kubectl delete pvc <name> --wait=false, then remove the finalizer). - Re-apply the manifest with the new size.
- Roll the owning StatefulSet pod so it gets the new PV.
This is a destructive path — the data on the old PV is gone unless you snapshotted or dumped first. Delegate to the db-manager agent for MariaDB PVCs. Don't do this ad-hoc. The project memory has a past incident write-up; start there before the first attempt.
For Longhorn-backed PVCs, just edit spec.resources.requests.storage in the PVC (Argo-owned if via Helm values) and commit. Online expansion happens on the next reconcile.
Delegation
- 🟢 Pod scaling — edit the ScaledObject; commit. If the triggers aren't working, delegate to k8s-manager for HPA/KEDA inspection.
- 🟢 Node pool tuning (min/max on an existing pool) — edit
values.yaml; commit. - 🟢 New node pool / new client pool — the client-onboarder agent handles the full flow.
- 🟠 One-off node add/remove — delegate to terraform-manager.
- 🟠 Stateful storage resize — delegate to db-manager first; they'll loop in terraform-manager if the node itself needs resizing.
Further reading
- Node autoscaling — the concept-level explanation of pools, labels, taints
- Add a client — the canonical "new pool" workflow
- Rules & guardrails — why KEDA owns replica counts
- Reboot & patch — the drain-before-destroy pattern