Skip to content

Add a client

Onboarding a new tenant (its own namespace, MariaDB, Redis, MaxScale, dedicated nodes, autoscaling, dashboards, TLS) is roughly 25 new files and 6 touch-ups to shared infra. The client-onboarder Claude agent automates the generation step — your job is mostly filling in a config file and reviewing diffs.

The short version

  1. Copy the template: cp untracked/client-onboarding/config-template.yaml untracked/client-onboarding/<client>.yaml
  2. Fill in the YAML (identity, domains, DB sizing, node types — see table below).
  3. Ask Claude: "Onboard client <client> using untracked/client-onboarding/<client>.yaml". The client-onboarder agent does the rest.
  4. Review the diff, walk through the delegation checklist the agent produces, commit.

The workflow

The config YAML — what you fill in

The template is commented, but here's the operator-view map:

SectionFieldsNotes
clientname, display_namename is used everywhere (directory, namespace suffix, node label/taint, Redis prefix, S3 prefix). Lowercase alphanumeric + hyphens only.
domainproduction, stagingUsed in Ingress TLS hosts and app base URL. Cert-manager issues a Let's Encrypt cert per domain via the DNS01 challenge.
databasename, replicas, image, storage_size, innodb_buffer_pool_size, resourcesRule of thumb: innodb_buffer_pool_size ≈ 80% of the container memory limit. storage_class: local-path because DB nodes use NVMe local storage, not Longhorn.
maxscalereplicas, external_access, ip_allowlistTurn external_access on only if an ERP connector needs to reach the DB from outside the cluster — it opens a TCP IngressRoute with an IP allowlist.
redis.cache / redis.sessioncluster_size, sentinel_size, eviction_policy, resourcesTwo independent Redis replication sets: allkeys-lru for application cache, volatile-lru for PHP sessions.
appimage_tag, php_cli_tag, php_fpm_tag, s3_prefixIf s3_prefix is blank the agent generates a random hash. This isolates uploaded files per client in the shared S3 bucket.
scalingmin_replicas, max_replicas, cpu_limit, request_rate_threshold, cpu_thresholdBecomes the KEDA ScaledObject in <client>/adveshop4/prod/. Defaults to 4–16 pods, 15 req/s per pod.
nodes.db / nodes.webcount, type, autoscaler_min, autoscaler_maxHetzner server types. ccx33 for DB (dedicated vCPU + more RAM), ccx13 for web. Autoscaler limits apply to the web pool only.
featureserp_access, phpmyadmin, redisinsight, physicalbackup, grafana_dashboards, staging_envPer-feature toggles — set false to skip generating that component.
cloudflaredmaxscale_gui, redisinsight, phpmyadminHostnames for management tools. Leave blank to auto-generate <tool>-<client>-ecnv4-mgmt.ecommercen.com.

What the agent generates

New files (~25): the full manifests_v1/app-constructs/ecommercen-clients/<client>/ tree — infrastructure/prod/ (MariaDB, MaxScale, Redis cache + session, phpMyAdmin, RedisInsight, backups), adveshop4/base/ (the big app.yaml + kustomization), adveshop4/prod/ and adveshop4/stg/ (overlays with ConfigMap patches, Ingress, KEDA scheduling). Plus two Grafana dashboards and a Kyverno policy.

Modified shared files (~6): Longhorn values.yaml (add tolerations for the new taints), cluster-autoscaler values.yaml (register the new web pool), Cloudflared configmap.yaml (add tunnel entries for the mgmt tools, before the catch-all 404), Kyverno setup kustomization, kube-prometheus-stack dashboards kustomization, Ansible inventory/hosts.yml.

Skeleton secrets: untracked/secrets/<client>/ — empty-value stubs for app-secrets, app-db-secrets, app-keycloak-secrets, and (if ERP is enabled) erp-db-secrets. Also adds commented-out cluster_seal lines to seal.sh for you to uncomment after filling in real values.

The client-onboarder agent pauses after generation for you to eyeball the diff before moving on — use git status + git diff.

Post-generation checklist

The agent produces a delegation report with four follow-up tasks:

  • 🟢 terraform-manager — adds the DB and web hcloud_server resources to terraform/main.tf (with matching null_resource provisioners for labels and taints). You then run ./tf.sh plan -out=plans/<name>.tfplan and ./tf.sh apply.
  • 🟢 secrets-manager (client secrets) — you fill in the plaintext values in untracked/secrets/<client>/, uncomment the seal.sh lines, and it runs the seal + copies everything into place (don't forget the cluster-wide shared secrets: regcred, redis-auth-secret, ecn-bucket-access).
  • 🟢 secrets-manager (autoscaler config) — registers the new web pool in build-autoscaler-config.sh and re-seals the autoscaler config.
  • 🟢 gitops-commit-pusher — stages everything atomically and commits with a [App: <client>] Add client … message. Push once, ArgoCD picks it up.

The ApplicationSet (appset-ecommercen-clients) auto-discovers the new config.json files — no manual app-<client>.yaml to add.

Verification once it's pushed

bash
# 1. ArgoCD generated the apps?
argocd app list | grep <client>

# 2. Pods in infrastructure namespace
kubectl -n ecommercen-clients-<client>-infrastructure get pods

# 3. Pods in app namespace (after the infrastructure pods are Ready)
kubectl -n ecommercen-clients-<client> get pods

# 4. External reachability (once DNS is wired)
curl -sI https://<production-domain> | head -5
  • 🟠 DNS for the client's domain is not in the repo. Create the CNAME in the Cloudflare dashboard pointing at the tunnel — see DNS & Cloudflare.
  • 🟠 Seed data (initial DB dump, S3 media) is outside the onboarder's scope. Coordinate with the dev team before the first external smoke test.
  • 🔴 Don't cut Cloudflare DNS over before the app actually reaches Healthy in ArgoCD — a dangling CNAME to a 502 is noisier than an NXDOMAIN.

Further reading

Internal documentation — Advisable only