Rules & guardrails

Three short files under .claude/rules/ encode the non-negotiable conventions of this repo. They're what the Claude agents enforce automatically and what the rest of the team should internalise before making changes. Reading them start-to-finish takes five minutes; they're below in operator-view summary.

kubernetes.md — cluster operations

The cluster runs GitOps. That means ArgoCD is the authority on what's deployed: every resource's source of truth is a YAML file in manifests_v1/. Editing a running resource directly with kubectl apply, kubectl edit, or kubectl delete sets you up for a surprise — ArgoCD's selfHeal: true will revert your change within a minute, possibly while you're mid-debug. The rule is simple: for anything ArgoCD manages, edit the manifest and commit. Out-of-band kubectl is fine only for read operations (get, describe, logs, top) and for node-level operations that aren't in git (drain, uncordon, cordon).

KEDA owns replica counts for autoscaled Deployments. Running kubectl scale deploy/app-web --replicas=20 appears to work, but KEDA re-evaluates within a few seconds and sets the count back to whatever its triggers decide. The right answer is to edit the ScaledObject's minReplicaCount / maxReplicaCount in git and commit.

Other guardrails: always use an explicit --context in kubectl when working across multiple clusters (our default context name is ecnv4 but your workstation may differ), never drain or destroy a worker node without first confirming pods can reschedule elsewhere, and never reboot or destroy two control-plane masters at once — etcd quorum is two out of three.

Examples that violate the rule: kubectl apply -f app-web.yaml to apply a "quick fix" on a pod, kubectl scale deploy/app-web --replicas=0 to "pause" an app (edit the ScaledObject or disable the app's ArgoCD Application instead), kubectl delete pod wecare-web-* to force a rolling restart (edit the manifest or bump an annotation that Reloader watches).

Delegate to:

k8s-manager — live kubectl operations
argocd-manager — ArgoCD sync / diff / get / list
network-expert — Cilium / Hubble / Traefik debugging

secrets.md — Sealed Secrets workflow

Plaintext never leaves your machine. Secrets enter the repo only in encrypted form via Sealed Secrets. The flow is strictly: plaintext YAML in untracked/secrets/<name>.yaml (the untracked/ tree is gitignored by extension), scope declared via the plaintext's metadata.annotations["sealedsecrets.bitnami.com/cluster-wide"] annotation, sealed via ./untracked/secrets/seal.sh seal <path> (namespace-scoped) or ./seal.sh cluster-seal <path> (cluster-wide) — or ./seal-all.sh to bulk-seal everything you have locally — producing sealed-<name>.yaml that you copy into the correct manifests_v1/app-constructs/<app>/ location and commit. The sealed-secrets controller in-cluster decrypts that into a real Secret; Reloader restarts consumers.

A few cluster-wide secrets — regcred, redis-auth-secret, ecn-bucket-access — must be copied to multiple app-construct locations. Forgetting one shows up as an ImagePullBackOff in a single tenant or a Redis auth error after a rotation. The secrets-manager agent's mapping table is authoritative.

Rules that compose with this: never commit a plaintext Secret to manifests_v1/ (only sealed-*.yaml); never output raw secret values in chat, logs, or commit messages (describe keys with <REDACTED> values); kubeseal requires cluster access because it fetches the controller's public key; and the argocd-secret is the one exception — it's patched imperatively, not through sealed secrets, because ArgoCD manages its own admin credential reconciliation.

Examples that violate the rule: committing app-db-secrets.yaml (the plaintext) instead of sealed-app-db-secrets.yaml; pasting a password into a Chat message to share it with a colleague (use Bitwarden); checking .env files into a commit.

Delegate to:

secrets-manager — end-to-end sealing and rotation
vault-manager — if Bitwarden itself is misbehaving

terraform.md — infrastructure changes

Always use the ./tf.sh wrapper, never bare terraform. tf.sh pulls the Hetzner Cloud API token and the RKE2 join token from Bitwarden and exports them as environment variables before forwarding to terraform. Running terraform plan or terraform apply directly will either fail (no credentials) or, worse, succeed against the wrong cloud account if you have Hetzner tokens floating around in your shell.

Plans go in terraform/plans/ with the naming convention <action>-<description>-<YYYYMMDD>.tfplan. The apply step always takes a saved plan file — ./tf.sh apply plans/<name>.tfplan — never an inline apply. This means every applied change was previewed and (at least by convention) reviewed. For destructive operations, always scope with -target so you can't accidentally drop the whole stack, and confirm with the user before running apply.

State is local (terraform/terraform.tfstate, gitignored). If someone else holds a state lock, investigate why — never force-unlock without understanding the cause. Before assigning a static IP to a new node, cross-check both terraform/main.tf and the live cluster (kubectl get nodes -o wide) — the cluster-autoscaler provisions nodes via DHCP in the same 10.1.0.0/24 subnet, and IP collisions are uniquely painful to debug.

Examples that violate the rule: terraform apply directly (no secrets, no plan file); ./tf.sh apply without a -out plan file from a previous plan; terraform destroy without -target flags; editing lifecycle.ignore_changes on an existing resource without flagging it to the user.

Delegate to:

terraform-manager — any Terraform operation or main.tf edit
vault-manager — if Bitwarden unlock is failing
hcloud-operator — for "what's actually running" queries that don't need to change state

Why these rules, not others

These three capture the differences between this cluster and a textbook Kubernetes + cloud setup: GitOps as a one-way authority flow, secrets as encrypted-at-rest from the repo onward, and cloud infra as state-file-managed with injected credentials. The rest — RBAC, network policies, PodSecurityStandards — is enforced in-cluster by Kyverno and RBAC bindings, not by these rules. You can find those in manifests_v1/app-constructs/kyverno/ and manifests_v1/app-constructs/argocd/.

Rules & guardrails ​

kubernetes.md — cluster operations ​

secrets.md — Sealed Secrets workflow ​

terraform.md — infrastructure changes ​

Why these rules, not others ​

Further reading ​

Rules & guardrails

kubernetes.md — cluster operations

secrets.md — Sealed Secrets workflow

terraform.md — infrastructure changes

Why these rules, not others

Further reading