GitOps & ArgoCD
The repo is the source of truth. Everything else — the live cluster, Grafana, Google Chat alerts, Cloudflare DNS — is a downstream consequence of what's committed. If you remember only that, you'll be fine.
The one-sentence definition
GitOps is an operating model where the desired state of your system is stored in a Git repository, and a controller runs continuously to make the real system match what the repo says.
- You don't press a "deploy" button.
- You don't run
kubectl applyby hand. - You write a change, commit it, push it. Within a minute the cluster has applied it.
This is very different from how Plesk worked. In Plesk, you made a change by clicking a button — state lived in Plesk's database and the running system was modified directly. There was no "ground truth" outside the running system itself.
ArgoCD is the controller
ArgoCD is the process that does the matching. It:
- Watches this repo (
git@github.com:Advisable-com/ecnv4_manifests) for changes. - Compares what the repo says should exist vs. what's actually in the cluster.
- Applies the difference — creates missing resources, updates changed ones, deletes removed ones.
- Flags any drift — when the cluster has been changed outside the repo — and reverts it if
selfHeal: trueis enabled (which it is for us).
Two cluster-admin-visible UIs:
- argocd.ecnv4-mgmt.ecommercen.com (web, sign in with Google)
argocd app list(command line)
The argo.sh wrapper (local tooling note)
Throughout this manual you'll see commands like argocd app list, argocd app sync <name>, argocd app diff <name>. Those are the standard ArgoCD CLI — they work, but they assume you've logged in and that Cloudflare Access is happy with your session.
For day-to-day work we have a tiny shell wrapper at untracked/scripts/argo.sh that removes both of those friction points. It:
- Starts a
kubectl port-forwardto theargocd-serverService so traffic never touches the Cloudflare Zero Trust edge — CF Access can't block what it can't see. - Reads the in-cluster admin secret and authenticates for you — no password prompt, no
argocd logindance. - Cleans up the port-forward on exit.
How to use it: it's a drop-in replacement. Any argocd <...> command in this manual can be invoked as ./untracked/scripts/argo.sh <...> and it "just works". To target a different kubectl context: KUBECTL_CONTEXT=<ctx> ./untracked/scripts/argo.sh app list.
When to use raw argocd instead: if you've already run argocd login argocd.ecnv4-mgmt.ecommercen.com --sso in your shell and your browser session has a valid CF Access cookie, raw argocd works too. For one-off commands the wrapper is simply less hassle.
The app-of-apps pattern
If ArgoCD managed 25+ individual Application resources, you'd be constantly clicking around. Instead we use one entry point that fans out:
To enable/disable a component, you move its app-*.yaml file between apps-enabled/ and apps-disabled/. That's it — ArgoCD notices and applies (or prunes, with automated.prune: true).
ApplicationSets — one template, many apps
When we onboard a new client, we don't want to create 5 Application manifests by hand. appset-ecommercen-clients.yaml does the generating:
- Scans
manifests_v1/app-constructs/ecommercen-clients/**/config.jsonfor files. - For each file found, generates a full
Applicationresource using the config.json's fields. - Adds/removes apps as
config.jsonfiles appear/disappear.
This is why onboarding a client is "copy a directory + commit" rather than "write 15 YAML files by hand".
Key concepts you'll see
| Word | What it means |
|---|---|
| Sync | Apply the state in the repo to the cluster (manual or automatic). |
| Health | The applied resources' own health (pods Ready, Deployment replica count matches, etc). |
| OutOfSync | Cluster has something different from the repo (someone ran kubectl edit). |
| Degraded | Resources are applied but unhealthy (crashloop, missing secret, etc). |
| Progressing | Rollout in flight — transient, usually resolves in under a minute. |
| selfHeal: true | ArgoCD reverts out-of-sync cluster changes automatically. |
| automated.prune: true | Resources removed from the repo get deleted from the cluster. |
| Sync wave | Ordering hint; lower numbers sync first (useful for "install CRD before anything that uses it"). |
| ignoreDifferences | "Ignore these specific fields on this resource" — used for things that are patched in-cluster by other controllers. |
What you should (and shouldn't) do
- 🟢 Do: edit YAML in the repo, commit, push, let ArgoCD sync.
- 🟢 Do: click "Sync" in the UI if you're impatient and don't want to wait 60s for the auto-poll.
- 🟠 Careful with:
kubectl editon anything managed by ArgoCD. Your change will be reverted within a minute byselfHeal. - 🔴 Don't:
kubectl applya file that contradicts what the repo says — you'll confuse yourself and a human teammate trying to reconcile later.
When to break the rules
There are legitimate times to make an in-cluster change that isn't in the repo:
- ignoreDifferences resources —
argocd-secret,app-kube-prometheus-stack-grafana(admin password), a handful of controller-initialised fields. These are documented case-by-case in our repo with a comment. - Debugging in the moment —
kubectl scale deploy foo --replicas=0to stop a bleeding fire. But you must update the repo immediately after, orselfHealreverts you. - Secret rotation via Bitwarden (see Secrets & Bitwarden) — plaintext never lands in the repo.
Further reading
- Official ArgoCD docs
- Our app status runbook — what to do when ArgoCD says something's OutOfSync.
- Next: Secrets & Bitwarden