Skip to content

Ingress, TLS & Cloudflare

Traffic into the cluster takes three very different paths depending on who's asking:

  • Path A — Public HTTP(S) sites like www.wecare.grCloudflare proxyHetzner Load Balancer → Traefik HTTP entry point → app.
  • Path B — Management URLs like grafana-ecnv4-mgmt.ecommercen.com → Cloudflare Tunnel (cloudflared) → in-cluster service. Gated by Cloudflare Access SSO.
  • Path C — External ERP / database clients speaking raw MySQL protocol → Hetzner Load Balancer port 3306 → Traefik mysql TCP entry point → MaxScale → MariaDB. Does not pass through Cloudflare at all.

Paths A and B start at Cloudflare; path C bypasses Cloudflare entirely because it's a raw TCP protocol, not HTTP. Public customers never touch the tunnel; operators never touch the Hetzner LB; ERP clients never touch Traefik's HTTP routers.

Path A — public traffic (customer browsing www.wecare.gr)

Public DNS records (www.wecare.gr, api.wecare.gr, and similar) are orange-cloud proxied through Cloudflare — they resolve to Cloudflare's anycast IPs, Cloudflare terminates TLS for the customer, and Cloudflare forwards to the origin (our Hetzner Cloud Load Balancer's public IP). The LB is an HCCM-managed Service of type LoadBalancer that fronts the Traefik pods; it distributes traffic across all cluster nodes and Traefik handles the in-cluster routing from there.

Path B — management URLs (operator visiting grafana)

Management hostnames live under *ecnv4-mgmt.ecommercen.com. Their DNS records are CNAMEs to the Cloudflare Tunnel address, not to a public IP. Cloudflare Access checks for a valid @advisable.com SSO session (or a Service Auth token for LLM clients), and the tunnel delivers the request to the cloudflared pod running inside the cluster. cloudflared looks at its ingress rules and sends the request directly to the target in-cluster Service — Grafana, ArgoCD server, Longhorn UI, Hubble UI, Keycloak admin, this docs site, etc.

No inbound ports are opened on the Hetzner side for any of this; the tunnel is initiated outbound from the cluster.

Management panels at a glance

Everything below is behind Cloudflare Access (@advisable.com SSO). Bookmark these — they're the day-to-day operator toolkit.

Cluster-wide (not tenant-scoped):

PanelURLWhat it's for
ArgoCDargocd-ecnv4-mgmt.ecommercen.comGitOps sync state, app diffs, manual sync
Grafanagrafana-ecnv4-mgmt.ecommercen.comMetrics + log dashboards
Traefiktraefik-ecnv4-mgmt.ecommercen.comLive router state, middlewares, entry points
Longhornlonghorn-ecnv4-mgmt.ecommercen.comRWX volume state, replica placement, backups
Hubblehubble-ecnv4-mgmt.ecommercen.comLive network flow observability (Cilium)

Per-tenant (one set per client — wecare shown):

PanelURL patternWecare example
MaxScale GUI<tool>-<tenant>-ecnv4-mgmt.ecommercen.commaxscale-wecare-ecnv4-mgmt.ecommercen.com
RedisInsight<tool>-<tenant>-ecnv4-mgmt.ecommercen.comredisinsight-wecare-ecnv4-mgmt.ecommercen.com
phpMyAdmin<tenant>-<tool>-ecnv4-mgmt.ecommercen.comwecare-phpmyadmin-ecnv4-mgmt.ecommercen.com

Naming inconsistency

Most per-tenant panels follow <tool>-<tenant>-... (maxscale-wecare, redisinsight-wecare) but phpMyAdmin uses <tenant>-<tool>-... (wecare-phpmyadmin). This is historical — when typing from memory, check both shapes if you get a 404.

All URLs above are defined in manifests_v1/app-constructs/cloudflared/configmap.yaml (the ingress rules) plus DNS CNAMEs in the Cloudflare dashboard and Access policies on the Cloudflare Access application.

Path C — external ERP clients (raw MySQL over TCP)

Wecare's ERP needs a direct MySQL connection to the database — not HTTP, not through a browser. For that we expose port 3306 on the Hetzner LB and route it through Traefik's TCP stack straight to MaxScale, completely bypassing Cloudflare.

Key points:

  • No Cloudflare in the path. Cloudflare's HTTP proxy can't meaningfully proxy the MySQL binary protocol. The LB is hit directly on port 3306.
  • Dedicated Traefik entry point. Traefik is declared with an extra mysql entry point (in manifests_v1/app-constructs/traefik/values.yaml) alongside web (HTTP) and websecure (HTTPS). TCP entry points don't look at Host: headers because TCP has none — routing is by HostSNI (TLS SNI) or, as here, HostSNI(*) to catch all.
  • IngressRouteTCP + MiddlewareTCP. The route lives at manifests_v1/app-constructs/ecommercen-clients/wecare/infrastructure/prod/maxscale-tcp-route.yaml. The middleware enforces a source-IP allowlist (currently opened to 0.0.0.0/0 as a temporary measure with a TODO to restore the per-ERP IPs).
  • MaxScale, not MariaDB. The route terminates at MaxScale, which load-balances reads and writes across the MariaDB replicas. MaxScale handles its own authentication using the MariaDB user accounts.
  • Proxy Protocol is enabled on the Traefik entry point (trustedIPs: 10.0.0.0/8 — the Hetzner private network), so MaxScale / MariaDB see the real client IP for audit logging.

When to use Path C

Only external data integrations that absolutely need raw SQL. Prefer Path A (HTTPS API) if any kind of application-level gateway is feasible — it's safer, observable through Traefik's normal metrics, and doesn't require opening a new firewall port per tenant.

Cloudflare — two distinct roles

  1. Proxy for public traffic. Orange-cloud DNS for *.wecare.gr / *.ecommercen.com customer-facing hostnames. Cloudflare runs the CDN, WAF, DDoS mitigation, and terminates public TLS. Origin is the Hetzner LB's public IP.
  2. Tunnel + Access for management traffic. *ecnv4-mgmt.ecommercen.com hostnames CNAME into the tunnel. Cloudflare Access enforces SSO on the URL before any request reaches cloudflared.

Tunnel configuration (which hostnames route to which in-cluster services) lives in manifests_v1/app-constructs/cloudflared/ — declared in the repo, synced like any other app. DNS records themselves live in the Cloudflare dashboard, not in the repo.

Traefik — the in-cluster router (public path only)

Traefik is our Ingress controller and it sits on the public path only. When a public request arrives at a Traefik pod (via the Hetzner LB), Traefik:

  1. Looks at the Host: header (e.g. www.wecare.gr).
  2. Matches it against Ingress or IngressRoute resources.
  3. Routes the request to the matching Service → backing pod.

Three resource kinds you'll see in the repo:

  • Ingress — the standard Kubernetes resource. We use it for simple hostname-based HTTP routing.
  • IngressRoute — Traefik's HTTP CRD. More expressive: Host(\...`)` predicates, middleware, path rewriting, redirects.
  • IngressRouteTCP — Traefik's TCP CRD. Used for Path C (MaxScale on port 3306). Matches via HostSNI(...), not Host(), because TCP has no HTTP headers to match on.

Each client's ingress rules live in their own manifests, e.g. manifests_v1/app-constructs/ecommercen-clients/wecare/adveshop4/prod/ingress*.yaml for HTTP and .../infrastructure/prod/maxscale-tcp-route.yaml for the MaxScale TCP route.

Management URLs (Path B) don't normally go through Traefik — cloudflared talks directly to the target Service.

cert-manager — internal TLS

  • Cloudflare handles the public TLS cert (browser ↔ Cloudflare) for both paths.
  • Inside the cluster, traffic is also encrypted end-to-end. cert-manager issues Let's Encrypt certificates for internal services.
  • We use the DNS01 challenge: cert-manager proves domain ownership by creating a TXT record in Cloudflare, which Cloudflare verifies via our Cloudflare API token.
  • Renewal is automatic, happening at ~30 days before expiry. Cert-manager logs on failure; we have the ExternalCertExpiringSoon alert as a safety net.

You almost never interact with cert-manager directly. If you see a Certificate CR in a manifest, cert-manager is handling it.

Adding a new public hostname for an existing client

Scenario: wecare wants to add shop.wecare.gr alongside the existing domains.

  1. Cloudflare dashboard: add a DNS record for shop.wecare.gr. For a public site, use a proxied (orange-cloud) CNAME that points at an existing proxied hostname for the same origin — or an A record at the Hetzner LB's public IP. Either way, the orange cloud must be on.
  2. Repo: add shop.wecare.gr to the relevant Ingress / IngressRoute resource (copy an existing host: entry).
  3. Repo: cert-manager will issue the internal cert automatically when the Ingress is applied.
  4. Commit + push.
  5. ArgoCD syncs the Ingress. Traefik picks up the new route within seconds.

Adding a new management (ecnv4-mgmt) hostname

Different path — Cloudflared tunnel, not Cloudflare proxy.

  1. Add an ingress rule for the new hostname in manifests_v1/app-constructs/cloudflared/configmap.yaml, pointing at the in-cluster Service URL.
  2. Add a CNAME in Cloudflare pointing at the tunnel address (<uuid>.cfargotunnel.com).
  3. Add the hostname to the Cloudflare Access application policy so SSO protects it.
  4. Commit + push. ArgoCD reloads cloudflared.

What to look at when a hostname doesn't work

Ask first: is this a public hostname or a management one? The debugging paths differ.

Public hostname (e.g. www.wecare.gr)

  • Step 1: external probe — is it up from outside? See the Wecare External Probes Grafana dashboard.
  • Step 2: DNS — dig www.wecare.gr should resolve to Cloudflare anycast IPs. If it doesn't, the DNS record is wrong in Cloudflare.
  • Step 3: Hetzner LB — is the load balancer healthy? (Delegate to the hcloud-operator agent or check the Hetzner console.)
  • Step 4: Traefik — kubectl -n traefik get ingressroute -A and kubectl -n traefik logs deploy/traefik to confirm the route exists and requests are arriving.
  • Step 5: backend — check app status on the target Service's pods.

Management hostname (e.g. grafana-ecnv4-mgmt.ecommercen.com)

  • Step 1: can you even reach the Cloudflare login page? If no, DNS or tunnel is broken.
  • Step 2: cloudflared tunnel health — kubectl -n cloudflared logs deploy/cloudflared --tail=50.
  • Step 3: cloudflared ingress rules — kubectl -n cloudflared get configmap cloudflared-config -o yaml should list your hostname.
  • Step 4: target Service and pod — standard app status checks.

ERP can't connect to MariaDB on port 3306

  • Step 1: is the client IP in the allowlist? Check MiddlewareTCP/maxscale-ipallowlist in .../infrastructure/prod/maxscale-tcp-route.yaml — if the allowlist is strict, the client's source IP must be listed.
  • Step 2: is the LB listening on 3306? hcloud load-balancer describe ecnv4-lb should show the mysql service.
  • Step 3: is Traefik receiving traffic on the mysql entry point? kubectl -n traefik logs deploy/traefik | grep -i mysql and check the Traefik dashboard.
  • Step 4: is MaxScale healthy? kubectl -n ecommercen-clients-wecare get pods -l app.kubernetes.io/name=maxscale — delegate to the db-manager agent for MaxScale/MariaDB internals.
  • Step 5: MySQL auth — the ERP user must exist in MariaDB and MaxScale must be able to see it. erp-db-user is defined in .../infrastructure/prod/erp-persistence.yaml.

Further reading

Internal documentation — Advisable only