Databases
In Plesk, databases were a checkbox in a panel — you picked MySQL or PostgreSQL, got a hostname, a user, a password, and phpMyAdmin. The cluster is the same idea, just with more moving parts exposed: operators watch custom resources that describe the database you want, and they spin up the pods, volumes, users, and backups to match. You write YAML instead of clicking — but you still end up with a MySQL server, a user, a password, and (yes) phpMyAdmin.
This page explains what's running, how the pieces fit together, and why we run two Redis clusters per tenant instead of one.
Three database layers
Three different operators manage three different database families. Each one is installed once at the cluster level; tenants reference it by creating CRs in their own namespace.
| Layer | Operator | Namespace | CRDs you'll see |
|---|---|---|---|
| MariaDB + MaxScale | mariadb-operator | op-mariadb | MariaDB, MaxScale, User, Grant, Database, Backup, PhysicalBackup, Restore, SqlJob |
| Redis | redis-operator (opstree) | op-redis | RedisReplication, RedisSentinel |
| PostgreSQL | Zalando postgres-operator | postgres-operator | postgresql |
| RabbitMQ | rabbitmq-cluster-operator | rabbitmq-system | RabbitmqCluster |
MariaDB is the primary Ecommercen database — this is where customer data actually lives. Redis is the cache and session store in front of it. PostgreSQL exists only for Keycloak and RabbitMQ is used by a couple of internal queues — you'll rarely touch either.
MariaDB + MaxScale
Everything above is generated from three CRs in the wecare namespace: a MariaDB, a MaxScale, and (for the ERP user) a User + Grant pair. The operator watches those and keeps the pods, volumes, replication topology, users, and backup CronJobs in sync with what you wrote.
Topology — primary + replica
We run spec.replicas: 2 with spec.replication.enabled: true. That produces two StatefulSet pods — app-db-0 and app-db-1 — connected by MariaDB GTID-based async replication. One of them is the current primary (writes go there), the other tails the binlog.
gtidStrictMode: true— replicas refuse to diverge silently; if a write lands on the wrong node, replication breaks loudly rather than drifting.maxScaleRef— the MariaDB CR hands off failover decisions to MaxScale (see below). Without this, the operator would choose which pod is primary; with it, MaxScale chooses and the operator defers.- Pod anti-affinity — the two pods are forced onto different nodes (
topologyKey: kubernetes.io/hostname). Losing one node never takes out both replicas. updateStrategy: ReplicasFirstPrimaryLast— rolling upgrades touch the replica first, wait for it to rejoin, then switch over and upgrade the old primary. The only write-unavailable window is the switchover itself (a few seconds).
MaxScale — why it's in the path
MaxScale is MariaDB's official database proxy. Every database client — the PHP app, the ERP connector, phpMyAdmin — talks to app-db-maxscale:3306, never directly to a MariaDB pod. MaxScale does three jobs we care about:
- Read-write split. The
readwritesplitrouter (rw-router) parses each query and sends writes to the primary, reads to whichever replica has capacity. The app doesn't need to know which pod is primary. - Automatic failover. The
monitorsection polls both MariaDB pods every 2 seconds. If the primary dies, MaxScale promotes the replica (auto_failover: "true"), rewires routing, and rejoins the old primary as a replica when it comes back (auto_rejoin: "true"). Because we setmaxScaleRefon the MariaDB CR, the operator respects MaxScale's decision — no tug-of-war. - Connection keepalive. MaxScale pings idle backend connections every 300s so PHP-FPM persistent connections don't get killed by MariaDB's
wait_timeout.
MaxScale runs with 2 replicas of its own, also on the dedicated DB nodes. There's a web GUI at maxscale-wecare-ecnv4-mgmt.ecommercen.com — you can see replication lag, routing stats, and trigger a switchover from there.
Where the data lives
spec.storage.storageClassName: local-path is a deliberate choice. local-path puts the PVC on the node's own disk — in our case the NVMe SSD of a dedicated DB node. We do not use Longhorn for MariaDB because:
- Latency. InnoDB fsyncs every commit. Network-replicated storage adds milliseconds to every write; NVMe is measured in microseconds.
- Redundancy elsewhere. Replication already gives us two copies of the data on two nodes. Double-replicating via Longhorn would be belt-and-braces with a performance tax.
The downside of local-path is that a pod is pinned to the node where its PVC was created. That's why the CR has node affinity on the wecare-mariadb label and a matching toleration for the wecare-mariadb:NoSchedule taint. The pool is provisioned separately — see Autoscaling / dedicated pools for how that works.
Declarative users, grants, databases
Creating a database user in Plesk was a form; here it's a User + Grant CR pair. Example from erp-persistence.yaml (the ERP connector's database access):
kind: User
spec:
mariaDbRef: { name: app-db }
passwordSecretKeyRef: { name: erp-db-secrets, key: ERP_DB_PWD }
maxUserConnections: 20
---
kind: Grant
spec:
mariaDbRef: { name: app-db }
username: erp-db-user
database: wecare_production
privileges: ["ALL PRIVILEGES"]The operator executes the equivalent CREATE USER + GRANT statements against MariaDB and keeps them in sync if you change the CR. The password itself is in a Sealed Secret. There's also a Database CRD for creating schemas, and a SqlJob CRD for one-shot SQL (migrations, one-off maintenance) — we don't lean on those for day-to-day work.
The same mechanism is how we provisioned the ERP's direct MySQL connection — cross-referenced in Ingress / Path C.
Backups
MariaDB runs two backup tracks in parallel, both writing to the same Hetzner Object Storage bucket (S3-compatible, hel1). Different mechanisms, different cadences, different best-fit scenarios — we keep both.
| Track | CRD | Mechanism | Schedule | Retention | Best for |
|---|---|---|---|---|---|
| Logical | Backup | mysqldump of the app schema, gzip-compressed | daily at 04:00 | 30 days | point-in-time recovery, portable across MariaDB versions, table-level restore |
| Physical | PhysicalBackup | volume-level snapshot via mariadb-backup, gzip-compressed | hourly (0 * * * *) | 24 hours | fast full-cluster restore (no SQL replay), bootstrapping fresh replicas |
Both CRs share the same ecn-bucket-access Sealed Secret for S3 credentials.
Logical — kind: Backup
A scheduled mysqldump. Defined in backup-sql.yaml:
- Scope: the
wecare_productiondatabase only (not MariaDB system tables). - Target: bucket
ecn-private, prefixwecare/mariadb-sql-backup, endpointhel1.your-objectstorage.com.
Logical backups are portable — you can restore a Backup into a completely different MariaDB version or even a different operator. They're the safety net you actually want when someone deletes a table at 3 AM.
Physical — kind: PhysicalBackup
A volume-level snapshot (via mariadb-backup), much faster to restore than replaying a mysqldump because it's a raw file copy, not SQL statements. Defined in physicalbackup-template.yaml:
- Target pod:
PreferReplica— takes the snapshot from the replica so the primary isn't stalled. - S3 prefix:
wecare/mariadb-backup(distinct from the logical-backup prefix). - Why hourly: physical backups are meant to be the "fast recover" path. The shorter the window since the last snapshot, the less catching up the replay has to do.
- Original purpose: to bootstrap new replicas — when a fresh pod joins the cluster, it pulls the latest physical backup from S3 instead of doing a full GTID stream from the primary (which can take hours on a large dataset).
bootstrapFrom currently disabled — the backups themselves run fine
The hourly PhysicalBackup CronJob is running and snapshots land in S3 as expected. What's disabled is the MariaDB CR's bootstrapFrom: + recovery: blocks (commented out in mariadb.yaml:15-23) because the bootstrap pods that consume these snapshots fail on start. A fresh replica currently falls back to the native GTID streaming clone from the primary — slower, but correct. The snapshots still exist in S3 and can be restored manually via a Restore CR. Cause still under investigation; tracked in project memory.
S3 credentials
Both backup CRs reference a shared ecn-bucket-access Sealed Secret that holds the S3 access key and secret key. Rotating those is the standard rotate-a-secret flow.
Restoring
There's a Restore CRD that takes either a backupRef (points at a past Backup run) or a raw S3 location. Writing one kicks off a Job that runs the restore. Actually doing it in anger — which backup to pick, downtime window, GTID considerations — is an incident scenario; delegate to the db-manager agent.
Redis — cache and session
Every tenant runs two Redis clusters, for different reasons:
| Cluster | What's in it | Eviction policy | Persistence | Size |
|---|---|---|---|---|
redis-cache | L2 application cache (query results, rendered fragments) | allkeys-lru — any key can be evicted | disabled (save "" + appendonly no) | 14Gi pod, 9Gi maxmemory |
redis-session | PHP session data | volatile-lru — only keys with TTL are evictable | default (AOF) | 4Gi pod, 3Gi maxmemory |
Why split them? They have very different durability requirements. Losing the cache means a few minutes of cache misses and higher DB load — annoying but harmless. Losing sessions means every logged-in user gets kicked out. Separating them lets us turn persistence off for the cache (no AOF fork stalls, no BGSAVE pressure, fast restarts) while keeping it on for sessions.
Topology — replication + Sentinel
Each cluster is a RedisReplication with clusterSize: 2 (one primary, one replica) plus a matching RedisSentinel with clusterSize: 2. Sentinel is Redis's own failover coordinator: it watches the primary, promotes the replica if the primary disappears, and tells clients the new primary's address. The opstree redis-operator wires all of this together.
The redis-cache pods live on a dedicated wecare-cache node pool (separate from the DB pool) so a Redis restart doesn't compete with MariaDB for page cache. Redis session pods currently land on the wecare-mariadb pool via the node-role.kubernetes.io/db label — this is fine given their small footprint. Dashboards RedisInsight is available for peeking at keys, TTLs, and memory stats.
The redis-auth secret
Both clusters share a single Sealed Secret (redis-auth) for the password. Sentinel also needs it to authenticate to the data nodes during failover — that's the redisReplicationPassword block on each RedisSentinel CR. Rotating it is the standard secret-rotation flow, but all four stateful sets need to come back up cleanly after Reloader restarts them, so do it during a quiet window.
PostgreSQL and RabbitMQ
Both run via their own dedicated operators and are minor compared to MariaDB + Redis.
- PostgreSQL (Zalando operator) is used by Keycloak for user/realm storage. One
postgresqlCR per environment. Managed by the Keycloak app, not by tenants directly. - RabbitMQ (cluster-operator) provides a small queue for internal messaging (a
RabbitmqClusterCR per consumer). Low traffic; unlikely to demand operator attention.
If either one does misbehave: check the app's ArgoCD sync state first, then the operator pod logs in postgres-operator / rabbitmq-system.
Credentials
Every database password in the cluster is a Sealed Secret referenced by a CR (passwordSecretKeyRef, redisSecret, accessKeyIdSecretKeyRef). Plaintext never lands in the repo. The secrets-manager agent handles the seal/rotate/re-seal flow end-to-end — see rotate a secret for the runbook. MariaDB system users (root, the operator's own service accounts) are generated by the operator into Secrets at install time and are reset automatically if you delete them.
When to use which CR
| Need | Use |
|---|---|
| New Ecommercen tenant database (app data) | MariaDB + MaxScale (copy wecare as template) |
| App user for an existing tenant DB | User + Grant |
| Create a named schema | Database |
| One-shot SQL migration | SqlJob |
Scheduled mysqldump to S3 | Backup |
| Scheduled volume snapshot to S3 | PhysicalBackup |
| Restore from either of the above | Restore |
| Cache or session store | RedisReplication + RedisSentinel |
| Keycloak backing store (already configured) | Zalando postgresql |
| Internal queue | RabbitmqCluster |
When to delegate to the db-manager agent
Anything non-trivial touching MariaDB or MaxScale — version bumps, myCnf tuning, compatibility checks, debugging a weird replication lag, planning a downgrade — should go through the db-manager Claude agent. It knows the operator CRDs cold, queries container registries for available tags, validates myCnf against the target MariaDB version, and flags misconfigurations (memory overcommit, timeout mismatches, missing anti-affinity, removed parameters) before you apply them. See the agents reference for how to invoke it.
Day-to-day reading of pod status, logs, and describe output stays with the k8s-manager agent. Drawing the line: if the question is "is this pod up?" it's k8s-manager; if the question is "should I change innodb_buffer_pool_size before or after this upgrade?" it's db-manager.
Further reading
- MariaDB server knowledge base — the canonical reference.
- mariadb-operator docs — every CRD field, with examples for replication, Galera, MaxScale topologies.
- MaxScale documentation — routers, monitors, failover semantics.
- opstree redis-operator docs — RedisReplication + RedisSentinel reference.
- Zalando postgres-operator — the CRD and its options.
- rabbitmq-cluster-operator — official RabbitMQ Kubernetes operator.
- Internal: incident response, rotate a secret, agents reference.
- Next: Autoscaling.