Modern backend infrastructure design

Realtime products usually need four things at once: HTTP APIs, long-lived connections, fast coordination storage, and an ACID system of record. When boundaries are fuzzy, you get dual-write drift between cache and database, WebSocket connections trapped on individual nodes, and incidents that end in “restart everything” with no idea which layer failed.

This note follows module boundaries inside a deployable unit → data & consistency → realtime transport → operations & evolution, for a stack that still holds up in small-to-mid production: Spring Boot for business and ingress, MySQL as SSOT, Redis as a coordination layer. It does not preach microservices, but it leaves seams for later extraction.

1. Draw boundaries first: what stays in-process

1.1 Suggested layers (logical modules, not folder theater)

| Layer | Responsibility | Should not | |-------|----------------|------------| | Ingress | REST/gRPC, WS handshake, auth, rate limits, DTO validation | Embed complex SQL or Redis details | | Application | Use cases, transaction boundaries, orchestration | Know HTTP status codes or WS frame formats | | Domain | Entities, aggregates, domain events (pure Java) | Depend on Spring, JPA, Redis clients | | Infrastructure | Repository impls, messaging, external adapters | Host business rules |

Spring Boot’s strength is shipping ingress + application in one process while keeping interfaces that could become remote calls later. Splitting into five microservices early often imports network uncertainty before domain uncertainty is understood.

1.2 When a modular monolith wins

Prefer monolith + clear modules when most of these hold:

Small team; cross-service integration costs more than in-repo coupling;
Peak load fits vertical scale + read replicas;
Domain boundaries still move weekly.

Split when a path needs independent scaling (e.g. WS gateways) or release cadence truly diverges—not when “LOC > X”.

2. MySQL: system of record and consistency seams

2.1 What must be relational

Anything involving money, authorization, audit, non-derivable facts belongs in MySQL (or equivalent ACID store) as single source of truth. Redis session/presence/rate-limit state is derived and must be rebuildable or safely expirable.

2.2 Keep transactions short and strict

Anti-pattern: external HTTP, Kafka publish, or Thread.sleep inside @Transactional.

Better:

DB reads/writes only inside the transaction;
Side effects after commit via domain events / transactional outbox.

Outbox pattern: write business row + outbox row in one transaction; a poller publishes to MQ or calls downstream. Avoids “DB committed, message never sent” (and the reverse).

2.3 Pools and slow queries

HikariCP: maximumPoolSize is not “as high as possible”; start near cores * 2 for OLTP and load-test;
Slow query log + EXPLAIN as routine; kill ORM N+1 on hot paths (@EntityGraph, fetch joins);
Read-heavy workloads: read replicas for reporting; writes stay on primary.

3. Redis: coordination layer, not a cache plugin

3.1 Structure by use case

| Use case | Structure | Notes | |----------|-----------|-------| | Session / token metadata | String + TTL | Namespaces like session:{userId}; key rotation windows | | Rate limiting | INCR + EXPIRE / Redis Cell | Per user, IP, global; return 429 + Retry-After | | Presence | Hash / ZSET | TTL + heartbeat; do not rely only on disconnect events | | Cross-node WS fan-out | Pub/Sub or Streams | Pub/Sub is fire-and-forget; Streams + consumer groups for at-least-once | | Distributed locks | SET NX PX / Redisson | Short TTL; tiny critical sections; prefer optimistic DB locks |

3.2 Document cache policy

Cache-aside: miss loads DB, then set; on write update DB then delete cache (or delayed double-delete);
Do not “cache everything by default”—define keys, TTL, invalidation, stampede control (singleflight / bloom);
Hot keys: local Caffeine L1 + Redis L2, or prewarm before events.

Stale presence is a product bug, not a “low hit rate” excuse.

4. WebSocket: ingress pattern and scale-out

4.1 Do not block business threads on I/O

Common Spring-era options:

WebFlux + Reactor Netty end-to-end reactive, or
Servlet API for HTTP + a separate Netty/Socket.IO gateway decoupled from MVC.

Filling Tomcat thread pools with long-lived sockets starves HTTP and WS together.

4.2 Gateway responsibilities

Client ──WS──► Gateway ──pub/sub──► Redis ◄──pub/sub── other Gateways
                  │                         ▲
                  └── auth / cursor / heartbeat ┘
Business API ── write DB + emit event ──► publish to Redis channel

Handshake: validate JWT (short-lived access + refresh split); bind userId, deviceId, protocol version;
Subscriptions: per room:{id} or user:{id} Redis channels; business tier holds no socket handles;
Cursor / deltas: every downlink carries seq or revision; reconnect uses HTTP catch-up API, not client guessing;
Backpressure: per-connection send queue cap; when full, drop low priority (typing < read receipts < payloads) and signal degraded.

4.3 Horizontal scale

Either sticky sessions at the load balancer, or
Stateless gateways (any instance, fan-out only via Redis)—prefer the latter.

Rolling deploys need graceful drain: stop accepting, wait, then terminate—avoid synchronized reconnect storms.

5. Async and integrations

@Async for non-critical paths only; critical flows use MQ + idempotent consumers (dedupe table);
LLM / third-party HTTP: dedicated pools + timeouts + circuit breakers (Resilience4j); API returns 202 + task id, result via WS or poll;
@Scheduled jobs need distributed locks (ShedLock / clustered Quartz) on multiple instances.

6. Observability: answer “which segment is slow?”

6.1 Three pillars

Metrics: QPS, P99, Hikari active/idle, Redis hit rate, WS connection count, per-channel push latency;
Logs: structured JSON with trace_id, span_id, redacted userId, route;
Traces: OpenTelemetry across HTTP → DB → Redis → outbound HTTP; separate WS gateway instrumentation.

HTTP and WS must share trace_id (client X-Trace-Id or gateway-issued on handshake).

6.2 Health checks

| Probe | Checks | |-------|--------| | Liveness | Process up | | Readiness | DB ping, Redis ping, sampled dependencies | | Deep (optional) | Read-only query, Redis RW test key |

Failed readiness should drain from LB, not restart-loop the pod.

7. Security and configuration

Secrets via env / K8s Secret / Vault—never in git;
TLS to Redis and MySQL even inside VPC; least-privilege DB users;
WSS only; validate Origin against CSWSH;
Rate limits + alert on auth failure spikes.

8. Evolution without hype

Phase A: modular monolith, single Redis, MySQL primary/replica; WS gateway in-process or sidecar;
Phase B: independent WS scaling; outbox + MQ; read replicas for queries;
Phase C: split by domain only when metrics and org require it; mesh for mTLS/retries.

Each phase needs rollback and mixed HTTP+WS load tests, not gut-feel capacity.

Summary

Modern backend infrastructure is not “we picked Spring Boot and Redis.” It is:

One source of truth, derived state rebuildable or TTL-bound;
Realtime decoupled from business, fan-out via coordination, connections out of the DB;
Short transactions, async side effects, correlated traces;
Modular monolith first, split on real bottlenecks.

Write these seams into ADRs early—it prevents more incidents than adding another framework.