WebSockets, SSE & Event-Driven Architecture at Scale

Choosing Your Real-Time Transport

Before writing a line of code, the first decision is which protocol to use. The wrong choice adds unnecessary complexity.

Server-Sent Events (SSE)

One-directional: server pushes to client. HTTP-based — works through proxies and load balancers without special configuration. Perfect for live feeds, notifications, dashboards, and progress updates where the client never needs to send data back.

    // Express SSE endpoint
    app.get('/events', (req, res) => {
      res.setHeader('Content-Type', 'text/event-stream')
      res.setHeader('Cache-Control', 'no-cache')
      res.setHeader('Connection', 'keep-alive')

      const sendEvent = (data: object) => {
        res.write(`data: ${JSON.stringify(data)}

`)
      }

      const unsubscribe = eventBus.on('update', sendEvent)
      req.on('close', unsubscribe)
    })

WebSockets

Bidirectional full-duplex. Required when the client sends real-time data back — collaborative editing, multiplayer, live chat, trading terminals. More complex to proxy, load balance, and scale.

Long Polling

HTTP request held open until data is available. The fallback for environments where WebSockets or SSE are blocked. Use only when necessary.

Scaling WebSockets: The Sticky Session Problem

WebSocket connections are long-lived and server-specific. With multiple server instances, a message published on server A must reach clients connected to server B. This requires a message broker as the coordination layer.

    // Socket.io with Redis adapter — messages broadcast across all instances
    import { createServer } from 'http'
    import { Server } from 'socket.io'
    import { createAdapter } from '@socket.io/redis-adapter'
    import { createClient } from 'redis'

    const pubClient = createClient({ url: process.env.REDIS_URL })
    const subClient = pubClient.duplicate()

    await Promise.all([pubClient.connect(), subClient.connect()])

    const io = new Server(httpServer)
    io.adapter(createAdapter(pubClient, subClient))

    // Now io.to('room:123').emit('update', data) works across all instances

The Redis adapter publishes messages to a Redis channel. All Socket.io instances subscribe and fan out to their locally connected clients.

Event-Driven Architecture: Decoupling at Scale

Real-time systems quickly develop tight coupling: the order service directly calls the notification service which directly calls the analytics service. One slow dependency cascades into all of them.

Event-driven architecture inverts this: services emit events to a shared bus. Consumers subscribe to what they care about. Publishers do not know or care who is listening.

    Order placed
          │
          ▼
      Event Bus (Redis Streams / Kafka)
          │
          ├─────────────────────────────────┐
          ▼                                 ▼
    Notification Service              Analytics Service
    (sends push notification)         (updates dashboards)
          │                                 │
          ▼                                 ▼
    WebSocket push to user            SSE push to admin dashboard

Each service scales independently. If the analytics service falls behind, the notification service is unaffected. Events are durable — if a service restarts, it catches up from its last processed offset.

Connection Limits at Scale

A single server can maintain 10,000–65,000 WebSocket connections depending on available file descriptors and memory. For 1M concurrent connections you need ~20–65 servers — manageable with Kubernetes autoscaling.

More important than connection count: what happens when a client disconnects and reconnects? Implement reconnection with exponential backoff on the client, and re-subscribe to rooms/channels on reconnect at the server.

Monitoring Real-Time Systems

Standard HTTP metrics (p95 latency, error rate) do not capture real-time system health. Additional metrics to track:

Active WebSocket connections per instance
Message delivery latency (time from event to client receipt)
Redis Pub/Sub channel subscriber count
WebSocket reconnection rate (high rate signals instability)

Choosing Your Real-Time Transport

Before writing a line of code, the first decision is which protocol to use. The wrong choice adds unnecessary complexity.

Server-Sent Events (SSE)

    // Express SSE endpoint
    app.get('/events', (req, res) => {
      res.setHeader('Content-Type', 'text/event-stream')
      res.setHeader('Cache-Control', 'no-cache')
      res.setHeader('Connection', 'keep-alive')

      const sendEvent = (data: object) => {
        res.write(`data: ${JSON.stringify(data)}

`)
      }

      const unsubscribe = eventBus.on('update', sendEvent)
      req.on('close', unsubscribe)
    })

WebSockets

Bidirectional full-duplex. Required when the client sends real-time data back — collaborative editing, multiplayer, live chat, trading terminals. More complex to proxy, load balance, and scale.

Long Polling

HTTP request held open until data is available. The fallback for environments where WebSockets or SSE are blocked. Use only when necessary.

Scaling WebSockets: The Sticky Session Problem

    // Socket.io with Redis adapter — messages broadcast across all instances
    import { createServer } from 'http'
    import { Server } from 'socket.io'
    import { createAdapter } from '@socket.io/redis-adapter'
    import { createClient } from 'redis'

    const pubClient = createClient({ url: process.env.REDIS_URL })
    const subClient = pubClient.duplicate()

    await Promise.all([pubClient.connect(), subClient.connect()])

    const io = new Server(httpServer)
    io.adapter(createAdapter(pubClient, subClient))

    // Now io.to('room:123').emit('update', data) works across all instances

The Redis adapter publishes messages to a Redis channel. All Socket.io instances subscribe and fan out to their locally connected clients.

Event-Driven Architecture: Decoupling at Scale

Real-time systems quickly develop tight coupling: the order service directly calls the notification service which directly calls the analytics service. One slow dependency cascades into all of them.

Event-driven architecture inverts this: services emit events to a shared bus. Consumers subscribe to what they care about. Publishers do not know or care who is listening.

    Order placed
          │
          ▼
      Event Bus (Redis Streams / Kafka)
          │
          ├─────────────────────────────────┐
          ▼                                 ▼
    Notification Service              Analytics Service
    (sends push notification)         (updates dashboards)
          │                                 │
          ▼                                 ▼
    WebSocket push to user            SSE push to admin dashboard

Connection Limits at Scale

Monitoring Real-Time Systems

Standard HTTP metrics (p95 latency, error rate) do not capture real-time system health. Additional metrics to track:

Active WebSocket connections per instance
Message delivery latency (time from event to client receipt)
Redis Pub/Sub channel subscriber count
WebSocket reconnection rate (high rate signals instability)

WebSockets, SSE & Event-Driven Architecture at Scale

Choosing Your Real-Time Transport

Scaling WebSockets: The Sticky Session Problem

Event-Driven Architecture: Decoupling at Scale

Connection Limits at Scale

Monitoring Real-Time Systems

More from the blog

Scaling to 1M+ Requests/Day

Redis Beyond Caching: Pub/Sub, Queues & Rate Limiting at Scale

Want to build something like this?

WebSockets, SSE & Event-Driven Architecture at Scale

Choosing Your Real-Time Transport

Scaling WebSockets: The Sticky Session Problem

Event-Driven Architecture: Decoupling at Scale

Connection Limits at Scale

Monitoring Real-Time Systems

More from the blog

Scaling to 1M+ Requests/Day

Redis Beyond Caching: Pub/Sub, Queues & Rate Limiting at Scale

Want to build something like this?