Back to Networking
Networking
easy
mid

How do you build real time features using WebSockets or Server Sent Events?

WebSockets for bidirectional, low-latency interaction (chat, collab, presence); SSE for one-way server→client streams (notifications, feeds, live status) — simpler, auto-reconnecting, over HTTP; polling as the low-effort fallback. Choose by directionality, then handle reconnection, resync, and scale.

7 min read·~12 min to think through

Real-time UIs come down to picking the right transport for the data flow, then engineering reliability around it.

The options

TransportDirectionBest forNotes
Pollingclient pullslow-frequency, simplewasteful, laggy; fine as a fallback
Long-pollingclient pulls (held)legacy fallbackbetter than polling, awkward
SSEserver → clientnotifications, feeds, live status, dashboardsHTTP, auto-reconnect, simple; one-way only; ~6-connection limit on HTTP/1.1
WebSocketbidirectionalchat, collab editing, presence, gamesfull duplex, low latency; more infra (stateful, scaling)

How to choose

  1. Do you need client→server over the same channel, low-latency?WebSocket. Chat, collaborative editing, multiplayer, live cursors.
  2. Only server→client updates?SSE. It's just HTTP: built-in auto-reconnect, Last-Event-ID for resume, works through most proxies, far less infra. Notifications, activity feeds, order/build status, live dashboards.
  3. Updates are infrequent or it's a fallback?polling. Don't over-engineer a "new comment" badge.

SSE is underrated — for the very common "stream updates down to the client" case it's simpler and more robust than WebSockets.

Reliability (applies to all)

  • Reconnection with exponential backoff + jitter. SSE does this natively; WS you implement.
  • Resync on reconnect — you may have missed events. Use Last-Event-ID (SSE) or a sequence number, or refetch current state via REST. Never assume a gap-free stream.
  • Heartbeats to detect dead connections.
  • Graceful degradation — WS → SSE → polling.
  • Ordering/idempotency — sequence numbers; drop stale/duplicate events.
  • Cleanup — close connections on unmount/navigation.

Scaling

  • Both are stateful at the server. Need a pub/sub backplane (Redis, Kafka) so any server instance can push to any client; sticky sessions or a connection layer.
  • Connection count is the limiting resource — each costs memory.
  • For high fan-out, consider managed services (Ably, Pusher, Pubnub) or a dedicated realtime tier.

The mental model

Directionality picks the transport (one-way → SSE, two-way → WebSocket, rare → polling). Reliability engineering — reconnect, resync, ordering, fallback — is the same regardless and is where most of the actual work is.

Follow-up questions

  • When is SSE the better choice over WebSockets?
  • How do you recover events missed during a disconnect?
  • Why does real-time get hard when you scale to multiple server instances?
  • What are the connection limits of SSE on HTTP/1.1?

Common mistakes

  • Defaulting to WebSockets when SSE (or polling) would be simpler and sufficient.
  • Assuming the stream is gap-free — no resync after reconnect.
  • No fallback when WS/SSE is blocked by a proxy.
  • Ignoring horizontal scaling — works on one server, breaks on many.

Performance considerations

  • Each open connection costs server memory — connection count is the scaling bottleneck. Polling wastes requests and adds latency. A pub/sub backplane is required for multi-instance fan-out. Managed services trade cost for offloading this.

Edge cases

  • Proxies/firewalls blocking WebSocket upgrades.
  • SSE's HTTP/1.1 connection-per-domain limit.
  • Reconnection storms after a server restart.
  • Out-of-order or duplicate event delivery.

Real-world examples

  • SSE for a notifications feed or CI build-status stream.
  • WebSockets for chat, Figma-style collaboration, live cursors.
  • Polling for a low-frequency 'unread count' badge.

Senior engineer discussion

Seniors choose transport by directionality and explicitly champion SSE for the common one-way streaming case because it's HTTP-native, auto-reconnecting, and far less infra. They then treat reliability (backoff+jitter, resync, ordering, fallback chain) and horizontal scaling (pub/sub backplane, connection limits, build-vs-buy) as the real engineering, independent of transport.

Related questions