Back to System Design
System Design
hard
mid

What problems do Web Workers actually solve in a frontend application?

Web Workers run JavaScript on a separate thread, off the main thread. They solve one core problem: CPU-bound work that would otherwise block paint, input, and animation. They do NOT solve I/O latency (fetch is already async) or render-blocking layout. Real wins: parsing large JSON, image/audio processing, crypto, search indexing, syntax highlighting. Communication is postMessage (structured clone or transferable), so picking the boundary is the hard part.

8 min read·~5 min to think through

The actual problem

JavaScript on a page runs on a single thread that also handles rendering, layout, paint, input, and animation. If JS runs for >16ms, you drop a frame. If it runs for >50ms, INP regresses. If it runs for >5s, the browser shows "Page Unresponsive."

Web Workers move JS off that thread onto a separate one with its own event loop. The main thread stays free to paint and respond to input.

What workers DO solve

  1. CPU-bound parsing: large JSON, CSV, protobuf, XML.
  2. Image / audio processing: filters, resizing, FFT, decoding.
  3. Crypto: hashing, encryption, signing large payloads.
  4. Search / indexing: building a Lunr or Flexsearch index over thousands of docs.
  5. Syntax highlighting / linting: editors like VS Code Web, CodeSandbox.
  6. Diff / merge: 3-way merges on large strings.
  7. Compression / decompression.
  8. Simulation / physics / game loops in headless mode.

What workers DO NOT solve

  • Network latency: fetch is already async on the main thread; a worker doesn't make the network faster.
  • Layout / paint: still happens on the main thread (offscreen canvas is a partial exception).
  • DOM access: workers have no document/window — no DOM mutations at all.
  • Trivial loops: anything <5ms doesn't justify the postMessage round-trip.
  • State that needs to be shared: structured clone copies; SharedArrayBuffer is the exception but rare.

Why this matters for INP / responsiveness

INP measures time from input to visual feedback. If your click handler does JSON.parse(bigBlob) synchronously, INP spikes. Move the parse into a worker, and the click handler returns immediately — INP stays in green.

Basic shape

js
// main.js
const worker = new Worker('/parser.worker.js', { type: 'module' });
worker.postMessage({ kind: 'parse', payload: largeString });
worker.onmessage = (e) => render(e.data);

// parser.worker.js
self.onmessage = (e) => {
  const result = JSON.parse(e.data.payload);
  self.postMessage(result);
};

Communication cost

Every postMessage structured-clones the data. For 10MB JSON, the clone alone is ~50ms. Mitigations:

  • Transferable objects: ArrayBuffer, ImageBitmap, OffscreenCanvas transfer ownership (zero-copy).
  • SharedArrayBuffer: actually shared memory; requires COOP/COEP headers, rare.
  • Batch work: send the URL, let the worker fetch + parse; only the result crosses the boundary.

If the data transfer dominates the work, you've made things worse.

Worker types

  • Dedicated Worker: one main thread ↔ one worker.
  • Shared Worker: multiple tabs/contexts ↔ one worker (limited browser support).
  • Service Worker: a different beast — proxies network, runs offline. Not for CPU offload.

Architecture patterns

  • Worker pool: spawn N workers, round-robin tasks. Like a thread pool.
  • Library + worker: ship a library with a built-in worker (Comlink wraps postMessage as RPC).
  • OffscreenCanvas: hand a canvas to a worker for fully off-main-thread rendering. Powerful for visualizations.

Comlink (the ergonomics fix)

Raw postMessage with message-IDs and onmessage handlers is awful. Comlink wraps it as RPC:

js
import * as Comlink from 'comlink';
const api = Comlink.wrap(new Worker('/api.worker.js'));
const result = await api.search(query);

This is how most teams actually ship workers.

Common pitfalls

  • Spinning up a worker per task — startup is ~10-50ms; pool instead.
  • Forgetting structured clone copies the data both ways.
  • Trying to share DOM access — there is none.
  • Using a worker for trivial work — round-trip > compute.
  • No fallback when the worker fails to load.

When to reach for one

Profile first. If the flamegraph shows a >50ms task on the main thread that runs on user input, and that task is pure computation, that's the worker candidate. If the main thread is fine and you just want "to use threads," skip it.

Mental model

Workers are CPU offload, not magic concurrency. They make CPU-bound work invisible to the main thread at the cost of data-marshalling and orchestration complexity. Reach for them when responsiveness is regressing on a measurable, profiled bottleneck — not by default.

Follow-up questions

  • When would you NOT use a Web Worker?
  • How do you choose between Comlink and raw postMessage?
  • What is OffscreenCanvas and when does it matter?
  • How does SharedArrayBuffer change the trade-off?

Common mistakes

  • Spinning up a worker per task instead of pooling.
  • Using workers for I/O-bound work — fetch is already async.
  • Ignoring the structured clone cost — sending huge payloads.
  • Trying to touch the DOM from a worker.
  • No load fallback — page silently broken if the worker 404s.

Performance considerations

  • Worker startup ~10-50ms. postMessage ~1ms + ~5ms per MB for structured clone. Transferable objects avoid the copy cost. The break-even is around 50ms of CPU work — below that, the round-trip dominates.

Edge cases

  • Worker fails to register on old iOS — feature-detect.
  • Cross-origin worker scripts blocked unless served with right headers.
  • Worker errors don't bubble to window.onerror; need worker.onerror.
  • Service Worker vs Web Worker confusion — different purposes.

Real-world examples

  • VS Code Web runs the language server in a worker.
  • Figma offloads rendering math to workers.
  • Google Docs uses workers for spell-check and OT.
  • Stripe Elements uses workers for cryptographic operations.

Senior engineer discussion

Seniors only reach for workers after profiling shows a real bottleneck. They design the boundary carefully (small messages, transferables, batched work), pool workers instead of spawning ad-hoc, and use Comlink to keep call sites readable. They distinguish CPU offload (workers) from I/O concurrency (Promises) from offline (service workers).

Related questions