How do you handle large data from API efficiently

Don't fetch it all — paginate or stream. On the client: virtualize rendering, normalize and cache, process heavy work in a Web Worker, select only needed fields, and use server-side filtering/sorting/aggregation. Push as much data work to the server as possible.

6 min read·~12 min to think through

"Large data from an API" is a problem at three layers — transport, processing, and rendering — and the best fixes push work away from the browser.

1. Don't transport it all — the server should send less

Pagination — cursor-based (?after=...&limit=50), not offset, for stable, fast deep pages. Or infinite scroll.
Field selection — GraphQL or REST endpoints that return only the fields the UI needs. Don't pull whole fat objects.
Server-side filtering / sorting / aggregation — let the database do it and return just the slice/summary. This is the biggest lever — the server is built for this.
Streaming — for genuinely large responses, stream and process chunks as they arrive instead of waiting for the whole payload (streaming fetch, NDJSON).
Compression (Brotli/gzip) on the response.

2. Cache and reuse — don't re-fetch

React Query / SWR — caching, dedup, background refetch, stale-while-revalidate. The same query across components hits the cache.
HTTP caching (ETag, Cache-Control) for cacheable endpoints.
For very large client-side datasets, IndexedDB as a local store.

3. Process off the main thread

Heavy client-side transforms (parsing, sorting, aggregating 100k+ rows) → Web Worker, so the UI doesn't freeze.
Or chunk-and-yield across frames.
Normalize the data ({ [id]: item }) so lookups and updates are O(1).
Memoize derived computations.

4. Render only what's visible

Virtualization — render the visible window only; DOM stays small no matter the dataset size.
This applies even after pagination — a "page" of 1000 still shouldn't all hit the DOM.

5. Choose the right shape for the use case

User browsing a list → paginate + virtualize.
User searching/filtering → server-side filter, return matches.
Need an aggregate/summary → compute server-side, send the summary.
Bulk export/processing → stream, or do it server-side entirely.

The framing

"Three layers. Transport: don't send it all — cursor pagination, field selection, and especially server-side filtering/sorting/aggregation, plus streaming for huge payloads. Caching: React Query so we don't re-fetch. Processing: normalize, and move heavy transforms to a Web Worker. Rendering: virtualize so the DOM only holds the visible window. The guiding principle — push data work to the server and the network, and only ever hold/render what the user actually needs."

Follow-up questions

•Why is cursor-based pagination better than offset?
•When should data processing happen server-side vs in a Web Worker?
•How does virtualization help even after you've paginated?
•When would you use IndexedDB on the client?

Common mistakes

•Fetching the entire dataset in one request.
•Over-fetching fat objects when only a few fields are needed.
•Doing heavy sort/filter/aggregation on the client main thread.
•Rendering all rows without virtualization.
•Re-fetching the same data repeatedly with no caching.

Performance considerations

•Server-side filtering/aggregation is the highest-leverage fix — the DB is optimized for it. Pagination bounds payload; virtualization bounds DOM; Web Workers free the main thread; caching cuts request count. Each layer attacks a different bottleneck.

Edge cases

•Data too large for memory even after pagination.
•Real-time updates layered on top of paginated data.
•Streaming responses that need incremental rendering.
•Offline access requiring a local store.

Real-world examples

•A dashboard: server-side aggregated summaries + cursor-paginated detail tables, virtualized rendering, React Query caching.
•Parsing a large uploaded file in a Web Worker.

Senior engineer discussion

Seniors structure it as transport / processing / rendering and emphasize pushing work to the server (filtering, sorting, aggregation) as the biggest win. They cover cursor pagination, streaming, caching, Web Workers, normalization, and virtualization — and match the strategy to the actual use case rather than reciting all of them.