Back to System Design
System Design
hard
mid

How would you design infinite scroll for a list with millions of items?

Combine cursor-based pagination (fetch in pages as the user nears the end) with virtualization (render only the visible window). Use IntersectionObserver to trigger loads, cache pages, handle loading/error/end states, and preserve scroll position. Never hold all items in the DOM or in one fetch.

7 min read·~30 min to think through

Infinite scroll for millions of items is two independent problems that must be solved together: fetching incrementally (you can't load millions) and rendering incrementally (you can't DOM millions).

1. Fetching — cursor-based pagination

  • Cursor / keyset pagination, not offset. ?after=<cursor>&limit=50. Offset pagination (LIMIT 50 OFFSET 500000) gets slow at depth and skips/duplicates items when the list mutates. Cursors are stable and fast.
  • Trigger: an IntersectionObserver on a sentinel element near the list's end fires the next-page fetch before the user hits the bottom (preload a page ahead — no visible wait).
  • Page size tuned to balance request count vs payload (~20–50 typically).
  • Cache fetched pages (React Query useInfiniteQuery is built for exactly this) — dedup, ret ries, and accumulated pages for free.

2. Rendering — virtualization is mandatory

Even though you only fetched a few thousand items, appending them all to the DOM still kills performance. So:

  • Virtualize — render only the visible window + overscan (@tanstack/react-virtual). DOM stays ~30 nodes no matter how far the user scrolls.
  • Combined: fetched items live in memory/cache; the virtualizer renders the visible slice; the observer loads more as you approach the loaded edge.

3. State the user must see

  • Loading — skeletons for the incoming page (not a layout-shifting spinner).
  • Error — a "couldn't load more — retry" row; don't kill the whole list.
  • End of list — an explicit "you've reached the end" so it's not an infinite spinner.
  • Empty — first page returns nothing.

4. Scroll & UX correctness

  • Preserve scroll position on navigate-away-and-back (cache the pages + scroll offset, or use the router's scroll restoration).
  • Don't jump — appended content must not shift what the user is looking at.
  • New items at the top (feeds) — prepend carefully; consider a "new items" pill rather than auto-jumping.
  • Memory — for truly enormous sessions, evict far-offscreen pages from the cache (windowed cache) and refetch if the user scrolls back; bidirectional infinite scroll.
  • Updating/removing an item must patch by id, not refetch everything.

5. Accessibility & SEO

  • Infinite scroll hides content from Ctrl+F, screen readers, and crawlers. Provide a fallback — a "Load more" button, real pagination links, or ensure key content is reachable. Manage focus when new content loads.

Architecture summary

ts
IntersectionObserver(sentinel) ──fires──▶ fetch next page (cursor)
        │                                      │
        ▼                                      ▼
   virtualizer renders visible window ◀── accumulated pages cache (useInfiniteQuery)

The framing

"Two problems. Fetching: cursor-based pagination triggered by an IntersectionObserver a page ahead, with fetched pages cached. Rendering: virtualization so the DOM only ever holds the visible window — non-negotiable at this scale. Plus proper loading/error/end states, scroll restoration, id-keyed updates, and an accessibility fallback. For extreme sessions, evict far-offscreen pages from the cache. The rule: never hold millions of items in one fetch or in the DOM."

Follow-up questions

  • Why cursor-based pagination instead of offset?
  • Why do you need virtualization even with paginated fetching?
  • How do you preserve scroll position when navigating back to the list?
  • What accessibility problems does infinite scroll create?

Common mistakes

  • Offset pagination that's slow at depth and skips/dupes items as the list mutates.
  • Appending all fetched items to the DOM without virtualization.
  • No end-of-list state — an eternal spinner.
  • Losing scroll position on back-navigation.
  • Ignoring accessibility/SEO — content unreachable by Ctrl+F or crawlers.

Performance considerations

  • Cursor pagination keeps queries fast at any depth. Virtualization caps DOM size. Caching pages avoids refetching. For unbounded sessions, a windowed cache that evicts far-offscreen pages bounds memory. Preloading a page ahead hides fetch latency.

Edge cases

  • List mutates (insert/delete) between page fetches.
  • User scrolls back up far — evicted pages need refetch.
  • New items arriving at the top of a feed.
  • First page empty; a page fetch failing mid-scroll.
  • Very long sessions growing memory unbounded.

Real-world examples

  • Social feeds (Twitter/X, Instagram), search results, large activity logs.
  • React Query useInfiniteQuery + @tanstack/react-virtual + an IntersectionObserver sentinel.

Senior engineer discussion

Seniors insist on solving fetching and rendering together — cursor pagination + virtualization — and explain why each alone is insufficient. They cover cache eviction for unbounded sessions, scroll restoration, id-keyed updates, the IntersectionObserver-a-page-ahead trigger, and proactively raise the accessibility/SEO downside with a concrete fallback.

Related questions