How would you design an autocomplete like Flipkart's search bar at high level?

Debounced input → cached, cancellable suggestion requests → ranked results rendered in an accessible combobox. Cover debouncing, request cancellation/race handling, caching, keyboard nav + ARIA, highlighting, recent searches, and backend trie/search-service for suggestions.

7 min read·~30 min to think through

Search autocomplete is a deceptively deep system-design question. Break it into frontend interaction, network discipline, and the suggestion backend.

Frontend: the input → suggestions loop

Debounce the input (~150–300ms) — don't fire a request per keystroke.
Minimum query length — usually don't search for 1 character.
On a debounced change, fetch suggestions.
Render results in a combobox dropdown.
Keyboard: ArrowUp/Down to move highlight, Enter to select, Escape to dismiss, Tab to close.
Highlight the matched substring in each suggestion.
Loading / empty / error states in the dropdown.

Network discipline — the part that separates good answers

Cancel stale requests / handle races. The user types fast; responses arrive out of order. If the response for "ip" arrives after the response for "iphone", you'd show wrong results. Fix: AbortController to cancel the previous request, or tag responses with their query and drop responses that don't match the current input.
Cache results per query (an LRU map, or React Query). Re-typing or backspacing should be instant — no refetch.
Dedup in-flight identical requests.
Prefetch popular queries; consider returning suggestions for query prefixes.

Accessibility

role="combobox" on the input, aria-expanded, aria-controls; role="listbox" + role="option" on the dropdown; aria-activedescendant pointing at the highlighted option so focus stays in the input; announce result counts.

UX features

Recent searches (localStorage) shown on focus before typing.
Trending / popular suggestions when the box is empty.
Categories/scopes, product thumbnails, "search for X" affordance.
Click-outside to dismiss; reopen on focus.

Backend (HLD)

A suggestion service backed by a trie / prefix index or a search engine (Elasticsearch/Solr) for prefix matching.
Ranking — popularity, personalization, recency, typo-tolerance (fuzzy matching).
Caching at the edge/CDN for hot prefixes; the suggestion endpoint should be very low latency.
Analytics — log queries to improve ranking.

Performance

Debounce + cancel + cache minimizes requests.
Virtualize the dropdown only if it can be very long (usually it's capped at ~10).
Keep payloads tiny — suggestions are just strings + minimal metadata.

The framing

"Frontend: debounced input feeding a cancellable, cached suggestion request, rendered in an accessible combobox with keyboard nav and match highlighting. The critical detail is race handling — cancel stale requests or drop responses that don't match the current query, or fast typing shows wrong results. Add recent/trending searches and proper ARIA. Backend: a low-latency suggestion service over a trie or search engine with popularity-based ranking and edge caching for hot prefixes."

Follow-up questions

•How do you handle out-of-order responses from fast typing?
•Why and how do you cache suggestion results on the client?
•What ARIA roles make an autocomplete accessible?
•What backend data structure powers prefix suggestions?

Common mistakes

•No debounce — a request per keystroke.
•Not cancelling/ignoring stale responses — wrong results from races.
•No caching, so backspacing refetches.
•Missing keyboard navigation and ARIA.
•Not handling loading/empty/error states in the dropdown.

Performance considerations

•Debounce + request cancellation + caching minimize network load. Tiny payloads keep it snappy. The backend suggestion endpoint must be sub-100ms — trie/search-index plus edge caching. Cap dropdown size instead of virtualizing.

Edge cases

•Responses arriving out of order.
•Empty query (show recent/trending) vs no results.
•Very fast typing and rapid backspacing.
•Network failure mid-typing.
•Selecting via keyboard vs mouse.

Real-world examples

•Flipkart/Amazon/Google search bars with prefix suggestions, recent searches, and trending queries.

Senior engineer discussion

Seniors structure it as interaction + network discipline + backend, and call out race handling (cancel stale / drop mismatched responses) as the make-or-break detail. They cover caching, ARIA combobox semantics, recent/trending UX, and on the HLD side a trie/search-index suggestion service with popularity ranking and edge caching.