Back to System Design
System Design
easy
mid

How do you manage feature flags at scale across a frontend organization?

Use a flag service (LaunchDarkly/Unleash/Flagsmith or in-house) with typed flag definitions, evaluate flags with sensible defaults, support targeting/gradual rollout/kill switches, evaluate server-side where possible to avoid flicker, and enforce flag lifecycle hygiene so they don't become permanent tech debt.

7 min read·~12 min to think through

Feature flags decouple deploy from release. At scale, the hard parts aren't toggling a boolean — they're consistency, performance, and lifecycle hygiene.

Flag types (they're not all the same)

  • Release flags — short-lived, gate an in-progress feature. Delete after launch.
  • Ops / kill switches — long-lived, instantly disable a feature in an incident.
  • Experiment flags — A/B tests, tied to analytics.
  • Permission / entitlement flags — per-plan or per-tenant; effectively config, long-lived by design.

Treating all four the same is a common mistake — their lifecycles differ.

Architecture

  • A flag service — LaunchDarkly / Unleash / Flagsmith / Statsig, or in-house. It owns flag definitions, targeting rules, and rollout state.
  • SDK in the app — fetches the flag ruleset, evaluates locally (fast), updates via streaming/polling.
  • Typed flag registry — flags defined in code with types and explicit defaults, so a missing/failed flag fetch falls back safely.

Evaluation: server-side where you can

  • Server-side / SSR evaluation avoids the client-side flicker where the UI renders the "off" state, then snaps to "on" after the flag loads.
  • If evaluating client-side, gate rendering on flag readiness or use sane defaults to avoid flash.
  • Bootstrapping — ship initial flag values with the HTML/SSR payload.

Targeting & rollout

  • Percentage rollouts, ring deployments (internal → beta → GA), targeting by user/tenant/region/plan.
  • Consistent bucketing — hash the user id so a given user always gets the same variant (no flickering between sessions).
  • Kill switch — every risky feature ships behind a flag you can flip off without a deploy.

Performance & reliability

  • Evaluate locally from a cached ruleset — never a network call per flag check.
  • Fail safe — if the flag service is down, fall back to defaults; the app must not break.
  • Bound the ruleset size; lazy-load experiment configs.

The discipline that actually matters: lifecycle hygiene

Flags rot. Without process you get hundreds of stale flags, dead code paths, and untestable combinatorial state.

  • Every release flag gets an owner and an expiry/cleanup ticket at creation.
  • Audit and remove stale flags regularly; lint for flags older than N days.
  • Test the flag-on and flag-off paths.
  • Don't nest flags into combinatorial explosions.

Summary

A flag service + typed registry with safe defaults + server-side evaluation + consistent bucketing gets you correctness and performance. Lifecycle hygiene is what keeps flags from becoming the worst tech debt in the codebase.

Follow-up questions

  • How do you avoid the client-side flag flicker (FOUC of the wrong variant)?
  • What happens if the flag service is unreachable?
  • Why does consistent bucketing matter?
  • How do you stop feature flags from becoming permanent tech debt?

Common mistakes

  • No defaults — a failed flag fetch breaks or flickers the UI.
  • Client-side-only evaluation causing a flash of the wrong variant.
  • Never cleaning up flags — hundreds of stale flags and dead code paths.
  • Treating release flags, kill switches, and entitlements as the same thing.

Performance considerations

  • Flags must evaluate locally from a cached ruleset — a network round-trip per check is unacceptable. Server-side evaluation removes client flicker and an extra request. Keep the ruleset payload bounded; stream updates rather than poll aggressively.

Edge cases

  • Flag service outage — must fail safe to defaults.
  • A user's variant changing between sessions (inconsistent bucketing).
  • Nested/dependent flags creating untested combinations.
  • Flags evaluated differently on server vs client.

Real-world examples

  • LaunchDarkly/Unleash gating a gradual rollout with a one-click kill switch.
  • SSR-evaluated flags so logged-in users never see the wrong variant flash.

Senior engineer discussion

Seniors classify flags by lifecycle (release / ops / experiment / entitlement), insist on typed registries with safe defaults and fail-safe behavior, and prefer server-side evaluation to kill flicker. The senior signal is treating lifecycle hygiene — owners, expiry, cleanup, testing both paths — as the central long-term problem, not the toggling mechanism.

Related questions