Which frontend performance metrics do you track in production?
Core Web Vitals (LCP, INP, CLS) at p75/p95, segmented by device/network/geo/route. Plus TTFB, FCP, total bundle size, JS error rate, long task count, custom marks for product-critical flows (time-to-search-result, time-to-checkout-ready). Collect via web-vitals.js + RUM, ship to analytics (Datadog/Sentry/internal). Alert on regression rate, not absolute thresholds. Pair lab (Lighthouse CI per PR) with field (RUM) for full picture.
Production perf metrics fall into a few buckets. The set you actually track should be small enough that someone reads them weekly.
Core Web Vitals (the legal minimum)
| Metric | What it is | Good (p75) |
|---|---|---|
| LCP | Largest Contentful Paint — main content render | <2.5s |
| INP | Interaction to Next Paint — interaction responsiveness | <200ms |
| CLS | Cumulative Layout Shift — visual stability | <0.1 |
(INP replaced FID in March 2024.)
Track at p75 and p95. Median hides the slow tail; p95 captures the user with the laggy phone and patchy network.
Other browser-side metrics
- TTFB (time to first byte) — server + CDN latency floor.
- FCP (first contentful paint) — anything visible at all.
- TTI / TBT (Lighthouse only — main-thread blocking proxies for interactivity).
- Long tasks count (>50ms tasks) — main-thread health.
Custom marks for product flows
The metrics that matter are the ones that map to the user's actual job:
- Search box keystroke → results visible.
- Click "Add to cart" → cart count updates.
- Form submit → server confirmation.
- Route navigation → next route TTI.
Use performance.mark + performance.measure, ship the durations via beacon.
performance.mark('search:start');
// … user typed, results returned, rendered
performance.mark('search:done');
performance.measure('search', 'search:start', 'search:done');
beacon('search.duration', performance.getEntriesByName('search')[0].duration);Engineering health metrics
- JS bundle size per route (gzip + brotli).
- Number of requests per page.
- Cache hit ratio at CDN.
- Error rate (JS exceptions + unhandled rejections + fetch failures).
- Deploy frequency vs perf regression rate — are we trading speed for delivery?
Segmentation
Average metrics across all users hide the truth. Segment by:
- Device class (low/mid/high; effective CPU/RAM via UA-CH).
- Network (effective connection type: 4g, 3g, 2g — from Network Information API).
- Geography (continent or country).
- Route (
/checkoutvs/homehave very different bars). - First visit vs repeat (cache state changes everything).
- Logged in vs anonymous (auth payloads, personalization cost).
A 2s LCP "average" masks a 5s p95 in India on mid-range Android. Always segment.
How to collect
RUM (field data): web-vitals npm package.
import { onLCP, onINP, onCLS, onTTFB, onFCP } from 'web-vitals';
onLCP(send); onINP(send); onCLS(send); onTTFB(send); onFCP(send);
function send({ name, value, rating, id, navigationType }) {
navigator.sendBeacon('/rum', JSON.stringify({
name, value, rating, id, navigationType,
route: location.pathname,
deviceMemory: navigator.deviceMemory,
connection: navigator.connection?.effectiveType,
}));
}sendBeacon works even on page unload; no UI thread blocking.
Lab (synthetic): Lighthouse CI per PR; WebPageTest for deep dives. Useful for catching regressions before deploy but doesn't replace field data.
Alerting
Alert on rate of change, not absolute thresholds:
- p75 LCP regressed >10% week over week.
- Error rate doubled in the last hour.
- Long-task count increased on
/checkoutafter deploy.
Absolute alerts ("LCP > 3s") fire all the time during traffic spikes or seasonal patterns; rate-of-change alerts catch the real regressions.
Tools
- Datadog RUM, New Relic, Sentry Performance — turnkey RUM with dashboards.
- Google Chrome User Experience Report (CrUX) — public field data from real Chrome users; useful for benchmarking.
- Cloudflare Web Analytics — privacy-friendly RUM.
- Internal:
web-vitals+ your own analytics pipeline if you have one.
Mental model
Lab tells you what could be fast. Field tells you what is. Always trust field for prioritization. Use lab to catch regressions in CI and to A/B-test fixes.
Follow-up questions
- •Why did INP replace FID?
- •How do you measure custom flows like 'search → results visible'?
- •What's the difference between CrUX, RUM, and Lighthouse?
- •How do you avoid alert fatigue with perf metrics?
Common mistakes
- •Tracking only Lighthouse scores — synthetic vs real users diverge.
- •Looking at averages instead of p75/p95.
- •Not segmenting by device/geography — global average looks fine, mobile India is on fire.
- •Alerting on absolute thresholds — fires all the time during traffic events.
- •Sending RUM via sync XHR — blocks the page; use sendBeacon.
- •Ignoring custom marks — Core Web Vitals don't measure your specific user flows.
Performance considerations
- •RUM itself has a small perf cost (beacon, listener). Sample appropriately. Don't ship the entire web-vitals library if you're using a vendor SDK that includes it. The metric set should be tight enough that someone actually reads it weekly — 10 metrics ignored is worse than 3 watched.
Edge cases
- •INP only stabilizes after a real interaction; pages with no clicks/keypresses produce no INP.
- •CLS shifts caused by user-initiated actions (clicks) are excluded; only unexpected shifts count.
- •Background tabs: web-vitals pauses; you don't get inflated numbers.
- •PWAs and SPAs: route transitions need explicit instrumentation (web-vitals doesn't fire on client-side navigations by default).
- •Sampling: at high traffic, 1-10% RUM sample is plenty; full collection burns analytics quota.
Real-world examples
- •Pinterest, Etsy, Shopify all publish their Core Web Vitals numbers and the work to maintain them.
- •Vercel Speed Insights ships web-vitals + dashboards out of the box for Next.js apps.
- •Google Search uses Core Web Vitals as a ranking signal — your CrUX data becomes a business metric.