How do you measure performance in real world projects?
Lab + field together. Lab: Lighthouse CI in PRs (catches regressions before deploy), WebPageTest for deep dives. Field: web-vitals.js shipped from real users via sendBeacon to analytics; segment by device/network/geography/route. Watch p75 and p95, not averages. Pair with custom marks for product-critical flows. Tie metrics to business KPIs (conversion, bounce) so perf work gets prioritized. Alert on rate-of-change.
Real-world measurement combines synthetic tests for catching regressions before deploy with field data for prioritizing investments based on actual users.
Lab measurement (synthetic)
Lighthouse CI runs Lighthouse on every PR:
- name: Lighthouse CI
uses: treosh/lighthouse-ci-action@v10
with:
urls: |
https://staging.example.com/
https://staging.example.com/checkout
budgetPath: ./lighthouse-budget.json
uploadArtifacts: truelighthouse-budget.json:
[
{
"path": "/*",
"resourceSizes": [
{ "resourceType": "script", "budget": 200 },
{ "resourceType": "image", "budget": 300 }
],
"resourceCounts": [{ "resourceType": "third-party", "budget": 10 }],
"timings": [{ "metric": "interactive", "budget": 3000 }]
}
]Fails the PR if a budget is exceeded. Pin Lighthouse to a specific version so scores are comparable across runs.
WebPageTest: deeper analysis — filmstrip, waterfall, request blocking. Great for one-off investigations; too slow for CI.
Field measurement (RUM)
web-vitals package, shipped to your analytics:
import { onCLS, onINP, onLCP, onFCP, onTTFB } from 'web-vitals';
function send(metric) {
const body = JSON.stringify({
name: metric.name,
value: metric.value,
rating: metric.rating,
id: metric.id,
route: location.pathname,
deviceMemory: navigator.deviceMemory,
hardwareConcurrency: navigator.hardwareConcurrency,
connection: navigator.connection?.effectiveType,
saveData: navigator.connection?.saveData,
});
navigator.sendBeacon('/api/rum', body) ||
fetch('/api/rum', { body, method: 'POST', keepalive: true });
}
onCLS(send); onINP(send); onLCP(send); onFCP(send); onTTFB(send);sendBeacon survives page unload; falls back to keepalive fetch.
Custom marks for product flows
Web Vitals are generic. Add marks for user-visible flows:
performance.mark('checkout:start');
// user fills form, submits, gets confirmation
performance.mark('checkout:done');
performance.measure('checkout', 'checkout:start', 'checkout:done');
const dur = performance.getEntriesByName('checkout').pop().duration;
beacon('flow.checkout', dur);Segmentation
Aggregated metrics across all users hide real issues. Always segment by:
- Device class (low/mid/high via deviceMemory, hardwareConcurrency).
- Network (4g/3g/2g via effectiveType).
- Country / region.
- Route (/checkout vs /home have different bars).
- First visit vs repeat (cache state matters).
- Logged in vs anonymous.
A 2s p75 LCP "global" usually hides a 5s p95 LCP for users in slow markets on slow phones.
Percentiles
p50/median hides the tail. p75 is Google's threshold for "good." p95 catches the worst-affected. Track both.
Alerting
Alert on rate of change, not absolute:
- "p75 LCP for /checkout regressed >10% week-over-week" ← real signal.
- "LCP > 3s" ← fires constantly during traffic events, useless.
CrUX — free real-user data for any URL
PageSpeed Insights → field data section shows CrUX p75 LCP/INP/CLS for any public URL. Useful for:
- Benchmarking against competitors.
- Tracking your own progress over months.
- Confirming RUM matches independent measurement.
Tools
| Need | Tool |
|---|---|
| PR regression catch | Lighthouse CI |
| Deep one-off investigation | WebPageTest |
| Field data, dashboards | Datadog RUM, New Relic, Sentry Performance |
| Public CrUX | PageSpeed Insights |
| Privacy-friendly RUM | Cloudflare Web Analytics, Plausible |
| Inline measurement | web-vitals + your analytics pipeline |
| Vercel-deployed Next.js apps | Vercel Speed Insights |
Tying to business
Get buy-in by tying metrics to revenue:
- "p75 LCP -800ms → projected +X% conversion (per Akamai/BBC studies)."
- "INP regression coincides with -Y% engagement on /checkout."
- "Mobile India p95 LCP improvement → +Z signups."
Without dollar attribution, perf work loses to feature work.
Process
- Ship RUM. Wait a week for baseline.
- Find the worst (route × device × geography) p75 metric.
- Open DevTools → Performance, reproduce on a throttled CPU.
- Identify the dominant cost (image, JS, third-party, server).
- Fix. Validate metric movement on real users (not just lab).
- Set a CI budget to prevent regression.
Mental model
Lab = leading indicator (catches regressions). Field = trailing indicator (proves real-user impact). Use both. Trust field for prioritization; trust lab for "did this change break anything."
Follow-up questions
- •Why use sendBeacon over fetch for RUM?
- •How do you avoid alert fatigue on perf metrics?
- •What's the difference between Lighthouse and CrUX?
- •How would you measure performance of a flow that spans multiple pages?
Common mistakes
- •Measuring only in the lab — synthetic perf can pass while real users suffer.
- •Looking at averages instead of p75/p95.
- •Not segmenting — global average hides the slow market.
- •Alerting on absolute thresholds that fire all the time.
- •Forgetting custom marks for product-critical flows.
- •Letting RUM data collect dust — measure to act, not just to measure.
Performance considerations
- •RUM itself has tiny perf cost (beacon, listener). Don't ship redundant analytics SDKs. Sample low-priority data. The biggest cost is misreading the data — chasing the wrong metric wastes weeks of engineering time.
Edge cases
- •INP only fires on real interactions; pages with no clicks produce no INP entry.
- •SPAs: route transitions need explicit instrumentation; web-vitals doesn't auto-fire.
- •Cross-origin iframes have their own perf timing, invisible to parent.
- •Bfcache (back/forward) loads behave differently — Chrome reports separate metrics.
- •Sampling: at high traffic, 1-10% RUM sample is enough; full sample burns analytics quota.
Real-world examples
- •Pinterest, Etsy, Shopify all publish their RUM-driven perf wins.
- •Vercel Speed Insights is built on web-vitals + per-deploy comparison.
- •Sentry Performance ties RUM to error context.
- •Cloudflare's free web analytics surfaces Core Web Vitals without tracking cookies.