How do you measure and improve Lighthouse scores for a production site?

Measure with Lighthouse (lab) plus field data (CrUX/RUM), and read the categories: Performance (Core Web Vitals — LCP, CLS, TBT), Accessibility, Best Practices, SEO. Improve by fixing the specific opportunities/diagnostics it lists — but optimize the real experience, not the score.

6 min read·~10 min to think through

Lighthouse audits a page across Performance, Accessibility, Best Practices, SEO (and PWA). Knowing how to measure it and improve it — without gaming it — is the skill.

Measuring it

Chrome DevTools → Lighthouse, or the CLI, or PageSpeed Insights (which adds field data from CrUX alongside the lab run).
Lighthouse CI — run it in CI on key pages with score/budget thresholds so regressions fail the build.
Crucial caveat: Lighthouse is a lab test — synthetic, one URL, simulated device/network. Pair it with field data (RUM / CrUX) — real users on real devices. The score is a proxy, not the goal.
Run it consistently — incognito (no extensions), throttled, a few runs (it varies).

The Performance score = Core Web Vitals (weighted)

LCP (Largest Contentful Paint) — largest element painted. Improve: optimize/preload the LCP image, faster TTFB (CDN/SSR/caching), eliminate render-blocking CSS/JS, fonts with font-display: swap.
CLS (Cumulative Layout Shift) — visual stability. Improve: set width/height/aspect-ratio on images and embeds, reserve space for ads/dynamic content, avoid inserting content above existing content, preload fonts.
TBT (Total Blocking Time) / lab proxy for INP — main-thread blocking. Improve: code-split and reduce JS, defer/async non-critical scripts, break up long tasks, remove unused JS, minimize hydration cost.

How to improve — read the report, fix the items

Lighthouse literally lists Opportunities (estimated savings) and Diagnostics. Work the list:

"Eliminate render-blocking resources" → inline critical CSS, defer JS.
"Properly size images" / "Serve images in next-gen formats" → responsive srcset, AVIF/WebP, lazy-load.
"Reduce unused JavaScript" → code-split, tree-shake.
"Avoid an excessive DOM size" → virtualize, simplify markup.
Accessibility section → it runs axe; fix contrast, labels, alt text, ARIA.
SEO section → meta tags, crawlability, valid HTML.

The senior caveat — don't game the score

You can chase 100 in ways that hurt users (lazy-loading the LCP image, deferring all JS so it's janky, testing only the simplest page). The score is a guide to common issues, not the objective. Optimize the real experience: use Lighthouse to find issues, validate fixes against field data and real-device testing, and remember it only audits initial load on one URL — it won't catch runtime interaction jank.

How to answer

"Measure with Lighthouse for the lab view but always pair it with field data — CrUX/RUM — since Lighthouse is synthetic and single-URL. The Performance score is basically Core Web Vitals: improve LCP (image/TTFB/render-blocking), CLS (dimensions, reserved space), and TBT/INP (less JS, code-split, break up long tasks). Practically, I work the Opportunities and Diagnostics list it gives me. But I optimize the actual user experience, not the number — the score is a guide, and it can be gamed in ways that make the app worse."

Follow-up questions

•Why pair Lighthouse with field data?
•Which Core Web Vitals does the Performance score weight, and how do you improve each?
•How can you game a Lighthouse score while hurting users?
•How do you prevent score regressions over time?

Common mistakes

•Treating the score as the goal rather than a proxy.
•Optimizing the score in ways that hurt UX (lazy-loading the LCP image).
•Only auditing the simplest page.
•Ignoring field data and runtime interaction performance.
•Not running it consistently (extensions, no throttling) so results are noisy.

Performance considerations

•Lighthouse catches load-time issues; field tools (CrUX/RUM) catch real experience including INP. Lighthouse CI with budgets prevents regressions. The Opportunities list quantifies estimated savings to prioritize work.

Edge cases

•Score varies between runs — average several.
•Authenticated/data-heavy pages Lighthouse can't easily audit.
•A great lab score but poor field INP.
•Third-party scripts dragging the score.

Real-world examples

•Lighthouse CI failing a PR that regressed bundle size / LCP.
•Fixing 'properly size images' and 'reduce unused JS' opportunities to move LCP and TBT.

Senior engineer discussion

Seniors measure with Lighthouse but insist on field data alongside it, map the Performance score to Core Web Vitals with concrete fixes for each, and work the Opportunities/Diagnostics list methodically. The senior signal is refusing to game the score — Lighthouse is a guide to find issues, the real experience (validated on real devices and RUM) is the objective.