Lighthouse scores tell you what's possible in a clean lab. Real User Monitoring (RUM) tells you what your visitors actually experience — on flaky 4G in a tunnel, on a five-year-old Android, with twelve browser extensions injecting scripts. If you ship to production without RUM, you're flying blind on the only metrics Google ranks you on: field data from the Chrome User Experience Report (CrUX).
So, this guide walks through a complete, production-ready RUM setup using Google's web-vitals library (v4.x). You'll capture LCP, INP, CLS, FCP, and TTFB from real users, ship them to an analytics endpoint, alert on regressions, and reconcile field data with your Lighthouse lab scores. Every example is current as of 2026 and uses the modern attribution API — which, honestly, is the part that finally makes RUM actionable instead of just decorative.
Why RUM Beats Synthetic Testing for Core Web Vitals
Synthetic tools like Lighthouse, WebPageTest, and PageSpeed Insights run controlled experiments. They throttle the CPU, simulate a Moto G4 on Slow 4G, and produce a reproducible score. That's perfect for catching regressions in CI — but at the end of the day, it's a single, simulated visitor.
RUM aggregates measurements from every real visit. The differences matter:
- Device diversity — your p75 user might be on a 2019 budget phone with thermal throttling that no synthetic test models accurately.
- Network reality — synthetic profiles don't replicate spotty mobile coverage, captive portals, or corporate proxies that rewrite half your headers.
- Interaction patterns — INP (Interaction to Next Paint) is the only Core Web Vital that depends on what users actually click, type, and tap. Lighthouse can simulate one interaction; RUM captures thousands.
- Geographic spread — TTFB varies by 400ms+ between regions when your CDN misses. Lighthouse runs from a single data center.
- Long-tail bugs — that 1% of users who trigger a 12-second INP because of a third-party A/B testing script? Only RUM finds them.
Google's CrUX dataset is already RUM, technically — but it only includes Chrome users who opted in, aggregates over 28 days, and excludes pages with too little traffic. Your own RUM gives you per-deploy granularity, real-time alerts, and per-route breakdowns. It's the difference between a monthly bank statement and a live balance.
The web-vitals Library: What It Captures
The web-vitals library, maintained by the Chrome team, is the canonical way to measure Core Web Vitals in the browser. As of v4, it exposes:
onLCP()— Largest Contentful Paint (loading)onINP()— Interaction to Next Paint (interactivity, replaced FID in March 2024)onCLS()— Cumulative Layout Shift (visual stability)onFCP()— First Contentful Paint (perceived load)onTTFB()— Time to First Byte (server responsiveness)
Each function returns a metric object with the value, a rating (good / needs-improvement / poor), and — critically — an attribution field that pinpoints the offending element, event, or shifted node. That last part is what makes the v4 API a real step change over hand-rolled PerformanceObserver code. I spent the better part of a Saturday a couple of years back writing my own LoAF observer; the v4 attribution API does it better in three lines.
Step 1: Install and Initialize web-vitals
Install the library. It's tiny — under 2KB gzipped, even with attribution:
npm install web-vitals
Create a single entry point that registers all five metrics. Use the /attribution sub-path to get debug information:
// src/rum.js
import {
onLCP,
onINP,
onCLS,
onFCP,
onTTFB,
} from 'web-vitals/attribution';
function sendToAnalytics(metric) {
const body = {
name: metric.name,
value: metric.value,
rating: metric.rating,
delta: metric.delta,
id: metric.id,
navigationType: metric.navigationType,
url: location.href,
attribution: metric.attribution,
};
// Use sendBeacon when available — survives page unloads.
const url = '/rum';
const data = JSON.stringify(body);
if (navigator.sendBeacon) {
navigator.sendBeacon(url, data);
} else {
fetch(url, { body: data, method: 'POST', keepalive: true });
}
}
onLCP(sendToAnalytics);
onINP(sendToAnalytics);
onCLS(sendToAnalytics);
onFCP(sendToAnalytics);
onTTFB(sendToAnalytics);
Load rum.js as low priority — it shouldn't compete with your critical resources:
<script type="module" src="/rum.js" fetchpriority="low" defer></script>
Why sendBeacon Matters
Core Web Vitals are finalized when the page is hidden or unloaded — that's the moment LCP is locked in, CLS stops accumulating, and the worst INP is known. A normal fetch() call would be canceled when the user navigates away. navigator.sendBeacon() queues the request with the browser, which guarantees delivery even after the document is gone. (Yes, even if the user has already opened a new tab and is halfway down a Reddit thread.)
Step 2: Build the Collection Endpoint
You can ship to Google Analytics 4, but a custom endpoint gives you full control and avoids consent banners blocking your data. Here's a minimal Node.js endpoint using Express:
// server/rum.js
import express from 'express';
import { Pool } from 'pg';
const app = express();
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
// Beacons send Content-Type: text/plain — accept raw body.
app.use('/rum', express.text({ type: '*/*', limit: '32kb' }));
app.post('/rum', async (req, res) => {
try {
const m = JSON.parse(req.body);
await pool.query(
`INSERT INTO rum_events
(metric, value, rating, url, navigation_type, attribution, ua, ts)
VALUES ($1, $2, $3, $4, $5, $6, $7, NOW())`,
[
m.name,
m.value,
m.rating,
m.url,
m.navigationType,
JSON.stringify(m.attribution),
req.get('user-agent'),
]
);
res.status(204).end();
} catch (err) {
res.status(400).end();
}
});
app.listen(3000);
For higher traffic, write to ClickHouse, BigQuery, or pipe into Kafka. RUM volume is roughly 5 events × pageviews — at a million pageviews a day that's 5 million inserts. Postgres handles it fine, but starts to feel cramped once you're trying to run percentile queries over 90 days of data.
Schema That Matters
Index on (metric, ts) for time-series queries and on url (or a route hash) for per-page breakdowns. Store attribution as JSONB so you can query into the LCP element selector without reshaping the table later. Future-you will thank present-you.
Step 3: Use Attribution Data to Find the Real Culprits
This is where RUM stops being a vanity dashboard and starts being a debugging tool. Here's what each metric exposes in v4:
LCP Attribution
// metric.attribution for LCP
{
element: '#hero-image', // CSS selector
url: 'https://cdn.example/hero.avif',
timeToFirstByte: 280,
resourceLoadDelay: 120, // delay before request started
resourceLoadDuration: 540, // request itself
elementRenderDelay: 40, // time from response to paint
lcpEntry: { /* PerformanceEntry */ },
}
Sum the four sub-parts and you get LCP. If resourceLoadDelay dominates, your hero image isn't being discovered early enough — add fetchpriority="high" or a <link rel="preload">. If elementRenderDelay is huge, you're blocked by render-blocking CSS, or the image is sitting below a hydration boundary.
INP Attribution
// metric.attribution for INP
{
interactionTarget: 'button.add-to-cart',
interactionType: 'pointer',
inputDelay: 12,
processingDuration: 184, // your event handler
presentationDelay: 96, // browser paint after handler
longAnimationFrameEntries: [/* LoAF entries */],
}
The longAnimationFrameEntries field is the killer feature here. Each LoAF entry includes script attribution showing which JS file and function blocked the main thread. You can finally answer the eternal question — "which third party broke INP this week?" — without firing up DevTools on a phone you don't own.
CLS Attribution
{
largestShiftTarget: '.ad-slot-leaderboard',
largestShiftTime: 1840,
largestShiftValue: 0.18,
largestShiftSource: { node: /* Element */, previousRect, currentRect },
loadState: 'dom-interactive',
}
Group by largestShiftTarget in your dashboard and the worst layout shifters bubble up immediately. Nine times out of ten it's an ad slot, a late-loading web font (you forgot size-adjust, didn't you?), or a banner being hydrated above the fold.
Step 4: Sample Smartly to Control Costs
At scale, RUM volume gets expensive fast. A 100% sample is rarely necessary — Core Web Vitals are statistical and converge quickly. Sample at the session level, not the event level, so you don't break attribution:
// 25% session sampling
const sessionId =
sessionStorage.getItem('rumSession') ||
(() => {
const id = crypto.randomUUID();
sessionStorage.setItem('rumSession', id);
return id;
})();
const SAMPLE_RATE = 0.25;
// Hash the session id deterministically into [0,1).
const bucket = parseInt(sessionId.slice(0, 8), 16) / 0xffffffff;
const enabled = bucket < SAMPLE_RATE;
if (enabled) {
onLCP(sendToAnalytics);
onINP(sendToAnalytics);
onCLS(sendToAnalytics);
onFCP(sendToAnalytics);
onTTFB(sendToAnalytics);
}
One trick worth stealing: sample 100% on slow pages. The long tail is where the bugs hide. Detect slowness with the Network Information API and override the sample:
const slowConnection =
navigator.connection?.effectiveType && ['slow-2g', '2g', '3g'].includes(navigator.connection.effectiveType);
if (enabled || slowConnection) {
// register handlers
}
Step 5: Compute p75 and Alert on Regressions
Google ranks pages by the 75th percentile of each Core Web Vital across a 28-day window. Mirror that in your dashboard. Here's a SQL query for daily p75 LCP per route:
SELECT
date_trunc('day', ts) AS day,
route,
percentile_cont(0.75) WITHIN GROUP (ORDER BY value) AS p75_lcp,
count(*) AS samples
FROM rum_events
WHERE metric = 'LCP'
AND ts > now() - interval '14 days'
GROUP BY 1, 2
HAVING count(*) >= 100
ORDER BY day DESC, p75_lcp DESC;
The HAVING count(*) >= 100 guard prevents low-traffic routes from spiking the dashboard with noisy two-sample averages. Without it, every obscure /admin/legacy/foo route will look like an emergency.
For alerting, compute a 7-day rolling p75 and trigger when it crosses the "good" threshold (LCP > 2.5s, INP > 200ms, CLS > 0.1) or jumps more than 15% week-over-week. Wire that to PagerDuty, Slack, or whatever your team actually watches.
Step 6: Reconcile Lab vs. Field
You'll often see Lighthouse reporting LCP of 1.2s while RUM p75 sits at 3.4s. That gap isn't a bug — it's the synthetic-vs-real divergence, and it's totally normal. Common causes, in rough order of frequency:
- Cache state. Lighthouse runs cold; real users return.
- Geography. Lighthouse runs from one region; users span continents.
- Device. Lighthouse simulates a Moto G4; field includes flagship phones and $80 Androids.
- Login state. Lighthouse hits the marketing page; real users are inside the app with personalized content.
- Third parties. Some scripts only load for opted-in users (analytics, A/B tests) and don't show up in lab runs.
The fix is to segment your RUM dashboard by these dimensions and find the slice that matches your Lighthouse config. If they still disagree, your Lighthouse profile is unrealistic — adjust it. (I've seen teams spend weeks chasing a "regression" that turned out to be a single mis-configured Lighthouse run on a stale CI image. Don't be those teams.)
Step 7: Privacy and Compliance
RUM data is performance telemetry, not behavioral tracking — it's measurements about the page, not the person. Most regulators treat it as legitimate interest under GDPR, but consult your legal team. To stay clearly on the right side:
- Don't capture IP addresses; truncate at the edge if your load balancer adds them.
- Don't store user-agent strings indefinitely. Hash to a device-class bucket and discard.
- Don't include URL query parameters that may contain PII (search terms, email tokens). Strip them client-side before sending.
- Document RUM in your privacy policy as performance telemetry.
Common RUM Mistakes That Burn Budgets
- Sending one beacon per metric. Five separate beacons per pageview is wasteful. Buffer until
visibilitychangeand send a single batched beacon. - Forgetting bfcache restores. Pages restored from back/forward cache fire metrics with
navigationType: 'back-forward-cache'— those are typically near-instant and will skew your p75 down. Either filter them out or report them separately. - Not capturing soft navigations. SPA route changes don't trigger a new pageload. Use the experimental
onLCP({reportSoftNavs: true})flag in v4 if you're on a single-page app. - Logging without indexing. Writing 5M rows a day to an unindexed Postgres table works for about a week. Then your dashboard times out, and you're scrambling to add indexes on a busy table during the morning standup. Add the indexes on day one.
FAQ
What's the difference between RUM and synthetic monitoring?
RUM measures real user sessions in production; synthetic monitoring runs scripted tests from controlled environments. Use synthetic for regression gating in CI, and RUM for measuring actual user experience and ranking signals. They're complementary, not interchangeable.
Does the web-vitals library work in all browsers?
It works in Chromium-based browsers (Chrome, Edge, Brave, Opera, Samsung Internet) for the full Core Web Vitals set. Safari and Firefox return only the metrics their browsers support — TTFB and FCP work everywhere; LCP works in Safari 16+; INP and CLS are still Chromium-only at the time of writing in 2026. Roughly 70-75% of global traffic is covered.
How much does RUM cost to run?
A million pageviews a day at 25% sampling produces about 1.25M events. On managed Postgres that's roughly 5-10GB/month of storage and pennies of CPU. On BigQuery or ClickHouse, costs are similar. The bigger cost is engineering time to maintain dashboards and alerts — budget a few days of setup plus ongoing tuning.
Should I use Google Analytics 4 instead of a custom endpoint?
GA4 has built-in web-vitals support and is fine for getting started. The downsides: cookie banners block roughly 20-40% of beacons in EU traffic, GA4's free tier samples aggressively above 10M events/month, and you can't query individual attribution payloads. For serious performance work, run both — GA4 for marketing context, custom RUM for engineering.
Why do my RUM numbers differ from PageSpeed Insights?
PageSpeed Insights field data comes from CrUX, which is a 28-day rolling p75 of Chrome users who opted into reporting. Your own RUM has different windowing, sampling, and browser coverage. Expect differences within 10-20%; larger gaps usually point to sampling bias (e.g., your RUM excludes a region or device class that CrUX includes).
The Bottom Line
Lab metrics ship clean code. Field metrics ship clean experiences. The web-vitals library plus a small endpoint and a couple of indexed tables gives you the same data Google uses to rank you — except yours updates in real time and tells you exactly which DOM node is hurting LCP.
Set this up once, wire it to your alerting, and Core Web Vitals stops being a quarterly fire drill. It just becomes another metric you actually own.