Debugging INP regressions with the Long Animation Frames API

INP became a Core Web Vital in March 2024, and most of us reached for the same tools we'd been using for FID. For the trickier regressions you'll stare at clean traces and lose a week.

INP Debugging with Long Animation Frames

INP (Interaction to Next Paint) became a Core Web Vital in March 2024, and most of us reached for the same tools we'd been using for FID: the Long Tasks API and event entries from the Event Timing API. For a lot of regressions, that's enough. For the trickier ones — the ones where your handler is fast but the next paint is slow — you'll stare at clean traces and lose a week.

This post walks through what the Long Animation Frames API (LoAF) gives you that long tasks don't, and how I've been using it to find INP culprits that nothing else surfaces.

Why Long Tasks API isn't enough for INP

A "long task" is a single task that runs for >50ms on the main thread. INP, on the other hand, measures the full time from user input until the browser paints the next frame that reflects the response. That window can include:

  • The event handler script
  • requestAnimationFrame callbacks scheduled by the handler
  • Forced style recalculation and layout
  • Paint and compositing
  • Other tasks that happen to run before the next presentation opportunity

If your slow work is split across two tasks of 30ms each, Long Tasks reports nothing. INP reports 400ms.

What LoAF actually gives you

A LoAF entry covers a single animation frame that took >50ms from start to presentation. The shape:

const observer = new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) {
    console.log({
      duration: entry.duration,
      blockingDuration: entry.blockingDuration,
      renderStart: entry.renderStart,
      styleAndLayoutStart: entry.styleAndLayoutStart,
      scripts: entry.scripts.map(s => ({
        invokerType: s.invokerType,
        sourceURL: s.sourceURL,
        sourceFunctionName: s.sourceFunctionName,
        duration: s.duration,
        forcedStyleAndLayoutDuration: s.forcedStyleAndLayoutDuration,
        pauseDuration: s.pauseDuration,
      })),
    });
  }
});

observer.observe({ type: 'long-animation-frame', buffered: true });

The two fields I keep coming back to:

  • blockingDuration — total time the main thread was blocked during this frame, excluding gaps. Correlates much better with INP than raw duration.
  • scripts[].forcedStyleAndLayoutDuration — time this script spent triggering synchronous style/layout. This is where layout-thrashing third-party libs hide.

A regression I'd never have found otherwise

The trigger for me writing this up: a dashboard's p75 INP jumped from 180ms to 410ms after a release. The release added a logging wrapper that, on every click, scheduled a requestIdleCallback to read element positions and ship them with the event.

The click handler itself ran in ~25ms. No long task. The Event Timing processingEnd - processingStart was a clean ~30ms. Everything looked fine.

LoAF showed the actual frame: the click handler finished, then the idle callback fired inside the same frame's render phase and called getBoundingClientRect() on a tree that React had just dirtied. forcedStyleAndLayoutDuration was 280ms. The browser couldn't paint the response until that resolved.

// The fix — defer the read past the next paint
requestIdleCallback(() => {
  requestAnimationFrame(() => {
    requestAnimationFrame(() => {
      // now safe to read layout — previous frame has presented
      reportPositions();
    });
  });
});

Ugly, but it dropped p75 INP back under 200ms within a day of rolling out.

Gotchas worth knowing

  1. Chromium only right now (Chrome 123+, Edge equivalent). Safari and Firefox haven't shipped it. So you're sampling roughly 70% of traffic depending on your audience.
  2. Synthetic events hide attribution. React's event delegation means invokerType is often 'event-listener' on the synthetic root, not your component handler. Cross-reference sourceURL and sourceFunctionName — or instrument your handlers with performance.mark() and correlate by timestamp.
  3. pauseDuration captures alerts, sync XHR, and prompt(). If you see it non-zero in prod, you have a bigger problem than INP.
  4. Buffer it. Use buffered: true so you don't miss frames before your observer attached.
  5. Sample, don't ship everything. LoAF entries with full scripts[] arrays can be hundreds of bytes. Rate-limit before posting to your RUM endpoint.

Where to go from here

The full walkthrough on the original post includes the production RUM snippet I'm using (with sampling + payload trimming), a side-by-side of LoAF vs Long Tasks on the same regression, and the benchmark numbers from before/after the layout-read fix.

Read the full version with benchmarks and the production RUM snippet at webperfclinic.com.

Article changelog (1)
  • — Expanded with TL;DR, table of contents, or additional sections
Nadia El-Sayed
About the Author Nadia El-Sayed

Core Web Vitals specialist focused on real-user monitoring. Believes synthetic-only perf testing is a comforting lie.