← All articles

How to Find and Fix Flaky Tests in Jest

The most common root causes of Jest flakiness — and battle-tested fixes you can apply today.

·14 min read

Jest is the most popular JavaScript testing framework, powering test suites at companies from startups to Fortune 500s. It’s also one of the most common sources of flaky tests. The combination of parallel test execution, module mocking, and shared Node.js process state creates a perfect storm for intermittent failures.

If your CI pipeline randomly fails with tests that pass on re-run, this guide is for you. We’ll cover why Jest tests become flaky, how to identify the culprits, and concrete fixes for each pattern — with code you can copy into your codebase today.

Want to skip the guesswork?

Before you start debugging one test at a time, get a ranked list of your flakiest tests. Kleore scans your CI history and shows you exactly which Jest tests are flaky, how often they fail, and how much each one costs in wasted CI minutes and developer time.

Why Jest tests become flaky

Jest runs test files in parallel worker processes by default. Each worker gets its own Node.js instance, but tests within a file share the same process. This architecture means that any state leaking between tests in the same file — or any resource shared across workers — becomes a flakiness vector.

The four most common root causes:

  1. Shared mutable state — Global variables, module-level caches, singleton instances, or database records that persist between tests.
  2. Timer and date dependencies — Tests that rely on setTimeout, setInterval, Date.now(), or real clock time. CI runners are slower than your laptop, and timing assumptions break.
  3. Async race conditions — Tests that don’t properly wait for async operations to complete. The test asserts before the state has updated, and it fails intermittently depending on execution speed.
  4. Module mocking leaksjest.mock() calls that bleed across tests because mocks aren’t properly reset between test cases.

How to identify flaky Jest tests

Before you fix anything, you need to know which tests are flaky. Here are the tools Jest gives you to flush out non-deterministic tests.

Use --detectOpenHandles to find leaked resources

If Jest hangs after tests complete or you see “Jest did not exit one second after the test run has completed,” you have open handles — unclosed database connections, running servers, or pending timers.

Detect leaked handles
# Find open handles that prevent Jest from exiting cleanly
npx jest --detectOpenHandles --forceExit

# Run tests sequentially to isolate ordering issues
npx jest --runInBand

Run tests repeatedly to reproduce flakes

A test that passes once might fail on the 50th run. Use the --repeat flag (Jest 29+) or a simple bash loop to stress-test suspected flaky tests.

Stress-test a suspected flaky test
# Jest 29+ built-in repeat
npx jest --repeat=50 path/to/suspected-flaky.test.ts

# Bash loop for older Jest versions
for i in $(seq 1 50); do
  npx jest path/to/suspected-flaky.test.ts || echo "FAILED on run $i" && exit 1
done

Randomize test order to find hidden dependencies

Jest 28+ supports --randomize to shuffle test order within each file. If a test only passes when another test runs before it, randomization will expose it.

Randomize and isolate
# Randomize test order within files
npx jest --randomize

# If a test fails, re-run with the same seed to reproduce
npx jest --randomize --seed=12345

Use jest-circus retry for automatic detection

The jest-circus test runner (default since Jest 27) supports retries. While retries are a bandaid, they’re useful for identifying which tests need attention: any test that passes on retry is flaky by definition.

jest.config.ts — retry configuration
// jest.config.ts
export default {
  // Retry failed tests up to 2 times
  // Tests that need retries are flaky — track them
  retryTimes: 2,

  // Log retries so you can find them in CI output
  logLevel: "warn",
};

Common patterns and fixes

Pattern 1: Shared mutable state between tests

Symptom: Test passes with it.only, fails when run with the full suite. Or it only fails when another specific test runs before it.

Root cause: A module-level variable, database record, or in-memory cache is modified by one test and not cleaned up before the next.

Bad — shared state across tests
// userService.ts
let cachedUsers: User[] = []; // Module-level cache

export function getUsers() {
  if (cachedUsers.length) return cachedUsers;
  cachedUsers = fetchUsersFromDB();
  return cachedUsers;
}

// test file — Test A populates the cache, Test B reads stale data
it("fetches users", () => {
  const users = getUsers(); // Populates cache
  expect(users).toHaveLength(5);
});

it("handles empty state", () => {
  // This SHOULD test empty state, but cache is warm from previous test
  const users = getUsers(); // Returns cached data!
  expect(users).toHaveLength(0); // FAILS
});
Fix — reset state in beforeEach
import { resetCache } from "./userService";

beforeEach(() => {
  resetCache(); // Clear module-level state
  jest.clearAllMocks(); // Clear all mock state
});

// Or use jest.isolateModules for complete isolation
it("handles empty state", () => {
  jest.isolateModules(() => {
    const { getUsers } = require("./userService");
    // Fresh module instance — no cached data
    const users = getUsers();
    expect(users).toHaveLength(0);
  });
});

Pattern 2: Timer-dependent tests

Symptom: Tests involving debounce, throttle, setTimeout, or animations pass locally but fail in CI where runners are slower.

Root cause: The test relies on real clock time. A 200ms debounce might take 300ms on a loaded CI runner, causing the assertion to fire too early.

Bad — real timers in tests
it("debounces search input", async () => {
  render(<SearchBox />);
  fireEvent.change(input, { target: { value: "hello" } });

  // Hoping 300ms is enough on CI... it's not
  await new Promise(r => setTimeout(r, 300));
  expect(mockSearch).toHaveBeenCalledWith("hello");
});
Fix — jest.useFakeTimers()
it("debounces search input", () => {
  jest.useFakeTimers();
  render(<SearchBox />);
  fireEvent.change(input, { target: { value: "hello" } });

  // Advance the clock by exactly 250ms (debounce delay)
  jest.advanceTimersByTime(250);

  expect(mockSearch).toHaveBeenCalledWith("hello");
  jest.useRealTimers();
});

Always call jest.useRealTimers() in an afterEach to prevent fake timers from leaking into other tests. Better yet, set it up globally in your Jest setup file.

Pattern 3: Async race conditions

Symptom: Tests using React Testing Library’s getBy* queries fail because the element hasn’t rendered yet. The test passes when you add a small delay.

Root cause: The test asserts synchronously against an asynchronously updated DOM. On faster machines it works; on slower CI runners, the render hasn’t completed yet.

Bad — synchronous assertion on async DOM
it("shows success message after submit", () => {
  render(<Form />);
  fireEvent.click(screen.getByText("Submit"));

  // Element hasn't rendered yet — race condition!
  expect(screen.getByText("Success")).toBeInTheDocument();
});
Fix — use waitFor or findBy
it("shows success message after submit", async () => {
  render(<Form />);
  fireEvent.click(screen.getByText("Submit"));

  // waitFor retries until the assertion passes or times out
  await waitFor(() => {
    expect(screen.getByText("Success")).toBeInTheDocument();
  });

  // Or use findBy* which combines getBy + waitFor
  const message = await screen.findByText("Success");
  expect(message).toBeInTheDocument();
});

Pattern 4: Port conflicts in integration tests

Symptom: EADDRINUSE: address already in use errors. Tests pass individually but fail when multiple test files run in parallel.

Root cause: Multiple test workers trying to bind to the same hardcoded port.

Bad — hardcoded port
// Every test file tries to use port 3000
beforeAll(() => {
  server = app.listen(3000);
});
Fix — dynamic port allocation
import { AddressInfo } from "net";

beforeAll(() => {
  // Port 0 = OS assigns an available port
  server = app.listen(0);
  const { port } = server.address() as AddressInfo;
  baseUrl = `http://localhost:${port}`;
});

afterAll(() => {
  server.close();
});

Pattern 5: Snapshot drift

Symptom: Snapshot tests fail in CI but pass locally. Or they fail after someone else’s PR merges but no one updated the snapshots.

Root cause: Snapshots contain environment-specific data (timestamps, random IDs, absolute paths) or were committed from a different OS/locale.

Fix — deterministic snapshots
// Mock Date so snapshots don't change daily
beforeAll(() => {
  jest.useFakeTimers();
  jest.setSystemTime(new Date("2025-01-01T00:00:00Z"));
});

// Use property matchers for dynamic values
it("renders user card", () => {
  const { container } = render(<UserCard user={testUser} />);
  expect(container).toMatchSnapshot({
    // Allow these properties to be any value
    props: expect.objectContaining({
      id: expect.any(String),
      createdAt: expect.any(String),
    }),
  });
});

// Add to CI: fail if snapshots are outdated
// package.json script:
// "test:ci": "jest --ci"
// --ci flag makes Jest fail if snapshots need updating

How to quarantine flaky Jest tests

Sometimes you can’t fix a flaky test immediately. Maybe it’s in a complex integration test that requires a larger refactor, or the root cause is an upstream dependency. In these cases, quarantining the test prevents it from blocking your team while you work on a fix.

A basic quarantine approach with Jest:

Manual quarantine with test.skip
// Mark flaky tests so they don't block CI
// TODO: Fix flaky — https://linear.app/team/issue/ENG-1234
describe.skip("payment webhook handler", () => {
  it("processes refund events", async () => {
    // This test flakes due to Stripe webhook timing
  });
});

The problem with manual quarantine is that it’s easy to forget about skipped tests. They accumulate, and eventually you have a pile of tests that no one runs. Kleore automates this by detecting flaky tests from your CI history, tracking them over time, and alerting you when quarantined tests haven’t been addressed. No manual tagging required.

Prevention: jest.config settings that reduce flakiness

The right Jest configuration can prevent flaky tests from sneaking in. Here’s a production-hardened jest.config.ts with annotations explaining each setting.

jest.config.ts — production-hardened
import type { Config } from "jest";

const config: Config = {
  // Run tests in random order to catch hidden dependencies
  randomize: true,

  // In CI, run tests sequentially to reduce resource contention
  // Locally, use all cores for speed
  maxWorkers: process.env.CI ? 1 : "50%",

  // Fail fast — stop after first failure in CI
  bail: process.env.CI ? 1 : 0,

  // Clear mocks between every test automatically
  clearMocks: true,
  restoreMocks: true,

  // Fail if snapshots are outdated (CI only)
  ci: !!process.env.CI,

  // Timeout per test — generous enough for CI, strict enough to catch hangs
  testTimeout: process.env.CI ? 30000 : 10000,

  // Detect open handles that prevent clean exit
  detectOpenHandles: true,

  // Force Jest to exit after all tests complete
  // (safety net — fix the root cause, don't rely on this)
  forceExit: !!process.env.CI,
};

export default config;

The key insight: --maxWorkers=1 in CI eliminates parallelism-related flakiness. Yes, it’s slower. But a 10-minute reliable suite is better than a 5-minute suite that fails 20% of the time and gets re-run. If you need speed, invest in splitting your test suite into parallel CI jobs with GitHub Actions matrix strategy instead.

Stop guessing which Jest tests are flaky.

Kleore scans your GitHub Actions history and gives you a ranked list of every flaky test — with failure rates, cost estimates, and fix priority. Free to start.

Further reading

Stop guessing.
Start measuring.

Two minutes from now, you’ll know exactly how much your CI flakes cost. No credit card. No config changes.

Scan my repos — free

Free to start. No credit card required.