How to Find and Fix Flaky Tests in Jest
The most common root causes of Jest flakiness — and battle-tested fixes you can apply today.
Jest is the most popular JavaScript testing framework, powering test suites at companies from startups to Fortune 500s. It’s also one of the most common sources of flaky tests. The combination of parallel test execution, module mocking, and shared Node.js process state creates a perfect storm for intermittent failures.
If your CI pipeline randomly fails with tests that pass on re-run, this guide is for you. We’ll cover why Jest tests become flaky, how to identify the culprits, and concrete fixes for each pattern — with code you can copy into your codebase today.
Want to skip the guesswork?
Before you start debugging one test at a time, get a ranked list of your flakiest tests. Kleore scans your CI history and shows you exactly which Jest tests are flaky, how often they fail, and how much each one costs in wasted CI minutes and developer time.
Why Jest tests become flaky
Jest runs test files in parallel worker processes by default. Each worker gets its own Node.js instance, but tests within a file share the same process. This architecture means that any state leaking between tests in the same file — or any resource shared across workers — becomes a flakiness vector.
The four most common root causes:
- Shared mutable state — Global variables, module-level caches, singleton instances, or database records that persist between tests.
- Timer and date dependencies — Tests that rely on
setTimeout,setInterval,Date.now(), or real clock time. CI runners are slower than your laptop, and timing assumptions break. - Async race conditions — Tests that don’t properly wait for async operations to complete. The test asserts before the state has updated, and it fails intermittently depending on execution speed.
- Module mocking leaks —
jest.mock()calls that bleed across tests because mocks aren’t properly reset between test cases.
How to identify flaky Jest tests
Before you fix anything, you need to know which tests are flaky. Here are the tools Jest gives you to flush out non-deterministic tests.
Use --detectOpenHandles to find leaked resources
If Jest hangs after tests complete or you see “Jest did not exit one second after the test run has completed,” you have open handles — unclosed database connections, running servers, or pending timers.
# Find open handles that prevent Jest from exiting cleanly
npx jest --detectOpenHandles --forceExit
# Run tests sequentially to isolate ordering issues
npx jest --runInBandRun tests repeatedly to reproduce flakes
A test that passes once might fail on the 50th run. Use the --repeat flag (Jest 29+) or a simple bash loop to stress-test suspected flaky tests.
# Jest 29+ built-in repeat
npx jest --repeat=50 path/to/suspected-flaky.test.ts
# Bash loop for older Jest versions
for i in $(seq 1 50); do
npx jest path/to/suspected-flaky.test.ts || echo "FAILED on run $i" && exit 1
doneRandomize test order to find hidden dependencies
Jest 28+ supports --randomize to shuffle test order within each file. If a test only passes when another test runs before it, randomization will expose it.
# Randomize test order within files
npx jest --randomize
# If a test fails, re-run with the same seed to reproduce
npx jest --randomize --seed=12345Use jest-circus retry for automatic detection
The jest-circus test runner (default since Jest 27) supports retries. While retries are a bandaid, they’re useful for identifying which tests need attention: any test that passes on retry is flaky by definition.
// jest.config.ts
export default {
// Retry failed tests up to 2 times
// Tests that need retries are flaky — track them
retryTimes: 2,
// Log retries so you can find them in CI output
logLevel: "warn",
};Common patterns and fixes
Pattern 1: Shared mutable state between tests
Symptom: Test passes with it.only, fails when run with the full suite. Or it only fails when another specific test runs before it.
Root cause: A module-level variable, database record, or in-memory cache is modified by one test and not cleaned up before the next.
// userService.ts
let cachedUsers: User[] = []; // Module-level cache
export function getUsers() {
if (cachedUsers.length) return cachedUsers;
cachedUsers = fetchUsersFromDB();
return cachedUsers;
}
// test file — Test A populates the cache, Test B reads stale data
it("fetches users", () => {
const users = getUsers(); // Populates cache
expect(users).toHaveLength(5);
});
it("handles empty state", () => {
// This SHOULD test empty state, but cache is warm from previous test
const users = getUsers(); // Returns cached data!
expect(users).toHaveLength(0); // FAILS
});import { resetCache } from "./userService";
beforeEach(() => {
resetCache(); // Clear module-level state
jest.clearAllMocks(); // Clear all mock state
});
// Or use jest.isolateModules for complete isolation
it("handles empty state", () => {
jest.isolateModules(() => {
const { getUsers } = require("./userService");
// Fresh module instance — no cached data
const users = getUsers();
expect(users).toHaveLength(0);
});
});Pattern 2: Timer-dependent tests
Symptom: Tests involving debounce, throttle, setTimeout, or animations pass locally but fail in CI where runners are slower.
Root cause: The test relies on real clock time. A 200ms debounce might take 300ms on a loaded CI runner, causing the assertion to fire too early.
it("debounces search input", async () => {
render(<SearchBox />);
fireEvent.change(input, { target: { value: "hello" } });
// Hoping 300ms is enough on CI... it's not
await new Promise(r => setTimeout(r, 300));
expect(mockSearch).toHaveBeenCalledWith("hello");
});it("debounces search input", () => {
jest.useFakeTimers();
render(<SearchBox />);
fireEvent.change(input, { target: { value: "hello" } });
// Advance the clock by exactly 250ms (debounce delay)
jest.advanceTimersByTime(250);
expect(mockSearch).toHaveBeenCalledWith("hello");
jest.useRealTimers();
});Always call jest.useRealTimers() in an afterEach to prevent fake timers from leaking into other tests. Better yet, set it up globally in your Jest setup file.
Pattern 3: Async race conditions
Symptom: Tests using React Testing Library’s getBy* queries fail because the element hasn’t rendered yet. The test passes when you add a small delay.
Root cause: The test asserts synchronously against an asynchronously updated DOM. On faster machines it works; on slower CI runners, the render hasn’t completed yet.
it("shows success message after submit", () => {
render(<Form />);
fireEvent.click(screen.getByText("Submit"));
// Element hasn't rendered yet — race condition!
expect(screen.getByText("Success")).toBeInTheDocument();
});it("shows success message after submit", async () => {
render(<Form />);
fireEvent.click(screen.getByText("Submit"));
// waitFor retries until the assertion passes or times out
await waitFor(() => {
expect(screen.getByText("Success")).toBeInTheDocument();
});
// Or use findBy* which combines getBy + waitFor
const message = await screen.findByText("Success");
expect(message).toBeInTheDocument();
});Pattern 4: Port conflicts in integration tests
Symptom: EADDRINUSE: address already in use errors. Tests pass individually but fail when multiple test files run in parallel.
Root cause: Multiple test workers trying to bind to the same hardcoded port.
// Every test file tries to use port 3000
beforeAll(() => {
server = app.listen(3000);
});import { AddressInfo } from "net";
beforeAll(() => {
// Port 0 = OS assigns an available port
server = app.listen(0);
const { port } = server.address() as AddressInfo;
baseUrl = `http://localhost:${port}`;
});
afterAll(() => {
server.close();
});Pattern 5: Snapshot drift
Symptom: Snapshot tests fail in CI but pass locally. Or they fail after someone else’s PR merges but no one updated the snapshots.
Root cause: Snapshots contain environment-specific data (timestamps, random IDs, absolute paths) or were committed from a different OS/locale.
// Mock Date so snapshots don't change daily
beforeAll(() => {
jest.useFakeTimers();
jest.setSystemTime(new Date("2025-01-01T00:00:00Z"));
});
// Use property matchers for dynamic values
it("renders user card", () => {
const { container } = render(<UserCard user={testUser} />);
expect(container).toMatchSnapshot({
// Allow these properties to be any value
props: expect.objectContaining({
id: expect.any(String),
createdAt: expect.any(String),
}),
});
});
// Add to CI: fail if snapshots are outdated
// package.json script:
// "test:ci": "jest --ci"
// --ci flag makes Jest fail if snapshots need updatingHow to quarantine flaky Jest tests
Sometimes you can’t fix a flaky test immediately. Maybe it’s in a complex integration test that requires a larger refactor, or the root cause is an upstream dependency. In these cases, quarantining the test prevents it from blocking your team while you work on a fix.
A basic quarantine approach with Jest:
// Mark flaky tests so they don't block CI
// TODO: Fix flaky — https://linear.app/team/issue/ENG-1234
describe.skip("payment webhook handler", () => {
it("processes refund events", async () => {
// This test flakes due to Stripe webhook timing
});
});The problem with manual quarantine is that it’s easy to forget about skipped tests. They accumulate, and eventually you have a pile of tests that no one runs. Kleore automates this by detecting flaky tests from your CI history, tracking them over time, and alerting you when quarantined tests haven’t been addressed. No manual tagging required.
Prevention: jest.config settings that reduce flakiness
The right Jest configuration can prevent flaky tests from sneaking in. Here’s a production-hardened jest.config.ts with annotations explaining each setting.
import type { Config } from "jest";
const config: Config = {
// Run tests in random order to catch hidden dependencies
randomize: true,
// In CI, run tests sequentially to reduce resource contention
// Locally, use all cores for speed
maxWorkers: process.env.CI ? 1 : "50%",
// Fail fast — stop after first failure in CI
bail: process.env.CI ? 1 : 0,
// Clear mocks between every test automatically
clearMocks: true,
restoreMocks: true,
// Fail if snapshots are outdated (CI only)
ci: !!process.env.CI,
// Timeout per test — generous enough for CI, strict enough to catch hangs
testTimeout: process.env.CI ? 30000 : 10000,
// Detect open handles that prevent clean exit
detectOpenHandles: true,
// Force Jest to exit after all tests complete
// (safety net — fix the root cause, don't rely on this)
forceExit: !!process.env.CI,
};
export default config;The key insight: --maxWorkers=1 in CI eliminates parallelism-related flakiness. Yes, it’s slower. But a 10-minute reliable suite is better than a 5-minute suite that fails 20% of the time and gets re-run. If you need speed, invest in splitting your test suite into parallel CI jobs with GitHub Actions matrix strategy instead.
Stop guessing which Jest tests are flaky.
Kleore scans your GitHub Actions history and gives you a ranked list of every flaky test — with failure rates, cost estimates, and fix priority. Free to start.
Further reading
- How to Fix Flaky Tests in GitHub Actions — Framework-agnostic patterns for CI-level flakiness.
- How to Find and Fix Flaky Tests in pytest — The Python equivalent of this guide.
- How Much Do Flaky Tests Actually Cost? — The dollar math to justify the fix.
- Flaky Test Cost Calculator — Plug in your team’s numbers and see the impact.