cd ../writing
// testing · strategy

Testing strategy in 2026 — the pyramid that still works.

Most test suites are slow, brittle, and don't catch the bugs that actually ship. Engineers spend hours writing tests that pass but don't prove anything. The testing pyramid — unit, integration, end-to-end — remains the right mental model, but the ratios and the boundaries have shifted significantly since 2010. This is the 2026 working strategy: what to test, what to mock, where the boundaries are, and how to ship with confidence without spending half your time on tests.

3 test layers real ratios 0 dogma © use freely

01What tests are actually for

Tests serve two purposes, in this order:

  1. Catch regressions. When you change one part of the system, tests tell you what else broke.
  2. Document intent. Tests show how the code is supposed to be used.

Tests are not for: proving correctness, achieving coverage numbers, satisfying process requirements, or punishing the previous developer. Tests written for these reasons are usually bad — they're slow, brittle, and don't actually catch the bugs that matter.

02The 2026 pyramid

The classic ratios were 70% unit, 20% integration, 10% e2e. In 2026, with better integration testing tools and slower e2e environments, the practical ratio is closer to:

  • 50% unit — pure functions, business logic, validators
  • 40% integration — code with real dependencies (DB, cache, HTTP)
  • 10% end-to-end — full user flows through the browser

The shift is from unit-heavy to integration-heavy. Modern tooling (Testcontainers, Vitest, Playwright Component Tests) makes integration tests almost as fast as unit tests while being far more representative of real failures.

03Unit tests — when they actually help

Unit tests test a single function or class with no real dependencies. They run in milliseconds, in parallel, with zero setup.

Unit tests shine for:

  • Pure functions. Inputs in, outputs out. Calculations, parsers, formatters, validators.
  • Business logic with branching. Discount calculation, permission checks, eligibility rules — anything with many conditional paths.
  • Algorithms. Sort, search, deduplication, anything where the implementation is non-obvious.
✓ a unit test that earns its place
describe('calculateDiscount', () => {
  test('no discount for first-time customers', () => {
    expect(calculateDiscount(100, { isFirstTime: true })).toBe(0);
  });

  test('10% for returning customers', () => {
    expect(calculateDiscount(100, { isReturning: true })).toBe(10);
  });

  test('caps at 50% for VIPs over $1000', () => {
    expect(calculateDiscount(2000, { vip: true })).toBe(1000);
  });
});

Unit tests fail when applied to code that has lots of dependencies. Mocking three or four collaborators to test one function is a sign you should be writing an integration test instead.

04Integration tests — the actual workhorse

Integration tests use real dependencies — a real database, real HTTP server, real Redis. They test behavior across module boundaries.

These are the tests that catch the bugs that actually ship. SQL queries that compile but return wrong results. Code paths that work in isolation but fail when wired together. Edge cases that mock-based tests gloss over.

✓ integration test against real DB
describe('POST /users', () => {
  beforeEach(async () => await db.resetSchema());

  test('creates user with email verification token', async () => {
    const res = await request(app)
      .post('/users')
      .send({ email: 'a@b.com', password: 'secret123' });

    expect(res.status).toBe(201);

    // Verify side effects in the DB
    const user = await db.users.findByEmail('a@b.com');
    expect(user.emailVerified).toBe(false);
    expect(user.verificationToken).toMatch(/^[a-f0-9]{32}$/);
  });
});

Modern tooling makes this fast. Testcontainers spins up real PostgreSQL/Redis in Docker per test run. pg-mem runs an in-memory Postgres for unit-like speed. Vitest can run hundreds of integration tests per second in parallel.

05End-to-end tests — used sparingly

E2E tests drive a real browser through the application. Playwright, Cypress. They're the only tests that prove the whole stack works together — JavaScript bundles, CSS, API calls, the database, the browser.

Use them for:

  • Critical user paths only. Signup. Login. Checkout. Cancellation. The 5-10 flows that, if broken, are catastrophic.
  • Smoke testing after deploy. Quick verification that the new build actually loads.

Don't use them for:

  • Form validation edge cases (unit test)
  • API responses (integration test)
  • Component visual states (component test)

E2E tests are slow (5-30 seconds per test) and flaky (network timing, animation, race conditions). Keep them few and important.

06What to mock — and what NOT to mock

Mock external services you don't control:

  • Third-party APIs (Stripe, SendGrid, OpenAI)
  • Email/SMS sending
  • File uploads to cloud storage

Don't mock things you do control:

  • Your database — use a real test database
  • Your own services in integration tests — use the real ones
  • Your HTTP routes — request through the real router

The pattern: mock at the system boundary, not in the middle. Tests with five layers of mocking end up testing the mocks, not the code.

07Why tests become brittle

Brittle tests are tests that break when behavior didn't actually change. Common causes:

  • Testing implementation, not behavior. Asserting on private methods or internal state. When the implementation refactors, the test breaks — even though behavior is identical.
  • Time-dependent assertions. expect(date).toBe(new Date()) fails milliseconds later. Use libraries like sinon's clock, or assert on relative time.
  • Order dependencies. Test B passes only if Test A ran first. Each test must be independent — set up its own state, tear it down.
  • Network calls in unit tests. Real network calls in unit tests fail randomly. If you need the network, write an integration test.

The rule of thumb: test what the function does, not how it does it. If you can completely rewrite the implementation and the tests still pass, your tests are valuable. If a refactor breaks the tests, your tests were measuring the implementation.

08Coverage — useful metric or vanity number?

Coverage tells you what code is exercised by tests. It does NOT tell you whether that code is correctly tested. Coverage of 90% with weak assertions is worse than coverage of 60% with strong ones.

Use coverage to:

  • Find code with zero tests (definitely worth adding some)
  • Track regressions in test quality over time

Don't use coverage to:

  • Set a hard threshold for PR merging (engineers will write garbage tests to hit the number)
  • Compare quality between codebases (different code structures have different natural coverage)

09Keeping the test suite fast

Slow test suites kill the discipline. If running tests takes 10 minutes, engineers run them at the end of the day, not after each change. Bugs slip through.

Targets:

  • Unit tests: entire suite under 30 seconds. Parallelize. Avoid setup.
  • Integration tests: entire suite under 5 minutes. Parallelize aggressively. Use Testcontainers or test-specific schemas.
  • E2E tests: entire suite under 15 minutes. Parallelize across multiple browser instances. Run in CI, not locally.

Strategies that work:

  • Parallel execution. Vitest, Jest, Playwright all support this natively.
  • Isolated test data. Each test creates the data it needs, in its own transaction or schema, and rolls back.
  • Smart test selection. Run only tests affected by changed files. Vitest does this automatically.
  • Fast database resets. TRUNCATE with foreign keys disabled is 100x faster than dropping and recreating.

The discipline

Tests are infrastructure. They earn their keep by giving you confidence to change code without fear. A test suite that doesn't give you that confidence isn't worth running — and a test suite that gives you that confidence is worth almost any amount of investment.

Write fewer tests, but write better ones. Test behavior, not implementation. Run them often. Trust them when they pass. Take them seriously when they fail.