Error Handling Patterns

01Errors are not one thing

The first conceptual leap: stop treating all errors the same. There are at least three categories that need different handling:

Expected errors: things that fail in normal operation. User typed a bad email, payment was declined, file not found. These are part of the protocol.
Unexpected errors: bugs, contract violations. The caller passed null where it shouldn't be, an invariant was broken, an assertion failed.
Catastrophic errors: the world is broken. Database is down, disk is full, network is partitioned.

Each calls for a different mechanism. Treating "user typed a bad email" the same as "database is down" gets you code that's noisy AND fragile.

02Exceptions vs Result types

Two models for representing failures in the type system:

Exceptions: errors are out-of-band. Function signature says "returns X." On error, control flow jumps to a catch handler.

Result types: errors are in-band. Function signature says "returns Result<X, Error>." Caller must explicitly handle both cases.

✓ exceptions

function parseEmail(s: string): Email {
  if (!isValid(s)) throw new ValidationError('Invalid email');
  return s as Email;
}

// Caller chooses where to catch
try {
  const email = parseEmail(input);
  await sendWelcomeEmail(email);
} catch (e) {
  if (e instanceof ValidationError) showFieldError(e.message);
  else throw;
}

✓ Result type

type Result<T, E> = { ok: true; value: T } | { ok: false; error: E };

function parseEmail(s: string): Result<Email, string> {
  if (!isValid(s)) return { ok: false, error: 'Invalid email' };
  return { ok: true, value: s as Email };
}

// Caller must handle both
const result = parseEmail(input);
if (!result.ok) {
  showFieldError(result.error);
  return;
}
await sendWelcomeEmail(result.value);

Result types make the error path visible in the type system. The compiler enforces that you handle errors. The trade-off: boilerplate. Every operation that can fail needs unwrapping.

Rule of thumb: use Result types (or equivalent) for expected errors. Use exceptions for unexpected errors (bugs, programmer mistakes). This matches Rust's distinction between Result and panic!, and Go's distinction between returned errors and panic.

03Error boundaries — where to catch

The most common mistake: catching errors too early. Code wraps every operation in try-catch and logs the error. Result: errors get hidden in deep layers, operations continue with broken state.

Better: define explicit error boundaries. Catch at the boundary, not at every internal call.

✓ HTTP handler as the boundary

// Each layer just throws — no try-catch
async function createOrder(req: Request) {
  const user = await authenticate(req);    // throws on bad auth
  const cart = await getCart(user.id);     // throws on not found
  const payment = await charge(cart);       // throws on decline
  return await saveOrder(user, cart, payment);
}

// Single boundary catches everything
app.post('/orders', async (req, res) => {
  try {
    const order = await createOrder(req);
    res.status(201).json(order);
  } catch (e) {
    respondWithError(res, e);
  }
});

One catch block, in the right place. Each layer of business logic is clean. The boundary maps errors to HTTP responses.

04Custom error classes

Throwing strings or generic Error makes catch handlers messy. Custom error classes let the catch handler dispatch cleanly:

✓ typed error hierarchy

class AppError extends Error {
  constructor(message: string, public code: string) {
    super(message);
    this.name = 'AppError';
  }
}

class NotFoundError extends AppError {
  constructor(resource: string) {
    super(`${resource} not found`, 'not_found');
  }
}

class ValidationError extends AppError {
  constructor(public field: string, message: string) {
    super(message, 'validation_error');
  }
}

// Boundary handler
function respondWithError(res, e) {
  if (e instanceof NotFoundError) res.status(404).json({ error: e.code });
  else if (e instanceof ValidationError) res.status(422).json({ error: e.code, field: e.field });
  else { logger.error(e); res.status(500).json({ error: 'internal' }); }
}

05Retry — exponential backoff with jitter

Transient failures (network blips, momentary overload) should be retried. Permanent failures (validation errors, auth failures) should not. Knowing which is which is half the challenge.

The retry pattern that works:

✓ exponential backoff with jitter

async function retry<T>(
  fn: () => Promise<T>,
  maxAttempts = 5,
  baseDelay = 100
): Promise<T> {
  let lastError: Error;
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (e) {
      lastError = e;
      if (!isRetryable(e)) throw e;

      // Exponential delay: 100, 200, 400, 800, 1600 ms
      // Plus jitter: ±50%
      const delay = baseDelay * 2 ** attempt;
      const jitter = delay * 0.5 * (Math.random() - 0.5);
      await sleep(delay + jitter);
    }
  }
  throw lastError!;
}

Three things make this work:

Exponential delay. Each retry waits twice as long. Doesn't hammer a struggling service.
Jitter. Random variance prevents thundering herd when many clients retry simultaneously after an outage.
Retryability check. Don't retry validation errors. Only retry network/timeout/5xx errors.

06Circuit breaker — stop retrying when broken

Retries help with momentary blips. But if the downstream service has been down for 5 minutes, hammering it with retries makes things worse. Circuit breakers solve this.

The state machine:

Closed: requests pass through normally. Failures counted.
Open: failure threshold exceeded. All requests fail fast without trying. Wait period starts.
Half-open: after the wait, allow one test request. If it succeeds, close. If it fails, reopen.

Libraries: opossum (Node), Polly (.NET), Resilience4j (Java), Hystrix (legacy). Don't roll your own — the state management is finicky to get right.

07Timeouts — set them everywhere

Every operation that can hang must have a timeout. Without timeouts, a slow downstream service ties up your threads/connections until your entire service falls over.

✓ HTTP client with timeout

const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 5000);

try {
  const response = await fetch(url, { signal: controller.signal });
  return response;
} finally {
  clearTimeout(timeoutId);
}

Default timeouts: HTTP outbound 5-30 seconds, database queries 5-15 seconds, background jobs 5-30 minutes. Adjust per use case but never use "infinite."

08Log errors with context, not just the message

logger.error('Failed') is useless. Errors need context — what was the operation, what were the inputs, who was the user, what was the request ID.

✓ structured error log

logger.error({
  msg: 'Payment charge failed',
  err: error,                       // stack trace via library
  user_id: user.id,
  amount_cents: cart.total,
  payment_method: payment.id,
  request_id: req.id,
  retry_attempt: attempt,
});

09Anti-patterns to spot

Swallowing exceptions. catch (e) { /* nothing */ } or worse, catch (e) { console.log(e); } in production code. The error happened; it should surface somewhere visible.

Returning null on error. Caller can't distinguish "no result" from "operation failed." Use exceptions or Result types.

Generic catch-and-rethrow. catch (e) { throw new Error('oops'); } destroys the original stack trace and context. Either handle the error or let it propagate.

Mixing error types. Code that returns null for some failures, throws for others, returns {error: ...} for others. Pick one model per layer and apply it consistently.

Logging in the middle. Code logs the same error at three levels as it propagates up. Result: every error appears three times in logs, you can't tell how often it actually happened.

∞The discipline

Error handling is the part of code that runs least often and matters most. The happy path runs millions of times in dev. The error path runs once at 3am when something has gone wrong. The code that's designed for that moment is the code that lets you sleep through the night.

Distinguish error types. Define error boundaries. Use typed errors. Retry transient failures with backoff. Time out everything. Log with context. None of this is glamorous, but it's what separates code that survives production from code that crashes when it's most needed.