← Back

Fitness Functions

First written Apr 13, 2026

A fitness function is any automated check that protects an architectural property. Unit tests verify what code does. Fitness functions verify how code is structured — module boundaries, layer direction, coupling, contracts.

The Missing Layer

The testing pyramid has a gap. Linters catch style. Unit tests catch logic. Integration tests catch contracts. Nothing catches structural decay — boundary erosion, layer violations, coupling growth.

Fitness functions fill that gap. They sit at the top of the escalation ladder (convention → documentation → linter → fitness function) and cannot be bypassed.

Architectural Dimensions

Architecture spans multiple dimensions. Each can degrade independently. Select fitness functions based on which dimensions are critical to the project.

DimensionWhat degrades without itWhat to check
StructureBoundaries blur, coupling growsImport direction, circular dependencies, coupling thresholds
ContractsAPI changes break clientsBreaking change detection, schema-to-code drift
DataMigrations destroy dataDestructive operation blocking, entity-schema consistency
MaintainabilityFiles grow, complexity hidesComplexity limits, file size caps, dead export detection
PerformanceBundles bloat, queries slowBundle size budgets, query cost limits, response time thresholds
SecuritySecrets leak, endpoints unprotectedSecret scanning, CVE detection, auth enforcement
ReliabilityErrors cascade, timeouts missingError boundary requirements, timeout enforcement
ObservabilityIncidents take hours to diagnoseStructured log format, required trace spans

Most teams only cover maintainability (via linters). Everything else is trust and manual review.

The Ratchet Pattern

Architecture quality should only go up:

  1. Clean up a module
  2. Add it to an enforcement list (violations become errors)
  3. New modules are enforced from creation
  4. The list grows. Quality never decreases.

Never write fitness functions for the codebase you want — write them for the one you have. Clean up first, then enforce. The function is a lock, not a goal.

For migrations: migrated modules go on an allowlist (errors). Unmigrated modules get warnings. As migration progresses, the allowlist grows.

Scope

  • Atomic — single check, single property. Fast. “This file has a lint error.”
  • Holistic — combines signals across the system. “All boundaries respected AND tests pass AND contracts valid.”

Both are needed. Atomic catches individual violations. Holistic catches emergent issues.

Timing

TimingWhenWhat it catches
TriggeredOn file edit, commit, or PRIndividual violations as they happen
ContinuousDaily or weekly scheduleGradual decay — dead code, dependency drift, convention erosion
TemporalPre-release or quarterlyAccumulated risk — security audits, architecture reviews

All three are needed. Triggered catches mistakes. Continuous catches drift. Temporal catches risk.

Why AI Makes This Critical

AI amplifies whatever patterns exist. One boundary violation becomes the pattern the agent copies for every new file. Without fitness functions, a single shortcut propagates at machine speed.

AI can’t review its own architecture. Code passes linters and tests but structural properties go unchecked unless fitness functions check them.

The cost of not enforcing becomes exponential. Multiple AI agents generating code in parallel — every unenforced rule produces violations in every session.

Connection to Severity-Gated Review

When a fitness function fails, it becomes a finding in the severity-gated review. The fitness function detects the violation. The severity framework classifies it (CRITICAL / MAJOR / MINOR) and gates progress (NO-GO / CONDITIONAL / GO). Fitness functions are the automated sensors; severity gates are the judgment layer. See severity-gated-review.

Decision Criteria

When choosing which dimensions to protect: ask “if this degrades silently, does the system fail in a way that matters?” If yes, add a fitness function.

When deciding scope: use atomic + triggered for fast per-edit feedback. Use holistic + triggered for CI validation. Use atomic + continuous for drift detection. Use manual + temporal for what can’t be automated.

When timing enforcement: don’t build fitness functions for problems you don’t have yet. Identify the current constraint. Build for it. Evolve when the constraint changes.

Anti-patterns

  • Aspirational fitness functions — encoding what you wish the codebase looked like. Breaks everything on day one.
  • Only maintainability coverage — linters check style but not architecture. Most structural decay is invisible to linters.
  • No continuous checks — triggered checks catch per-edit violations but miss gradual drift.
  • All-or-nothing enforcement — either enforce everything (breaks existing code) or nothing (allows regression). The ratchet pattern is the middle path.
  • Fitness functions without cleanup — enforcement without migration work creates noise, not quality.