Harness Engineering - Rodrigo Ramirez

An AI coding agent is two things: the model and the harness. The model generates code. The harness is everything around it — rules it reads, checks that run when it acts, guardrails that prevent damage. When output is wrong, fix the harness, not the prompt.

Guides and Sensors

The harness has two sides:

Guides steer before action — rules files, architecture docs, conventions, examples. Context that prevents mistakes.
Sensors check after action — linters, type checkers, tests, structural validations. Verification that catches mistakes.

Most teams have guides but no sensors wired into the agent loop. The tools exist (linters, type checkers, test suites) but run only in CI. Wire them into the agent’s feedback cycle so it self-corrects before a human sees the output.

Sensor Timing

When	What runs	Budget
After each file edit	Linter (single file, cached) + type checker (incremental)	Under 5 seconds
When the agent finishes a task	Linter + type checker + related tests	Under 120 seconds
Before dangerous commands	Pattern matching on the command	Under 1 second

Per-edit sensors must be fast. If they exceed 5 seconds, engineers disable them. End-of-task sensors can take longer. Safety sensors are pattern matching — near instant.

The Escalation Ladder

Rules exist at different enforcement levels. When a rule fails repeatedly at one level, promote it to the next:

Convention → Documentation → Linter rule → Fitness function

Convention: lives in people’s heads. Invisible to AI. Breaks silently.
Documentation: written in rule files the agent reads. Works most of the time.
Linter rule: encoded in tooling. Immediate feedback. Cannot be forgotten.
Fitness function: automated structural test. Cannot be bypassed. Protects permanently.

When documentation fails to prevent a mistake, promote the rule into code. Each promotion pays off in every future session.

The Self-Improving Loop

Mistake → sensor catches it → “what prevents this class of mistake?” → promote rule → next session: can’t happen

When a mistake happens, the question is not “how do I fix this code?” The question is “what harness change prevents this from ever happening again?” Each failure improves the system, not just the code.

Circuit Breakers

After 3 consecutive identical failures, stop blocking and escalate to a human. Prevents infinite retry loops. Every escalation is a signal — the harness has a gap. Fix the gap, not just the code.

Accidental vs Essential

The harness targets accidental complexity — mechanical mistakes that follow patterns. Essential complexity (which pattern should exist, where boundaries should be, how to model the domain) stays with humans.

Harness solves	Humans solve
Code follows the wrong pattern	Which pattern should exist
Import violates a boundary	Where the boundary should be
Missing type annotation	What the domain model should look like
Test doesn’t exist for changed code	What behavior to test for

Severity Gates: The Judgment Layer

The harness detects problems (sensors catch violations). Severity-gated review classifies and gates progress. When a sensor fires, the finding enters the severity framework: CRITICAL → NO-GO, MAJOR → CONDITIONAL, MINOR → GO. The harness is the runtime enforcement. The severity gate is the judgment layer on top.

See severity-gated-review for the full framework — severity tiers, verdict logic, multi-agent orchestration, and structured findings format.

Decision Criteria

When building a harness: start with guides (rules files, documentation). When a documented rule gets violated more than once, promote it to a linter rule or fitness function. When the agent can’t self-correct after 3 attempts, add a circuit breaker. Prioritize sensors that give the fastest feedback — per-edit linting before end-of-task testing.

When deciding what to enforce: only enforce what already works. Clean up first, then add the fitness function. Never write enforcement for aspirational states — that breaks the existing codebase.

Anti-patterns

Fixing prompts instead of the environment — prompt changes are session-specific. Harness changes are permanent.
Guides without sensors — telling the agent what to do without checking if it did it right.
Sensors too slow — per-edit checks over 5 seconds get disabled. Keep them fast.
No circuit breakers — agents retrying the same error endlessly waste time and tokens.
Aspirational enforcement — writing fitness functions for a codebase state that doesn’t exist yet.
Building the harness all at once — the harness grows with the codebase. Start small, compound over time.

Experience Notes

Previously: guides only (dozens of rule files, zero sensors). Every PR required full manual review. Now: sensors wired into the agent loop — lint after every edit, type check incrementally, run related tests before task completion. Review shifted from catching mechanical mistakes to verifying design decisions.