Skip to content

When Intent Engineering Fails

Intent Engineering does not prevent specs and agent instructions from falling out of sync with the codebase. It makes that gap visible and recoverable, which is a meaningfully weaker claim, and the honest one.

The five failure modes below survive good initial setup. They are not beginner mistakes. They emerge once the initial discipline wears off, gradually enough that nothing alerts you before the damage compounds. The chapter exists here, before the practices, so the rest of the book does not read as sales material for itself.

Agent instructions rot

The entry point goes stale. The agent instructions say to follow the old module layout, while an ADR reversed that layout months ago. Nobody updated the agent instructions, so the agent reads them, not the ADR, and produces code shaped for the old system.

The fix is structural and slightly painful. Treat agent instructions as part of the architecture, not part of the initial setup. Any PR changing something the agent instructions describe must update them in the same commit. No CI check catches "the convention you describe no longer matches the code". The Agent Instructions topic covers what to put in AGENTS.md and .agents/instructions/.... Keeping these current stays your job.

Dead specs

Open openspec/changes/ and find a pile of directories: implemented changes, canceled changes, an implemented change never archived, a partially done change from before the original author left, and competing proposals for the same change.

Without an archive step, the agent has no signal to distinguish a canceled spec from an active one. Whatever it reads, it reads as live instruction. De Schryver's case for keeping agentic workflows simple lands here: the clutter compounds with every change the team leaves un-archived.

A dead spec is worse than no spec. It tells the agent authoritatively about behavior the system no longer has, decisions that were reversed, and acceptance criteria never proven. Worse, it does so as the agent's first read of the change folder. Archive immediately after implementation. The Spec Lifecycle chapter builds the archive discipline that prevents this.

Sources: De Schryver, "Keep Agentic AI Simple" (2026), clutter as a compounding factor in agent context.

Agent-accelerated tech debt

Without spec-first discipline, the agent produces code that satisfies the immediate ask and quietly violates an architectural decision nobody read out loud. At human speed, this kind of drift accumulated across quarters. At agent speed, a day of merged PRs adds the architectural contradictions that once took weeks of hand-written changes to produce. Yegge's framing of the agentic shift fits: velocity amplifies whatever discipline is already there, and whatever is missing.

The Spec-Driven topic exists because of this mode. Writing the spec before the agent implements gives the agent the intention it needs. Decisions written down as ADRs are constraints the agent will follow. Intent and constraints living only in human memory will be violated.

Sources: Yegge, "Revenge of the junior developer," Sourcegraph blog (Mar 22, 2025), agent velocity as amplifier.

Over-spec

The team writes multi-page specs for a config rename. The spec becomes the bottleneck. Review cycles stretch. The agent, asked to implement from requirements buried late in the spec, drifts during the long reading pass and misses the requirement that mattered.

Spec length is a cost, not a quality signal. Every token spent reading the spec is a token unavailable for reasoning about the code. LeanSpec's framing applies here: if the spec is longer than the implementation would be, something has gone wrong. Match formality to risk. Payment processing earns a thorough spec. A config-key rename does not.

The "Why Small" chapter in the Spec-Driven topic goes further.

Sources: LeanSpec (lean-spec.dev), small-spec discipline, and formality-to-risk matching.

Drift with no detection

The team has agent instructions, ADRs, specs, and good initial intentions. Six months later, the repo has six ADRs from the first month and nothing since, a design doc last touched in March, and a docs/INDEX.md last updated when someone new joined. Nobody violated a rule because there is no rule about update frequency. There is only drift, and nothing detecting it.

A convention check in CI closes part of this loop. It catches structural violations before they reach the main. The check does not catch ADRs that should have been written and were not. It does not detect an architecture overview that was accurate a year ago and is now misleading. Detection of content drift is harder than detection of structural drift, and most of it remains a human responsibility.

The agent-evaluation chapter covers what detection is available and where the limits are.

Why the rest of the book is organized the way it is

Each topic targets one or more of these modes directly:

Failure modeTopic that addresses it
Agent instructions rotAgent Instructions
Dead specsSpec-Driven Development
Agent-accelerated tech debtSpec-Driven Development
Over-specSpec-Driven Development (Why Small)
Drift with no detectionQuality and Verification

ThoughtWorks Radar Vol 34 names the cost that accrues when these modes go unaddressed: cognitive debt, the agentic-era version of the undocumented decision that quietly breaks a deploy. Keeping the agent's context coherent enough to hold it down is what the Radar calls "harness engineering". The rest of this book is about building those controls, one failure mode at a time.

Zero drift is not the goal. Catching it before it compounds is.

Six months of accumulated mismatch trace back to what the agent was reading. What the agent should have been reading is the question the Agent Instructions topic exists to answer.

Sources: ThoughtWorks Technology Radar Vol 34 (April 2026), cognitive debt and harness engineering as the frame for drift that no check catches.