IT Governance8. Mai 202613 min

Spec-Driven Development: Governing AI Coding Agents at Enterprise Scale

AI coding agents now write the majority of code — the question is no longer whether, but under what control. Spec-Driven Development provides the governance framework that lets mid-size organizations master agentic delivery.

R&D

R&D Team

Alev-B Research & Development

From Prompt to Specification: Why the Delivery Paradigm Is Shifting

Within roughly two years, the way software is produced has shifted fundamentally. AI coding agents generate pull requests, refactor modules, and write tests at a pace human reviewers can no longer follow line by line. The initial euphoria around "vibe coding" — freely improvised prompts that yield runnable code — has given way to a sobering operational question: how does an organization retain control over code it did not write itself and has not fully read?

The answer that emerged as an enterprise standard across 2025 and 2026 is Spec-Driven Development (SDD). Its core is an inversion: the primary artifact is no longer the prompt, but a versioned, machine- and human-readable specification. The AI agent is the executor — the specification is the source of truth against which every generated change is checked. InfoQ describes this transition as the decisive maturity step that moves AI code generation out of experimentation and into the regulated delivery pipeline.

This shift is more than a tool change. It changes where in the value chain the intellectual work happens. Previously, diligence flowed into writing the code; going forward, it flows into precisely formulating intent, constraints, and acceptance criteria. The industry has coined the term "intent engineering" for this — the discipline of specifying a business intent so completely and unambiguously that an agent can implement it correctly and a human can check the result against that intent.

For those accountable in IT delivery management, this is a first-order governance question. When the executing instance is no longer a person but a non-deterministic model, traceability, auditability, and accountability shift from the person to the process. This very bridge — from strategic AI readiness to operational delivery reality — is the subject of this article.

Spec-Driven Development moves the primary artifact from the ephemeral prompt to the versioned specification. For the first time, AI code generation becomes auditable, reproducible, and governance-ready.

"Vibe Coding Is Dead": What the Phrase Actually Means

The phrase "vibe coding is dead" has circulated as a slogan since the most recent Thoughtworks Technology Radar. It is deliberately sharp, but it captures a real finding. "Vibe coding" described the early practice of giving a model unstructured instructions and judging the result intuitively — fast, creative, and often brilliant for prototypes. For production systems that must be maintained, however, this way of working produces a structural problem: code with no traceable intent.

Thoughtworks and others summarize the consequence under the term cognitive debt. This does not mean defective code — that code may be technically flawless. It means the shared mental model of the system erodes: the team can no longer reliably explain why a component behaves the way it does. This debt is insidious because it surfaces only at the next increment, the next incident, or the next staffing change. We treat this mechanism in depth in our discussion of modern technical debt; in the present context, it is enough to note that vibe coding systematically produces cognitive debt.

Spec-Driven Development is the direct answer to this. By capturing intent explicitly, in a versioned and reviewable form before generation, the system's mental model remains externally documented — regardless of who or what wrote the code. The death of vibe coding is therefore not a ban on AI assistance. It is the transition from improvised to specified AI assistance.

The distinction matters: SDD does not replace agile ways of working, nor the product backlog. It replaces the uncontrolled intermediate step between requirement and generated code with a binding, reviewable intermediate artifact. For organizations that already examine their DevOps maturity in a structured way, SDD is the logical extension of feedback and version-control discipline onto the AI layer.

Spec-Driven, Prompt-Driven, Vibe Coding: A Direct Comparison

The three ways of working cannot be sorted along "good" versus "bad," but along controllability and fitness for production, regulated environments. The comparison below makes the governance-relevant differences explicit.

CriterionVibe CodingPrompt-DrivenSpec-Driven Development
Primary artifactNone — ephemeral conversationPrompt history, usually not versionedVersioned specification in the repo
Source of truthIn the developer's headImplicit in the prompt contextExplicit, reviewed specification
AuditabilityNot givenLimited, hard to reproduceComplete, reproducible via Git
Review focusLines of code, ad hocLines of code plus promptSpecification plus high-value gates
Scaling with agentsBreaks down earlyLimited, context-dependentDesigned-for goal of the method
Fitness for regulated deliveryNot suitableConditionally suitableSuitable, governance-ready

The Governance Model: Versioned Spec Repositories as the Control Point

The architectural core of Spec-Driven Development is mundane and effective precisely for that reason: the specification lives in the version control system, on equal footing with the code. It is changed through pull requests, undergoes review, has a commit history, and is therefore reconstructable for any point in the past. What has been self-evident for code for two decades is here applied to intent.

This solves a problem that AI assistance otherwise aggravates: the diffusion of accountability. When an agent produces code, the question "Who decided it should be this way?" is otherwise unanswerable. With a versioned spec repository it is not: the decision sits in the spec, the spec has an author and a reviewer, and the generated code is a traceable derivation from it. This is the foundation of any robust AI governance — and the point at which a structured governance documentation model should formally anchor the spec discipline.

A second effect is the shift in review focus. Classic code review does not scale against agent-generated volumes — nobody reads five thousand generated lines with the same care as fifty hand-written ones. SDD does not answer this with "more review," but with concentrated review at a few, high-value control points: the specification is reviewed thoroughly because it is compact, substantive, and long-lived. The generated code is validated primarily against the spec, ideally automatically.

From this follows a concrete governance approach that is implementable without a FAANG budget:

  1. 1Establish a spec repository: keep specifications as versioned artifacts in the same or a tightly coupled repository as the code. Define a binding, lean spec format — intent, acceptance criteria, constraints, explicit non-goals.
  2. 2Move the review obligation onto the spec: make spec review a mandatory gate before any agent generation. Concentrate human review capacity where a decision is made, not where it is merely executed.
  3. 3Define high-value checkpoints: identify the few places where human judgment is indispensable — security and data-protection boundaries, business correctness, architectural decisions. Everything else is checked automatically.
  4. 4Anchor spec-to-code conformance in CI/CD: have the pipeline verify that the generated code satisfies the acceptance criteria and tests defined in the spec. A build that violates the spec is a red build.
  5. 5Constrain agent permissions: grant agents only the rights their role requires — no direct push to protected branches, no bypass of spec gates, no production access without human approval.
  6. 6Close the audit trail: link spec commit, agent run, and resulting pull request through stable references, so that every code change is traceable, without gaps, to a reviewed intent.
  7. 7Recalibrate periodically: treat the spec format and gate criteria as living governance artifacts and review them at fixed intervals against real incidents and reviews.

CI/CD Integration: Where the Specification Engages the Pipeline

Spec-Driven Development only realizes its governance effect in the pipeline. A specification that no one checks automatically is a wish list. The decisive step is therefore to make spec-to-code conformance a hard gate in the CI/CD process — with the same authority with which a failed unit test stops the build.

In practice this means: an agent's pull request references the spec version it was generated against. The pipeline derives the acceptance criteria from the spec — as executable tests, contract checks, or policy validations — and blocks the merge if the code deviates from the specified intent. Microsoft describes such an end-to-end, agent-supported delivery lifecycle, in which specification, generation, and validation are orchestrated as a connected lifecycle rather than as loosely coupled individual steps.

For organizations without a highly specialized platform team, the pragmatic sequence is decisive. Do not start with a fully automated agent factory, but with the simplest effective gate: spec in the repo, mandatory spec review, a CI step that runs the acceptance tests stored in the spec. This minimal stack already produces the bulk of the governance value, because it establishes traceability and automatic deviation detection.

Organizations that have not yet reliably established the DevOps fundamentals — automated tests, clean version control, fast feedback loops — should stabilize these first. SDD amplifies a mature pipeline, but it cannot substitute for an immature one. A structured assessment of DevOps maturity is therefore sensibly the first step before agentic generation is released at larger scale.

A specification that is not automatically checked in the pipeline is documentation without enforcement power. The CI/CD gate turns intent into an enforced control.

The Tooling Landscape: Spec Kit, Kiro, and BMAD at a Glance

Three approaches shape the current Spec-Driven discussion, each with a different emphasis. The point is not to crown the "best" tool, but to understand which model fits which level of organizational maturity. The overview below positions the options along their governance emphasis; specific feature sets evolve quickly.

ApproachEmphasisGovernance ContributionTypical Entry Context
GitHub Spec KitSpecification as a repo artifact, tightly coupled to the Git workflowSpec in version control, PR-based spec reviewsTeams with an established Git and PR process
AWS KiroSpec-centric development environment with structured phasesSeparation of intent, plan, and implementation as process stepsOrganizations that want a guided SDD workflow
BMADMethodical framework for agent-based, specification-driven deliveryRole and phase model for human-agent collaborationTeams that adopt the method before a tool

Choosing Tools Without a FAANG Budget: The Pragmatic Decision

Mid-size organizations face a different reality than large enterprises with dedicated platform and tooling teams. They have neither the budget nor the capacity to build their own agentic delivery chain, and they cannot afford a failed large-scale project. The good news from practice: the governance value of SDD arises overwhelmingly from the principle, not from the tool.

Concretely: a versioned spec, a mandatory spec review, and a CI gate can be implemented with the tools already in use — Git, pull requests, an existing CI pipeline. Spec Kit is attractive for teams that start exactly here and want to extend their established Git workflow with only the spec artifact. Kiro addresses organizations that prefer a more guided, phase-oriented flow. BMAD makes sense for those who want to anchor the method and role model first, before committing to a specific tool.

The consulting recommendation is consistently incremental: choose not the most comprehensive model but the most easily adoptable one that fits your existing pipeline. Prove the governance benefit on a real, non-critical value stream. Scale only once the spec gate demonstrably catches deviations that a classic code review would have missed. CGI describes this change as the transition from improvised coding to intent engineering — and precisely that transition can start small.

A common mistake is the assumption that tool selection is the central decision. It is not. The central decision is organizational: who may approve specifications, which checkpoints are non-delegable, and how the audit trail is secured. That is a governance question, not a tooling question — and it should be answered before the tool is chosen.

The Bridge: From AI Readiness to Agentic Delivery

Spec-Driven Development is the operational redemption of what a strategic AI readiness assessment demands at the strategy level. An AI Readiness Assessment Guide asks, among other things: are processes structured so that AI outputs can flow in under control? Is there an accountability model for algorithmic decisions? SDD is the concrete answer to exactly these questions in the delivery domain.

This closes a gap many organizations underestimate. Strategic AI readiness without operational delivery governance produces precisely the pattern known from failed AI initiatives: impressive pilot results that are not controllable in production. SDD is the mechanism that converts a successful agent pilot into a reproducible, auditable production process — the point at which most AI initiatives otherwise fail.

The order of the maturity steps is not arbitrary. An organization with weak DevOps maturity — unstable pipelines, patchy test automation, slow feedback loops — becomes faster with agents, but not safer. The DORA finding that AI is an amplifier and not a repair applies directly here: agentic generation amplifies the maturity that is already present and widens the gaps that were not closed. A DevOps Maturity Assessment baseline is therefore not a side issue but a precondition.

For leaders in the mid-size segment, a clear path follows: first make the delivery fundamentals reliable, then introduce spec discipline as a control point, then expand agentic generation incrementally and under gate control. This sequence is not the fastest, but it is the only one that holds sustainably without large-enterprise resources.

A Practical Roadmap: Agentic SDLC Without Large-Enterprise Resources

For mid-size organizations, the path to governed agentic delivery is a sequence of manageable stages, not a mega-project. From consulting practice, an approach has proven itself that, in its first stage, requires neither new tools nor an organizational restructure.

In the first weeks, what counts is establishing the principle, not building a platform. Choose a single, non-critical value stream. Introduce a lean, binding spec format for it and make spec review the mandatory gate before any agent generation. Add a CI step that runs the acceptance tests stored in the spec. Nothing more is needed for the first proof.

The second phase is about consolidation and the audit trail. Link spec commit, agent run, and pull request via stable references, so that every change is traceable, without gaps, to a reviewed intent. Constrain agent permissions restrictively and define the few non-delegable checkpoints — typically security, data-protection, and architectural decisions — explicitly and in writing.

Only in the third phase do you scale across further value streams and assess whether a dedicated SDD tool justifies the manual effort. This decision is made on evidence: if the spec gate demonstrably catches deviations that classic review would have missed, the tooling investment is justifiable. Before that, it is speculation.

Throughout, internal expectation management is indispensable. The message to teams and leadership is not that AI replaces engineering, but that it shifts the locus of diligence — from the line of code to the specification. Those who do not actively lead this narrative risk that spec discipline is perceived as bureaucracy rather than what it is: the precondition for using AI speed without losing control.

Key Takeaways

  • Spec-Driven Development replaces the ephemeral prompt with a versioned, reviewed specification as the source of truth — the precondition for auditable AI code generation.
  • "Vibe coding is dead" does not mean the end of AI assistance, but the transition from improvised to specified assistance that avoids cognitive debt.
  • The governance lever lies in shifting the review focus: thorough review of a few high-value spec and security checkpoints instead of line-by-line review of agentic volumes.
  • The governance value arises from the principle, not the tool — a versioned spec plus a CI gate is implementable with existing Git and an existing pipeline.
  • Agentic delivery is the operational redemption of strategic AI readiness; without mature DevOps fundamentals it amplifies existing weaknesses rather than fixing them.

Frequently Asked Questions

Classic requirements are usually a document that is created before development and goes stale afterward. An SDD specification is a versioned, living artifact in the same repository as the code, with commit history, mandatory review, and automated conformance checking in the pipeline. The decisive difference is enforcement: the spec is not merely a description but an active control point against which every generated change is automatically validated.

No. The bulk of the governance value arises from the principle — a versioned spec, mandatory spec review, a CI gate for spec-to-code conformance — and is implementable with existing Git and an existing CI pipeline. Specialized tools become worthwhile once the manual effort is demonstrably justified. Tool selection should follow, not precede, the organizational decision about approval rights and non-delegable checkpoints.

Cognitive debt arises when the shared mental model of a system erodes because code is produced without traceable intent. SDD captures intent explicitly, in versioned and reviewable form, before generation. The mental model therefore remains externally documented and reconstructable, regardless of whether a human or an agent wrote the code. The spec becomes the durable, reviewable explanation of the why.

No, the opposite. Mid-size organizations benefit precisely because they cannot absorb failed large-scale projects. The recommended entry is deliberately small: a non-critical value stream, a lean spec format, a mandatory review gate, a CI step. This minimal stack already produces most of the traceability and deviation detection, without dedicated platform teams.

Robust DevOps fundamentals are the precondition: automated tests, clean version control, and fast feedback loops. AI is an amplifier, not a repair — it widens existing weaknesses just as it widens existing strengths. A structured DevOps maturity assessment before broad agent release prevents speed from scaling without control.

Through a closed audit trail: every code change references the spec version it was generated against; the spec has an author and a reviewer; spec commit, agent run, and pull request are linked via stable references. Every production code change is therefore traceable, without gaps, to a reviewed, documented intent — the basis for any robust compliance statement about AI-generated code.

SDD replaces neither agility nor the product backlog. It replaces only the uncontrolled intermediate step between requirement and generated code with a binding, reviewable artifact. A backlog item still leads to a substantive clarification; what is new is that this clarification is captured as a versioned specification and becomes the control point of generation.

Spec-Driven DevelopmentAI Coding AgentsAgentic SDLCIntent EngineeringAI GovernanceCI/CD

Ready for Your Assessment?

Use our interactive templates to measure your IT organization's maturity — with automatic scores, AI-powered recommendations, and professional PDF reports.