- Scrum solved late-90s delivery failures by shrinking feedback cycles and coordinating humans who couldn’t plan or build large systems quickly.
- Over time, Scrum’s flexible practices hardened into rigid doctrine, turning coordination tools (sprints, standups, story points) into performative rituals.
- Estimation rituals like planning poker often create the illusion of precision and productivity while work remains poorly specified and misunderstood.
- AI changes the underlying constraints Scrum was built for by reducing the cost of execution and helping teams manage complexity, making traditional ceremonies less necessary.
- Leading teams aren’t “fixing Scrum” so much as replacing it with spec-first development plus continuous validation and faster feedback loops measured in hours, not sprints.
Three things you'll walk away with after reading this:
- Scrum solved a real problem, but the problem has changed. The ceremonies existed because humans couldn't plan large systems or build them fast enough. AI removes both constraints, and the methodology hasn't caught up.
- Waterfall isn't the answer either. What's emerging is something new: spec-first development with continuous validation, where the upfront thinking is deep but the execution is fluid, not rigid.
- The teams pulling ahead aren't tweaking Scrum. They're replacing it. Small pods, full specs, AI-driven execution, and feedback loops measured in hours instead of sprints. The velocity gap between these teams and everyone else is widening fast.
The most telling responses to my earlier piece on Scrum's decline came from Scrum Masters who disagreed. Their pushback was interesting because the workflows they described bore almost no resemblance to the Scrum framework they were defending. They'd dropped estimation sessions. They'd shortened sprints to one week or collapsed them entirely. Standups happened asynchronously. The vocabulary survived, but the methodology underneath had already been gutted and replaced with something else. That something else is worth naming, because it keeps showing up independently across teams that have nothing in common except that they build software with AI assistance.
What the framework actually solved
The conversation around Agile's decline has gotten imprecise, so it's worth being specific about what Scrum addressed. In the late 1990s, waterfall projects collapsed at alarming rates. Teams spent months drafting specifications nobody referenced, built for a year, and delivered products that missed what customers actually wanted. The distance between "what we assumed they needed" and "what turned out to be useful" was measured in quarters. Scrum shortened that distance to two weeks. Build something small, put it in front of people, learn from the reaction, adjust. The ceremonies existed to synchronize humans who would otherwise drift apart. Story points existed because estimation was genuinely difficult and relative sizing outperformed false precision. Every piece of the framework had a reason when it was introduced.
How reasonable practices became rigid doctrine
The trouble started when Scrum's pragmatic adaptations hardened into rules. The two-week sprint shifted from useful default to mandatory cadence. Daily standups stopped being optional coordination and became compulsory attendance regardless of whether anyone had anything to contribute. Story points, originally rough guides, were plotted on burndown charts and presented to executives as if they quantified something real. Brian Carpizo captured it precisely: the original principle was "plan less and course-correct more" because detailed upfront planning consistently failed. Over time, that principle degraded into "we don't need to think carefully upfront." Those statements sound similar. They produce radically different outcomes. I've watched teams spend half a day in sprint planning negotiating story points for work nobody fully understood. The negotiation created the feeling of productivity without the substance. Two weeks later, during the retrospective, the team would acknowledge that the estimates were wrong and the stories were inadequately specified. Then they'd repeat the process the following sprint. Planning poker captures this perfectly. Engineers reveal Fibonacci cards simultaneously to prevent anchoring bias. It's a negotiation disguised as measurement. The output lacks units, lacks a universal definition, and lacks any dependable correlation with time or effort. An entire industry of coaching, tooling, and certification grew around it.
The assumption AI invalidates
Scrum was engineered around a specific cognitive constraint: humans cannot hold large, complex systems in working memory. So everything got decomposed into small pieces to be handled in isolation. User stories. Acceptance criteria. Sprint-sized increments. AI doesn't operate under that constraint. Provide Claude or GPT with a complete architecture document, a data model, a dependency map, and a set of constraints, and it reasons about the entire system simultaneously. It identifies edge cases you overlooked. It designs interfaces that align cleanly across modules. It achieves consistency that would have required weeks of collaborative whiteboarding. Provide it with a decontextualized user story ("As a user, I want to click the button so that the thing happens") and you get exactly that: a button that performs an action, disconnected from everything surrounding it. The user story, Scrum's fundamental unit of work, is the wrong input format for AI-driven development. It's too narrow and too stripped of surrounding information. AI output quality scales directly with context volume. Scrum's entire design philosophy removes context in pursuit of manageability.
Why the old alternative doesn't work either
Waterfall isn't resurfacing as a viable option. It failed because humans couldn't produce complete, accurate specifications before building. The effort was genuine. The results were 500-page requirements documents that contained errors by page 50 and were outdated before development began. The flaw wasn't the ambition to think upfront. It was that human cognition couldn't execute upfront thinking at the required depth and speed. AI shifts that equation without rehabilitating waterfall's core structure. A spec-first approach with AI doesn't mean "write everything, build everything, test everything" in rigid sequence. Waterfall collapsed not because of upfront thinking but because there was no mechanism to change direction once the spec proved wrong.
The pattern that keeps appearing
The fastest-moving teams I work with have abandoned both frameworks. They're converging on a pattern that lacks an established name but shows enough consistency to describe concretely. Planning produces a genuine architecture document. Not a 500-page waterfall artifact, but a 10 to 20 page living specification describing the system, its constraints, its data model, and its integration points. AI participates in writing it. A senior engineer with AI assistance can produce a coherent system architecture in an afternoon that would have demanded weeks of whiteboarding a year ago. With the spec in hand, implementation compresses dramatically. Not to sprint-speed. To hours. A capable engineer working with Claude Code or comparable tooling can implement a well-specified feature in a single session. The two-week sprint becomes an anachronism when the actual construction takes a day. Feedback survives. Showing work to people and learning from their responses remains essential. But the cycle is no longer bound to sprint boundaries. Ship a feature Tuesday morning, collect feedback Tuesday afternoon, revise Wednesday. The ceremony disappears. The learning loop persists. Team structure shifts accordingly. Three to five people, each full-stack capable, each working with AI agents. No dedicated Scrum Master role. No separate QA function. Cursor operates this way internally. So does a growing cohort of startups that never adopted Scrum to begin with.
The amplification problem
Carpizo raises a difficult truth about what this means for team composition. AI functions as a multiplier, not a leveler. A mediocre engineer with AI produces mediocre work faster. An exceptional engineer with AI generates output that a five-person team would have struggled to match six months ago. Scrum implicitly treated contributors as interchangeable. Story points were supposed to be team-relative. Velocity belonged to the team, not the individual. That framing made sense when the productivity variance between individual contributors was roughly 3x. With AI, that variance widens to 10x or 20x. An engineer who thinks architecturally, writes clean specifications, directs AI agents with precision, and evaluates output critically operates in an entirely different category from one who prompts and accepts. The team-velocity abstraction collapses when a single person with AI outproduces a five-person team without it. This is uncomfortable territory. The ceremonies, the pair programming sessions, the code reviews served genuine socialization and mentoring functions alongside their productivity purpose. What fills that role when teams contract to small pods and the performance gap between top and average performers widens? There's no clean resolution. But ignoring the question is how organizations end up with three-person pods handling the workload of ten while the remaining seven attend standups with nothing to discuss.
The spec-first workflow in practice
Write an actual document. Not user stories. A genuine specification: system architecture, data model, API contracts, edge cases. Plain language that a non-technical stakeholder can read and a model can reason about. AI assists in drafting it. This replaces the Jira backlog as the source of truth. An engineer selects a section, works with AI to implement it, ships the result. Collects feedback from actual users. Updates the specification based on what they learned. The spec evolves with the project, not during a grooming session, but whenever reality contradicts the plan. Human judgment focuses on reviewing the AI's interpretation of the specification. Did it parse the requirements as intended? Did it catch an edge case you missed, or introduce a new one? That review happens continuously rather than concentrating in a sprint review at the close of two weeks.
The coordination objection
Every time this topic comes up, someone from a large enterprise asks how to coordinate 200 engineers without Scrum. The question is fair. The honest answer is that coordinating 200 engineers was always the wrong solution to the underlying problem. Most of those 200 engineers existed because human-speed development demanded headcount. When each person becomes 3 to 5 times more productive with AI, you don't need 200. You need 40 to 60, organized in pods with clear ownership boundaries and shared specifications. Block reduced its workforce by 40%. Not because AI replaced the engineers directly, but because the organizational overhead, the coordination cost, the ceremony-servicing roles couldn't be justified once AI compressed the actual building work. The Scrum infrastructure was keeping people occupied, not productive. That's difficult to say about people's jobs. It isn't a celebration. But organizations pretending they can sustain 200-person engineering teams running Scrum while competitors ship with pods of three will learn the same lesson Block learned, just later and with less control over the outcome.
Naming something that doesn't have a name yet
Various labels have been proposed: "spec-driven development," "architecture-first development," "AI-native development," and Carpizo's contribution, "specification-first, iteration-on-feedback." None have gained traction. The label probably matters less than the pattern it describes: think deeply, build quickly, validate with real people, update the specification when reality diverges from the plan. Everything beyond that is ceremony. Two decades from now, story points will likely occupy the same historical position as 500-page waterfall requirements documents. A well-intentioned response to constraints that no longer apply.
Audit your process for rituals that exist mainly to coordinate human limitations—especially story-point negotiation, mandatory meetings, and sprint-bound planning—and cut or redesign them. Shift effort toward clearer specs and tighter validation loops: define what “done” means, get rapid user feedback, and let execution stay fluid as AI accelerates build time. Watch for teams that keep the Scrum vocabulary but operate in small pods with full specs and near-continuous delivery, because that’s where the velocity gap is likely to widen.