March 13, 2026aidevelopmentretrospective

I Tried to One-Shot an Entire Web App with Claude Code

20,000 lines of code in 12 hours. It worked. It was also terrible. Here’s what I learned about the gap between “functional” and “shippable.”

I wanted to see how far I could push one-shot development. Not a component, not a landing page—an entire web application. A tier list ranking platform with multiple ranking modes, community consensus, user profiles, image uploads, and a leaderboard. I gave Claude Code a mega-prompt and let it rip.

Twelve hours later I had a working product. Four days later I had a shippable one. The gap between those two things taught me more about AI-assisted development than anything I’d read online.

Lines of code

Files changed

To first commit

Total commits

The Experiment

The project was Tier Flock—a site where you rank things into S/A/B/C/D/F tiers. Not a toy: four ranking modes (classic drag-and-drop, tournament brackets, “this or that” merge sort, timed challenges), community consensus heatmaps, user profiles, image uploads, embeddable widgets, and a points system.

I used Claude Code with Opus 4.6. I wrote the most thorough prompt I could—stack choices (Next.js 16, Drizzle ORM, Neon Postgres, Tailwind v4, better-auth), feature requirements, database schema concepts, UX expectations. Then I let it go.

The Timeline

March 10 – 13, 2026

12:34 AM

create-next-app

The blank canvas

12:26 PM

The One-Shot

20,686 lines land

Day 1 PM

First Fixes

Audit, migration, polish

Day 2

Heavy Iteration

18 commits, mobile, SEO

Day 3-4

Polish & Ship

Security, embeds, final bugs

12:34 AM

create-next-app

The blank canvas

12:26 PM

The One-Shot

20,686 lines land

Day 1 PM

First Fixes

Audit, migration, polish

Day 2

Heavy Iteration

18 commits, mobile, SEO

Day 3-4

Polish & Ship

Security, embeds, final bugs

That first commit landed at 12:26 PM on March 10th. One hundred two files. Twenty thousand lines of code. Full database schema, API routes, page components, auth, image handling. Four ranking modes, community consensus, user profiles, leaderboard.

It compiled. It ran. You could create a template, rank items, see community results.

On paper, the one-shot worked.

The Reality

In practice, it was nowhere close to shippable. The logic was sound—database queries worked, auth flowed correctly, the data model was reasonable. The AI understood the objective perfectly. It knew what a tier list platform should do, what features it needed, how the data should flow.

But every single feature was half-baked. Nothing was finished. The UX was the biggest casualty.

It was like looking at a product through frosted glass—you could tell what it was supposed to be, but the details were all smeared.

The /loop Experiment

I also tried using Claude Code’s /loop command to have it iteratively improve the codebase on its own. The idea: let it identify gaps and fill them autonomously.

It made things worse.

What /loop actually did

Instead of polishing existing features, it kept adding new half-baked ones.

Prompt

Mega-prompt

→

Generate

20K lines

→

/loop

"improve it"

→

More features

half-baked

→

More features

still half-baked

⚠Breadth when it should have optimized for depth

Instead of polishing existing features, Claude kept adding new half-baked ones. It would see a page and think “this needs a widget” instead of “this layout needs to not be terrible.” More features, same lack of polish.

The Fix: Four Days of Iteration

What followed was 44 more commits over four days. Here’s the velocity:

Day 1Mar 10 · 15 commits

Day 2Mar 11 · 18 commits

Day 3Mar 12 · 7 commits

Day 4Mar 13 · 5 commits

One-Shot vs Iteration

The one-shot

102

Files changed

+20,686

Lines added

−944

Lines deleted

One-shot

20.7K

Iteration

38.2K

Nearly 2x the work went into fixing and polishing vs. the initial generation.

Pop quiz

What percentage of the final shipped codebase came from the original one-shot commit?

What the AI Understands

Claude understood what to build with impressive accuracy. The architecture was clean. The database schema was sensible. The API routes were well-structured. It picked reasonable libraries and wired them together correctly. The bones were solid.

It also understood context deeply. When I said “tier list platform,” it didn’t just make a drag-and-drop grid. It built community consensus aggregation, comparison views, heatmaps, social sharing. It filled in features I hadn’t explicitly asked for—some of which were genuinely useful.

What the AI Doesn’t Understand

The gap is between “working” and “polished.” Claude can implement a feature checklist, but it can’t tell when a page feels cluttered, when a flow has too many steps, when a button is in the wrong place, or when an animation is too slow.

This isn’t a small gap. It’s the gap between a prototype and a product. And it’s the gap that takes the majority of the time in real software development.

The Real Time Split

AI nails architecture and logic. The craft takes human iteration.

20%

80%

Architecture & LogicCraft & Polish

The Finished Product

Here’s an actual live embed from Tier Flock—the same embeddable widget system I built into the platform. Meta, right?

Live from Tier FlockEMBED

This is the actual product — embedded via the widget system I built into it.

35+ pages, four ranking modes, community consensus, heatmaps, achievement system, embeddable widgets, short URLs, image exports—all shipped. But not a single one of these features was shippable in the one-shot. Every one required human iteration.

Would I Do This Again?

No. Not like this.

The amount of planning required to one-shot something well exceeds the effort of just building it iteratively. To get a good one-shot, I would have needed to specify every page layout, every interaction detail, every error state, every mobile breakpoint. At that point I’m writing a design doc more detailed than the code itself.

Iterative development is just better for this. Build a feature, use it, feel what’s wrong, fix it, repeat. The feedback loop matters. A one-shot can’t simulate the experience of actually using your own product.

What I Want From the Next Generation

The dream is an AI that acts more like a PM than a coder:

🎯Determine the goal, not just execute instructions

🔍Identify what’s most lacking and prioritize it

⚡Understand "feature exists" vs "feature is good"

🎨Iterate with taste, not just add more stuff

The /loop failure was instructive. When I told Claude to improve the project autonomously, it optimized for breadth when it should have optimized for depth. A better agent would look at the product, identify the weakest point, and make it better—not bolt on something new.

The Bottom Line

One-shot development with AI is real. You can generate a working application—database, API, frontend, auth—in hours. But “working” and “shippable” are separated by a chasm of polish, and crossing that chasm still requires human judgment and iterative refinement.

The 20,000-line commit was impressive. The 38,000 lines of fixes that followed were the actual product.

Try Tier Flock

The finished product

Browse tier lists

See what people are ranking