Building a UI framework in Go with Claude

TL;DR

I built godom, a Go framework for local apps, mostly with Claude (Anthropic) as an implementation partner. Not autocomplete; a real partner that wrote whole files.
The architecture, the design constraints, and the editorial calls stayed human. Speed of execution came from the AI.
The full breakdown of what was AI and what was human is in docs/AI_USAGE.md. This post is the longer reflection.

What “AI did most of the code” actually means

There’s a wide gap between “AI wrote some code” and “AI wrote a working framework.” I keep seeing AI-coding posts that flatten the experience into either “magic” or “useless.” It’s neither, and the difference matters.

I started godom with a clear thesis (the why post Why I built godom I kept needing a UI for local Go tools and hating every option. So I built a framework where Go owns the DOM and the bro... has the long version): the Go process should own the DOM, the browser should just render, and there should be no complex JS frontend to author. From there, the actual implementation, the VDOM, the diff algorithm, the WebSocket protocol, the template parser, the directive validator, and most of the tests, was largely written by Claude.

But “written by Claude” is doing a lot of work in that sentence. Claude wrote files. I read them, redirected them, kept the parts that fit the design, threw out the parts that didn’t. The AI_USAGE doc describes the relationship as “AI was redirected, corrected, and had wrong turns rejected throughout development.” That’s accurate. Without that steering, you don’t end up with a coherent framework; you end up with a pile of plausible code.

I wish AI could do everything perfectly. It has deep reasoning. It can write a lot of code in a hurry. But it still misses the mark, and once your tool grows up, it misses a lot of context too. The bigger the codebase gets, the more often “looks correct” is wrong in a way only a human who’s been carrying the design can see.

There were also times where AI challenged a decision I’d made and turned out to be right. That’s a humbling experience to admit, and worth admitting. Plenty of my “no, do it this way” instructions got walked back after Claude pushed on them.

The frustrating part of building something large is that AI doesn’t always pause where an experienced engineer would. It charges forward, takes a tangent, builds a clever thing on top of a shaky premise, and you spend your time rolling that back. Steering is constant work, not a one-time prompt.

I had wanted to run with the “let AI do everything” school of thought. I tried it on another project. It was a disaster I scrapped without even reviewing the code in detail. Some of it was probably good; the whole was beyond rescue. godom was built mostly with Claude Opus 4.6, and the same class of issues shows up with 4.7. The model has gotten better at individual tasks. It has not stopped needing direction.

As a stressed reviewer, it’s easy to let things slip because the code works. That instinct may have served me fine with other human engineers; experienced engineers don’t usually drift architecturally between reviews. With AI the stress is higher, and the checks (tests, validations, harnesses) have to be very deliberately crafted. A missed review compounds: something gets built on a shaky foundation, nobody pauses to question it, and rolling it back later is expensive.

The thing that mattered most: holding the bridge to its job

Of all the patterns I had to keep correcting, the one that mattered most was keeping the JS bridge dumb.

The thin-bridge principle (the bridge executes commands, every decision happens in Go) is load-bearing for godom. Without it, the framework’s whole pitch collapses; you end up with state on two sides and the friction I was trying to escape. Early on, the bridge was the place Claude kept trying to add intelligence. Each individual addition looked reasonable in isolation. Each one was a step away from the thin-bridge principle.

I had to point this out many times in the early weeks. What eventually made it self-enforcing wasn’t more rules; it was the architecture itself maturing. Once the protocol was tight and the patch types were small and well-named, “make the bridge smarter” started looking obviously misplaced. The constraint became visible in the code, not just in my prompts.

That’s a more general lesson than just godom: the editorial work an engineer does on AI output is heaviest when the design is fuzzy. Once the design is sharp, the AI starts producing more of what fits.

The single rule that actually mattered

If I had to compress everything into one sentence: the AI is the implementor; you are the editor.

That’s not the same as saying you read the diff. Editors cut. Editors redirect. Editors reject the whole draft and ask for a different approach. If you don’t do that, you end up with code that mostly works and an architecture that mostly doesn’t.

A few patterns that consistently helped:

Have a thesis before you write a prompt. “Go owns the DOM, browser is dumb” was mine. Everything got evaluated against it.
Force the simplest version first. Layer features on as concrete examples demand them, not because the framework “should” have them.
Test against intent, not implementation. A test that only passes when the code is shaped exactly as it currently is doesn’t catch regressions; it locks in details.

For the inverse direction (using AI agents to write apps on top of godom, rather than AI helping build the framework itself), there’s a separate post How to use godom with AI agents godom ships a docs/llm-reference.md aimed at AI agents writing godom apps. Why that doc exists, what it covers, and what... on what that workflow looks like.

If you want to look at the actual code, the godom repo is on GitHub. The AI_USAGE doc is the honest accounting of who did what. Use it with understanding, not blind trust.