Sean Burn

Posted on Jun 24

13 years in, AI nearly left me behind. So I stopped fighting it and started building.

#ai #career #devjournal #programming

13 years in, AI nearly left me behind. So I stopped fighting it and started building.

I've been writing code for thirteen years. I mention that not to flex, but because it's exactly why the last couple of years rattled me.

When AI coding tools started getting genuinely good, I felt something I hadn't felt since I was a junior: behind. Thirteen years of hard-won instinct, and suddenly a chatbot could scaffold in thirty seconds what used to take me an afternoon. You tell yourself it's hype. Then you watch a feature ship in a day that would've taken a week, and the floor shifts a little under you.

So I did what a lot of us did. I started dabbling.

The dabbling phase

At first it was small, low-stakes stuff. Summarising things. Turning a wall of server logs into a readable incident report when a site fell over at 2am: paste the logs, one-line prompt, done. Little wins, low trust required. I wasn't betting anything important on it.

Then the wins got bigger, and so did my usage. I settled on Claude, in a browser tab that was basically always open. I'd describe a small problem, like a bit of functionality a site needed or a fiddly plugin to handle one specific thing, and download the artifact it produced. Copy, paste, tweak, ship. It became a real part of how I worked.

But a browser tab has a ceiling. You're forever copying things in and out, losing context, re-explaining your codebase every single session.

Down the rabbit hole

So I went into the terminal properly. Learning MCP servers, wiring the AI into my actual environment, my actual files, my actual tools. It was not a weekend project. It was a long, frustrating slog of making AI work the way I needed it to, rather than the way the demos pretended it already did.

And somewhere in there, I hit the wall I think every serious developer using AI eventually hits:

The time you save prompting, you lose reviewing.

Because AI is confidently wrong in very specific, very repeatable ways. After enough hours you start to recognise the tells:

It calls a method that doesn't exist, a function the library never had, written so plausibly you almost don't check.
It imports a package you never installed, or invents a config option that was never in the docs.
It reaches for a deprecated pattern from three years ago, because that's what the training data was thick with.
It "helpfully" refactors code you didn't ask it to touch, and quietly reintroduces a bug you fixed last week.
And the one that started to genuinely worry me: it leaves user input unsanitised, logs things it shouldn't, hardcodes a secret right there in the source, because it's optimising for works, not safe.

I found myself spending nearly as many hours auditing AI output as I used to spend writing the code myself. I couldn't find the balance between prompting and policing. And in an industry this competitive, "slower but I trust it" and "faster but I'm anxious about it" are both losing positions.

That security thread turned out to be the loose end that unravelled everything else.

How Ghostables was born

A client raised their hand and asked the question I'd been quietly dreading: "If you're using AI to build our software, how do we know our data is safe?"

Fair question. And I didn't have a clean answer.

So I went and built one. That became Ghostables, a way to make the data a project holds genuinely defensible, so that "we used AI" and "your customers' data is protected" can both be true at once. I won't go deep here; the point is it wasn't a startup idea on a whiteboard. It was a real problem a real client had, and I built the thing that solved it.

But solving it exposed something bigger about how we were working.

The problem that became Snagger

The agency I was at delivered projects… backwards. Feedback lived in screenshots with red circles scrawled on them. Client changes arrived as "the button on the about page, the second one, not that one." Sign-off was a verbal "yeah, that's fine" that evaporated the second a dispute started. Nothing was tracked. Everyone re-explained everything five times.

If you've ever sent a developer a screenshot with a red circle and the note "this bit looks off", you already know the exact problem I'm describing.

I genuinely tried not to build this

Before I wrote a line of code, I did the responsible thing and went looking for the tool that already did the job. I didn't want to build a product; I wanted to use one. So I tried a lot of them.

The visual-feedback tools are good at what they do. BugHerd, Marker.io, Usersnap, Pastel, Ruttl and the rest all do their job well. But they nearly all work one of two ways: either you install a script or widget on the site, or they freeze the page into a static screenshot you annotate on top of. The first needs developer cooperation and falls over the moment you're reviewing a client's existing site you don't control, a staging build behind auth, or anything with a strict security policy that throws the snippet straight out. The second hands you a dead image: the menus don't open, the forms don't submit, the JavaScript doesn't run, and you've lost the exact behaviour you were trying to give feedback on. The thing I most needed to review was usually the thing I either couldn't install on, or couldn't afford to flatten into a screenshot.

And even when they worked, they only ever solved one slice of the project: the bug list. The feedback lived in BugHerd, but the design review lived in Figma comments, the sitemap lived in Slickplan or FlowMapp, the wireframes lived in Balsamiq or back in Figma, the tasks lived in Trello or ClickUp, and the client approval lived… nowhere, really, beyond an email that said "looks good." Five or six subscriptions, none of them talking to each other, and the context dying every time it crossed a boundary.

The part that finally pushed me over the edge was sign-off. Not one of them gave me a clean, formal "the client approved this page, on this date, at this viewport", the kind of record that ends a dispute instead of starting one. So every approval was a verbal maybe, and every change was an argument waiting to happen.

I didn't set out to build a product. I set out to find one that covered the whole arc, from audit and planning to wireframing, build review, content, and client sign-off, all without making me install anything on a site I didn't own. It didn't exist. So I built it.

What Snagger actually does

Snagger lets anyone point, click, and comment directly on a live website. And it works on any site, with zero installation, because it serves the real, fully-interactive site back through a reverse proxy instead of a dead screenshot or a snippet you have to beg a developer to install. Every comment is anchored to the exact element, on the exact page, at the exact viewport, with a screenshot, an author, a status, a due date. "The spacing is wrong on mobile" becomes a precise, trackable task pinned to the actual thing.

And then it keeps going, because feedback was only half the mess. Snagger carries a project across its whole life: audit an existing site, plan the new sitemap, wireframe it on a proper canvas, review the build, stage copy changes, and, the part that protects you, formal client sign-off, per page, per viewport, with revision rounds tracked. New feedback automatically revokes approval, so "but you said it was approved" is finally over. Agencies can white-label the whole thing and run it as their own tool.

Stop screenshotting. Start shipping. That's the entire pitch.

The third piece: keeping AI on the rails

Building two real products with AI meant living inside that prompting-vs-policing problem every single day. I needed a way to keep the agent on the rails: to hold it to what I'd actually asked for, stop it wandering off and rewriting things, and (not a small thing when you're paying per token) stop it burning context re-reading everything on every turn.

So I built a third thing: a control layer that sits around the coding agent. It keeps my intent written down where the AI has to follow it, checkpoints work so I can pick up cleanly instead of dragging a giant conversation around, and cuts my token usage dramatically in the process. (It's got a name and a home coming; I'm just not announcing it until the domain's mine. Bear with me.)

And building that is where the whole thing clicked into one idea.

Where I think this is all going

As AI writes more and more of the code that runs the world, the scarce thing won't be generating software. It'll be trusting it. Who verified this? Against what rules? Was the data handled properly? Did the agent actually do what it was told, or just something that looked close enough to pass review?

Right now almost nobody can answer those questions with a straight face. That's the gap I'm building toward: verification, attestation, and trust as first-class parts of how software gets delivered. Ghostables is the data side of it. The control layer is the process side. Snagger is where the work and the client meet in the open, on the record. Different products, one thesis: trust has to become something you can show, not just claim.

Why I'm telling you all this

I don't want passive users. I've built these tools out of real problems, but the surest way to build the wrong thing is to build it alone in a room. I want developers, designers, and agencies who'll put Snagger on actual client work and tell me, bluntly, where it falls short. The people who do this every day know things I can only guess at.

So I'm opening up early access, and the people who come in now to genuinely help shape it get the highest tier, free, and a direct line to me. Not a focus group. Collaborators. If the feedback gap, the client-sign-off chaos, or the AI-trust problem is something you live with, I'd love your hands on this.

Take a look at snagger.io, or just reach out and tell me what you'd want it to do. I'm a developer who fell behind, clawed his way back, and built the tools he wished existed along the way.

If any of that sounds like your week, come build it with me.

Top comments (2)

Mike Czerwinski • Jun 24

"Trust has to become something you can show, not just claim" is the part I keep watching teams skip. The prompting-vs-policing gap is the same shape as every operator-side substrate problem I keep landing on from a different angle.

The pattern in your three products is what makes the thesis credible. Ghostables for data trust, Snagger for client sign-off, control layer for AI rails. Same shape, different surfaces. Trust ledger that survives the work, not a claim that decorates it.

The control layer is the one I'd want to read about. Specifically: how it answers "which layer made the call" when you reconstruct a decision later. Keyword pass writes its verdict, router writes its verdict, agent writes its verdict, human writes their verdict. Honest implementations leave that trail visible, not subsumed by the last layer. Curious how yours handles that.

Sean Burn • Jun 25

Hey Mike, thanks for your input. So refreshing to get a well crafted and thorough response.

This is the question I care most about, so let me answer it straight.

You framed it as keyword, router, agent, human. In mine the shape's a little different: three voices, not four, and none of them gets to speak for the others.

The deterministic scan runs first and writes its findings as first-class records: the rule, the file, the line, the severity, and a plain-English reason it fired. It's its own artifact, not a paragraph buried inside something else.

The model verdict judges the change against the committed contract, one ruling per guardrail: honored, violated, or not-evaluable, each with a cited file and hunk or an explicit "nothing here to cite." What keeps it honest is the stance it's handed: it's prompted as an auditor that did not write the change, told to judge rather than defend, and not to overclaim. And the safety findings stay the scan's own layer; the model doesn't get to re-derive or relabel them.

The human layer is its own record too: the contract is versioned, sign-off is an explicit acknowledged checkpoint with a timestamp, and governance actions land in an audit trail.

Every verdict then lands as one entry in an append-only ledger, and each entry carries its own provenance: which model ruled, which coding agent produced the change being judged, whether it was self-run by the operator or attested in CI, a content fingerprint of the entry, and a tamper-evident signature. That ledger is public and append-only, each entry linked to the one before it, so a past entry can't be altered or dropped without breaking the chain. The trail genuinely survives the work; the newest entry never rewrites the earlier ones, because they're separate records, not a collapsed summary.

The honesty bit I won't dress up, and it's right there in the public report: a self-run verdict is labelled as exactly that, something the operator could have re-run, while the CI-attested ones are the independently verifiable kind. I'd rather show the seam than hide it.

What's live today is each verdict rendered with its per-rule evidence, the source-attributed ledger, and the public append-only log. The part I'm still hardening is folding those into a single "show me why this was allowed" view, so you read one reconstruction instead of cross-referencing the scan, the verdict, the contract version, and the audit trail by hand. That's exactly where I'd value your pressure.

You clearly come at this from the operator side, and that's the perspective I'd most want to pressure-test the design against. I'll be straight, though: the control layer isn't something I'm opening up to people yet. It's the newest of the three, and I'm keeping it close while that reconstruction story gets sharper. But I'd genuinely value comparing notes properly.