Andrej Karpathy, OpenAI co-founder and former Tesla AI director, called Claude Tag the third major redesign of LLM UI/UX. First the LLM was a websi...
For further actions, you may consider blocking this person and/or reporting abuse
Attestation as record-not-approval is the primitive most teams will skip when they retrofit Claude Tag into existing review practice. It is also the part of your post I want to keep, because it names a distinction the rest of the conversation collapses: approval implies endorsement of the decision, attestation only commits the reviewer to having read the artifact and accepted accountability for what happens next. Those are different liabilities and they need different surfaces.
The hash convergence on two-reviewer attestation does interesting work I had not seen named explicitly. Two reviewers attesting the same session get the same hash by construction, because the artifact is content-addressed. That gives you a verifiable record of independence-via-content, which is different from independence-via-different-judgment. Independence-via-content says "we both saw the same thing"; independence-via-judgment says "we agreed despite seeing differently." Most agentic verification setups conflate these and end up with neither.
The piece that maps directly from your single-agent framework to the ambient-team case, and where I think the harder problem hides, is: at team scale, who authors the five-element frame matters as much as who attests to it. If the same agent that made the decision also assembles "original promise / acceptance criteria / diff / evidence / unresolved assumptions," the attestation is over a self-curated artifact. Structural separation between deciding agent and frame author is the part that needs ambient infrastructure most teams don't have yet, and it does not get solved by "tag a different agent to write the summary" because the deciding agent's outputs are still the only source the framer sees.
Gail Weiner's "skeptic on record as approver" mechanism is also the social version of your hash receipt: a public commitment a third party can point at later. That is the same primitive at a different layer, content-addressed accountability at the technical layer, name-addressed accountability at the organizational layer. Both have the same structural property: the accountability is hard to relocate after the fact.
Honest stage marker on this side: I work adjacent (operator-side decision audit on dev.to, with a parallel decision-audit primitive set called jugeni-contracts published this week). Reading your post sharpened how I think about the attestation-vs-approval split, which is a distinction I had been leaving implicit.
You're right, and it's worse in my own repo than the Claude Tag case makes obvious. The ledger rows are written by the same process that made the decisions. ReviewSurface reads self-reported evidence and calls it independent verification. Separating framer from decider means an observer outside the agent's own process — something logging tool calls and token counts at the system level, not trusting the agent to log itself honestly. I haven't built that yet. Where would you put the boundary??
The boundary is the transport seam, not the code path. The agent can lie about what it did internally; it cannot lie about what bytes left its process. So the observer has to live where the agent's output crosses into the world — the syscall boundary, the MCP transport, the LLM API call, the file write fsync. Not inside the agent's loop, calling it "instrumented."
Concretely, three properties the observer needs:
Different process, append-only sink it cannot rewrite. If the agent can edit the log, the log is self-report under a different name. OTel collector in another container, or a unix socket to a separate writer with one-way pipe. The agent doesn't get a handle to the past.
Capture at the wire, not the wrapper. Token counts and tool calls logged by whatever talks to the model provider, not by the agent's own SDK wrapper. The wrapper is on the agent's side of the seam. Anthropic's API logs, OpenAI usage records, MCP server-side logs — those sit on the world's side.
Content-addressed, so the ledger row points at the bytes that actually went over the wire. Hash of the prompt, hash of the response, signed by the transport layer. If ReviewSurface reads evidence_hash and the observer's log has a different hash for the same call, you have a discrepancy without trusting either side.
The honest version of "I haven't built that yet" is that almost nobody has — most agent observability is wrapper-side, which means it is still self-report with extra latency. The closest production examples I have seen are the ones that piggyback on existing transport layers nobody owns: provider usage records, MCP server logs, syscall auditing. Those exist because somebody else wrote them for a different reason.
The shape that would land in your repo without rewriting everything: a sidecar process subscribing to the MCP server's stdio, hashing each call and response, writing append-only to a sink the agent's runtime cannot reach. ReviewSurface keeps reading its own ledger; an external diff job compares the two. Disagreement is the signal, agreement is not.
Three-actor structure stays: frame author (agent), attester (sidecar), reader (anyone). What changes is that the attester is not running the same code as the framer.
The "wire not wrapper" distinction is the part I needed. My repo doesn't run MCP . it calls the Anthropic SDK directly, so the seam is the HTTPS call to api.anthropic.com, not stdio. That means the sidecar has to terminate TLS or proxy the connection, not just wrap the client in the same interpreter, since a same-process wrapper is still self-report with extra steps. Does the diff-job design still hold once the proxy has to decrypt to hash anything, or does that just relocate the trust problem to whoever holds the cert???
Yes, the trust problem relocates. That isn't a bug to solve, it's the property to design around. There is no fully-untrustable observer in software-only solutions — you can only push trust to actors with different motives, fewer conflict-of-interest, or harder-to-compromise positions. Same-process wrapper has no independent motive and is self-report under another name. Different process, different motive, harder-to-compromise position is the gradient.
Three options for your Anthropic-SDK-direct setup, ranked by infra cost:
Anthropic-side usage records as already-existing observer. They log request/response metadata server-side because they have to bill you and detect abuse. That motive is not identical to your agent's motive of self-favorable reporting. You don't control what they log, you don't get content hashes by default — but it's zero new infra, and the structural property (separate party, different motive) is real. Cheap; partial coverage. Worth pulling first because someone already paid for it.
eBPF or kprobes on SSL_write / SSL_read. Capture the cleartext bytes in kernel space before they enter TLS. Sidecar reads pre-encryption from a kernel hook. You never terminate TLS, so the cert ownership question doesn't apply. Trust relocates to the kernel — which is the threat-model boundary you were already implicitly trusting. Cilium and Falco do this in production for security observability. Same primitive, applied to honesty rather than security.
TLS-terminating proxy with separately-owned cert. Works, but only buys you independence if the cert owner is a different operator/team/org than the agent's shipowner. Same team holds the cert = security theater. Different team holds the cert = real gain, plus production complexity (cert rotation, latency, SPOF).
The cleanest of the three for a small ops surface is #2. The trust relocation lands on the kernel, which is where most threat models already draw the line, and you skip the TLS termination problem entirely. The cleartext is in process memory for a few microseconds anyway — the kernel just gets to see it without you having to MITM yourself.
Honest stage: I haven't shipped this at the SSL_write boundary in production. The eBPF pattern works for security observability — applying it to honesty observability is the same primitive, but I am reasoning by analogy on the production-readiness side. The Anthropic-side usage path I am more sure of, because it costs nothing to start using.
The eBPF route is the cleanest design but it's a kernel-level commitment a "minimal library, $5/month" tutorial repo's audience won't make. Anthropic's own usage records are the only option that matches the project's existing philosophy of zero added infrastructure. Is partial coverage — metadata only, no content hash still worth shipping as a real v2 step, or is that thin enough that it's not meaningfully different from having nothing?
The unwinnable-frame observation is the real insight here, sharper than the tooling debate. Good output reads as outsourcing, bad output confirms the fear, and there's no third result that converts a skeptic on its own. Gail's move works because it changes who's on record, not because it changes the tech. One thing I'd add: the skeptic's 'this added value' only holds if they can see what the agent did and why. Trust survives when the work is legible, not just when the output happens to be good.
Right and it's the same requirement Mike's working through in the other thread, just at a different layer. His fix for technical trust is an independently auditable trail — something a different process can check without trusting the agent's own account. The skeptic's version of legible isn't a ledger though. They're not opening a diff tool. What does "I can see what it did and why" actually look like for someone who was never going to read the logs?
Deep stuff, and I have the feeling you're right ... when is Anthropic going to hire you?
They don't have to. I already gave the code away.🤣
They don't have to - but they might want to, when they recognize you've got some serious "skillz" ... ;-)
Just curious, if they'd offer you a nice "position" (remote), would you consider it? :-)
(but maybe it's only possible if you relocate to the US, I don't know)
Remote, yes. Relocating to the US, no. That's the version I'd actually consider...
Oh yes that's so true, I couldn't agree more ...
I would also NEVER consider relocating to the US, even if they'd beg me (vanishingly small chance of that, lol) - especially not with the current Trump administration and their ICE insanity and all that ... thanks but no thanks!
(well it's completely theoretical, because they seem to have decided that they really do NOT want any foreigners in their country anymore, not even the best and the brightest, or the most hardworking)
lol, let's leave that one to the comments section and let the repo do the talking.