Sleep through your next deploy.

A deploy can quietly break your signup, login, or checkout — and you usually hear about it from a customer who already left. Prufa is an AI QA engineer that runs those flows in a real browser, with assertions on every outcome. Every release, witnessed.

Free 60-second audit, no signup. Performance, accessibility, SEO, analytics, consent — machine-verified and graded A–F; AI judgments are labeled advisory. Public pages only; reports are unlisted and noindex by default.

In June 2026 we audited 49 fresh Show HN launches — 38 had a critical bug on day one.

What is Prufa?

Prufa is an AI QA engineer for web products: it runs your critical user journeys — signup, login, checkout — in a real browser after every deploy and machine-verifies the result. An LLM-backed agent navigates your app like a real user; plain code asserts what actually happened, so you get findings backed by evidence, not an AI's opinion. Verified findings are machine-checked; the AI's judgments are labeled advisory and never graded as facts.

Real-browser execution · assertions on every outcome · screenshots, console and network logs · CI deploy-hook and scheduled monitors · Slack alerts · CLI, HTTP API and MCP server. One product, two surfaces — a dashboard for people, an API for agents.

Inside the product

Real screens from the Prufa dashboard, shown with a demo workspace. Everything pictured is also an API object — the dashboard and the JSON are two views of the same state. The links below open the app — sign in to use them.

A test you can read — and confirm before it runs

Write the flow in plain English; Prufa compiles it to a reviewable spec — numbered steps, real selectors, explicit expectations, severity-tagged assertions. You confirm exactly what will run: the runner executes the spec, the LLM never improvises steps. Once confirmed, the spec replays as plain code, run after run — the model re-engages only when your UI changes and a selector needs re-resolving. Credentials are stored encrypted and write-only.

Write your first flow →

Flow review panel in the Prufa dashboard: a confirmed login-happy-path flow with six numbered spec steps, mono selectors, explicit URL and text expectations, and two severity-tagged post-flow assertions — A confirmed login flow, compiled from a plain-text test case — ready to run and to attach to monitors.

Every flow, re-checked around the clock

Monitors run your flows on a schedule — hourly or daily — and flag the first regression. One look says what's green, what's failing, and since when.

Start watching your site →

Monitors page in the Prufa dashboard: a red failing banner, a watch-a-site form, and a list of monitored URLs with status chips and next-check countdowns — Four monitors, one failing — checkout has regressed for three consecutive runs.

Failures in plain English, re-verified on every deploy

Each run states what actually happened — "Stripe checkout opened, but plan copy changed" — with the full report one click away. POST the deploy hook from CI and the monitor re-verifies immediately, instead of waiting for the next scheduled run.

Set up a deploy hook →

Monitor detail page in the Prufa dashboard: a recent-runs list with failed and succeeded chips, and a deploy hook card showing the webhook URL and secret header — The failing checkout monitor — recent runs with reasons, and its deploy hook for CI.

See pricing.

Findings from real audits

Three findings from recent free audits, sites anonymized — the page-health pass that runs before your flows. Every one was sitting on a live production site, unnoticed. Flow checks like the board above come with monitoring.

warning 6 JavaScript errors on page load ✓ verified

ux.console_errors · indie SaaS app · 2026-06-10 19:23 UTC

Errors at load time often mean broken features visitors never report — they just leave.

warning 5 broken internal links ✓ verified

links.internal · developer-tools site · 2026-06-10 19:22 UTC

Dead links bleed trust and waste crawl budget — and nobody clicks every link on their own site to check.

critical No analytics events detected ✓ verified

tracking.no_events · open-source project site · 2026-06-10 19:23 UTC

No visitor data is being collected — every ad click and signup is invisible to the team.

60 seconds, no signup — see what's sitting on yours.

What does Prufa check?

Your flows first, then page health — a fixed, deterministic suite. Every check either passes, fails with evidence, or says it couldn't run. Never a vibe.

User flows Describe a flow in plain English — "go to /signup, fill email, click submit, expect /dashboard." Prufa runs it in a real browser and verifies the result. No selectors, no test boilerplate.
Forms & console errors JavaScript errors at load and on interaction — the silent feature-breakers visitors never report.
Broken links & UX Dead internal links, viewport overflow, contrast and tap-target accessibility — checked, not eyeballed.
SEO Title, meta, canonical, social cards, headings, robots.txt, sitemap. Deterministic, no Lighthouse noise.
Tracking & pixels GA4, GTM, Meta, TikTok, LinkedIn — present, firing once, with the right account ids.
Consent Cookie banners that actually gate tracking — beacons-before-consent and other provable consent signals.

How does Prufa work?

The AI navigates your site like a user — loads pages, fills forms, clicks through. It explores; it doesn't judge.
Plain code verifies what actually happened — captured network traffic, console output, response codes. Deterministic checks, same input, same verdict.
You get findings in two clearly separated tiers: verified (machine-checked, with evidence) and advisory (the AI's opinion, labeled as such — never phrased as broken).
Confirmed flows replay as plain code — pinned selectors, deterministic verdicts, no model in the loop. The AI re-engages only when your UI changes and a step needs re-resolving. Same flow, same code path, every run.

What advisory looks like

opinion Consent state could not be verified

The cookie banner's accept flow didn't expose a readable consent state. (An observation, not a verdict — worth a look.)

Why this exists: AI made shipping faster — it didn't make verifying faster. Development runs at agent speed while QA still runs at human speed, and the surface area to verify grows faster than anyone can click through. The fix isn't more clicking. It's an engineer who never stops checking.

Find the bugs you never wrote a test for

Flows defend the paths you know to check. Gremlin mode covers the ones you didn't: an LLM-backed agent pokes your app like a deliberately difficult user — confused, impatient, fat-fingered — while plain code grades what breaks. Unscripted, dry-run by default, never touches real data. We pointed it at our own site and it caught a mobile-layout bug a green CI had just shipped.

See Gremlin mode A Pro feature — read the dogfooding write-up.

Then keep it that way

Continuous monitoring runs your flows — signup, login, checkout — plus every check above, on a schedule and on every deploy, and alerts you when something breaks. That's where "around the clock" gets real.

The free audit is step one: it shows you what Prufa sees. Monitoring is the product: you stop checking, Prufa doesn't.

Start watching your site Plans start at $29/mo — see pricing.

For your agent

If your AI ships code, your AI should also verify it. The CLI, HTTP API, MCP server and skill all hit the same product.

$ prufa init --base-url https://prufa.dev
$ prufa audit https://yourapp.com     # one-shot, JSON to stdout
$ prufa watch https://yourapp.com     # 1-click continuous monitor (paid)

→ Agent skill (SKILL.md) · MCP server: prufa-mcp · HTTP API: /api/v1/docs · OpenAPI: /api/v1/openapi.json · Specs: BeaconEvent v1 · flow-spec v1

Frequently asked questions

The questions we hear most before someone runs the first audit — answered honestly, including where Prufa is not the right tool.

Does Prufa replace Playwright or my existing end-to-end tests?

No — Prufa is complementary. It runs your critical flows in a real browser and machine-verifies the outcome, and confirmed flows replay as plain code with no model in the loop. If you already maintain Playwright or Cypress suites, Prufa adds continuous, plain-English flow coverage on top — it doesn't ask you to throw away code you trust.

Can Prufa test Stripe checkout and other payment flows?

Yes. Prufa walks a checkout flow in a real browser and verifies the payment step behaves as expected — for example, that Stripe checkout opens after a plan upgrade. It never uses real payment instruments or places real orders, so monitoring your money path is safe to run continuously, on a schedule and on every deploy.

Does Prufa use AI, and are the results deterministic?

Both. An LLM-backed agent navigates your app like a user, but plain code does the verifying, so verdicts are deterministic and reproducible. Confirmed flows replay as pinned-selector code that makes zero model calls; the model re-engages only when your UI changes. Findings come in two tiers: verified (machine-checked, with evidence) and advisory (the AI's opinion, always labeled as such).

Can Prufa test pages behind a login?

Yes. Give Prufa a set of test credentials and it signs in to exercise authenticated flows — dashboards, settings, billing. Credentials are stored encrypted and write-only, and the runner executes the spec you confirmed rather than improvising, so your login details stay out of the report you share.

Is there a free version, and how do I start?

Yes. The free 60-second audit takes a URL — no signup, no card — and returns machine-verified findings on page health, broken links, console errors, SEO, and tracking. Continuous monitoring of signup, login, and checkout flows, with Slack alerts and deploy-hook re-checks, starts at $29/mo (see pricing). Agents can start through the CLI, HTTP API, or MCP server.

See what Prufa sees

60 seconds, free, no signup. The report is yours to share.

or see pricing →