A2A, MCP, AG-UI, A2UI: The Essential 2026 AI Agent Protocol Stack

A2A→MCP→AG-UI→A2UI: Agentic Architecture Patterns & Tradeoffs. The Complete Protocol Stack Every AI Agent Builder Needs in 2026

6 min readJan 15, 2026

In 2026, AI agent development faces protocol overload — A2A, MCP, AG-UI, A2UI, Open-Json-UI, and others promise different pieces of the agent stack, but their overlaps and gaps create real confusion. New standards emerge monthly while teams struggle to choose the right combination for their agentic systems. This article provides a clear architectural map of these protocols, their precise roles, and how they layer together from agent coordination (A2A) → tools (MCP) → runtime (AG-UI) → UI (A2UI).

Also I will present a custom and innovate approach to have a cross protocol UI SimpleA2UI which can render both a2ui and open-json-ui by mapping the later in a2ui format

I’ve covered A2A and MCP extensively in prior work — including my custom Java implementations a2java and tools4ai. See here. This article focuses on AG-UI and A2UI — the critical user-facing layer where agents finally escape plain text chat into interactive UIs.

Quick Summary:

Agents talk to agents        → A2A
Agents call tools & systems → MCP
Agents talk to users        → AG-UI
Agents describe UI          → A2UI / Open-JSON-UI

Why do we need ag-ui?

AG-UI is a runtime protocol that carries structured events and messages between the agent backend and the user interface. It can carry generative UI specs like A2UI, Open-JSON-UI, or others over its event streams

Agent A ──A2A──► Agent B
         │
         ▼
      AG-UI
      (for display only)

AG-UI defines how agents and UIs communicate at runtime. It is an event-driven protocol

Handles:

Agent lifecycle (RUN_STARTED, RUN_FINISHED)
State deltas (STATE_DELTA)
Tool calls
User actions

Works over:

SSE
WebSockets
HTTP streams

Think of AG-UI like a real-time event bus / transport layer

Why do we need A2UI?

A2UI is a declarative UI specification — a format for describing UI components and associated application state that an agent wants the client to render. It doesn’t define transport rules (like bi-directional messaging) or lifecycle events by itself.

Open-JSON-UI — OpenAI  
A JSON-based UI description format aligned with OpenAI model response schemas.

A2UI — Google  
A declarative UI language designed for streaming, safety, and agent-generated interfaces.

AG-UI — CopilotKit  
A runtime interaction protocol that transports UI specs and agent events between agents and users.

A2UI defines what UI should be rendered; AG-UI defines how the agent and UI talk to each other in real time. Implementations can use A2UI payloads over AG-UI transport.

A2UI defines what the UI should look like and mean. It is a declarative UI spec

Describes:

UI intent (surfaces, components)
Data model / state

It does not define:

Transport
Lifecycle events
Bidirectional interaction semantics

️ Think of A2UI like HTML / JSX for agents

Step-by-step AG-UI flow

1️⃣ Agent execution starts

The backend agent begins a run.

{
  "type": "RUN_STARTED",
  "runId": "run_42",
  "timestamp": 1736891000
}

✅ Tip: AG-UI explicitly models agent runs and lifecycles.

2️⃣ Agent streams state changes (incremental)

Instead of re-sending everything, AG-UI sends state deltas.

{
  "type": "STATE_DELTA",
  "delta": {
    "status": "thinking",
    "progress": 0.2
  }
}

UI reacts immediately (spinner, progress bar, etc.).

✅ Tip: AG-UI is delta-based, event-driven, not snapshot-based.

3️⃣ Agent decides to render UI (via a UI spec)

The agent generates UI intent (for example, A2UI) and sends it inside an AG-UI event.

{
  "type": "STATE_DELTA",
  "delta": {
    "ui": {
      "spec": "A2UI",
      "surfaceUpdate": {
        "surface": "main",
        "components": [
          {
            "type": "chart",
            "props": { "title": "Revenue by Month" }
          }
        ]
      }
    }
  }
}

As Stated Earlier Important fact
AG-UI does not define the UI format. It only transports it.

This is where people might confuse AG-UI with a UI framework — it isn’t one.

4️⃣ User interacts with the UI

User clicks “Refine”.

That user action is sent back to the agent via AG-UI.

{
  "type": "USER_ACTION",
  "action": "refine_chart",
  "payload": {
    "timeRange": "last_6_months"
  }
}

✅ Fact: AG-UI is bidirectional — not just agent → UI.

5️⃣ Agent calls a tool

The agent decides it needs data.

{
  "type": "TOOL_CALL_START",
  "tool": "fetchRevenue",
  "args": { "range": "6m" }
}

Later:

{
  "type": "TOOL_CALL_END",
  "tool": "fetchRevenue",
  "result": "success"
}

UI can visualize tool execution in real time.

6️⃣ Agent finishes

{
  "type": "RUN_FINISHED",
  "runId": "run_42"
}

UI knows the interaction is complete.

AG-UI + Open-JSON-UI works identically — just swap the UI payload format.

Here’s the step 3 modification for Open-Json-ui:

3️⃣ Agent decides to render UI (via Open-JSON-UI)

The agent generates **Open-JSON-UI** instead of A2UI and sends it inside an AG-UI event.

text{
  "type": "STATE_DELTA",
  "delta": {
    "ui": {
      "spec": "open-json-ui",
      "content": {
        "type": "chart",
        "title": "Revenue by Month",
        "data": { "series": [...] }
      }
    }
  }
}

Open-Json-UI to A2UI Mapper

In my simplea2ui project I am bridging the open-json-ui to a2ui gap by mapping the open-json-ui to the a2ui format in custom way.

The Open-JSON-UI renderer works by acting as a translation layer between the “agent-heavy” flattened JSON standard and the “developer-heavy” A2UI protocol.

Here is the high-level flow of how it processes my JSON:

1. Detection & Parsing

When you click Render OpenJSONUI, the application first parses the JSON. It specifically looks for the type: “screen” property, which identifies it as the new Open-JSON-UI standard.

2. The Mapping Process (mapOpenJsonToA2UI)

Since the A2UI rendering engine expects a highly structured component tree (with explicit IDs, child references, and protocol-specific wrappers), I implemented a recursive mapper that transforms the flattened structure:

Flattened to Hierarchical: It takes the linearcontentarray and builds a tree. For example, acardin Open-JSON-UI has acontentarray; the mapper transforms this into an A2UI Card component that points to aColumncontaining the child components.
Protocol Wrapping: A2UI requires text to be wrapped in{ literalString: "..." }and children to be in an{ explicitList: [...] } The mapper handles all this “boilerplate” automatically so the LLM doesn’t have to.
Component Decomposition: Complex types like form are decomposed into individual A2UI components (TextFieldfor inputs andButtonfor the submit action), ensuring they layout correctly in the UI.

3. A2UI Protocol Generation

The mapper outputs a standard A2UI message sequence:

surfaceUpdate: A single message containing an array of all generated components (the card, the text nodes, the inputs, etc.), each with a unique generated ID.
beginRendering: A message telling the UI which component is the “root” (the main container) and where to start drawing.

4. Rendering

Finally, these generated messages are passed to the MessageProcessor(the core A2UI engine). The engine sees standard A2UI components and renders them into the a2ui-surface in the main viewport.

Why this is better for Agents:

Token Efficiency: The LLM only needs to emittype: "text" instead of the verbose A2UI structure.
Structured Output: It aligns perfectly with OpenAI’s JSON Schema/Structured Outputs because the properties are at the top level of the objects.
Simplicity: The agent focuses on what to show (content), while the client handles how to show it (protocol/layout).