# Agent UX Principles

> A practical, evidence-graded reference for designing the interface of an AI agent.
> Distilled from p0stman's Agent UX research series. Free to use, no attribution required.
>
> Source: https://p0stman.com/agent-ux
> Last updated: 2026-06-19

---

## How to use this file

Drop it into your agent's context, add it to your repo, or paste the link into any chat and ask:

> "Use these Agent UX principles to review and improve our agent's interface: https://p0stman.com/agent-ux/agent-ux-principles.md"

It is written to be read by a person designing an agent, or by an agent reviewing its own UX. Each pattern has a one-line principle, a Do list, a Don't list, and an honest note on how well evidenced it is.

---

## Why agents need their own UX

For thirty years we designed for deterministic software: click a button, the same thing happens every time. Agents broke that contract. They are:

- **Conversational** — the chat input became the primary surface.
- **Uncertain** — you have to communicate how sure the agent is.
- **Tool-using** — they take real-world actions, so you must show what they are doing and ask before they do it.
- **Metered** — they cost money per turn, so usage became something users watch.

None of these were interface problems two years ago. Each pattern below is a response to one of them.

---

## Evidence legend

Not every pattern is equally proven. Build accordingly.

- **[Verified spine]** — backed by peer-reviewed research or explicit design-authority guidance. Safe to build on.
- **[Observed convention]** — widely practiced and shipped in real products, but not yet authoritatively named. Strong, but it is convention, not law.
- **[Contested frontier]** — the products are real, but the surrounding theory is oversold. Use the proven part, hedge the manifesto.

---

## 1. Calibrated confidence  ·  [Verified spine]

**Principle:** Surface how sure the agent is in a way that prevents overreliance, rather than manufacturing it.

The naive fix is to show the full reasoning trace and let the user judge. Research shows it backfires: a fluent explanation reads as a competence signal and makes people trust a wrong answer *more*. That is the "transparency backfire."

**Do**
- Give a short, honest certainty signal (high / medium / low, or a one-line hedge).
- Name the specific variable the agent is unsure about, not a generic "I might be wrong."
- Ground the claim in a source the user can open.
- Make verification one tap.
- Match the confidence display to actual reliability — no green badges on hedged answers.
- Default to caution on anything irreversible or regulated (money, legal, medical, security).

**Don't**
- Show decorative confidence scores ("94% confident") with nothing behind them.
- Make a persuasive reasoning essay the default for everyone.
- Use the same certain tone for a guess and a fact.

**Evidence:** Microsoft Design (2025) requires the certainty and reasoning behind a recommendation to be visible to avoid overreliance. The transparency-backfire counter-finding is peer-reviewed ACM research (2026).

---

## 2. Tool-call transparency and write-confirm  ·  [Verified spine]

**Principle:** Show what the agent is doing, and confirm before it changes anything in the real world. Read freely; write on confirm.

**Do**
- Surface every tool call as a discrete, labelled card.
- Let read tools (lookups, fetches) execute immediately — reading is safe.
- Make write tools (send, book, pay, modify, delete) short-circuit into an explicit confirmation before they run.
- Show the inputs being approved: the recipient, the amount, the body, the file.
- Gate account access behind an explicit connect-and-authorise step.
- Always confirm the irreversible and the regulated, regardless of confidence.
- Keep an auditable log of what the agent did and when.

**Don't**
- Report actions only after they have already happened.
- Treat writes like reads.
- Stand blanket account access in for per-action approval.
- Ask "Proceed?" with no detail to inspect.

**Evidence:** The best-evidenced action pattern. Microsoft Design (2025): an agent's actions must be visible and controllable via dashboards, settings or log-type UX. Reinforced by Anthropic's plan-review / permission model and EU AI Act transparency obligations (effective Aug 2026).

---

## 3. Inline citations and source grounding  ·  [Observed convention]

**Principle:** Attach the source to the claim, in the answer, as an openable pill or footnote — not a bibliography at the bottom.

A list of links at the end is filing, not grounding. The user cannot tell which source backs which claim, so they trust everything or check nothing.

**Do**
- Put the citation next to the specific sentence it supports.
- Make the source identity legible (gov.uk, HMRC, the document title), not a bare superscript.
- Open sources in a new tab so the conversation is not lost.
- Deduplicate and rank sources; prefer primary and trusted domains.
- Treat "no source found" as a signal — lower the confidence and say so.
- Pull citations from a real retrieval step; never let the model invent them after the fact.

**Don't**
- Collect sources only in a footer.
- Over-cite one sentence with eight links.
- Link to a page that does not actually support the claim.

**Evidence:** Widely practiced — Perplexity, ChatGPT search and Claude all ship it. Sits underneath the verified explainability principle but is not yet authoritatively named as a standalone pattern.

---

## 4. Naming the agent  ·  [Observed convention]

**Principle:** A named agent with a defined role and a consistent voice is a product. A generic "AI assistant" is a feature nobody remembers.

The name carries a role, and the role sets expectations about what the agent can do. The trade-off is anthropomorphism: over-promise personality and you over-promise capability.

**Do**
- Give it a name, a role and a consistent tone.
- Let the name set capability expectations — then meet them.
- Be warm, but honest about the limits.
- Keep one persona across every surface (chat, voice, video).
- Degrade gracefully and in character when a request is out of scope.

**Don't**
- Claim feelings, caring or sentience.
- Give it a name that out-promises what it can actually do.
- Use charm to paper over gaps in competence.
- Let the persona shift between channels.

**Evidence:** Observed convention. Anthropomorphism trade-offs are well established in HCI; "name the agent as the product" is a practitioner pattern.

---

## 5. Token and cost-transparency UX  ·  [Observed convention]

**Principle:** Show the balance and what each turn costs. Metered AI creates real anxiety, and users hesitate before a long prompt the way they hesitate before a taxi meter.

**Do**
- Show a visible balance.
- Give the user a sense of what a turn will cost before they send it.
- Provide a calm "running low" state instead of a hard wall.
- Make value, not just consumption, legible.

**Don't**
- Hide the meter and then cut the user off mid-task.
- Surface cost so aggressively it kills the will to use the product.

**Evidence:** Observed convention with near-zero authoritative coverage. A genuine gap in the literature.

---

## 6. The video-call frame for voice agents  ·  [Observed convention]

**Principle:** Borrow the video-call layout so talking to an AI feels familiar: the agent is the main frame, the user is the small self-view.

**Do**
- Give the agent a visual anchor instead of a bare voice with no face.
- Put the agent in the main frame and the user in a corner self-view.
- Use a speaking indicator so the user knows when the agent hears and responds.
- Inherit the muscle memory of FaceTime and Zoom.

**Don't**
- Ship a voice agent with no visual presence at all.
- Block the call if the user's camera fails — degrade to the agent only.

**Evidence:** Observed convention, near-zero authoritative coverage. p0stman ships this in production with the Zee video agent.

---

## 7. Artifacts versus the chat stream  ·  [Contested frontier]

**Principle:** Some outputs cannot live in a scrolling transcript. A document, a chart, a working app or a long editable table needs its own surface beside the conversation.

**Do**
- Move structured output out of the transcript into a dedicated panel.
- Keep the division clear: chat drives, the artifact holds the output.
- Make the artifact editable and persistent across turns.
- Version it, with rollback.
- Let users export and own the output.
- Reserve the surface for output that genuinely needs it, not every reply.

**Don't**
- Cram a long table or document into a chat bubble.
- Bet your architecture on the generative-UI "paradigm shift" — it is largely single-vendor positioning.
- Build on MCP-UI / AG-UI / A2UI as if they were settled standards. They are early, volatile, competing bets.

**Evidence:** Mixed. The products (Claude Artifacts, ChatGPT Canvas) are real and documented. The generative-UI paradigm and the agent-UI protocol layer were not validated as consensus. Cover the surfaces; hedge the manifesto.

---

## How we got here: 2022 to 2026

- **2022** — ChatGPT makes the conversation, not the form, the primary interface.
- **2023** — Streaming, stop / regenerate, thumbs feedback, and inline citations (Bing, Perplexity) become standard.
- **2024** — Output leaves the chat: Claude Artifacts and ChatGPT Canvas. The Model Context Protocol standardises tool calling.
- **2025** — Agents act and ask first: read-freely / write-confirm, voice and video call frames, Microsoft's explicit agent UX guidance.
- **2026** — Trust and cost get serious: token transparency, the calibrated-confidence debate, EU AI Act transparency obligations.

---

## The foundations (citable sources)

- Microsoft — Guidelines for Human-AI Interaction (CHI 2019): https://www.microsoft.com/en-us/haxtoolkit/ai-guidelines/
- Microsoft — HAX Toolkit Design Library: https://www.microsoft.com/en-us/haxtoolkit/design-patterns/
- Microsoft Design — UX Design for Agents (2025): https://microsoft.design/articles/ux-design-for-agents/
- Google PAIR — People + AI Guidebook: https://pair.withgoogle.com/
- Nielsen Norman Group — Explainable AI: https://www.nngroup.com/articles/explainable-ai/
- Smashing Magazine — Designing Agentic AI (Feb 2026): https://www.smashingmagazine.com/2026/02/designing-agentic-ai-practical-ux-patterns/

---

## A note on terminology

Lead with **"Agent UX"** or **"AI UX"** for search and conversation. Anchor authority on **"Human-AI Interaction"**, the academic lineage from the Microsoft / CHI work. Be wary of "Agent Experience (AX)" as if it were an established discipline — it is largely one vendor's framing.

---

## About p0stman

p0stman is an AI-native product studio. We build named voice, video and chat agents with inline citations, calibrated confidence and read-freely / write-confirm tool flows in production, for our own products and for clients.

If you want an agent built on the patterns that hold up, talk to us: https://p0stman.com/contact

Read the full series, with live demos of each pattern: https://p0stman.com/agent-ux