01

Designing the trust layer

Nava's AI handles the analytical work, parsing invoices, comparing census data, surfacing discrepancies. My job was designing the experience that sits between what the system produces and the humans who need to act on it.
SERVICES
User research
AI enabled UX
UI
systems thinking
prototyping
outcome
Audit turnaround reduced from hours to under 3 minutes. Shipped AI-first planning and results experience. Established a review-and-decision model that kept scope in check and set the product up to learn.
Benefits audits used to take hours, sometimes days. With AI doing the comparison work, that window collapses to minutes. The harder problem was designing an experience users could actually trust.
ROLE
Lead designer
COMPANY
Nava Benefits
platform
Web - internal
type
AI-first product
context

An AI-first product

Nava Benefits is an AI-enabled healthcare benefits platform that helps HR administrators manage and audit employee benefits programs. The audit tool is AI-first: the system parses carrier invoices and employee census data, generates a structured audit plan, runs the comparison, and surfaces discrepancies for human review.

I was the primary designer for the audit experience, owning both the planning flow and the results workspace end to end.

THE PROBLEM

Automation creates a new design problem

The product could automate a process that previously took hours or days. But because the AI was doing the analytical work, users needed to understand how the audit was configured and why specific discrepancies were surfaced.

Without that visibility, the output feels arbitrary. No HR administrator is going to act on findings they don't understand. AI wasn't just a feature here. It was the system doing the work. My job was designing the layer between that system and the humans who need to trust it. Three tensions shaped every design decision:

Tension 01
Automation
vs. transparency
Tension 02
Speed
vs. control
Tension 03
Summary
vs. validation
The constraint

A platform shift mid-project

Early in the project, I explored a dedicated planning interface. Then the strategy shifted.

The product team prioritized an assistant-first interaction model, and the decision was made to deliver planning within the existing chat interface rather than introducing a new standalone surface. The dedicated planning UI wasn't going to ship.

The underlying design thinking didn't get thrown out. It got translated.

The need for a clear, scannable summary before execution, the separation of high-level understanding from detailed configuration, and the support for both quick execution and deeper review all carried over into the chat-based model. That constraint is what led to the Plan Summary Artifact.

Design — audit planning

Structure inside the assistant

Instead of exposing the AI's configuration directly, the assistant translates it into something readable and scannable.

When the audit plan is generated, it surfaces in chat as a structured summary artifact. Users who trust the system can execute immediately. Users who want more control can open the full plan in a right panel. Two surfaces, doing different jobs:

chat
System communication and editing
Generates the plan summary, communicates what the system is doing, and handles lightweight editing interactions.
right panel
Structured review and execution
The full plan lives here. A sticky footer keeps the primary execution action visible regardless of scroll depth.

Not every user needs to inspect the plan before running an audit. The summary card accommodates both without making either feel like the wrong choice.

The three-minute processing window shaped this experience specifically. That drove the design of the task tracker: showing meaningful milestones without exposing unnecessary system detail or making the experience feel unpredictable.

Planning: Chat view
Quick execution path: the summary card surfaces enough to act without requiring a full plan review.
Planning: Full plan open
Detailed review path: users who want to inspect mappings before committing can open the full plan without leaving the flow.
Design — audit results

Review, not task management

A discrepancy says "something looks off. Review it and decide." That distinction shaped everything about the results workspace.

Not every discrepancy is an error. Some are explained by timing, carrier billing behavior, retroactive changes, or acceptable variance. The system surfaces them. The administrator applies judgment. A two-column structure reflects that:

Left column - triage and navigation
Each record shows the employee name, current status, number of flagged issues, and benefit type chips. Pure triage: who needs attention, how many issues, and what type of coverage is involved.
Right column - context and decision support
Employee context, a plain-language "What we found" summary, structured billing vs enrollment comparisons, and a lightweight decision interface. The tag describes what the system found. The status records what the administrator decided to do about it.

For v1, administrators have two resolution actions:

Action 01
Mark resolved
The discrepancy was a real issue and has been addressed. Doesn't mean the system fixed it automatically. It means the administrator completed their review and took whatever action was needed.
Action 02
Dismissed
Reviewed and determined not to require action. Not every discrepancy is a true error. Timing, billing lag, and retroactive changes all count.

We intentionally kept v1 as a review-and-decision model rather than a task management system. Building something heavier would have meant making assumptions about user behavior we didn't yet have data to support.

Results workspace: Needs review
When a discrepancy represents a real issue, the system surfaces what was found, why it matters, and what the administrator should do next.
Results workspace: Dismissed
Not every flag requires action. When a variance is reviewed and determined to be benign, the administrator can dismiss it and move on.
my design workflow

Validated across clinical and product

I used Figma Make and Claude Code to prototype dynamic and state-based interactions more quickly. That helped the team pressure-test edge cases in the planning and results flows without waiting on engineering builds.

outcome

A foundation and a product stance

What shipped was a foundation. What it established was a product stance.

Usability testing confirmed that earlier designs, which exposed too much system detail during planning and used more narrative descriptions in results, created friction the final model resolved. The summary-first planning approach and simplified triage list both emerged from consistent feedback that earlier iterations were harder to follow and slower to navigate.

There are no post-launch metrics. But the experience was designed to generate them, particularly around whether users engage with the full plan before executing or proceed directly from the summary.

The broader contribution was establishing that the right product stance for v1 was a review-and-decision model rather than a task management system. A call that kept scope in check, avoided overbuilding on assumptions, and left room for the product to learn what a more mature system should actually look like.

NEXT CASE STUDY · 02
Trust before the transaction