Case study · Research specialist

Meridian Operations · read-only audit research

Reconciliation in 30 seconds, not 8 minutes

The Research Specialist answers exactly one class of question: what happened between what we submitted and what came back? It cannot do anything else. That constraint is the feature.

0 sec

Average lookup time vs. 8 minutes by hand

Mutation actions — read-only by design

Question coverage on the variance rubric

Built with

The problem

Two browser tabs, an eight-minute scroll, and a typed comment from memory

The team did variance investigations as a daily ritual. Pull the submitted record. Pull what came back. Compare by eye. Type the difference into the audit ticket. Most of the eight minutes per investigation went into mechanical tab-switching and value-by-value comparison. The cognitive part — was this difference expected? — happened in the last ten seconds.

The eight-minute lookup
Finding the difference between a submitted record and an upstream response meant opening two browser tabs, scrolling to the relevant fields, and comparing by hand. If the record had been resubmitted, three tabs. The comparison lived in someone's head, not in a diff.
The copy-paste chain
After finding the difference, the analyst typed the finding into a comment on the audit ticket — from memory, occasionally with a field name wrong. The audit ticket said something different from what the record said.
Re-reading the same record twice
Different team members ran the same lookup on the same submission because there was no shared research surface. Two people could spend 16 minutes reaching the same conclusion with no way for the second to know the first had already done it.
Write access in a research context
Before role scoping, a research session used the same agent instance as edit sessions. An analyst running a comparison could accidentally call a mutation tool with the wrong argument. The accidental write was rare but non-recoverable on submitted records.

Active reconciliations · 11:42 am

2 dupes in progress

MO-48217in flight3:42 elapsed

MO-48214tab-switching1:18 elapsed

MO-48211typing into ticket5:03 elapsed

MO-48209in flight0:47 elapsed

MO-48205re-reading8:12 elapsed

MO-48204re-reading7:56 elapsed

Two analysts · same question · two answersNo shared surface

The pipeline

From inbox to verified record in one pass

Six steps. Five of them are deterministic. The model only owns the planning step.

01Pick

Research specialist selected

User opens the assistant and picks the read-only research persona from the specialist grid; grant required to see the option.

02Whitelist

Read-only tools only

Tool list filtered server-side to lookup, variance compute, and source-span fetch. No mutation tools are present in the API request — not gated, absent.

Claude

03Question

Natural-language ask

User asks: 'What changed between our submission and the upstream response on packet 48217?' No field names required; the specialist knows the schema.

04Plan

Diff plan composed

Specialist plans the lookup: pull the submission record, pull the upstream response, run a typed-field diff across the shared field set.

Claude

05Fetch

Records and source spans pulled

Tool calls execute in sequence: submission record → upstream response → diff computed. Source spans loaded from blob storage when needed.

Postgres + Azure Blob

06Respond

Field-by-field diff returned

Matching fields omitted; differing fields shown with submitted value, upstream value, and field identifier. Output is paste-ready into the audit ticket.

Claude

Clear packet reference

Diff returned in under thirty seconds

Specialist resolves the packet, retrieves both records, runs the diff, returns a clean field-by-field comparison. Output is paste-ready into the audit ticket.

Ambiguous reference or missing upstream

Decline rather than guess

If the upstream response hasn't been recorded yet, the specialist says so explicitly rather than returning a partial diff. If the packet reference is ambiguous (multiple resubmissions), the specialist lists the timestamps and asks the user to confirm. It does not guess.

Validation review

Coverage per lookup type

The specialist handles five types of variance question. Each is scored against a weekly sample for first-pass-correct rate.

Field-level confidence

Pass 2 — Claude self-review

HighMediumLow

Field diffSubmission vs. upstream response

98%High

Historical lookupReference number search

97%High

Variance attributionWhich pipeline step introduced the change

92%High

Source-span retrievalOriginal document segment

95%High

Audit-trail recapSequence of events on a packet

66%Low

Routed to human review. Re-imported records can carry timestamps from the original submission, creating a false ordering. Fix is a tool that strips import-time timestamps before passing the sequence to the model — not a prompt change.

4 of 5 fields cleared the 0.85 threshold

model: First-pass coverage

The stack

Boring tech, glued together well

Each vendor handles what it's best at. Aisyst owns the orchestration layer in between.

Claude

Plans diffs and generates structured field-by-field responses

PostgreSQL

Submission records, upstream response records, audit history

Drizzle ORM

Type-safe queries against the records and audit tables

Azure Blob

Source-span retrieval for original document segments

Third-party logos are trademarks of their respective owners and appear here only to indicate integration.

Outcomes

What changed when the research surface became shared

Two analysts asking the same question now get the same answer. Neither of them can accidentally write.

0 sec

Average lookup time

Mutation actions — read-only by design

Coverage on variance-investigation rubric

Faster than manual research

Watch rubric coverage on variance investigations

Currently 99%. If it falls, the specialist is encountering a new common variance pattern — a new upstream response format, a new field added to the submission schema, a new resubmission flow — and needs a new tool or an updated prompt. Coverage decay is earlier-warning than user complaints.

Related cases

Read related case studies

Role-scoped AI assistant

Five specialists in one chat

Per-specialist skill gating means an assistant cannot edit data outside its role. Forty-seven tool-bound skills, zero cross-domain hallucinations.

Read the case

Autonomous & infrastructure

Webhook-driven local mirror

Ten times fewer live API calls. Sweeps query in-process at two-hundred-millisecond medians instead of crossing the network.

Read the case

If your team spends afternoons reconciling 'what we sent' vs. 'what came back,' this pattern fits

The constraint is architectural: there are no write tools in the whitelist, so the worst outcome of a wrong answer is a wrong answer. Not a wrong write.