Case study · Form extraction

Meridian Operations · high-volume document submission

Field re-keying: 22× faster, 94% match

A team that spent its days copying numbers from PDFs into a system of record now reviews exceptions instead — and the extraction gets more accurate every week without a single prompt change.

Extraction match vs. reviewer truth

Faster than manual re-keying

0.00

Reviewer agreement on auto-posted records

Built with

The problem

Every packet meant a person reading a PDF and retyping it three times

Coordinators triaged thousands of submission packets a month. The shape was always the same: an inbound document attached to a ticket, fields copy-pasted into one system, then retyped into a third. Most of the cognitive load was visual pattern-matching plus catalog lookups by memory — exactly the part you shouldn't be paying humans to do, and exactly where the same client kept hitting the same error week after week.

Re-keying the same number three times
Every inbound PDF was opened in a viewer, copy-pasted into the ticket, then retyped a third time into the submission form. The third one was always the one that was wrong.
Classification codes entered by memory
Coordinators looked up codes in a separate tab and typed them by hand. A transposed digit only surfaced days later in review, well after the packet had moved downstream.
No audit trail to the source
When a reviewer questioned a declared value, there was no way to point at which line of which page it came from. The manual PDF search took as long as the original entry.
Per-account corrections that didn't stick
A reviewer's correction applied to that one packet. The next packet from the same account had the same error. No feedback loop from correction back to extraction.

Repeat-error log · last 6 weeks

recurring

Hartwell Trading Co.Code 8471.40 vs. 8471.304w running

Westridge ImportsTotal $/qty mismatch3w running

Cascade LogisticsOrigin missing6w running

Northbay GroupDate format flipped2w running

Olmsted & SonsCode 9018 vs. 90195w running

Same accounts. Same errors.No feedback loop

The pipeline

From inbox to verified record in one pass

Six orchestrated steps. Two Claude calls. Reviewer corrections become enrichment suggestions for the next packet from the same account.

01Trigger

Packet arrives

Zoho Desk webhook fires on a new ticket; attachment URLs and metadata land in the processing queue.

Zoho Desk

02Detect

File type routed

PDF, DOCX, XLSX, and email-body attachments each route through CloudConvert into a searchable PDF.

CloudConvert

03Extract

Typed schema with source spans

Claude returns every field as a typed value plus the exact page and span it came from.

Claude

04Enrich

Parties matched, codes validated

Submitting accounts matched against the parties reference table; classification codes validated against the code list.

05Verify

Second-pass confidence

Claude emits a 0.00–1.00 score per field with a one-sentence reasoning trace.

Claude

06Learn

Corrections re-applied

Reviewer corrections are stored as enrichment suggestions and auto-applied to the next packet from the same account.

High confidence

Auto-post to the system of record

All fields at or above 0.70 with reviewer agreement of 0.97; the record is written to the submission form and the Zoho ticket is annotated with the extraction summary and source page references. No human touches it.

Low confidence

Routed to the reviewer queue with reasoning trace

Any field below 0.70 — or any packet flagged with a structural anomaly — lands in the queue with the score, the source span highlighted in the PDF, and the reasoning visible inline. Reviewers correct the specific field, not the whole form.

Step 04 — Extraction

What Claude saw → what the system received

Scroll triggers the live extraction. Each region Claude reads on the document maps into the structured form on the right.

Source attachment

packet-MO-48217.pdf

Claude

Target schema

Submission record

Submitting account✓ extracted

Hartwell Trading Co.—

Effective date✓ extracted

2026-03-14—

Classification code✓ extracted

8471.30.0100—

Total declared value✓ extracted

$142,380.00—

Party signature✓ extracted

M. Okafor—

0 / 5 fields populated✓ ready for review

Step 06 — Accuracy review

The system tells you when it's unsure

A second Claude pass scores every field against the source spans. High-confidence results post automatically. Anything ambiguous routes to a human with the reasoning attached.

Field-level confidence

Pass 2 — Claude self-review

HighMediumLow

Submitting accountHartwell Trading Co.

98%High

Effective date2026-03-14

96%High

Classification code8471.30.0100

91%High

Total declared value$142,380.00

61%Low

Routed to human review. Line-item subtotals sum to $138,240.00, not the stated $142,380.00; one row is partially obscured by a red stamp on page 4. Routed to reviewer for manual check.

Party signatureM. Okafor

88%High

4 of 5 fields cleared the 0.85 threshold

model: Claude

The stack

Boring tech, glued together well

Each vendor handles what it's best at. Aisyst owns the orchestration layer in between.

Zoho Desk

Inbound ticket queue + webhook source; extraction summary written back as ticket annotation

CloudConvert

File-type routing and OCR preprocessing for scans, DOCX, XLSX, and email-body attachments

Claude

Extracts typed fields with source spans, then scores confidence with reasoning

Azure

Functions hosting the extraction pipeline, suggestion re-application, and reviewer routing

Postgres + Drizzle ORM

Parties reference table, enrichment suggestions, per-packet extraction records

Third-party logos are trademarks of their respective owners and appear here only to indicate integration.

Outcomes

What changed for the team

The pipeline runs every minute against new tickets. Coordinators graduated from re-keying every field to validating the rare flagged ones — and the system gets sharper week over week as their corrections feed back in.

Extraction match vs. ground truth

Faster than manual re-keying

0.00

Reviewer agreement on auto-posted records

Week-12 error reduction on enriched fields

Watch the week-12 error reduction, not the day-one accuracy

Day-one accuracy says how well the prompt is written. Week-12 reduction says how well the loop is working. When a reviewer corrects a product field, that correction is stored and re-applied automatically to the next packet from the same account — so the per-account error rate falls each week without any prompt change. Flatten before week 8 and the suggestions are saturating a narrow catalog; climb after week 12 and the catalog is drifting. Either signal calls the curator in.

Related cases

Read related case studies

Form-extraction pipelines

Deadline-driven declarations

Datasheet detection in the queue and ten-field extraction in under a minute, before the deadline.

Read the case

Form-extraction pipelines

Multi-line catalog reconciliation

Line-item extraction matched against an internal catalog, with threshold validation on every row.

Read the case

Autonomous & infrastructure

Self-checking queue sweeps

Four reasoning sweeps run every two hours, with a confidence floor of 0.91 before any auto-action.

Read the case

If your team retypes the same fields from PDFs every day, this pattern fits

We'll scope it on a 30-minute call. No deck, no discovery doc — bring a sample packet and we'll walk through where each field goes and where the loop closes.