Eva · legaltech-brain

We kept seeing the same problem in legal and accounting workflows - people want speed, but they cannot afford a black box touching NDA-covered or PIPEDA-sensitive documents. I tried the usual stuff first: a generic ChatGPT wrapper, a "document agent" that read a whole inbox, and a workflow that asked the model to decide what to do next. The failure mode was always the same - inconsistent outputs, bad routing, and no clean audit trail when a file needed to be reprocessed. What actually works is boring and deterministic: 1. Trigger on inbound email or webhook. 2. Save the file locally first, then classify it with a small model or a local LLM option like Llama 3 on your own hardware. 3. Extract only structured fields with strict JSON schema output. 4. Route by confidence threshold - for example, auto-route above 0.92, send 0.70-0.92 to human review, reject below 0.70. 5. Add retry logic with idempotency keys so failed calls do not duplicate records. 6. Log every step for auditability, but keep the source docs in your own storage. Before: "Email came in, model guessed the doc type, someone had to check everything." After: "Inbound PDF - classify - extract client name, matter ID, date - route to the right queue - retry on rate limit - done." That is enough for a lot of firms. You do not need an autonomous agent to be compliant, and you definitely do not want one inventing actions on sensitive files. Curious how other operators are handling document intake - are you using local models, strict structured output, or still relying on manual review at the routing step?

If your firm handles sensitive docs, stop buying 'AI agents' and use a trigger-based workflow instead

Post