Vol. 01  ·  No. 1 Est. 2025  ·  Wilder Mountain
Wilder Mountain Bit Farm

§ Work · ii. · Henry Law Firm

A custom eDiscovery platform at a fraction of vendor cost.

A multi-million-document matter. The off-the-shelf options would have eaten the budget. So we built our own.

Client
Henry Law Firm
Year
2019 – 2020
Role
Solo engineer (direct contract)
Outcome
Vendor-class capability, in-house cost

The situation

Henry Law Firm was preparing for a multi-million-document litigation. The realistic options for that scale of discovery are well known and well priced — Relativity and its peers will happily run the matter for you, and they'll charge per gigabyte to do it. For a single case at this volume, that's a six-figure line item before anyone has read a single page.

They asked a simple question: does this need to be a vendor problem, or could we own it? I was hired to find out.

What I built

Working solo on a direct contract, I built an in-house eDiscovery platform around the open-source pieces that matter: Django for the application layer and access controls, PostgreSQL for case metadata, ElasticSearch for full-text and faceted search, Celery for the ingestion pipeline, and Apache Tika plus OCR for extracting text out of the wild zoo of formats that production sets show up in — emails with attachments-of-attachments, scanned PDFs, image-only TIFFs, the lot.

On top of that I added lightweight ML classifiers to triage documents by relevance, so attorney review could focus on the small slice that actually mattered to the matter rather than spending hours on company holiday-party invitations and out-of-office replies.

The product side mattered as much as the pipeline. The interface had to make sense to attorneys, not engineers — faceted search, batch tagging, privilege flagging, audit trails — the muscle memory of a litigation team, in a tool they'd never seen before.

What changed

Henry Law went into the matter with vendor-class capability at in-house cost. The platform handled the document set end-to-end — ingestion through review — and meant the firm wasn't spending the budget of the case on the tooling for the case. For a litigation practice that runs cases like this regularly, owning the platform pays back fast.

Looking at a vendor invoice that doesn't pencil?

Sometimes the right call is to pay the vendor. Sometimes it isn't. If you'd like a second opinion before you sign, get in touch.

Start a conversation →