DecoverAI Blog - Why Document Review Is Still Priced Like It's 2009

When the Model Was Built

Document review pricing was designed in an era when “review” meant a contract attorney clicking through TIFFs in a hosted Concordance database. The infrastructure required to host that data was genuinely expensive: on-premise servers, enterprise SAN storage, dedicated IT staff, and data center real estate that billed at a premium. The labor to review documents was also expensive in a different sense — not because contract attorneys commanded high hourly rates, but because there were a lot of them, working a lot of hours, on a lot of documents. At $35–60/hour for contract review and 50–75 documents per hour per reviewer, the economics of volume review were unavoidably labor-intensive.

The per-GB and per-document pricing model emerged as a rational response to those cost realities. Per-GB pricing was a proxy for infrastructure cost: more data meant more servers, more bandwidth, more storage. Per-document pricing was a proxy for labor cost: more documents meant more reviewer hours, more project management, more QC cycles. The model was not a fiction. It was a reasonably accurate approximation of what it actually cost a national eDiscovery vendor to process, host, and review a set of documents in 2003 or 2007. When the first commercial eDiscovery platforms were being built — Concordance, Summation, early Relativity — Amazon S3 did not exist. The cost structure the vendors were pricing against was real.

Understanding this history matters because it explains why the model has proven so durable. The rates were not arbitrary: they were set to recover real costs, yield a margin, and compete against other vendors doing the same calculation. The problem is not that the model was ever wrong. The problem is that it stopped being right approximately ten to twelve years ago, and the pricing has not moved.

What Changed — And What Didn't

Three things have changed categorically since 2009. The first is storage economics. Amazon S3 Standard currently costs $0.023 per GB per month. Even with the overhead of enterprise data management, access controls, redundancy, and audit logging, a sophisticated SaaS platform cannot credibly argue that the fully-loaded cost of hosting a gigabyte of legal documents runs to $10, $25, or $40 per gigabyte per month. The storage cost that once justified per-GB rates is now essentially zero as a fraction of the rate that vendors continue to charge.

The second change is labor bifurcation. The contract review market has not disappeared — but it has been compressed. Senior associates at AmLaw 100 firms bill $400–900 per hour for judgment-intensive work. Contract reviewers still bill $35–60 per hour for volume review tasks. The middle tier — experienced senior contract reviewers doing complex first-level classification at $80–120 per hour — has thinned substantially. What this means in practice is that the work that per-document pricing was designed to price has become more binary: either it requires a senior attorney (in which case per-document pricing is the wrong unit entirely), or it is volume classification work that AI now performs more accurately and at a cost measured in fractions of a cent per document.

The third change is the automation of the responsiveness pass itself. A multi-model GenAI classifier operating on extracted text can complete the first-pass responsiveness review for a 250,000-document corpus in hours, not weeks. The classification accuracy on well-scoped matters routinely exceeds that of human contract reviewers on volume work, and the output — a document-level relevance determination with confidence scoring — is directly defensible as a TAR-equivalent methodology under the post-Rio Tinto framework. The work that generated the bulk of per-document billing in the traditional model has been automated. What has not changed is the pricing. Vendors still quote per-GB and per-document rates that bear no relationship to the marginal cost of processing the data.

The Intentional Opacity

The persistence of legacy pricing is not an oversight. It is structural. A pricing model built on multiple opaque line items — processing surcharges, per-document review fees, per-entry privilege log charges, per-hour project management, per-user seat fees, per-production output charges — makes it effectively impossible for clients to challenge any individual rate, because no single rate looks unreasonable in isolation. A $100/GB processing fee sounds defensible. A $25/GB/month hosting fee sounds defensible. A $1.50/document review fee sounds defensible. It is only when you stack all the line items together on a real matter — say, a 100 GB commercial dispute with 250,000 documents — that the aggregate becomes indefensible. The total in that scenario is $460,000. See the full line-item breakdown in The $460,000 vs. $36,000 Benchmark.

The opacity is compounded by two structural features of the vendor engagement model. First, the final cost is unknown at engagement inception. Vendors quote per-unit rates, not matter totals. A client who signs a contract at $1.50/document has no idea what the total document count will be after processing and deduplication — and neither does the vendor, who has every economic incentive to process broadly rather than narrowly. Second, the line items flex upward as the matter progresses. Project management hours expand. The privilege review population grows when the responsiveness pass is broader than expected. Production charges accumulate as the matter drags on. Vendors who cannot give you a single all-in number for your matter are vendors whose pricing model depends on you not doing that math.

The single most important question to ask your eDiscovery vendor: Can you give me a single all-in number for this matter? If the answer is no, the line items are designed to flex upward as the matter progresses.

The Labor Tier Split and Why It Matters

The bifurcation of legal labor into senior-associate and contract-reviewer tiers has created a cost arbitrage opportunity that the per-document model is specifically designed to capture. Senior attorneys do not review 250,000 documents for responsiveness. They do not have the time, the tolerance, or — frankly — the comparative advantage. Contract attorneys do that work, and they do it at $35–60/hour with a throughput of 50–75 documents per hour. That translates to a per-document cost of approximately $0.47–$1.20 before the vendor's platform margin, project management overhead, QC layers, and profit. The rate billed to the client is $1.00–$3.00/document, which represents a 2–6x markup on the actual labor cost.

The arbitrage the per-document model captures is real but it has changed in character. In 2009, the markup was justified in part by the genuine infrastructure costs of hosting a Concordance database and by the project management overhead of coordinating a contract review team in a physical review facility. Both of those costs have declined substantially. Cloud hosting is cheap. Remote review teams are now the norm, and the coordination overhead of a distributed review team is lower than for an in-person one. But the markup has not declined commensurately, because the pricing model is sticky and because clients typically lack the internal benchmarking data to challenge it.

AI changes the arbitrage entirely. A multi-model GenAI classifier does not charge per document. The compute cost per document at current cloud LLM API rates is approximately $0.002–$0.01 depending on document length and the number of classification passes. At those rates, the AI-augmented cost for a 250,000-document responsiveness pass is $500–$2,500 in raw compute, compared to $375,000 at $1.50/document in the traditional vendor model. The per-document billing model survives only because clients have not yet systematically demanded the single all-in number that would make the comparison visible.

Why the Model Is Becoming a Litigation Liability

The 2015 amendments to FRCP Rule 26(b)(1) made proportionality a primary constraint on the scope of discovery, not a backstop for objections. The six-factor test now applies at the threshold of any discovery demand: importance of the issues, amount in controversy, relative access to information, the parties' resources, the importance of the specific discovery in resolving the issues, and the burden-versus-benefit ratio. Under the post-2015 framework articulated in Henry v. Morgan's Hotel Group, 2016 WL 303114 (S.D.N.Y. 2016), the requesting party bears the burden of demonstrating that its request is proportional to the needs of the case.

A $460,000 review estimate on a $1.2 million dispute is no longer just expensive — it is potentially disproportionate as a matter of law. That is not a theoretical risk. Courts applying the amended Rule 26(b)(1) have been increasingly willing to limit discovery scope on proportionality grounds, particularly where the cost-to-controversy ratio is extreme and where the producing party can demonstrate that a less expensive methodology exists. A producing party who can point to an AI-augmented all-in cost of $36,000 for the same matter is in a structurally different position in that proportionality argument than one who cannot.

The vendors who built their business model on per-document pricing did not build it with the expectation that clients would have a credible, court-defensible alternative methodology at a fraction of the cost. That alternative now exists. Courts are now asking whether discovery costs are proportional to the needs of the case. Vendors who cannot defend their pricing in a Rule 26(b)(1) discussion — because their model was built on the assumption that no one would ask — are facing a structural challenge that is not going to resolve in their favor. For the full proportionality framework and the leading case authorities, see Proportionality in eDiscovery: How Courts Are Redefining "Reasonable".

Download the pricing white paper

The full benchmark methodology, line-item comparison, and vendor evaluation worksheet — no demo required.

Why Document Review Is Still Priced Like It's 2009

When the Model Was Built

What Changed — And What Didn't

The Intentional Opacity

The Labor Tier Split and Why It Matters

Why the Model Is Becoming a Litigation Liability

Related Reading