Case Studies Free Playbook About Contact Terms of Service Apply for a Proof of Concept →
Proof of COncept results

Proof is proof,
whether it says yes or no.

The following case studies document real Proof of Concept (PoC) engagements. Three returned positive results. One returned a negative result We are publishing it, because a negative result that saves a client $74,000 is the most important story we can tell about what this model is actually for.

⚠ Company names and identifying details have been anonymised at client request. All performance figures are drawn as-is from documented PoC reports.

Jump to a case study

Law officePositive result

Case study 01

From four hours to forty minutes without a quality drop.

A boutique litigation firm discovers that its most time-intensive workflow can be AI-augmented without compromising the rigour their partners demand.

ClientGarrison & Associates LLP
Size11 staff · 4 partners
Revenue~$2.1M
LocationMidwest, USA
Workflow testedContract review summaries

A profitable firm with a billable hour problem.

Garrison & Associates had built a strong reputation in commercial litigation over 14 years. Associates were spending an average of four hours reviewing each incoming contract before producing a summary memo for the supervising partner, handling 18 to 22 contracts per month. That was roughly 80 hours of non-billable associate time consumed by a single workflow.

The managing partner had reviewed a large AI agency proposal for $65,000 and declined. The ROI case was not clear enough to justify without evidence first.

Legal work demands precision. The bar was non-negotiable.

The partners were not opposed to AI. They were opposed to AI that produced legally imprecise summaries. The firm had a specific memo format refined over a decade that any AI output had to match.

We agreed on a blind evaluation rubric before the PoC: five quality dimensions including legal accuracy, clause identification, and structural adherence. Both AI and associate outputs were scored by two senior partners who did not know which was which.

Workflow benchmarked · Contract review summary memo

Before · Human baseline

Associate reviews contract · Drafts summary memo · Partner reviews · Average: 4.1 hrs per contract

After · AI-augmented

Contract uploaded · AI generates memo to firm template · Associate quality checks · Average: 41 min per contract

Time per contract

4.1 hrs → 41 min

71% reduction

Blind quality score

AI: 8.6 / Human: 8.2

Out of 10 across 5 dimensions

Annual hrs reclaimed

~720 hrs

At 20 contracts/month avg.

"I was looking at two memos and genuinely could not identify which one had been written by our associate of three years. That was the moment I knew we had a real result, not a demo."

— Managing Partner, Garrison & Associates LLP

What they did with the playbook

Immediate rollout across all contract types

Associates began using the validated workflow on every incoming contract within two weeks. Average time dropped to 41 minutes, with a 22-minute quality-check step replacing the previous 4-hour drafting process.

Redeployed capacity to billable work

Time freed was redirected toward case preparation, client communication, and research. The firm estimates an additional $180,000 in annual billable capacity from the same headcount.

Commissioned a second PoC

The firm engaged Perceptrus for a second PoC covering client intake documentation and matter update letter drafting, both validated within the same 21-day model.

Accounting firmFailed AI project + Negative PoC

Case study 02

They lost $47,000 learning the hard way. Then we saved them $74,000 more.

A mid-size accounting firm attempts an internal AI implementation without validation, absorbs a costly failure, then uses the Proof of Concept to avoid repeating the mistake on a second project, this time with a negative result that proves its worth.

ClientHartwell Advisory Group
Size18 staff · 3 directors
Revenue~$3.4M
LocationSoutheast, USA
PoC workflowAutomated audit prep AI
PoC resultNegative, not viable
Why we are publishing a negative result Hartwell Advisory Group's story demonstrates both what unvalidated AI implementation costs and what a timely negative result is worth. We publish it because it is the most honest case we have ever documented, and the most instructive.

A $47,000 lesson in skipping validation.

In early 2025, Hartwell Advisory Group decided to pursue an AI-driven automation of their audit preparation workflow, collating, categorising, and summarising client financial data ahead of annual audits. It consumed approximately 120 hours of senior staff time per audit cycle.

Rather than validating the concept first, the firm moved directly to implementation. They engaged a generalist software contractor through a referral, allocated an internal project manager, and began building a custom AI workflow tool designed to ingest client data and produce structured audit prep summaries.

Seven months in, the tool had produced inconsistent outputs across different client data structures, failed to handle the firm's non-standard chart of accounts formats, and generated summaries that required more time to correct than the original manual process.

The firm abandoned the project. The tool was never deployed. The contractor was terminated. The internal project manager had spent an estimated 30% of their working hours on the initiative across seven months, none of it recoverable.

Total cost of the failed implementation

Cost itemDetailAmount
Contractor development fees7-month engagement, fixed + variable billing$28,500
Internal project management timeEst. 30% of PM salary x 7 months$11,200
Senior accountant testing timeReview cycles, error documentation, feedback sessions$5,400
Software licences purchasedAnnual licences acquired for the intended tool stack$2,100
Total direct and indirect cost of failed project$47,200

The outcome: failed implementation

$47,200 spent. Zero workflows changed. Seven months lost.

The firm returned to manual audit preparation with no improvement to throughput, a damaged internal appetite for AI, and a senior team openly sceptical of any future AI initiative. The managing director described it as "the most expensive lesson we never needed to learn."

"We did everything the wrong way. We trusted a referral instead of evidence. We built before we tested. We spent seven months and nearly fifty thousand dollars finding out what a three-week proof of concept would have told us for a fraction of the cost. We won't make that mistake again."

— Managing Director, Hartwell Advisory Group

Twelve months later: a second AI idea, validated before a single dollar was spent on it.

Twelve months after the failed implementation, a director identified a new AI opportunity: automating the narrative commentary in their quarterly management accounts, the written analysis accompanying financial data packages for clients. It consumed approximately 180 senior hours per quarter.

This time, the firm did not move directly to implementation. They engaged Perceptrus to run a Proof of Concept first, designed to test one specific hypothesis: can AI produce management accounts commentary that meets Hartwell's quality standard across their diverse client base?

The Proof of Concept ran across 28 real management accounts packages, a genuine cross-section including SMEs, holding companies, and not-for-profits. AI and human commentary were scored blind by three senior directors on technical accuracy, client-appropriate tone, variance explanation quality, and actionability of insights.

The result was unambiguous: AI performed adequately on simple single-entity accounts but failed materially on consolidated group accounts, entities with non-standard revenue recognition, and clients requiring industry-specific regulatory language. On 11 of the 28 packages, 39% of the sample, AI outputs fell below the firm's minimum acceptable threshold.

PoC verdict · Management accounts narrative commentary

Simple accounts - AI performance

Adequate · Score 7.2 avg · Passed threshold on 17 of 28 packages

Complex accounts - AI performance

Below standard · Failed threshold on 11 of 28 packages · Material inaccuracies in 4 cases

Packages below threshold

11 / 28

39% failure rate

Overall quality score

AI: 6.8 / Human: 8.4

1.6 pt gap - unacceptable

Implementation cost avoided

$74,100

Projected build + rollout

What a failed implementation would have cost them this time

Projected cost itemDetailAmount
AI platform implementationVendor-quoted custom deployment scope$38,000
Internal integration and testingEstimated 4-month staff time allocation$18,500
Client data migration and validationPer vendor scoping document$9,200
Training and change managementExternal facilitation + internal hours$8,400
Total avoided by negative PoC result$74,100

The outcome: negative PoC

A negative result, and the best return on investment the firm made all year.

The PoC cost a fraction of the projected implementation. It returned a clear, evidence-based verdict before a single line of code was written or a vendor contract signed. Hartwell did not proceed. They saved $74,100, preserved client-facing quality standards, and retained their senior team's trust in AI decision-making, because this time, the decision was made with data.

"The PoC told us no. And that answer was worth more than a yes would have been. We would have spent seventy thousand dollars building something that would have damaged our reputation with clients. A small experiment stopped a large mistake."

— Managing Director, Hartwell Advisory Group

What they did next

Identified a viable subset from the PoC data

The Proof of Concept report documented precisely which account types AI handled adequately and which it failed on. The firm identified approximately 40 simple single-entity clients for whom AI commentary was viable and commissioned a targeted third Proof of Concept on that cohort. That Proof of Concept returned a positive result.

Deployed a scoped, targeted implementation

Rather than a firm-wide rollout, Hartwell implemented AI commentary generation for the 40-client subset only, reducing quarterly narrative workload by approximately 35 hours without exposing complex clients to substandard output.

Adopted Shadow PoC as mandatory governance

The board formally adopted the Proof of Concept model as a mandatory validation step before any future AI initiative is approved for implementation funding. The negative result became the policy.

Dental clinicPositive result

Case study 03

Giving clinicians back eight hours a week, without touching patient care.

A family dental practice discovers its administrative backlog is not a staffing problem. It is a workflow problem, with a measurable AI solution.

ClientNorthside Family Dental
Size9 staff · 2 dentists
Revenue~$1.6M
LocationPacific Northwest, USA
Workflow testedPatient follow-up communications

Admin was eating clinical time. Something had to give.

Northside Family Dental's practice manager had identified a persistent problem: clinicians were completing post-appointment documentation and follow-up correspondence outside of clinical hours, adding 45 to 75 minutes to their daily schedule.

The workflows in question were high-volume and repetitive: post-treatment care instructions, insurance pre-authorisation letter drafts, and appointment re-engagement messages. None required clinical judgement to produce. All required clinical language and HIPAA-aware communication standards.

Healthcare language has zero tolerance for imprecision.

Patient communications carry both a compliance obligation and a trust obligation. The practice principal made clear that any AI output would need to pass his personal review before the playbook was accepted.

We structured the Proof of Concept around three discrete document types, each with its own quality rubric. All outputs were reviewed blind by both the practice principal and practice manager independently before scores were compared.

Workflow benchmarked · Patient post-appointment communications

Before · Human baseline

Clinician drafts follow-up content manually · Admin sends · Average: 22 min per patient across 3 document types

After · AI-augmented

Treatment type selected · AI generates all 3 documents · Clinician spot-checks in 4 min · Admin sends · Average: 9 min per patient

Time per patient set

22 min → 9 min

58% reduction

Clinician admin hrs/week

~11 hrs → ~4.5 hrs

Across both dentists

Blind quality score

AI: 8.8 / Human: 8.5

Post-treatment instructions led

"I told them I was looking for reasons to reject it. The post-treatment instructions were more thorough than what we had been sending. The re-engagement letters had better tone. I ran out of objections."

— Practice Principal, Northside Family Dental

What they did with the playbook

Eliminated after-hours admin for both dentists

Within three weeks neither dentist was completing documentation outside clinical hours. The recovered time was split between additional patient appointments and personal time, two additional daily slots were opened across both chairs.

Recall rate improved 18% in the first quarter

AI-generated re-engagement messages were more consistently sent and better timed than the previous manual process. Recall bookings increased 18% in the first full quarter, contributing directly to revenue.

Insurance pre-auth turnaround dropped from 6 days to 2

Pre-authorisation drafts were now produced same-day instead of queued. Average time from procedure planning to insurance submission dropped from six days to two, accelerating approvals and the associated revenue cycle.

Recruiting agencyPositive result

Case study 04

Three times the candidate output. The same three recruiters.

A specialist talent agency validates AI-augmented candidate brief production and discovers a workflow advantage significant enough to change how they price their service.

ClientVantage Talent Partners
Size7 staff · 3 senior recruiters
Revenue~$1.9M
LocationNortheast, USA
Workflow testedCandidate briefing documents

A strong pipeline limited by how long briefs took to write.

Vantage Talent Partners had carved out a niche in mid-level financial services and operations placements. Their competitive advantage was the quality of their candidate briefs: detailed documents presenting shortlisted candidates to hiring managers with enough context to make confident interview decisions.

A thorough brief took a senior recruiter between 75 and 110 minutes from interview notes and CV data. With 15 to 20 active roles at any time and three to five candidates per role, the team was spending 50 to 80 hours per month on brief writing alone.

The brief was the product. Mediocre output would undermine the brand.

Candidate briefs were client-facing and the primary evidence of the agency's value-add. A brief that read as generic would signal exactly the opposite of what Vantage was known for.

The Proof of Concept tested AI brief generation across 30 real candidate profiles. Evaluators were three hiring managers at existing client companies who scored both sets blind on presentation quality, insight depth, usefulness for interview preparation, and overall impression.

Workflow benchmarked · Candidate briefing document

Before · Human baseline

Recruiter reviews notes and CV · Drafts brief from scratch · Edits and formats · Average: 88 min per candidate

After · AI-augmented

Notes and CV structured · AI generates full brief to agency template · Recruiter adds insight layer · Average: 19 min per candidate

Time per brief

88 min → 19 min

78% reduction

Client blind score

AI: 8.9 / Human: 8.4

Scored by 3 active hiring managers

Monthly capacity unlocked

+60 hrs

Across 3 senior recruiters

"Our clients scored the AI briefs higher. That told us two things: the AI is better at the format, and our recruiters' value is in the insight layer, not the writing layer. That changes how you think about the business."

— Director, Vantage Talent Partners

What they did with the playbook

Active role capacity increased from 20 to 34

With 60 hours of monthly senior time recovered, the team redirected effort toward sourcing and business development. Roles managed simultaneously increased from approximately 20 to 34, a 70% capacity expansion with no new hires.

Repriced the service tier upward

The independently confirmed quality improvement gave the director confidence to introduce a premium brief tier priced 15% above standard. Three of their seven anchor clients opted in within the first billing cycle.

Redefined what a recruiter's job actually is

By separating writing from insight, the team recognised their true differentiation: reading candidates, understanding cultural fit, identifying potential. The AI handled the presentation layer. The recruiters owned the intelligence layer. Both improved.

A negative result is still
the right result.

Every case study above started with one question: does this actually work for our operation? We ran the experiment and told the truth about what we found. Apply and we will do the same for yours, whatever the answer turns out to be.

Apply for Your PoC →