How to read AI phone agent success rate
“Success rate” sounds simple. In practice, vendors can make the number look wildly different depending on what they include, what they exclude, and whether they quietly swap full traffic for a polished customer slice. This page shows how smart teams read the number before trusting it.
1) Start with definitions
2) Ask which audience the number represents
A single percentage can hide very different realities. Ask whether the reported number includes all production traffic, only mature customers, or some other curated slice.
Production-wide view
This shows how the system is performing across the full base, including customers who are still learning and testing the product.
Mature-customer view
This helps teams understand what performance looks like after the testing phase, once the workflows are more settled.
Use the same framework on Shopify and beyond
This framework works whether you start on Shopify or deploy enterprise AI agents outside Shopify. The call types may change, but the discipline does not: define outcomes clearly, label exclusions, and compare vendors on the same denominator.
3) Why hang-ups still matter
Hang-ups are a normal way calls end. A customer may hang up because they got the answer quickly, because they’re in a noisy environment, because a package arrived mid-call, or because they decided to switch channels. Treating hang-ups as “not real calls” is a convenient way to inflate resolution.
Disqualify "resolved-call billing" that hides outcomes
If a vendor says they charge only for resolved calls, that can sound aligned with your incentives. It is not aligned if they control the definition by removing hang-ups, redirects, voicemail, or short conversations that make the rate look worse.
4) How headline numbers get manufactured
The point of reporting should not be to manufacture the biggest-looking headline. It should be to help teams understand the operating reality behind the number they are being asked to trust.
5) Vendor checklist
Ask for this report
- Resolved / Redirected / Hang-up / Voicemail (counts + %)
- Definitions for each label
- Breakdown by call reason
- Breakdown by time-to-resolution and transfers
Auditability
- Transcript + summary on every call
- End reason (“who hung up?” / “why ended?”)
- Operator workflow: review, mark done, and improve skills
- Integration outcomes (tickets/CRM updates) in the call record
6) A note on headline percentages
A higher number does not automatically mean a better system. A serious vendor should be able to explain the audience, denominator, exclusions, billing logic, and workflow quality behind the headline.
Back to Blog.