Most teams underestimate AI cost for the same reason:
They start with models instead of behavior.
Retries compound during partial failures.
Production traffic is spiky by default.
The Scenario: Customer Support Chatbot (Production)
Imagine a customer support chatbot embedded in a SaaS product.
Assumptions
- ~25,000 conversations per month
- Uneven traffic (launches, outages, renewals)
- Long-tail queries with growing context
- Retries and fallbacks are expected
This already rules out “average cost per request” thinking.
Step 1 — Start With Traffic, Not Tokens
Tokens matter — but only after you understand behavior.
The real cost drivers show up when:
- Many users hit the system at once
- Context grows over time
- Retries compound during partial failures
Production traffic is spiky by default.
Step 2 - Define a Planning Baseline
In ModelIndex, this is the Expected scenario.
Expected means:
- Normal usage patterns
- Planned retries
- Reasonable context growth
It is not optimistic.
It is not worst-case.
This is the number a team should feel comfortable budgeting against.
Step 3 — Acknowledge Risk Boundaries
Now look at Best and Worst.
These are not modes or performance levels.
They exist to answer one question:
How does cost behave when assumptions break?
- Best assumes cleaner traffic and fewer retries
- Worst compounds burst traffic, retries, and larger contexts
Worst is not failure — it’s where cost starts to feel uncomfortable.
Step 4 — Ask the Right Question
The useful question is not:
Which model is cheapest?
It is:
What breaks first when usage scales or degrades?
That’s the insight teams usually miss until after launch.
Step 5 — Make an Intentional Decision
With this view, teams typically choose to:
- Ship as planned
- Constrain usage
- Defer the feature
The important part is not the outcome.
It’s that the decision is intentional — not reactive.
Why This Matters
AI costs don’t surprise teams because they’re unpredictable.
They surprise teams because no one models real behavior early enough.
That’s what ModelIndex is for.