AI StrategyAdvanced7 min read

AI Pricing Experiments

AI pricing experiments test how to price AI products themselves and how to use AI to test pricing on non-AI products. The two are different sports. For pricing AI products: the canonical pattern is OpenAI's tier experimentation — Free, Plus ($20), Pro ($200), Enterprise. Each tier tests willingness-to-pay against feature differentiation. For using AI to optimize pricing on other products: the pattern is Adobe-style ML personalization, where prices are tested per segment with bandits or A/B tests against a holdout. In both cases the trap is changing pricing without measurement infrastructure to detect cannibalization.

Also known asDynamic Pricing for AIAI Tier TestingML Price Optimization

Challenge a friend Browse library

The Trap

The trap is launching a new AI tier (e.g., 'Pro at $200/mo') without a measurement plan to detect cannibalization from the existing tier. OpenAI's $200 Pro plan was an experiment to extract surplus from power users — but if 30% of $200 buyers were prior $20 Plus subscribers churning UP, that's cannibalization to net positive; if 30% of NEW $20 Plus signups were diverted DOWN from Pro because of confusion, that's cannibalization to net negative. Without segment-level cohort tracking, you don't know which.

What to Do

Before any AI pricing change, instrument three measurement layers: (1) Cohort revenue per visitor — total revenue / visitors who saw the new pricing, (2) Tier mix shift — % of new signups by tier vs. baseline, (3) Net new MRR — accounting for cannibalization. Run the change as a 50/50 split for 4 weeks minimum. Real signal usually emerges at week 3 because of consideration cycles. Don't trust week 1 numbers — they over-index on power users.

Formula

Net Pricing Lift = (Revenue per Visitor_new − Revenue per Visitor_baseline) × Visitor Volume

In Practice

OpenAI launched ChatGPT Pro at $200/month in December 2024 as a tier experiment, sitting alongside Plus at $20. Sam Altman publicly stated they were losing money on Pro because power users consumed more inference than expected — demonstrating the value of price experimentation: even at a 10x price multiple, willingness-to-pay among heavy users exceeded marginal cost in ways the company hadn't fully modeled. The experiment generated learning that retroactively reshaped capacity planning.

Pro Tips

01
Test price up before testing price down. It's almost always reversible — you can drop a price next quarter — but raising prices that you've previously dropped destroys trust.
02
Bandit algorithms (Thompson Sampling, contextual bandits) are appropriate for personalized pricing on commodity SKUs but inappropriate for SaaS subscriptions because the customer notices the variation. Use bandits for one-time purchases; use A/B tests for subscription tiers.
03
Always grandfather existing customers. A 'price up' that backfills to existing customers triggers churn that wipes out the new pricing's gains. Adobe learned this in 2012 with the Creative Cloud transition — the customer outrage was a multi-year headwind.

Myth vs Reality

Myth

“AI lets you optimize pricing in real time per customer”

Reality

It can — but most contexts don't reward it. SaaS customers compare notes; B2B customers procurement-audit your pricing. Real-time personalized pricing is appropriate for hotels, airlines, and ride-share — almost nothing else.

Myth

“A pricing test has clear winners after a week”

Reality

Pricing has long consideration cycles. A 1-week test over-weights impulsive buyers. Minimum 4 weeks for SaaS; 8+ weeks for enterprise. Anything shorter is noise.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

You launch a $200 'Pro' tier alongside your existing $20 'Plus' tier. After 30 days: 200 Pro signups, but Plus signups dropped from 5,000/mo to 4,200/mo. What's the most likely net MRR impact?

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

🧠

OpenAI

December 2024

mixed

OpenAI launched ChatGPT Pro at $200/month, a 10x premium over the existing $20 Plus tier. Sam Altman publicly admitted on X that the company was losing money on Pro because power users consumed more inference than the price covered — a candid signal that the tier was an experiment in extracting surplus from the heaviest users, not a settled pricing decision. The experiment yielded reusable knowledge: capacity planning changed, and subsequent product tiers (Sora, Operator) priced into the Pro bundle reflected the learning.

Pro Tier Price

$200/mo

Plus Tier Price

$20/mo (10x lower)

Disclosed Outcome

Losing money — tier underpriced for heavy users

Pricing experiments are valuable even when they 'fail' financially short-term — they reveal willingness-to-pay segments and consumption patterns that are otherwise invisible.

Source ↗

🎨

Adobe (Sensei AI in Creative Cloud)

2017-present

success

Adobe uses ML (Sensei) to test promotional pricing and bundle composition across Creative Cloud customer segments. Personalized retention offers, free-month promos for at-risk users, and bundle upgrades are tested against held-out controls before being rolled out broadly. The discipline came from the painful 2012 Creative Cloud transition, when a forced subscription move triggered widespread customer backlash — Adobe rebuilt its pricing testing infrastructure to never repeat that.

Pricing Test Cadence

Continuous

Methodology

Personalized offers + holdout

AI-driven pricing personalization works best for retention offers, not net-new acquisition pricing. Customers don't compare retention offers; they do compare list prices.

Source ↗

Related concepts