The Subsidy Era of Agentic Coding Just Ended. Nobody Built FinOps for It.
hatch

The best deal in software just ended, and the replacement bill is non-deterministic.
For two years, flat-rate coding-agent pricing was the deal of the decade. Twenty bucks a month, point an agent at your codebase, let it churn. You were almost certainly consuming more value than you paid for. That wasn’t generosity. It was customer acquisition, financed by venture money, priced below cost on purpose. And in a two-week window this spring, three different vendors quietly clawed it back.
A week ago I wrote that enterprise AI found product-market fit — that the tell was customers abandoning subscriptions to pay by the token, because the subscription ceiling had become a constraint on their business. That post was written from the seller’s chair. The signal was good news for the vendor: PMF, real fit, infrastructure-grade demand.
This one is written from yours. Because here’s the part that wasn’t fun to write: vendor PMF is buyer-side metering. The same shift that made Anthropic a near-trillion-dollar company is the shift that’s about to put a per-task meter on your engineering org — and almost nobody on the buyer side has built the dashboard to read it.
Quick roadmap:
- Why flat-rate agent pricing was always a loss-leader, and the exact two-week window it disappeared in
- The buyer-side flip of the PMF story: seller fit equals your metered bill
- The FinOps gap nobody has filled — why “fails silently” is the scariest phrase in the changelogs
- Three proof points: Copilot, Claude Code, and the Uber number that should stop you cold
- The playbook, because we’ve run exactly this fire drill before — it was called EC2
The subsidy was the product
Be honest about what flat-rate agentic pricing actually was. It was not a price. It was a bet that getting you hooked was worth eating the inference cost while you got hooked.
Non-deterministic agents are expensive to run, and the cost is wildly uneven per task. A trivial one-line fix might cost a penny. A “refactor this module and write the tests” run can chew through dollars of inference in a single invocation, looping, re-reading context, retrying. The vendor knew this. You, on the $20 plan, did not — because the whole point of a flat rate is that it hides the variance.
Translation: you were never buying a subscription. You were being paid to form a habit. The bill was always going to come due. It just came due in a two-week window, all at once, across the industry:
- June 1 — Microsoft moves Copilot off its all-you-can-eat posture to usage-based AI Credits.
- June 15 — Anthropic ends flat-rate access for the Claude Code / agent-SDK tier, replacing it with per-user credit pools. No rollover. And when the pool depletes, requests don’t error loudly — they fail silently.
- And bracketing both — Uber, having burned its entire 2026 agent budget in four months, capped engineers at $1,500/month per tool.
Three vendors. Two weeks. One direction. That is not coincidence — that is an industry retiring a CAC line item on the same schedule.
The flip: seller’s PMF is your meter
A week ago the story was: customers are converting from subscriptions to pay-per-token because they’re consuming so much the ceiling became a constraint. I called that an EC2-level signal, and I stand by it.
But run that sentence again from the buyer’s seat. “Consuming so much that pay-per-token makes sense” is the vendor’s victory condition. From where you sit, it reads differently: you are now metered, and your consumption is non-deterministic.
The seller knows their margin to four decimal places. That’s why they can confidently move you onto a meter — they’ve done the math. The question is whether you’ve done yours. When the contract flips from flat to metered, the entire risk of cost variance transfers from the vendor’s balance sheet to yours. The vendor de-risked. You got handed the variance, and the variance is the expensive part.
Vendor PMF means a metered bill for the buyer. The subscription protected you from the agent’s appetite. The meter removes the protection and hands you the appetite.
That’s the sequel. The PMF post was the diagnosis from the lab. This is the patient getting the invoice.
You cannot govern what you cannot see
Here’s the uncomfortable structural fact. A traditional software license is deterministic: N seats, fixed price, you can model it on a napkin. A non-deterministic agent has no per-task cost model at all. The same prompt, run twice, can cost different amounts. Two engineers doing “the same” work can differ by an order of magnitude. There is no unit you can hand finance and say “this is what a query costs,” because there isn’t one.
Now layer on the detail buried in the Anthropic changelog: when a credit pool depletes, requests fail silently. No alert. No invoice spike to flag it. The agent just quietly stops being as good, and the engineer relying on it has no idea why their afternoon suddenly got slow and dumb.
I’ve been beating this drum for years: the job of an operator is to raise the visibility bar — to make the invisible measurable before it becomes a 2 a.m. page. “Fails silently when depleted” is the exact opposite of that. It’s a system designed to be invisible at precisely the moment it’s costing you the most. You can’t govern what you can’t see, and right now most orgs can’t see any of it.
I learned this the expensive way back in March, watching a token meter spin on a single agent task I’d assumed was free. It’s not a theoretical gap — it’s the kind the June reporting finally put a number on: one developer, one tool, a plan that nominally cost $20, and roughly $236 of measured API value running underneath it. That gap was the subsidy. On June 15, for a lot of teams, that gap closes and lands on a cost center that has no model for it.
The Uber number
If you want the whole thesis in one data point, it’s Uber.
Uber burned its entire 2026 agent-tooling budget in four months. Four. Then it capped engineers at $1,500 per month, per tool. Sit with the magnitude: at that cap, a fully-loaded senior engineer’s agent allowance runs to something like 11% of their total compensation. That is not a software line item anymore. That is a material fraction of the cost of the human.
And Uber is one of the most operationally disciplined engineering organizations on the planet. If they blew the budget in a third of the year and had to slam a hard cap on it, what exactly is your napkin math protecting you from?
The answer, for most companies, is nothing. There is no cap. There is no per-team allocation. There is no tag on the spend. There’s a finance team that thinks it bought a few hundred subscriptions, and an engineering org about to discover it bought a metered utility — with no meter installed on the buyer’s side of the line.
We have run this fire drill before
The good news: this problem is not new. The asset is new; the discipline is not.
In 2006, EC2 turned servers from a capital purchase into a metered utility, and within a few years a generation of companies had set fire to their cloud bills because they treated elastic, metered infrastructure like the fixed-cost servers it replaced. Untagged instances. Forgotten dev boxes spinning at 3 a.m. Nobody able to answer “what does this workload cost.” We built an entire discipline to fix it. We called it FinOps: tag everything, attribute spend to a team, set budgets and alerts, make the meter visible before the bill, kill what nobody can justify.
Coding agents are the new untagged EC2 sprawl. Same shape, faster clock. Non-deterministic spend, no attribution, no alerting, failures that hide instead of page. Every lesson cloud cost governance taught us applies, and it applies now:
- Tag and attribute. Spend you can’t assign to a team is spend you can’t govern. Get per-team, per-tool visibility before the metered tiers go live.
- Budget and alert, loudly. “Fails silently” is the vendor’s default. Your job is to make depletion impossible to miss — your alert, not their silence.
- Find the rate-limiting workflow. Same as the PMF post argued for model choice: know with specificity where agent spend actually buys outcomes, and where it’s just churning. Cap the churn, fund the outcomes.
FinOps for agents is not a nice-to-have you’ll get to next quarter. It is infrastructure. It’s the meter on your side of the wire. And the vendors have already installed theirs.
You budgeted for a subscription. What you actually bought is a metered, non-deterministic utility — one that hides its own cost, fails without telling you, and just had its training-wheels subsidy removed across three vendors in a fortnight.
The PMF post asked which side of your competitors’ deals you were on. This one asks something more immediate: when the meter starts on June 15, who in your organization can tell you what a query costs, which team spent it, and what happens when the pool runs dry?
If the answer is “nobody,” that’s not a pricing problem. That’s a governance gap. And you have until June 15 to close it.