You Picked the Premium AI. The App Switched It Back.

It happened to four of my clients in two weeks. The picker quietly flipped, and nobody noticed.

May 07, 2026

You’re Paying for the Best AI Model. You’re Not Using It.

TL;DR: AI companies have buried the best models behind confusing pickers, and most apps default new users to the cheapest, fastest version. Worse, when paying users do pick the premium model, several apps silently reset them on the next chat. The result is that senior professionals at firms paying $20-$125 per seat per month are getting entry-level answers without realizing it. The fix is one rule per platform. Claude users want Opus 4.7 with Adaptive Thinking on. ChatGPT users want Thinking at the deepest level, or Pro. Gemini users want Pro. Grok and CoPilot users, same logic: skip Auto, find the deepest reasoning option, set it. Then check the picker on every new chat, because the apps don’t always remember.

I was presenting to a law firm this week, walking through how the different Claude models work and when you’d use each one. About ninety seconds in, the lead attorney held up a hand.

“Stop. Just stop. We don’t need to know. We don’t care. We will never change the model. Just tell us the one we should use.”

She wasn’t being rude. She was being honest. The firm runs Claude Teams. They handle work where quality matters more than token economics. They weren’t in the mood for a recipe. They wanted a rule.

I told her: Opus 4.7 with Adaptive Thinking on. That’s the one. Set it. Don’t touch it again.

And the thing is, four other clients had told me versions of the same story in the past two weeks. A managing partner pressure-testing a hiring decision. A COO trying to draft a board update. A practice group leader summarizing a deposition. A general counsel reviewing a vendor contract. Different platforms, different uses. Each one, in a slightly bewildered tone, said something like: “There are four options in the menu and I have no idea which one to pick, so I just use whatever comes up first.”

That’s the problem. And it’s quietly costing them the thing they’re paying for.

The labs built complexity into the front door

Open Claude. The picker shows Opus 4.7, Sonnet 4.6, Haiku 4.5, plus an Adaptive Thinking toggle. ChatGPT: Instant, Thinking (with multiple levels of thinking inside it), Pro. Gemini: Fast, Thinking, Pro. Grok: Auto, Fast, Expert, Grok 4.3 Beta. CoPilot is its own animal. Auto, Quick Response, Think Deeper, Opus, and a GPT option that opens up five more sub-models underneath it.

If you live inside AI every day, you know what these mean. If you open Claude twice a day to pressure-test a memo, you don’t. You see a dropdown and pick whatever’s at the top. And in most apps, the top option is a cheaper, faster, lower-capability model designed to keep cloud costs down across millions of free users.

I get why the labs do this, by the way. They’re balancing rate limits, latency, infrastructure cost across hundreds of millions of free accounts, and what their best models can crank out per second. There’s also the inference economics question, which I’ve spent more time than I probably should thinking about: every reasoning-model query costs the labs meaningfully more compute than an instant answer, and at scale that math has to come out of somebody’s pocket. From their seat, defaulting most users to the lighter model is a survival tactic. From the seat of the executive who paid full freight for the premium subscription, it’s something else entirely.

The math is worse than you’d think. You’re probably paying $20-$125 per user per month for one of these subscriptions. Fifty seats is $15,000 a year, minimum. Most of those seats are getting answers from a model a generation behind what they’re paying for.

It’s like buying a sports car and never shifting out of first.

And the apps actively work against you

Here’s the part that genuinely drives me nuts.

Even when a user has done the right thing, picked the premium model, set their preference, gotten comfortable with the answers, the apps will quietly reset them. Open a new chat in several of these clients and the model selector flips back to Auto, or Instant, or Fast. The default the user picked is gone. They’re back on the cheap model, and they often don’t notice for a while.

I had three clients get caught by this in the past week. They opened new chats to continue work they’d been doing successfully the day before, got back answers that were thin or partly wrong, and emailed me with some version of “WTF, I thought we were paying for the good model.” They were. The app had silently flipped them back to the entry-level default.

The labs will tell you this is about managing cost across hundreds of millions of users. Fine. From their seat, that’s true. From the customer’s seat, the message is: we sold you the premium tier, then handed you the entry-level product on the next page. Nobody designs a paid subscription that way on purpose. Or if they do, they shouldn’t.

So here’s what I tell clients. Claude: Opus 4.7 with Adaptive Thinking on. ChatGPT: pick Thinking and run it at the deepest level your plan offers, or jump to Pro if you have access. Gemini: pick Pro (you’ll be on 3.1 Pro). Grok and CoPilot users, same logic. Skip Auto. Find the deepest reasoning option in the menu. Leave it set.

Then check the picker every time you open a new chat. Yes, that’s a tax. Yes, it’s annoying. It still beats the alternative.

“But what about quick lookups?”

I had this exact exchange with an associate at a client firm last month. “Isn’t the cheap model fine for emails and quick stuff?” Honest answer: if a question is genuinely small enough that the cheap model is “fine,” Google can answer it faster than any LLM. The reason you opened Claude is because the question wasn’t small. You wanted synthesis. You wanted real writing. You wanted something thought through.

For all of that, you want the best model in the box. Every time.

The standard advice is to match the model to the task. I don’t buy it. Asking busy executives to triage their questions before they ask them is asking the wrong people to do the wrong work. They want to type a question and get a great answer. The picker should never be their problem.

The thinking models actually work the problem before they answer. They follow multi-step instructions. They catch their own mistakes. They hallucinate less. The gap between a thinking model and an instant one isn’t subtle on real questions. It’s the difference between an answer you can use and one you have to redo.

What to do Monday morning

Send your firm one note. Tell people which model to use on each platform you’ve licensed, tell them to set it as the default, tell them to check the picker at the start of every new chat because the app won’t always remember, and tell them why in one line: you’re paying for the top model, so use it.

That’s the whole memo. It pays for itself the first time someone asks a substantive question and gets a substantive answer instead of a confident-sounding skim.

The labs need to fix both halves of this. Premium subscriptions should default to premium models, and a user’s chosen model should stick when they open a new chat. The fact that neither is true is on them. The fact that you’re letting it stay that way at your firm is on you.

You're paying for the best AI in the box. Make sure your firm is using it. Send the memo, set the rule, and check the picker on every new chat until the labs fix what they should have fixed at launch. If you want help rolling a one-page rule out across your seats without turning it into a meeting about a meeting, reach out: steve@intelligencebyintent.com. The model you picked should be the model you get.

Thanks for reading Intelligence by Intent! This post is public so feel free to share it.

Intelligence by Intent

Discussion about this post

Ready for more?