Context is Capacity: How to Stop Paying for an AI That Can't Remember

Forget the marketing hype. We break down the one number that actually determines an AI’s power for real work - context window size - and why the gap between tiers is wider than you think.

Sep 07, 2025

You know that feeling when a colleague nods along in a meeting, then five minutes later asks a question you already answered? That is what an AI does when you run past its context window. It is not being difficult, it is out of short-term memory. If you are using these tools for real work, that lapse turns into wasted time, missed facts, and rework you quietly pay for in salaries and missed deadlines.

I have been hammer-testing the latest models across free, plus, pro, max, ultra, API, and enterprise plans. The gap is not just price. The size of the memory window changes what the tool can actually do for you.

What a context window really is

Think of the context window as the model’s working memory. It is the amount of text the model can “see” at once. That context includes all of the files you uploaded as well as all of the input and output as you go back and forth with the model. Numbers help:

32K tokens is roughly 24,000 words, about 50 pages.
128K tokens is around 96,000 words, about 200 pages.

Now the catch. The window rolls. When you go past the limit, the earliest parts slide off the back to make room for new text. That is why a long chat starts sharp, then drifts. The model did not ignore you, it forgot the beginning.

Where things stand, September 2025

Here is the practical view by model and tier. I am focusing on what you get, not the marketing line.

ChatGPT-5: free at 16K tokens, Plus at 32K, Pro and Enterprise at 128K in the app. If you build on the API, you can step up to 400K. That difference matters if you work with board decks, multi-year contracts, or bundles of PDFs. If you are using the thinking model (across all tiers), the context window is 196k tokens.

Claude 4.1 Opus: paid tiers standardize at 200K. The free tier does not include Opus. Enterprise teams using Claude Sonnet 4 can hit 500K. If you live in long policy docs or large codebases, that ceiling is useful.

Gemini 2.5 Pro: paid users sit at 1 million tokens, free at 32K. Google has (reportedly) 2 million on deck (that’s what it was under 1.5 Pro). Translate that to the real world, and you are talking roughly 1,500 pages loaded at once.

Grok 4: 128K tokens in the chat window, and 256K tokens via API.

I am not ranking them on philosophy or features here. This is about how much they can remember in one shot.

How to choose without overbuying

Use your actual workflow as the yardstick, not a feature grid.

If your tasks live under 30–40 pages at a time, Plus-level plans are fine.
If you regularly upload long docs, run multi-doc comparisons, or keep a complex thread going across a week, step up to the larger tiers.
If your team builds internal tools or automations, the API levels unlock larger windows and predictable throughput.
Sanity-check your common file sizes. A 38 MB discovery PDF will break a small context, even if your question is short.

A rough ROI check helps. If two analysts spend an extra 30 minutes a day re-feeding context, that is five hours a week. At fully loaded cost, that is thousands per quarter to babysit memory. The math flips quickly in favor of the higher tier.

A few operating tips

Front-load the anchor facts. Since the window rolls, put IDs, definitions, and key metrics at the top of your thread so you can paste them back if needed.
Bundle with intent. Combine the documents you actually want compared, not the entire folder, then ask for the exact comparison you need.
Reset on purpose. When a thread gets long and wobbly, start a fresh chat and re-post the core context. It feels redundant, it saves time.

The headline is simple. Context is capacity. If you want the model to think with you across real-world inputs, buy the memory to match the work. The cost is visible, the hidden tax of short windows is not, and it is the one you are already paying.

If you enjoyed this article, please subscribe to my newsletter and share it with your network! Looking for help to really drive the adoption of AI in your organization? Want to use AI to transform your team’s productivity? Reach out to me at: steve@intelligencebyintent.com

Intelligence by Intent

Discussion about this post

Ready for more?