I Tried to Build One PowerPoint Template in Claude. It Failed Three Times.
Three failed attempts in Claude. One pass in Codex. Here's what that taught me about every AI decision I've made since.
How I’m Using Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro Right Now
TL;DR
Claude Opus 4.7 still owns writing, Word docs, and the memory-based answers that actually feel considered. GPT-5.5 took over my Excel work and beat Claude in a PowerPoint template build I tried in three different Claude surfaces first. Codex has wormed its way into my daily routine, which surprised me. Google is teasing something big. Stack today is roughly 60/30/10 across Claude, GPT-5.5, and Gemini. Ask me again in a month.
Here’s what I’m reaching for, and why.
What Still Belongs to Claude
Writing. Full stop.
GPT-5.5 got better, and I’d put it at a solid 8 out of 10 on prose. But Opus 4.7 still hits a 9-9.5. The difference shows up in voice, in pacing, in the moments where a sentence needs to breathe. I can feel the gap.
Memory is the other big one. When I ask Claude about me, my business, the shape of a project I’ve been working on for months, the answer comes back thoughtful. Not just accurate. Considered. GPT-5.5 has improved here too, but Claude still pulls ahead on the kind of reflection that makes an answer actually useful.
And Word documents are not even close. Out of the box, with no template guidance, the polish I get from Claude is in a different league. Margins, headings, structure, the way a memo reads when I open it. That matters when I’m running a legal demo or sending something to a partner.
Where GPT-5.5 Took Over
Excel is the cleanest example. I’ve moved every spreadsheet task to GPT-5.5. The formulas are tighter, the error checking is better, it follows direction more carefully, and it goes deeper into the work. Claude is still good. GPT-5.5 is great. When the answer needs to add up, I want the tool that adds up best (5.5-Pro!). It’s also amazing at Deep Research.
Then there’s the PowerPoint story.
The PowerPoint Template That Broke Claude
I had Claude Design build me a design spec for a website redesign. It nailed it. I handed that spec to Claude Code and we rebuilt the entire Intelligence by Intent site from it. You can see the result at intelligencebyintent.com.
Then I tried something simple. I asked Claude to take that same design spec and turn it into a matching PowerPoint template. Same colors, same fonts, same look and feel as the website.
Three attempts. Three complete fails.
It would generate slides, sure. But it never touched the actual underlying template. The master slides, the theme, the color palette. None of that got built. I tried in the Claude desktop app. I tried in Claude Cowork. I tried in Claude Code with Opus 4.7 on max thinking. Same result every time. Slides on top of a default theme. Unusable.
Out of frustration, I opened the latest Codex app with GPT-5.5. Same prompt, same spec.
One shot. Exactly what I asked for. Master slides, theme colors, layouts, the whole thing. Done in a single pass.
I came away genuinely impressed. So impressed that Codex has worked its way into my daily routine. I’m extending my memory system with hooks and MCPs to plug into Codex, and I’ll write that up properly soon. Most general users won’t touch Codex, and I get that. But for me, it’s literally become hard to live without over the course of a few days. I prefer it to the Claude desktop app right now. That’s a sentence I didn’t expect to write a month ago.
Where’s Google?
Google is sending signals. They’re hinting at a new model, and from what I can tell, they’re putting real focus on coding. That’s the area where they’ve been weakest. Their design output is already excellent. The raw thinking? Amazing. The raw code, less so. If they close that gap, things get interesting.
I’ll also say what I’ve said before. Gemini in Workspace still feels like a second-class experience compared to consumer Gemini (where’s my NotebookLM integration as the new ‘projects’?). Memory for workspace users? The polish, the pacing, the responsiveness all feel a step behind. I want that to change. I really do because I love this model, but they make it hard to use.
A couple of fresh data points though. Google I/O is around the corner, and they just held their Cloud Next event in Vegas, which I owe you a write-up on. And I just confirmed something new (at least new to me!): you can now generate .xlsx, .docx, and .pptx files directly from Gemini (when did this drop and how did I miss it?). That’s a real shift. Combined with the model rumors, I think we’re about to see something serious from Google. I cannot wait.
Images Update
I was blown away by Nano Banana 2, but now ChatGPT Images 2.0 has become my default. It’s been more consistent than Nano Banana 2 from Google, especially for the kind of professional and brand-adjacent images I generate day to day. Google will catch up. They always do, and I expect a jump in the next month.
What This Means for My Stack
A month ago: 85-90% Claude, 10-15% everything else.
Today: roughly 60% Claude, 30% GPT-5.5 (almost all in Codex), 10% Gemini (almost all in AI Studio with a paid API key and high thinking turned on).
I don’t pick a tool out of loyalty. I pick the tool that gives me the best answer for the work in front of me. As the models change, my stack changes with them.
What to Do This Week
If you’re running a single-tool AI strategy, here are three moves worth making before Friday.
Run a head-to-head test on your single most important workflow. Whatever you do most often, push it through both Claude and GPT-5.5 and compare the outputs honestly. Not vibes. Outputs.
If you do anything serious in Excel, try GPT-5.5 with extended (or heavy) thinking. Just once.
If you’ve never opened Codex, install it and spend an hour on one real task. Not a demo task. Then form an opinion.
The tools are moving fast. The right answer this month may not be the right answer next month. Stay flexible.
More soon. A deep dive on Codex is coming, and a recap of Google Cloud Next is on the way. And if you have a minute, take a look at the new Intelligence by Intent site and let me know what you think.
My stack will probably look different a month from now. I’m okay with that.
The firms making real progress on AI right now aren’t picking a vendor and settling in. They’re running honest tests on their actual work, switching when a better tool shows up, and being willing to be wrong about what they thought six weeks ago. That last part is the hard one. Most partnerships are built to commit to a stack and move on. The cost of doing that is going up, not down.
If you want to compare notes on your stack, or pressure-test the workflow that matters most to your firm, send me a note at steve@intelligencebyintent.com.


