The Instruction File That Replaced My Analyst
I got tired of typing the same prompts every quarter. So I stopped.
Image created by ChatGPT Image 1.5
How I Automated My RevOps Analysis with Claude Code (And You Can Too)
TL;DR: I turned a manual, hour-long metrics analysis into a one-click automation using Claude Code. The key insight: create an instruction file, point Claude at your data, and let it build Excel outputs with real formulas. If you need even more horsepower, subagents can run multiple analyses in parallel. Here’s the exact workflow and instruction file I used.
The Problem Every RevOps Leader Knows
For the past seven months, I’ve been running RevOps for a B2B pet wellness and insurance company. Great work, interesting challenges, amazing leadership. And like most people in this role, I spend a lot of time in spreadsheets.
Back in December, I used Claude to crunch two years of sales data. Pipeline conversion rates. Win rates. Top loss reasons. RFP performance. Broker versus direct deals. State-by-state breakdowns. Industry splits. The whole picture.
I’d download the data, open Claude, and type in prompts one at a time. “Calculate pipeline conversion by quarter.” “Now break it down by broker versus direct.” “Show me win rates.” Claude did a great job. Really impressive Excel analysis.
But here’s the thing. When January rolled around, and I needed to run the same analysis on fresh data, I found myself typing in the same prompts all over again. Same questions, same structure, same format. Just different numbers.
There had to be a better way.
Enter Claude Code
I’d been using Claude Code for all sorts of automation tasks. Renaming batches of files. Redacting PDFs. Consolidating spreadsheets from multiple sources. So I thought: why not point it at my RevOps data?
First attempt: I loaded up Claude Code, told it where my data file lived, and entered the same prompts I’d been typing into regular Claude. It worked fine. But I was still doing the typing. No real time savings there.
Second attempt: I created an instruction file. A markdown document with every analysis I wanted, spelled out step by step. Then I told Claude Code to read that file and execute it. That’s when things got interesting.
The prompt looked something like this:
“I have created a revops analysis instruction file. It’s called ‘revops_instructions.md’. I want you to read that and do all of the analysis step by step. The source data file is called ‘revops_data.xlsx’ and I want you to create a new output file called ‘revops_output.xlsx’. For each analysis in that instruction file create a new tab.”
Claude read the file, processed the data, and built me a complete Excel workbook with eight analysis tabs. It even created an executive summary and charts when I asked.
One prompt. Eight analyses. Professional formatting. Done.
The Subagent Trick (When You Need More Horsepower)
My worry was context limits. What happens when the analysis gets so big that Claude runs out of room to think? That’s when I discovered subagents.
Here’s what most people don’t realize: each subagent in Claude Code gets its own full context window. So instead of one Claude working through eight analyses sequentially, you can spin up eight Claudes working in parallel.
The prompt change was tiny:
“I have created a revops analysis instruction file. It’s called ‘revops_instructions.md’. I want you to read that and do all of the analysis. For each analytic step in the file, launch a subagent to run that analysis. There are 8 of them total. The source data file is called ‘revops_data.xlsx’ and I want you to create a new output file called ‘revops_output.xlsx’. For each analysis in that instruction file create a new tab.”
I watched it spin up eight subagents simultaneously. Fast. Impressive to watch. All the results were compiled into one spreadsheet.
But there was a catch.
The Formula Trade-off
I wanted my spreadsheets to have actual Excel formulas in the cells. Not just calculated values, but the underlying logic. That way, if the source data changes, the numbers update automatically. That’s how real analysts work.
The subagent approach couldn’t deliver this. The technical reason: Python libraries lock files when writing, so multiple agents can’t update the same spreadsheet simultaneously. They’d calculate the answers in memory and paste in the final numbers. Accurate, but brittle.
So I went back to the single-agent approach with a twist: the instruction file now explicitly requires formulas. Every calculated cell must contain an actual Excel formula referencing the source data tab. No hardcoded numbers. No Python calculations pasted as values.
The result? A workbook I can trust. Change the source data, and all the metrics recalculate automatically.
The Complete Instruction File
I’m a believer in sharing what works. Below is the actual instruction file I created. If you don’t need formulas, spin up the subagents for speed. If you want a living spreadsheet that updates with new data, use this version.
## These are the instructions to do pipeline analysis data in a consistent format.
## There are 8 different analyses in here.
## For each analysis, create a new tab in the output spreadsheet.
## CRITICAL REQUIREMENTS - MUST FOLLOW:
### 1. FORMULAS ONLY - NO HARDCODED CALCULATIONS
**YOU MUST USE EXCEL FORMULAS FOR ALL CALCULATIONS. DO NOT CALCULATE IN PYTHON AND PASTE NUMBERS.**
- Every calculated cell MUST contain an Excel formula (e.g., =SUM(), =SUMIFS(), =COUNTIFS(), etc.)
- DO NOT use Python/pandas to calculate values and hardcode them
- DO NOT use subagents to pre-calculate and store results
- Example CORRECT: Cell B5 contains "=SUMIFS('Source Data'!$D:$D,'Source Data'!$Z:$Z,A5)"
- Example WRONG: Cell B5 contains "1234567.89" (hardcoded number)
### 2. WORKBOOK STRUCTURE
**First tab must be "Source Data" containing all raw data from the input file:**
- Copy ALL rows and columns from the source Excel file
- Keep all original column headers
- This is the ONLY place where raw data appears
- All other tabs reference this data with formulas
**Then create analysis tabs (each with formulas only):**
- Pipeline Conversion ($)
- Pipeline Conversion (Units)
- Win Rate Analysis
- RFP Timeline
- Lost RFPs
- Sales Cycle Analysis
- Industry Analysis
- Geographic Analysis
- Executive Summary (last, after all analyses are done)
- Key Insights - Charts (final tab)
### 3. FORMULA EXAMPLES
Use Excel formulas like these (adapt to your specific needs):
- **Sum with criteria**: `=SUMIFS('Source Data'!$D:$D, 'Source Data'!$Z:$Z, "2024-Q1", 'Source Data'!$AD:$AD, "")`
- **Count with criteria**: `=COUNTIFS('Source Data'!$Z:$Z, "2024-Q1", 'Source Data'!$G:$G, "Contract Signed")`
- **Division with error handling**: `=IF(B5=0, 0, C5/B5)`
- **Percentage**: `=IF(B5=0, 0, C5/B5)` (then format cell as percentage)
### 4. WORKFLOW
1. Use openpyxl to create workbook
2. Create "Source Data" tab and copy all source data
3. Create analysis tabs with formulas as STRINGS (e.g., ws['B5'] = '=SUM(A1:A4)')
4. Save the workbook
5. Run: `python recalc.py output.xlsx` to calculate all formula values
6. Verify no formula errors (#REF!, #DIV/0!, etc.)
### 5. WHAT NOT TO DO
❌ Do NOT calculate in Python and paste numbers
❌ Do NOT use pandas groupby/sum and hardcode results
❌ Do NOT store pre-calculated values in pickle files
❌ Do NOT use subagents to calculate - they will hardcode values
✅ DO write Excel formulas that reference the Source Data tab
✅ DO use openpyxl to write formulas as strings
✅ DO run recalc.py to populate formula values
### 6. VERIFICATION STEPS BEFORE WRITING FORMULAS
**ALWAYS verify these before creating formulas:**
1. **Check actual column positions** - Load source data with pandas and print column indices
```python
df = pd.read_excel('source.xlsx')
for i, col in enumerate(df.columns):
print(f"Column {i+1} (Excel): {col}")
```
- Excel columns are 1-indexed (Column A = 1, Column B = 2, etc.)
- Python DataFrame columns are 0-indexed (Column A = index 0)
- To convert: Excel column = Python index + 1
- Example: If RFP is at df.columns[41], it's Excel column 42 (AP), not 41 (AO)
2. **Verify data values** - Check actual values before writing criteria
- For RFP: Check if values are 0/1, True/False, or text
- For Stage: Check exact text (e.g., "Contract Signed" vs "Closed Won")
- For dates: Verify they're datetime objects, not strings
3. **Test formula logic** - Calculate sample results in Python first to verify logic
```python
# Test: How many RFPs created in Jan 2024?
test = df[(df['Created Date'].dt.month == 1) &
(df['Created Date'].dt.year == 2024) &
(df['RFP'] == 1)]
print(f"Expected count: {len(test)}")
```
### 7. EXCEL FORMULA BEST PRACTICES
**Use specific ranges, NOT entire columns:**
- ❌ WRONG: `=SUMPRODUCT((MONTH('Source Data'!$Y:$Y)=1)*...)`
- ✅ CORRECT: `=SUMPRODUCT((MONTH('Source Data'!$Y$2:$Y$354)=1)*...)`
- Entire column ranges ($Y:$Y) cause performance issues and errors
- Always specify exact data range (e.g., $Y$2:$Y$354 for 353 data rows)
**Handle date calculations with helper columns:**
- ❌ WRONG: Date arithmetic in SUMPRODUCT: `('Source Data'!$E:$E-'Source Data'!$Y:$Y)`
- ✅ CORRECT: Create helper column "Sales Cycle Days" in Source Data with formula: `=IF(AND(E2<>"",Y2<>""),E2-Y2,"")`
- Then use: `=AVERAGEIFS('Source Data'!$AQ:$AQ,'Source Data'!$J:$J,A3,...)`
- Excel doesn't reliably handle date subtraction in array formulas
**For filtered data displays (like Lost RFPs table):**
- Option A: Use Excel 365 FILTER function if available: `=FILTER('Source Data'!A:E,('Source Data'!$AP:$AP=1)*('Source Data'!$G:$G="Closed Lost"))`
- Option B: Copy filtered data directly from Source Data as static snapshot with note
- ❌ Do NOT leave placeholder text like "[Use Source Data tab filtered...]"
- Choose Option B if unsure about Excel version compatibility
### 8. EXECUTIVE SUMMARY REQUIREMENTS
**The Executive Summary MUST contain actual data-driven insights, NOT metadata.**
❌ WRONG - Metadata about what was created:
```
This workbook contains 8 comprehensive analyses:
1. Pipeline Conversion ($)
2. Pipeline Conversion (Units)
...
```
✅ CORRECT - Data-driven insights with numbers:
```
EXECUTIVE SUMMARY
KEY FINDINGS:
Pipeline Performance:
- Direct channel achieves 20% pipeline conversion vs broker channel at 7%
- Q1 2025 showed significant improvement with conversion rate of 34%
- Broker deals have higher average value ($245K) but lower conversion
Sales Cycle Analysis:
- Direct deals close in average 52 days vs 85 days for broker deals
- Contract Signed stage deals average 67 day sales cycle
- Closed Lost deals average 98 days before loss
Geographic Performance:
- Top performing states: California ($2.3M), Texas ($1.8M), New York ($1.5M)
- 5 states represent 65% of total closed won revenue
STRATEGIC RECOMMENDATIONS:
- Focus resources on direct channel given 3x higher conversion rate
- Investigate Q1 2025 success factors to replicate performance
- Review broker qualification process to improve conversion
```
**How to create insights:**
1. Read key metrics from analysis tabs using formulas
2. Calculate comparisons (direct vs broker, quarter over quarter)
3. Identify top/bottom performers
4. Write strategic recommendations based on the data
5. Use actual numbers and percentages from the data
## Make sure each page in the spreadsheet is professionally formatted.
## Use the excel skill
### 9. WORKFLOW - SEQUENTIAL EXECUTION
**Do NOT use parallel subagents for this task.** The workflow should be sequential:
1. **Load and verify source data**
- Read source data with pandas
- Print all column positions to verify indices
- Test sample formulas to verify logic
2. **Create workbook and Source Data tab**
- Create new workbook with openpyxl
- Create "Source Data" tab as first sheet
- Copy all source data (all rows and columns)
- Add helper column for Sales Cycle Days: `=IF(AND(E2<>"",Y2<>""),E2-Y2,"")`
3. **Create analysis tabs sequentially (1-8)**
- Writing formulas is fast - no need for parallelization
- Each tab references 'Source Data' sheet
- Use formulas, not hardcoded values
- Apply professional formatting to each tab
4. **Create Executive Summary**
- Must be created AFTER all analyses are complete
- Read key metrics from analysis tabs
- Include data-driven insights with actual numbers
5. **Create Charts tab**
- Must be created LAST
- References completed analysis tabs
6. **Save and recalculate**
- Save workbook
- Run: `python recalc.py output.xlsx`
- Verify no formula errors
**Why sequential, not parallel:**
- openpyxl can't handle multiple processes writing to same file
- Writing formulas is fast (not compute-intensive)
- Simpler error handling and debugging
- Easier to maintain proper execution order
## Analysis #1:
I want to do some analysis on this spreadsheet. I want you to first calculation pipeline conversion rates by created fiscal quarter (column Z). As a reminder, pipeline conversion has the following formula: the denominator is sum of amounts of all deals created in that quarter. the numerator is sum of all amounts of all deals that were created in that quarter that are now closed won (which is stage 6. contract signed) no matter when they won. Show me the total % for each fiscal quarter, and then do an additional breakdown and show me pipeline conversion on broker versus direct deals. There's a column called "Broker". If there's data in it then it was a broker deal, if it's blank then it's a direct deal.
Do a standalone table of non-broker deals first.
Then a table below it of broker deals.
Then a table below if of the combined analysis (all deals analyzed at once).
## Analysis #2:
Build on analysis #1. Now run those calculations based on unit numbers of deals (not $ Amounts) just so I can see the comparison.
Do a standalone table of non-broker deals first.
Then a table below it of broker deals.
Then a table below if of the combined analysis (all deals analyzed at once).
## Analysis #3:
Now let's calculate win rate for each quarter for total, then broker, then direct. As a reminder, win rate is calculated by the total number of deals won in the quarter divided by the sum of the total number of deals won and lost in the quarter regardless of creation date. Make sure you use column J for the fiscal period (don't caluclate it yourself).
Do a standalone table of non-broker deals first.
Then a table below it of broker deals.
Then a table below if of the combined analysis (all deals analyzed at once).
## Analysis #4:
Create a table that shows when RFP's came in. Create a table that shows the year (e.g. 2024) and then each month in a new row (Jan, Feb), and then a column that shows total # of RFP's that were created in that month, then the next column should be number of those that are closed won, then the next column as the % that are closed won. It should have both years (2024, 2025) and twelve months for both years. Create charts that highlight the trends.
## Analysis #5:
Make a table of every RFP we've lost. The table should include account name, opportunity name, closed lost date, opportunity owner, broker name. Do them in order of closed lost date.
## Analysis #6:
calculate average sales cycle for closed lost and closed won deals by fiscal quarter. do total, broker, direct.
Do a standalone table of non-broker deals first.
Then a table below it of broker deals.
Then a table below if of the combined analysis (all deals analyzed at once).
## Analysis #7:
Can you give me a breakdown of closed won deals by industry. Create a table that shows industry and total # of deals won and total $ value of deals won.
## Analysis #8:
Create a table that shows state name and # of deals won and total amount won in that state.
Once all 8 analysis have been completed, create a new "Executive Summary" tab as teh first tab in the workbook and highlight all of the key findings from these analysis. Write them formally as they will be used for senior executives and board members. Polish the page so it's highly formatted.
Finally, after the executive summay is done, create a tab with charts highlight the top insights from the data analyzed.What This Changes for Your Team
Think about what just happened. An analysis that used to take me an hour of interactive prompting now runs in minutes with a single command. And the output isn’t some one-time snapshot. It’s a living document with real formulas that recalculate when the data changes.
For RevOps specifically, this means monthly or quarterly reporting becomes almost trivial. Update the source data file, run the prompt, get a board-ready workbook. The executive summary writes itself based on the actual numbers. The charts update automatically.
But the pattern applies far beyond RevOps. Any recurring analysis you do in spreadsheets can work this way. Financial modeling. Marketing attribution. Customer churn analysis. Inventory planning. If you’re doing the same analysis repeatedly, you should be automating it.
The Honest Tradeoffs
This approach isn’t perfect. A few things to keep in mind.
Getting the instruction file right takes time upfront. The file I shared was the result of a fair amount of iteration - it was not a first draft. This version is the result of many iterations. The formula syntax has to be precise. Column references need to be exact. Error handling matters. Expect to spend an afternoon building and testing your first instruction file.
The formula-based approach is slower than the subagent approach. If you’re doing a massive one-time analysis and don’t care about auditability, parallel subagents will get you there faster. But you’ll have hardcoded numbers that can’t update.
Claude Code requires some technical comfort. You’re working in a command line environment, not a chat interface. If that feels foreign, there’s a learning curve. (Though it’s smaller than you might think.). Go try this in Claude Cowork - I think you will be pleasantly surprised!
What to Do Monday Morning
If you’re ready to try this yourself, here’s where to start.
Pick one recurring analysis you do manually today. Something you run monthly or quarterly that follows a consistent structure.
Write out every step of that analysis in plain English. Be specific about data sources, calculations, and output format. This becomes your instruction file.
Install Claude Code and run a simple test. Point it at a sample data file and ask for one piece of your analysis. Make sure the basics work.
Build your full instruction file incrementally. Start with one analysis, verify it works, then add the next. Debug as you go.
Decide whether you need formulas or speed. If auditability matters and data refreshes regularly, go with formulas (sequential execution). If you need a fast one-time analysis, try subagents.
The Bigger Picture
Claude Code and tools like Claude Cowork are still early. But the productivity gains are already real. What used to require a data analyst and hours of manual work can now happen in minutes with the right setup.
The key insight isn’t about any specific tool. It’s about recognizing that most analytical work follows patterns. Once you see the pattern, you can automate it. Once you automate it, you can focus on the decisions instead of the data wrangling.
The people who figure this out first will move faster than everyone else. That’s not hype. That’s just math.
Note: I’ll do a video showing all of this later this week. I’ll send out a link once it’s done because seeing this - especially the subagents - is really interesting.
Why I write these articles:
I write these pieces because senior leaders don’t need another AI tool ranking. They need someone who can look at how work actually moves through their organization and say: here’s where AI belongs, here’s where your team and current tools should still lead, and here’s how to keep all of it safe and compliant.
In this article, we looked at the gap between using AI as an interactive assistant and building AI into your operational infrastructure. Most teams are stuck in the first mode, typing the same prompts over and over, getting useful outputs that die the moment the conversation ends. The real unlock is recognizing which analytical patterns repeat and encoding them into reusable instructions. That’s not hype. That’s just process improvement with better tools.
If you want help sorting this out:
Reply to this or email me at steve@intelligencebyintent.com. Tell me what recurring analyses are eating your team’s time and where spreadsheet work is getting stuck. I’ll tell you what I’d automate first, whether Claude Code or another approach fits, and whether it makes sense for us to go further than that first conversation.
Not ready to talk yet?
Subscribe to my daily newsletter at smithstephen.com. I publish short, practical takes on AI for business leaders who need signal, not noise.



This is great. A couple thoughts / questions.
1) maybe this is by design, but the the conversion rates used in 1 and 2 vs 3 are different. 1 and 2 use creation date as the denominator and 3 uses close date. I find using different definitions like this can confuse business leaders.
2) have you thought about fully automating it where there is no need to run it weekly but rather the agent does it directly by connecting to source systems and then sending emails to stakeholders once the analysis is complete?
3) How are you doing trending to see how 1-8 are performing over time? Are snapshots being saved anywhere?
4) I always wonder if using AI like this or if modern BI tools like Snowflake and Tableau are both more reliable and efficient (e.g for snapshot trend analysis, declarative logic accuracy, and no need to have human intervention to routinely run). What is built with Claude Code is cool, but wondering how much time in total was spend learning, iterating to get right, and maintaining versus doing it with modern BI tools?