mirror of
https://github.com/asgeirtj/system_prompts_leaks.git
synced 2025-10-23 01:11:59 +00:00
Create xlsx.md
This commit is contained in:
committed by
GitHub
parent
4402f2ea4b
commit
14448f4257
289
Anthropic/xlsx.md
Normal file
289
Anthropic/xlsx.md
Normal file
@@ -0,0 +1,289 @@
|
||||
# Excel Spreadsheet Handler (/mnt/skills/public/xlsx/SKILL.md)
|
||||
|
||||
---
|
||||
name: Excel Spreadsheet Handler
|
||||
description: Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization
|
||||
when_to_use: "When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas"
|
||||
version: 0.0.1
|
||||
dependencies: openpyxl, pandas
|
||||
---
|
||||
|
||||
# Requirements for Outputs
|
||||
|
||||
## All Excel files
|
||||
|
||||
### Zero Formula Errors
|
||||
- Every Excel model MUST be delivered with ZERO formula errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?)
|
||||
|
||||
### Preserve Existing Templates (when updating templates)
|
||||
- Study and EXACTLY match existing format, style, and conventions when modifying files
|
||||
- Never impose standardized formatting on files with established patterns
|
||||
- Existing template conventions ALWAYS override these guidelines
|
||||
|
||||
## Financial models
|
||||
|
||||
### Color Coding Standards
|
||||
Unless otherwise stated by the user or existing template
|
||||
|
||||
#### Industry-Standard Color Conventions
|
||||
- **Blue text (RGB: 0,0,255)**: Hardcoded inputs, and numbers users will change for scenarios
|
||||
- **Black text (RGB: 0,0,0)**: ALL formulas and calculations
|
||||
- **Green text (RGB: 0,128,0)**: Links pulling from other worksheets within same workbook
|
||||
- **Red text (RGB: 255,0,0)**: External links to other files
|
||||
- **Yellow background (RGB: 255,255,0)**: Key assumptions needing attention or cells that need to be updated
|
||||
|
||||
### Number Formatting Standards
|
||||
|
||||
#### Required Format Rules
|
||||
- **Years**: Format as text strings (e.g., "2024" not "2,024")
|
||||
- **Currency**: Use $#,##0 format; ALWAYS specify units in headers ("Revenue ($mm)")
|
||||
- **Zeros**: Use number formatting to make all zeros "-", including percentages (e.g., "$#,##0;($#,##0);-")
|
||||
- **Percentages**: Default to 0.0% format (one decimal)
|
||||
- **Multiples**: Format as 0.0x for valuation multiples (EV/EBITDA, P/E)
|
||||
- **Negative numbers**: Use parentheses (123) not minus -123
|
||||
|
||||
### Formula Construction Rules
|
||||
|
||||
#### Assumptions Placement
|
||||
- Place ALL assumptions (growth rates, margins, multiples, etc.) in separate assumption cells
|
||||
- Use cell references instead of hardcoded values in formulas
|
||||
- Example: Use =B5*(1+$B$6) instead of =B5*1.05
|
||||
|
||||
#### Formula Error Prevention
|
||||
- Verify all cell references are correct
|
||||
- Check for off-by-one errors in ranges
|
||||
- Ensure consistent formulas across all projection periods
|
||||
- Test with edge cases (zero values, negative numbers)
|
||||
- Verify no unintended circular references
|
||||
|
||||
#### Documentation Requirements for Hardcodes
|
||||
- Comment or in cells beside (if end of table). Format: "Source: [System/Document], [Date], [Specific Reference], [URL if applicable]"
|
||||
- Examples:
|
||||
- "Source: Company 10-K, FY2024, Page 45, Revenue Note, [SEC EDGAR URL]"
|
||||
- "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"
|
||||
- "Source: Bloomberg Terminal, 8/15/2025, AAPL US Equity"
|
||||
- "Source: FactSet, 8/20/2025, Consensus Estimates Screen"
|
||||
|
||||
# XLSX creation, editing, and analysis
|
||||
|
||||
## Overview
|
||||
|
||||
A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks.
|
||||
|
||||
## Important Requirements
|
||||
|
||||
**LibreOffice Required for Formula Recalculation**: You can assume LibreOffice is installed for recalculating formula values using the `recalc.py` script. The script automatically configures LibreOffice on first run
|
||||
|
||||
## Reading and analyzing data
|
||||
|
||||
### Data analysis with pandas
|
||||
For data analysis, visualization, and basic operations, use **pandas** which provides powerful data manipulation capabilities:
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
# Read Excel
|
||||
df = pd.read_excel('file.xlsx') # Default: first sheet
|
||||
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict
|
||||
|
||||
# Analyze
|
||||
df.head() # Preview data
|
||||
df.info() # Column info
|
||||
df.describe() # Statistics
|
||||
|
||||
# Write Excel
|
||||
df.to_excel('output.xlsx', index=False)
|
||||
```
|
||||
|
||||
## Excel File Workflows
|
||||
|
||||
## CRITICAL: Use Formulas, Not Hardcoded Values
|
||||
|
||||
**Always use Excel formulas instead of calculating values in Python and hardcoding them.** This ensures the spreadsheet remains dynamic and updateable.
|
||||
|
||||
### ❌ WRONG - Hardcoding Calculated Values
|
||||
```python
|
||||
# Bad: Calculating in Python and hardcoding result
|
||||
total = df['Sales'].sum()
|
||||
sheet['B10'] = total # Hardcodes 5000
|
||||
|
||||
# Bad: Computing growth rate in Python
|
||||
growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue']
|
||||
sheet['C5'] = growth # Hardcodes 0.15
|
||||
|
||||
# Bad: Python calculation for average
|
||||
avg = sum(values) / len(values)
|
||||
sheet['D20'] = avg # Hardcodes 42.5
|
||||
```
|
||||
|
||||
### ✅ CORRECT - Using Excel Formulas
|
||||
```python
|
||||
# Good: Let Excel calculate the sum
|
||||
sheet['B10'] = '=SUM(B2:B9)'
|
||||
|
||||
# Good: Growth rate as Excel formula
|
||||
sheet['C5'] = '=(C4-C2)/C2'
|
||||
|
||||
# Good: Average using Excel function
|
||||
sheet['D20'] = '=AVERAGE(D2:D19)'
|
||||
```
|
||||
|
||||
This applies to ALL calculations - totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.
|
||||
|
||||
## Common Workflow
|
||||
1. **Choose tool**: pandas for data, openpyxl for formulas/formatting
|
||||
2. **Create/Load**: Create new workbook or load existing file
|
||||
3. **Modify**: Add/edit data, formulas, and formatting
|
||||
4. **Save**: Write to file
|
||||
5. **Recalculate formulas (MANDATORY IF USING FORMULAS)**: Use the recalc.py script
|
||||
```bash
|
||||
python recalc.py output.xlsx
|
||||
```
|
||||
6. **Verify and fix any errors**:
|
||||
- The script returns JSON with error details
|
||||
- If `status` is `errors_found`, check `error_summary` for specific error types and locations
|
||||
- Fix the identified errors and recalculate again
|
||||
- Common errors to fix:
|
||||
- `#REF!`: Invalid cell references
|
||||
- `#DIV/0!`: Division by zero
|
||||
- `#VALUE!`: Wrong data type in formula
|
||||
- `#NAME?`: Unrecognized formula name
|
||||
|
||||
### Creating new Excel files
|
||||
```python
|
||||
# Using openpyxl for formulas and formatting
|
||||
from openpyxl import Workbook
|
||||
from openpyxl.styles import Font, PatternFill, Alignment
|
||||
|
||||
wb = Workbook()
|
||||
sheet = wb.active
|
||||
|
||||
# Add data
|
||||
sheet['A1'] = 'Hello'
|
||||
sheet['B1'] = 'World'
|
||||
sheet.append(['Row', 'of', 'data'])
|
||||
|
||||
# Add formula
|
||||
sheet['B2'] = '=SUM(A1:A10)'
|
||||
|
||||
# Formatting
|
||||
sheet['A1'].font = Font(bold=True, color='FF0000')
|
||||
sheet['A1'].fill = PatternFill('solid', start_color='FFFF00')
|
||||
sheet['A1'].alignment = Alignment(horizontal='center')
|
||||
|
||||
# Column width
|
||||
sheet.column_dimensions['A'].width = 20
|
||||
|
||||
wb.save('output.xlsx')
|
||||
```
|
||||
|
||||
### Editing existing Excel files
|
||||
```python
|
||||
# Using openpyxl to preserve formulas and formatting
|
||||
from openpyxl import load_workbook
|
||||
|
||||
# Load existing file
|
||||
wb = load_workbook('existing.xlsx')
|
||||
sheet = wb.active # or wb['SheetName'] for specific sheet
|
||||
|
||||
# Working with multiple sheets
|
||||
for sheet_name in wb.sheetnames:
|
||||
sheet = wb[sheet_name]
|
||||
print(f"Sheet: {sheet_name}")
|
||||
|
||||
# Modify cells
|
||||
sheet['A1'] = 'New Value'
|
||||
sheet.insert_rows(2) # Insert row at position 2
|
||||
sheet.delete_cols(3) # Delete column 3
|
||||
|
||||
# Add new sheet
|
||||
new_sheet = wb.create_sheet('NewSheet')
|
||||
new_sheet['A1'] = 'Data'
|
||||
|
||||
wb.save('modified.xlsx')
|
||||
```
|
||||
|
||||
## Recalculating formulas
|
||||
|
||||
Excel files created or modified by openpyxl contain formulas as strings but not calculated values. Use the provided `recalc.py` script to recalculate formulas:
|
||||
```bash
|
||||
python recalc.py <excel_file> [timeout_seconds]
|
||||
```
|
||||
|
||||
Example:
|
||||
```bash
|
||||
python recalc.py output.xlsx 30
|
||||
```
|
||||
|
||||
The script:
|
||||
- Automatically sets up LibreOffice macro on first run
|
||||
- Recalculates all formulas in all sheets
|
||||
- Scans ALL cells for Excel errors (#REF!, #DIV/0!, etc.)
|
||||
- Returns JSON with detailed error locations and counts
|
||||
- Works on both Linux and macOS
|
||||
|
||||
## Formula Verification Checklist
|
||||
|
||||
Quick checks to ensure formulas work correctly:
|
||||
|
||||
### Essential Verification
|
||||
- [ ] **Test 2-3 sample references**: Verify they pull correct values before building full model
|
||||
- [ ] **Column mapping**: Confirm Excel columns match (e.g., column 64 = BL, not BK)
|
||||
- [ ] **Row offset**: Remember Excel rows are 1-indexed (DataFrame row 5 = Excel row 6)
|
||||
|
||||
### Common Pitfalls
|
||||
- [ ] **NaN handling**: Check for null values with `pd.notna()`
|
||||
- [ ] **Far-right columns**: FY data often in columns 50+
|
||||
- [ ] **Multiple matches**: Search all occurrences, not just first
|
||||
- [ ] **Division by zero**: Check denominators before using `/` in formulas (#DIV/0!)
|
||||
- [ ] **Wrong references**: Verify all cell references point to intended cells (#REF!)
|
||||
- [ ] **Cross-sheet references**: Use correct format (Sheet1!A1) for linking sheets
|
||||
|
||||
### Formula Testing Strategy
|
||||
- [ ] **Start small**: Test formulas on 2-3 cells before applying broadly
|
||||
- [ ] **Verify dependencies**: Check all cells referenced in formulas exist
|
||||
- [ ] **Test edge cases**: Include zero, negative, and very large values
|
||||
|
||||
### Interpreting recalc.py Output
|
||||
The script returns JSON with error details:
|
||||
```json
|
||||
{
|
||||
"status": "success", // or "errors_found"
|
||||
"total_errors": 0, // Total error count
|
||||
"total_formulas": 42, // Number of formulas in file
|
||||
"error_summary": { // Only present if errors found
|
||||
"#REF!": {
|
||||
"count": 2,
|
||||
"locations": ["Sheet1!B5", "Sheet1!C10"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Library Selection
|
||||
- **pandas**: Best for data analysis, bulk operations, and simple data export
|
||||
- **openpyxl**: Best for complex formatting, formulas, and Excel-specific features
|
||||
|
||||
### Working with openpyxl
|
||||
- Cell indices are 1-based (row=1, column=1 refers to cell A1)
|
||||
- Use `data_only=True` to read calculated values: `load_workbook('file.xlsx', data_only=True)`
|
||||
- **Warning**: If opened with `data_only=True` and saved, formulas are replaced with values and permanently lost
|
||||
- For large files: Use `read_only=True` for reading or `write_only=True` for writing
|
||||
- Formulas are preserved but not evaluated - use recalc.py to update values
|
||||
|
||||
### Working with pandas
|
||||
- Specify data types to avoid inference issues: `pd.read_excel('file.xlsx', dtype={'id': str})`
|
||||
- For large files, read specific columns: `pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])`
|
||||
- Handle dates properly: `pd.read_excel('file.xlsx', parse_dates=['date_column'])`
|
||||
|
||||
## Code Style Guidelines
|
||||
**IMPORTANT**: When generating Python code for Excel operations:
|
||||
- Write minimal, concise Python code without unnecessary comments
|
||||
- Avoid verbose variable names and redundant operations
|
||||
- Avoid unnecessary print statements
|
||||
|
||||
**For Excel files themselves**:
|
||||
- Add comments to cells with complex formulas or important assumptions
|
||||
- Document data sources for hardcoded values
|
||||
- Include notes for key calculations and model sections
|
Reference in New Issue
Block a user