Create pptx.md

2025-10-23 01:11:59 +00:00 · 2025-10-02 08:12:47 +00:00
parent 14448f4257
commit 3964e72269
1 changed files with 410 additions and 0 deletions
--- a/Anthropic/pptx.md
+++ b/Anthropic/pptx.md
@@ -0,0 +1,410 @@
+# PowerPoint Suite (/mnt/skills/public/pptx/SKILL.md)
+
+---
+name: PowerPoint Suite
+description: Presentation creation, editing, and analysis.
+when_to_use: "When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks"
+version: 0.0.3
+---
+
+# PPTX creation, editing, and analysis
+
+## Overview
+
+A user may ask you to create, edit, or analyze the contents of a .pptx file. A .pptx file is essentially a ZIP archive containing XML files and other resources that you can read or edit. You have different tools and workflows available for different tasks.
+
+## Reading and analyzing content
+
+### Text extraction
+If you just need to read the text contents of a presentation, you should convert the document to markdown:
+```bash
+# Convert document to markdown
+python -m markitdown path-to-file.pptx
+```
+
+### Raw XML access
+You need raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. For any of these features, you'll need to unpack a presentation and read its raw XML contents.
+
+#### Unpacking a file
+`python ooxml/scripts/unpack.py <office_file> <output_dir>`
+
+**Note**: The unpack.py script is located at `skills/pptx/ooxml/scripts/unpack.py` relative to the project root. If the script doesn't exist at this path, use `find . -name "unpack.py"` to locate it.
+
+#### Key file structures
+* `ppt/presentation.xml` - Main presentation metadata and slide references
+* `ppt/slides/slide{N}.xml` - Individual slide contents (slide1.xml, slide2.xml, etc.)
+* `ppt/notesSlides/notesSlide{N}.xml` - Speaker notes for each slide
+* `ppt/comments/modernComment_*.xml` - Comments for specific slides
+* `ppt/slideLayouts/` - Layout templates for slides
+* `ppt/slideMasters/` - Master slide templates
+* `ppt/theme/` - Theme and styling information
+* `ppt/media/` - Images and other media files
+
+#### Typography and color extraction
+**When given an example design to emulate**: Always analyze the presentation's typography and colors first using the methods below:
+1. **Read theme file**: Check `ppt/theme/theme1.xml` for colors (`<a:clrScheme>`) and fonts (`<a:fontScheme>`)
+2. **Sample slide content**: Examine `ppt/slides/slide1.xml` for actual font usage (`<a:rPr>`) and colors
+3. **Search for patterns**: Use grep to find color (`<a:solidFill>`, `<a:srgbClr>`) and font references across all XML files
+
+## Creating a new PowerPoint presentation **without a template**
+
+When creating a new PowerPoint presentation from scratch, use the **html2pptx** workflow to convert HTML slides to PowerPoint with accurate positioning.
+
+### Design Principles
+
+**CRITICAL**: Before creating any presentation, analyze the content and choose appropriate design elements:
+1. **Consider the subject matter**: What is this presentation about? What tone, industry, or mood does it suggest?
+2. **Check for branding**: If the user mentions a company/organization, consider their brand colors and identity
+3. **Match palette to content**: Select colors that reflect the subject
+4. **State your approach**: Explain your design choices before writing code
+
+**Requirements**:
+- ✅ State your content-informed design approach BEFORE writing code
+- ✅ Use web-safe fonts only: Arial, Helvetica, Times New Roman, Georgia, Courier New, Verdana, Tahoma, Trebuchet MS, Impact
+- ✅ Create clear visual hierarchy through size, weight, and color
+- ✅ Ensure readability: strong contrast, appropriately sized text, clean alignment
+- ✅ Be consistent: repeat patterns, spacing, and visual language across slides
+
+#### Color Palette Selection
+
+**Choosing colors creatively**:
+- **Think beyond defaults**: What colors genuinely match this specific topic? Avoid autopilot choices.
+- **Consider multiple angles**: Topic, industry, mood, energy level, target audience, brand identity (if mentioned)
+- **Be adventurous**: Try unexpected combinations - a healthcare presentation doesn't have to be green, finance doesn't have to be navy
+- **Build your palette**: Pick 3-5 colors that work together (dominant colors + supporting tones + accent)
+- **Ensure contrast**: Text must be clearly readable on backgrounds
+
+**Example color palettes** (use these to spark creativity - choose one, adapt it, or create your own):
+
+1. **Classic Blue**: Deep navy (#1C2833), slate gray (#2E4053), silver (#AAB7B8), off-white (#F4F6F6)
+2. **Teal & Coral**: Teal (#5EA8A7), deep teal (#277884), coral (#FE4447), white (#FFFFFF)
+3. **Bold Red**: Red (#C0392B), bright red (#E74C3C), orange (#F39C12), yellow (#F1C40F), green (#2ECC71)
+4. **Warm Blush**: Mauve (#A49393), blush (#EED6D3), rose (#E8B4B8), cream (#FAF7F2)
+5. **Burgundy Luxury**: Burgundy (#5D1D2E), crimson (#951233), rust (#C15937), gold (#997929)
+6. **Deep Purple & Emerald**: Purple (#B165FB), dark blue (#181B24), emerald (#40695B), white (#FFFFFF)
+7. **Cream & Forest Green**: Cream (#FFE1C7), forest green (#40695B), white (#FCFCFC)
+8. **Pink & Purple**: Pink (#F8275B), coral (#FF574A), rose (#FF737D), purple (#3D2F68)
+9. **Lime & Plum**: Lime (#C5DE82), plum (#7C3A5F), coral (#FD8C6E), blue-gray (#98ACB5)
+10. **Black & Gold**: Gold (#BF9A4A), black (#000000), cream (#F4F6F6)
+11. **Sage & Terracotta**: Sage (#87A96B), terracotta (#E07A5F), cream (#F4F1DE), charcoal (#2C2C2C)
+12. **Charcoal & Red**: Charcoal (#292929), red (#E33737), light gray (#CCCBCB)
+13. **Vibrant Orange**: Orange (#F96D00), light gray (#F2F2F2), charcoal (#222831)
+14. **Forest Green**: Black (#191A19), green (#4E9F3D), dark green (#1E5128), white (#FFFFFF)
+15. **Retro Rainbow**: Purple (#722880), pink (#D72D51), orange (#EB5C18), amber (#F08800), gold (#DEB600)
+16. **Vintage Earthy**: Mustard (#E3B448), sage (#CBD18F), forest green (#3A6B35), cream (#F4F1DE)
+17. **Coastal Rose**: Old rose (#AD7670), beaver (#B49886), eggshell (#F3ECDC), ash gray (#BFD5BE)
+18. **Orange & Turquoise**: Light orange (#FC993E), grayish turquoise (#667C6F), white (#FCFCFC)
+
+#### Visual Details Options
+
+**Geometric Patterns**:
+- Diagonal section dividers instead of horizontal
+- Asymmetric column widths (30/70, 40/60, 25/75)
+- Rotated text headers at 90° or 270°
+- Circular/hexagonal frames for images
+- Triangular accent shapes in corners
+- Overlapping shapes for depth
+
+**Border & Frame Treatments**:
+- Thick single-color borders (10-20pt) on one side only
+- Double-line borders with contrasting colors
+- Corner brackets instead of full frames
+- L-shaped borders (top+left or bottom+right)
+- Underline accents beneath headers (3-5pt thick)
+
+**Typography Treatments**:
+- Extreme size contrast (72pt headlines vs 11pt body)
+- All-caps headers with wide letter spacing
+- Numbered sections in oversized display type
+- Monospace (Courier New) for data/stats/technical content
+- Condensed fonts (Arial Narrow) for dense information
+- Outlined text for emphasis
+
+**Chart & Data Styling**:
+- Monochrome charts with single accent color for key data
+- Horizontal bar charts instead of vertical
+- Dot plots instead of bar charts
+- Minimal gridlines or none at all
+- Data labels directly on elements (no legends)
+- Oversized numbers for key metrics
+
+**Layout Innovations**:
+- Full-bleed images with text overlays
+- Sidebar column (20-30% width) for navigation/context
+- Modular grid systems (3×3, 4×4 blocks)
+- Z-pattern or F-pattern content flow
+- Floating text boxes over colored shapes
+- Magazine-style multi-column layouts
+
+**Background Treatments**:
+- Solid color blocks occupying 40-60% of slide
+- Gradient fills (vertical or diagonal only)
+- Split backgrounds (two colors, diagonal or vertical)
+- Edge-to-edge color bands
+- Negative space as a design element
+
+### Layout Tips
+**When creating slides with charts or tables:**
+- **Two-column layout (PREFERRED)**: Use a header spanning the full width, then two columns below - text/bullets in one column and the featured content in the other. This provides better balance and makes charts/tables more readable. Use flexbox with unequal column widths (e.g., 40%/60% split) to optimize space for each content type.
+- **Full-slide layout**: Let the featured content (chart/table) take up the entire slide for maximum impact and readability
+- **NEVER vertically stack**: Do not place charts/tables below text in a single column - this causes poor readability and layout issues
+
+### Workflow
+1. **MANDATORY - READ ENTIRE FILE**: Read [`html2pptx.md`](html2pptx.md) completely from start to finish. **NEVER set any range limits when reading this file.** Read the full file content for detailed syntax, critical formatting rules, and best practices before proceeding with presentation creation.
+2. Create an HTML file for each slide with proper dimensions (e.g., 720pt × 405pt for 16:9)
+   - Use `<p>`, `<h1>`-`<h6>`, `<ul>`, `<ol>` for all text content
+   - Use `class="placeholder"` for areas where charts/tables will be added (render with gray background for visibility)
+   - **CRITICAL**: Rasterize gradients and icons as PNG images FIRST using Sharp, then reference in HTML
+   - **LAYOUT**: For slides with charts/tables/images, use either full-slide layout or two-column layout for better readability
+3. Create and run a JavaScript file using the [`html2pptx.js`](scripts/html2pptx.js) library to convert HTML slides to PowerPoint and save the presentation
+   - Use the `html2pptx()` function to process each HTML file
+   - Add charts and tables to placeholder areas using PptxGenJS API
+   - Save the presentation using `pptx.writeFile()`
+4. **Visual validation**: Generate thumbnails and inspect for layout issues
+   - Create thumbnail grid: `python scripts/thumbnail.py output.pptx workspace/thumbnails --cols 4`
+   - Read and carefully examine the thumbnail image for:
+     * Text overflow or truncation
+     * Misaligned elements
+     * Incorrect colors or fonts
+     * Missing content
+     * Layout problems
+   - If issues found, diagnose and fix before proceeding
+
+## Creating a new PowerPoint presentation **from a template**
+
+When given a PowerPoint template, you can create a new presentation by replacing the text content in the template slides.
+
+### Workflow
+
+1. **Unpack the template**: Extract the template's XML structure
+```bash
+   python ooxml/scripts/unpack.py template.pptx unpacked_template
+```
+
+2. **Read the presentation structure**: Read `unpacked_template/ppt/presentation.xml` to understand the overall structure and slide references
+
+3. **Examine template slides**: Check the first few slide XML files to understand the structure
+```bash
+   # View slide structure
+   python -c "from lxml import etree; tree = etree.parse('unpacked_template/ppt/slides/slide1.xml'); print(etree.tostring(tree, pretty_print=True, encoding='unicode'))"
+```
+
+4. **Copy template to working file**: Make a copy of the template for editing
+```bash
+   cp template.pptx working.pptx
+```
+
+5. **Generate text shape inventory**:
+```bash
+   python scripts/inventory.py working.pptx > template-inventory.json
+```
+   
+   The inventory provides a structured view of ALL text shapes in the presentation:
+```json
+   {
+     "slide-0": {
+       "shape-0": {
+         "shape_id": "2",
+         "shape_name": "Title 1",
+         "placeholder_type": "TITLE",
+         "text_content": "Original title text here...",
+         "default_font_size": 44.0,
+         "default_font_name": "Calibri Light"
+       },
+       "shape-1": {
+         "shape_id": "3",
+         "shape_name": "Content Placeholder 2",
+         "placeholder_type": "BODY",
+         "text_content": "Original content text...",
+         "default_font_size": 18.0
+       }
+     },
+     "slide-1": {
+       ...
+     }
+   }
+```
+   
+   **Understanding the inventory**:
+   - Each slide is identified as "slide-N" (zero-indexed)
+   - Each text shape within a slide is identified as "shape-N" (zero-indexed by occurrence)
+   - `placeholder_type` indicates the shape's role: TITLE, BODY, SUBTITLE, etc.
+   - `text_content` shows the current text (useful for identifying which shape to replace)
+   - `default_font_size` and `default_font_name` show the shape's default formatting
+
+6. **Create replacement text JSON**: Based on the inventory, create a JSON file specifying which shapes to update with new text
+   - **IMPORTANT**: Reference shapes using the slide and shape identifiers from the inventory (e.g., "slide-0", "shape-1")
+   - **CRITICAL**: Each shape's "paragraphs" field must contain **properly formatted paragraph objects**, not plain text strings
+   - Each paragraph object can include:
+     - `text`: The actual text content (required)
+     - `alignment`: Text alignment (e.g., "CENTER", "LEFT", "RIGHT")
+     - `bold`: Boolean for bold text
+     - `italic`: Boolean for italic text
+     - `bullet`: Boolean to enable bullet points (when true, `level` is also required)
+     - `level`: Integer for bullet indent level (0 = no indent, 1 = first level, etc.)
+     - `font_size`: Float for custom font size
+     - `font_name`: String for custom font name
+     - `color`: String for RGB color (e.g., "FF0000" for red)
+     - `theme_color`: String for theme-based color (e.g., "DARK_1", "ACCENT_1")
+   - **IMPORTANT**: When bullet: true, do NOT include bullet symbols (•, -, *) in text - they're added automatically
+   - **ESSENTIAL FORMATTING RULES**:
+     - Headers/titles should typically have `"bold": true`
+     - List items should have `"bullet": true, "level": 0` (level is required when bullet is true)
+     - Preserve any alignment properties (e.g., `"alignment": "CENTER"` for centered text)
+     - Include font properties when different from default (e.g., `"font_size": 14.0`, `"font_name": "Lora"`)
+     - Colors: Use `"color": "FF0000"` for RGB or `"theme_color": "DARK_1"` for theme colors
+     - The replacement script expects **properly formatted paragraphs**, not just text strings
+     - **Overlapping shapes**: Prefer shapes with larger default_font_size or more appropriate placeholder_type
+   - Save the updated inventory with replacements to `replacement-text.json`
+   - **WARNING**: Different template layouts have different shape counts - always check the actual inventory before creating replacements
+
+   Example paragraphs field showing proper formatting:
+```json
+   "paragraphs": [
+     {
+       "text": "New presentation title text",
+       "alignment": "CENTER",
+       "bold": true
+     },
+     {
+       "text": "Section Header",
+       "bold": true
+     },
+     {
+       "text": "First bullet point without bullet symbol",
+       "bullet": true,
+       "level": 0
+     },
+     {
+       "text": "Red colored text",
+       "color": "FF0000"
+     },
+     {
+       "text": "Theme colored text",
+       "theme_color": "DARK_1"
+     },
+     {
+       "text": "Regular paragraph text without special formatting"
+     }
+   ]
+```
+
+   **Shapes not listed in the replacement JSON are automatically cleared**:
+```json
+   {
+     "slide-0": {
+       "shape-0": {
+         "paragraphs": [...] // This shape gets new text
+       }
+       // shape-1 and shape-2 from inventory will be cleared automatically
+     }
+   }
+```
+
+   **Common formatting patterns for presentations**:
+   - Title slides: Bold text, sometimes centered
+   - Section headers within slides: Bold text
+   - Bullet lists: Each item needs `"bullet": true, "level": 0`
+   - Body text: Usually no special properties needed
+   - Quotes: May have special alignment or font properties
+
+7. **Apply replacements using the `replace.py` script**
+```bash
+   python scripts/replace.py working.pptx replacement-text.json output.pptx
+```
+
+   The script will:
+   - First extract the inventory of ALL text shapes using functions from inventory.py
+   - Validate that all shapes in the replacement JSON exist in the inventory
+   - Clear text from ALL shapes identified in the inventory
+   - Apply new text only to shapes with "paragraphs" defined in the replacement JSON
+   - Preserve formatting by applying paragraph properties from the JSON
+   - Handle bullets, alignment, font properties, and colors automatically
+   - Save the updated presentation
+
+   Example validation errors:
+```
+   ERROR: Invalid shapes in replacement JSON:
+     - Shape 'shape-99' not found on 'slide-0'. Available shapes: shape-0, shape-1, shape-4
+     - Slide 'slide-999' not found in inventory
+```
+```
+   ERROR: Replacement text made overflow worse in these shapes:
+     - slide-0/shape-2: overflow worsened by 1.25" (was 0.00", now 1.25")
+```
+
+## Creating Thumbnail Grids
+
+To create visual thumbnail grids of PowerPoint slides for quick analysis and reference:
+```bash
+python scripts/thumbnail.py template.pptx [output_prefix]
+```
+
+**Features**:
+- Creates: `thumbnails.jpg` (or `thumbnails-1.jpg`, `thumbnails-2.jpg`, etc. for large decks)
+- Default: 5 columns, max 30 slides per grid (5×6)
+- Custom prefix: `python scripts/thumbnail.py template.pptx my-grid`
+  - Note: The output prefix should include the path if you want output in a specific directory (e.g., `workspace/my-grid`)
+- Adjust columns: `--cols 4` (range: 3-6, affects slides per grid)
+- Grid limits: 3 cols = 12 slides/grid, 4 cols = 20, 5 cols = 30, 6 cols = 42
+- Slides are zero-indexed (Slide 0, Slide 1, etc.)
+
+**Use cases**:
+- Template analysis: Quickly understand slide layouts and design patterns
+- Content review: Visual overview of entire presentation
+- Navigation reference: Find specific slides by their visual appearance
+- Quality check: Verify all slides are properly formatted
+
+**Examples**:
+```bash
+# Basic usage
+python scripts/thumbnail.py presentation.pptx
+
+# Combine options: custom name, columns
+python scripts/thumbnail.py template.pptx analysis --cols 4
+```
+
+## Converting Slides to Images
+
+To visually analyze PowerPoint slides, convert them to images using a two-step process:
+
+1. **Convert PPTX to PDF**:
+```bash
+   soffice --headless --convert-to pdf template.pptx
+```
+
+2. **Convert PDF pages to JPEG images**:
+```bash
+   pdftoppm -jpeg -r 150 template.pdf slide
+```
+   This creates files like `slide-1.jpg`, `slide-2.jpg`, etc.
+
+Options:
+- `-r 150`: Sets resolution to 150 DPI (adjust for quality/size balance)
+- `-jpeg`: Output JPEG format (use `-png` for PNG if preferred)
+- `-f N`: First page to convert (e.g., `-f 2` starts from page 2)
+- `-l N`: Last page to convert (e.g., `-l 5` stops at page 5)
+- `slide`: Prefix for output files
+
+Example for specific range:
+```bash
+pdftoppm -jpeg -r 150 -f 2 -l 5 template.pdf slide  # Converts only pages 2-5
+```
+
+## Code Style Guidelines
+**IMPORTANT**: When generating code for PPTX operations:
+- Write concise code
+- Avoid verbose variable names and redundant operations
+- Avoid unnecessary print statements
+
+## Dependencies
+
+Required dependencies (should already be installed):
+
+- **markitdown**: `pip install "markitdown[pptx]"` (for text extraction from presentations)
+- **pptxgenjs**: `npm install -g pptxgenjs` (for creating presentations via html2pptx)
+- **playwright**: `npm install -g playwright` (for HTML rendering in html2pptx)
+- **react-icons**: `npm install -g react-icons react react-dom` (for icons)
+- **sharp**: `npm install -g sharp` (for SVG rasterization and image processing)
+- **LibreOffice**: `sudo apt-get install libreoffice` (for PDF conversion)
+- **Poppler**: `sudo apt-get install poppler-utils` (for pdftoppm to convert PDF to images)