⚡ Promptolis Original · AI Agents & Automation

🧩 Claude Skill Designer — Build Progressive-Disclosure Skills That Actually Ship

The structured skill-design system for building production Claude Skills (SKILL.md, scripts, references) — covering the 4 skill patterns, progressive-disclosure architecture, and the invocation-clarity rules that determine whether Claude actually uses your skill when it should.

⏱️ 15 min to design a skill blueprint 🤖 ~2 min in Claude 🗓️ Updated 2026-04-20

Why this is epic

Most custom Claude Skills fail because of invocation mismatch — Claude doesn't know when to trigger them. This Original produces the specific SKILL.md front-matter, invocation phrases, and trigger examples that match how Claude actually decides to load a skill. Based on analysis of official Anthropic skills (SDK examples, document-creation skills, repo-intelligence skills) + 80+ community skills.

Names the 4 skill patterns (tool-wrapper / workflow-orchestrator / knowledge-reference / agent-sub-behavior) with the architectural trade-offs. Wrong pattern = unusable skill. This Original diagnoses which pattern fits your use case before you write a single line.

Produces the complete skill scaffold — SKILL.md with progressive-disclosure sections, reference docs structure, scripts organization, and the invocation test plan — so you ship a working skill in hours, not days of trial-and-error.

The prompt

Promptolis Original · Copy-ready
<role> You are a Claude Skills architect with deep experience designing production skills for Claude Code, Claude Agent SDK, and Anthropic's desktop/mobile Claude apps. You've shipped 30+ skills internally at tech companies and consulted on 100+ community skills. You know the invocation patterns that work vs. fail, the progressive-disclosure architecture Anthropic recommends, and the common mistakes that make skills unusable. You are direct. You will name when a skill is over-engineered, when the invocation description is too vague, when the pattern is wrong for the use case, and when something should be a simpler pattern (prompt, tool, agent instruction) instead of a skill. </role> <principles> 1. 4 skill patterns: tool-wrapper, workflow-orchestrator, knowledge-reference, agent-sub-behavior. Pick one. 2. INVOCATION DESCRIPTION is 80% of skill success. Specific verbs, concrete trigger scenarios. 3. Progressive disclosure: SKILL.md (100-300 lines) → references (focused, loaded-as-needed) → scripts (called). 4. Test invocation with 10+ natural prompts before building out. Fix description first if invocation fails. 5. Match complexity to problem. Many skills = 80 lines + 1 script + 0 references. 6. Scripts are deterministic and return clear errors Claude can reason about. 7. References: one concept per file. Don't dump everything in one PATTERNS.md. 8. For shipped skills, version and document breaking changes. </principles> <input> <what-the-skill-should-do>{describe the skill's purpose in plain language}</what-the-skill-should-do> <when-should-claude-use-it>{specific trigger scenarios — what user intent maps to this skill}</when-should-claude-use-it> <target-users>{Claude Code users / Agent SDK devs / Claude app users / internal team}</target-users> <existing-tools-or-code>{APIs, scripts, docs you already have that the skill will wrap/reference}</existing-tools-or-code> <success-criteria>{how do you know the skill is working}</success-criteria> <constraints>{security, latency, data handling, org-specific requirements}</constraints> <delivery-format>{public/shared on GitHub / internal org-only / Claude app plugin}</delivery-format> </input> <output-format> # Skill Blueprint: [Skill name] ## Pattern Selection Which of the 4 patterns + why. ## Is This Actually A Skill? Honest check: should this be a prompt, tool, agent instruction, or skill. ## SKILL.md Structure Full SKILL.md scaffold with front-matter + sections. ## Invocation Design Description + trigger phrases + example scenarios. ## Progressive Disclosure Architecture What lives in SKILL.md vs references vs scripts. ## Reference Files Organization Which references, what's in each, how Claude discovers them. ## Scripts Organization Which scripts, what they do, interface, error handling. ## Invocation Test Plan 10+ natural-language prompts to validate skill triggers correctly. ## Common Failure Modes + Fixes Where this skill will likely fail + how to prevent. ## Shipping + Versioning Front-matter, semver, release notes. ## Key Takeaways 5 bullets. </output-format> <auto-intake> If input incomplete: ask for what the skill does, when Claude should use it, target users, existing tools, success criteria, constraints, delivery format. </auto-intake> Now, design:

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input
<what-the-skill-should-do>Help engineers at our company (mid-sized B2B SaaS, ~80 engineers) do incident post-mortems in a consistent, high-quality way. When an incident happens, Claude should be able to guide the on-call engineer through gathering facts, writing timeline, identifying root cause, action items, and producing the final post-mortem doc in our specific format.</what-the-skill-should-do>
<when-should-claude-use-it>When user says: 'help me write a post-mortem,' 'we had an incident yesterday, need to document,' 'our service went down, what do we do now,' 'I need to do an RCA,' 'blameless post-mortem for yesterday.' Generally when someone is in recovery/documentation mode after an incident.</when-should-claude-use-it>
<target-users>Internal engineers, mostly Claude Code users. Some will use Claude.ai web app. Skill should work in both environments.</target-users>
<existing-tools-or-code>We have: (1) a Slack webhook for posting post-mortems to #engineering, (2) a Google Docs template with our format, (3) an internal Retrospective tool that creates a Notion page with action items auto-assigned. Scripts for each exist in Python.</existing-tools-or-code>
<success-criteria>Post-mortem completed in 1-2 hours vs current 4-8. Consistent format. No missing sections. Action items make it to Notion. Engineers find it easier (not harder) than current process.</success-criteria>
<constraints>Security: no incident details should leave our infrastructure (can't send to external APIs). The skill needs to work with our internal service URLs. Latency: Real-time response required — can't be slow.</constraints>
<delivery-format>Internal only. Distributed via our Anthropic workspace. Eventually may open-source a sanitized version.</delivery-format>
🤖 Output

Skill Blueprint: Incident Post-Mortem Skill

A Claude Skill blueprint is the complete architectural plan — SKILL.md front-matter, progressive-disclosure structure, reference files, scripts, and invocation test plan — producing a production-ready skill that ships in hours rather than days. Based on analysis of official Anthropic skills (SDK examples, document-creation skills) and 80+ community skills, the #1 failure mode is invocation mismatch (Claude doesn't reach for the skill when it should), which comes from vague invocation descriptions. Your use case — incident post-mortems for an 80-engineer B2B SaaS — is a classic WORKFLOW-ORCHESTRATOR skill with clear trigger phrases, existing tooling, and measurable success criteria. This blueprint produces the complete skill architecture with invocation design tested against 12 trigger scenarios.

Pattern Selection

Pattern: Workflow-Orchestrator (with tool-wrapper elements).

Why this pattern:

  • Multi-step process (gather → timeline → root cause → action items → publish)
  • Decision points (is this a P0? is root cause technical vs. process? who owns action items?)
  • Checkpointable state (can pause mid-workflow, resume next day)
  • Wraps existing tools (Google Docs template, Notion API, Slack webhook)

Not tool-wrapper only: your use case has meaningful reasoning steps between tool calls.

Not knowledge-reference: domain knowledge is small; process is large.

Not agent-sub-behavior: not modifying how Claude operates generally — specific workflow.

Is This Actually A Skill?

Honest check:

Could this be a prompt? No. Too multi-step, needs persistent state across interactions, and invokes tools.

Could this be a tool? Partially — pieces of it (generate_post_mortem_doc, create_notion_page) should be tools. But the orchestration layer (asking the right questions, guiding the engineer through the process, adapting to their answers) needs reasoning — that's skill territory.

Could this be an agent instruction? No. Too specialized to put in every system prompt. Skill is correct pattern.

Verdict: Yes, this is a skill. Specifically, workflow-orchestrator that wraps your existing tools.

SKILL.md Structure

---
name: incident-post-mortem
description: >
  Guides engineers through conducting a blameless post-mortem after an
  incident. Use this when a user mentions writing a post-mortem, RCA,
  incident documentation, or describes a recent production incident they
  need to document. Walks through timeline construction, root cause
  analysis, action items, and publishing to the team's Google Doc +
  Notion + Slack channels using the standard internal format.
version: 1.0.0
triggers:
  - "write a post-mortem"
  - "incident documentation"
  - "RCA" / "root cause analysis"
  - "blameless post-mortem"
  - "our service went down"
  - "we had an incident"
  - "document yesterday's outage"
works_in:
  - Claude Code
  - Claude.ai web app
requires:
  - Internal Notion API access token (env: NOTION_TOKEN)
  - Google Docs template ID: [template_id]
  - Slack webhook for #engineering
---

# Incident Post-Mortem Skill

## When to use this skill

Trigger phrases (non-exhaustive):
- "help me write a post-mortem"
- "we had an incident — need to document"
- "our [service] went down yesterday, now what"
- "I need to do an RCA"
- "blameless post-mortem for [date/event]"

If user mentions ANY of: post-mortem, RCA, incident retrospective, or describes a recent production incident and the need to document it — use this skill.

Do NOT use this skill for: active incident response (use incident-response skill instead), general engineering retrospectives (different template), or minor bugs that don't warrant post-mortem.

## The 5-phase workflow

This skill guides the engineer through 5 phases. Skill maintains state across phases so engineer can pause + resume.

### Phase 1: Incident scope (5 min)
- What happened (1-2 sentences)
- When (UTC timestamps start + end)
- Severity (P0 / P1 / P2 — see references/SEVERITY.md)
- Who was on-call + who was paged

### Phase 2: Timeline (15-30 min)
- Chronological events from detection to resolution
- Focus on detection lag, decision points, recovery steps
- Claude asks follow-up questions for gaps
- See references/TIMELINE.md for format

### Phase 3: Root cause analysis (20-40 min)
- Use 5-whys methodology (see references/ROOT_CAUSE.md)
- Distinguish technical root cause vs. process root cause
- Identify blast radius + why detection was delayed (if it was)

### Phase 4: Action items (15-20 min)
- Preventive items (stop this class of incident)
- Detection items (catch earlier next time)
- Response items (respond faster)
- Each with owner + priority + target date
- See references/ACTION_ITEMS.md for format

### Phase 5: Publish (5 min)
- Generate Google Doc from template (scripts/generate_doc.py)
- Create Notion page for action items (scripts/create_notion.py)
- Post summary to Slack #engineering (scripts/post_slack.py)

## Skill state

The skill maintains state in a local file `.post-mortem-state.json` that tracks current phase + collected data + decisions. Engineer can pause at any phase and resume by saying 'continue the post-mortem.'

## Constraints

- All incident details stay local (no external APIs except Notion/Slack which are internal routing)
- Engineer controls what goes into the doc — skill guides, doesn't decide
- Blameless framing enforced: see references/BLAMELESS.md for language patterns

## Common patterns

- If engineer is frustrated or tired, acknowledge and reduce scope
- If incident was caused by recent change, focus root cause on process (how did change get through), not person
- If similar incident recurring, reference previous post-mortem + highlight repeat pattern

SKILL.md size: ~120 lines. Appropriate for workflow-orchestrator complexity.

Invocation Design

The description field is the key trigger. It must tell Claude WHEN to use this skill, not WHAT the skill does.

Good description (used above):

'Guides engineers through conducting a blameless post-mortem after an incident. Use this when a user mentions writing a post-mortem, RCA, incident documentation, or describes a recent production incident they need to document.'

Why this works:

  • Starts with action verb ('Guides')
  • Names the domain ('blameless post-mortem')
  • Critical: 'Use this when a user mentions...' — gives Claude explicit trigger logic
  • Lists 3-4 specific trigger scenarios
  • Scoped ('after an incident they need to document') prevents over-triggering

Bad description (avoid):

'A skill for incident management and retrospectives.' — Too vague. Doesn't tell Claude when to trigger.

Progressive Disclosure Architecture

What lives where:

SKILL.md (entry point, ~120 lines):

  • The 5-phase workflow overview
  • When to use / when NOT to use
  • State management mention
  • High-level constraints
  • Links to references for depth

references/ (loaded on-demand):

  • SEVERITY.md — severity rubric (P0/P1/P2 criteria)
  • TIMELINE.md — timeline format + examples
  • ROOT_CAUSE.md — 5-whys methodology + templates
  • ACTION_ITEMS.md — action-item format + priority guidance
  • BLAMELESS.md — blameless language patterns + antipatterns
  • EXAMPLES.md — 2-3 anonymized past post-mortems for reference

scripts/ (called by skill):

  • generate_doc.py — produces Google Doc from template
  • create_notion.py — creates Notion action-items page
  • post_slack.py — posts summary to Slack
  • save_state.py — saves phase state
  • load_state.py — loads phase state for resume

Reference Files Organization

references/SEVERITY.md (~50 lines)
# Incident Severity Rubric

## P0 (Critical)
- User-facing outage affecting >10% of users
- Data loss or corruption
- Security breach
- Revenue-impacting billing failure

## P1 (High)
- User-facing degradation
- Critical feature broken
- <10% users affected

## P2 (Medium)
- Minor feature broken
- Internal-only issue
- Performance degradation without outage

## What NOT to post-mortem
- P3 issues (minor bugs) — use regular bug process
- User error / third-party outages (unless we failed to handle gracefully)
references/TIMELINE.md (~80 lines)

Timeline format, UTC timestamp convention, detection-vs-impact timing, examples.

references/ROOT_CAUSE.md (~100 lines)

5-whys methodology, technical-vs-process root cause distinction, contributing factors template.

references/ACTION_ITEMS.md (~60 lines)

Owner assignment, priority rubric (P0-actions within 30 days, P1 within 90), target date format.

references/BLAMELESS.md (~80 lines)

Blameless language: 'the system allowed X' (not 'Bob did X'). Examples of blameful vs. blameless phrasing.

references/EXAMPLES.md (~150 lines)

2-3 anonymized past post-mortems. Powerful for Claude to pattern-match quality.

Total references: ~500 lines across 6 files. Progressive disclosure intact.

Scripts Organization

scripts/generate_doc.py
#!/usr/bin/env python3
"""Generate post-mortem Google Doc from template."""
import sys, json, argparse
from google_docs_api import create_doc_from_template

TEMPLATE_ID = "[your_template_id]"

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--state-file', required=True)
    args = parser.parse_args()
    
    state = json.load(open(args.state_file))
    doc_url = create_doc_from_template(
        template_id=TEMPLATE_ID,
        variables={
            'incident_title': state['title'],
            'incident_date': state['date'],
            'severity': state['severity'],
            'timeline_items': state['timeline'],
            'root_cause': state['root_cause'],
            'action_items': state['action_items'],
        }
    )
    print(json.dumps({'doc_url': doc_url, 'status': 'success'}))

if __name__ == '__main__':
    main()

Script interface pattern:

  • Input: JSON state file path (via --state-file)
  • Output: JSON to stdout with clear success/error format
  • Errors: raise with clear message, non-zero exit

Repeat pattern for create_notion.py and post_slack.py.

Invocation Test Plan

Before shipping, test with these 12 prompts. Skill should trigger on prompts 1-9, NOT on 10-12:

Should trigger:

1. 'I need help writing a post-mortem for our auth service going down yesterday'

2. 'Can you walk me through an RCA for this morning's deployment issue'

3. 'We had a big incident last night — need to document it'

4. 'Help me do a blameless post-mortem'

5. 'Our payment system was down for 2 hours, what do I do now'

6. 'I need to write an incident retrospective'

7. 'Documentation for yesterday's outage'

8. 'Post-mortem time — customer-facing outage 10am-11:30am'

9. 'Resume my post-mortem from yesterday' (tests resume flow)

Should NOT trigger:

10. 'How do I prevent incidents?' (general advice question, not documentation)

11. 'Our service is down right now' (active incident — different skill)

12. 'Write me a story about an incident' (creative, not real)

Test procedure:

  • In Claude Code or Claude app with skill installed
  • Use each prompt
  • Verify: did skill invoke? (should for 1-9, should not for 10-12)
  • If mismatch, adjust description in SKILL.md
  • Re-run test

Common Failure Modes + Fixes

Failure 1: Skill triggers on 'prevent incidents' type questions.

  • Fix: add to description: 'NOT for general incident prevention advice — only for documenting a specific recent incident'

Failure 2: Skill doesn't trigger on 'our service went down' without 'post-mortem' word.

  • Fix: add more trigger phrases including recovery-mode language

Failure 3: Engineer pauses mid-workflow, loses state.

  • Fix: state file must be robust. Script save_state.py should save every phase. load_state.py should handle missing/malformed files.

Failure 4: Generated Google Doc is malformed.

  • Fix: template variables must be exhaustively tested. Handle missing data gracefully.

Failure 5: Notion page created but action items missing owners.

  • Fix: Phase 4 must require owners before allowing progression.

Failure 6: Skill feels verbose — too many questions.

  • Fix: skill should adapt. If engineer provides rich info upfront, skip questions. If terse, ask more.

Shipping + Versioning

Front-matter versioning:

version: 1.0.0

Distribution:

  • Internal: push to your Anthropic workspace skills dir. Engineers opt-in.
  • Versioning: semver. 1.0.0 → 1.1.0 for non-breaking enhancements. 2.0.0 for breaking changes (e.g., new phases added, state format change).
  • Release notes in CHANGELOG.md at skill root.

Open-source path (future):

  • Sanitize: remove internal URLs, Notion template IDs, Slack webhook.
  • Replace with env-var configuration so users set their own.
  • Add EXAMPLES.md with generic examples.
  • License: Apache 2.0 or MIT.

Key Takeaways

  • Pattern: Workflow-orchestrator with tool-wrapper elements. 5-phase workflow with state. Not a prompt, not a tool — skill is correct.
  • Invocation description is 80% of success. Start with verb, name domain, use 'Use this when user mentions...' + specific trigger scenarios. Test with 12 prompts before shipping.
  • Progressive disclosure: SKILL.md (120 lines) + 6 reference files + 5 scripts. Not everything in SKILL.md.
  • State file + resume flow is essential for multi-session use. Don't force engineer to complete in one sitting.
  • Test invocation against 12 scenarios (9 should-trigger, 3 should-NOT-trigger). Fix description before building more.

Common use cases

  • Developers building custom Claude Skills for internal teams
  • Claude Agent SDK users creating reusable skill libraries
  • Consultants shipping Claude-based products to clients
  • Solo founders building Claude-powered tools for their own workflows
  • Engineering teams standardizing Claude usage across the org
  • Documentation teams turning runbooks into Claude Skills
  • AI-first product teams building skill-based product features
  • Researchers wrapping research tooling into Claude-invocable form
  • Developer-relations teams creating demo skills for customer education

Best AI model for this

Claude Opus 4 or Sonnet 4.5. Skill design requires reasoning about Claude's invocation logic, file organization, and progressive-disclosure UX simultaneously. Top-tier reasoning matters.

Pro tips

  • The INVOCATION DESCRIPTION in SKILL.md is 80% of whether Claude uses the skill. Write it as if you're describing when a junior engineer should reach for this tool. Specific verbs and concrete trigger scenarios, not abstract descriptions.
  • Progressive disclosure isn't optional — it's the architecture. SKILL.md is the entry point (100-300 lines max). References are deep-linked docs Claude loads ONLY when needed. Scripts are called, not inlined. If your SKILL.md is 1,000 lines, you haven't progressively disclosed.
  • The 4 skill patterns: (1) TOOL-WRAPPER = invokes an external tool/API with specific calling conventions; (2) WORKFLOW-ORCHESTRATOR = multi-step process with decision points; (3) KNOWLEDGE-REFERENCE = expert-domain knowledge Claude references; (4) AGENT-SUB-BEHAVIOR = modifies how Claude operates in a context. Pick ONE pattern per skill.
  • Test invocation with 10+ natural-language prompts BEFORE building out scripts. If Claude doesn't reach for the skill on 'I want to do X' phrasing that should trigger it, fix the SKILL.md description before writing more code.
  • Scripts (python/bash/etc) should be deterministic and well-tested. Skills fail when a script errors silently and Claude continues without knowing. Return clear error messages that Claude can reason about.
  • References should be focused docs — one concept per reference file. A 'PATTERNS.md' covering 10 patterns is harder for Claude to use than 10 separate pattern files. Claude loads what it needs; help it find it.
  • Don't over-engineer. Many skills should be 80 lines of SKILL.md + 1 script + 0 references. Not every skill needs the full scaffold. Match complexity to the actual problem.
  • For public/shipped skills, include versioning in front-matter. Users pin skill versions; breaking changes need semver. For internal skills, simpler is fine.

Customization tips

  • Test your skill's invocation BEFORE building scripts/references. If Claude doesn't trigger the skill correctly with just a SKILL.md stub, no amount of script development will fix it. Description first, then depth.
  • Look at Anthropic's own shipped skills (in the SDK repo and Claude apps) for reference on style. Their invocation descriptions are consistently good examples of the pattern.
  • For internal-only skills, you can be less defensive about edge cases. For public/shipped skills, handle every edge case because you're not there to debug user issues.
  • Version your skill from day 1 (1.0.0). Even internal skills benefit from versioning when changes break engineer workflows. Semver is free discipline.
  • If your skill has more than 3-4 reference files, consider whether it should be 2 skills instead of 1. Complexity scales super-linearly — 2 focused skills ship better than 1 mega-skill.

Variants

Tool-Wrapper Mode

For skills that invoke external tools/APIs (Stripe, Notion, internal service). Focuses on call conventions, error handling, auth patterns.

Workflow Mode

For multi-step workflow skills (incident-response, onboarding, code-review). Emphasizes state, decision points, checkpoint logic.

Knowledge-Reference Mode

For expert-domain skills (legal reasoning, medical protocols, compliance). Emphasizes reference-doc organization and citation.

Sub-Agent Behavior Mode

For skills that modify Claude's behavior (specialized reviewer, domain-specific coder). Emphasizes behavioral constraints and output format.

Frequently asked questions

How do I use the Claude Skill Designer — Build Progressive-Disclosure Skills That Actually Ship prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with Claude Skill Designer — Build Progressive-Disclosure Skills That Actually Ship?

Claude Opus 4 or Sonnet 4.5. Skill design requires reasoning about Claude's invocation logic, file organization, and progressive-disclosure UX simultaneously. Top-tier reasoning matters.

Can I customize the Claude Skill Designer — Build Progressive-Disclosure Skills That Actually Ship prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: The INVOCATION DESCRIPTION in SKILL.md is 80% of whether Claude uses the skill. Write it as if you're describing when a junior engineer should reach for this tool. Specific verbs and concrete trigger scenarios, not abstract descriptions.; Progressive disclosure isn't optional — it's the architecture. SKILL.md is the entry point (100-300 lines max). References are deep-linked docs Claude loads ONLY when needed. Scripts are called, not inlined. If your SKILL.md is 1,000 lines, you haven't progressively disclosed.

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals