Claude Code Kit

Skills

What Are Skills?

Skills are .claude/skills/<name>/SKILL.md files that Claude Code loads automatically via semantic matching. When a task matches a skill's description, Claude gets the skill's knowledge injected into context — no manual loading needed.

Skills vs Lessons

tasks/lessons/.claude/skills/
TriggerUser corrects ClaudeClaude discovers something non-obvious
ContentWhat went wrong + rule to followProblem + context + verified solution
FormatYAML frontmatter + Issue / Root Cause / Rule (one file per lesson)YAML frontmatter + structured sections
Loading_index.md → Top Rules at session boot; individual lesson files on-demandAutomatic via semantic matching
ScopeMistakes to avoidKnowledge to apply proactively
LifetimeGrows per-project, refreshed via /lesson-refreshGrows per-project, can be shared

Both systems complement each other. Lessons prevent repeated mistakes. Skills proactively surface relevant knowledge.

The Skill Extractor

The kit includes a built-in skill-extractor skill at .claude/skills/skill-extractor/SKILL.md. It:

  • Can be invoked manually with /skill-extractor
  • Activates automatically via semantic matching when discovery-worthy knowledge appears
  • Uses a structured template for consistent skill formatting
  • Has quality gates to prevent low-value extractions

Enabling the Reminder Hook

An optional UserPromptSubmit hook reminds Claude to consider skill extraction at each prompt. It's not enabled by default (same as auto-lint and auto-format).

To enable, add to .claude/settings.json under the hooks key:

"UserPromptSubmit": [
  {
    "hooks": [
      {
        "type": "command",
        "command": ".claude/hooks/skill-extract-reminder.sh"
      }
    ]
  }
]

Writing Good Skills

The 7 Principles

Every skill — whether hand-written or generated — should follow these principles:

  1. Explain the WHY — Every rule needs a rationale. "Don't use any" is weak. "Don't use any because it disables type checking and hides bugs that only surface in production" is strong.

  2. Be concrete — Real code examples over abstract descriptions. A developer should be able to copy-paste a pattern and have it work.

  3. Be project-specific — Reference actual file paths, dependency versions, and config from the project. Generic advice belongs in docs, not skills.

  4. Be opinionated — One best approach, enforced. Don't present a menu of options — pick the right one and explain why.

  5. Be testable — Every rule should be verifiable via lint, test, or code review. If you can't check it, it's not a rule — it's a suggestion.

  6. Show both sides — For critical rules, show both correct AND incorrect code. Developers learn faster from contrast.

  7. Keep it focused — Each skill covers one concern. If it's over 500 lines, split it into multiple skills.

Do Extract

  • Framework quirks that caused hard-to-debug issues
  • Config combinations that are required but undocumented
  • Workarounds for known dependency bugs
  • Project patterns that deviate from convention
  • API behaviors that differ from documentation

Don't Extract

  • Anything easily found in official docs
  • One-time fixes unlikely to recur
  • User preferences (put in CLAUDE.md)
  • User corrections (put in tasks/lessons/<YYYY-MM-DD>-<slug>.md using tasks/lessons/_TEMPLATE.md)
  • Unverified guesses

Quality Checklist

  • Problem is clear to someone unfamiliar with the context
  • Solution has been verified in this session
  • Doesn't duplicate CLAUDE.md rules or existing lessons
  • YAML description is router-style — capability + a Use when… trigger (+ Do NOT use… / use /X instead when it overlaps another skill). See § Skill Conventions → Router-style description
  • Includes verification steps
  • Every rule has a rationale (WHY, not just WHAT)
  • Code examples use the project's actual stack and versions

Skill Conventions (v1.11.0+)

Four conventions all skills should follow. The scripts/validate-skills.sh validator warns when these are missing.

1. Core Rule (required for every skill)

Every SKILL.md opens with a ## Core Rule section — one sentence (max ~25 words) that anchors the skill's ethical scope.

## Core Rule

Form a single hypothesis at a time. Validate with a falsifiable test before fixing. Never patch a symptom you haven't reproduced.

Goes immediately after the title (and the optional intro paragraph), before any other ## section. The rule states what must be preserved or what's forbidden — the deal-breaker that distinguishes this skill from a free-form prompt. Inspired by codex-complexity-optimizer.

2. Router-style description (required for every skill)

The description is the skill's router — Claude reads it (not the body) to decide whether to fire. Write it as capability + trigger, not a bare capability statement:

  • Capability — what the skill does, in one clause.
  • Use when… — the concrete situations, user phrasings, or file types that should trigger it.
  • Do NOT use… / use /other-skill instead — a negative trigger whenever the skill overlaps a sibling. This is what stops the wrong skill from firing.
description: Audit code for smells, error-handling gaps, and maintainability issues with actionable fixes. Use for a general code-quality pass not tied to project rules. To audit against the project's golden-principles.yaml use /quality-audit instead.

Keep it under ~500 characters (the validator warns past that). A capability-only description like "Systematic root-cause debugging" matches far worse than one that also says when to reach for it. For overlapping families — the *-audit skills, retro / pulse / scorecard, skill-extractor / skill-generator, doc-gardening / documentation-audit — the negative trigger is mandatory, not optional. Pattern borrowed from kepano/obsidian-skills, whose descriptions read as explicit routing instructions.

3. Default Behavior (required for audit-class skills)

Audit-class skills (skills with ## Output Format that produce a structured report) include a ## Default Behavior section telling the agent what to do autonomously when invoked. This removes friction from "audit / scan / review / give me a report" requests.

## Default Behavior

When the user asks to audit, scan, review, or "give me a report" for <DOMAIN>, produce the full <skill-name> report automatically using the Process and Output Format sections below. Do not require the user to specify fields.

Only modify files when the user explicitly requests implement / fix / apply / refactor. By default, this skill is **report-only**.

The 10 audit-class skills today: code-quality-audit, performance-audit, architecture-review, accessibility-audit, testing-audit, dependency-audit, documentation-audit, design-review, project-health-report, dead-code-audit.

4. Phase 1 Inventory naming (audit-class skills)

Audit-class skills use uniform Phase 1 naming and an explicit "candidates, not findings" framing — discouraging the agent from reporting raw scanner output as final findings.

### Phase 1: Inventory (first-pass leads)

This pass produces **candidates**, not findings. Treat counts as leads for deeper inspection in later phases. Do not report Phase 1 raw output as the final result.

<phase-specific content>

Anti-staleness: defer to --help

Skills that wrap a CLI (ship, references-sync, web-read) should point the agent at the tool's own --help / help output rather than hardcoding an exhaustive flag list:

Run `<tool> --help` for the current flags — it stays in sync with the installed version.

A hardcoded command catalog rots the moment the tool updates; the tool's own help never does. Document the 3–5 common invocations inline for muscle memory, and defer the long tail to --help. Pattern borrowed from kepano/obsidian-skills (obsidian-cli: "Run obsidian help… This is always up to date.").

Reusable Shared Blocks

When writing a new skill, copy patterns from the shared blocks:

  • .claude/skills/_shared/blocks/core-rule.md — Core Rule template + examples
  • .claude/skills/_shared/blocks/default-behavior.md — Default Behavior pattern with <DOMAIN> / <skill-name> placeholders
  • .claude/skills/_shared/blocks/inventory-framing.md — Phase 1 framing

For templated skills (.tmpl files in _templates/), inline the actual text — bespoke per skill, not block-substituted.


Skill Structure

Simple (default)

.claude/skills/<skill-name>/
  SKILL.md

Most skills are a single SKILL.md file. Use this for focused, specific discoveries.

Extended (for complex skills)

.claude/skills/<skill-name>/
  SKILL.md                  # Main instructions (< 500 lines, lean orchestration)
  references/               # Loaded on-demand at specific phases
    patterns.md             # Approved patterns with code examples
    anti-patterns.md        # Forbidden patterns with severity ratings
    checklist.md            # Pre-commit/merge verification checklist
    error-patterns.md       # Lookup tables, schema specs, language quirks
  assets/                   # Verbatim content copied/embedded by the skill
    output-template.md      # Templates the skill writes out (with placeholders)

Use the extended structure when:

  • The skill has 5+ rules with code examples
  • You need both correct AND incorrect examples side by side
  • A pre-commit checklist would prevent recurring mistakes
  • A lookup table is large but only needed at one phase (extract it to references/)
  • The skill writes a file from a template (put the template in assets/)

Templates for all files are in .claude/skills/skill-extractor/resources/.

When to split content into references/

references/ holds on-demand reading material — the agent loads it only when a specific phase asks for it. This keeps the SKILL.md focused on orchestration logic instead of catalog content.

A section is a good candidate to move when all three are true:

  1. Long — 30+ lines of mostly tabular or list content (lookup-style, not narrative)
  2. Phase-specific — only used at one or two points in the Process section, not throughout
  3. Catalog-like — could grow over time (language-specific patterns, anti-patterns by category, framework quirks)

Counter-examples — keep these inline in SKILL.md:

  • Output format — load-bearing structural anchor; readers scan SKILL.md to know what the output looks like
  • Process phases — the orchestration sequence is the skill
  • Run Mode tables — small, decision-critical, must be visible to anyone reading the skill

When you do split, replace the inline content with a short pointer at the phase that needs it:

### Phase 2: Hypothesize
...
4. **Check the language pattern lookup** — if the error matches a category in
   `references/error-patterns.md`, use it to seed hypotheses

See .claude/skills/debug/SKILL.md (line ~140) and .claude/skills/debug/references/error-patterns.md for a worked example.

Bloat audit recipe

Run periodically to spot SKILL.md files that have grown beyond useful density:

cd .claude/skills
for f in */SKILL.md; do
  lines=$(wc -l < "$f" | tr -d ' ')
  examples=$(grep -c '^```' "$f")
  if [ "$lines" -ge 200 ] || [ "$examples" -ge 6 ]; then
    echo "  $f$lines lines, $examples code blocks"
  fi
done

For each flagged skill:

  1. Open the SKILL.md and look for any section matching the three split criteria above
  2. If found → extract to references/<topic>.md and replace with a one-line pointer
  3. If not found → the skill is genuinely dense, leave it alone
  4. If the SKILL.md exceeds 500 lines → consider splitting into multiple skills entirely (see "Keep it focused" in the 7 Principles)

Template-Based Skill Generation

For skills that share common kit rules (preamble, scope discipline, verification order), the kit provides a template system that prevents doc drift across skills.

How It Works

  1. Shared blocks in .claude/skills/_shared/blocks/ contain reusable content (e.g., preamble.md, scope-rules.md)
  2. Templates in .claude/skills/_templates/*.tmpl reference blocks via {{PLACEHOLDER}} tags
  3. Build script (./scripts/build-skills.sh) assembles templates into final SKILL.md files

Available Blocks

PlaceholderFileContent
{{PREAMBLE}}preamble.mdSession boot context check
{{SCOPE_RULES}}scope-rules.mdRead-only analysis, scope discipline
{{VERIFICATION_ORDER}}verification-order.mdTypecheck > lint > test > smoke
{{PLAN_FIRST}}plan-first.mdPlan-first methodology for multi-file changes
{{CONTEXT_GATHERING}}context-gathering.mdProject config and structure analysis
{{REPORT_FOOTER}}report-footer.mdReport formatting guidelines
{{MODE_DETECTION}}mode-detection.mdInteractive vs mode:headless run mode contract

Usage

# List available blocks and templates
./scripts/build-skills.sh --list

# Build all templated skills
./scripts/build-skills.sh

# Preview without writing
./scripts/build-skills.sh --dry-run

When a shared block changes (e.g., verification order gets a new step), run build-skills.sh once and all templated skills update automatically.


Headless Mode Contract

Interactive skills can opt into a mode:headless run mode for automation, scheduled runs, and skill-to-skill orchestration. The canonical contract lives in .claude/skills/_shared/blocks/mode-detection.md and is summarized below.

Detection

A skill is in headless mode when its argument bag contains the literal token mode:headless. The token is a flag, not content — strip it before treating the remainder as user input.

/<skill>                        # interactive
/<skill> mode:headless          # headless
/<skill> mode:headless context  # headless, with context hint

Contract

In headless mode, a skill must:

  • Skip every "ask the user" step. Substitute the default documented in its own ## Run Mode section.
  • Never prompt for confirmation. Decisions are pre-committed by the defaults.
  • Replace the interactive end ("What's next?") with a structured terminal report listing what was done.
  • Exit non-zero with a clear error rather than blocking on user input if a default cannot be satisfied.

Headless mode applies for the entire run once detected — do not switch back to interactive mid-skill.

Adding headless mode to a skill

  1. Identify the skill's interactive decision points (anywhere it asks the user a question or waits for approval).
  2. For each, document a deterministic default that a human would consider reasonable for unattended runs.
  3. Add a ## Run Mode section to the skill's SKILL.md with a table of Decision point | Interactive default | Headless default.
  4. Reference the shared contract: see .claude/skills/_shared/blocks/mode-detection.md.

Some skills should remain interactive-only — those whose value is the dialogue (e.g., office-hours, deepening-review). Do not force headless mode where it would defeat the skill's purpose.

Skills with headless support

SkillUseful headless trigger
skill-extractorHook reminder firing after a session with verified discoveries
retroWeekly cron (/loop weekly) producing a strictly-from-artifacts report
shipCI pipeline after pre-validation outside the skill
review-pipelineScheduled PR sweeps, CI pre-merge gates, skill-to-skill orchestration

Extending the Kit (Resolution Order)

The kit ships ~25 core skills. Projects regularly need to (a) add new skills the kit doesn't ship, (b) tweak existing skills for a project's stack, or (c) override a skill entirely for a single project. The kit handles all three through a fixed resolution order — at runtime, Claude Code reads the first match it finds.

The four layers

Priority  Layer                              Location                                Owner
⬆ 1       Project-local overrides            .claude/skills/<name>/SKILL.md          Project
2         Community extensions               .claude/extensions/<name>/SKILL.md      Third-party
3         Project-overlay slot               .claude/skills/<name>/project/          Project (additive)
⬇ 4       Kit core                           .claude/skills/<name>/SKILL.md          Kit

When two layers define the same <name>, the higher-priority layer wins entirely — there is no field-level merging. The lower-priority version is shadowed, not deleted; removing the override restores the lower layer automatically on the next read.

Each layer in detail

Layer 1 — Project-local overrides. A file at .claude/skills/<name>/SKILL.md in the project repo that shadows the kit's version of the same name. Use this when one project needs a fundamentally different version of a skill the kit ships. The override is one file; the rest of the kit's <name>/templates/ directory still applies unless the override also redefines those. Mark project-managed: commit it normally; install.sh only writes to skill directories that don't exist, so existing project overrides survive upgrades.

Layer 2 — Community extensions. A folder under .claude/extensions/<name>/SKILL.md containing a skill the kit does not ship. Source: a teammate's PR, a community plugin, or a Claude Code plugin marketplace entry mirrored locally. Same shape as a core skill, just sitting in a different directory. The kit's installer never touches .claude/extensions/ — its lifecycle is owned by whoever published it.

Layer 3 — Project-overlay slot. A folder at .claude/skills/<name>/project/ next to a core skill. Used for additive content the skill explicitly looks for at runtime (e.g. .claude/skills/quality-audit/project/golden-principles.local.yaml). Unlike Layer 1, this doesn't shadow the SKILL.md — it adds project-specific content the kit-managed skill knows how to pick up. Hooks already use this pattern at .claude/hooks/project/ and install.sh is wired to preserve it.

Layer 4 — Kit core. Default. Everything ships at this layer.

When to use which

You want toUse
Add a new skill the kit doesn't haveLayer 2 (extension) — keeps it separate from kit upgrades
Tweak a kit skill's behaviour for one projectLayer 3 (overlay) if the skill supports it, else Layer 1 (override)
Replace a kit skill entirelyLayer 1 (override)
Contribute a new skill back to the kitOpen a PR; it becomes Layer 4

Conflict resolution

The kit's scripts/validate-skills.sh warns when two layers define the same <name>. The warning is informational — the priority order is deterministic — but it's a signal that a kit upgrade may be hiding a project-local intent. Resolve by either renaming the override or by upstreaming the customization (open a PR to move it to Layer 4 or a community extension).

See ADR-015 for the decision history behind this model.


Skill Lifecycle

  1. Discovery — Claude encounters something non-obvious during a session
  2. Verification — The discovery is confirmed to be correct
  3. Extraction — Written as a SKILL.md using the template
  4. Loading — Automatically picked up by Claude Code's semantic matching
  5. Maintenance — Update or remove skills as the project evolves

Maintenance Tips

  • Review .claude/skills/ periodically — delete stale skills
  • If a skill's solution becomes the documented approach, remove it
  • If a skill applies to multiple projects, consider extracting it to ~/.claude/skills/ (user-level)

Periodic Cleanup ("Spa Day")

As rules, skills, and lessons accumulate, agent performance degrades. This is context bloat — the agent reads too much conflicting or irrelevant information before doing actual work.

When to Clean Up

  • Agent starts ignoring rules it used to follow
  • Agent behavior becomes inconsistent across sessions
  • Session boot requires reading 10+ files
  • Skills or lessons contradict each other
  • You notice the agent "forgetting" things mid-session more often

Cleanup Checklist

Run this periodically (every 2-4 weeks or when symptoms appear):

1. Audit CLAUDE.md

  • Is every instruction still relevant?
  • Are there rules that no longer apply to the project?
  • Are there contradictions between rules?
  • Is the agent_docs directory listing still accurate?

2. Consolidate Lessons

  • Read all entries in tasks/lessons/ (or use /lesson-refresh)
  • Merge duplicate or similar lessons
  • Remove lessons that have been encoded as rules in CLAUDE.md
  • Remove lessons for bugs that were structurally fixed (no longer possible)

3. Prune Skills

  • List all skills: ls .claude/skills/
  • For each skill: is the problem still relevant?
  • For each skill: has the solution become standard (in docs or framework)?
  • For each skill: does the YAML description still match what it does?
  • Remove stale or redundant skills

4. Check for Contradictions

  • Do any skills contradict CLAUDE.md rules?
  • Do any lessons contradict skills?
  • Do any agent_docs files give conflicting advice?
  • Resolve every contradiction — pick one source of truth

5. Measure Context Load

Count how many files the agent reads at session boot + task start:

  • Healthy: 3-5 files
  • Concerning: 6-10 files
  • Critical: 10+ files — consolidate immediately

How to Run a Cleanup Session

Tell the agent:

Read all files in agent_docs/, tasks/lessons/, all skills in .claude/skills/,
and CLAUDE.md. Identify contradictions, stale content, and redundancies.
Present a consolidation plan. Do not make changes until I confirm.

This leverages the agent's ability to cross-reference content while keeping you in control of what gets removed.