Lesson 11Claude Skills

Testing Your Skills

A skill that works once isn't reliable — it needs to produce the same quality output across different inputs. This lesson covers how to test slash command invocation, auto-detection, and output consistency, then keep a regression log so you notice regressions when you edit a skill.

Try in chatClaude Code testingauto-detectionregression logcommon failures

Test invocation

Verify the skill triggers correctly — both by slash command and auto-detection.

Test output quality

Check that the format, length, and content match what you specified.

Keep a test log

Record inputs, expected outputs, and results so regressions are visible.

Method 1 — “Try in chat” from the skills page

The fastest way to test a skill after creating or editing it is to use the Try in chat button in the claude.ai skills manager. It opens a fresh conversation with the skill pre-loaded so you can immediately type a test prompt.

claude.ai/customize/skills

Your skills

/code-review

Review code for bugs, security issues, and style violations.

•••

/commit

Write a Conventional Commit message from staged changes.

•••

/standup

Generate a daily standup update.

▶Try in chat

Edit with Claude

Replace

Download

Uninstall

▶Try in chat opens a new conversation with /standup ready to run

💡 Tip: After clicking “Try in chat”, type exactly what a real user would type — not /standup, but something like “I need my standup for today” — to test auto-detection at the same time.

Method 2 — test in Claude Code

For skills in ~/.claude/skills/, start Claude Code in your project and run the slash command directly. Test three things in sequence.

Slash command

> /code-review

Skill must activate immediately.

Auto-detection

> can you check src/auth.py for me?

Skill should activate without the /.

With a file

> /code-review src/auth.py

Output must match your format spec.

# Open Claude Code in your project root
$ claude
> /code-review src/auth.py
  Running /code-review on src/auth.py…
  ## Code Review — src/auth.py
  🔴 Critical (1)
  └─ Line 42 — SQL query built with f-string. Use parameterised queries.
  🟡 Warning (2)
  └─ Line 18 — password compared with == instead of secrets.compare_digest
  └─ Line 67 — JWT secret read from env but no fallback check
  🔵 Style (0)
  1 critical, 2 warnings, 0 style notes.

ℹ️ Info: If the slash command does not appear in the autocomplete list inside Claude Code, the skill file is in the wrong folder or has a syntax error in the YAML front matter. Run cat ~/.claude/skills/code-review/SKILL.md to verify.

What to check on every test run

Use this checklist for each test. If any row fails, the fix usually lives in the corresponding SKILL.md section.

☐Slash command activates the skill

If fails: Check the name field in front matter

☐Auto-detection phrase triggers the skill

If fails: Add more trigger phrases to the description

☐Output uses the correct format (headings, bullets, table, etc.)

If fails: Add/update ## Output format section

☐Output length is appropriate (not too short or too long)

If fails: Set a word/line limit in ## Output format

☐Claude stays within the skill's scope (doesn't go off-topic)

If fails: Add a ## Constraints section

☐File access works when expected (Read tool available)

If fails: Add Read to allowed-tools in front matter

☐Output is consistent across multiple runs

If fails: Make steps and output format more explicit

Common failures and how to fix them

⚑Skill never auto-detects

Why: Trigger phrases are too vague or not listed in the description.

Fix: Add specific trigger phrases: Triggered by: "standup", "daily update", "/standup".

⚑Output format changes every run

Why: No ## Output format section, or it is too vague.

Fix: Specify exact structure: headings, bullet depth, max word count.

⚑Claude rewrites the whole file instead of annotating

Why: No constraint preventing full rewrites.

Fix: Add to ## Constraints: "Never rewrite the entire file — annotate only."

⚑Skill runs but ignores allowed-tools

Why: allowed-tools is missing or misspelled in front matter.

Fix: Check YAML: allowed-tools: [Read]  (capital R, list format).

⚑Slash command not in autocomplete

Why: Skill is in the wrong directory or has a YAML parse error.

Fix: Verify path: ~/.claude/skills/<name>/SKILL.md and validate YAML front matter.

Keep a simple test log

Every time you edit a skill, re-run your previous test cases. A plain Markdown file stored alongside the skill is enough — no framework needed. Include the input, expected output, and result (✅ PASS / ⚠️ FAIL).

~/.claude/skills/code-review/TEST_LOG.md

# Skill Test Log — /code-review

## Test 1 — basic Python file
Input:  sample.py (20 lines, one SQL injection)
Expect: at least one Critical finding mentioning SQL injection
Result: ✅ PASS — flagged on line 14, provided fix snippet

## Test 2 — clean file
Input:  utils.py (no issues)
Expect: "No critical issues found" or similar clean message
Result: ✅ PASS — responded "Code looks clean. 0 critical, 0 warnings."

## Test 3 — trigger phrase auto-detection
Input:  "can you check my auth.py for problems?"
Expect: skill auto-activates and reviews auth.py
Result: ✅ PASS — auto-detected and ran /code-review

## Test 4 — large file (500 lines)
Input:  api_server.py
Expect: structured findings, does not summarise whole file
Result: ⚠️  FAIL — rewrote entire file instead of annotating
Fix:    Added "Do NOT rewrite the file" to ## Constraints

💡 Tip: Test 4 above shows the value of a log: the failure revealed a missing constraint. Without the log, that regression would only surface in production when Claude unexpectedly rewrites a real file.

What's Next

Testing ensures your skills behave correctly. Next: learn how to pass arguments so a single skill can handle many different inputs.

← PREVIOUS

How to Write Good Skill Instructions

Pass Arguments to Skills