Course navigation
Claude SkillsLesson 11 of 15

Testing Your Skills

A skill that works once is not reliable — it needs to produce the same quality output across different inputs. Test slash command invocation, auto-detection, and output consistency, then keep a regression log so you notice regressions when you edit a skill.

Three steps to test any skill

Run through these three phases every time you create or edit a skill.

StepFocusWhat to verify
1Test invocationSlash command activates; auto-detection triggers on natural phrases
2Test output qualityFormat, length, and content match your SKILL.md spec
3Keep a test logRecord inputs, expected outputs, and results for regression checks

Method 1 — “Try in chat” from the skills page

The fastest way to test a skill after creating or editing it is the Try in chat button in the claude.ai skills manager. It opens a fresh conversation with the skill pre-loaded so you can type a test prompt immediately.

claude.ai/customize/skills

Your skills

/code-review

Review code for bugs, security issues, and style violations.

/commit

Write a Conventional Commit message from staged changes.

/standup

Generate a daily standup update.

Try in chat
After clicking Try in chat, type what a real user would type — not /standup, but something like “I need my standup for today” — to test auto-detection at the same time.

Method 2 — test in Claude Code

For skills in ~/.claude/skills/, start Claude Code in your project and run the slash command directly. Test three things in sequence.

TestPromptExpected
A/code-reviewSkill activates immediately
Bcan you check src/auth.py for me?Skill activates without the slash
C/code-review src/auth.pyOutput matches your format spec
Claude Code
# Open Claude Code in your project root
$ claude
> /code-review src/auth.py
Running /code-review on src/auth.py…
## Code Review — src/auth.py
[Critical] (1)
└─ Line 42 — SQL query built with f-string. Use parameterised queries.
[Warning] (2)
└─ Line 18 — password compared with == instead of secrets.compare_digest
└─ Line 67 — JWT secret read from env but no fallback check
[Style] (0)
1 critical, 2 warnings, 0 style notes.
If the slash command does not appear in autocomplete, the skill file is in the wrong folder or has a YAML front matter error. Run cat ~/.claude/skills/code-review/SKILL.md to verify.

What to check on every test run

Use this checklist for each test. If any row fails, the fix usually lives in the corresponding SKILL.md section.

CheckIf it fails
Slash command activates the skillCheck the name field in front matter
Auto-detection phrase triggers the skillAdd more trigger phrases to the description
Output uses the correct formatAdd or update ## Output format
Output length is appropriateSet a word or line limit in ## Output format
Claude stays within the skill's scopeAdd a ## Constraints section
File access works when expectedAdd Read to allowed-tools in front matter
Output is consistent across multiple runsMake steps and output format more explicit

Common failures and how to fix them

SymptomCauseFix
Skill never auto-detectsTrigger phrases are too vague or not listed in the descriptionAdd: Triggered by: "standup", "daily update", "/standup"
Output format changes every runNo ## Output format section, or it is too vagueSpecify exact structure: headings, bullet depth, max word count
Claude rewrites the whole file instead of annotatingNo constraint preventing full rewritesAdd to ## Constraints: "Never rewrite the entire file — annotate only."
Skill runs but ignores allowed-toolsallowed-tools is missing or misspelled in front matterCheck YAML: allowed-tools: [Read] (capital R, list format)
Slash command not in autocompleteSkill is in the wrong directory or has a YAML parse errorVerify path: ~/.claude/skills/<name>/SKILL.md and validate YAML

Keep a simple test log

Every time you edit a skill, re-run your previous test cases. A plain Markdown file stored alongside the skill is enough — no framework needed. Include the input, expected output, and result (PASS or FAIL).

~/.claude/skills/code-review/TEST_LOG.md
# Skill Test Log — /code-review

## Test 1 — basic Python file
Input:  sample.py (20 lines, one SQL injection)
Expect: at least one Critical finding mentioning SQL injection
Result: PASS — flagged on line 14, provided fix snippet

## Test 2 — clean file
Input:  utils.py (no issues)
Expect: "No critical issues found" or similar clean message
Result: PASS — responded "Code looks clean. 0 critical, 0 warnings."

## Test 3 — trigger phrase auto-detection
Input:  "can you check my auth.py for problems?"
Expect: skill auto-activates and reviews auth.py
Result: PASS — auto-detected and ran /code-review

## Test 4 — large file (500 lines)
Input:  api_server.py
Expect: structured findings, does not summarise whole file
Result: FAIL — rewrote entire file instead of annotating
Fix:    Added "Do NOT rewrite the file" to ## Constraints
Test 4 above shows the value of a log: the failure revealed a missing constraint. Without the log, that regression would only surface when Claude unexpectedly rewrites a real file.

Before you continue

  • Test invocation (slash command and auto-detection) before output quality.
  • Use Try in chat for quick checks; use Claude Code for project-level skills.
  • Failed checks usually map to a specific SKILL.md section — description, steps, or constraints.
  • Keep a TEST_LOG.md so regressions are visible after every edit.
  • Next lesson: Pass Arguments to Skills.

What's Next

Testing ensures your skills behave correctly. Next: pass arguments so a single skill can handle many different inputs.