Skill Creator
This skill helps you design, author, validate, and package high-quality skills.
Version: 1.5.0
Last updated: 2026-02-12
Working agreement (skills + shell + compaction)
- Follow the repo's
AGENTS.md(treat it as a map, not a megadoc). - For long-running work: also follow
~/.codex/instructions/shell-skills-compaction.md(or a repo-local copy). - Use an artifact boundary:
- Local Codex CLI: write deliverables to
./artifacts/ - Hosted shell: write deliverables to
/mnt/data/
- Local Codex CLI: write deliverables to
- After major milestones, write a short status note to an artifact (e.g.,
./artifacts/STATUS.md) so the thread can be compacted safely.
Scope and triggers
Use this skill to:
- Create a new skill (instruction-only, script-backed, or router-style).
- Revise an existing skill for better triggering, portability, or reliability.
- Audit/upgrade a skill to meet “gold standard” structure, progressive disclosure, and validation.
- Package a skill into a distributable
.skillarchive.
Required inputs
- Desired skill goal (what the user wants to accomplish).
- 3–10 example user prompts:
- 2–5 happy-path prompts
- 1–3 edge-cases
- 1–3 “should NOT trigger” prompts (negative examples)
- Target environment(s):
codex,claude, orportablesubset. - Any required assets, schemas, APIs, CLIs, or “house style” constraints.
- Compatibility posture (default: canonical-only for unreleased/greenfield projects; add backwards compatibility only when explicitly required).
If any of the above are missing, ask only the minimum questions required to proceed safely.
Deliverables
Depending on the request, produce one or more of:
- A skill folder containing:
SKILL.md(required)agents/openai.yaml(recommended for OpenAI/Codex UI + MCP dependencies)scripts/(optional)references/(optional but recommended for non-trivial skills)assets/(optional)workflows/(optional for router-style skills)
references/contract.yaml(output contract) andreferences/evals.yaml(eval cases) when the skill is non-trivial.references/plan.md(plan artifact) for non-trivial skill builds; store$create-planoutput here when available.- A validation report (what passed/failed and what to fix).
- An operational-readiness + security-risk report (OpenClaw-style summary: critical/warn/info).
- A packaged
.skillfile (optional).
Response format (required)
Always start responses with these headings (no text before them):
Scope and triggers
- 1–3 bullets on when this skill applies (confirm scope).
Required inputs
- List required inputs and ask targeted questions if needed.
Deliverables
- List deliverables you will produce.
Failure mode
If the request is out of scope:
- Use the headings above.
- Under Required inputs, explain what’s missing or why it’s out of scope.
- Under Deliverables, propose the closest appropriate next step or skill.
Operating principles
Humans steer. Agents execute.
Your goal is leverage: translate vague intent into a workflow the agent can execute repeatedly. When something fails, the fix is almost never “try harder”—it’s usually missing scaffolding, missing constraints, or missing feedback loops.
Keep SKILL.md short and treat it as a map
Context is scarce. Treat SKILL.md as the high-signal “table of contents,” and push depth into:
references/(system of record)scripts/(deterministic helpers)assets/(templates/boilerplate)
This matches the “AGENTS.md as table of contents” approach: point to structured sources of truth instead of growing a single blob.
Descriptions are routing logic
The description is effectively the model’s decision boundary. It should be concrete about:
- Use-when vs don’t-use-when
- Outputs/artifacts
- Success criteria
Default compatibility posture
- For unreleased/greenfield projects, default to canonical implementations and guidance.
- Do not add compatibility shims, adapter layers, migration bridges, or dual-write flows unless explicitly requested or required by an existing released contract.
Put templates/examples inside the skill
Do not cram templates into system prompts. Put them inside the skill so they load only when needed.
Design for long runs
Plan for multi-step continuity:
- Reuse the same environment/container when you want stable deps and cached intermediate files.
- Use compaction as a default long-run primitive, not an emergency fallback.
Treat skills + networking as high-risk
Default posture:
- Skills: allowed
- Shell: allowed
- Network: enabled only when required, behind strict allowlists, and never echo secrets.
Skill creation process (follow by default)
Skip steps only with a clear reason.
0) Confirm target + artifact boundary
- Confirm where the skill lives:
- Repo:
.agents/skills/<skill-name>/ - User:
~/.agents/skills/<skill-name>/
- Repo:
- Confirm artifact boundary (local
./artifacts/vs hosted/mnt/data/).
1) Lock down triggers early (with negative examples)
- Collect 3–10 prompts (happy, edge, and negative).
- Ensure the
descriptioncontains:- trigger keywords
- explicit “don’t use when …” near-misses
- output artifacts and success criteria
- Encode compatibility stance in the trigger boundary: default to canonical-only for unreleased work; require explicit language to trigger compatibility-preserving outputs.
- For non-trivial skills, write
references/evals.yamlearly (RED → GREEN → REFACTOR).
2) Choose the skill structure
- Single-file: one intent, one workflow, < ~200 lines.
- Router style: multiple intents/workflows, heavy domain knowledge, or multiple output contracts.
Router layout:
skill-name/
SKILL.md
workflows/
references/
scripts/
assets/
agents/openai.yaml
3) Scaffold the folder
Use the initializer:
python scripts/init_skill.py <skill-name> --target codex --run-type instruction --path <output-dir>
Then delete any unused folders and example files.
4) Author SKILL.md
Frontmatter:
name: kebab-case, matches folder name.description: single line; WHAT + WHEN + outputs + success criteria; include negative triggers.- Prefer minimal frontmatter (default: only
name+description).
Body:
- Include a short Principles section before the workflow.
- Keep the workflow minimal and reliable.
- Link to
references/instead of pasting long docs (progressive disclosure). - Store templates/examples in the skill bundle, not in prompts.
5) Add resources (as needed)
references/: schemas, style guides, evals, contracts, deep docs.scripts/: deterministic helpers (token-efficient + repeatable).assets/: templates, boilerplate, fixtures.
Prefer relative paths so the skill works anywhere.
6) Validate (fail fast)
Stop at the first failed gate and fix it before proceeding.
~/.venvs/pyyaml/bin/python scripts/quick_validate.py <path/to/skill-folder>
~/.venvs/pyyaml/bin/python scripts/skill_gate.py <path/to/skill-folder>
Optional (if available):
scripts/analyze_skill.pyfor a quality scorescripts/run_skill_evals.pyfor eval execution (--dual-run --capture-jsonlfor cross-runner scorecards)
7) Package (optional)
python scripts/package_skill.py <path/to/skill-folder> dist/
Script-backed security rules (required)
When a skill includes executable code (scripts/ or containers):
- Offline by default. If network is required, gate behind
--allow-networkand document allowed domains. - Never echo secrets (no
os.environ, no token values). - Destructive actions require explicit confirmation:
- Prefer
--dry-runby default - Require
--confirm/--forceto execute
- Prefer
What to avoid
- Bloating
AGENTS.mdorSKILL.mdwith encyclopedic content—keep them as maps to deeper sources of truth. - Writing marketing-style descriptions; treat them as routing logic.
- Putting templates/examples in system prompts; put them inside the skill.
- Assuming network access; keep allowlists tight and explicit.
- Printing logs that could contain secrets.
- Adding backward-compatibility work by default when the project is unreleased/greenfield.
Constraints
- Redact secrets/credentials/PII by default. Never print raw tokens or environment values.
- Keep frontmatter valid and explicit (
name+descriptionas single-line scalars; useagents/openai.yamlfor UI/dependency metadata). - Do not invent external facts; if uncertain, add a verification step.
- For script-backed skills, default to offline behavior and require explicit confirmation for destructive actions.
- Default to canonical implementations for unreleased/greenfield projects; only include backwards-compatibility requirements when explicitly requested.
Validation
Fail fast: stop at the first failed gate, fix it, and rerun.
~/.venvs/pyyaml/bin/python scripts/quick_validate.py <path/to/skill-folder>
~/.venvs/pyyaml/bin/python scripts/skill_gate.py <path/to/skill-folder>
~/.venvs/pyyaml/bin/python scripts/openclaw_skill_guard.py <path/to/skill-folder> --mode both
Optional deep checks:
~/.venvs/pyyaml/bin/python scripts/analyze_skill.py <path/to/skill-folder>~/.venvs/pyyaml/bin/python scripts/run_skill_evals.py <path/to/skill-folder> --dual-run --capture-jsonl
Examples
- “Create a new skill called
foo-barunderutilities/with eval cases and an output contract.” - “Audit this skill for trigger quality and tighten the description so it routes correctly.”
- “Fix validation failures (
quick_validate.py/skill_gate.py) with the smallest safe patch and rerun gates.”
Reference map (skill-creator internal)
Use these files when needed:
references/about-skills.md: background on skills, intent, and structure.references/portable-skills.md: strict subset for cross-platform portability.references/skill-structure.md: router vs single-file patterns.references/progressive-disclosure-patterns.md: how to split SKILL.md into references/scripts.references/quality-tools.md: how to run validators/evals and interpret output.references/iteration-and-testing.md: eval-driven iteration patterns.references/evals-v2-migration.md: eval schema v2 fields, migration rules, and tiered gating.references/tiered-gating-policy.md: week-by-week rollout policy and promotion rules for tier 2.references/security-hardening.md: offline defaults, redaction, destructive action confirmations.references/examples.md: calibrated examples for phrasing and structure.references/anti-patterns.md: common failure modes + remediation patterns.
Philosophy and tradeoffs
- Use this skill when consistent decision-making matters more than one-off execution because project context should drive the approach.
- Principle and mindset: prioritize tradeoffs and constraints over rigid checklists; understand why each step exists.
- Ask this to keep outcomes robust: Why is this the right default, and what could change this outcome?
- How do we adapt if constraints shift?
- What evidence is needed before choosing one path over another?
Anti-patterns and caveats
- Avoid applying this playbook generically without checking repository-specific context.
- NEVER skip required validation gates when behavior changes.
- DO NOT use this skill as a rigid replacement for engineering judgment.
- DON'T ignore warnings or assume one pattern fits all repos.
- DON'T introduce legacy-preservation code paths unless the user explicitly asks for compatibility.
- Common pitfall: treating anti-patterns as optional.
- Incorrect assumptions here can lead to fragile guidance.
- Warning: wrong sequencing can create avoidable regressions.
Variation and adaptation
- Vary the workflow by team size, risk, and deployment target.
- Use different strategies for small, medium, and large changes.
- Adapt recommendations to the specific environment and avoid repetitive templates.
- Avoid generic or cookie-cutter responses; craft context-specific alternatives.
- Keep outputs diverse and not repetitive.
- Converge on a custom path only after evidence review.
- Different constraints should produce different, non-generic recommendations.
Empowering execution style
- Be capable of exploring multiple options and enabling the team to make safe decisions.
- Unlock confidence by explaining options and tradeoffs clearly.
- Feel free to be creative while staying rigorous and precise.
- Push boundaries with practical alternatives when simple recipes fail.
- Enable outcomes-oriented problem solving.
