What host is best for running Hermes Skills self-evolution?

GEPA evolution and Gateway 24/7 uptime require stable connectivity and native macOS paths. A Mac Mini M4 monthly rental usually beats a sleeping laptop. See NodeMini rental rates for details.

2026 Hermes Agent Skills Advanced Guide: SKILL.md, Bundles, GEPA Evolution & Tap Publishing

Q: Why does my skill change not take effect?

The current session caches the old skill version. Run /reset to start a new session, or add --now during install to force a refresh (this invalidates Prompt Cache).

Q: Are GEPA-evolved skills safe?

Four guardrails constrain output: full test suite pass, size limits, semantic preservation, and human PR review. Still review every PR diff manually.

Why the Skills system deserves its own deep dive — Skills are not Prompts or Memory

Unlike one-off prompts, Hermes Skills follow the agentskills.io open standard and port across Hermes, Claude Code, and Cursor. The mental model is simple: Prompt = sticky note (valid for one turn), Memory = notebook (permanent notes, auto-injected every session), Skill = SOP manual (step-by-step procedure, loaded only when needed).

Dimension	Plain Prompt	Memory	Skills
Persistence	Current conversation	Cross-session, permanent	Cross-session, permanent
Load timing	Always in context	Auto-injected every session	On demand (key difference)
Token cost	Consumed every turn	Small and stable	Zero before activation
Content type	Any intent description	User preferences / facts	Procedural steps (how to do)
Shareability	Hard to reuse	Private	Publishable as community Tap

01
Pain point: Treating a Skill like a full prompt dumps the entire file into context — token cost scales linearly with skill count.
02
Pain point: Related skills require triggering each one with /skill-name, breaking complex workflows.
03
Pain point: Free DuckDuckGo and paid web_search both appear in the prompt, wasting tokens on redundant tool descriptions.
04
Pain point: Every team member rebuilds the same skills with no one-click subscribe-and-share path.
05
Pain point: Skills never evolve — the same mistakes repeat across sessions.
06
What this delivers: A complete advanced roadmap from SKILL.md standards through GEPA self-evolution, covering every core mechanism.

SKILL.md format deep dive and Progressive Disclosure three-level loading

All Hermes Skills follow the agentskills.io standard. Recommended layout: ~/.hermes/skills/my-category/my-skill/ with SKILL.md (core steps, aim for ≤500 lines), references/ (API docs loaded on demand), templates/ (reusable templates), and scripts/ (scripts the agent can execute directly).

yaml — SKILL.md frontmatter

---
name: my-skill                    # required: lowercase + hyphens, max 64 chars
description: |                    # required: max 1024 chars, start with "Use when..."
  Use when the user needs to [...].
version: 1.0.0
metadata:
  hermes:
    tags: [devops, automation]
    requires_toolsets: [terminal]
    fallback_for_toolsets: [web]
---

# My Skill Title
## Overview / When to Use / Procedure / Common Pitfalls / Verification Checklist

Load level	Content	Trigger	Token cost
Level 0	`name` + `description`	Session start, all skills	~3K tokens total across all skills
Level 1	Full SKILL.md body	User runs `/skill-name` or LLM decides it is needed	Depends on file length
Level 2	`references/`, `scripts/`	LLM decides during execution	On demand, per file

info

Writing tip: description is the only Level 0 signal — the LLM uses it to decide whether to load the full skill. State when to use it, not just what it is. Split skills over 1,000 lines into references/; files over 15KB exceed GEPA evolution limits and must be split.

Skill Bundles and conditional activation — one command for a full workflow

Skill Bundles (2026 addition, still underused)

A Bundle is a lightweight YAML file that packs multiple related skills into a single slash command. Running /bundle-name loads every listed skill simultaneously. File location: ~/.hermes/skill-bundles/<slug>.yaml. When a Bundle and a single Skill share the same name, the Bundle wins; missing skills are skipped silently; Bundles do not modify the system prompt, keeping token use lean.

yaml — backend-dev bundle

name: backend-dev
description: Full backend feature workflow — code review, TDD, and PR management.
skills:
  - github-code-review
  - test-driven-development
  - github-pr-workflow
instruction: |
  Always write failing tests first before implementation.
  Never push directly to main.

# Quick CLI create:
# hermes bundles create backend-dev \
#   --skills github-code-review,test-driven-development,github-pr-workflow

Conditional activation

Skills can auto-show or hide based on tool availability in the current session. Configure four rule types under metadata.hermes:

Field	Behavior
`requires_toolsets`	Hide this skill when listed toolsets are missing
`requires_tools`	Hide this skill when listed tools are missing
`fallback_for_toolsets`	Hide when listed toolsets exist (fallback role)
`fallback_for_tools`	Hide when listed tools exist (fallback role)

Classic scenario: after setting FIRECRAWL_KEY / BRAVE_SEARCH_KEY, paid web_search activates and the DuckDuckGo skill disappears via fallback_for_tools: [web_search], saving tokens. When the API is unavailable, the fallback resurfaces automatically. Platform-aware skills can use requires_toolsets: [messaging] plus platforms: [telegram, discord], with per-platform toggles in the hermes skills TUI.

Skills Hub ecosystem, Tap publishing, and GEPA self-evolution

Official install commands and open-source skill repos

bash

hermes skills install official/research/arxiv
hermes skills install github:openai/skills/k8s
hermes skills tap add github:my-org/my-skills
hermes skills tap update && hermes skills tap list

Repository	Highlights	Stars
ChuckSRQ/awesome-hermes-skills	Curated production skills: Deep Research, MLOps, Apple integration	67
amanning3390/hermeshub	Community skill registry with prompt-injection screening per skill	166
kevinnft/ai-agent-skills	191 skills, 28 categories, cross Hermes / Claude / Cursor	10
NousResearch/hermes-agent	Official source of truth, includes all built-in Skills	—

Publish your own Skill Tap

Create a GitHub repo as a Tap and let your team subscribe in one command: hermes skills tap add github:your-org/your-skills-tap. For private repos, add --token $GH_TOKEN. Optional skills.sh.json controls Hub category display. Version-control ~/.hermes/skills/ in Git for cross-device sync.

GEPA + DSPy: automatic skill evolution (ICLR 2026 Oral)

GEPA (Genetic-Pareto Prompt Evolution) does not fine-tune model weights. It improves SKILL.md text by analyzing execution traces, generating variants, and running multi-objective Pareto optimization. Each optimization run costs roughly $2–10 in API calls — no GPU required. The five-stage pipeline: (1) execution trace collection (SQLite) → (2) reflective failure analysis → (3) targeted mutation (10–20 variants) → (4) multi-objective Pareto evaluation (success rate × token efficiency × speed) → (5) human PR review.

bash — GEPA quick start

git clone https://github.com/NousResearch/hermes-agent-self-evolution
cd hermes-agent-self-evolution && pip install -r requirements.txt
export HERMES_AGENT_PATH=~/.hermes

# Synthetic data entry point
python -m evolution.skills.evolve_skill \
    --skill github-code-review --iterations 10 --eval-source synthetic

# Real session data (better results)
python -m evolution.skills.evolve_skill \
    --skill github-code-review --iterations 10 --eval-source sessiondb

# Combined Claude/Gemini traces (experimental)
python -m evolution.skills.evolve_skill \
    --skill github-code-review --eval-source mixed \
    --trace-dirs ~/.claude/traces,~/.hermes/sessions

Four safety guardrails: (1) full test suite must pass at 100%; (2) Skills ≤15KB, tool descriptions ≤500 characters; (3) must not break Prompt Cache compatibility; (4) semantic preservation check ensures the skill does not drift from its original purpose. Official evolution roadmap: Phase 1 Skill files (done) → Phase 2 tool descriptions → Phase 3 system prompt → Phase 4 tool implementation code → Phase 5 fully automated continuous improvement.

Plugin skills and skill_manage self-maintenance

Plugin skills load under the plugin:skill namespace (e.g. skill_view("superpowers:writing-plans")). They do not appear in the default list and require opt-in activation. The agent can dynamically maintain skills via skill_manage(action='patch'|'create', ...). Set skills.agent_writes_require_approval: true in config.yaml to require human approval before writes land in production.

Eight-step rollout: from writing SKILL.md to team Tap and GEPA evolution

01
Write frontmatter per agentskills.io: name (lowercase hyphens, max 64 chars), description starting with "Use when..." (max 1,024 chars), including trigger conditions and exclusion cases.
02
Build a modular directory: main file ≤500 lines, detailed API docs in references/, executable scripts in scripts/. Validate with skills-ref validate ./my-skill.
03
Create a Skill Bundle: write YAML in ~/.hermes/skill-bundles/ or use hermes bundles create CLI to pack related workflow skills.
04
Configure conditional activation: set requires_toolsets / fallback_for_tools in metadata.hermes for smart free/paid tool switching.
05
Publish a Tap repo: create a GitHub repo with categorized directories plus optional skills.sh.json; team runs hermes skills tap add github:your-org/tap.
06
Version-control sync: cd ~/.hermes/skills && git init, then git pull && hermes skills reset across devices.
07
Run GEPA evolution: clone hermes-agent-self-evolution, use evolve_skill against failure traces, review the PR manually before merge.
08
Enable the approval gate: set agent_writes_require_approval: true in production; write concrete failure modes and fixes in the Pitfalls section — that is the quality dividing line.

Example: blog-workflow bundle

yaml — blog-workflow bundle

name: blog-workflow
description: Full tech blog writing workflow.
skills:
  - seo-keyword-research
  - outline-generator
  - code-example-validator
  - bilingual-checker
  - publish-to-platform
instruction: |
  Always research SEO keywords before writing.
  Ensure all code examples are tested and runnable.

Level 0 token budget: all skill descriptions combined total roughly 3K tokens — the first cost gate.
GEPA per-run cost: about $2–10 in API fees, no GPU — suitable for scheduled evolution on an always-on host.
GEPA size limit: skill files must be ≤15KB or guardrails block PR generation.
Cross-platform reuse: the same SKILL.md copies to ~/.claude/skills/ or installs via kevinnft/ai-agent-skills for multi-client use.

Skills are procedural knowledge; MCP is a tool interface — they complement each other. After editing a skill, the current session still uses the cached version until you run /reset or install with --now. Keep descriptions in English (or bilingual) — underlying LLMs match English descriptions more precisely.

A sleeping laptop, a low-spec VPS without native macOS paths, or home Wi-Fi dropouts will kill Gateway and GEPA evolution jobs at the worst moment. For production environments that need stable 24/7 Hermes Skills compounding with native macOS launchd daemons, NodeMini Mac Mini M4 cloud rental is usually more reliable than a makeshift laptop plus manual restarts — see rental rates for current specs.

FAQ

Frequently Asked Questions

Skills are procedural knowledge documents that teach the agent how to do something. MCP is a tool interface that gives the agent additional tool-calling capability. They complement each other: MCP provides database access; a Skill teaches how to run a database migration correctly.

The current session caches the old version. Run /reset to start a new session, or add --now during install to force a refresh (this invalidates Prompt Cache and uses more tokens).

GEPA evolution and Gateway 24/7 uptime need stable connectivity. NodeMini offers dedicated Mac Mini M4 monthly rental — see rental rates for current pricing; for access questions see the help center.

Four guardrails apply: full test suite pass, size limits, semantic preservation, and human PR review. Semantic drift detection ensures the skill does not stray from its original purpose. Still review every PR diff manually before merge.