Published about 23 hours ago 15 min read

🔮 Hermes Agent 🤖: A Practical Guide 🔥 — and How It Stacks Up Against OpenClaw & GoClaw 📊

MayFest2026

A practical deep-dive for engineers, founders, and curious builders.

Hermes Agent is the agent framework that, in roughly twelve weeks since its February 2026 release, has gone from a niche Nous Research project to 140,000+ GitHub stars and the most-used agent on OpenRouter. That growth is not just hype — it reflects a meaningful design shift away from "agents as orchestrated prompt graphs" toward agents as long-lived, self-improving processes that own their own learning artifacts.

Companion reads: 🔮 Hermes Agent 🤖 — Deep Dive & Build-Your-Own Guide 📘 and 🏗️ Building High-Quality AI Agents 🤖 — A Comprehensive, Actionable Field Guide 📚.

This article is a working engineer's tour:

🧠 What Hermes is, and what genuinely separates it from LangGraph / CrewAI / AutoGen
🏗️ Its core architecture
⚔️ How it compares with two adjacent open-source projects: OpenClaw and GoClaw — when to pick which
🌍 Real-world and personal use cases
🔌 Integration patterns into existing apps and SaaS
🛠️ A setup / extend / customize playbook
💭 An opinion on what open, capable agent systems mean for the future of AI development

1. 🧠 What Hermes Agent Actually Is

Hermes is an open-source, model-agnostic, long-running AI agent built by Nous Research. The tagline — "the agent that grows with you" — is technically literal: Hermes is the only mainstream agent framework with a built-in learning loop that creates, edits, and improves its own skills during normal use.

It ships as:

A CLI / TUI you run locally (hermes).
A messaging gateway that turns Telegram / Discord / Slack / WhatsApp / Signal / Email / Matrix into agent surfaces.
A web UI and an Agent Client Protocol (ACP) endpoint for AI-native editors.
A cron scheduler for unattended work.
A pluggable terminal backend layer: local, Docker, SSH, Singularity, Modal, Daytona, Vercel Sandbox — including serverless backends that hibernate when idle, so a 24/7 agent can cost essentially nothing.

It supports 200+ models through Nous Portal, OpenRouter, OpenAI, Anthropic, NVIDIA NIM, Hugging Face, NovitaAI, z.ai/GLM, Kimi, MiniMax, xAI Grok, and any OpenAI-compatible endpoint. Switching providers is hermes model — no code change.

✨ What separates it from LangGraph, CrewAI, AutoGen

The popular frameworks treat an agent as a graph or crew you define ahead of time. You design nodes, you wire edges, you ship. The agent's capability is bounded by what you prompted into it.

Hermes treats an agent as a process that accumulates capability over time. Concretely:

Dimension	LangGraph / CrewAI / AutoGen	Hermes Agent
Primary abstraction	Graph / crew / message-passing topology you author	Long-running loop with self-edited memory & skills
Where capability lives	In the code you wrote and the prompts you crafted	In skills (markdown procedural memory) the agent writes and improves itself
Learning	None built-in — re-runs are stateless unless you wire it	Closed learning loop: skills self-curate; cross-session recall via FTS5 + LLM summarization; Honcho-style user modeling
Surfaces	You build them (FastAPI, Streamlit, etc.)	CLI, TUI, messaging gateway (20+ platforms), web UI, ACP, cron — all included
Execution	Your process	Pluggable: local, Docker, SSH, Modal, Daytona, Vercel Sandbox
Persistence	DIY (sqlite, Redis, vector store)	Frozen-snapshot memory + SessionDB (FTS5) + pluggable provider (Honcho / mem0 / supermemory)
Distribution of skills	Re-implement in code per project	Portable markdown skills via agentskills.io open standard
Sweet spot	Multi-agent orchestration, deterministic pipelines, research pipelines	Personal assistant, always-on operator, long-horizon tasks, knowledge work

Said differently: LangGraph is a build-time framework. Hermes is a run-time being. The two are not competitors so much as different scales of the same problem — LangGraph is excellent for building a deterministic flow inside an enterprise app; Hermes is excellent when you want an agent that lives somewhere, hears you across channels, and gets better at you over months.

2. 🏗️ Core Architecture

Hermes' architecture is deceptively simple — almost every "feature" is a thin layer over a single, stable agent loop.

                            ┌─────────────────────────────────┐
                            │         User Surfaces           │
                            │  CLI · TUI · Gateway · Web ·    │
                            │     ACP · Cron · Subagents      │
                            └────────────────┬────────────────┘
                                             │
                            ┌────────────────▼────────────────┐
                            │          Agent Loop             │
                            │  prompt → think → tool → obs →  │
                            │   memory write → continue       │
                            └──┬──────────────┬───────────┬───┘
                               │              │           │
        ┌──────────────────────▼─┐  ┌─────────▼────┐  ┌───▼─────────────────┐
        │     System Prompt      │  │    Tools     │  │   Skills (Markdown) │
        │  (cache-stable header) │  │  70+ builtin │  │ ~/.hermes/skills/   │
        │                        │  │  + MCP + you │  │ self-edited         │
        └────────────────────────┘  └──────┬───────┘  └─────────────────────┘
                                           │
                              ┌────────────▼─────────────┐
                              │  Execution Environment   │
                              │ local · Docker · SSH ·   │
                              │ Modal · Daytona · Vercel │
                              └──────────────────────────┘
                                           │
                              ┌────────────▼─────────────┐
                              │         Memory           │
                              │ Frozen-snapshot · FTS5   │
                              │  SessionDB · Honcho      │
                              └──────────────────────────┘

The pieces worth understanding in depth:

2.1 🔄 The Agent Loop

A textbook think → act → observe loop, but with two non-obvious decisions baked in:

Cache-friendly prompt layout. The system prompt header is deliberately stable across turns so provider-side prompt caching (especially Anthropic's) hits 80–95% of the time. This is the single biggest cost lever — on Hermes' default Claude config, prompt caching alone yields up to ~90% input-token savings on long sessions.
Skill nudges. The loop periodically prompts itself to reflect on whether the current trajectory should be captured as a reusable skill — that is what gives it the "self-improving" property.

2.2 🧰 Tools

70+ built-in tools across filesystem, shell, browser, search, fetch, code execution, image/audio/video generation, and orchestration (spawnable subagents). Tools are self-registering: drop a Python module into tools/, the registry picks it up. You can also wire any MCP server; tool filters let you allow-list per-session.

2.3 📚 Skills — the killer feature

A skill is a markdown file with optional YAML frontmatter that the agent stores under ~/.hermes/skills/<skill-name>/SKILL.md. The agent invokes them by reference, sometimes nested. Three reasons this is bigger than it looks:

Procedural memory. The agent doesn't just remember facts — it remembers how to do things you've taught it.
Progressive disclosure. Skills can have multiple disclosure levels — a one-line description for retrieval, an expanded body when triggered, and deep references loaded on demand. This keeps the context window tight.
Self-improvement loop. Via the skill_manage tool, the agent can edit, fork, or retire its own skills based on what worked. v0.10.0 ships 118 bundled skills; the community Skills Hub (agentskills.io) tracks thousands more.

2.4 🗂️ Memory

Three independent mechanisms, intentionally layered:

Frozen-snapshot persistent memory — a stable, append-only log inserted into the cache-friendly portion of the prompt.
SessionDB — FTS5-indexed full-text store of every past session; recall is "search + LLM summarize the hits".
Pluggable provider — Honcho (dialectic user-model framework), mem0, or supermemory if you want fancier semantics.

2.5 🌐 Surfaces

Hermes treats "how the user reaches the agent" as a separate concern from the loop:

TUI — the most polished terminal UI in the open-source agent space, with streaming, slash-command autocomplete, and multimodal output.
Gateway — bridges 20+ messaging platforms. This is what makes Hermes feel like a person you message rather than a tool you launch.
Cron — ~/.hermes/cron/ schedules unattended runs.
Subagents — spawnable, isolated peers for parallel workstreams (e.g., one searches, one drafts, one critiques).

2.6 🧪 RL & self-evolution

The companion project hermes-agent-self-evolution (ICLR 2026 Oral) uses DSPy + GEPA to optimize Hermes' skills, prompts, and even agent code against benchmarks. This is the research substrate behind "the agent improves itself" — and it is open.

3. ⚔️ Hermes vs OpenClaw vs GoClaw

These three projects rhyme, but they target different builders. Quick orientation:

Hermes — research-grade, Python/TS, self-improving, model-agnostic, ships as "the agent itself."
OpenClaw — TypeScript / Node, messaging-first, "your personal assistant on every channel you use," local-first daemon.
GoClaw — Go reimplementation of OpenClaw aimed at multi-tenant production: row-level isolation, 5-layer security, single ~25 MB binary, PostgreSQL + pgvector. CC BY-NC license.

3.1 📊 Feature matrix

	Hermes Agent	OpenClaw	GoClaw
Language	Python (88%) + TS	TypeScript / Node 24	Go 1.26 + React
License	MIT	MIT	CC BY-NC 4.0 (non-commercial)
GitHub stars (May 2026)	~140k	very high (the dominant "personal assistant" repo)	~3.1k
Primary metaphor	Long-lived self-improving agent	Personal assistant on every channel	Enterprise multi-tenant agent platform
Tenancy	Single user	Single user (local-first)	Multi-tenant with workspace isolation
Memory	Frozen snapshot + FTS5 + Honcho/mem0	Workspace `AGENTS.md`/`SOUL.md`/`TOOLS.md`	3-tier (working/episodic/semantic) + pgvector
Channels	20+ via Gateway	23+ (WhatsApp, iMessage, Matrix, Tlon, Nostr, Twitch, WeChat, QQ…)	7 (Telegram, Discord, Slack, Zalo, Feishu, WhatsApp, native WS)
Skills	Self-improving, agentskills.io standard, 118 bundled	ClawHub registry (~13.7k+ skills)	Skills + Knowledge Vault with `[[wikilinks]]`
Voice	Transcription	Wake-word on macOS/iOS, continuous on Android, ElevenLabs + system TTS	(less emphasized)
Canvas/UI surface	Web UI, TUI	Live Canvas (A2UI) rendered into companion apps	React dashboard
Execution backends	local, Docker, SSH, Modal, Daytona, Singularity, Vercel	Docker, SSH, OpenShell	Docker; static binary deploy
Security model	Tool approval, sandboxing per backend	Default-permissive `main` session; non-main is sandboxed	5-layer: rate limit, prompt-injection detect, SSRF, AES-256-GCM, RBAC, row-level DB isolation
Self-improvement	Skill loop + DSPy/GEPA research path	Skills are user-authored	"Self-evolution within guardrails" (auto-adapt style/expertise; identity locked)
Best for	Personal long-running agent that learns you	Always-on personal assistant across every device & channel	Multi-tenant SaaS, enterprise teams of agents

3.2 🎯 When to pick which

Pick Hermes if:

You want the strongest learning loop in the open-source space — skills, memory, self-improvement are the headline.
You want a single agent that grows with you over months and years.
You want to swap models freely (200+ supported) or run on serverless backends with near-zero idle cost.
You're building on top of an agent platform and want active research velocity (Nous Research is shipping fast, ICLR-grade work).
You're comfortable with Python.

Pick OpenClaw if:

Your dominant requirement is "I want the assistant to live where I already chat" — every messenger, every device.
You want first-class voice and Canvas rendering on Mac / iOS / Android.
You prefer TypeScript and the npm ecosystem; you want an installable daemon (openclaw onboard --install-daemon).
The agent's job is "respond reliably across channels" more than "plan autonomously over hours."

Pick GoClaw if:

You're shipping a product or SaaS that runs many agents for many users — multi-tenancy, row-level isolation, encrypted per-user API keys, and audit-friendly security matter.
You want enterprise operational characteristics: 25 MB single binary, sub-second startup, native concurrency, OTLP tracing, PostgreSQL durability.
You're a Go shop, or you want a runtime your platform/ops team can love.
⚠️ Note the CC BY-NC 4.0 license — commercial use requires a separate arrangement. If your business is for-profit SaaS, do due diligence before committing.

Pick more than one:

Hermes + OpenClaw is a credible pairing: Hermes as the brain (learning, skills, planning) routed into OpenClaw's channel/device surfaces.
Hermes for personal + GoClaw for product is a common split — your team learns one stack twice, once as the user, once as the operator.

4. 🌍 Real-World Use Cases

4.1 🏢 Common production use cases

Always-on engineering operator. Wired to GitHub + Slack + your CI: triages issues, summarizes PRs, runs flaky-test bisection, files draft fixes, reports back in-channel.
Customer-facing support copilot. Behind a WhatsApp or Telegram gateway, handling Tier-1 support with sandboxed tool access to your knowledge base + ticket system.
Internal ops bot. Cron-driven: every morning pulls metrics dashboards, summarizes anomalies, drops a note in the team channel; runs ad-hoc investigations on demand.
Research assistant. Long-running, scopes literature reviews, maintains a personal knowledge base of summaries, and notices when new papers contradict prior ones.
Sales/CRM concierge. Watches inbound channels, drafts replies in your voice, schedules follow-ups via cron, hands hot leads to humans with a packaged brief.
Devrel / community manager. Across Discord + Twitter/X + GitHub, drafts responses, escalates real issues, maintains FAQ skills that improve every week.

4.2 👤 Personal / "agent for one" use cases

A second brain that talks back. Journals, recalls past projects via FTS5 SessionDB, surfaces patterns ("you've burned out the last three Aprils — want to lighten this week?").
Calendar / inbox triage. Connect Email + Telegram. The agent ingests, classifies, drafts replies, never sends without approval until you trust it.
Personal trainer / coach. Skills like weekly-review, progressive-overload-plan, recovery-check accumulate over months — literally a coach that learns you.
Home automation brain. Webhook / MCP into Home Assistant. Natural-language schedules, anomaly alerts ("there's been a leak sensor spike, do you want me to close the main valve?").
Travel concierge. Pulls fare data, drafts itineraries, books via tool calls behind your confirmation, files receipts to a notes app.
Writing / creative partner. A long-running collaborator that remembers your style and last 80,000 words of context; skills can encode editing rules ("never use the word 'leverage'").
Tax / finance helper. Skills capture your accounting policies; one cron runs monthly reconciliations against bank exports; nothing leaves your machine.
Family group assistant. Sit Hermes (or OpenClaw) in a family Signal group: shared lists, reminders, photo organization, vacation planning.

5. 🔌 Integration Patterns for Existing Systems / SaaS

Hermes is intentionally open at every seam. Five integration shapes you'll likely use:

5.1 📥 Inbound integrations — letting the agent reach into your systems

MCP servers (recommended default). Wrap your internal APIs as MCP tools — your stack stays untouched and any agent (Hermes, Claude Desktop, Cursor, etc.) can consume it. Hermes filters MCP tools per session.
Custom Hermes tools (Python). Drop a module into tools/, declare a schema, the registry picks it up. Use this when you want first-class tool ergonomics, streaming, or tool-side caching.
Webhooks via the cron / event bus. Schedule pulls (every 10 min, fetch open tickets) or expose webhook endpoints that drop an event onto the agent's queue.
The Gateway as inbox. Treat Telegram/Slack/Email as the input plane — your existing messaging surface becomes the agent's UI without you building one.

5.2 📤 Outbound — embedding the agent into your product

ACP (Agent Client Protocol). Hermes speaks ACP, so AI-native editors (Cursor-style) and any ACP client can drive it. This is the cleanest way to embed an agent into a desktop or editor product.
Web UI iframe / API. hermes web exposes a usable UI; for deeper integration, wrap the agent process and proxy I/O.
Subagents as microservices. Spawn a subagent per request from your backend; let it run isolated in a Daytona/Modal sandbox; collect the trajectory.
Trajectory export → fine-tuning. Hermes ships batch trajectory generation; you can use real production runs to fine-tune cheaper local models for your domain.

5.3 🧱 Architecture sketch for a SaaS

A pragmatic three-tier embedding:

[Your SaaS]
   │
   ├── /api/* (your existing app)
   │
   └── /agent/* ── proxy ──► [Hermes process]
                              │
                              ├── MCP ──► your internal API (Stripe, Postgres, S3, etc.)
                              ├── Sandbox: Modal / Daytona (per-tenant)
                              └── Memory: Postgres + pgvector (per-tenant namespace)

For multi-tenant scenarios specifically (one agent per customer), this is where GoClaw earns its keep: it gives you tenant isolation, encrypted per-user keys, and row-level DB security out of the box, so you don't have to build them.

5.4 ⚠️ Common gotchas

Cache invalidation. Anything that mutates the cache-stable prompt header (timestamps, dynamic counters) tanks prompt-cache hit rate. Keep volatile content below the cache boundary.
Skill explosion. Without grooming, an agent will accumulate 500 mediocre skills. Periodic skill_manage review (or a cron that runs it) is worth its weight.
Tool approval UX. In a user-facing product, "agent wants to run X" prompts need real product thought — don't paper over with auto-approve.
Cost. Skills + memory + long sessions = many tokens. Lean hard on prompt caching, and consider mixing a small local model for routine turns.

6. 🛠️ Setup / Run / Customize / Extend

6.1 🚀 Install (Linux / macOS / WSL2 / Termux)

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc
hermes

Windows native: a PowerShell one-liner installs uv, Python 3.11, Node, ripgrep, ffmpeg, and a bundled MinGit.

For contributors:

git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh

6.2 ⌨️ Day-1 commands

Action	Command
Interactive chat	`hermes`
TUI mode	`hermes --tui`
Pick model/provider	`hermes model`
Configure tools	`hermes tools`
Start messaging gateway	`hermes gateway`
Open web UI	`hermes web`
Migrate from OpenClaw	`hermes claw migrate`
In-chat: reset	`/new` or `/reset`
In-chat: change model	`/model anthropic:claude-opus-4-7`
In-chat: skills	`/skills` or `/<skill-name>`
In-chat: compress context	`/compress`
In-chat: set persona	`/personality coach`

6.3 ✍️ Writing a skill

Skills are just markdown. The smallest useful one:

---
name: weekly-review
description: Run a Friday weekly review with the user
triggers: ["weekly review", "friday review"]
---

When triggered:
1. Pull the last 7 days of journal entries from SessionDB.
2. Group by theme; surface 3 wins, 3 frictions, 1 pattern.
3. Ask the user one sharp question, then propose next week's top 3.

Drop it into ~/.hermes/skills/weekly-review/SKILL.md. The agent will discover it via progressive disclosure (description first; full body when relevant). To share, publish to the Skills Hub.

6.4 🔧 Writing a custom tool

A tool is a Python module that the self-registering registry picks up. Pattern:

# tools/jira_search.py
from hermes.tools import tool

@tool(name="jira_search", description="Search Jira issues by JQL.")
def jira_search(jql: str, limit: int = 20) -> list[dict]:
    """JQL → list of issues."""
    return jira_client.search(jql=jql, limit=limit)

Reload tools (hermes tools) and the agent can call it. For shared/installable tools, prefer MCP.

6.5 🎭 Customizing personality & context

Personalities: ~/.hermes/personalities/<name>.md — slot in via /personality <name>.
Context files: project-level markdown that becomes part of every conversation in that project (think CLAUDE.md, but Hermes-native).
Cron: ~/.hermes/cron/ — drop YAML/markdown schedules; the daemon runs the agent unattended.

6.6 🧩 Extending the runtime itself

Memory provider. Swap to Honcho, mem0, or supermemory via config.
Execution backend. Switch from local → Docker → Modal/Daytona with a config change; no code rewrite.
Surface. Add an ACP client, expose /v1/agent over your own HTTP layer, or write a new gateway adapter (the gateway is a clean adapter pattern).
Plugins. The plugin system + COMMAND_REGISTRY pattern lets you add slash commands and entirely new subsystems without forking core.

6.7 ✅ Production checklist

Pin a specific Hermes version; don't ride main in production.
Run in Docker (or Modal/Daytona) — never local backend for shared agents.
Set explicit tool allow-lists per session/profile.
Turn on prompt caching at the provider level; verify cache hit rate > 80%.
Cron a skill-grooming run weekly.
Log trajectories (cheap) — they become training data and audit trail.
Wrap external API tools with rate limits & circuit breakers; agents will hammer broken endpoints harder than humans.

7. 💭 Opinion — What an Open, Capable Agent System Means for AI Development

Three years ago, "agent framework" meant "fancy retry loop around a chat completion." Hermes — and the OpenClaw/GoClaw lineage — represent something genuinely different, and it's worth naming:

1. The unit of software is shifting from "app" to "agent." An app is a UI + business logic + persistence. An agent is a process + tools + memory + a way to be reached. Hermes treats every surface (CLI, messaging, web, ACP, cron) as interchangeable adapters to the same underlying being. Once you internalize that, building "an app" and "an agent that does the app's job" stop being separate disciplines — and the agent wins almost every time, because it composes with everything else the user has.

2. Self-improvement, when it's just markdown, is real. The deepest insight in Hermes' design is unglamorous: skills are markdown files the agent writes. No vector store gymnastics, no opaque fine-tunes — just a folder of text files that the loop edits. That's enough for a closed learning loop, because LLMs are extraordinarily good at reading and writing their own instructions. The implication is that a long-lived open agent will, in practice, become as capable as proprietary ones — not by matching their base model, but by accumulating thousands of small procedural wins their stateless competitors can't.

3. Openness changes the economics. With serverless backends like Modal/Daytona that idle at near-zero, plus 200+ provider support, plus an MIT license — the marginal cost of running a personal Hermes is approaching nothing. We are roughly one user-experience cycle away from the world where running your own agent is more natural than using a hosted one, the same way self-hosting a wiki briefly was, before it wasn't, and then was again with Obsidian. The companies that bet exclusively on hosted agent moats are going to have to find a different moat.

4. The interesting frontier moves from models to artifacts. The model is becoming a commodity input. What differentiates one user's agent from another is the artifact graph that accumulates around it — their skills, their memories, their personalities, their tool wiring, their channel presence. That graph is portable, exportable, forkable, gift-able. It is the part that's yours. Hermes is the first major framework to take that seriously by design.

5. The risks compound the same way. A self-editing agent with tool access is exactly as much of a security problem as it sounds. The trio of agent runs tools + agent edits its own instructions + agent persists across sessions is genuinely new threat surface. GoClaw's 5-layer model — rate limits, prompt-injection detection, SSRF guards, AES-256-GCM, RBAC, row-level DB isolation — is the floor, not the ceiling, for anyone running this for other people. Expect "agent security" to become a discipline with its own conferences within 18 months.

6. The community wins. The agentskills.io standard is the part of this story I'd watch closest. A portable, vendor-neutral skill format means a skill someone wrote for Hermes can run inside OpenClaw, can run inside your in-house framework, can be inspected and forked. Compare to the alternative — every vendor's "GPTs / Agents / Assistants" being a walled garden. The open-skill bet is the same bet HTTP made against AOL: more chaotic in the short run, structurally inevitable in the long.

The bottom line. Hermes is not "the best agent framework" the way React is "the best UI framework." It's the first credible attempt at a living agent — a piece of software that runs continuously, reaches you where you already are, edits itself, and gets noticeably better at you over time. That's a different product category, and the next five years of personal/professional AI use are going to be defined by whoever masters it. If you build software for a living, spend a weekend with Hermes — not because you'll necessarily adopt it, but because the shape of what you're building is changing, and this is one of the clearest views of the new shape that exists today.

📎 Sources

If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃

Android iOS JavaScript ReactJS