0

🦊GoClaw Deep Dive πŸ€– β€” A Builder's Guide to a Multi-Tenant AI Agent Platform πŸ“˜

Source: https://github.com/nextlevelbuilder/goclaw β€” a Go-based, multi-tenant AI agent gateway with 20+ LLM providers, 7 messaging channels, an 8-stage pipeline, 3-tier memory, and 5-layer security.

This document distills GoClaw's architecture into the principles, patterns, and concrete building blocks you need to build a similar platform from scratch. Read top-to-bottom for theory, jump to Part 4 β€” Build-It-Yourself Blueprint for a sequenced implementation plan.


Table of Contents

  1. 🧠 What GoClaw Actually Is (mental model)
  2. βš™οΈ The 11 Core Principles
    1. πŸ”„ The Agent Loop: Think β†’ Act β†’ Observe
    2. πŸ”§ The 8-Stage Pluggable Pipeline
    3. πŸ€– Provider Abstraction & Resilience
    4. πŸ› οΈ The Tool Registry Pattern
    5. 🧠 3-Tier Memory (L0/L1/L2)
    6. 🏒 Multi-Tenant Isolation by Default
    7. πŸ›‘οΈ 5-Layer Defense-in-Depth Security
    8. πŸ’Ύ Persistence: Interface-First, Dual Backend
    9. πŸ“‘ Channels as Pluggable Adapters
    10. 🀝 Teams, Delegation, and Subagents
    11. 🌱 Self-Evolution with Guardrails
  3. πŸ” Cross-Cutting Patterns
  4. πŸ—ΊοΈ Build-It-Yourself Blueprint
  5. ⚠️ Anti-Patterns to Avoid
  6. πŸ“š Reference Map

Part 1 β€” 🧠 What GoClaw Actually Is

GoClaw is not a chatbot or "wrapper around OpenAI." It is an AI agent gateway β€” a backend service that sits between your application and LLM providers + tools + storage, and exposes a stable RPC/HTTP surface to the outside world.

[Browser / Telegram / Discord / Your SaaS Backend / CLI]
            β”‚ (WebSocket RPC, HTTP REST, OpenAI-compat /v1/chat/completions)
            β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚        GoClaw Gateway           β”‚
   β”‚  Auth Β· RBAC Β· Rate-limit       β”‚
   β”‚  Tenant Isolation Layer         β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚        Agent Engine             β”‚
   β”‚  Loop Β· Pipeline Β· Router       β”‚
   β”‚  Tools Β· Memory Β· Skills Β· MCP  β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  PostgreSQL  Β·  Redis  Β·  Files β”‚
   β”‚  (sessions Β· agents Β· memory Β·  β”‚
   β”‚   traces Β· KG Β· vault Β· keys)   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
        20+ LLM Providers (Anthropic, OpenAI, Gemini, …)

Three sentences that capture the design

  1. Agents are configurations, not code β€” defined by rows in a DB plus a few markdown bootstrap files (SOUL.md, IDENTITY.md, AGENTS.md, TOOLS.md).
  2. Everything is multi-tenant from day one β€” every table carries tenant_id, every query enforces it, and tenant scope flows through context.Context.
  3. Every concern is an interface with at least one implementation β€” providers, stores, channels, tools, all behind small interfaces so they can be swapped or mocked.

Part 2 β€” βš™οΈ The 11 Core Principles

2.1 πŸ”„ The Agent Loop: Think β†’ Act β†’ Observe

The fundamental shape of any agent is a loop. GoClaw caps it at 20 iterations by default and structures each iteration as three actions:

loop (≀ 20 times):
    THINK   β†’ Build prompt β†’ call LLM β†’ get response (text + tool calls?)
    if no tool calls: BREAK
    ACT     β†’ Execute tool calls (parallel if multiple)
    OBSERVE β†’ Append tool results back into the message history
finalize β†’ sanitize output, persist messages, emit completion event

Key implementation details:

Detail Value Why
Max iterations 20 Prevents runaway loops; configurable per-agent and per-request
Parallel tools goroutines + result sort by index Latency win when LLM calls 3+ tools at once
Single tool sequential Goroutine overhead isn't worth it
Mid-loop compaction trigger at 75% of context window Summarize first ~70% of history in-place to avoid overflow
Cancel handling context.Background() fallback for trace finalize Ensures the trace record always saves even on /stop

Build it yourself: start here. A loop with one provider, one tool (echo), one in-memory session store, and a for i := 0; i < 20; i++ is a 200-line program that already works.


2.2 πŸ”§ The 8-Stage Pluggable Pipeline

The V3 architecture turns the monolithic loop into 8 independent stages. Each stage is a Stage interface implementation that mutates a shared RunState.

Setup (once)
└─ ContextStage      Inject ctx (agentID, userID, locale), resolve workspace,
                     ensure per-user files exist, persist IDs on session.

Iteration loop (≀ 20)
β”œβ”€ ThinkStage        Build system prompt (15+ sections), filter tools via policy,
β”‚                    call LLM, record span, emit `chunk` events.
β”œβ”€ PruneStage        If context > 25%: soft-trim oversized tool results.
β”‚                    If > 50%: hard-clear. Run sanitizeHistory after.
β”œβ”€ ToolStage         Execute tool calls (parallel for multi-call).
β”‚                    Emit `tool.call` / `tool.result`.
β”œβ”€ ObserveStage      Append tool results to message buffer.
β”‚                    Handle `NO_REPLY` convention (silent completion).
└─ CheckpointStage   Increment iteration. Break on max-iters or ctx cancel.

Finalize (once)
└─ FinalizeStage     7-step output sanitization, atomic message flush,
                     update session metadata, emit `run.completed`.

Why this matters:

  • Each stage is testable in isolation (stages_test.go per stage).
  • New behavior (e.g. a RagStage) is one file β€” no surgery on a 2k-line runLoop().
  • Both V2 (monolithic) and V3 (pipeline) can coexist behind a feature flag.

Stage interface (sketch):

type Stage interface {
    Name() string
    Run(ctx context.Context, state *RunState) (StageResult, error)
}

type StageResult int
const (
    Continue   StageResult = iota // proceed to next stage
    BreakLoop                      // exit iteration loop
    AbortRun                       // abort the entire run
)

Lesson: Pluggable pipelines beat monolithic loops once the loop has more than ~3 conditional branches. Pay the abstraction cost early.


2.3 πŸ€– Provider Abstraction & Resilience

A Provider is a tiny interface. Everything that's hard about LLMs lives inside this seam.

type Provider interface {
    Name() string
    DefaultModel() string
    Chat(ctx context.Context, req ChatRequest) (ChatResponse, error)
    ChatStream(ctx context.Context, req ChatRequest, onChunk func(Chunk)) (ChatResponse, error)
}

Every backend β€” Anthropic native HTTP+SSE, OpenAI-compatible (Groq, DeepSeek, Gemini, Mistral via the same wire format), Claude CLI subprocess, ACP JSON-RPC, DashScope wrapper β€” implements this interface. The agent loop never knows which one it's talking to.

Resilience layers wrapped around providers:

Layer Purpose
Retry Exponential backoff with jitter; honors Retry-After; retries 5xx + network errors only (not 4xx)
Cooldown Per-model cooldown timer after repeated failures β€” skip the model for N seconds
Failover 2-tier: rotate API profiles, then degrade to a fallback model
Cache Composable middleware β€” caches identical prompts within a TTL
Service tier Middleware that picks priority/flex/auto tier per request
Error classify Map raw provider errors to 9 canonical reasons (rate-limit, context-overflow, auth, etc.)

Wire-format quirks live in the adapter, not the loop. Examples:

  • Anthropic uses x-api-key; OpenAI-compat uses Bearer; Codex uses OAuth + token refresh.
  • Claude CLI is a subprocess speaking stdio; ACP is JSON-RPC 2.0 over stdio.
  • DashScope wraps Qwen with a custom thinking-budget mapping.

Lesson: When you support N providers, the spread of behaviors is enormous. Force every quirk through one interface and you keep the agent loop boringly simple.


2.4 πŸ› οΈ The Tool Registry Pattern

Tools are the agent's hands. Every tool call goes through one place: Registry.ExecuteWithContext. The registry mediates every invocation.

Agent Loop
    β”‚ ExecuteWithContext(name, args, channel, chatID, ...)
    β–Ό
[Registry]
    1. Inject per-call context (channel, chatID, peerKind, sandbox key, workspace)
    2. Rate-limit check (token bucket per session key)
    3. Policy check (RBAC: is this tool allowed for this agent?)
    4. Execute the Tool.Execute(ctx, args)
    5. Scrub credentials from output (regex + dynamic registered values)
    6. Return Result{ ForLLM, ForUser, IsError, MediaRefs, ... }

Tool capabilities (metadata that drives policy):

Capability Examples
read-only read_file, web_search, memory_search β€” safe to retry
mutating write_file, exec, cron, team_tasks
async spawn β€” returns immediately, result delivered later
mcp-bridged Anything proxied to an external MCP server

The Policy Engine filters tools through 7 layers before sending the list to the LLM:

  1. Global profile (full / coding / messaging / minimal)
  2. Provider profile override
  3. Global allow list
  4. Provider allow override
  5. Agent allow
  6. Agent + provider allow
  7. Group allow β†’ then deny lists β†’ then AlsoAllow (additive) β†’ then subagent deny β†’ final list

The 4-tier config overlay (most specific wins):

  1. Per-agent override (agents.builtin_tool_settings)
  2. Per-tenant override (builtin_tool_tenant_configs)
  3. Global default (builtin_tools.settings)
  4. Hardcoded fallback (in tool code)

Built-in tool inventory (the floor you should aim for):

Group Tools
fs read_file, write_file, list_files, edit, send_file
runtime exec (with credentialed CLI mode for secret injection)
web web_search, web_fetch (with allow/block domains)
memory memory_search, memory_get, memory_expand
sessions sessions_list, sessions_history, sessions_send, spawn
automation cron, datetime, heartbeat
messaging message, create_forum_topic, list_group_members
team team_tasks (create/list/claim/complete/comment/attach/...)
media-gen create_image, create_audio, create_video, tts
media-read read_image, read_audio, read_document, read_video
knowledge vault_search, vault_read, knowledge_graph_search, skill_search

Custom tools are shell commands with Go-template placeholders, stored in custom_tools table. Hot-reloaded via pub/sub on change. Supports encrypted env vars for credentials.

Virtual filesystem interceptors route specific paths to the database, not disk:

  • ContextFileInterceptor β†’ routes SOUL.md, IDENTITY.md, etc. to agent_context_files / user_context_files.
  • MemoryInterceptor β†’ routes MEMORY.md, memory/* to memory_documents. Writing a .md triggers chunking + embedding automatically.

Path security: every filesystem op runs through resolvePath() which filepath.Clean()s and verifies the result starts with the workspace prefix. Blocks path traversal.

Lesson: the tool registry is where security lives. If every tool call doesn't go through one chokepoint, you have no place to enforce rate-limit / RBAC / scrubbing.


2.5 🧠 3-Tier Memory (L0/L1/L2)

GoClaw treats memory as a progressive loading problem: cheap context first, expensive context only when asked.

L0 β€” Working Memory                L1 β€” Episodic                L2 β€” Semantic
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Current session    β”‚             β”‚ Session summariesβ”‚         β”‚ Knowledge Graph  β”‚
β”‚ messages           β”‚   ───────►  β”‚ + L0 abstracts   β”‚ ──────► β”‚ entities +       β”‚
β”‚ (auto-injected     β”‚             β”‚ (~50 tokens)     β”‚         β”‚ relations        β”‚
β”‚  if relevant)      β”‚             β”‚ + embeddings     β”‚         β”‚ + temporal       β”‚
β”‚ Threshold-based    β”‚             β”‚ 90-day retention β”‚         β”‚   validity       β”‚
β”‚ compaction         β”‚             β”‚ Hybrid search    β”‚         β”‚ (valid_from/to)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β–²                                  β–²                            β–²
   auto-inject                        memory_search                memory_expand
   (ContextStage)                     (tool, top-K)                (tool, full doc)

The progressive flow:

  1. L0 auto-injection β€” On every turn, ContextStage runs AutoInjector which scores the user message against episodic summaries + KG entities. If relevance β‰₯ 0.3, inject up to 5 entries / 200 tokens at the top of the system prompt. Free for the agent β€” no tool call.
  2. L1 unified search β€” When the agent calls memory_search(query), it runs hybrid search (BM25 + vector) across both episodic L0 abstracts and KG entities. Returns top K within score threshold.
  3. L2 deep retrieval β€” When the agent calls memory_expand(episodic_id), it loads the full summary plus linked KG edges.

Hybrid search formula:

combined_score = vector_score * 0.7 + fts_score * 0.3
                                              β”‚
              FTS: PostgreSQL tsvector +     β”‚
                   plainto_tsquery('simple')  β”‚
              Vector: pgvector with <=>       β”‚
                      cosine distance         β”‚
              Per-user boost: 1.2x            β”‚
              Dedup: per-user wins over globalβ”‚

Event-driven consolidation (the magic that fills L1/L2 over time):

run.completed event
       β”‚
       β–Ό
EpisodicWorker β†’ extract summary + L0 abstract via LLM
       β”‚
       β”‚ episodic.created event
       β–Ό
SemanticWorker β†’ extract entities/relations from summary, write to KG
       β”‚
       β”‚ entity.upserted event
       β–Ό
DedupWorker β†’ embedding-similarity merge, redirect relations

(separately, debounced 10m)
DreamingWorker β†’ batch unpromoted summaries scored by:
                 0.30 * frequency + 0.35 * relevance +
                 0.20 * recency  + 0.15 * freshness  (14-day half-life)
              β†’ LLM synthesis β†’ write to long-term memory / vault

Two compaction strategies for L0:

When Trigger Strategy
Mid-loop prompt_tokens >= 75% of context window during iteration Summarize first ~70% of in-memory messages, keep last ~30%
Post-run > 50 messages OR > 75% context window after run Per-session try-lock β†’ memory flush β†’ background summarize β†’ save summary + truncate to last 4 messages

Lesson: Memory is not a single tier. Treat it as a hierarchy with cost gradients (free auto-inject β†’ tool call for L1 β†’ tool call for L2). Use embeddings + FTS together, not either-or.


2.6 🏒 Multi-Tenant Isolation by Default

This is the single most consequential design decision β€” and the one most projects skip until it's painful.

Three rules, never broken:

  1. Every isolatable table has tenant_id NOT NULL. 40+ tables in GoClaw enforce this.
  2. Every query includes WHERE tenant_id = $N. No exceptions. Fail-closed.
  3. Tenant flows through context.Context. Resolved at the gateway, propagated everywhere, never taken from client headers (which can be spoofed).

Tenant resolution at the gateway:

Credential How tenant is resolved
Tenant-bound API key Auto from api_keys.tenant_id (the recommended path)
System-level API key + X-GoClaw-Tenant-Id header From header (UUID or slug); only system keys can do this
Gateway token + owner user ID All tenants (cross-tenant admin)
Channel webhook (Telegram, Discord, …) Baked into channel_instances.tenant_id at registration
No credentials Master tenant only (dev mode)

Per-tenant overrides β€” each tenant gets its own:

  • LLM provider configs and API keys
  • Tool settings (web_search providers, TTS voice, etc.)
  • Skills enabled/disabled
  • MCP servers + per-user credentials
  • Channel instances

API key flow:

[Your SaaS Backend] ── Bearer goclaw_sk_abc... ── [GoClaw]
                                                       β”‚
                                                       β–Ό
                                         api_keys table:
                                         hash = SHA-256(key)
                                         tenant_id = UUID
                                         scopes = [...]
                                                       β”‚
                                                       β–Ό
                                         ctx = WithTenantID(parent, tenantID)
                                                       β”‚
                                                       β–Ό
                                    All downstream queries:
                                    WHERE tenant_id = $N

Storage hardening:

  • API keys: SHA-256 at rest, constant-time compare for validation (crypto/subtle.ConstantTimeCompare).
  • Provider/MCP/custom-tool secrets: AES-256-GCM with aes-gcm: prefix + 12-byte nonce + ciphertext + tag, base64'd.
  • Master scope guard: writes to global tables (builtin_tools, config.*) require IsMasterScope(ctx) β€” otherwise tenant admin only.

Identity propagation pattern: GoClaw doesn't authenticate end-users. The upstream service (your SaaS backend, your auth proxy) provides user_id, opaque, max 255 chars. The recommended convention for multi-tenant deployments is tenant.{tenantId}.user.{userId}.

Lesson: Retrofitting multi-tenancy is one of the most painful migrations in software. Make tenant_id a column on day one, even if you only have one tenant.


2.7 πŸ›‘οΈ 5-Layer Defense-in-Depth Security

Each layer is independent β€” even if one is bypassed, the others still protect.

Layer 1 β€” 🌐 Transport

  • CORS allow-list validation
  • WebSocket message size limit: 512 KB
  • HTTP body limit: MaxBytesReader 1 MB
  • Timing-safe token comparison (crypto/subtle)
  • Rate limiting (token bucket, per user / per IP)
  • Ping/pong every 30s; read deadline 60s; write deadline 10s

Layer 2 β€” πŸ” Input Validation (InputGuard)

6 regex patterns scan every user message:

Pattern Catches
ignore_instructions "Ignore all previous instructions"
role_override "You are now a different assistant"
system_tags <\|im_start\|>system, [SYSTEM]
instruction_injection "New instructions:", "override:"
null_bytes \x00
delimiter_escape </instructions>, "end of system"

4 action modes: off / log / warn (default) / block.

Layer 3 β€” βš™οΈ Tool Execution

  • Shell deny groups β€” 15 classes, all denied by default: destructive_ops, data_exfiltration, reverse_shell, code_injection, privilege_escalation, dangerous_paths, env_injection, container_escape, crypto_mining, filter_bypass, network_recon, package_install, persistence, process_control, env_dump. Live-reloadable via pub/sub.
  • Path traversal prevention β€” resolvePath() cleans + prefix-checks every filesystem op.
  • SSRF guards β€” validateProviderURL() blocks 127.0.0.1/localhost for provider base URLs.
  • Credentialed CLI gate β€” when calling registered binaries (gh, gcloud, aws, kubectl, terraform), the exec tool injects encrypted env vars directly into the child process (no shell), unwraps sh -c wrappers up to depth 3 to prevent bypass, and fails-closed on DB error.
  • Domain allow/block β€” web_fetch honors per-tenant allow_domains / block_domains.

Layer 4 β€” 🧹 Output Sanitization

  • Credential scrubber β€” static regex patterns for OpenAI, Anthropic, GitHub, AWS keys + dynamic registry of runtime values. Replaces with [REDACTED]. Always-on.
  • Output sanitizer (7 steps applied to LLM output before delivery):
    1. Strip garbled tool XML (<tool_call>, <minimax:tool_call>, etc. from broken models)
    2. Strip downgraded text-format tool calls ([Tool Call: ...])
    3. Strip thinking tags (<think>, <thinking>, <antThinking>)
    4. Strip final wrapper tags (preserve inner content)
    5. Strip echoed [System Message] blocks
    6. Collapse consecutive duplicate paragraphs (model stuttering)
    7. Strip leading blank lines

Layer 5 β€” πŸ”’ Isolation

  • Per-user workspace β€” base + "/" + sanitize(userID), injected via WithToolWorkspace(ctx)
  • Docker sandbox β€” read-only root, dropped capabilities, scoped per-session
  • Subagent depth limit β€” max depth 1, max children 5/parent, max concurrent 8 system-wide

Lesson: Don't pick one security strategy. Layer them. Assume each one will fail and ask "what's the next line of defense?"


2.8 πŸ’Ύ Persistence: Interface-First, Dual Backend

Every store is a Go interface. Each interface has both a PostgreSQL implementation (server) and a SQLite implementation (Lite desktop). Selected at compile time via //go:build tags.

type SessionStore interface {
    GetOrCreate(ctx context.Context, key string) (*Session, error)
    AddMessage(ctx context.Context, key string, msg Message) error
    SetSummary(ctx context.Context, key, summary string) error
    Save(ctx context.Context, key string) error
    Delete(ctx context.Context, key string) error
    List(ctx context.Context, opts ListOpts) ([]*Session, error)
}

// PG: writes through to PostgreSQL, in-memory write-behind cache
// SQLite: same interface, plain SQLite, no FTS5/vector

Why this matters:

  • Write the agent loop once, ship a server edition (PG) and a desktop edition (SQLite + Wails app).
  • Tests use mocks against the interface.
  • Replace any backend without touching call sites.

The 22+ stores in the system:

Store What it owns
SessionStore Conversation history (with in-memory write-behind cache)
AgentStore Agent definitions, soft-delete, RBAC sharing
ProviderStore LLM provider configs, encrypted keys
MemoryStore Memory docs + chunks (FTS + pgvector hybrid)
EpisodicStore Session summaries with embeddings + recall scoring
KnowledgeGraphStore Entities + relations with temporal validity
VaultStore Knowledge vault docs + bidirectional wikilinks
TeamStore Teams, tasks (atomic claim), members, messages
CronStore Scheduled jobs + run logs
TracingStore Traces + spans (LLM, tool, agent)
MCPServerStore MCP server configs + grants
CustomToolStore Dynamic shell-based tools
ChannelInstanceStore Channel configs (Telegram bot tokens, Discord guild IDs, …)
ConfigSecretsStore Encrypted config values
BuiltinToolStore System tool metadata + per-tenant settings
PendingMessageStore Offline group-chat queue with auto-compaction
ContactStore Cross-channel contact dedup + merge
ActivityStore Audit log
SnapshotStore Hourly usage aggregations for dashboards
SecureCLIStore Credentialed binary configs (encrypted env)
APIKeyStore Gateway API keys (SHA-256 hashed)
HookStore Lifecycle hook definitions + execution audit

Two power patterns from the PG layer:

  1. xmax trick for "is this row new?"

    INSERT INTO user_agent_profiles (...) VALUES (...) 
    ON CONFLICT (...) DO UPDATE SET last_seen_at = NOW()
    RETURNING xmax = 0 AS is_new
    

    is_new = true means a real INSERT happened β†’ trigger first-time setup (seed context files). false means it was an UPDATE β†’ returning user.

  2. Atomic task claim (race-safe without distributed locks):

    UPDATE team_tasks
    SET status = 'in_progress', owner_agent_id = $1
    WHERE id = $2 AND status = 'pending' AND owner_agent_id IS NULL
    -- 1 row updated = claimed; 0 rows = someone else got it
    

Other PG conventions:

  • No ORM. database/sql with pgx/v5/stdlib. Raw SQL, $1/$2/$3 positional params.
  • Nullable columns via Go pointers (*string, *time.Time); helpers like nilStr() convert zero-values to nil.
  • execMapUpdate(map[string]any) builds dynamic UPDATE statements without one-function-per-field-combo.
  • UUID v7 (time-ordered) for all primary keys via GenNewID().
  • Required extensions: pgvector + pgcrypto.

Session caching pattern (write-behind):

Read:    GetOrCreate(key) β†’ cache miss? load from DB into cache β†’ return
Write:   AddMessage / SetSummary β†’ in-memory only (no DB write)
Save:    Save(key) β†’ snapshot under read lock β†’ flush to DB via UPDATE
Delete:  Delete(key) β†’ remove from cache + DB

Reads of List() go straight to DB to avoid stale results.

Lesson: Define stores as interfaces from line one. You'll thank yourself when you need a desktop edition, an in-memory test, or to swap PG for CockroachDB.


2.9 πŸ“‘ Channels as Pluggable Adapters

Each external messaging platform is an adapter that converts platform-specific events to a unified InboundMessage and platform-specific replies from a unified OutboundMessage.

7 supported channels:

Channel Transport DM Group STT Streaming
Telegram Long polling (telego) βœ“ βœ“ βœ“ βœ“
Feishu/Lark WebSocket / webhook βœ“ βœ“ βœ“ βœ“
Discord Gateway WebSocket βœ“ βœ“ βœ“ β€”
Slack Socket Mode βœ“ βœ“ β€” βœ“
WhatsApp Multi-device protocol βœ“ βœ“ βœ“ β€”
Zalo OA Webhook βœ“ β€” β€” β€”
Zalo Personal Reverse-engineered βœ“ βœ“ β€” β€”

4 internal channels (cli, system, subagent, browser) are silently skipped by the outbound dispatcher β€” they never reach an external platform.

Three DM access policies: pairing (8-character code, 60-min validity) / allowlist / open.

Session key format encodes everything you need:

agent:{agentId}:{channel}:direct:{peerId}     ← DM
agent:{agentId}:{channel}:group:{groupId}     ← Group
agent:{agentId}:subagent:{label}              ← Subagent
agent:{agentId}:cron:{jobId}:run:{runId}      ← Cron run
agent:{agentId}:main                          ← Default/main session

This single key fully scopes session state and enables cross-channel deduplication.

Lesson: Channels look diverse but reduce to two functions: Listen() -> InboundMessage and Send(OutboundMessage) -> error. Keep the agent loop ignorant of platform specifics.


2.10 🀝 Teams, Delegation, and Subagents

Three orchestration modes determine which inter-agent tools are available:

Mode Tools available When
Spawn (default) spawn No team, no delegate links
Delegate spawn, delegate agent_links table has rows for this agent
Team spawn, delegate, team_tasks teams table has a row for this agent

Resolution priority: Team > Delegate > Spawn.

Subagents (parallel child agents):

Limit Default
Max concurrent (system-wide) 8
Max spawn depth 1
Max children per parent 5
Auto-archive after 60 min
Max iterations per subagent 20

Subagent actions: spawn (async), run (sync), list, cancel (id/all/last), steer (cancel + respawn with new message). Subagents share the parent's SecureCLIStore β€” credentialed binary gate cannot be bypassed by delegation.

Teams (collaborative multi-agent with a shared task board):

User β†’ Team Lead (sees TEAM.md with member list + roles)
         β”‚
         β–Ό creates task on board
       team_tasks table
         β”‚  status: pending
         β–Ό atomic claim (SQL row lock)
       Member Agent β†’ works in their own session
         β”‚
         β–Ό on completion: result via message bus with "teammate:" prefix
       Team Lead β†’ synthesizes results β†’ replies to user

Only the lead receives TEAM.md in its system prompt. Members discover context through tools (team_tasks list, list_group_members). This saves tokens on idle agents.

Task states: pending / in_progress / in_review / completed / failed / cancelled / blocked / stale.

Task dependencies via blocked_by UUID[]: completing a task auto-unblocks dependents whose blockers are all complete.

Lesson: Don't overload a single agent with everything. Start with spawn for simple parallelism. Add delegate when agents have distinct skills. Add team_tasks when you need a board (work tracking, dependencies, peer messages).


2.11 🌱 Self-Evolution with Guardrails

Agents adapt their behavior based on metrics β€” within strict bounds.

Three rules for the suggestion engine:

Rule Detects Suggests
LowRetrievalUsageRule memory_search / knowledge_graph_search underused Enable vault, adjust retrieval weights
ToolFailureRule Frequently failing tools Limit tool set or reword tool descriptions
RepeatedToolRule Same tool called many times in a row (loop) Adjust prompt to break the loop

Adaptation guardrails (in agents.other_config.evolution_guardrails):

Field Default Purpose
max_delta_per_cycle 0.1 Max parameter change per cycle (no wild swings)
min_data_points 100 Need β‰₯ N metrics before applying
rollback_on_drop_pct 20.0 Auto-revert if quality drops > 20% after change
locked_params [] Names that cannot auto-change (e.g. temperature)

The workflow:

  1. SuggestionEngine.Analyze() runs over a 7-day metrics window.
  2. Generates EvolutionSuggestion records with status="pending".
  3. Admin reviews in dashboard, approves/rejects.
  4. On approval, the auto-adapt worker applies and records baseline metrics.
  5. Next cycle detects regression and rolls back if rollback_on_drop_pct exceeded.

Lesson: "Self-evolving agents" without guardrails is a recipe for production incidents. Bound the change rate, require admin approval, and always keep a rollback path.


Part 3 β€” πŸ” Cross-Cutting Patterns

A handful of patterns repeat across every module. They're worth internalizing as habits.

Pattern A β€” πŸ”— Context Propagation, Not Mutable State

Everything per-request flows through context.Context:

ctx = store.WithTenantID(ctx, tenantID)
ctx = store.WithUserID(ctx, userID)
ctx = store.WithAgentID(ctx, agentID)
ctx = store.WithAgentType(ctx, "predefined")
ctx = store.WithLocale(ctx, "en")
ctx = tools.WithToolChannel(ctx, "telegram")
ctx = tools.WithToolChatID(ctx, chatID)
ctx = tools.WithToolWorkspace(ctx, "/data/workspaces/u_123")

Tools and store calls read from ctx, never from globals. This is what makes per-tenant + per-user concurrent execution thread-safe without mutexes.

Pattern B β€” πŸ“’ Event Bus for Decoupling

Agent run completion fires run.completed on a domain event bus. Workers subscribe asynchronously:

  • EpisodicWorker β†’ extract summary
  • SemanticWorker β†’ extract entities
  • DedupWorker β†’ merge duplicates
  • DreamingWorker β†’ debounced batch synthesis

The agent loop never imports any of them. New workers just subscribe.

Pattern C β€” πŸ“ System Prompt as 19+ Composable Sections

The system prompt is assembled at request time from these sections (build order matters):

  1. Identity (channel-aware)
  2. First-run bootstrap notice (if BOOTSTRAP.md exists)
  3. Persona (SOUL.md, IDENTITY.md) β€” early "primacy zone"
  4. Tooling (filtered + sandbox-aware)
  5. Credentialed CLI context (optional)
  6. Safety preamble + identity anchoring
  7. Self-Evolution rules (predefined agents only)
  8. Skills inline (≀ 15 skills) OR via skill_search tool
  9. MCP tools inline OR via mcp_tool_search
  10. Workspace info
  11. Team workspace (team agents)
  12. Sandbox container info
  13. User identity / owner IDs
  14. Time (UTC)
  15. Channel formatting hints
  16. Extra context (<extra_context> tags)
  17. Project/bootstrap context files (defensive preamble)
  18. Sub-agent spawning rules
  19. Runtime info (agent ID, model, pricing)
  20. Persona reminder β€” late "recency zone" β€” fights "lost in the middle"
  21. Memory reminders (run memory_search first)

Two modes: PromptFull (main runs) and PromptMinimal (subagents, cron, memory flush β€” only AGENTS.md + TOOLS.md).

Two reinforcement zones (primacy + recency) are the cheapest reliability win in agent prompting.

Pattern D β€” 🧹 Always Sanitize, Always Trace, Always Scrub

Three callbacks that wrap every run:

  1. Sanitize output (7 steps) before delivery.
  2. Record a span for every LLM call and every tool call. Trace tree mirrors the run shape.
  3. Scrub credentials from every tool result via static + dynamic patterns.

Pattern E β€” βš›οΈ Atomic, Race-Safe Mutations via SQL, Not Locks

Don't reach for distributed locks. Instead:

  • Atomic claim: UPDATE … WHERE status = 'pending' (row-level lock, 1 winner)
  • Upsert: INSERT … ON CONFLICT … DO UPDATE (idempotent)
  • Dynamic update: execMapUpdate(map[string]any) β€” no one-function-per-field-combo

Pattern F β€” πŸ”’ Per-Session Try-Lock for Long-Running Side Effects

When a run finishes and decides to compact:

if !sessionLock.TryLock(sessionKey) { return }   // someone else is already compacting
defer sessionLock.Unlock(sessionKey)
runMemoryFlush()
go runSummarize(ctx, ...)

Try-lock instead of blocking lock β€” skip if another concurrent run is already doing it.

Pattern G β€” ⚑ Write-Behind Cache for Hot Data

Session messages are written to memory only during a run. One Save(key) flushes to DB at the end. This collapses 10–20 individual INSERTs into 1 UPDATE.

Pattern H β€” πŸ”€ Two-Phase Tool Registry (Global + Per-Agent)

Global tools loaded at startup into a shared registry. Per-agent custom tools merged on first agent access into a clone of the global registry β€” never mutating the shared one.


Part 4 β€” πŸ—ΊοΈ Build-It-Yourself Blueprint

A concrete, sequenced plan to build a similar system. Each milestone is a runnable, testable deliverable.

Milestone 0 β€” πŸ—οΈ Foundation (1–2 days)

  • [ ] Pick the language (Go is a great fit; Python is too).
  • [ ] Pick the DB (PostgreSQL + pgvector if you want vector search).
  • [ ] Set up project skeleton: cmd/, internal/, pkg/, migrations/, docs/, Makefile, docker-compose.yml.
  • [ ] Define the Provider interface (4 methods).
  • [ ] Implement one provider β€” start with OpenAI-compatible (covers Groq, DeepSeek, Together, etc. for free).
  • [ ] Wire a cmd/serve that loads config, makes one HTTP request to the provider, and prints the response.

Milestone 1 β€” πŸ”„ Minimum Viable Agent Loop (1 week)

  • [ ] Define Tool interface: Name() string, Description() string, Schema() JSONSchema, Execute(ctx, args) (Result, error).
  • [ ] Implement 3 tools: read_file, write_file, list_files (workspace-scoped, with resolvePath() traversal guard).
  • [ ] Build the loop: Loop.Run(req) β†’ for i := 0; i < 20; i++ { think; if no tools break; act; observe }.
  • [ ] Persist sessions: SessionStore interface + in-memory implementation. Add PG implementation behind it.
  • [ ] Emit events via callback (onEvent func(EventType, payload)). Just three: run.started, tool.call, run.completed.
  • [ ] Build cmd/serve HTTP /v1/chat/completions (OpenAI-compatible). One agent. No streaming yet.

You should now have an LLM that can read/write files in a workspace.

Milestone 2 β€” πŸ“ System Prompt Architecture (3–4 days)

  • [ ] Bootstrap files in DB: agent_context_files (agent-level) + user_context_files (per-user). 6 known files: SOUL, IDENTITY, AGENTS, TOOLS, BOOTSTRAP, USER.
  • [ ] ContextFileInterceptor β€” when a tool reads/writes one of these names, route to DB instead of disk.
  • [ ] System prompt builder β€” assemble from sections (start with 5–6, grow as needed). Persona early, persona reminder late.
  • [ ] Two modes: PromptFull and PromptMinimal.
  • [ ] Per-user file seeding on first chat (use the xmax trick with PG; on SQLite use last_insert_rowid() after INSERT ... ON CONFLICT DO NOTHING).

Milestone 3 β€” 🏒 Multi-Tenancy from the Start (3–4 days)

  • [ ] tenants and api_keys tables. UUID v7 PKs.
  • [ ] tenant_id NOT NULL on every table that holds tenant data (agents, sessions, memory_documents, traces, agent_context_files, …).
  • [ ] Add WithTenantID(ctx) / TenantIDFromContext(ctx) helpers.
  • [ ] At the gateway: resolve API key β†’ SHA-256 lookup β†’ set tenant on ctx.
  • [ ] Update every store query to add WHERE tenant_id = $N. Audit the diff.
  • [ ] Master tenant for legacy/single-user data. Master scope guard for global writes.

Milestone 4 β€” πŸ”§ Pipeline Refactor (1 week)

Once your monolithic loop has > 3 conditional branches, split it:

  • [ ] Define Stage interface, StageResult enum, RunState struct.
  • [ ] Implement: ContextStage, ThinkStage, ToolStage, ObserveStage, CheckpointStage, FinalizeStage. Add PruneStage later.
  • [ ] Pipeline.Run orchestrates: setup β†’ iteration loop β†’ finalize.
  • [ ] Add a feature flag (pipeline_enabled) so V2 (monolithic) and V3 (pipeline) coexist during the migration.

Milestone 5 β€” 🧠 Memory & Search (1–2 weeks)

  • [ ] memory_documents + memory_chunks tables. tsvector (FTS) + vector(1536) (pgvector) columns.
  • [ ] MemoryInterceptor β€” auto-chunks + embeds on .md writes inside memory/*.
  • [ ] Hybrid search: 0.7 * vector + 0.3 * fts, with per-user 1.2x boost and dedup (per-user wins).
  • [ ] memory_search and memory_get tools.
  • [ ] (Later) episodic_summaries table + EpisodicWorker subscribed to run.completed.
  • [ ] (Later) kg_entities + kg_relations with valid_from / valid_until for L2.

Milestone 6 β€” πŸ› οΈ Tool Registry Hardening (1 week)

  • [ ] Funnel every tool call through Registry.ExecuteWithContext.
  • [ ] Add rate limiting (token bucket per session key, defaults: 60/min, burst 5).
  • [ ] Add credential scrubber β€” start with 5–10 high-value patterns (OpenAI sk-, Anthropic sk-ant-, GitHub ghp_, AWS AKIA, generic 64-char hex).
  • [ ] Add policy engine: profiles (full / coding / messaging / minimal), groups (fs, runtime, web, …), allow/deny lists.
  • [ ] Add shell deny groups (start with: destructive_ops, reverse_shell, dangerous_paths, package_install).
  • [ ] Capability metadata on every tool (read-only / mutating / async).

Milestone 7 β€” πŸ“‘ Channels (per channel, ~2 days each)

  • [ ] Define Channel interface: Name() string, Listen(ctx, onMessage), Send(ctx, OutboundMessage) error.
  • [ ] Telegram first (simplest, long-polling library exists).
  • [ ] Add channel_instances table with tenant_id baked in.
  • [ ] Outbound dispatcher routes by channel_instance_id. Internal channels (cli, system, subagent) silently skipped.
  • [ ] Pairing flow: 8-char code, 60-min TTL, paired-device tracking.
  • [ ] Then add: Discord (websocket), Slack (Socket Mode), WhatsApp, Feishu, Zalo.

Milestone 8 β€” πŸ”­ Observability (3–4 days)

  • [ ] traces and spans tables. Three span types: agent, llm_call, tool_call.
  • [ ] Wrap every LLM call in a span. Wrap every tool call in a span.
  • [ ] BatchCreateSpans in batches of 100; on batch failure, retry individually.
  • [ ] Verbose mode (TRACE_VERBOSE=1) records full input/output truncated at 50 KB.
  • [ ] Optional: OpenTelemetry exporter for spans.

Milestone 9 β€” πŸ’ͺ Resilience (3–4 days)

  • [ ] Wrap providers with retry middleware (exponential backoff, jitter, honor Retry-After, only retry 5xx + network).
  • [ ] Per-model cooldown β€” track failures per model, skip cooldown'd models for N seconds.
  • [ ] Failover β€” try API profile A, then profile B, then degraded model.
  • [ ] Mid-loop compaction at 75% context. Post-run compaction at 50 messages or 75% context.
  • [ ] Per-session TryLock for compaction goroutine.

Milestone 10 β€” 🀝 Multi-Agent (1–2 weeks)

  • [ ] subagent table for spawn tracking. Limits: depth 1, max 5 children, max 8 concurrent.
  • [ ] spawn tool (async return), delegate tool (sync with timeout).
  • [ ] agent_links table for delegation eligibility.
  • [ ] When ready: teams, agent_team_members, team_tasks, team_messages.
  • [ ] Atomic task claim: UPDATE … WHERE status = 'pending' AND owner_agent_id IS NULL.
  • [ ] team_tasks tool with actions: create / list / claim / complete / comment / attach / approve / reject.

Milestone 11 β€” πŸ” Production Hardening (ongoing)

  • [ ] Add the remaining 4 security layers (input guard, output sanitizer, isolation).
  • [ ] AES-256-GCM encryption for all at-rest secrets. aes-gcm: prefix convention.
  • [ ] API keys: 16 random bytes, SHA-256 hash, constant-time compare.
  • [ ] Activity log for every admin action.
  • [ ] Hourly SnapshotStore aggregations.
  • [ ] Per-tenant config UI.
  • [ ] Self-evolution suggestion engine (only after you have β‰₯ 100 metrics per agent).

Milestone 12 β€” 🌟 Optional Surface Area

  • [ ] Knowledge Vault with wikilinks ([[target]] syntax).
  • [ ] MCP bridge (stdio + SSE + streamable-http transports, per-agent + per-user grants).
  • [ ] Custom shell tools (DB-stored, hot-reloaded).
  • [ ] Cron jobs (cron expressions + cron_run_logs).
  • [ ] Browser automation (headless Chrome, browser.act / browser.snapshot / browser.screenshot).

Part 5 β€” ⚠️ Anti-Patterns to Avoid

GoClaw earns its design by not doing these things:

Anti-pattern Why it's a trap What GoClaw does instead
Hard-coding one LLM provider You'll need 5 within a year Provider interface; adapters per provider
Single-tenant first, "we'll add it later" Migration is brutal β€” every query, every test, every cache key tenant_id NOT NULL on day one
Mutable global agent state Race conditions across concurrent runs Per-call data lives in context.Context
Bypassing the tool registry "just for this one call" Loses scrubbing, rate-limit, RBAC Every tool call through Registry.ExecuteWithContext, no exceptions
Trusting the model's tool-call format Models hallucinate <tool_call> XML, [Tool Call: ...] text, etc. 7-step output sanitizer strips them all
Storing secrets unencrypted because "it's the same DB" Database dumps leak; insider access widens blast radius AES-256-GCM with aes-gcm: prefix on every secret
One giant runLoop() function 2k-line functions become untestable 8-stage pipeline, each stage isolated
Using time.Sleep between LLM retries Wastes time + cost; no jitter β†’ thundering herd Exponential backoff with jitter, honors Retry-After
One memory tier ("just embeddings") Slow, expensive, irrelevant matches L0 auto-inject + L1 hybrid search + L2 deep retrieval
Distributed lock for "claim this task" Adds Redis/Zookeeper dependency; race conditions still possible Atomic SQL UPDATE with WHERE status = 'pending'
Trusting client-supplied tenant_id header Spoofable; cross-tenant leakage Tenant resolved from API key at gateway, never from clients
Loading the full agent config on every request Slow; chatty Router cache with TTL + pub/sub invalidation
Synchronous summarization on the request path User waits 10+ seconds Synchronous memory flush, asynchronous summarization in background goroutine
Letting the agent self-modify its prompts One bad cycle and quality cratters Suggestion engine + admin approval + rollback_on_drop_pct guardrail

Part 6 β€” πŸ“š Reference Map

πŸ“ Repo structure (the parts that matter)

goclaw/
β”œβ”€β”€ cmd/                                  130+ files: serve, onboard, migrate
β”‚   β”œβ”€β”€ gateway*.go                       Gateway lifecycle + setup + wiring
β”‚   └── tui_*.go                          TUI for onboarding/setup
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ agent/                            V2 monolithic loop, router, system prompt,
β”‚   β”‚                                     resolver, sanitize, compaction, evolution
β”‚   β”œβ”€β”€ pipeline/                         V3 8-stage pipeline (context_stage.go,
β”‚   β”‚                                     think_stage.go, tool_stage.go, …)
β”‚   β”œβ”€β”€ providers/                        Provider interface + adapters per backend
β”‚   β”‚                                     + retry, cooldown, failover, middleware
β”‚   β”œβ”€β”€ tools/                            Registry, capabilities, policy engine,
β”‚   β”‚                                     scrubber, rate limiter, custom tools
β”‚   β”œβ”€β”€ memory/                           3-tier memory + auto-injector + embeddings
β”‚   β”œβ”€β”€ consolidation/                    Episodic/semantic/dreaming workers
β”‚   β”œβ”€β”€ vault/                            Knowledge vault + wikilinks + FS sync
β”‚   β”œβ”€β”€ knowledgegraph/                   KG entities + relations + traversal
β”‚   β”œβ”€β”€ store/                            Store interfaces (the contract)
β”‚   β”‚   β”œβ”€β”€ pg/                           PostgreSQL implementations
β”‚   β”‚   └── sqlitestore/                  SQLite implementations
β”‚   β”œβ”€β”€ gateway/                          WS server, HTTP mux, method router,
β”‚   β”‚                                     rate limiter, client lifecycle
β”‚   β”œβ”€β”€ http/                             HTTP API handlers (/v1/*)
β”‚   β”œβ”€β”€ channels/                         Telegram, Discord, Slack, WhatsApp,
β”‚   β”‚                                     Feishu, Zalo OA, Zalo Personal
β”‚   β”œβ”€β”€ mcp/                              MCP bridge (stdio/sse/http transports)
β”‚   β”œβ”€β”€ crypto/                           AES-256-GCM with `aes-gcm:` prefix
β”‚   β”œβ”€β”€ permissions/                      RBAC: viewer/operator/admin
β”‚   β”œβ”€β”€ eventbus/                         Domain event bus for consolidation
β”‚   β”œβ”€β”€ tracing/                          Trace + span hierarchy
β”‚   β”œβ”€β”€ tokencount/                       tiktoken-based counter
β”‚   β”œβ”€β”€ workspace/                        Per-user workspace resolver
β”‚   β”œβ”€β”€ bootstrap/                        SOUL/IDENTITY system prompt loading
β”‚   β”œβ”€β”€ config/                           JSON5 config + env overlay
β”‚   β”œβ”€β”€ i18n/                             EN/VI/ZH backend message catalog
β”‚   β”œβ”€β”€ audio/                            TTS provider layer (5 providers)
β”‚   β”œβ”€β”€ media/                            Image / audio / video generation
β”‚   └── sandbox/                          Docker sandbox for shell exec
β”œβ”€β”€ pkg/
β”‚   β”œβ”€β”€ browser/                          Browser automation
β”‚   └── protocol/                         Frame types, RPC method names, errors
β”œβ”€β”€ migrations/                           PostgreSQL migrations (45+)
β”œβ”€β”€ docker/                               Docker compose variants
β”œβ”€β”€ docs/                                 31 architecture docs (00-architecture-overview, 
β”‚                                         01-agent-loop, 03-tools-system, …)
└── ui/
    β”œβ”€β”€ web/                              React SPA (Vite, Tailwind, Radix, Zustand)
    └── desktop/                          Wails v2 desktop app (SQLite, embedded gateway)

πŸ—οΈ Key files to read first (in order)

  1. docs/00-architecture-overview.md β€” system map
  2. docs/01-agent-loop.md β€” the loop in detail (V2 + V3)
  3. docs/03-tools-system.md β€” tool registry, policy, security
  4. docs/06-store-data-model.md β€” every table and store interface
  5. docs/09-security.md β€” the 5 layers
  6. docs/23-multi-tenant-architecture.md β€” tenant resolution + isolation
  7. docs/24-knowledge-vault.md β€” vault, wikilinks, hybrid search
  8. docs/04-gateway-protocol.md β€” RPC + HTTP API surface
  9. docs/02-providers.md β€” provider abstraction + resilience
  10. docs/codebase-summary.md β€” module map

πŸ’‘ The shortest possible "what is GoClaw"

A multi-tenant AI agent gateway in Go that exposes WebSocket RPC + HTTP REST + OpenAI-compatible APIs. Behind a single Provider interface it talks to 20+ LLM backends. Behind a single Tool registry it offers 50+ built-in tools plus MCP and custom shell tools, all gated by RBAC + rate limits + credential scrubbing + path/SSRF/shell-deny guards. Agent runs flow through an 8-stage pluggable pipeline (thinkβ†’pruneβ†’toolβ†’observeβ†’checkpointβ†’finalize). Memory is 3-tier (working / episodic / semantic) with hybrid BM25+vector search. Every isolatable table carries tenant_id; every query enforces it; tenant scope flows through context.Context. Channels (Telegram, Discord, Slack, …) are pluggable adapters. Teams of agents collaborate on a SQL-claimed task board.


πŸ’­ Closing Thoughts

GoClaw is a study in disciplined boundaries. The agent loop never knows which provider it's talking to. The provider never knows which channel a message came from. The tool never knows which tenant owns the data. Each layer reduces to a small interface and a context-propagated set of values.

If you take only one thing from this document: make every concern an interface from line one, and make multi-tenancy and security non-optional from line one. Everything else can be added incrementally β€” those two cannot.


If you found this helpful, let me know by leaving a πŸ‘ or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! πŸ˜ƒ


All Rights Reserved

Viblo
Let's register a Viblo Account to get more interesting posts.