Đã đăng vào thg 4 30, 9:18 SA 21 phút đọc

159

🤖 Multica Deep Dive — How to Build a Managed-Agents Platform 🌐

A complete, actionable build guide derived from a deep read of multica-ai/multica (~22k stars, ~42 MB, dual-language Go + TypeScript monorepo).

If you read only one section before coding, read §3 The Core Idea and §5 The Agent Backend Interface. Everything else hangs off those two ideas.

📋 Table of Contents

🧐 What Multica Is — and What It Is Not
⚡ The 30-Second Mental Model
💡 The Core Idea — Don't Build the Agent Loop, Wrap It
🏗️ Architecture at a Glance
🔌 The Agent Backend Interface (the keystone abstraction)
🔄 The Local Daemon — Polling, Wakeups, Concurrency
📁 Per-Task Workdir + Native Config Injection
🧠 Skills — the Compounding Capability Layer
▶️ Resumable Sessions and Workdir Reuse
🖥️ The Server — Data Model, Realtime, Multi-Tenancy
⏰ Autopilots — Scheduled and Triggered Automation
🖼️ Frontend — Strict State Boundaries
📦 Packaging, Release, Self-Host
🏆 Engineering Practices Worth Stealing
🗺️ Step-by-Step Build Plan (12 Phases)
⚠️ Common Pitfalls and Hard-Won Guardrails
📋 Cheat Sheet

🧐 1. What Multica Is — and What It Is Not

Tagline. "The open-source managed agents platform. Turn coding agents into real teammates — assign tasks, track progress, compound skills."

Positioning. A Linear-shaped project-management surface (issues, projects, comments, inbox, real-time updates) where AI coding agents are first-class citizens alongside humans:

An agent has a profile, shows up on the board, can be @-mentioned.
You assign an issue to an agent the same way you assign to a colleague.
A local daemon on the user's laptop picks up the work, runs the chosen agent CLI (Claude Code, Codex, Cursor, Gemini, Copilot, OpenCode, …), streams progress, and reports back.
Skills (markdown bundles) are injected into every task so capabilities compound.
Autopilots are cron/webhook-triggered automations that fire agent runs without human assignment.

It IS:

A control plane / orchestration layer
A managed-teammate UI (Linear-clone with agents)
A daemon that runs agent CLIs and streams events
A skills + autopilots system

It IS NOT:

An agent loop (no LLM calls, no tool-use parser, no RAG)
A library — it's a deployable platform
Tied to one model provider — supports 11 different agent CLIs

The closest cousin in spirit is Linear × LangGraph — but the LangGraph part is delegated to whichever third-party agent CLI is installed on the user's machine. This decision is the most important one in the entire codebase. Internalize it before going further.

⚡ 2. The 30-Second Mental Model

                       ┌──────────────────┐
                       │  Browser / Desk  │
                       │   (Next.js / EL) │
                       └────────┬─────────┘
                                │ HTTPS + WS
                  ┌─────────────▼──────────────┐
                  │   Server (Go: Chi + WS)    │  ← source of truth
                  │   Postgres + (opt) Redis   │
                  └────────┬──────────┬────────┘
                           │ WS push  │ HTTPS poll
                           │ wakeup   │ (every 3s)
                  ┌────────▼──────────▼────────┐
                  │  Daemon on user's laptop   │  ← runs the agents
                  │  (same Go binary, cobra)   │
                  └────────┬───────────────────┘
                           │ exec.Command
        ┌──────────┬───────▼──────┬───────────┬──────────┐
        ▼          ▼              ▼           ▼          ▼
     claude     codex         cursor       gemini     opencode  ...

Three runtime artifacts, all from the same monorepo:

Artifact	Built from	Runs where
Server binary	`server/cmd/server`	Your infra (Docker / VPS / k8s)
`multica` CLI + daemon	`server/cmd/multica`	User's laptop (Homebrew / install.sh)
Web app	`apps/web` (Next.js) + `apps/desktop` (Electron)	Browser / Mac / Win / Linux

💡 3. The Core Idea — Don't Build the Agent Loop, Wrap It

The single decision that lets a small team ship this much surface area:

Stop trying to be an agent runtime. Be the control plane that dispatches to existing agent CLIs.

Concretely:

Define one Go interface — Backend — with a streaming Execute method.
Write one implementation per CLI (claude, codex, cursor, gemini, …). Each implementation is just an exec.Command plus a streaming-stdout parser.
Translate every CLI's idiosyncratic JSON dialect into your own unified message taxonomy (text / thinking / tool-use / tool-result / status / log / error).
Everything above this layer (assignment, scheduling, comments, autopilots, skills, UI) treats agents uniformly.

If you only adopt one architectural idea from Multica, this is it. It's what makes the project tractable, vendor-neutral, and trivially extensible (one new file = one new agent).

The README explicitly cites the inspiration: "It mirrors the happy-cli AgentBackend pattern, translated to idiomatic Go."

🏗️ 4. Architecture at a Glance

4.1 🌐 Process / Service Topology

[Frontend]   → [Go API + WS]   → [Postgres + pgvector]
                  │
                  ↕  Redis streams (optional, for multi-node fanout)
                  │
                  ↕  Daemon WS + HTTP poll
                  │
              [Local Daemon] → spawns → [agent CLIs]

4.2 📂 Repo Layout (top-level)

apps/
  web/           Next.js 16 App Router
  desktop/       Electron (electron-vite)
  docs/          Mintlify/MDX docs
packages/
  core/          Headless logic — zustand stores, react-query, api client (zero react-dom)
  ui/            Atomic primitives (shadcn / Base UI; zero business logic)
  views/         Business components/pages (zero next/* or react-router)
server/
  cmd/server/    HTTP API entry
  cmd/multica/   CLI + daemon (cobra) entry
  cmd/migrate/   Migration runner
  internal/
    handler/     HTTP handlers (Chi)
    service/     Business logic
    daemon/      Local daemon
    daemonws/    Daemon-side WS hub
    realtime/    User-facing WS hub + Redis stream relay
    cli/         CLI helpers
    auth/        JWT + Google OAuth
    middleware/  Auth, CSP, request log
    events/      In-process event bus
  pkg/
    agent/       *** The Backend interface + 11 implementations ***
    db/queries/  sqlc input
    db/generated/ sqlc output
  migrations/    156 SQL files (Postgres)
  sqlc.yaml
e2e/             Playwright (against full docker-compose)
.github/workflows/  ci.yml, desktop-smoke.yml, release.yml
.goreleaser.yml
Makefile
docker-compose.{,selfhost.,selfhost.build.}yml

4.3 ⚙️ Tech Stack (the load-bearing pieces)

Server (Go 1.26)

github.com/go-chi/chi/v5 — router + middleware chain
jackc/pgx/v5 + pgxpool — Postgres
sqlc — typed SQL → Go (input: pkg/db/queries/, output: pkg/db/generated/)
gorilla/websocket — both user-facing and daemon-facing WS
redis/go-redis/v9 — optional fanout
golang-jwt/jwt/v5 — auth
spf13/cobra — CLI for multica binary
robfig/cron/v3 — autopilot scheduler
resend-go — email
aws-sdk-go-v2/s3 + CloudFront signed URLs
prometheus/client_golang — metrics
stdlib log/slog + lmittmann/tint (pretty in dev)

Frontend (TS / React 19)

React 19, TS 5.9, Vite, Tailwind v4
Zustand 5 for client state, TanStack Query 5 for server state — strict split
TanStack Table 8
Vitest 4 + Testing Library, Playwright for e2e
Turborepo for orchestration, pnpm catalog for unified version pinning

Infra

PostgreSQL 17 + pgvector
Redis 7 (optional)
GoReleaser for CLI binaries (mac/linux/win × amd64/arm64)
Homebrew tap (multica-ai/homebrew-tap) auto-published on tag
Docker images on GHCR for self-host

🔌 5. The Agent Backend Interface (the keystone abstraction)

Everything below is in server/pkg/agent/. Read agent.go first when reproducing this project.

5.1 🔗 The Interface

package agent

type Backend interface {
    Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error)
}

type ExecOptions struct {
    Cwd                        string
    Model                      string
    SystemPrompt               string
    MaxTurns                   int
    Timeout                    time.Duration
    SemanticInactivityTimeout  time.Duration  // kill if no semantic event in N
    ResumeSessionID            string         // resume previous agent session
    CustomArgs                 []string       // appended after our flags
    McpConfig                  json.RawMessage // written to temp file, --mcp-config <path>
}

type Session struct {
    Messages <-chan Message  // streamed; closes when agent exits
    Result   <-chan Result   // exactly one Result, then closes
}

type Message struct {
    Type      MessageType    // text | thinking | tool-use | tool-result | status | error | log
    Content   string
    Tool      string
    CallID    string
    Input     map[string]any
    Output    string
    Status    string
    Level     string
    SessionID string
}

type Result struct {
    Status     string  // completed | failed | aborted | timeout | cancelled
    Output     string
    Error      string
    DurationMs int64
    SessionID  string
    Usage      map[string]TokenUsage // per-model: input/output/cache_read/cache_write
}

5.2 🏭 The Factory

func New(name string, cfg Config) (Backend, error) {
    switch name {
    case "claude":   return newClaude(cfg)
    case "codex":    return newCodex(cfg)
    case "cursor":   return newCursor(cfg)
    case "gemini":   return newGemini(cfg)
    case "copilot":  return newCopilot(cfg)
    case "opencode": return newOpenCode(cfg)
    case "openclaw": return newOpenClaw(cfg)
    case "hermes":   return newHermes(cfg)
    case "pi":       return newPi(cfg)
    case "kimi":     return newKimi(cfg)
    case "kiro":     return newKiro(cfg)
    }
    return nil, fmt.Errorf("unknown backend %q", name)
}

5.3 📐 The Canonical Implementation Pattern (Claude Code)

claude.go (~17 KB) is the cleanest backend to study. The streaming loop is the template:

cmd := exec.CommandContext(ctx, c.path, args...)
cmd.Dir = opts.Cwd
cmd.Env = mergedEnv
stdout, _ := cmd.StdoutPipe()
stdin,  _ := cmd.StdinPipe()
stderrTail := newStderrTail(64 * 1024)  // bounded ring buffer
cmd.Stderr = stderrTail

cmd.Start()
io.WriteString(stdin, prompt)  // pipe prompt over stdin
stdin.Close()

scanner := bufio.NewScanner(stdout)
scanner.Buffer(make([]byte, 0, 1024*1024), 10*1024*1024)  // 10 MB lines

for scanner.Scan() {
    var msg claudeSDKMessage
    if json.Unmarshal(scanner.Bytes(), &msg); err != nil { continue }
    switch msg.Type {
    case "assistant": handleAssistant(msg)  // text / thinking / tool-use; tally tokens
    case "user":      handleUser(msg)       // tool-result
    case "system":    trySend(MessageStatus{...})
    case "result":    finalOutput, finalStatus, finalSessionID = ...
    case "log":       trySend(MessageLog{...})
    }
}

exitErr := cmd.Wait()
result := Result{
    Status:    classify(exitErr, finalStatus, ctx.Err()),
    Output:    finalOutput,
    Error:     errorWithStderrTail(exitErr, stderrTail),  // critical: V8/bun aborts only show "exit 3"
    SessionID: finalSessionID,
    Usage:     usageMap,
    DurationMs: ...,
}

5.4 🔍 Per-Backend Quirks Worth Knowing

Backend	Notable detail
`claude.go`	Uses `--output-format stream-json` (NDJSON over stdout); auto-approves all tool-use control requests because human approval happens at issue/comment level.
`codex.go` (33 KB)	Spawns `codex app-server`; per-task `CODEX_HOME` so skills don't pollute the system one; sandbox policy varies by detected version (`codex_sandbox.go`).
`hermes.go` / `kimi.go` / `kiro.go`	Speak the ACP protocol.
`cursor.go`	Has platform-specific files (`cursor_invocation_windows.go`) for Windows quirks.
`openclaw.go`	Doesn't read AGENTS.md from workdir, so system prompt is passed inline.
`models.go` (27 KB)	Static catalog + `ListModels()` that the daemon queries on heartbeat for the UI's model picker.
`version.go`	`DetectVersion(ctx, path)` runs `<bin> --version`; `CheckMinVersion(name, version)` is the gate that prevents the daemon from registering a runtime that's too old.
`stderr_tail.go`	Bounded 64 KB ring buffer. Critical: without this, native crashes in the underlying CLI bubble up as `"exit status 3"` with no diagnostic.
`proc_other.go` / `proc_windows.go`	Process group + window-hide cross-platform helpers.

5.5 🏆 Why This Design Wins

Adding an agent = one Go file. That's it. No protocol changes, no DB migrations, no UI changes.
No vendor lock. Users keep their own subscriptions / API keys / config for whichever CLI they prefer.
No risk of being out of date. The agent CLI gets better → your platform gets better, for free.
Failure surface is bounded. A CLI crash doesn't crash your server.

🔄 6. The Local Daemon — Polling, Wakeups, Concurrency

server/internal/daemon/daemon.go (~53 KB). Runs on the user's machine via multica daemon start.

6.1 🔄 Lifecycle (`Daemon.Run`)

1. Bind health port early (default :19514)
   → /health endpoint
   → fail-fast if another daemon is already running
2. resolveAuth()          — load token from ~/.multica/config.json
3. syncWorkspacesFromAPI  — for each workspace user belongs to:
                             - probe each agent CLI via exec.LookPath
                             - run agent.DetectVersion + CheckMinVersion
                             - POST /api/daemon/register with {name, type, version, status}
                             - cache returned runtimeIDs
4. Start background goroutines:
   - workspaceSyncLoop  (30s) — re-sync workspace membership
   - taskWakeupLoop           — open daemon WS, listen for instant wakeups
   - heartbeatLoop      (15s) — POST /api/daemon/heartbeat
                                response may piggyback: PendingUpdate,
                                PendingModelList, PendingLocalSkills,
                                PendingLocalSkillImport
   - gcLoop                   — clean ~/multica_workspaces/ for done issues
   - serveHealth              — local /health JSON (uptime, active task count)
5. Enter pollLoop (the heart of the daemon)

6.2 🔁 The Poll Loop

sem := make(chan struct{}, cfg.MaxConcurrentTasks)  // default 20

for {
    runtimeIDs := d.allRuntimeIDs()
    for i := 0; i < len(runtimeIDs); i++ {
        sem <- struct{}{}                         // acquire slot (blocks if full)
        rid := runtimeIDs[(pollOffset+i)%len(runtimeIDs)]  // round-robin
        task, _ := d.client.ClaimTask(ctx, rid)
        if task != nil {
            wg.Add(1); d.activeTasks.Add(1)
            go func(t Task) {
                defer wg.Done()
                defer d.activeTasks.Add(-1)
                defer func() { <-sem }()           // release slot
                d.handleTask(ctx, t)
            }(*task)
            break  // claimed something; sleep before next round
        } else {
            <-sem  // nothing claimed; release slot
        }
    }
    sleepWithContextOrWakeup(ctx, cfg.PollInterval, taskWakeups)
}

Defaults: PollInterval = 3s, MaxConcurrentTasks = 20, AgentTimeout = 2h.

Wakeup channel. taskWakeups is fed by the daemon WS — when the server enqueues a task for a runtime owned by this daemon, it sends a wakeup, and sleepWithContextOrWakeup returns immediately. This gets you sub-second pickup latency without giving up polling's robustness.

6.3 ⚙️ Per-Task Pipeline (`handleTask` → `runTask`)

1. POST /api/daemon/tasks/{id}/start
2. Post progress: "Launching {provider} (1/2)"
3. spawn cancellation watcher goroutine:
       every 5s: GET /api/daemon/tasks/{id}/status
       if status == "cancelled": call runCancel() → kill process group
4. SECURITY GUARD: refuse if task.WorkspaceID == ""
   (no silent fallback to user-global config across workspaces)
5. Build TaskContext (issue, agent, skills, repos, autopilot/chat/quick-create flags)
6. execenv.Prepare or execenv.Reuse:
   - {WorkspacesRoot}/{workspace_id}/{task_id_short}/{workdir,output,logs}/
   - For codex: also seed per-task CODEX_HOME
7. execenv.InjectRuntimeConfig — write CLAUDE.md / AGENTS.md / GEMINI.md
   into workdir; write skill bundles into native skills dirs
8. daemon.BuildPrompt(task) → prompt string
9. Build agentEnv:
     MULTICA_TOKEN, MULTICA_SERVER_URL, MULTICA_DAEMON_PORT
     MULTICA_WORKSPACE_ID, MULTICA_AGENT_NAME, MULTICA_AGENT_ID, MULTICA_TASK_ID
     [optional] MULTICA_AUTOPILOT_*, MULTICA_QUICK_CREATE_TASK_ID
     CODEX_HOME (codex only)
     PATH-prepend so the spawned agent can call `multica` itself
   Merge agent.CustomEnv with a BLOCKLIST so users can't override daemon vars
10. backend, _ := agent.New(provider, cfg)
    session, _ := backend.Execute(ctx, prompt, execOpts)
11. executeAndDrain(session):
       for msg := range session.Messages {
           batch = append(batch, msg)
           if shouldFlush(batch) { client.ReportTaskMessages(taskID, batch) }
       }
       result := <-session.Result
12. As soon as the agent emits its first SessionID:
       client.PinTaskSession(taskID, sessionID)   // crash-safe resume pointer
13. Resume fallback: if Status==failed && PriorSessionID!="" && SessionID==""
       retry once with ResumeSessionID = ""
14. POST /usage, then /complete (output, branch_name, session_id, work_dir)
                   or /fail (error, session_id, work_dir, failure_reason)
15. Persist .gc_meta.json (issue_id, workspace_id, completed_at) so GC
    can map workdir → issue and reap when issue is done|cancelled

6.4 🔎 Auto-Detection of Installed CLIs

LoadConfig walks a list of known providers and probes each via exec.LookPath. Only those present register as runtimes. Per-provider env overrides exist:

MULTICA_<PROVIDER>_PATH    # override binary path
MULTICA_<PROVIDER>_MODEL   # override default model

So the daemon adapts to whatever's installed without user config — and users can pin specific binaries when they want.

6.5 🆔 Stable Daemon ID

EnsureDaemonID(profile) writes a UUID to ~/.multica/profiles/<name>/daemon.id once and reuses it forever. Without this, hostname drift (e.g. .local suffix appearing/disappearing on macOS) would mint duplicate runtime rows on the server. LegacyDaemonIDs(host, profile) is sent at register-time so the server can merge old hostname-derived rows.

6.6 👤 Profiles

multica setup self-host --profile staging lets one machine talk to multiple servers. Each profile gets its own ~/.multica/profiles/<name>/ with config, daemon ID, health port, and workspace root.

📁 7. Per-Task Workdir + Native Config Injection

This is the second-most important design decision after §3. Each agent self-bootstraps via its own native config-file convention — you don't invent a protocol.

7.1 📁 Per-Task Workdir

~/multica_workspaces/
  {workspace_id}/
    {task_id_short}/
      workdir/      ← cwd of the agent process; git checkout lives here
      output/       ← collected outputs
      logs/         ← captured stdout/stderr
      .gc_meta.json ← {issue_id, workspace_id, completed_at}

Isolation is per-task, not per-issue. Reuse on the same agent+issue is opt-in via task.PriorWorkDir.

7.2 🧩 The "Meta-Skill" — Native Config File per Provider

execenv.InjectRuntimeConfig writes a config file at the workdir root that each agent reads natively at startup:

Provider	Config file written
claude	`CLAUDE.md`
codex / copilot / opencode / openclaw / hermes / pi / cursor / kimi / kiro	`AGENTS.md`
gemini	`GEMINI.md`

The content is built by buildMetaSkillContent(provider, ctx) and is essentially a system prompt teaching the agent to act as a Multica teammate:

Identity block — "You are: {agent name} (ID: …)" + agent's persona instructions.
CLI catalog — every multica subcommand the agent may use:
- Read: issue get, issue list, issue comment list, workspace members
- Write: issue create, issue update, issue assign, issue label add, issue subscriber add, issue comment add, label create, autopilot create|update|trigger|delete
Hard rule: always pass --output json so the agent gets stable IDs.
Multi-line content rule: must use --content-stdin with HEREDOCs (because bash doesn't expand \n in double-quoted strings — observed empirically, hard-coded as a guard).
Provider-specific gotchas — e.g. Codex tends to follow a per-turn reply command literally → instruct it to use --content-stdin.
Workflow section — branches on task kind: chat, quick-create, autopilot run-only, comment-triggered, default.

The agent now has who it is and what tools it has and how to use them, all via the file format it already reads natively. Zero protocol invention.

7.3 📚 Skill Files in Native Skill Directories

Skills are written into each agent's native skills directory:

Provider	Skills directory
claude	`.claude/skills/`
codex	`.codex/skills/`
cursor	`.cursor/skills/`
openclaw	`.openclaw/skills/`
opencode	`.config/opencode/skills/`
copilot	`.github/skills/`
pi	`.pi/skills/`
hermes (fallback)	`.agent_context/skills/`

Each agent discovers them through its own native mechanism. You write to disk; the agent CLI does the rest.

🧠 8. Skills — the Compounding Capability Layer

A Skill is just:

{ name: string, content: string /* markdown */, files: { path: string, content: string }[] }

That's it. The platform value comes from management (per-workspace catalog, agent linkage, marketplace install, lockfile), not from format complexity.

8.1 🔒 Reproducible Installs via Lockfile

skills-lock.json at repo root pins each marketplace skill:

{
  "skills": {
    "frontend-design": {
      "source": "github.com/anthropics/skills",
      "ref": "abc123…",
      "computedHash": "sha256:…"
    },
    ...
  }
}

Sources include anthropics/skills, shadcn/ui, vercel-labs/agent-skills. computedHash makes installs verifiable.

8.2 ✂️ The Prompt vs Skill Split

A subtle but important discipline: the prompt is minimal; skills carry context. BuildPrompt(task) is one short paragraph per task kind. Everything that describes how the platform works lives in the meta-skill (CLAUDE.md / AGENTS.md), which you'd otherwise have to re-emit in every prompt.

8.3 🎛️ Per-Agent Customization

The agent table stores the dials a user has over an agent's behavior:

instructions — persona / system prompt
skills[] — linked skill IDs (joined to per-workspace skill catalog)
custom_env — k/v injected per task (with a daemon-side blocklist)
custom_args — appended after the daemon's built-in CLI args
mcp_config — raw JSON, written to a temp file and passed --mcp-config <path>
model
max_concurrent_tasks
visibility — workspace | private

LaunchHeader(provider) is shown in the UI so users see the skeleton their custom_args extend.

▶️ 9. Resumable Sessions and Workdir Reuse

Coding agents have expensive context. Throwing it away on each turn is wasteful. Multica handles this with two pieces of forwarded state:

9.1 📌 Mid-Flight Session Pinning

As soon as a backend emits a SessionID, the daemon calls client.PinTaskSession(taskID, sessionID) → server stores it on the task row. Crash-safe: if the daemon dies mid-task, the resume pointer is already on the server.

9.2 ▶️ Resume on Next Claim

When the server hands the next task on the same agent+issue, it includes:

PriorSessionID — passed back as ExecOptions.ResumeSessionID (e.g. claude --resume <id>)
PriorWorkDir — daemon calls execenv.Reuse(...) instead of execenv.Prepare(...) → same git checkout, same scratchpad

9.3 🔁 Resume Fallback

If a resume fails before establishing a session (Status==failed && PriorSessionID!="" && SessionID==""), the daemon retries once with ResumeSessionID="" — fresh start. This rescues the user from a stale session ID without infinite-looping.

9.4 🗑️ GC

gcLoop cleans ~/multica_workspaces/:

Workdirs whose issue is done|cancelled and older than MULTICA_GC_TTL (default 24h)
Orphan dirs (no .gc_meta.json) older than MULTICA_GC_ORPHAN_TTL (default 72h)
Server returning 404 on the issue → immediate clean

🖥️ 10. The Server — Data Model, Realtime, Multi-Tenancy

10.1 🎭 Polymorphic Actors

The single most enabling schema decision:

issues.assignee_type  CHECK (assignee_type IN ('member', 'agent'))
issues.assignee_id    UUID
comments.author_type  CHECK (author_type IN ('member', 'agent'))
inbox.recipient_type  ...

Once you commit to polymorphism on every actor field, agents are free citizens everywhere in the API — no special endpoints, no parallel UI.

10.2 🔒 Multi-Tenancy

Every query filters by workspace_id.
Membership table gates access (member row joins user and workspace with a role).
The frontend sends X-Workspace-ID on every request to route to the active workspace.
Middleware:
- Auth(queries) — JWT or PAT
- DaemonAuth(queries) — daemon token
- RequireWorkspaceMemberFromURL(queries, "id")
- RequireWorkspaceRoleFromURL(queries, "id", "owner", "admin")

10.3 💾 Persistence Layer

156 numbered SQL migration files (server/migrations/001_init.up.sql …) — immutable history; never edit an applied migration.
sqlc turns pkg/db/queries/*.sql into typed Go code in pkg/db/generated/.
pgxpool throughout; no ORM.
pgvector enabled for embedding-based search (skills, issues).

10.4 🔗 Layering: Handler → Service → Repo

handler (Chi routes)  ←  HTTP/WS adapters; never touch DB
   ↓
service               ←  business logic; transactions; calls multiple queries
   ↓
queries (sqlc)        ←  typed SQL only

Constructor-based DI:

taskSvc := service.NewTaskService(queries, pool, hub, bus, daemonWakeup)
autoSvc := service.NewAutopilotService(queries, taskSvc, ...)

No globals. No init().

10.5 📡 In-Process Event Bus

events.Bus is a synchronous publisher with topic-based listeners. Order of registration matters and is documented in cmd/server/main.go:

// Subscribers MUST register BEFORE notifications, because notifications
// depend on the subscriber list being up to date.
events.RegisterSubscriberListeners(bus, queries)
events.RegisterNotificationListeners(bus, queries, ...)
events.RegisterActivityListeners(bus, queries)
events.RegisterAutopilotListeners(bus, queries, autoSvc)

When a service emits an event, listeners write derived state (inbox items, activity rows) and emit broadcaster events that flow out over WS.

10.6 🔌 Two WebSocket Subsystems

Path	Audience	Auth	Purpose
`/ws`	Browser / Desktop	JWT (PAT or session cookie); origin check against `ALLOWED_ORIGINS`	Stream updates: new issues, comments, presence, task progress
`/api/daemon/ws`	Daemon	Daemon token	Server → daemon wakeups when a task is queued

10.7 🌐 Single-Node vs Multi-Node Realtime

Without REDIS_URL: in-process Hub — single API node.

With REDIS_URL: realtime.NewShardedStreamRelay uses Redis streams to fan out events across nodes. Sharding key + per-shard consumer groups. The same daemon-wakeup channel routes through daemonws.NewRelayNotifier(hub, sharded) so a runtime connected to API node A can be woken when node B ingests its task.

There's a legacy / dual / sharded env switch (REALTIME_RELAY_MODE) for safe rollouts.

Key principle: don't make Redis required. Single-node self-host should run with just Postgres.

10.8 🐛 Strict UUID Parsing (a real bug in disguise)

CLAUDE.md documents three named helpers, born from bug #1661 where a generic util.ParseUUID silently returned the zero UUID, causing DELETEs to return 204 while matching zero rows:

parseUUIDOrBadRequest(s)  // for user input — returns 400 on invalid
parseUUID(s)              // for trusted round-trips — panics → caught by Recoverer
loadIssueForUser(ctx, queries, key)  // accepts UUID or "MUL-123" human ID
loadAgentForUser(...)

The lesson: typed parsers at every trust boundary. Never roll a generic helper that hides errors.

⏰ 11. Autopilots — Scheduled and Triggered Automation

server/internal/service/autopilot.go + cron.go. Two modes:

create_issue — scheduler creates a new issue and assigns it to the agent. Normal task flow follows.
run_only — no issue exists; scheduler enqueues a task in agent_task_queue with autopilot context. Daemon picks it up; the meta-skill detects MULTICA_AUTOPILOT_RUN_ID and switches to autopilot workflow (no multica issue get calls).

triggers table holds:

cron — robfig/cron expression + timezone
webhook — endpoint hash (data model exists, dispatch not wired yet per CLI_AND_DAEMON.md)
api — manual API trigger (same status)

runAutopilotScheduler(ctx, queries, autopilotSvc) ticks; due triggers call autopilotSvc.RunOnce.

CLI exposes only cron triggers today:

multica autopilot trigger-add \
  --cron "0 9 * * 1-5" \
  --timezone "America/New_York"

🖼️ 12. Frontend — Strict State Boundaries

This is where the project's discipline really shows. The rules are codified in CLAUDE.md and enforced via package boundaries.

12.1 📦 The Three-Package Split

packages/core/     headless logic
  - zustand stores (ALL of them, even view-related)
  - react-query hooks
  - api client
  - StorageAdapter, NavigationAdapter (interfaces)
  - ZERO react-dom
  - ZERO localStorage (use StorageAdapter)
  - ZERO process.env

packages/ui/       atomic primitives (shadcn / Base UI variant)
  - components/ui/button.tsx, card.tsx, ...
  - ZERO @multica/core imports
  - ZERO business logic

packages/views/    business components/pages
  - One component per route (IssuesPage, AutopilotsPage, ...)
  - ZERO next/* imports
  - ZERO react-router-dom
  - ZERO direct store imports (read via core hooks)
  - Routing via NavigationAdapter

apps/web/          Next.js wiring
apps/desktop/      Electron wiring
  - Each provides StorageAdapter, NavigationAdapter, CoreProvider
  - This is the ONLY layer where Next.js / Electron APIs appear

12.2 🔄 Server State vs Client State

TanStack Query for everything API-derived. Always.
Zustand for UI-only state (selection, modals, drafts, presence).
WebSocket events invalidate Query. They never write directly to stores.
All workspace-scoped queries key on wsId, so workspace switching invalidates automatically.

12.3 🧩 Internal Packages Pattern

Packages export raw .ts / .tsx. Consumer's bundler (Vite / Next) compiles directly. Zero-config HMR, instant go-to-definition, no build step between packages.

12.4 📋 pnpm Catalog

pnpm-workspace.yaml declares a catalog of pinned versions. Every package imports "react": "catalog:". Bumps happen in one place.

12.5 🚫 The No-Duplication Rule

"If the same logic exists in both apps, it must be extracted to a shared package."

Frequently restated in CLAUDE.md. This is what keeps a web + desktop app from diverging.

📦 13. Packaging, Release, Self-Host

13.1 🚀 GoReleaser for the CLI

.goreleaser.yml builds:

darwin / linux / windows × amd64 / arm64
Both legacy-named and versioned tarballs (legacy keeps old multica update working — backwards compat)
Checksums
Auto-publishes a Homebrew formula to multica-ai/homebrew-tap on tag

User install paths:

brew install multica-ai/tap/multica
curl https://multica.ai/install.sh | sh
iwr https://multica.ai/install.ps1 | iex
All scripts support --with-server to bring up the full stack alongside the CLI.

13.2 🐳 Docker for the Server

Dockerfile (server) + Dockerfile.web (frontend) — published to GHCR (ghcr.io/multica-ai/multica-backend, multica-web).
Three compose files:
- docker-compose.yml — dev (only Postgres)
- docker-compose.selfhost.yml — production self-host
- docker-compose.selfhost.build.yml — override that builds locally

13.3 🔧 The Makefile (the workflow tour)

Unusually polished at 12.5 KB:

make dev               # start dev stack
make selfhost          # production self-host
make selfhost-build    # build locally instead of pulling
make selfhost-stop
make check             # full CI pipeline locally
make sqlc              # regenerate typed SQL
make migrate-up / migrate-down / migrate-status
make migrate-new name=add_foo_table
make db-reset          # refuses if DATABASE_URL points to remote
make worktree-env      # generate .env.worktree with unique DB name + ports
                       # → run multiple git worktrees in parallel against one Postgres

13.4 ✅ CI

.github/workflows/ci.yml — two jobs:

frontend — pnpm + Node 22 + turbo build typecheck test --filter='!@multica/docs'
backend — Go 1.26 + Postgres 17 + pgvector + Redis 7 services; go build ./..., run migrations, go test ./.... Separate REDIS_TEST_URL=redis://localhost:6379/1 for runtime-local-skill tests.

.github/workflows/release.yml — auto-fires on v* tag: Go tests → GoReleaser → GitHub Releases + Homebrew tap.

.github/workflows/desktop-smoke.yml — Electron build/package per platform.

13.5 🔐 Self-Host Gating

ALLOW_SIGNUP=false
ALLOWED_EMAIL_DOMAINS=acme.com
ALLOWED_EMAILS=alice@example.com,bob@example.com

Plus MULTICA_DEV_VERIFICATION_CODE for local dev (rejected when APP_ENV=production).

🏆 14. Engineering Practices Worth Stealing

A grab bag, ranked by leverage:

CLAUDE.md as the engineering bible (21 KB). Every architectural rule is documented with the bug number that motivated it. Hard rules, hard reasons. AGENTS.md is a 2 KB pointer that just tells agents to read CLAUDE.md. Single source of truth, thin pointers everywhere else.
Constructor-based DI everywhere. No globals. No init(). Mockability comes for free.
Test placement is rule-bound: shared logic tests live in the package they test; framework-specific wiring tests live in the app. Every Go file has a _test.go peer (often the same size or bigger).
CI uses real Postgres + Redis services (not testcontainers). Faster, simpler.
Bounded stderr ring buffer for every spawned process. Without this, native crashes show only "exit status 3".
Polymorphic actor fields from day one (*_type + *_id). Retrofitting is painful.
Workspace-scoped query keys. Switching tenant invalidates cache automatically.
Zero-config monorepo. Packages export raw TS; consumer bundler compiles. Instant HMR + go-to-definition.
Mid-flight pinning. Pin volatile state (session ID) to the server as soon as it's produced — don't wait for completion.
Worktree-friendly Makefile. Generate .env.worktree with unique DB name + ports. Run N branches in parallel against one Postgres.
Don't make Redis required. Optional fanout, single-node default.
Two-tier model resolution: explicit override > daemon-wide env > CLI default. No mandatory choice.
MULTICA_* env vars + agent.CustomEnv merge with a blocklist. Users can set their own env without overriding daemon-set vars.
Auto-detect installed CLIs via exec.LookPath. Daemon adapts to whatever's installed; explicit overrides exist when needed.
chi.Recoverer so panics from parseUUID (the trusted variant) don't crash the server — they're logged and 500'd.
Listener registration order is documented in code comments, because it's load-bearing.
Per-tenant security guard: daemon refuses to spawn if task.WorkspaceID == "". No silent fallback to user-global config across workspaces.
Health port bound first. Detects another daemon already running before doing anything else.
Stable daemon ID persisted to disk. Hostname drift is a real source of duplicate runtime rows.
Backwards-compat legacy-named tarballs so old multica update keeps working forever.

🗺️ 15. Step-by-Step Build Plan (12 Phases)

Build a minimum-viable Multica clone. Each phase is shippable. Don't skip ahead.

🌱 Phase 1 — Skeleton (1 day)

Init monorepo: apps/web, packages/core, packages/ui, packages/views, server/.
pnpm workspace + Turborepo.
Postgres locally; one migration: user, workspace, member.
Email + password (or magic-link) auth → JWT.
Health endpoint. Basic Chi router. Structured logging via slog.

Done when: make dev brings up Postgres + Go server + Next.js, you can sign up and see your workspace.

📝 Phase 2 — Issues CRUD (2 days)

Migrations: issue, issue_label, comment. Polymorphic assignee_type + assignee_id.
sqlc + queries.
Handler → service → repo for issues + comments.
Linear-shaped UI: list, detail, create modal.
TanStack Query for everything API-derived.

Done when: Humans can create, assign, comment on issues, like a tiny Linear.

🔌 Phase 3 — User-Facing WebSocket (1 day)

/ws endpoint with JWT auth + origin check.
In-process events.Bus. Listeners that emit broadcaster events on issue/comment changes.
Frontend WS client invalidates Query on relevant events.

Done when: Two browser tabs see each other's edits in real time.

🔗 Phase 4 — The Agent Backend Interface (1 day)

This is the keystone. Get it right.

server/pkg/agent/agent.go — interface, types, factory.
claude.go — first implementation. Streaming stdout parser, bounded stderr tail, per-message-type translation to your taxonomy.
version.go, models.go.
Unit tests with a fake CLI (a shell script that prints canned NDJSON).

Done when: A unit test can run Backend.Execute("hello") against a fake stdout fixture and observe the unified message stream + final result.

🔄 Phase 5 — Local Daemon Skeleton (2 days)

Cobra CLI: multica daemon start.
Health port bind (fail-fast). Stable daemon ID persisted to disk.
LoadConfig probes installed CLIs via exec.LookPath.
POST /api/daemon/register.
Heartbeat loop.

Done when: Daemon starts, registers a runtime, server shows it online.

✅ Phase 6 — Task Lifecycle End-to-End (3 days)

DB: agent, agent_task_queue, runtime, task tables.
Server endpoints: claim task, start, messages (batch), usage, complete, fail, status.
Daemon poll loop with semaphore + round-robin.
Per-task workdir: ~/multica_workspaces/{ws}/{task}/workdir/.
Inject CLAUDE.md (or AGENTS.md) at workdir root with a minimal meta-skill.
Build agentEnv with MULTICA_* vars; merge agent.CustomEnv with blocklist.
Run agent → stream messages → report.

Done when: UI shows live token-by-token output for a real assigned issue.

🧠 Phase 7 — Skills + Per-Provider Config Injection (1 day)

Skill model: { name, content, files[] }. Per-workspace catalog.
Write skills into native dirs (.claude/skills/, etc.).
Build the meta-skill content: identity + CLI catalog + workflow.
Add multica issue CLI subcommands so the agent can call them: get, list, comment add (with --content-stdin), update, assign, label add.

Done when: An agent on an assigned issue calls multica issue get and multica issue comment add and the comments appear in the UI authored as the agent.

⚡ Phase 8 — Daemon Wakeup over WS (½ day)

/api/daemon/ws endpoint.
daemonws.Hub with task-wakeup channels per runtime.
sleepWithContextOrWakeup returns immediately on wakeup.

Done when: Latency from "assign" to "agent message arrives" is < 1 s, not 3 s.

▶️ Phase 9 — Resumable Sessions (1 day)

Mid-flight PinTaskSession.
Forward PriorSessionID + PriorWorkDir on next claim.
execenv.Reuse vs execenv.Prepare.
Resume fallback: retry once with empty ResumeSessionID if resume fails before establishing a session.
GC loop for ~/multica_workspaces/.

Done when: Two consecutive comments on the same issue don't lose context, and finished issues' workdirs are cleaned up.

➕ Phase 10 — Add a Second + Third Backend (1 day)

gemini.go (simpler, stream-json). codex.go (more complex, app-server mode + per-task CODEX_HOME).
Verify the abstraction holds — no schema changes, no UI changes.

Done when: UI shows a model picker with multiple providers, and assigning to a different agent uses a different CLI.

⏰ Phase 11 — Autopilots (1 day)

autopilot + trigger tables.
robfig/cron/v3 scheduler in a goroutine.
RunOnce mode: enqueue a task with autopilot context (MULTICA_AUTOPILOT_* env).
Meta-skill branch for autopilot run.
CreateIssue mode: scheduler creates an issue and assigns it.
CLI: multica autopilot create / trigger-add / list / delete.

Done when: A cron-triggered autopilot fires and produces output in the UI without human intervention.

📦 Phase 12 — Packaging + Self-Host (1 day)

GoReleaser config: mac/linux/win × amd64/arm64.
Homebrew tap auto-publish on tag.
install.sh and install.ps1 that detect Homebrew if available.
GHCR images for server + web.
docker-compose.selfhost.yml for end-users.
Auth gating: ALLOW_SIGNUP, ALLOWED_EMAILS, ALLOWED_EMAIL_DOMAINS.

Done when: A stranger can brew install you/tap/yourcli && yourcli setup self-host against a Docker-Compose'd backend.

⚠️ 16. Common Pitfalls and Hard-Won Guardrails

These are real bugs Multica documents in CLAUDE.md — borrow them rather than re-discover them.

Pitfall	Guardrail
Generic `ParseUUID` returns zero UUID silently → DELETEs return 204 matching nothing.	Three named helpers: `parseUUIDOrBadRequest` (input boundary), `parseUUID` (trusted, panics), `loadXForUser` (accepts UUID or human ID like `MUL-123`).
Native CLI crashes show as `"exit status 3"` with no diagnostic.	Bounded stderr ring buffer; attach last 64 KB to `Result.Error`.
Hostname drift mints duplicate runtime rows.	Persist daemon ID to disk; report legacy hostname-derived IDs at register time so server can merge.
Daemon silently uses user-global config across workspaces.	Refuse to spawn if `task.WorkspaceID == ""`.
Two daemons running on one machine → race.	Bind health port first; fail-fast.
Agent CLI users override daemon-set env vars.	Blocklist on the merge of `agent.CustomEnv` into `agentEnv`.
Bash `\n` in double-quoted strings doesn't expand → multi-line agent comments mangled.	Hard-coded rule in meta-skill: always use `--content-stdin` with HEREDOCs.
Resume with stale session ID fails silently.	Resume fallback: retry once with empty `ResumeSessionID`.
Workdirs grow unbounded.	GC loop with `MULTICA_GC_TTL` (default 24h) and orphan TTL. 404 on issue → immediate clean.
Daemon WS dies → wakeups silently lost.	Always-on poll loop as the floor; WS is just an accelerator.
Listener registration order causes notifications to miss subscribers.	Document order in code comments; subscribers register before notifications.
Anthrope users running multiple worktrees collide on Postgres.	`make worktree-env` generates `.env.worktree` with unique DB name + ports.
Old CLI binaries break after rename.	Legacy-named tarballs alongside versioned ones — `multica update` keeps working.
Codex skills pollute `~/.codex/`.	Per-task `CODEX_HOME`.
Single-node prod self-host gets blocked by Redis dependency.	Optional Redis; in-memory hub by default.
Agent loops on each other's pure-ack comments.	Meta-skill rule: "If the prior comment was a pure ack/thanks AND you produced no work, do NOT reply — silence is preferred."
Server-state writes from WS events corrupt cache.	WS events invalidate Query. They never write directly to stores.

📋 17. Cheat Sheet

📖 Files to read first (in order)

server/pkg/agent/agent.go — the interface.
server/pkg/agent/claude.go — the canonical implementation.
server/internal/daemon/daemon.go — the lifecycle + poll loop.
server/internal/daemon/execenv/runtime_config.go — meta-skill builder.
server/internal/daemon/prompt.go — task-kind-branched prompt.
server/cmd/server/main.go — server bootstrap.
server/cmd/server/router.go — full route tree.
server/migrations/001_init.up.sql — core schema.
CLAUDE.md — every rule that matters, with the bug that motivated it.
Makefile — the workflow.

⚙️ Default config values

Setting	Default	Env var
Poll interval	3 s	`MULTICA_DAEMON_POLL_INTERVAL`
Heartbeat interval	15 s	`MULTICA_DAEMON_HEARTBEAT_INTERVAL`
Agent timeout	2 h	`MULTICA_AGENT_TIMEOUT`
Codex semantic-inactivity timeout	10 m	`MULTICA_CODEX_SEMANTIC_INACTIVITY_TIMEOUT`
Max concurrent tasks per daemon	20	`MULTICA_DAEMON_MAX_CONCURRENT_TASKS`
Health port	19514	(CLI flag)
Workspaces root	`~/multica_workspaces/`	`MULTICA_WORKSPACES_ROOT`
GC TTL (done issues)	24 h	`MULTICA_GC_TTL`
GC orphan TTL	72 h	`MULTICA_GC_ORPHAN_TTL`

📐 The unified message taxonomy (don't deviate)

text          assistant prose
thinking      assistant reasoning
tool-use      tool call (Tool, CallID, Input)
tool-result   tool output (CallID, Output)
status        lifecycle event (model loaded, sandbox ready, …)
error         non-fatal error
log           debug log

🔖 The unified result statuses

completed    happy path
failed       agent returned non-zero
aborted      ctx cancelled by user
timeout      hit AgentTimeout / SemanticInactivityTimeout
cancelled    server-side cancel

🗣️ The agent's CLI vocabulary (what the meta-skill teaches)

multica issue get <id> --output json
multica issue list --output json
multica issue comment list <id> --output json
multica workspace members --output json
multica issue create --title ... --content-stdin <<EOF ... EOF --output json
multica issue update <id> ... --output json
multica issue assign <id> --to <member-or-agent> --output json
multica issue label add <id> --label ... --output json
multica issue subscriber add <id> --user ... --output json
multica issue comment add <id> --content-stdin <<EOF ... EOF --output json
multica label create --name ... --color ... --output json
multica autopilot create / update / trigger / delete ...

🎭 The polymorphic-actor pattern

CREATE TABLE issue (
    id           UUID PRIMARY KEY,
    workspace_id UUID NOT NULL REFERENCES workspace,
    title        TEXT NOT NULL,
    content      TEXT,
    status       TEXT NOT NULL,
    assignee_type TEXT CHECK (assignee_type IN ('member', 'agent')),
    assignee_id   UUID,
    creator_type  TEXT CHECK (creator_type IN ('member', 'agent')),
    creator_id    UUID NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    ...
);

🚫 Hard rules (non-negotiable)

Every server query filters by workspace_id.
Every TanStack Query key includes wsId.
packages/core/ has zero react-dom, zero localStorage, zero process.env.
packages/views/ has zero next/*, zero react-router-dom.
packages/ui/ has zero @multica/core imports.
Listener registration order: subscribers before notifications.
Daemon refuses to spawn if task.WorkspaceID == "".
Always pass --output json from the agent's CLI calls.
Always use --content-stdin with HEREDOCs for multi-line content.
WS events invalidate Query; they never write directly to stores.
Migrations are append-only. Never edit an applied migration.

💭 Closing Thought

Multica's superpower isn't novel ML — it's discipline:

One interface for agents (Backend.Execute), eleven implementations.
One workdir convention (~/multica_workspaces/{ws}/{task}/), every agent self-bootstraps via its native config-file format.
One source of truth (Postgres), one event bus, two WS subsystems with distinct audiences.
One engineering bible (CLAUDE.md), every rule annotated with the bug that produced it.

If you internalize §3 (don't build the loop, wrap it) and §5 (the Backend interface), and you keep that discipline as you grow, you can recreate this in ~10–14 days of focused work for a v1.

Now go build.

If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃