1.5K 62 115

Đã đăng vào thg 3 15, 3:59 SA 6 phút đọc

510

Bóc tách Kiến trúc DeerFlow 2.0: Khi ByteDance Build một SuperAgent Framework

Mở đầu

Ngày 28/2/2026, một repo Python của ByteDance leo lên #1 GitHub Trending chỉ sau một đêm. Đó là DeerFlow 2.0 — một bản rewrite hoàn toàn từ đầu, không giữ lại một dòng code nào từ v1.

DeerFlow 1.x từng là một Deep Research framework — bạn hỏi, nó search web, tổng hợp, trả lời. Đơn giản. DeerFlow 2.0 khác hẳn: nó là một SuperAgent Harness — orchestrate sub-agents, quản lý memory dài hạn, chạy code trong sandbox, và mở rộng qua skills. Gần giống một personal AI OS hơn là một research tool.

Bài này phân tích kiến trúc từ góc nhìn của một Senior Developer — đọc thẳng vào source code, không qua marketing copy.

Kiến trúc tổng quan

┌─────────────────────────────────────────────┐
│           Frontend (Next.js 15+)             │
│   React 19 · Radix UI · @xyflow (graph viz)  │
│          LangGraph SDK (SSE streaming)        │
└──────────────────┬──────────────────────────┘
                   │ SSE / REST
┌──────────────────▼──────────────────────────┐
│            FastAPI Gateway                   │
│  /agents  /memory  /skills  /mcp  /channels  │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│        LangGraph Agent Server                │
│   (langgraph dev · lead_agent graph)         │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│           Lead Agent (LangChain)             │
│  create_agent() + Middleware Chain           │
│  ┌─────────────────────────────────────────┐ │
│  │ ThreadDataMiddleware                    │ │
│  │ UploadsMiddleware                       │ │
│  │ DanglingToolCallMiddleware              │ │
│  │ SummarizationMiddleware                 │ │
│  │ TodoMiddleware (plan mode only)         │ │
│  │ TitleMiddleware                         │ │
│  │ MemoryMiddleware                        │ │
│  │ ViewImageMiddleware (vision models)     │ │
│  │ SubagentLimitMiddleware (max 3-4)       │ │
│  │ LoopDetectionMiddleware                 │ │
│  │ ClarificationMiddleware (always last)   │ │
│  └─────────────────────────────────────────┘ │
└────────────┬──────────┬───────────┬──────────┘
             │          │           │
     ┌───────▼──┐ ┌─────▼──┐ ┌─────▼──────┐
     │ Subagent │ │Sandbox  │ │  Memory    │
     │ Executor │ │(local / │ │  Updater   │
     │ThreadPool│ │ Docker /│ │ (LLM+JSON) │
     │max: 3-4  │ │  K8s)   │ │            │
     └──────────┘ └─────────┘ └────────────┘

Tech stack thực tế từ source:

Backend: Python 3.12+, FastAPI, LangGraph, LangChain
Frontend: Next.js 15, React 19, @xyflow/react (visualize agent graph)
LLM: Multi-provider — OpenAI, Anthropic, Gemini, DeepSeek, Doubao (config-driven)
Channels: Telegram, Slack, Feishu (Lark)
MCP: langchain-mcp-adapters + OAuth

Deep Dive: Những phần thú vị nhất

1. Lead Agent và Middleware Chain

Đây là phần hay nhất của kiến trúc. DeerFlow không dùng LangGraph graph trực tiếp — nó dùng langchain.agents.create_agent() với một Middleware Chain được inject vào.

# Thứ tự middleware có tính toán kỹ — không phải random
def _build_middlewares(config, model_name, agent_name=None):
    middlewares = build_lead_runtime_middlewares(lazy_init=True)
    # ... SummarizationMiddleware (giảm context sớm)
    # ... TodoMiddleware (chỉ khi plan_mode=True)
    middlewares.append(TitleMiddleware())
    middlewares.append(MemoryMiddleware(agent_name=agent_name))
    if model_config.supports_vision:
        middlewares.append(ViewImageMiddleware())
    if subagent_enabled:
        middlewares.append(SubagentLimitMiddleware(max_concurrent=3))
    middlewares.append(LoopDetectionMiddleware())
    middlewares.append(ClarificationMiddleware())  # luôn cuối cùng
    return middlewares

Mỗi middleware có hook before_model, after_model, after_agent. Pattern này rất clean — dễ extend, dễ test độc lập từng middleware. Code comment trong source còn ghi rõ lý do thứ tự của từng middleware, điều hiếm thấy trong open-source.

SubagentLimitMiddleware là ví dụ hay: thay vì prompt engineer để LLM không spawn quá nhiều subagent, nó hard-enforce bằng code — truncate tool calls vượt quá limit trực tiếp trên AIMessage. Pragmatic và reliable hơn nhiều.

2. Memory System

Memory của DeerFlow là structured JSON + LLM-powered update:

{
  "user": {
    "workContext": { "summary": "...", "updatedAt": "..." },
    "personalContext": { "summary": "...", "updatedAt": "..." },
    "topOfMind": { "summary": "...", "updatedAt": "..." }
  },
  "history": {
    "recentMonths": { "summary": "..." },
    "earlierContext": { "summary": "..." },
    "longTermBackground": { "summary": "..." }
  },
  "facts": [
    { "id": "fact_abc123", "content": "...", "confidence": 0.9, "category": "context" }
  ]
}

Sau mỗi conversation, MemoryMiddleware queue một async job. Job này gọi LLM với conversation + current memory, LLM trả về diff (shouldUpdate + newFacts + factsToRemove). Facts được sort theo confidence trước khi inject vào context.

Điểm tinh tế: _strip_upload_mentions_from_memory() — tự động xóa các mention về file upload khỏi memory vì uploaded files là session-scoped, không cần persist. Loại detail này cho thấy team đã chạy production và biết pain points thực tế.

Cache dùng mtime check thay vì TTL — nếu file bị modify bên ngoài, cache tự invalidate ngay lần đọc kế tiếp.

3. Subagent System

class SubagentsAppConfig(BaseModel):
    timeout_seconds: int = 900  # default 15 phút
    agents: dict[str, SubagentOverrideConfig] = {}  # per-agent override

Subagent chạy trong ThreadPoolExecutor với 2 pool (scheduler + execution). Mỗi subagent có thể:

Inherit model từ parent hoặc override
Có per-agent memory isolation (agent_name riêng)
Tool filtering qua allowlist/denylist

Max concurrent mặc định là 3, clamped vào range [2, 4] — hardcode range này có lý do: quá nhiều subagent song song dễ hit rate limit API và làm context window explode.

4. Sandbox

Abstract Sandbox class với 3 implementations: local subprocess, Docker, Kubernetes. Security test có file riêng (test_sandbox_tools_security.py) — không phải afterthought.

So sánh với OpenClaw

OpenClaw là một personal AI assistant framework — Node.js daemon, chạy như một process thường trực trên máy, kết nối với Telegram/Signal/Discord/WhatsApp.

Tiêu chí	DeerFlow 2.0	OpenClaw
Runtime	Python 3.12, FastAPI + LangGraph server	Node.js, single daemon process
Agent core	LangChain `create_agent` + Middleware Chain	Claude/GPT via Anthropic SDK, session-based
Memory	Structured JSON + LLM-powered update	Markdown files (MEMORY.md + daily notes)
Skills/Tools	Python packages trong `community/` + MCP	SKILL.md files — instructions cho LLM
Subagents	Native ThreadPoolExecutor, max 3-4 concurrent	`sessions_spawn` → isolated subprocess
Sandbox	Local/Docker/K8s, abstract interface	Không có built-in sandbox
Channels	Telegram, Slack, Feishu	Telegram, Signal, WhatsApp, Discord, IRC, Line
Frontend	Next.js với graph visualization	Web chat UI
Config	`config.yaml` — model, sandbox, memory, tools	JSON config + plugin system
Deploy	Docker Compose / K8s	Single npm process, systemd/pm2

Điểm mạnh của DeerFlow

1. Middleware architecture rõ ràng và extensible. Thêm một middleware mới chỉ cần implement 2-3 hook method, không cần hiểu toàn bộ graph. Comment trong code giải thích thứ tự — đây là dấu hiệu của codebase được maintain nghiêm túc.

2. Memory system production-grade. Structured schema với confidence scoring, per-agent isolation, LLM-powered diff update — suy nghĩ rõ về long-term memory hơn hầu hết các agent framework.

3. Subagent với hard enforcement. SubagentLimitMiddleware truncate tool calls bằng code, không rely vào prompt. Reliable hơn nhiều.

4. Sandbox abstraction. Support local → Docker → K8s, cùng interface. Tốt cho scale.

Điểm mạnh của OpenClaw

1. Đơn giản để run. Một npm install -g openclaw là chạy được. DeerFlow cần setup Python 3.12 + uv + pnpm + Docker.

2. Multi-channel native. Hỗ trợ 7+ messaging platforms out-of-the-box. DeerFlow chỉ có 3 (Telegram, Slack, Feishu).

3. Skills = plain Markdown. SKILL.md files là instructions cho LLM — không cần viết code để add capability mới. DeerFlow skills cần viết Python package.

4. Cron + heartbeat built-in. Scheduling và proactive behavior được design vào core. DeerFlow không có equivalent.

5. Personal assistant focus. Designed để chạy 24/7 trên máy cá nhân, biết về calendar/email/files của user. DeerFlow focus vào task execution hơn là personal context.

Khi nào dùng cái nào?

Dùng DeerFlow khi:

Cần chạy long-running tasks phức tạp (research, code generation pipeline)
Cần sandbox isolation thực sự (chạy code lạ)
Team/enterprise setup với nhiều người dùng chung một agent server
Cần visualize agent execution flow (Next.js graph UI rất đẹp)
Muốn custom deep vào LLM reasoning pipeline

Dùng OpenClaw khi:

Personal assistant chạy 24/7 trên máy cá nhân
Cần kết nối nhiều messaging platform
Muốn add capabilities nhanh không cần code (SKILL.md)
Cần tích hợp với local tools (file system, calendar, SSH, smart home)
Lightweight setup, không muốn maintain Docker stack

Kết luận

DeerFlow 2.0 là một serious engineering effort. Code quality cao, comment rõ ràng, design decisions có lý do. Middleware pattern cho agent là pattern đáng học — clean separation of concerns, easy to test.

Điều thú vị nhất là hai dự án giải quyết hai vấn đề khác nhau dù trông giống nhau từ bên ngoài: DeerFlow là task execution engine (làm việc phức tạp, cần isolation, có thể scale), OpenClaw là personal AI OS (luôn sẵn sàng, biết về bạn, kết nối với cuộc sống digital của bạn).

Chúng không cạnh tranh nhau — chúng complement nhau. Và cả hai đều cho thấy năm 2026, "AI agent framework" đã trở thành một software category thực sự, không còn là buzzword nữa.

Bài viết dựa trên phân tích source code DeerFlow 2.0 tại github.com/bytedance/deer-flow · commit tháng 3/2026

AI LangChain