1.5K 56 111

Đã đăng vào thg 2 18, 12:15 SA 2 phút đọc

115

Phân tích TradingAgents: Kiến trúc, Lí luận & Giao tiếp

1. Kiến trúc tổng thể (LangGraph State Machine)

Hệ thống dùng LangGraph — một directed acyclic graph (DAG) với state machine — để orchestrate các agent theo pipeline sau:

START
  │
  ▼
[Market Analyst] ──tool calls──→ [tools_market] ──→ (loop)
  │ (done)
  ▼
[Social Analyst] ──tool calls──→ [tools_social] ──→ (loop)
  │ (done)  
  ▼
[News Analyst]
  │
  ▼
[Fundamentals Analyst]
  │
  ▼
[Bull Researcher] ◄──────────────────────────────────┐
  │                                                   │
  ▼                                                   │
[Bear Researcher] ──→ (count < 2*rounds) ────────────┘
  │ (count >= limit)
  ▼
[Research Manager]  ← Deep LLM (gpt-5.2)
  │
  ▼
[Trader]
  │
  ▼
[Aggressive] → [Conservative] → [Neutral] → (loop 3 rounds)
  │ (count >= limit)
  ▼
[Risk Judge]  ← Deep LLM
  │
  ▼
 END → process_signal() → BUY/SELL/HOLD

2. Cách Agent Giao tiếp — Shared State Pattern

Tất cả agent không gọi nhau trực tiếp. Chúng giao tiếp qua một dict state dùng chung (AgentState):

class AgentState(MessagesState):
    company_of_interest: str      # Ticker
    trade_date: str
    
    # Analysts ghi vào đây:
    market_report: str
    sentiment_report: str
    news_report: str
    fundamentals_report: str
    
    # Debate team đọc/ghi:
    investment_debate_state: InvestDebateState
    investment_plan: str
    
    # Risk team đọc/ghi:
    risk_debate_state: RiskDebateState
    final_trade_decision: str

Luồng dữ liệu:

Analyst viết report → lưu vào market_report, news_report... Researcher đọc các report → viết vào investment_debate_state.history Manager đọc history → viết investment_plan Risk agents đọc investment_plan → viết risk_debate_state Risk Judge đọc risk_debate_state.history → viết final_trade_decision 3. Lí luận Debate — Cơ chế Tranh luận 3.1 Investment Debate (Bull vs Bear)

InvestDebateState:
{
  "history": "toàn bộ cuộc tranh luận",  
  "bull_history": "chỉ lập luận bull",
  "bear_history": "chỉ lập luận bear",
  "current_response": "lập luận mới nhất",
  "count": 2,          # số lượt
  "judge_decision": ""
}

ConditionalLogic điều phối vòng lặp:

def should_continue_debate(self, state):
    if count >= 2 * max_debate_rounds:  # mặc định 2 lượt
        return "Research Manager"       # chuyển sang judge
    if current_response.startswith("Bull"):
        return "Bear Researcher"        # Bear phản bác
    return "Bull Researcher"            # Bull phản bác
3.2 Risk Debate (Aggressive → Conservative → Neutral)
Tương tự nhưng 3 bên, xoay vòng theo thứ tự cố định:

Aggressive → Conservative → Neutral → Aggressive... (3*rounds lần)

4. Hệ thống Bộ nhớ — BM25 Retrieval

5 memory stores độc lập (không dùng vector embeddings, dùng BM25):

bull_memory, bear_memory, trader_memory
invest_judge_memory, risk_manager_memory

Cách hoạt động:

Ghi vào memory (sau mỗi trade, khi biết P&L): memory.add_situations([(market_situation_text, reflection_text)])

Đọc memory (trước khi agent lập luận): past_memories = memory.get_memories(current_situation, n_matches=2)

BM25 tìm 2 situation tương tự nhất theo keyword matching Prompt injection pattern:

prompt = f"""...
Reflections from similar situations: {past_memory_str}
You must address reflections and learn from lessons and mistakes...
"""

5. Reflection Loop — Học từ Kết quả

Sau khi biết P&L (returns_losses), reflect_and_remember() trigger:

def reflect_and_remember(self, returns_losses):
    # LLM phân tích: quyết định đúng hay sai? Tại sao?
    reflector.reflect_bull_researcher(state, returns_losses, bull_memory)
    reflector.reflect_bear_researcher(state, returns_losses, bear_memory)
    reflector.reflect_trader(state, returns_losses, trader_memory)
    reflector.reflect_invest_judge(state, returns_losses, invest_judge_memory)
    reflector.reflect_risk_manager(state, returns_losses, risk_manager_memory)
Reflection prompt yêu cầu LLM:

Xác định quyết định đúng/sai dựa vào P&L Phân tích nguyên nhân (market data, news, sentiment, fundamentals) Đề xuất cải thiện Rút ra bài học cô đọng (≤1000 tokens) → lưu vào memory

6. Debug Mode

Bật debug:

graph = TradingAgentsGraph(debug=True)

Trong propagate():

if self.debug:
    for chunk in self.graph.stream(init_agent_state, **args):
        chunk["messages"][-1].pretty_print()  # in từng bước
        trace.append(chunk)

Debug mode dùng graph.stream() thay vì graph.invoke(), in ra từng message khi state thay đổi.

7. Hai tầng LLM — Deep vs Quick

LLM Model mặc định Dùng cho deep_thinking_llm gpt-5.2 Research Manager (judge), Risk Judge quick_thinking_llm gpt-5-mini Analysts, Researchers, Risk debators, Reflector, Signal Processor Insight business: Quyết định cuối cùng (judge roles) dùng model mạnh nhất. Công đoạn thu thập dữ liệu dùng model nhẹ hơn để tiết kiệm chi phí.

8. Vì sao Hệ thống Đạt Kết quả Tốt (Business Perspective)

┌─────────────────────────────────────────────────────┐
│  LAYER 1: THÔNG TIN (4 chiều độc lập)               │
│  Market Tech + Sentiment + News + Fundamentals       │
└──────────────────────┬──────────────────────────────┘
                       │ 4 reports
┌──────────────────────▼──────────────────────────────┐
│  LAYER 2: TRANH LUẬN (adversarial reasoning)        │
│  Bull vs Bear → tránh confirmation bias             │
│  Memory: học từ sai lầm quá khứ                     │
└──────────────────────┬──────────────────────────────┘
                       │ investment_plan
┌──────────────────────▼──────────────────────────────┐
│  LAYER 3: VALIDATION RỦI RO (3 góc nhìn)            │
│  Aggressive vs Conservative vs Neutral               │
│  → kiểm tra plan từ các khẩu vị rủi ro khác nhau    │
└──────────────────────┬──────────────────────────────┘
                       │ final_trade_decision
┌──────────────────────▼──────────────────────────────┐
│  LAYER 4: EXTRACTION                                 │
│  SignalProcessor → BUY / SELL / HOLD                 │
└─────────────────────────────────────────────────────┘

3 cơ chế chống bias quan trọng nhất:

Adversarial debate — Buộc LLM phải phản bác chính mình thay vì đồng thuận
Memory injection — Past mistakes được inject vào prompt để tránh lặp lỗi
Separation of concerns — Researcher không biết Risk, Risk không biết cách thu thập data → giảm contamination

Điểm yếu cần lưu ý:

Memory dùng BM25 (keyword) thay vì semantic search → có thể miss các situation tương tự nhưng dùng từ khác
Memory chỉ tồn tại trong session (in-memory), reset khi khởi động lại
max_debate_rounds=1 mặc định → chỉ 2 lượt Bull/Bear, có thể chưa đủ sâu
risk_manager.py:14: fundamentals_report = state["news_report"]

trading-agent memory-ai