> 📌 **Official channel**: This GitHub repository is the sole official source for GenericAgent. We have no affiliation with any third-party website using the GenericAgent name.
---
## 🌟 Overview
**GenericAgent** is a minimal, self-evolving autonomous agent framework. Its core is just **~3K lines of code**. Through **9 atomic tools + a ~100-line Agent Loop**, it grants any LLM system-level control over a local computer — covering browser, terminal, filesystem, keyboard/mouse input, screen vision, and mobile devices (ADB).
Its design philosophy: **don't preload skills — evolve them.**
Every time GenericAgent solves a new task, it automatically crystallizes the execution path into an skill for direct reuse later. The longer you use it, the more skills accumulate — forming a skill tree that belongs entirely to you, grown from 3K lines of seed code.
> **🤖 Self-Bootstrap Proof** — Everything in this repository, from installing Git and running `git init` to every commit message, was completed autonomously by GenericAgent. The author never opened a terminal once.
## 📋 Core Features
- **Self-Evolving**: Automatically crystallizes each task into an skill. Capabilities grow with every use, forming your personal skill tree.
- **Minimal Architecture**: ~3K lines of core code. Agent Loop is ~100 lines. No complex dependencies, zero deployment overhead.
- **Strong Execution**: Injects into a real browser (preserving login sessions). 9 atomic tools take direct control of the system.
- **High Compatibility**: Supports Claude / Gemini / Kimi / MiniMax and other major models. Cross-platform.
- **Token Efficient**: <30K context window — a fraction of the 200K–1M other agents consume. Layered memory ensures the right knowledge is always in scope. Less noise, fewer hallucinations, higher success rate — at a fraction of the cost.
## 🧬 Self-Evolution Mechanism
This is what fundamentally distinguishes GenericAgent from every other agent framework.
```
[New Task] --> [Autonomous Exploration] (install deps, write scripts, debug & verify) -->
[Crystallize Execution Path into skill] --> [Write to Memory Layer] --> [Direct Recall on Next Similar Task]
```
| What you say | What the agent does the first time | Every time after |
|---|---|---|
| *"Read my WeChat messages"* | Install deps → reverse DB → write read script → save skill | **one-line invoke** |
| *"Monitor stocks and alert me"* | Install mootdx → build selection flow → configure cron → save skill | **one-line start** |
| *"Send this file via Gmail"* | Configure OAuth → write send script → save skill | **ready to use** |
After a few weeks, your agent instance will have a skill tree no one else in the world has — all grown from 3K lines of seed code.
##### 🎯 Demo Showcase
| 🧋 Food Delivery Order | 📈 Quantitative Stock Screening |
|:---:|:---:|
| | |
| *"Order me a milk tea"* — Navigates the delivery app, selects items, and completes checkout automatically. | *"Find GEM stocks with EXPMA golden cross, turnover > 5%"* — Screens stocks with quantitative conditions. |
| 🌐 Autonomous Web Exploration | 💰 Expense Tracking | 💬 Batch Messaging |
| | | |
| Autonomously browses and periodically summarizes web content. | *"Find expenses over ¥2K in the last 3 months"* — Drives Alipay via ADB. | Sends bulk WeChat messages, fully driving the WeChat client. |
## 📅 Latest News
- **2026-04-21:** 📄 [Technical Report released on arXiv](https://arxiv.org/abs/2604.17091) — *GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization*
- **2026-04-11:** Introduced **L4 session archive memory** and scheduler cron integration
- **2026-03-23:** Support personal WeChat as a bot frontend
- **2026-03-10:** [Released million-scale Skill Library](https://mp.weixin.qq.com/s/q2gQ7YvWoiAcwxzaiwpuiQ?scene=1&click_id=7)
- **2026-03-08:** [Released "Dintal Claw" — a GenericAgent-powered government affairs bot](https://mp.weixin.qq.com/s/eiEhwo-j6S-WpLxgBnNxBg)
- **2026-03-01:** [GenericAgent featured by Jiqizhixin (机器之心)](https://mp.weixin.qq.com/s/uVWpTTF5I1yzAENV_qm7yg)
- **2026-01-16:** GenericAgent V1.0 public release
---
## 🚀 Quick Start
#### Method 1: Standard Installation
```bash
# 1. Clone the repo
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
# 2. Install dependencies
pip install requests streamlit pywebview # Desktop GUI (launch.pyw)
pip install requests textual # Terminal UI (tuiapp.py)
# 3. Configure API Key
cp mykey_template.py mykey.py
# Edit mykey.py and fill in your LLM API Key
# 4. Launch
python launch.pyw
```
#### Method 2: uv (for experienced Python users)
If you prefer a modern Python workflow, GenericAgent also provides a minimal `pyproject.toml`:
```bash
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
uv venv
uv pip install -e ".[ui]" # Core + GUI dependencies
cp mykey_template.py mykey.py
python launch.pyw
```
> GenericAgent is meant to grow its environment through the Agent itself, not by pre-installing every possible package.
Full guide: [GETTING_STARTED.md](GETTING_STARTED.md)
---
## 🖥️ Desktop Frontends
### Terminal UI
A lightweight, keyboard-driven interface built on [Textual](https://github.com/Textualize/textual). Supports multiple concurrent sessions, real-time streaming, and runs anywhere a terminal does — no browser needed.
```bash
python frontends/tuiapp.py
```
### Other Desktop Frontends
```bash
python frontends/qtapp.py # Qt-based desktop app
streamlit run frontends/stapp2.py # Alternative Streamlit UI
```
### Codeg
[Codeg](https://github.com/yiqi-017/codeg) (`feat/genericagent-integration` branch) is a desktop/web UI that connects GenericAgent alongside other agents (Claude Code, Gemini, Codex, etc.) in a unified interface with a polished, modern UI.
> This integration is usable now. Some features are still being refined — feedback welcome.
Place your GenericAgent directory alongside the codeg project. Codeg will auto-detect `frontends/genericagent_acp_bridge.py` and launch GenericAgent as a local ACP agent.
---
## 💬 Bot Interface (IM)
### Telegram Bot
```python
# mykey.py
tg_bot_token = 'YOUR_BOT_TOKEN'
tg_allowed_users = [YOUR_USER_ID]
```
```bash
python frontends/tgapp.py
```
### Common Chat Commands
The default Streamlit desktop UI started by `python launch.pyw`, plus the QQ / Telegram / Feishu / WeCom / DingTalk frontends, support these chat commands:
- `/new` - start a fresh conversation and clear the current context
- `/continue` - list recoverable conversation snapshots
- `/continue N` - restore the `N`th recoverable conversation
## 📊 Comparison with Similar Tools
| Feature | GenericAgent | OpenClaw | Claude Code |
|------|:---:|:---:|:---:|
| **Codebase** | ~3K lines | ~530,000 lines | Open-sourced (large) |
| **Deployment** | `pip install` + API Key | Multi-service orchestration | CLI + subscription |
| **Browser Control** | Real browser (session preserved) | Sandbox / headless browser | Via MCP plugin |
| **OS Control** | Mouse/kbd, vision, ADB | Multi-agent delegation | File + terminal |
| **Self-Evolution** | Autonomous skill growth | Plugin ecosystem | Stateless between sessions |
| **Out of the Box** | A few core files + starter skills | Hundreds of modules | Rich CLI toolset |
## 🧠 How It Works
GenericAgent accomplishes complex tasks through **Layered Memory × Minimal Toolset × Autonomous Execution Loop**, continuously accumulating experience during execution.
1️⃣ **Layered Memory System**
> _Memory crystallizes throughout task execution, letting the agent build stable, efficient working patterns over time._
- **L0 — Meta Rules**: Core behavioral rules and system constraints of the agent
- **L1 — Insight Index**: Minimal memory index for fast routing and recall
- **L2 — Global Facts**: Stable knowledge accumulated over long-term operation
- **L3 — Task Skills / SOPs**: Reusable workflows for completing specific task types
- **L4 — Session Archive**: Archived task records distilled from finished sessions for long-horizon recall
2️⃣ **Autonomous Execution Loop**
> _Perceive environment state → Task reasoning → Execute tools → Write experience to memory → Loop_
The entire core loop is just **~100 lines of code** (`agent_loop.py`).
3️⃣ **Minimal Toolset**
> _GenericAgent provides only **9 atomic tools**, forming the foundational capabilities for interacting with the outside world._
| Tool | Function |
|------|------|
| `code_run` | Execute arbitrary code |
| `file_read` | Read files |
| `file_write` | Write files |
| `file_patch` | Patch / modify files |
| `web_scan` | Perceive web content |
| `web_execute_js` | Control browser behavior |
| `ask_user` | Human-in-the-loop confirmation |
> Additionally, 2 **memory management tools** (`update_working_checkpoint`, `start_long_term_update`) allow the agent to persist context and accumulate experience across sessions.
4️⃣ **Capability Extension Mechanism**
> _Capable of dynamically creating new tools._
Via `code_run`, GenericAgent can dynamically install Python packages, write new scripts, call external APIs, or control hardware at runtime — crystallizing temporary abilities into permanent tools.
GenericAgent Workflow Diagram
## ⭐ Support
If this project helped you, please consider leaving a **Star!** 🙏
You're also welcome to join our **GenericAgent Community Group** for discussion, feedback, and co-building 👏
## 🚩 友情链接
感谢 **LinuxDo** 社区的支持!
[](https://linux.do/)
## 📄 许可
MIT License — 详见 [LICENSE](LICENSE)
*声明:本项目未构建任何商业站点;除 DintalClaw 外,目前未官方授权任何机构、组织或个人以 GenericAgent 名义从事商业活动。*
## 📈 Star History
================================================
FILE: TMWebDriver.py
================================================
import json, threading, time, uuid, queue, socket, requests, traceback
from typing import Any
from simple_websocket_server import WebSocketServer, WebSocket
import bottle
from bottle import request
class Session:
def __init__(self, session_id, info, client=None):
self.id = session_id
self.info = info
self.connect_at = time.time()
self.disconnect_at = None
self.type = info.get('type', 'ws')
self.ws_client = client if self.type in ('ws', 'ext_ws') else None
self.http_queue = client if self.type == 'http' else None
@property
def url(self): return self.info.get('url', '')
def is_active(self):
if self.type == 'http' and time.time() - self.connect_at > 60: self.mark_disconnected()
return self.disconnect_at is None
def reconnect(self, client, info):
self.info = info
self.type = info.get('type', 'ws')
if self.type in ('ws', 'ext_ws'):
self.ws_client = client
self.http_queue = None
elif self.type == 'http':
self.http_queue = client
self.connect_at = time.time()
self.disconnect_at = None
def mark_disconnected(self):
if self.is_active(): print(f"Tab disconnected: {self.url} (Session: {self.id})")
self.disconnect_at = time.time()
class TMWebDriver:
def __init__(self, host: str = '127.0.0.1', port: int = 18765):
self.host, self.port = host, port
self.sessions, self.results, self.acks = {}, {}, {}
self.default_session_id = None
self.latest_session_id = None
self.is_remote = socket.socket().connect_ex((host, port+1)) == 0
if not self.is_remote:
self.start_ws_server()
self.start_http_server()
else:
self.remote = f'http://{self.host}:{self.port+1}/link'
def start_http_server(self):
self.app = app = bottle.Bottle()
@app.route('/api/longpoll', method=['GET', 'POST'])
def long_poll():
data = request.json
session_id = data.get('sessionId')
session_info = {'url': data.get('url'), 'title': data.get('title', ''), 'type': 'http'}
if session_id not in self.sessions:
session = Session(session_id, session_info, queue.Queue())
print(f"Browser http connected: {session.url} (Session: {session_id})")
self.sessions[session_id] = session
session = self.sessions[session_id]
if session.disconnect_at is not None and session.type != 'http': session.reconnect(queue.Queue(), session_info)
session.disconnect_at = None
if session.type == 'http': msgQ = session.http_queue
else: return json.dumps({"id": "", "ret": "use ws"})
session.connect_at = start_time = time.time()
while time.time() - start_time < 5:
try:
msg = msgQ.get(timeout=0.2)
try: self.acks[json.loads(msg).get('id','')] = True
except Exception: traceback.print_exc()
return msg
except queue.Empty: continue
return json.dumps({"id": "", "ret": "next long-poll"})
@app.route('/api/result', method=['GET','POST'])
def result():
data = request.json
if data.get('type') == 'result':
self.results[data.get('id')] = {'success': True, 'data': data.get('result'), 'newTabs': data.get('newTabs', [])}
elif data.get('type') == 'error':
self.results[data.get('id')] = {'success': False, 'data': data.get('error'), 'newTabs': data.get('newTabs', [])}
return 'ok'
@app.route('/link', method=['GET','POST'])
def link():
data = request.json
if data.get('cmd') == 'get_all_sessions': return json.dumps({'r': self.get_all_sessions()}, ensure_ascii=False)
if data.get('cmd') == 'find_session':
url_pattern = data.get('url_pattern', '')
return json.dumps({'r': self.find_session(url_pattern)}, ensure_ascii=False)
if data.get('cmd') == 'execute_js':
session_id = data.get('sessionId')
code = data.get('code')
timeout = float(data.get('timeout', 10.0))
try:
result = self.execute_js(code, timeout=timeout, session_id=session_id)
print('[remote result]', (str(code)[:50] + ' RESULT:' +str(result)[:50]).replace('\n', ' '))
return json.dumps({'r': result}, ensure_ascii=False)
except Exception as e:
return json.dumps({'r': {'error': str(e)}}, ensure_ascii=False)
return 'ok'
def run():
from wsgiref.simple_server import make_server, WSGIServer, WSGIRequestHandler
from socketserver import ThreadingMixIn
class _T(ThreadingMixIn, WSGIServer): pass
class _H(WSGIRequestHandler):
def log_request(self, *a): pass
make_server(self.host, self.port+1, app, server_class=_T, handler_class=_H).serve_forever()
http_thread = threading.Thread(target=run, daemon=True)
http_thread.start()
def clean_sessions(self):
sids = list(self.sessions.keys())
for sid in sids:
session = self.sessions[sid]
if not session.is_active() and time.time() - session.disconnect_at > 600:
del self.sessions[sid]
def start_ws_server(self) -> None:
driver = self
class JSExecutor(WebSocket):
def handle(self) -> None:
try:
data = json.loads(self.data)
if data.get('type') == 'ready':
session_id = data.get('sessionId')
session_info = {'url': data.get('url'), 'title': data.get('title', ''),
'connected_at': time.time(), 'type': 'ws'}
driver._register_client(session_id, self, session_info)
elif data.get('type') in ['ext_ready', 'tabs_update']:
tabs = data.get('tabs', [])
current_tab_ids = {str(tab['id']) for tab in tabs}
print(f"Received tabs update: {current_tab_ids}")
for sid in list(driver.sessions.keys()):
sess = driver.sessions[sid]
if sess.type == 'ext_ws' and sid not in current_tab_ids:
sess.mark_disconnected()
for tab in tabs:
session_id = str(tab['id'])
session_info = {'url': tab.get('url'), 'title': tab.get('title', ''), 'connected_at': time.time(), 'type': 'ext_ws'}
sess = driver.sessions.get(session_id)
if sess and sess.is_active(): sess.info = session_info
else: driver._register_client(session_id, self, session_info)
elif data.get('type') == 'ack': driver.acks[data.get('id','')] = True
elif data.get('type') == 'result':
driver.results[data.get('id')] = {'success': True, 'data': data.get('result'), 'newTabs': data.get('newTabs', [])}
elif data.get('type') == 'error':
driver.results[data.get('id')] = {'success': False, 'data': data.get('error'), 'newTabs': data.get('newTabs', [])}
except Exception as e:
print(f"Error handling message: {e}")
if hasattr(self, 'data'): print(self.data)
def connected(self): (f"New connection from {self.address}")
def handle_close(self):
print(f"WS Connection closed: {self.address}")
driver._unregister_client(self)
self.server = WebSocketServer(self.host, self.port, JSExecutor)
server_thread = threading.Thread(target=self.server.serve_forever)
server_thread.daemon = True
server_thread.start()
print(f"WebSocket server running on ws://{self.host}:{self.port}")
def _register_client(self, session_id: str, client: WebSocket, session_info) -> None:
is_new_session = session_id not in self.sessions
if is_new_session:
session = Session(session_id, session_info, client)
self.sessions[session_id] = session
print(f"New tab connected: {session.url} (Session: {session_id})")
else:
session = self.sessions[session_id]
session.reconnect(client, session_info)
print(f"Tab reconnected: {session.url} (Session: {session_id})")
self.latest_session_id = session_id
if self.default_session_id is None: self.default_session_id = session_id
def _unregister_client(self, client: WebSocket) -> None:
for session in self.sessions.values():
if session.ws_client == client: session.mark_disconnected()
def execute_js(self, code, timeout=15, session_id=None) -> Any:
if session_id is None: session_id = self.default_session_id
if self.is_remote:
print('remote_execute_js')
response = self._remote_cmd({"cmd": "execute_js", "sessionId": session_id,
"code": code, "timeout": str(timeout)}).get('r', {})
if response.get('error'): raise Exception(response['error'])
return response
session = self.sessions.get(session_id)
if not session or not session.is_active():
time.sleep(3)
session = self.sessions.get(session_id)
if not session or not session.is_active():
alive_sessions = [s for s in self.sessions.values() if s.is_active()]
if alive_sessions:
session = alive_sessions[0]
print(f"会话 {session_id} 未连接,自动切换到最新活动会话: {session.id}")
session_id = self.default_session_id = session.id
if not session or not session.is_active():
raise ValueError(f"会话ID {session_id} 未连接")
tp = session.type
if tp not in ('ws', 'http', 'ext_ws'):
raise ValueError(f"Unsupported session type: {tp}")
exec_id = str(uuid.uuid4())
payload_dict = {'id': exec_id, 'code': code}
if tp == 'ext_ws': payload_dict['tabId'] = int(session.id)
payload = json.dumps(payload_dict)
if tp in ['ws', 'ext_ws']: session.ws_client.send_message(payload)
elif tp == 'http': session.http_queue.put(payload)
start_time = time.time()
self.clean_sessions()
hasjump = acked = False
while exec_id not in self.results:
time.sleep(0.2)
if not acked and exec_id in self.acks:
acked = True; start_time = time.time()
if tp in ['ws', 'ext_ws']:
if not session.is_active(): hasjump = True
if hasjump and session.is_active():
return {'result': f"Session {session_id} reloaded.", "closed":1}
if time.time() - start_time > timeout:
if tp in ['ws', 'ext_ws']:
if hasjump: return {'result': f"Session {session_id} reloaded and new page is loading...", 'closed':1}
if acked: return {"result": f"No response data in {timeout}s (ACK received, script may still be running)"}
return {"result": f"No response data in {timeout}s (no ACK, script may not have been delivered)"}
elif tp == 'http':
if acked: return {"result": f"Session {session_id} no response in {timeout}s (delivered but no result)"}
return {"result": f"Session {session_id} no response in {timeout}s (script not polled)"}
result = self.results.pop(exec_id)
if exec_id in self.acks: self.acks.pop(exec_id)
if not result['success']: raise Exception(result['data'])
rr = {'data': result['data']}
newtabs = result.get('newTabs', []); [x.pop('ts', None) for x in newtabs]
if newtabs: rr['newTabs'] = newtabs
return rr
def _remote_cmd(self, cmd):
try: return requests.post(self.remote, headers={"Content-Type": "application/json"}, json=cmd, timeout=30).json()
except (ConnectionError, requests.exceptions.ConnectionError):
raise ConnectionError("TMWebDriver master未运行,看tmwebdriver_sop启动master")
def get_all_sessions(self):
if self.is_remote:
return self._remote_cmd({"cmd": "get_all_sessions"}).get('r', [])
return [{'id': session.id, **session.info} for session in self.sessions.values()
if session.is_active()]
def get_session_dict(self):
return {session['id']: session['url'] for session in self.get_all_sessions()}
def find_session(self, url_pattern: str):
if url_pattern == '':
session = self.sessions.get(self.latest_session_id)
return [(session.id, session.info)] if session else []
matching_sessions = []
for session in self.sessions.values():
if not session.is_active(): continue
if 'url' in session.info and url_pattern in session.info['url']:
matching_sessions.append((session.id, session.info))
return matching_sessions
def set_session(self, url_pattern: str) -> bool:
if self.is_remote:
matched = self._remote_cmd({"cmd": "find_session", "url_pattern": url_pattern}).get('r', [])
else:
matched = self.find_session(url_pattern)
if not matched: return print(f"警告: 未找到URL包含 '{url_pattern}' 的会话")
if len(matched) > 1: print(f"警告: 找到多个URL包含 '{url_pattern}' 的会话,选择第一个")
self.default_session_id, info = matched[0]
print(f"成功设置默认会话: {self.default_session_id}: {info['url']}")
return self.default_session_id
def jump(self, url, timeout=10): self.execute_js(f"window.location.href='{url}'", timeout=timeout)
if __name__ == "__main__":
driver = TMWebDriver(host='127.0.0.1', port=18765)
================================================
FILE: agent_loop.py
================================================
import json, re, os
from dataclasses import dataclass
from typing import Any, Optional
@dataclass
class StepOutcome:
data: Any
next_prompt: Optional[str] = None
should_exit: bool = False
def try_call_generator(func, *args, **kwargs):
ret = func(*args, **kwargs)
if hasattr(ret, '__iter__') and not isinstance(ret, (str, bytes, dict, list)): ret = yield from ret
return ret
class BaseHandler:
def tool_before_callback(self, tool_name, args, response): pass
def tool_after_callback(self, tool_name, args, response, ret): pass
def turn_end_callback(self, response, tool_calls, tool_results, turn, next_prompt, exit_reason): return next_prompt
def dispatch(self, tool_name, args, response, index=0):
method_name = f"do_{tool_name}"
if hasattr(self, method_name):
args['_index'] = index
prer = yield from try_call_generator(self.tool_before_callback, tool_name, args, response)
ret = yield from try_call_generator(getattr(self, method_name), args, response)
_ = yield from try_call_generator(self.tool_after_callback, tool_name, args, response, ret)
return ret
elif tool_name == 'bad_json': return StepOutcome(None, next_prompt=args.get('msg', 'bad_json'), should_exit=False)
else:
yield f"未知工具: {tool_name}\n"
return StepOutcome(None, next_prompt=f"未知工具 {tool_name}", should_exit=False)
def json_default(o): return list(o) if isinstance(o, set) else str(o)
def exhaust(g):
try:
while True: next(g)
except StopIteration as e: return e.value
def get_pretty_json(data):
if isinstance(data, dict) and "script" in data:
data = data.copy(); data["script"] = data["script"].replace("; ", ";\n ")
return json.dumps(data, indent=2, ensure_ascii=False).replace('\\n', '\n')
def agent_runner_loop(client, system_prompt, user_input, handler, tools_schema, max_turns=40, verbose=True, initial_user_content=None):
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": initial_user_content if initial_user_content is not None else user_input}
]
turn = 0; handler.max_turns = max_turns
while turn < handler.max_turns:
turn += 1; turnstr = f'LLM Running (Turn {turn}) ...'
if handler.parent.task_dir: turnstr = f'Turn {turn} ...'
if verbose: turnstr = f'**{turnstr}**'
yield f"\n\n{turnstr}\n\n"
if turn%10 == 0: client.last_tools = '' # 每10轮重置一次工具描述,避免上下文过大导致的模型性能下降
response_gen = client.chat(messages=messages, tools=tools_schema)
if verbose:
response = yield from response_gen
yield '\n\n'
else:
response = exhaust(response_gen)
cleaned = _clean_content(response.content)
if cleaned: yield cleaned + '\n'
if not response.tool_calls: tool_calls = [{'tool_name': 'no_tool', 'args': {}}]
else: tool_calls = [{'tool_name': tc.function.name, 'args': json.loads(tc.function.arguments), 'id': tc.id}
for tc in response.tool_calls]
tool_results = []; next_prompts = set(); exit_reason = {}
for ii, tc in enumerate(tool_calls):
tool_name, args, tid = tc['tool_name'], tc['args'], tc.get('id', '')
if tool_name == 'no_tool': pass
else:
if verbose: yield f"🛠️ Tool: `{tool_name}` 📥 args:\n````text\n{get_pretty_json(args)}\n````\n"
else: yield f"🛠️ {tool_name}({_compact_tool_args(tool_name, args)})\n\n\n"
handler.current_turn = turn
gen = handler.dispatch(tool_name, args, response, index=ii)
try:
v = next(gen)
def proxy(): yield v; return (yield from gen)
if verbose: yield '`````\n'
outcome = (yield from proxy()) if verbose else exhaust(proxy())
if verbose: yield '`````\n'
except StopIteration as e: outcome = e.value
if outcome.should_exit:
exit_reason = {'result': 'EXITED', 'data': outcome.data}; break
if not outcome.next_prompt:
exit_reason = {'result': 'CURRENT_TASK_DONE', 'data': outcome.data}; break
if outcome.next_prompt.startswith('未知工具'): client.last_tools = ''
if outcome.data is not None and tool_name != 'no_tool':
datastr = json.dumps(outcome.data, ensure_ascii=False, default=json_default) if type(outcome.data) in [dict, list] else str(outcome.data)
tool_results.append({'tool_use_id': tid, 'content': datastr})
next_prompts.add(outcome.next_prompt)
if len(next_prompts) == 0 or exit_reason:
if len(handler._done_hooks) == 0 or exit_reason.get('result', '') == 'EXITED': break
next_prompts.add(handler._done_hooks.pop(0))
next_prompt = handler.turn_end_callback(response, tool_calls, tool_results, turn, '\n'.join(next_prompts), exit_reason)
messages = [{"role": "user", "content": next_prompt, "tool_results": tool_results}] # just new message, history is kept in *Session
if exit_reason: handler.turn_end_callback(response, tool_calls, tool_results, turn, '', exit_reason)
return exit_reason or {'result': 'MAX_TURNS_EXCEEDED'}
def _clean_content(text):
if not text: return ''
def _shrink_code(m):
lines = m.group(0).split('\n')
lang = lines[0].replace('```','').strip()
body = [l for l in lines[1:-1] if l.strip()]
if len(body) <= 6: return m.group(0)
preview = '\n'.join(body[:5])
return f'```{lang}\n{preview}\n ... ({len(body)} lines)\n```'
text = re.sub(r'```[\s\S]*?```', _shrink_code, text)
for p in [r'[\s\S]*?', r'[\s\S]*?', r'(\r?\n){3,}']:
text = re.sub(p, '\n\n' if '\\n' in p else '', text)
return text.strip()
def _compact_tool_args(name, args):
a = {k: v for k, v in args.items() if k != '_index'}
for k in ('path',):
if k in a: a[k] = os.path.basename(a[k])
if name == 'update_working_checkpoint': s = a.get('key_info', ''); return (s[:60]+'...') if len(s)>60 else s
if name == 'ask_user':
q = str(a.get('question', ''))
cs = a.get('candidates') or []
if cs: q += '\ncandidates:\n' + '\n'.join(f'- {c}' for c in cs)
return q
s = json.dumps(a, ensure_ascii=False); return (s[:120]+'...') if len(s)>120 else s
================================================
FILE: agentmain.py
================================================
import os, sys, threading, queue, time, json, re, random, locale
os.environ.setdefault('GA_LANG', 'zh' if any(k in (locale.getlocale()[0] or '').lower() for k in ('zh', 'chinese')) else 'en')
if sys.stdout is None: sys.stdout = open(os.devnull, "w")
elif hasattr(sys.stdout, 'reconfigure'): sys.stdout.reconfigure(errors='replace')
if sys.stderr is None: sys.stderr = open(os.devnull, "w")
elif hasattr(sys.stderr, 'reconfigure'): sys.stderr.reconfigure(errors='replace')
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from llmcore import reload_mykeys, LLMSession, ToolClient, ClaudeSession, MixinSession, NativeToolClient, NativeClaudeSession, NativeOAISession, resolve_client
from agent_loop import agent_runner_loop
from ga import GenericAgentHandler, smart_format, get_global_memory, format_error, consume_file
script_dir = os.path.dirname(os.path.abspath(__file__))
def load_tool_schema(suffix=''):
global TOOLS_SCHEMA
TS = open(os.path.join(script_dir, f'assets/tools_schema{suffix}.json'), 'r', encoding='utf-8').read()
TOOLS_SCHEMA = json.loads(TS if os.name == 'nt' else TS.replace('powershell', 'bash'))
load_tool_schema()
lang_suffix = '_en' if os.environ.get('GA_LANG', '') == 'en' else ''
mem_dir = os.path.join(script_dir, 'memory')
if not os.path.exists(mem_dir): os.makedirs(mem_dir)
mem_txt = os.path.join(mem_dir, 'global_mem.txt')
if not os.path.exists(mem_txt): open(mem_txt, 'w', encoding='utf-8').write('# [Global Memory - L2]\n')
mem_insight = os.path.join(mem_dir, 'global_mem_insight.txt')
if not os.path.exists(mem_insight):
t = os.path.join(script_dir, f'assets/global_mem_insight_template{lang_suffix}.txt')
open(mem_insight, 'w', encoding='utf-8').write(open(t, encoding='utf-8').read() if os.path.exists(t) else '')
cdp_cfg = os.path.join(script_dir, 'assets/tmwd_cdp_bridge/config.js')
if not os.path.exists(cdp_cfg):
try:
os.makedirs(os.path.dirname(cdp_cfg), exist_ok=True)
open(cdp_cfg, 'w', encoding='utf-8').write(f"const TID = '__ljq_{hex(random.randint(0, 99999999))[2:8]}';")
except Exception as e: print(f'[WARN] CDP config init failed: {e} — advanced web features (tmwebdriver) will be unavailable.')
def get_system_prompt():
with open(os.path.join(script_dir, f'assets/sys_prompt{lang_suffix}.txt'), 'r', encoding='utf-8') as f: prompt = f.read()
prompt += f"\nToday: {time.strftime('%Y-%m-%d %a')}\n"
prompt += get_global_memory()
return prompt
class GenericAgent:
def __init__(self):
os.makedirs(os.path.join(script_dir, 'temp'), exist_ok=True)
self.lock = threading.Lock()
self.task_dir = None
self.history = []; self.handler = None;
self.task_queue = queue.Queue()
self.is_running = False; self.stop_sig = False
self.llm_no = 0; self.inc_out = False; self.verbose = True
self.peer_hint = True
self.log_path = os.path.join(script_dir, f'temp/model_responses/model_responses_{int(time.time()*1e6)%1000000:06d}.txt')
self.load_llm_sessions()
def load_llm_sessions(self):
mykeys, changed = reload_mykeys()
if not changed and hasattr(self, 'llmclients'): return
try: oldhistory = self.llmclient.backend.history
except: oldhistory = None
llm_sessions = []
for k, cfg in mykeys.items():
if not any(x in k for x in ['api', 'config', 'cookie']): continue
try:
if 'mixin' in k: llm_sessions += [{'mixin_cfg': cfg}]
elif c := resolve_client(k): llm_sessions += [c]
except: pass
for i, s in enumerate(llm_sessions):
if isinstance(s, dict) and 'mixin_cfg' in s:
try:
mixin = MixinSession(llm_sessions, s['mixin_cfg'])
if isinstance(mixin._sessions[0], (NativeClaudeSession, NativeOAISession)): llm_sessions[i] = NativeToolClient(mixin)
else: llm_sessions[i] = ToolClient(mixin)
except Exception as e: print(f'\n\n\n[ERROR] Failed to init MixinSession with cfg {s["mixin_cfg"]}: {e}!!!\n\n')
self.llmclients = llm_sessions
self.llmclient = self.llmclients[self.llm_no%len(self.llmclients)]
if oldhistory: self.llmclient.backend.history = oldhistory
def next_llm(self, n=-1):
self.load_llm_sessions()
self.llm_no = ((self.llm_no + 1) if n < 0 else n) % len(self.llmclients)
lastc = self.llmclient
self.llmclient = self.llmclients[self.llm_no]
try: self.llmclient.backend.history = lastc.backend.history
except: raise Exception('[ERROR] BAD Mixin config: Check your mykey.py')
self.llmclient.last_tools = ''
name = self.get_llm_name(model=True)
if 'glm' in name or 'minimax' in name or 'kimi' in name: load_tool_schema('_cn')
else: load_tool_schema()
def list_llms(self):
self.load_llm_sessions()
return [(i, self.get_llm_name(b), i == self.llm_no) for i, b in enumerate(self.llmclients)]
def get_llm_name(self, b=None, model=False):
b = self.llmclient if b is None else b
if isinstance(b, dict): return 'BADCONFIG_MIXIN'
if model: return b.backend.model.lower()
return f"{type(b.backend).__name__}/{b.backend.name}"
def abort(self):
if not self.is_running: return
print('Abort current task...')
self.stop_sig = True
if self.handler is not None: self.handler.code_stop_signal.append(1)
def put_task(self, query, source="user", images=None):
display_queue = queue.Queue()
self.task_queue.put({"query": query, "source": source, "images": images or [], "output": display_queue})
return display_queue
# i know it is dangerous, but raw_query is dangerous enough it doesn't enlarge
def _handle_slash_cmd(self, raw_query, display_queue):
if not raw_query.startswith('/'): return raw_query
if _sm := re.match(r'/session\.(\w+)=(.*)', raw_query.strip()):
k, v = _sm.group(1), _sm.group(2)
vfile = os.path.join(script_dir, 'temp', v)
if os.path.isfile(vfile): v = open(vfile, encoding='utf-8').read().strip()
try: v = json.loads(v) # cover number parsing
except (json.JSONDecodeError, ValueError): pass
setattr(self.llmclient.backend, k, v)
display_queue.put({'done': smart_format(f"✅ session.{k} = {repr(v)}", max_str_len=500), 'source': 'system'})
return None
if raw_query.strip() == '/resume':
return r'帮我看看最近有哪些会话可以恢复。读model_responses/目录,按修改时间取最近10个文件,从每个文件里找最后一个...块,用一句话总结每个会话在聊什么,列表给我选。注意读文件后要把字面的\n替换成真换行才能正确匹配。'
return raw_query
def run(self):
while True:
task = self.task_queue.get()
raw_query, source, display_queue = task["query"], task["source"], task["output"]
raw_query = self._handle_slash_cmd(raw_query, display_queue)
if raw_query is None:
self.task_queue.task_done(); continue
self.is_running = True
rquery = smart_format(raw_query.replace('\n', ' '), max_str_len=200)
self.history.append(f"[USER]: {rquery}")
sys_prompt = get_system_prompt() + getattr(self.llmclient.backend, 'extra_sys_prompt', '')
if self.peer_hint: sys_prompt += f"\n[Peer] 用户提及其他会话/后台任务状态时: temp/model_responses/ (只找近期修改的文件尾部)\n"
handler = GenericAgentHandler(self, self.history, os.path.join(script_dir, 'temp'))
if self.handler and 'key_info' in self.handler.working:
ki = re.sub(r'\n\[SYSTEM\] 此为.*?工作记忆[。\n]*', '', self.handler.working['key_info']) # 去旧
handler.working['key_info'] = ki
handler.working['passed_sessions'] = ps = self.handler.working.get('passed_sessions', 0) + 1
if ps > 0: handler.working['key_info'] += f'\n[SYSTEM] 此为 {ps} 个对话前设置的key_info,若已在新任务,先更新或清除工作记忆。\n'
self.handler = handler # although new handler, the **full** history is in llmclient, so it is full history!
self.llmclient.log_path = self.log_path
gen = agent_runner_loop(self.llmclient, sys_prompt, raw_query,
handler, TOOLS_SCHEMA, max_turns=70, verbose=self.verbose)
try:
full_resp = ""; last_pos = 0
for chunk in gen:
if consume_file(self.task_dir, '_stop'): self.abort()
if self.stop_sig: break
full_resp += chunk
if len(full_resp) - last_pos > 50 or 'LLM Running' in chunk:
display_queue.put({'next': full_resp[last_pos:] if self.inc_out else full_resp, 'source': source})
last_pos = len(full_resp)
if self.inc_out and last_pos < len(full_resp): display_queue.put({'next': full_resp[last_pos:], 'source': source})
if '' in full_resp: full_resp = full_resp.replace('', '\n\n')
if '' in full_resp: full_resp = re.sub(r'\s*(.*?)\s*', r'\n````\n\n\1\n\n````', full_resp, flags=re.DOTALL)
display_queue.put({'done': full_resp, 'source': source})
self.history = handler.history_info
except Exception as e:
print(f"Backend Error: {format_error(e)}")
display_queue.put({'done': full_resp + f'\n```\n{format_error(e)}\n```', 'source': source})
finally:
if self.stop_sig: print('User aborted the task.')
self.is_running = self.stop_sig = False
self.task_queue.task_done()
if self.handler is not None: self.handler.code_stop_signal.append(1)
GeneraticAgent = GenericAgent
if __name__ == '__main__':
import argparse
from datetime import datetime
parser = argparse.ArgumentParser()
parser.add_argument('--task', metavar='IODIR', help='一次性任务模式(文件IO)')
parser.add_argument('--reflect', metavar='SCRIPT', help='反射模式:加载监控脚本,check()触发时发任务')
parser.add_argument('--input', help='prompt')
parser.add_argument('--llm_no', type=int, default=0)
parser.add_argument('--verbose', action='store_true')
parser.add_argument('--nobg', action='store_true')
args = parser.parse_args()
if args.task and not args.nobg:
import subprocess, platform
cmd = [sys.executable, os.path.abspath(__file__)] + [a for a in sys.argv[1:]] + ['--nobg']
d = os.path.join(script_dir, f'temp/{args.task}'); os.makedirs(d, exist_ok=True)
p = subprocess.Popen(cmd, cwd=script_dir,
creationflags=0x08000000 if platform.system() == 'Windows' else 0,
stdout=open(os.path.join(d, 'stdout.log'), 'w', encoding='utf-8'),
stderr=open(os.path.join(d, 'stderr.log'), 'w', encoding='utf-8'))
print(p.pid); sys.exit(0)
agent = GeneraticAgent()
agent.next_llm(args.llm_no)
agent.verbose = args.verbose
threading.Thread(target=agent.run, daemon=True).start()
if args.task:
agent.peer_hint = False
agent.task_dir = d = os.path.join(script_dir, f'temp/{args.task}'); nround = ''
infile = os.path.join(d, 'input.txt')
if args.input:
os.makedirs(d, exist_ok=True)
import glob; [os.remove(f) for f in glob.glob(os.path.join(d, 'output*.txt'))]
with open(infile, 'w', encoding='utf-8') as f: f.write(args.input)
if (fh := consume_file(d, '_history.json')): agent.llmclient.backend.history = json.loads(fh)
with open(infile, encoding='utf-8') as f: raw = f.read()
while True:
dq = agent.put_task(raw, source='task')
while 'done' not in (item := dq.get(timeout=300)):
if 'next' in item and random.random() < 0.95: # 概率写一次中间结果
with open(f'{d}/output{nround}.txt', 'w', encoding='utf-8') as f: f.write(item.get('next', ''))
with open(f'{d}/output{nround}.txt', 'w', encoding='utf-8') as f: f.write(item['done'] + '\n\n[ROUND END]\n')
consume_file(d, '_stop') # 已经成功停下来了,避免打断下次reply
for _ in range(300): # 等reply.txt,10分钟超时
time.sleep(2)
if (raw := consume_file(d, 'reply.txt')): break
else: break
nround = nround + 1 if isinstance(nround, int) else 1
elif args.reflect:
agent.peer_hint = False
import importlib.util
spec = importlib.util.spec_from_file_location('reflect_script', args.reflect)
mod = importlib.util.module_from_spec(spec); spec.loader.exec_module(mod)
_mt = os.path.getmtime(args.reflect)
print(f'[Reflect] loaded {args.reflect}')
while True:
if os.path.getmtime(args.reflect) != _mt:
try: spec.loader.exec_module(mod); _mt = os.path.getmtime(args.reflect); print('[Reflect] reloaded')
except Exception as e: print(f'[Reflect] reload error: {e}')
time.sleep(getattr(mod, 'INTERVAL', 5))
try: task = mod.check()
except Exception as e:
print(f'[Reflect] check() error: {e}'); continue
if task and task == '/exit': break
if task is None: continue
print(f'[Reflect] triggered: {task[:80]}')
dq = agent.put_task(task, source='reflect')
try:
while 'done' not in (item := dq.get(timeout=180)): pass
result = item['done']
print(result)
except Exception as e:
if getattr(mod, 'ONCE', False): raise
print(f'[Reflect] drain error: {e}'); result = f'[ERROR] {e}'
log_dir = os.path.join(script_dir, 'temp/reflect_logs'); os.makedirs(log_dir, exist_ok=True)
script_name = os.path.splitext(os.path.basename(args.reflect))[0]
open(os.path.join(log_dir, f'{script_name}_{datetime.now():%Y-%m-%d}.log'), 'a', encoding='utf-8').write(f'[{datetime.now():%m-%d %H:%M}]\n{result}\n\n')
if (on_done := getattr(mod, 'on_done', None)):
try: on_done(result)
except Exception as e: print(f'[Reflect] on_done error: {e}')
if getattr(mod, 'ONCE', False): print('[Reflect] ONCE=True, exiting.'); break
else:
try: import readline
except Exception: pass
agent.inc_out = True
while True:
q = input('> ').strip()
if not q: continue
try:
dq = agent.put_task(q, source='user')
while True:
item = dq.get()
if 'next' in item: print(item['next'], end='', flush=True)
if 'done' in item: print(); break
except KeyboardInterrupt:
agent.abort()
print('\n[Interrupted]')
================================================
FILE: assets/SETUP_FEISHU.md
================================================
# 飞书 Agent 配置指南
> 让你的个人电脑变成飞书机器人的大脑,随时随地通过飞书对话控制你的电脑。
---
## 📋 目录
1. [前置条件](#前置条件)
2. [方案选择](#方案选择)
3. [企业用户配置](#企业用户配置)
4. [个人用户配置](#个人用户配置)
5. [项目配置](#项目配置)
6. [运行与测试](#运行与测试)
7. [常见问题](#常见问题)
---
## 前置条件
### 必需环境
- Python 3.8+
- 本项目完整代码
- LLM API 密钥(Claude/OpenAI 等,已在 `llmcore/mykeys` 中配置)
### 安装依赖
```bash
pip install lark-oapi
```
---
## 方案选择
| 你的情况 | 推荐方案 | 预计耗时 |
| ------------------ | -------------------------- | --------- |
| 公司已有飞书企业版 | [企业用户配置](#企业用户配置) | 5-10分钟 |
| 个人用户/学习测试 | [个人用户配置](#个人用户配置) | 10-15分钟 |
---
## 企业用户配置
> 适用于:你的公司使用飞书,你有权限创建应用或联系管理员审批
### 步骤 1:创建应用
1. 访问 [飞书开放平台](https://open.feishu.cn/)
2. 登录你的企业飞书账号
3. 点击右上角「创建应用」→「企业自建应用」
4. 填写应用信息:
- 应用名称:`我的Agent助手`(可自定义)
- 应用描述:`个人AI助手`
- 应用图标:可选
### 步骤 2:添加机器人能力
1. 进入应用详情页
2. 左侧菜单选择「添加应用能力」
3. 找到「机器人」,点击「添加」
4. 配置机器人信息(可保持默认)
### 步骤 3:配置权限
1. 左侧菜单「权限管理」→「API 权限」
2. 搜索并开通以下权限:
- `im:message` - 获取与发送单聊、群组消息
- `im:message:send_as_bot` - 以应用身份发送消息
- `contact:user.id:readonly` - 获取用户 ID
### 步骤 4:获取凭证
1. 左侧菜单「凭证与基础信息」
2. 记录以下信息:
- **App ID**:`cli_xxxxxxxx`
- **App Secret**:`xxxxxxxxxxxxxxxx`
### 步骤 5:发布应用
1. 左侧菜单「版本管理与发布」
2. 点击「创建版本」
3. 填写版本信息,提交审核
4. **联系企业管理员审批**(或自己是管理员直接审批)
### 步骤 6:获取你的 Open ID
1. 应用审批通过后,在飞书中搜索你的机器人
2. 给机器人发送任意消息
3. 运行以下代码获取你的 Open ID:
```python
# 临时运行一次,获取 open_id
import lark_oapi as lark
from lark_oapi.api.im.v1 import *
client = lark.Client.builder().app_id("你的APP_ID").app_secret("你的APP_SECRET").build()
# 监听消息,打印发送者的 open_id
def handle(data):
print(f"你的 Open ID: {data.event.sender.sender_id.open_id}")
# ... 或者查看 frontends/fsapp.py 运行时的日志输出
```
---
## 个人用户配置
> 适用于:没有企业飞书账号,想个人测试使用
### 步骤 1:创建测试企业
1. 访问 [飞书开放平台](https://open.feishu.cn/)
2. 使用个人手机号注册/登录
3. 点击右上角头像 →「创建测试企业」
4. 填写企业名称(如:`我的测试工作区`)
5. 创建完成后,你就是这个测试企业的**管理员**
### 步骤 2:创建应用
> 与企业用户步骤相同
1. 点击「创建应用」→「企业自建应用」
2. 填写应用信息
### 步骤 3:添加机器人能力
1. 进入应用详情页
2. 「添加应用能力」→「机器人」→「添加」
### 步骤 4:配置权限
1. 「权限管理」→「API 权限」
2. 开通权限:
- `im:message`
- `im:message:send_as_bot`
- `contact:user.id:readonly`
### 步骤 5:获取凭证
1. 「凭证与基础信息」
2. 复制 **App ID** 和 **App Secret**
### 步骤 6:发布应用(测试企业可自审批)
1. 「版本管理与发布」→「创建版本」
2. 提交后,进入 [飞书管理后台](https://feishu.cn/admin)
3. 「工作台」→「应用审核」→ 通过你的应用
### 步骤 7:在飞书客户端使用
1. 下载 [飞书客户端](https://www.feishu.cn/download)
2. 登录你的测试企业账号
3. 搜索你创建的机器人名称
4. 开始对话!
---
## 项目配置
### 配置飞书凭证
编辑项目根目录的 `mykey.py`,添加:
```python
# 飞书应用凭证
fs_app_id = "cli_xxxxxxxxxxxxxxxx" # 替换为你的 App ID
fs_app_secret = "xxxxxxxxxxxxxxxx" # 替换为你的 App Secret
# 允许使用的用户 Open ID 列表(可选,留空则允许所有人)
fs_allowed_users = [
"ou_xxxxxxxxxxxxxxxxxxxxxxxx", # 你的 Open ID
]
```
### 确认 LLM 配置
确保 `llmcore/mykeys` 中已配置 LLM API 密钥:
```python
# 示例:Claude API
claude_config = {
'apikey': 'sk-ant-xxxxx',
'apibase': 'https://api.anthropic.com',
'model': 'claude-sonnet-4-20250514'
}
```
---
## 运行与测试
### 启动服务
```bash
cd /path/to/pc-agent-loop
python frontends/fsapp.py
```
### 预期输出
```
==================================================
飞书 Agent 已启动(长连接模式)
App ID: cli_xxxxxxxxxxxxxxxx
等待消息...
==================================================
```
### 测试对话
1. 打开飞书客户端
2. 找到你的机器人
3. 发送:`你好`
4. 等待回复(首次可能需要几秒)
---
## 可用命令
在与机器人对话时,可以使用以下特殊命令:
| 命令 | 说明 |
| ---- | ---- |
| `/new` | 开始新对话,清除当前上下文 |
| `/stop` | 中止当前正在执行的任务 |
| `/restore <关键词>` | 恢复之前的对话上下文(根据关键词搜索历史记录) |
### 命令示例
```
/new # 清空对话,重新开始
/stop # 停止正在运行的任务
/restore 昨天的任务 # 恢复包含"昨天的任务"关键词的历史对话
```
### 消息显示说明
- ⏳ 表示任务正在执行中
- 消息会实时更新,无需等待完成
- 超长回复会自动分段发送
---
## 常见问题
### Q: 提示「应用未发布」或「无权限」
**A:** 确保应用已发布且管理员已审批。测试企业用户需要在管理后台手动审批。
### Q: 发送消息后没有回复
**A:** 检查:
1. `frontends/fsapp.py` 是否在运行
2. 终端是否有错误日志
3. LLM API 密钥是否配置正确
### Q: 提示「invalid app_id」
**A:** 检查 `mykey.py` 中的 `fs_app_id` 是否正确复制(包含 `cli_` 前缀)
### Q: 如何获取自己的 Open ID?
**A:** 运行 `frontends/fsapp.py` 后给机器人发消息,查看终端日志中的 `open_id`
### Q: 能否多人同时使用?
**A:** 不能。一个应用只能有一个长连接,连接到一台电脑。每个人需要创建自己的应用。
---
## 架构说明
```
你的飞书 ←→ 飞书云 ←→ 长连接 ←→ frontends/fsapp.py ←→ Agent ←→ 你的电脑
↑
运行在你电脑上
```
- 消息通过飞书云转发到你电脑上运行的 `frontends/fsapp.py`
- Agent 处理请求后,通过飞书 API 回复消息
- **你的电脑必须保持运行** `frontends/fsapp.py` 才能响应消息
---
## 下一步
- 自定义 Agent 行为:编辑 `assets/sys_prompt.txt`
- 添加新工具:编辑 `assets/tools_schema.json`
- 查看日志:运行时观察终端输出
---
*文档版本:v1.1 | 更新日期:2026-03-07*
**v1.1 更新内容:**
- 新增「可用命令」章节(/new, /stop, /restore)
- 新增消息显示说明(⏳ 进行中标记、实时更新等)
================================================
FILE: assets/agent_bbs.py
================================================
# agent_bbs.py — 极简Agent公告板(多板块版)
# 启动: uvicorn agent_bbs:app --host 0.0.0.0 --port 58800
# 或: python agent_bbs.py
import sqlite3, uuid, time, json, os
from threading import Lock
from fastapi import FastAPI, HTTPException, Query, Body, UploadFile, File
from fastapi.responses import JSONResponse, HTMLResponse, PlainTextResponse, FileResponse
from contextlib import contextmanager
from starlette.requests import Request
from starlette.responses import Response
from starlette.middleware.base import BaseHTTPMiddleware
# key → board config; 修改 boards.json 可热重载新增板块
BOARDS_FILE = "boards.json"
DEFAULT_BOARDS = {"agent-bbs-test": {"name": "default", "db": "agent_bbs.db"}}
BOARDS, BOARDS_MTIME_NS, BOARDS_LOCK = DEFAULT_BOARDS, None, Lock()
def load_boards_if_changed():
global BOARDS, BOARDS_MTIME_NS
with BOARDS_LOCK:
if not os.path.exists(BOARDS_FILE):
json.dump(DEFAULT_BOARDS, open(BOARDS_FILE, "w", encoding="utf-8"), ensure_ascii=False, indent=2)
mtime = os.stat(BOARDS_FILE).st_mtime_ns
if mtime == BOARDS_MTIME_NS: return BOARDS
try:
new = json.load(open(BOARDS_FILE, "r", encoding="utf-8"))
assert isinstance(new, dict) and all(isinstance(v, dict) and "db" in v and "name" in v for v in new.values())
BOARDS, BOARDS_MTIME_NS = new, mtime; init_db()
print(f"[boards] reloaded {len(BOARDS)} boards")
except Exception as e: print(f"[boards] reload failed, keep old config: {e}")
return BOARDS
UPLOAD_DIR = "bbs_files"
os.makedirs(UPLOAD_DIR, exist_ok=True)
app = FastAPI(title="Agent BBS", docs_url=None, redoc_url=None, openapi_url=None)
class ApiKeyMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
key = request.headers.get("x-api-key") or request.query_params.get("key")
board = load_boards_if_changed().get(key)
if not board: return Response("Not Found", status_code=404)
request.state.board = board
return await call_next(request)
app.add_middleware(ApiKeyMiddleware)
HTML_PAGE = """
Agent BBS
Agent BBS
"""
README_TEXT = "Agent BBS API\tAuth: ALL requests require header X-API-Key: or pass ?key= as query parameter.\t1. Register: POST /register body: {\"name\": \"your-agent-name\"}\tResponse: {\"token\": \"xxx\", \"name\": \"your-agent-name\"}\t2. Post: POST /post body: {\"token\": \"xxx\", \"content\": \"your message\"}\tResponse: {\"id\": 1, \"author\": \"your-agent-name\"}\t3. Poll new: GET /poll?since_id=0&limit=50\tReturns posts with id > since_id, ordered by id asc. Keep track of the last id you received, use it as since_id next time.\t4. Query: GET /posts?author=xxx&limit=50\tauthor is optional. Returns posts ordered by id desc. 5. Upload file: POST /file/upload multipart/form-data, form fields: token (your agent token) + file (the file). Requires X-API-Key. Response: {\"ref\": \"a1b2c3/filename.ext\"}. Paste ref into post content to reference the file. 6. Download file: GET /file/{rand_id}/{filename} Requires X-API-Key. e.g. /file/a1b2c3/filename.ext"
@app.get("/readme")
def readme(): return PlainTextResponse(README_TEXT)
@app.get("/", response_class=HTMLResponse)
def index(): return HTML_PAGE
@contextmanager
def get_db(db_path):
conn = sqlite3.connect(db_path)
conn.row_factory = sqlite3.Row
try:
yield conn
conn.commit()
finally: conn.close()
def _db(request): return request.state.board["db"]
def init_db():
for board in BOARDS.values():
with get_db(board["db"]) as db:
db.execute("""CREATE TABLE IF NOT EXISTS users (
token TEXT PRIMARY KEY, name TEXT UNIQUE NOT NULL, created_at REAL)""")
db.execute("""CREATE TABLE IF NOT EXISTS posts (
id INTEGER PRIMARY KEY AUTOINCREMENT, author TEXT NOT NULL,
content TEXT NOT NULL, created_at REAL,
FOREIGN KEY(author) REFERENCES users(name))""")
db.execute("CREATE INDEX IF NOT EXISTS idx_posts_id ON posts(id)")
def verify_token(token, db_path):
with get_db(db_path) as db:
row = db.execute("SELECT name FROM users WHERE token=?", (token,)).fetchone()
if not row: raise HTTPException(401, "invalid token")
return row["name"]
@app.on_event("startup")
def startup(): load_boards_if_changed()
@app.post("/register")
def register(request: Request, name=Body(..., embed=True)):
token = uuid.uuid4().hex[:16]
try:
with get_db(_db(request)) as db:
db.execute("INSERT INTO users VALUES(?,?,?)", (token, name, time.time()))
except sqlite3.IntegrityError:
with get_db(_db(request)) as db:
row = db.execute("SELECT token FROM users WHERE name=?", (name,)).fetchone()
return {"token": row["token"], "name": name}
return {"token": token, "name": name}
@app.post("/post")
def create_post(request: Request, token=Body(...), content=Body(...)):
author = verify_token(token, _db(request))
with get_db(_db(request)) as db:
cur = db.execute("INSERT INTO posts(author,content,created_at) VALUES(?,?,?)",
(author, content, time.time()))
post_id = cur.lastrowid
return {"id": post_id, "author": author}
@app.get("/poll")
def poll(request: Request, since_id=Query(0), limit=Query(50)):
with get_db(_db(request)) as db:
rows = db.execute("SELECT id,author,content,created_at FROM posts WHERE id>? ORDER BY id LIMIT ?",
(since_id, limit)).fetchall()
return [dict(r) for r in rows]
@app.get("/count")
def count_posts(request: Request, author=Query(None)):
with get_db(_db(request)) as db:
q, p = ("SELECT COUNT(*) c FROM posts WHERE author=?", (author,)) if author else ("SELECT COUNT(*) c FROM posts", ())
return {"total": db.execute(q, p).fetchone()["c"]}
@app.get("/authors")
def get_authors(request: Request):
with get_db(_db(request)) as db:
return [r["author"] for r in db.execute("SELECT DISTINCT author FROM posts ORDER BY author").fetchall()]
@app.get("/posts")
def get_posts(request: Request, author=Query(None), limit=Query(50), offset=Query(0)):
with get_db(_db(request)) as db:
if author:
rows = db.execute("SELECT id,author,content,created_at FROM posts WHERE author=? ORDER BY id DESC LIMIT ? OFFSET ?",
(author, limit, offset)).fetchall()
else:
rows = db.execute("SELECT id,author,content,created_at FROM posts ORDER BY id DESC LIMIT ? OFFSET ?",
(limit, offset)).fetchall()
return [dict(r) for r in rows]
@app.post("/file/upload")
def upload_file(request: Request, token=Body(...), file: UploadFile = File(...)):
verify_token(token, _db(request))
rand_id = uuid.uuid4().hex[:6]
safe_name = os.path.basename(file.filename)
dest = os.path.join(UPLOAD_DIR, rand_id)
os.makedirs(dest, exist_ok=True)
with open(os.path.join(dest, safe_name), "wb") as f:
f.write(file.file.read())
return {"ref": f"{rand_id}/{safe_name}"}
@app.get("/file/{rand_id}/{filename}")
def download_file(rand_id: str, filename: str):
path = os.path.join(UPLOAD_DIR, rand_id, os.path.basename(filename))
if not os.path.exists(path):
raise HTTPException(404, "not found")
return FileResponse(path, filename=filename)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=58800)
================================================
FILE: assets/code_run_header.py
================================================
import sys, os, json, re, time, subprocess
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), '..', 'memory'))
_r = subprocess.run
def _d(b):
if not b: return ''
if isinstance(b, str): return b
try: return b.decode()
except: return b.decode('gbk', 'replace')
def _run(*a, **k):
t = k.pop('text', 0) | k.pop('universal_newlines', 0)
enc = k.pop('encoding', None)
k.pop('errors', None)
if enc: t = 1
if t and isinstance(k.get('input'), str):
k['input'] = k['input'].encode()
r = _r(*a, **k)
if t:
if r.stdout is not None: r.stdout = _d(r.stdout)
if r.stderr is not None: r.stderr = _d(r.stderr)
return r
subprocess.run = _run
_Pi = subprocess.Popen.__init__
def _pinit(self, *a, **k):
if os.name == 'nt': k['creationflags'] = (k.get('creationflags') or 0) | 0x08000000
_Pi(self, *a, **k)
subprocess.Popen.__init__ = _pinit
sys.excepthook = lambda t, v, tb: (sys.__excepthook__(t, v, tb), print(f"\n[Agent Hint]: NO GUESSING! You MUST probe first. If missing common package, pip.")) if issubclass(t, (ImportError, AttributeError)) else sys.__excepthook__(t, v, tb)
================================================
FILE: assets/configure_mykey.py
================================================
#!/usr/bin/env python3
"""
GenericAgent — 交互式初始化向导 (configure.py)
一键配置 LLM 模型 + 消息平台,自动生成 mykey.py
用法:
python configure.py
"""
import os
import sys
import shutil
import json
import urllib.request
import time
from datetime import datetime
# ── ANSI 颜色 ──────────────────────────────────────────────────────────────
C = {
'reset': '\033[0m', 'bold': '\033[1m', 'dim': '\033[2m',
'red': '\033[91m', 'green': '\033[92m', 'yellow': '\033[93m',
'blue': '\033[94m', 'magenta': '\033[95m', 'cyan': '\033[96m', 'white': '\033[97m',
}
PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
MYKPY_PATH = os.path.join(PROJECT_ROOT, 'mykey.py')
# ── 模型厂商定义 ───────────────────────────────────────────────────────────
LLM_PROVIDERS = [
{
'id': 'deepseek',
'name': 'DeepSeek V4 Flash (推荐首选)',
'desc': '国产开源模型,速度快、性价比高,原生 OAI 协议',
'type': 'native_oai',
'template': {
'name': 'deepseek-flash', 'apikey': 'sk-',
'apibase': 'https://api.deepseek.com', 'model': 'deepseek-v4-flash',
'api_mode': 'chat_completions', 'reasoning_effort': 'high',
},
'key_hint': '在 https://platform.deepseek.com/api_keys 获取',
'model_choices': ['deepseek-v4-flash', 'deepseek-v3-premium'],
},
{
'id': 'openai',
'name': 'OpenAI GPT-5 / o 系列',
'desc': 'OpenAI 官方,支持 GPT-5、o 系列推理模型',
'type': 'native_oai',
'template': {
'name': 'gpt-native', 'apikey': 'sk-',
'apibase': 'https://api.openai.com/v1', 'model': 'gpt-5.4',
'api_mode': 'chat_completions', 'reasoning_effort': 'high',
'max_retries': 3, 'connect_timeout': 10, 'read_timeout': 120,
},
'key_hint': '在 https://platform.openai.com/api-keys 获取',
'model_choices': ['gpt-5.4', 'o4-mini-high', 'o4-mini'],
},
{
'id': 'anthropic',
'name': 'Anthropic Claude 官方直连',
'desc': 'Claude 官方 API,sk-ant- 开头,原生 tool 协议',
'type': 'native_claude',
'template': {
'name': 'anthropic-direct', 'apikey': 'sk-ant-',
'apibase': 'https://api.anthropic.com', 'model': 'claude-opus-4-7',
'thinking_type': 'adaptive', 'max_tokens': 32768, 'temperature': 1,
},
'key_hint': '在 https://console.anthropic.com/ 获取',
'model_choices': ['claude-opus-4-7', 'claude-sonnet-4-6'],
},
{
'id': 'cc_relay',
'name': 'CC Switch 透传 (社区常用)',
'desc': '社区 Claude Code 透传渠道,需要 fake_cc_system_prompt=True',
'type': 'native_claude',
'template': {
'name': 'cc-relay', 'apikey': 'sk-user-',
'apibase': 'https:///claude/office',
'model': 'claude-opus-4-7', 'fake_cc_system_prompt': True,
'thinking_type': 'adaptive',
},
'key_hint': '从你的 CC Switch 服务商获取 apikey 和 apibase',
'model_choices': ['claude-opus-4-7', 'claude-sonnet-4-6'],
'extra_fields': [
{'key': 'apibase', 'label': 'API 地址 (apibase)', 'default': 'https://your-host/claude/office'},
{'key': 'fake_cc_system_prompt', 'label': 'fake_cc_system_prompt', 'type': 'bool', 'default': True},
],
},
{
'id': 'zhipu',
'name': '智谱 GLM (Anthropic 兼容)',
'desc': '智谱 GLM-5.1,走 Anthropic 兼容协议',
'type': 'native_claude',
'template': {
'name': 'zhipu-glm', 'apikey': 'sk-',
'apibase': 'https://open.bigmodel.cn/api/anthropic',
'model': 'GLM-5.1-Cloud', 'fake_cc_system_prompt': False,
'thinking_type': 'adaptive', 'max_retries': 3,
'connect_timeout': 10, 'read_timeout': 180,
},
'key_hint': '在 https://open.bigmodel.cn/usercenter/apikeys 获取',
'model_choices': ['GLM-5.1-Cloud', 'GLM-5.1-Edge'],
},
{
'id': 'minimax',
'name': 'MiniMax (推荐 Anthropic 路径)',
'desc': 'MiniMax M2.7,Anthropic 路径无 标签',
'type': 'native_claude',
'template': {
'name': 'minimax-anthropic', 'apikey': 'eyJh...',
'apibase': 'https://api.minimaxi.com/anthropic',
'model': 'MiniMax-M2.7', 'max_retries': 3,
},
'key_hint': '在 https://platform.minimaxi.com/user-center/basic-information 获取',
'model_choices': ['MiniMax-M2.7', 'MiniMax-M2.5'],
},
{
'id': 'minimax_oai',
'name': 'MiniMax (OpenAI 兼容路径)',
'desc': 'MiniMax M2.7,走 /v1/chat/completions',
'type': 'native_oai',
'template': {
'name': 'minimax-oai', 'apikey': 'eyJh...',
'apibase': 'https://api.minimaxi.com/v1', 'model': 'MiniMax-M2.7',
'context_win': 50000,
},
'key_hint': '在 https://platform.minimaxi.com/user-center/basic-information 获取',
'model_choices': ['MiniMax-M2.7', 'MiniMax-M2.5'],
},
{
'id': 'kimi',
'name': 'Kimi for Coding (Anthropic 兼容)',
'desc': 'Kimi 官方 CC 兼容端点,kimi-for-coding 模型',
'type': 'native_claude',
'template': {
'name': 'kimi-coding', 'apikey': 'sk-kimi-',
'apibase': 'https://api.kimi.com/coding',
'model': 'kimi-for-coding', 'fake_cc_system_prompt': True,
'thinking_type': 'adaptive',
},
'key_hint': '在 https://kimi.com/code 获取 API Key',
'model_choices': ['kimi-for-coding', 'kimi-thinking-plus'],
},
{
'id': 'moonshot_oai',
'name': 'Kimi / Moonshot (OAI 兼容)',
'desc': 'Moonshot OAI 端点,kimi-k2 系列,温度强制 1.0',
'type': 'native_oai',
'template': {
'name': 'kimi-k2', 'apikey': 'sk-',
'apibase': 'https://api.moonshot.cn/v1', 'model': 'kimi-k2-turbo-preview',
},
'key_hint': '在 https://platform.moonshot.cn/ 获取',
'model_choices': ['kimi-k2-turbo-preview', 'kimi-k2'],
},
{
'id': 'openrouter',
'name': 'OpenRouter (多模型中继)',
'desc': '一个 Key 用所有模型,支持 Claude/GPT/Gemini 等',
'type': 'native_oai',
'template': {
'name': 'openrouter', 'apikey': 'sk-or-',
'apibase': 'https://openrouter.ai/api/v1',
'model': 'anthropic/claude-opus-4-7',
'max_retries': 3, 'connect_timeout': 10, 'read_timeout': 120,
},
'key_hint': '在 https://openrouter.ai/keys 获取',
'model_choices': ['anthropic/claude-opus-4-7', 'openai/gpt-5.4'],
},
{
'id': 'crs',
'name': 'CRS 反代 Claude Max',
'desc': 'CRS 协议的反代 Claude,需要 fake_cc_system_prompt=True',
'type': 'native_claude',
'template': {
'name': 'crs-claude-max', 'apikey': 'cr_',
'apibase': 'https:///api',
'model': 'claude-opus-4-7[1m]', 'fake_cc_system_prompt': True,
'thinking_type': 'adaptive', 'max_tokens': 32768,
'max_retries': 3, 'read_timeout': 180,
},
'key_hint': '从你的 CRS 服务商获取 key 和 host',
'model_choices': ['claude-opus-4-7[1m]', 'claude-sonnet-4-6'],
'extra_fields': [
{'key': 'apibase', 'label': 'API 地址 (apibase)', 'default': 'https://your-crs-host/api'},
],
},
{
'id': 'crs_gemini',
'name': 'CRS Gemini Ultra (Antigravity 通道)',
'desc': 'CRS 包装的 Google Antigravity,不支持 SSE 流式,必须 stream=False',
'type': 'native_claude',
'template': {
'name': 'crs-gemini-ultra', 'apikey': 'cr_',
'apibase': 'https:///antigravity/api',
'model': 'claude-opus-4-7-thinking', 'stream': False,
'max_tokens': 32768, 'max_retries': 3, 'read_timeout': 180,
},
'key_hint': '从你的 CRS 服务商获取 Gemini Ultra key 和 host',
'model_choices': ['claude-opus-4-7-thinking', 'claude-opus-4-7[1m]', 'claude-opus-4-7'],
'extra_fields': [
{'key': 'apibase', 'label': 'API 地址 (apibase)', 'default': 'https://your-crs-gemini-host/antigravity/api'},
],
},
]
# ── 消息平台定义 ────────────────────────────────────────────────────────────
PLATFORMS = [
{
'id': 'none',
'name': '不使用消息平台(纯终端 REPL)',
'desc': '直接用 python agentmain.py 在终端交互',
'deps': [],
},
{
'id': 'telegram',
'name': 'Telegram 机器人',
'desc': '通过 Telegram Bot 与 Agent 对话',
'file': 'frontends/tgapp.py',
'deps': ['python-telegram-bot'],
'env_vars': [
{'key': 'tg_bot_token', 'label': 'Bot Token', 'hint': '从 @BotFather 获取'},
{'key': 'tg_allowed_users', 'label': '允许的用户 ID(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
{
'id': 'qq',
'name': 'QQ 机器人',
'desc': '通过 QQ 官方机器人 API 接入',
'file': 'frontends/qqapp.py',
'deps': ['qq-botpy'],
'env_vars': [
{'key': 'qq_app_id', 'label': 'App ID', 'hint': 'QQ 开放平台获取'},
{'key': 'qq_app_secret', 'label': 'App Secret'},
{'key': 'qq_allowed_users', 'label': '允许的用户 OpenID(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
{
'id': 'feishu',
'name': '飞书机器人',
'desc': '通过飞书应用与 Agent 对话',
'file': 'frontends/fsapp.py',
'deps': ['lark-oapi'],
'env_vars': [
{'key': 'fs_app_id', 'label': 'App ID', 'hint': '飞书开放平台获取'},
{'key': 'fs_app_secret', 'label': 'App Secret'},
{'key': 'fs_allowed_users', 'label': '允许的用户(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
{
'id': 'wecom',
'name': '企业微信机器人',
'desc': '通过企业微信 Bot 接入',
'file': 'frontends/wecomapp.py',
'deps': ['wecombot'],
'env_vars': [
{'key': 'wecom_bot_id', 'label': 'Bot ID'},
{'key': 'wecom_secret', 'label': 'Bot Secret'},
{'key': 'wecom_allowed_users', 'label': '允许的用户(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
{
'id': 'dingtalk',
'name': '钉钉机器人',
'desc': '通过钉钉应用接入',
'file': 'frontends/dingtalkapp.py',
'deps': ['dingtalk-sdk'],
'env_vars': [
{'key': 'dingtalk_client_id', 'label': 'Client ID (App Key)'},
{'key': 'dingtalk_client_secret', 'label': 'Client Secret (App Secret)'},
{'key': 'dingtalk_allowed_users', 'label': '允许的用户 StaffID(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
{
'id': 'discord',
'name': 'Discord 机器人',
'desc': '通过 Discord Bot 接入',
'file': 'frontends/dcapp.py',
'deps': ['discord.py'],
'env_vars': [
{'key': 'dc_bot_token', 'label': 'Bot Token', 'hint': 'Discord Developer Portal 获取'},
{'key': 'dc_allowed_users', 'label': '允许的用户 ID(逗号分隔, 留空=所有人)', 'default': '[]', 'is_list': True},
],
},
]
def _read_char():
"""跨平台读取单个字符(Windows 用 getwch 避免 CRLF 拆字节问题)。"""
if os.name == 'nt':
import msvcrt
return msvcrt.getwch()
else:
import tty
import termios
fd = sys.stdin.fileno()
old = termios.tcgetattr(fd)
try:
tty.setraw(fd)
return sys.stdin.read(1)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old)
def _masked(v, reveal, tail):
"""生成脱敏字符串:前 reveal 位明文 + * + 后 tail 位明文"""
if len(v) > reveal + tail:
return v[:reveal] + '*' * min(len(v) - reveal - tail, 8) + v[-tail:]
elif len(v) > reveal:
return v[:reveal] + '*' * (len(v) - reveal)
return v
def masked_input(prompt, reveal=6, tail=4):
"""密文输入:每输入一个字符实时显示脱敏版本,支持逐字输入和粘贴。
prompt 必须为单行(不含 \\n)。
"""
sys.stdout.write(prompt)
sys.stdout.flush()
chars = []
def _repaint():
m = _masked(''.join(chars), reveal, tail)
# \r → 行首;写 prompt+m;多余空格覆盖前次更长渲染的残留字符
sys.stdout.write(f'\r{prompt}{m} \r{prompt}{m}')
sys.stdout.flush()
while True:
c = _read_char()
if c in ('\r', '\n'):
break
if c in ('\x03', '\x04'):
raise KeyboardInterrupt
if c in ('\x08', '\x7f'):
if chars:
chars.pop()
_repaint()
elif c.isprintable() or c == ' ':
chars.append(c)
_repaint()
value = ''.join(chars)
_repaint()
sys.stdout.write('\n')
sys.stdout.flush()
return value
# ═══════════════════════════════════════════════════════════════════════════
# UI Helpers
# ═══════════════════════════════════════════════════════════════════════════
def cprint(text, color=None, bold=False, end='\n'):
parts = []
if color: parts.append(C.get(color, ''))
if bold: parts.append(C['bold'])
parts.append(text)
parts.append(C['reset'])
print(''.join(parts), end=end)
def banner():
print('\033[2J\033[H', end='') # ANSI 清屏,跨平台
print(f"{C['cyan']}{C['bold']}")
print(" ╔═══════════════════════════════════════════════════════════╗")
print(" ║ GenericAgent — 交互式初始化向导 v1.1 ║")
print(" ║ 一键配置 LLM 模型 + 消息平台,自动生成 mykey.py ║")
print(" ╚═══════════════════════════════════════════════════════════╝")
print(f"{C['reset']}")
print(f"{C['dim']} 项目目录: {PROJECT_ROOT}{C['reset']}")
print()
def _check_python():
"""检查 Python 版本,返回 (ok, msg)"""
vi = sys.version_info
if vi < (3, 10):
return False, f"Python {vi.major}.{vi.minor} 不满足最低要求 (≥ 3.10)"
if vi >= (3, 14):
return True, f"⚠ Python {vi.major}.{vi.minor} 可能与 pywebview 等依赖不兼容,推荐 3.11/3.12"
return True, f"✓ Python {vi.major}.{vi.minor}.{vi.micro}"
def ask_choice(prompt, choices, allow_multi=False, default=None):
"""交互式选择,返回 selected_id 或 [selected_ids]"""
print(f"\n{C['bold']}{prompt}{C['reset']}")
if allow_multi:
print(f"{C['dim']} (可多选,输入序号用逗号分隔,如: 1,3,5;输入 a 全选;回车跳过){C['reset']}")
else:
print(f"{C['dim']} (输入序号,如: 1){C['reset']}")
for i, c in enumerate(choices, 1):
desc = c.get('desc', '')
print(f" {C['green']}{i}.{C['reset']} {C['bold']}{c['name']}{C['reset']} {C['dim']}{desc}{C['reset']}")
while True:
raw = input(f"\n {C['yellow']}►{C['reset']} ").strip()
if not raw and default is not None:
return default
if allow_multi:
if raw.lower() == 'a':
return [c['id'] for c in choices]
parts = [p.strip() for p in raw.split(',') if p.strip()]
selected = []
for p in parts:
try:
idx = int(p) - 1
if 0 <= idx < len(choices):
selected.append(choices[idx]['id'])
except ValueError:
pass
if selected:
return selected
else:
try:
idx = int(raw) - 1
if 0 <= idx < len(choices):
return choices[idx]['id']
except ValueError:
pass
print(f" {C['red']}✗ 请输入有效序号{C['reset']}")
def ask_input(prompt, default=None, secret=False, hint=None):
"""交互式输入。secret=True 时使用脱敏输入。"""
# 提示信息先打印(不放进 prompt,保证 prompt 单行)
if hint:
cprint(f" {hint}", 'dim')
if default is not None:
cprint(f" [默认: {default}]", 'dim')
# 单行 prompt,\r 能正确回行首
prompt_line = f" {C['yellow']}►{C['reset']} {prompt}: "
while True:
if secret:
val = masked_input(prompt_line).strip()
else:
val = input(prompt_line).strip()
if not val and default is not None:
return default
if val:
return val
cprint("✗ 此项不能为空", 'red')
def ask_yesno(prompt, default=True):
hint = "Y/N"
raw = input(f"\n {C['yellow']}►{C['reset']} {prompt} ({hint}): ").strip().lower()
if not raw:
return default
return raw.startswith('y')
# ═══════════════════════════════════════════════════════════════════════════
# LLM 配置逻辑
# ═══════════════════════════════════════════════════════════════════════════
def _get_proxy_handler():
"""从环境变量读取代理配置,返回 ProxyHandler 或 None"""
for var in ('HTTPS_PROXY', 'https_proxy', 'HTTP_PROXY', 'http_proxy'):
url = os.environ.get(var)
if url:
return urllib.request.ProxyHandler({'https': url, 'http': url})
return None
def probe_models(provider, apikey, apibase=None):
"""调用 API 探测可用模型列表,返回模型 ID 列表或 None"""
ptype = provider.get('type', 'native_oai')
base = (apibase or provider['template'].get('apibase', '')).rstrip('/')
if ptype == 'native_claude':
# Anthropic 协议: 尝试 /v1/models (多数中继兼容此路径)
url = f"{base}/v1/models"
headers = {'x-api-key': apikey, 'anthropic-version': '2023-06-01'}
else:
url = f"{base}/models"
headers = {'Authorization': f'Bearer {apikey}'}
print(f"\n {C['dim']}🔍 正在探测可用模型 ({url})...{C['reset']}", end='', flush=True)
time.sleep(0.3)
opener = urllib.request.build_opener()
ph = _get_proxy_handler()
if ph:
opener = urllib.request.build_opener(ph)
print(f" {C['dim']}(via proxy){C['reset']}", end='', flush=True)
try:
req = urllib.request.Request(url, headers=headers, method='GET')
with opener.open(req, timeout=8) as resp:
data = json.loads(resp.read().decode())
# 兼容两种响应格式: {data: [{id: ...}]} 与 {object: 'list', data: [...]}
models = data.get('data', [])
ids = sorted(set(m['id'] for m in models if isinstance(m, dict) and m.get('id')))
if ids:
print(f" {C['green']}✓ 发现 {len(ids)} 个模型{C['reset']}")
return ids
print(f" {C['yellow']}⚠ 返回为空{C['reset']}")
return None
except Exception as e:
print(f" {C['yellow']}⚠ 探测失败: {type(e).__name__}(将使用预设列表){C['reset']}")
return None
def _normalize_model_choices(choices):
"""统一 model_choices 格式为 [{'id': str, 'name': str}]"""
if not choices:
return []
result = []
for item in choices:
if isinstance(item, str):
result.append({'id': item, 'name': item})
elif isinstance(item, dict):
result.append(item)
elif isinstance(item, (tuple, list)) and len(item) >= 1:
result.append({'id': item[0], 'name': item[1] if len(item) > 1 else item[0]})
return result
def _configure_advanced(provider, cfg):
"""配置高级可选字段: proxy, context_win, stream, user_agent, thinking_budget_tokens"""
print(f"\n {C['dim']}── 高级选项(回车跳过,使用默认值){C['reset']}")
proxy = ask_input("HTTP 代理地址 (proxy)", default='', hint='如 http://127.0.0.1:2082,留空跳过')
if proxy:
cfg['proxy'] = proxy
cw = ask_input("上下文窗口阈值 (context_win)", default='', hint='NativeClaude 默认 28000,其他默认 24000')
if cw:
cfg['context_win'] = int(cw)
if cfg.get('thinking_type') == 'enabled':
tbt = ask_input("thinking_budget_tokens", default='', hint='low≈4096, medium≈10240, high≈32768')
if tbt:
cfg['thinking_budget_tokens'] = int(tbt)
if provider['type'] == 'native_claude':
ua = ask_input("User-Agent 版本号", default='', hint='某些中转按 UA 白名单校验,pin 老版本用')
if ua:
cfg['user_agent'] = ua
stream_default = cfg.get('stream', True)
if ask_yesno("启用 SSE 流式 (stream)", default=stream_default):
cfg['stream'] = True
else:
cfg['stream'] = False
def configure_llm(provider):
"""引导用户配置单个模型"""
print(f"\n{C['cyan']}{'─'*60}{C['reset']}")
print(f"{C['bold']} 配置: {provider['name']}{C['reset']}")
print(f" {C['dim']}{provider['desc']}{C['reset']}")
print(f"{C['cyan']}{'─'*60}{C['reset']}")
cfg = dict(provider['template'])
# API Key(密文输入)
cfg['apikey'] = ask_input(
f"API Key",
hint=provider.get('key_hint', ''),
secret=True,
)
# 额外字段
for field in provider.get('extra_fields', []):
if field['key'] == 'apibase':
cfg['apibase'] = ask_input(
field['label'],
default=field.get('default', cfg.get('apibase', '')),
)
elif field.get('type') == 'bool':
cfg[field['key']] = ask_yesno(
field['label'],
default=field.get('default', True)
)
# 模型选择
model_list = probe_models(provider, cfg['apikey'], cfg.get('apibase'))
if model_list:
refresh_choice = {'id': '__refresh__', 'name': '🔄 重新探测模型列表'}
choices = [refresh_choice] + [{'id': m, 'name': m} for m in model_list]
while True:
picked = ask_choice("API 探测到以下可用模型,请选择:", choices)
if picked == '__refresh__':
print(f" {C['dim']}再次探测...{C['reset']}")
model_list = probe_models(provider, cfg['apikey'], cfg.get('apibase'))
if not model_list:
print(f" {C['yellow']}⚠ 再次探测失败,回退到预设列表{C['reset']}")
picked = _fallback_model(provider)
break
choices = [refresh_choice] + [{'id': m, 'name': m} for m in model_list]
else:
break
cfg['model'] = picked
else:
cfg['model'] = _fallback_model(provider)
# 别名
default_name = cfg.get('name', provider['id'])
name = ask_input("此配置的别名 (name,Mixin 引用用)", default=default_name)
if name:
cfg['name'] = name
# 高级选项
if ask_yesno("配置高级选项(proxy / context_win / stream 等)?", default=False):
_configure_advanced(provider, cfg)
return cfg
def _fallback_model(provider):
"""使用预设模型列表让用户选择"""
normalized = _normalize_model_choices(provider.get('model_choices', []))
if normalized:
return ask_choice("选择模型:", normalized)
return ask_input("请输入模型名称", default=provider['template'].get('model', ''))
def configure_llms():
"""配置 LLM 模型"""
print(f"\n{C['bold']}{C['magenta']}╔══════════════════════════════════════╗")
print(f"║ 第一步: 配置 LLM 模型 ║")
print(f"╚══════════════════════════════════════╝{C['reset']}")
print(f"\n{C['dim']} 你可以配置最多 2 个模型组成故障转移 (Mixin) 列表。{C['reset']}")
all_cfgs = []
provider_id = ask_choice("选择模型厂商 (配置第 1 个模型):", LLM_PROVIDERS)
provider = next(p for p in LLM_PROVIDERS if p['id'] == provider_id)
cfg = configure_llm(provider)
all_cfgs.append(cfg)
if ask_yesno("再添加一个模型做故障转移?", default=False):
providers_ext = [{'id': '__stop__', 'name': '✓ 不需要备选了', 'desc': ''}] + LLM_PROVIDERS
provider_id = ask_choice(
"选择模型厂商 (配置第 2 个模型 — 或选「不需要备选了」跳过):",
providers_ext
)
if provider_id != '__stop__':
provider = next(p for p in LLM_PROVIDERS if p['id'] == provider_id)
cfg = configure_llm(provider)
all_cfgs.append(cfg)
return all_cfgs
# ═══════════════════════════════════════════════════════════════════════════
# 消息平台配置逻辑
# ═══════════════════════════════════════════════════════════════════════════
def configure_platforms():
"""配置消息平台,返回 (platform_configs, pip_hints)"""
print(f"\n{C['bold']}{C['magenta']}╔══════════════════════════════════════╗")
print(f"║ 第二步: 配置消息平台 ║")
print(f"╚══════════════════════════════════════╝{C['reset']}")
print(f"\n{C['dim']} 消息平台用于从聊天软件与 Agent 交互。{C['reset']}")
print(f"{C['dim']} 你也可以跳过此步,直接用终端 REPL。{C['reset']}")
platform_ids = ask_choice(
"选择消息平台 (可多选,选 '不使用' 则跳过):",
PLATFORMS,
allow_multi=True,
default=['none']
)
if 'none' in platform_ids:
return [], set()
selected_platforms = []
pip_hints = set()
for pid in platform_ids:
platform = next(p for p in PLATFORMS if p['id'] == pid)
pip_hints.update(platform.get('deps', []))
print(f"\n{C['cyan']}{'─'*60}{C['reset']}")
print(f"{C['bold']} 配置: {platform['name']}{C['reset']}")
print(f"{C['cyan']}{'─'*60}{C['reset']}")
env_vals = {}
# 飞书扫码创建
if pid == 'feishu' and ask_yesno("使用一键扫码创建应用?(推荐)", default=True):
env_vals = _feishu_scan(platform)
# 补充扫码未获取的字段(或扫码失败时全手动填写)
for var in platform['env_vars']:
if var['key'] not in env_vals:
env_vals.update(_manual_platform_var(var))
# 企业微信专属:欢迎消息
if pid == 'wecom' and ask_yesno("设置欢迎消息?", default=False):
env_vals['wecom_welcome_message'] = ask_input("欢迎消息内容", default='你好,我在线上。')
selected_platforms.append({'platform': platform, 'config': env_vals})
return selected_platforms, pip_hints
def _manual_platform_var(var):
"""手动填写单个平台变量"""
val = ask_input(var['label'], hint=var.get('hint', ''), default=var.get('default'))
if var.get('is_list'):
if val == '[]' or not val:
return {var['key']: []}
return {var['key']: [x.strip() for x in val.split(',') if x.strip()]}
return {var['key']: val}
def _feishu_scan(platform):
"""飞书一键扫码创建应用,返回 env_vals 或空 dict"""
try:
import lark_oapi as lark
import qrcode, threading
from io import StringIO
except ImportError:
print(f"\n {C['yellow']}⚠ lark-oapi 未安装,降级为手动配置{C['reset']}")
return {}
print(f"\n {C['cyan']}📱 正在启动一键创建...{C['reset']}")
print(f" {C['dim']} 请用飞书 App 扫描终端二维码,完成授权后自动获取凭据。{C['reset']}\n")
qr_printed = threading.Event()
result_holder = {'data': None}
def handle_qr(info):
url = info['url']
expire = info['expire_in']
qr = qrcode.QRCode(border=1, box_size=1)
qr.add_data(url)
buf = StringIO()
qr.print_ascii(out=buf)
qr_art = buf.getvalue()
print(f"\n {C['bold']}请用飞书扫描下方二维码,或复制链接在浏览器打开:{C['reset']}")
print(f" {C['green']}{qr_art.replace(chr(27), '')}{C['reset']}")
print(f" {C['dim']} 链接: {url}{C['reset']}")
print(f" {C['dim']} 有效期 {expire} 秒{C['reset']}")
qr_printed.set()
def handle_status(info):
status = info['status']
if status == 'polling':
print(f" {C['yellow']}⏳ 等待扫码...{C['reset']}")
elif status == 'slow_down':
print(f" {C['yellow']}⏳ 等待中... (间隔 {info.get('interval', '?')}s){C['reset']}")
elif status == 'domain_switched':
print(f" {C['cyan']}🌐 已切换认证域名{C['reset']}")
def run_register():
try:
result = lark.register_app(
on_qr_code=handle_qr,
on_status_change=handle_status,
)
result_holder['data'] = result
except Exception as e:
print(f"\n {C['red']}✗ 创建失败: {e}{C['reset']}")
thread = threading.Thread(target=run_register, daemon=True)
thread.start()
qr_printed.wait(timeout=15)
thread.join(timeout=300)
if result_holder['data']:
result = result_holder['data']
print(f"\n {C['green']}✅ 应用创建成功!{C['reset']}")
print(f" App ID: {C['bold']}{result['client_id']}{C['reset']}")
print(f" App Secret: {C['bold']}{result['client_secret']}{C['reset']}")
return {
'fs_app_id': result['client_id'],
'fs_app_secret': result['client_secret'],
}
else:
print(f"\n {C['yellow']}⚠ 扫码创建未完成,降级为手动填写...{C['reset']}")
return {}
# ═══════════════════════════════════════════════════════════════════════════
# 生成 mykey.py
# ═══════════════════════════════════════════════════════════════════════════
def _var_type_info(cfg):
"""根据配置类型返回 (var_prefix, session_type)"""
cfg_type = cfg.get('type', 'native_oai')
if cfg_type == 'native_claude':
return 'native_claude_config', 'NativeClaudeSession'
elif cfg_type == 'claude':
return 'claude_config', 'ClaudeSession'
elif cfg_type == 'oai':
return 'oai_config', 'LLMSession'
else:
return 'native_oai_config', 'NativeOAISession'
def generate_mykey(llm_cfgs, platform_configs):
"""生成 mykey.py 内容"""
lines = []
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
lines.append(f"# GenericAgent — mykey.py (由 configure.py 自动生成 @ {datetime.now().strftime('%Y-%m-%d %H:%M')})")
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
lines.append("")
lines.append("# ── 停止符 ──────────────────────────────────────────────────────────────────")
lines.append("_SETUP_DONE = 'configure.py' # 删除此行可重新触发配置向导")
lines.append("")
# Mixin 配置
names = [c['name'] for c in llm_cfgs]
lines.append("# ── Mixin 故障转移 ──────────────────────────────────────────────────────────")
lines.append("mixin_config = {")
lines.append(f" 'llm_nos': {names},")
lines.append(" 'max_retries': 10,")
lines.append(" 'base_delay': 0.5,")
lines.append("}")
lines.append("")
# 各模型配置
# 同类型多实例时加上数字后缀
type_counts = {}
for cfg in llm_cfgs:
cfg_type = cfg.get('type', 'native_oai')
type_counts[cfg_type] = type_counts.get(cfg_type, 0) + 1
type_indices = {}
for i, cfg in enumerate(llm_cfgs):
cfg_type = cfg.get('type', 'native_oai')
var_prefix, session_type = _var_type_info(cfg)
idx = type_indices.get(cfg_type, 0)
type_indices[cfg_type] = idx + 1
# 同类型只有一个时不加后缀;多个时加数字后缀
if type_counts[cfg_type] > 1:
var_name = f"{var_prefix}_{idx}"
else:
var_name = var_prefix
lines.append(f"# ── {cfg['name']} ({session_type}) ─────────────────────────────────────────────")
lines.append(f"{var_name} = {{")
_write_config_fields(lines, cfg)
lines.append("}")
lines.append("")
# 平台配置
if platform_configs:
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
lines.append("# 聊天平台集成")
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
lines.append("")
for pc in platform_configs:
for key, val in pc['config'].items():
_write_platform_value(lines, key, val)
lines.append("")
# 尾部
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
lines.append("# 配置完毕!运行: python agentmain.py (终端 REPL)")
if platform_configs:
for pc in platform_configs:
p = pc['platform']
lines.append(f"# 或: python {p['file']} ({p['name']})")
lines.append("# ══════════════════════════════════════════════════════════════════════════════")
return '\n'.join(lines)
def _write_config_fields(lines, cfg):
"""写入配置字典的键值对(缩进的 'key': value, 格式)"""
for key in ['name', 'apikey', 'apibase', 'model', 'api_mode',
'fake_cc_system_prompt', 'thinking_type', 'thinking_budget_tokens',
'reasoning_effort', 'max_tokens', 'max_retries', 'connect_timeout',
'read_timeout', 'temperature', 'context_win',
'proxy', 'user_agent', 'stream']:
if key not in cfg:
continue
val = cfg[key]
if isinstance(val, bool):
lines.append(f" '{key}': {str(val)},")
elif isinstance(val, (int, float)):
lines.append(f" '{key}': {val},")
elif isinstance(val, str):
lines.append(f" '{key}': '{val}',")
else:
lines.append(f" '{key}': {repr(val)},")
def _write_platform_value(lines, key, val):
"""写入顶级变量(平台配置等)"""
if isinstance(val, list):
if val:
lines.append(f"{key} = {repr(val)}")
else:
lines.append(f"{key} = [] # 允许所有用户")
elif isinstance(val, str):
lines.append(f"{key} = '{val}'")
else:
lines.append(f"{key} = {repr(val)}")
# ═══════════════════════════════════════════════════════════════════════════
# Main
# ═══════════════════════════════════════════════════════════════════════════
def main():
banner()
# Python 版本检查
ok, msg = _check_python()
if not ok:
print(f" {C['red']}✗ {msg}{C['reset']}")
sys.exit(1)
color = 'yellow' if '⚠' in msg else 'green'
print(f" {C[color]}{msg}{C['reset']}\n")
# 检测已有配置
if os.path.exists(MYKPY_PATH):
print(f" {C['yellow']}⚠ 检测到已有 mykey.py{C['reset']}")
if not ask_yesno("是否重新配置?", default=False):
print(f"\n {C['dim']} 退出。如需重新配置请删除 mykey.py 后重试。{C['reset']}\n")
sys.exit(0)
# ── 顶层菜单 ──
scope = ask_choice(
"你想配置什么?",
[
{'id': 'llm', 'name': 'LLM 模型', 'desc': '选择厂商、填写 API Key、探测模型列表'},
{'id': 'platform', 'name': '消息平台 (Telegram/QQ/飞书等)', 'desc': '配置聊天机器人接入'},
{'id': 'both', 'name': '两项都配置 (推荐)', 'desc': 'LLM + 平台,完整初始化'},
],
default='both',
)
llm_cfgs = []
platform_configs = []
platform_deps = set()
# ── 执行 ──
if scope in ('llm', 'both'):
llm_cfgs = _do_llm()
if scope == 'llm':
if ask_yesno("是否继续配置消息平台?", default=True):
platform_configs, platform_deps = configure_platforms()
if scope == 'both':
platform_configs, platform_deps = configure_platforms()
if scope == 'platform':
platform_configs, platform_deps = configure_platforms()
if ask_yesno("是否继续配置 LLM 模型?", default=True):
llm_cfgs = _do_llm()
# ── 生成 mykey.py ──
if not llm_cfgs and not platform_configs:
print(f"\n {C['yellow']}⚠ 没有配置任何内容,退出。{C['reset']}")
sys.exit(0)
content = generate_mykey(llm_cfgs, platform_configs)
# 备份旧文件
if os.path.exists(MYKPY_PATH):
backup = os.path.join(PROJECT_ROOT, f'mykey.py.bak.{datetime.now().strftime("%Y%m%d_%H%M%S")}')
shutil.copy2(MYKPY_PATH, backup)
print(f"\n {C['green']}✓ 旧配置已备份至:{C['reset']} {C['dim']}{backup}{C['reset']}")
# 写入
with open(MYKPY_PATH, 'w', encoding='utf-8') as f:
f.write(content)
print(f"\n {C['green']}✓ mykey.py 已生成!{C['reset']}")
# ── 完成提示 ──
print(f"\n{C['bold']}{C['green']}╔══════════════════════════════════════╗")
print(f"║ 配置完成! ║")
print(f"╚══════════════════════════════════════╝{C['reset']}")
print()
if llm_cfgs:
print(f" {C['cyan']} 终端 REPL:{C['reset']} python agentmain.py")
if platform_configs:
for i, pc in enumerate(platform_configs, 1):
p = pc['platform']
print(f" {C['cyan']} 平台 {i} ({p['name']}):{C['reset']} python {p['file']}")
print()
# pip 依赖提示
all_deps = sorted(platform_deps)
if all_deps:
print(f" {C['yellow']}💡 提示:你需要安装以下依赖以使消息平台正常工作:{C['reset']}")
print(f" {C['cyan']}pip install {' '.join(all_deps)}{C['reset']}")
print()
# ── 入门示例 ──
print(f" {C['bold']}试试这些命令:{C['reset']}")
examples = [
"帮我在桌面创建一个 hello.txt,内容是 Hello World",
"请查看你的代码,安装所有用得上的 python 依赖",
"执行 web setup sop,解锁 web 工具",
"打开淘宝,搜索 iPhone 16,按价格排序",
"用rapidocr配置你的ocr能力并存入记忆",
"git 更新你的代码,然后看看 commit 有什么新功能",
"把这个记到你的记忆里",
]
for ex in examples:
print(f" {C['dim']}{ex}{C['reset']}")
print()
print(f" {C['green']}{C['bold']}合抱之木,生于毫末{C['reset']}\n")
def _do_llm():
"""配置 LLM 模型,失败则 exit。"""
cfgs = configure_llms()
if not cfgs:
print(f"\n {C['red']}✗ 至少需要配置一个模型才能使用。退出。{C['reset']}")
sys.exit(1)
return cfgs
if __name__ == '__main__':
try:
main()
except KeyboardInterrupt:
print(f"\n\n {C['yellow']}⚠ 用户中断{C['reset']}")
sys.exit(0)
================================================
FILE: assets/global_mem_insight_template.txt
================================================
# [Global Memory Insight]
需要时read L2 或 ls ../memory/ 查L3
L0(META-SOP): memory_management_sop
L2: 现空
L3: memory_cleanup_sop(记忆整理) | skill_search | ui_detect.py | ocr_utils.py | subagent | web_setup_sop | plan_sop
| procmem_scanner | keychain | ljqCtrl_sop+.py | tmwebdriver_sop | autonomous_operation_sop | scheduled_task_sop | vision_sop | adb_ui.py
L4: L4_raw_sessions/ 历史会话
浏览器特殊操作: tmwebdriver_sop(文件上传/图搜/PDF blob/物理坐标/HttpOnly Cookie/autofill突破/跨域iframe/CDP/跨tab)
键鼠: ljqCtrl_sop(禁pyautogui/先activate) 截图/视觉: ocr/vision_sop | 禁全屏截图,优先窗口
定时:scheduled_task_sop | 自主:autonomous_operation_sop | watchdog/反射:agentmain --reflect
手机:adb_ui.py
[RULES]
1. 搜索先行: 搜文件名严禁不用es(禁PS递归/禁dir遍历), 搜索一定优先使用web工具的google(严禁duckduckgo等), 优先看cwd,禁猜路径
2. 交叉验证: 禁信摘要, 数值进详情页核实
3. 编码安全: 禁PS cat/type用file_read; 改前必读; memory模块直接import(已在PATH,禁加虚假前缀)
4. 闭环: 物理模拟后确认; 3次失败请求干预; Git完整闭环
5. 进程: 禁无条件杀python(杀自己), 精确PID, 禁os.kill判活
6. 窗口: GUI状态优先win32gui枚举标题
7. web JS: 输入用原生setter+事件链, 点击前检disabled, 注意引号转义; scan空/不全先稍等再scan, 禁首扫定论
8. SOP: 读SOP禁凭印象,有utils必用 | 复杂超长程任务/用户明确提及规划模式→读plan_sop
================================================
FILE: assets/global_mem_insight_template_en.txt
================================================
# [Global Memory Insight]
Read L2 or ls ../memory/ for L3 when needed
L0(META-SOP): memory_management_sop
L2: currently empty
L3: memory_cleanup_sop(memory cleanup) | skill_search | ui_detect.py | ocr_utils.py | subagent | web_setup_sop | plan_sop
| procmem_scanner | keychain | ljqCtrl_sop+.py | tmwebdriver_sop | autonomous_operation_sop | scheduled_task_sop | vision_sop | adb_ui.py
L4: L4_raw_sessions/ historical sessions
Browser special ops: tmwebdriver_sop(file upload/image search/PDF blob/physical coords/HttpOnly Cookie/autofill bypass/cross-origin iframe/CDP/cross-tab)
Keyboard & Mouse: ljqCtrl_sop(no pyautogui/activate first) Screenshot/Vision: ocr/vision_sop | No fullscreen capture, prefer window
Scheduling: scheduled_task_sop | Autonomous: autonomous_operation_sop | watchdog/reflect: agentmain --reflect
Mobile: adb_ui.py
[RULES]
1. Search first: must use es for filename search (no PS recursion/no dir traversal), always prefer Google for web search (no duckduckgo etc), check cwd first, no guessing paths
2. Cross-verify: never trust summaries, verify numbers on detail pages
3. Encoding safety: use file_read not PS cat/type; read before modify; import memory modules directly (already in PATH, no fake prefixes)
4. Close the loop: confirm after physical simulation; request intervention after 3 failures; complete Git workflow
5. Processes: never kill python unconditionally (kills self), use exact PID, no os.kill for liveness check
6. Windows: prefer win32gui title enumeration for GUI state
7. Web JS: use native setter + event chain for input, check disabled before click, mind quote escaping; if scan empty/incomplete wait then rescan, no conclusions from first scan
8. SOP: read SOPs not from memory, must use utils if available | complex long-running/user mentions planning -> read plan_sop
================================================
FILE: assets/insight_fixed_structure.txt
================================================
Facts(L2): ../memory/global_mem.txt | GA CodeRoot: ../ | SOPs(L3): ../memory/*.md or *.py | META-SOP(L0): ../memory/memory_management_sop.md
L1 Insight是极简索引,L2/L3变更时同步L1,索引必须极简。写记忆前先读META-SOP(L0)。
[CONSTITUTION]
1. 改自身源码先请示;./内可自主实验,允许装包和portable工具
2. 决策前查记忆,有SOP/utils必用;多次失败回看SOP;未查证不断言
3. 分步执行,控制粒度,限制失败半径;3次失败请求干预
4. 密钥文件仅引用,不读取/移动
5. 写任何记忆前读META-SOP核验,memory下文件只能patch修改(除非新建)
================================================
FILE: assets/insight_fixed_structure_en.txt
================================================
Facts(L2): ../memory/global_mem.txt | CodeRoot: ../ | SOPs(L3): ../memory/*.md or *.py | META-SOP(L0): ../memory/memory_management_sop.md
L1 Insight is a minimal index; sync L1 when L2/L3 changes; keep index minimal. Read META-SOP(L0) before writing any memory.
[CONSTITUTION]
1. Ask before modifying own source code; free to experiment within ./; installing packages and portable tools allowed
2. Check memory before decisions; always use existing SOPs/utils; revisit SOPs on repeated failures; never assert without evidence
3. Execute step by step, control granularity, limit blast radius; request intervention after 3 failures
4. Key/secret files: reference only, never read or move
5. Read META-SOP to verify before writing any memory; files under memory/ must be patched only (unless creating new)
================================================
FILE: assets/install-macos-app.sh
================================================
#!/bin/bash
# GenericAgent macOS Desktop App Installation Script
#
# Usage:
# bash assets/install-macos-app.sh [--auto]
#
# This installer creates a small .app bundle that opens Terminal and runs
# `python3 launch.pyw` from the current GenericAgent checkout.
if [ -z "${BASH_VERSION}" ]; then
if command -v bash >/dev/null 2>&1; then
exec bash -- "${0}" "$@"
else
echo "Error: This script requires bash."
exit 1
fi
fi
set -euo pipefail
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; CYAN='\033[0;36m'; NC='\033[0m'
log_info() { echo -e "${BLUE}ℹ️ $1${NC}"; }
log_success() { echo -e "${GREEN}✅ $1${NC}"; }
log_warning() { echo -e "${YELLOW}⚠️ $1${NC}"; }
log_error() { echo -e "${RED}❌ $1${NC}"; }
AUTO_MODE=false
for arg in "$@"; do
case "$arg" in
--auto) AUTO_MODE=true ;;
esac
done
APP_NAME="GenericAgent"
PRIMARY_INSTALL_DIR="/Applications"
FALLBACK_INSTALL_DIR="${HOME}/Applications"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
ICON_PATH="${PROJECT_ROOT}/assets/images/logo.jpg"
LAUNCH_SCRIPT="${PROJECT_ROOT}/launch.pyw"
echo -e "${CYAN}"
echo "╔═══════════════════════════════════════════════════════════╗"
echo "║ GenericAgent — macOS Desktop App Installer ║"
echo "╚═══════════════════════════════════════════════════════════╝"
echo -e "${NC}"
if [[ "$(uname)" != "Darwin" ]]; then
log_error "This script only supports macOS."
exit 1
fi
if ! command -v python3 >/dev/null 2>&1; then
log_error "python3 is not installed."
exit 1
fi
if [ ! -f "${LAUNCH_SCRIPT}" ]; then
log_error "launch.pyw not found at ${LAUNCH_SCRIPT}"
exit 1
fi
project_path_for_applescript="${PROJECT_ROOT}/"
project_path_for_applescript="${project_path_for_applescript//\\/\\\\}"
project_path_for_applescript="${project_path_for_applescript//\"/\\\"}"
detect_existing_app() {
if [ -d "${PRIMARY_INSTALL_DIR}/${APP_NAME}.app" ]; then
echo "${PRIMARY_INSTALL_DIR}/${APP_NAME}.app"
return
fi
if [ -d "${FALLBACK_INSTALL_DIR}/${APP_NAME}.app" ]; then
echo "${FALLBACK_INSTALL_DIR}/${APP_NAME}.app"
return
fi
}
existing_app_path="$(detect_existing_app || true)"
if [ -n "${existing_app_path}" ]; then
log_warning "${APP_NAME}.app already exists at ${existing_app_path}"
fi
if [ "${AUTO_MODE}" = false ]; then
echo ""
echo "This will install a desktop app that launches GenericAgent"
echo "from Spotlight, Launchpad, or the Applications folder."
echo ""
if [ -n "${existing_app_path}" ]; then
read -p "Reinstall ${APP_NAME}.app? (y/N) " -n 1 -r
else
read -p "Continue? (Y/n) " -n 1 -r
fi
echo
if [ -n "${existing_app_path}" ]; then
[[ ! ${REPLY:-} =~ ^[Yy]$ ]] && { echo "Aborted."; exit 0; }
else
[[ ${REPLY:-} =~ ^[Nn]$ ]] && { echo "Aborted."; exit 0; }
fi
fi
TMP_DIR="$(mktemp -d)"
trap 'rm -rf "${TMP_DIR}"' EXIT
log_info "Building ${APP_NAME}.app..."
cat > "${TMP_DIR}/${APP_NAME}.applescript" </dev/null 2>&1
sips -z 32 32 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_16x16@2x.png" >/dev/null 2>&1
sips -z 32 32 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_32x32.png" >/dev/null 2>&1
sips -z 64 64 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_32x32@2x.png" >/dev/null 2>&1
sips -z 128 128 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_128x128.png" >/dev/null 2>&1
sips -z 256 256 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_128x128@2x.png" >/dev/null 2>&1
sips -z 256 256 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_256x256.png" >/dev/null 2>&1
sips -z 512 512 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_256x256@2x.png" >/dev/null 2>&1
sips -z 512 512 "${ICON_PATH}" --out "${ICONSET_DIR}/icon_512x512.png" >/dev/null 2>&1
cp "${ICON_PATH}" "${ICONSET_DIR}/icon_512x512@2x.png"
iconutil -c icns "${ICONSET_DIR}" -o "${TMP_DIR}/ga-icon.icns"
cp "${TMP_DIR}/ga-icon.icns" "${TMP_DIR}/${APP_NAME}.app/Contents/Resources/applet.icns"
log_success "Icon applied from assets/images/logo.jpg"
else
log_warning "Logo not found at ${ICON_PATH}, using default icon."
fi
install_bundle() {
local install_dir="$1"
local destination="${install_dir}/${APP_NAME}.app"
mkdir -p "${install_dir}"
rm -rf "${destination}"
cp -R "${TMP_DIR}/${APP_NAME}.app" "${destination}"
}
install_path=""
if install_bundle "${PRIMARY_INSTALL_DIR}" 2>/dev/null; then
install_path="${PRIMARY_INSTALL_DIR}/${APP_NAME}.app"
else
log_warning "Could not write to ${PRIMARY_INSTALL_DIR}; falling back to ${FALLBACK_INSTALL_DIR}"
install_bundle "${FALLBACK_INSTALL_DIR}"
install_path="${FALLBACK_INSTALL_DIR}/${APP_NAME}.app"
fi
log_success "Installed to: ${install_path}"
echo ""
echo -e "${CYAN}╔═══════════════════════════════════════════════════════════╗${NC}"
echo -e "${CYAN}║${NC} ✨ ${APP_NAME} Desktop App installed successfully! ${CYAN}║${NC}"
echo -e "${CYAN}╚═══════════════════════════════════════════════════════════╝${NC}"
echo ""
echo -e "${BLUE}Launch methods:${NC}"
echo " • Spotlight: Cmd + Space → type '${APP_NAME}' → Enter"
echo " • Launchpad: Find the '${APP_NAME}' icon"
echo " • Finder: Open ${install_path}"
echo ""
echo -e "${BLUE}Runtime behavior:${NC}"
echo " The app uses the current checkout path embedded at install time:"
echo " ${PROJECT_ROOT}"
echo " If you move the repo later, re-run this installer."
echo ""
echo -e "${BLUE}Uninstall:${NC}"
echo " rm -rf '${install_path}'"
echo ""
================================================
FILE: assets/install_python_windows.bat
================================================
@echo off
setlocal enabledelayedexpansion
title Python One-Click Installer
color 0A
echo.
echo ========================================
echo Python One-Click Installer (Windows)
echo ========================================
echo.
net session >nul 2>&1
if %errorlevel% neq 0 (
echo [!] Administrator privileges required. Restarting with elevation...
powershell -Command "Start-Process '%~f0' -Verb RunAs"
exit /b
)
echo [OK] Administrator privileges confirmed
echo.
python --version >nul 2>&1
if %errorlevel% equ 0 (
echo [OK] Python already installed:
python --version
echo.
choice /C YN /M "Install latest version anyway? (Y=Yes / N=Exit)"
if errorlevel 2 goto :end
)
set PYTHON_VERSION=3.12.9
set MIRROR_URL=https://npmmirror.com/mirrors/python/3.12.9/python-3.12.9-amd64.exe
set OFFICIAL_URL=https://www.python.org/ftp/python/3.12.9/python-3.12.9-amd64.exe
set INSTALLER=%TEMP%\python_installer.exe
echo [*] Preparing to download Python %PYTHON_VERSION%
echo [*] Trying mirror source first...
echo.
powershell -NoProfile -Command "[Net.ServicePointManager]::SecurityProtocol=[Net.SecurityProtocolType]::Tls12; $ProgressPreference='SilentlyContinue'; Invoke-WebRequest -Uri '%MIRROR_URL%' -OutFile '%INSTALLER%' -UseBasicParsing"
if not exist "%INSTALLER%" goto :official
for %%A in ("%INSTALLER%") do if %%~zA lss 1000000 goto :official
echo [OK] Mirror download complete
goto :install
:official
echo [!] Mirror failed, switching to official source...
powershell -NoProfile -Command "[Net.ServicePointManager]::SecurityProtocol=[Net.SecurityProtocolType]::Tls12; $ProgressPreference='SilentlyContinue'; Invoke-WebRequest -Uri '%OFFICIAL_URL%' -OutFile '%INSTALLER%' -UseBasicParsing"
if not exist "%INSTALLER%" (
echo [x] Download failed. Please check your network connection and retry.
pause
goto :end
)
for %%A in ("%INSTALLER%") do if %%~zA lss 1000000 (
echo [x] Downloaded file is incomplete. Please check your network and retry.
pause
goto :end
)
echo [OK] Official source download complete
:install
echo.
echo [*] Installing Python %PYTHON_VERSION% (this may take 2-5 minutes^)...
echo.
start /wait "" "%INSTALLER%" /passive InstallAllUsers=1 PrependPath=1 Include_test=0 Include_pip=1
set INSTALL_CODE=%errorlevel%
del /f /q "%INSTALLER%" >nul 2>&1
if %INSTALL_CODE% neq 0 (
echo [x] Installation failed with error code: %INSTALL_CODE%
pause
goto :end
)
echo [+] Installation complete!
echo.
timeout /t 3 /nobreak >nul
set "PATH=C:\Program Files\Python312;C:\Program Files\Python312\Scripts;%PATH%"
python --version >nul 2>&1
if %errorlevel% equ 0 (
echo [OK] Python installed successfully:
python --version
echo.
echo [OK] pip version:
pip --version
echo.
echo [*] Configuring pip mirror (Tsinghua^)...
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
pip config set global.trusted-host pypi.tuna.tsinghua.edu.cn
echo.
echo [*] Installing requests...
pip install requests
echo.
echo ========================================
echo All done! Open a new terminal to use
echo python and pip commands.
echo ========================================
) else (
echo [!] PATH not yet refreshed. Please close this window and open a new terminal.
)
:end
echo.
pause
================================================
FILE: assets/sys_prompt.txt
================================================
# Role: 物理级全能执行者
你拥有文件读写、脚本执行、用户浏览器JS注入、系统级干预的物理操作权限。禁止推诿"无法操作"——不空想,用工具探测。
## 行动原则
调用工具前先推演:当前阶段、上步结果是否符合预期、下步策略,必须在回复文本中用输出极简总结。
- 探测优先:失败时先充分获取信息(日志/状态/上下文),关键信息存入工作记忆,再决定重试或换方案。不可逆操作先询问用户。
- 失败升级:1次→读错误理解原因,2次→探测环境状态,3次→深度分析后换方案或问用户。禁止无新信息的重复操作。
================================================
FILE: assets/sys_prompt_en.txt
================================================
# Role: Physical-Level Omnipotent Executor
You have full physical access: file I/O, script execution, browser JS injection, and system-level intervention. Never deflect with "can't do it" — don't speculate, use tools to probe.
Summarize and reply in user's language or follow user's prompt.
## Action Principles
Before each tool call, reason: current phase, whether the last result met expectations, and next strategy and in reply text of each turn.
- Probe first: on failure, gather sufficient info (logs/status/context), store key findings in working memory, then decide to retry or pivot. Ask the user before irreversible operations.
- Failure escalation: 1st fail → read error and understand cause; 2nd → probe environment state; 3rd → deep analysis then switch approach or ask user. Never repeat an action without new information.
================================================
FILE: assets/tmwd_cdp_bridge/background.js
================================================
// background.js - Cookie + CDP Bridge
chrome.runtime.onInstalled.addListener(() => {
console.log('CDP Bridge installed');
// Strip CSP headers to allow eval/inline scripts
chrome.declarativeNetRequest.updateDynamicRules({
removeRuleIds: [9999],
addRules: [{
id: 9999, priority: 1,
action: { type: 'modifyHeaders', responseHeaders: [
{ header: 'content-security-policy', operation: 'remove' },
{ header: 'content-security-policy-report-only', operation: 'remove' }
]},
condition: { urlFilter: '*', resourceTypes: ['main_frame', 'sub_frame'] }
}]
});
});
async function handleExtMessage(msg, sender) {
if (msg.cmd === 'cookies') return await handleCookies(msg, sender);
if (msg.cmd === 'cdp') return await handleCDP(msg, sender);
if (msg.cmd === 'batch') return await handleBatch(msg, sender);
if (msg.cmd === 'tabs') {
try {
if (msg.method === 'switch') {
const tab = await chrome.tabs.update(msg.tabId, { active: true });
await chrome.windows.update(tab.windowId, { focused: true });
return { ok: true };
} else {
const tabs = (await chrome.tabs.query({})).filter(t => isScriptable(t.url));
const data = tabs.map(t => ({ id: t.id, url: t.url, title: t.title, active: t.active, windowId: t.windowId }));
return { ok: true, data };
}
} catch (e) { return { ok: false, error: e.message }; }
}
if (msg.cmd === 'management') {
try {
if (msg.method === 'list') {
const all = await chrome.management.getAll();
return { ok: true, data: all.map(e => ({ id: e.id, name: e.name, enabled: e.enabled, type: e.type, version: e.version })) };
}
if (msg.method === 'reload') {
chrome.alarms.create('tmwd-self-reload', { when: Date.now() + 200 });
return { ok: true };
}
if (msg.method === 'disable') {
await chrome.management.setEnabled(msg.extId, false);
return { ok: true };
}
if (msg.method === 'enable') {
await chrome.management.setEnabled(msg.extId, true);
return { ok: true };
}
return { ok: false, error: 'Unknown method: ' + msg.method };
} catch (e) { return { ok: false, error: e.message }; }
}
if (msg.cmd === 'contentSettings') {
try {
const type = msg.type || 'automaticDownloads';
const setting = msg.setting || 'allow';
const pattern = msg.pattern || '';
await chrome.contentSettings[type].set({
primaryPattern: pattern,
setting: setting
});
return { ok: true };
} catch (e) { return { ok: false, error: e.message }; }
}
return { ok: false, error: 'Unknown cmd: ' + msg.cmd };
}
chrome.runtime.onMessage.addListener((msg, sender, sendResponse) => {
handleExtMessage(msg, sender).then(sendResponse);
return true;
});
async function handleCookies(msg, sender) {
try {
let url = msg.url || sender.tab?.url;
if (!url && msg.tabId) {
const tab = await chrome.tabs.get(msg.tabId);
url = tab.url;
}
const origin = url.match(/^https?:\/\/[^\/]+/)[0];
const all = await chrome.cookies.getAll({ url });
const part = await chrome.cookies.getAll({ url, partitionKey: { topLevelSite: origin } }).catch(() => []);
const merged = [...all];
for (const c of part) {
if (!merged.some(x => x.name === c.name && x.domain === c.domain)) merged.push(c);
}
return { ok: true, data: merged };
} catch (e) {
return { ok: false, error: e.message };
}
}
async function handleBatch(msg, sender) {
const R = [];
let attached = null;
const resolve$N = (params) => JSON.parse(JSON.stringify(params || {}).replace(/"\$(\d+)\.([^"]+)"/g,
(_, i, path) => { let v = R[+i]; for (const k of path.split('.')) v = v[k]; return JSON.stringify(v); }));
try {
for (const c of msg.commands) {
if (c.tabId === undefined && msg.tabId !== undefined) c.tabId = msg.tabId;
if (c.cmd === 'cookies') {
R.push(await handleCookies(c, sender));
} else if (c.cmd === 'tabs') {
const tabs = (await chrome.tabs.query({})).filter(t => isScriptable(t.url));
R.push({ ok: true, data: tabs.map(t => ({ id: t.id, url: t.url, title: t.title, active: t.active, windowId: t.windowId })) });
} else if (c.cmd === 'cdp') {
const tabId = c.tabId || msg.tabId || sender.tab?.id;
if (attached !== tabId) {
if (attached) { await chrome.debugger.detach({ tabId: attached }); attached = null; }
await chrome.debugger.attach({ tabId }, '1.3');
attached = tabId;
}
R.push(await chrome.debugger.sendCommand({ tabId }, c.method, resolve$N(c.params)));
} else {
R.push({ ok: false, error: 'unknown cmd: ' + c.cmd });
}
}
if (attached) await chrome.debugger.detach({ tabId: attached });
return { ok: true, results: R };
} catch (e) {
if (attached) try { await chrome.debugger.detach({ tabId: attached }); } catch (_) {}
return { ok: false, error: e.message, results: R };
}
}
async function handleCDP(msg, sender) {
const tabId = msg.tabId || sender.tab?.id;
if (!tabId) return { ok: false, error: 'no tabId' };
try {
await chrome.debugger.attach({ tabId }, '1.3');
const result = await chrome.debugger.sendCommand({ tabId }, msg.method, msg.params || {});
await chrome.debugger.detach({ tabId });
return { ok: true, data: result };
} catch (e) {
try { await chrome.debugger.detach({ tabId }); } catch (_) {}
return { ok: false, error: e.message };
}
}
// Filter out chrome:// and other internal tabs that can't be scripted
const isScriptable = url => url && /^https?:/.test(url);
// --- Shared page/CDP script builder core ---
function buildExecScript(code, errorHandler) {
return `(async () => {
function smartProcessResult(result) {
if (result === null || result === undefined || typeof result !== 'object') return result;
try { if (result.window === result && result.document) return '[Window: ' + (result.location?.href || 'about:blank') + ']'; } catch(_){}
if (typeof jQuery !== 'undefined' && result instanceof jQuery) {
const elements = []; for (let i = 0; i < result.length; i++) { if (result[i] && result[i].nodeType === 1) elements.push(result[i].outerHTML); } return elements;
}
if (result instanceof NodeList || result instanceof HTMLCollection) {
const elements = []; for (let i = 0; i < result.length; i++) { if (result[i] && result[i].nodeType === 1) elements.push(result[i].outerHTML); } return elements;
}
if (result.nodeType === 1) return result.outerHTML;
if (!Array.isArray(result) && typeof result === 'object' && 'length' in result && typeof result.length === 'number') {
const firstElement = result[0];
if (firstElement && firstElement.nodeType === 1) {
const elements = []; const length = Math.min(result.length, 100);
for (let i = 0; i < length; i++) { const elem = result[i]; if (elem && elem.nodeType === 1) elements.push(elem.outerHTML); } return elements;
}
}
try { return JSON.parse(JSON.stringify(result, function(key, value) { if (typeof value === 'object' && value !== null) { if (value.nodeType === 1) return value.outerHTML; if (value === window || value === document) return '[Object]'; try { if (value.window === value && value.document) return '[Window]'; } catch(_){} } return value; })); } catch (e) { return '[无法序列化: ' + e.message + ']'; }
}
try {
const jsCode = ${JSON.stringify(code)}.trim();
const lines = jsCode.split(/\\r?\\n/).filter(l => l.trim());
const lastLine = lines.length > 0 ? lines[lines.length - 1].trim() : '';
const AsyncFunction = Object.getPrototypeOf(async function(){}).constructor;
let r;
function _air(c) { const ls = c.split(/\\r?\\n/); let i = ls.length - 1; while (i >= 0 && !ls[i].trim()) i--; if (i < 0) return c; const t = ls[i].trim(); if (/^(return |return;|return$|let |const |var |if |if\\(|for |for\\(|while |while\\(|switch|try |throw |class |function |async |import |export |\\/\\/|})/.test(t)) return c; ls[i] = ls[i].match(/^(\\s*)/)[1] + 'return ' + t; return ls.join('\\n'); }
if (lastLine.startsWith('return')) {
r = await (new AsyncFunction(jsCode))();
} else {
try { r = eval(jsCode); if (r instanceof Promise) r = await r; } catch (e) {
if (e instanceof SyntaxError && (/return/i.test(e.message) || /await/i.test(e.message))) { r = await (new AsyncFunction(_air(jsCode)))(); } else throw e;
}
}
return { ok: true, data: smartProcessResult(r) };
} catch (e) {
${errorHandler}
}
})()`;
}
function buildPageScript(code) {
return buildExecScript(code, `
const errMsg = e.message || String(e);
return { ok: false, error: { name: e.name || 'Error', message: errMsg, stack: e.stack || '' },
csp: errMsg.includes('Refused to evaluate') || errMsg.includes('unsafe-eval') || errMsg.includes('Content Security Policy') };
`);
}
function buildCdpScript(code) {
return buildExecScript(code, `
return { ok: false, error: { name: e.name || 'Error', message: e.message || String(e), stack: e.stack || '' } };
`);
}
// --- WebSocket Client for TMWebDriver ---
let ws = null;
const WS_URL = 'ws://127.0.0.1:18765';
function scheduleProbe() {
// Use chrome.alarms to survive MV3 service worker suspension
chrome.alarms.create('tmwd-ws-probe', { delayInMinutes: 0.083 }); // ~5s
}
function scheduleKeepalive() {
// Keep SW alive while WS is connected (~25s, under 30s SW timeout)
chrome.alarms.create('tmwd-ws-keepalive', { delayInMinutes: 0.4 }); // ~24s
}
async function isServerAlive() {
try {
const ctrl = new AbortController();
setTimeout(() => ctrl.abort(), 2000);
await fetch('http://127.0.0.1:18765', { signal: ctrl.signal });
return true; // Got HTTP response → port is listening
} catch (e) {
return false; // Network error (connection refused) or timeout → server not alive
}
}
chrome.alarms.onAlarm.addListener(async (alarm) => {
if (alarm.name === 'tmwd-self-reload') {
chrome.runtime.reload();
return;
}
if (alarm.name === 'tmwd-ws-keepalive') {
// Keepalive: ping to keep SW alive + detect dead connections
if (ws && ws.readyState === WebSocket.OPEN) {
try { ws.send('{"type":"ping"}'); } catch (_) {}
scheduleKeepalive();
} else {
// Connection lost, switch to probe mode
ws = null;
scheduleProbe();
}
}
if (alarm.name === 'tmwd-ws-probe') {
if (ws && ws.readyState <= 1) return; // Already connected/connecting
if (await isServerAlive()) {
console.log('[TMWD-WS] Server detected, connecting...');
connectWS();
} else {
scheduleProbe(); // Server not up, keep probing
}
}
});
async function handleWsExec(data) {
const tabId = data.tabId;
console.log('[TMWD-WS] Exec request', data.id, 'on tab', tabId);
ws.send(JSON.stringify({ type: 'ack', id: data.id }));
if (!tabId) {
ws.send(JSON.stringify({ type: 'error', id: data.id, error: 'No tabId provided' }));
return;
}
// Use onCreated listener to reliably capture new tabs (avoids race condition with query-diff)
const newTabIds = new Set();
const onCreated = (tab) => { newTabIds.add(tab.id); };
chrome.tabs.onCreated.addListener(onCreated);
try {
let res;
try {
const result = await chrome.scripting.executeScript({
target: { tabId },
world: 'MAIN',
func: async (s) => await eval(s),
args: [buildPageScript(data.code)]
});
res = result[0]?.result;
if (res === null || res === undefined) {
console.log('[TMWD-WS] executeScript returned null/undefined, treating as CSP issue');
res = { ok: false, error: { name: 'Error', message: 'executeScript returned null (possible CSP or context issue)', stack: '' }, csp: true };
}
} catch (e) {
console.log('[TMWD-WS] scripting.executeScript failed:', e.message);
res = { ok: false, error: { name: e.name || 'Error', message: e.message || String(e), stack: e.stack || '' }, csp: true };
}
// CDP fallback for CSP-restricted pages
if (res && !res.ok && res.csp) {
console.log('[TMWD-WS] CDP fallback for tab', tabId);
const wrappedCode = buildCdpScript(data.code);
try {
await chrome.debugger.attach({ tabId }, '1.3');
const cdpRes = await chrome.debugger.sendCommand({ tabId }, 'Runtime.evaluate', {
expression: wrappedCode, awaitPromise: true, returnByValue: true
});
await chrome.debugger.detach({ tabId });
if (cdpRes.exceptionDetails) {
const desc = cdpRes.exceptionDetails.exception?.description || 'CDP Error';
res = { ok: false, error: { name: 'Error', message: desc, stack: desc } };
} else {
res = cdpRes.result.value;
}
} catch (cdpErr) {
try { await chrome.debugger.detach({ tabId }); } catch (_) {}
res = { ok: false, error: { name: 'Error', message: 'CDP fallback failed: ' + cdpErr.message, stack: '' } };
}
}
// Grace period for async tab creation (e.g. link click with target=_blank)
if (newTabIds.size === 0) await new Promise(r => setTimeout(r, 200));
chrome.tabs.onCreated.removeListener(onCreated);
// Get full info for captured new tabs
const newTabs = [];
for (const id of newTabIds) {
try { const t = await chrome.tabs.get(id); newTabs.push({id: t.id, url: t.url, title: t.title}); } catch (_) {}
}
if (res?.ok) {
ws.send(JSON.stringify({ type: 'result', id: data.id, result: res.data, newTabs }));
} else {
console.log(res);
ws.send(JSON.stringify({ type: 'error', id: data.id, error: res?.error || 'Unknown error', newTabs }));
}
} catch (e) {
ws.send(JSON.stringify({ type: 'error', id: data.id, error: { name: e.name || 'Error', message: e.message || String(e), stack: e.stack || '' } }));
} finally {
chrome.tabs.onCreated.removeListener(onCreated);
}
}
function connectWS() {
if (ws && ws.readyState <= 1) return; // CONNECTING or OPEN
ws = null;
console.log('[TMWD-WS] Connecting to', WS_URL);
try {
ws = new WebSocket(WS_URL);
} catch (e) {
console.error('[TMWD-WS] Constructor error:', e);
ws = null;
scheduleProbe();
return;
}
ws.onopen = async () => {
console.log('[TMWD-WS] Connected!');
scheduleKeepalive(); // Keep SW alive while connected
const tabs = (await chrome.tabs.query({})).filter(t => isScriptable(t.url));
ws.send(JSON.stringify({
type: 'ext_ready',
tabs: tabs.map(t => ({ id: t.id, url: t.url, title: t.title }))
}));
console.log('[TMWD-WS] Sent ext_ready with', tabs.length, 'tabs');
};
ws.onmessage = async (event) => {
try {
const data = JSON.parse(event.data);
if (data.id && data.code) {
let code = data.code;
// If code is a JSON string representing an object, parse it
if (typeof code === 'string') {
try { const p = JSON.parse(code); if (p && typeof p === 'object') code = p; } catch (_) {}
}
if (typeof code === 'object' && code !== null && code.cmd) {
// Custom protocol message → route to handleExtMessage
if (code.tabId === undefined && data.tabId !== undefined) code.tabId = data.tabId;
const res = await handleExtMessage(code, {});
ws.send(JSON.stringify({ type: res.ok ? 'result' : 'error', id: data.id, result: res.data ?? res.results ?? res, error: res.error }));
} else if (typeof code === 'string') {
// Plain JS code
await handleWsExec(data);
} else if (typeof code === 'object' && code !== null) {
// Object without cmd → legacy extension message
const msg = code.tabId === undefined && data.tabId !== undefined ? { ...code, tabId: data.tabId } : code;
const res = await handleExtMessage(msg, {});
ws.send(JSON.stringify({ type: res.ok ? 'result' : 'error', id: data.id, result: res.data ?? res.results ?? res, error: res.error }));
}
}
} catch (e) {
console.error('[TMWD-WS] message parse error', e);
}
};
ws.onclose = () => {
console.log('[TMWD-WS] Disconnected');
ws = null;
scheduleProbe();
};
ws.onerror = (e) => {
console.error('[TMWD-WS] Error:', e);
// onclose will fire after this, which triggers reconnect
};
}
// Initial connect + wake-up hooks
connectWS();
chrome.runtime.onStartup.addListener(() => connectWS());
chrome.runtime.onInstalled.addListener(() => connectWS());
// Sync tab list on changes
async function sendTabsUpdate() {
if (!ws || ws.readyState !== WebSocket.OPEN) return;
const tabs = (await chrome.tabs.query({})).filter(t => isScriptable(t.url) && !/streamlit/i.test(t.title));
ws.send(JSON.stringify({
type: 'tabs_update',
tabs: tabs.map(t => ({ id: t.id, url: t.url, title: t.title }))
}));
}
chrome.tabs.onUpdated.addListener((_, changeInfo) => {
if (changeInfo.status === 'complete') sendTabsUpdate();
});
chrome.tabs.onRemoved.addListener(() => sendTabsUpdate());
chrome.tabs.onCreated.addListener(() => sendTabsUpdate());
================================================
FILE: assets/tmwd_cdp_bridge/content.js
================================================
;(function(){ if (/streamlit/i.test(document.title)) return;
// Remove meta CSP tags
document.querySelectorAll('meta[http-equiv="Content-Security-Policy"]').forEach(e => e.remove());
// Indicator badge at bottom-right (userscript style)
(function(){
if(window.self!==window.top)return;
const d=document.createElement('div');
d.id='ljq-ind';
d.innerText='ljq_driver: 已连接';
d.style.cssText='position:fixed;bottom:8px;right:8px;background:#4CAF50;color:white;padding:4px 7px;border-radius:4px;font-size:11px;font-weight:bold;z-index:99999;cursor:pointer;box-shadow:0 2px 4px rgba(0,0,0,0.2);opacity:0.5;';
d.addEventListener('click',()=>alert('会话活跃\nURL: '+location.href));
(document.body||document.documentElement).appendChild(d);
})();
new MutationObserver(muts => {
for (const m of muts) for (const n of m.addedNodes) {
if (n.id === TID || (n.querySelector && n.querySelector('#' + TID))) {
const el = n.id === TID ? n : n.querySelector('#' + TID);
handle(el);
}
}
}).observe(document.documentElement, { childList: true, subtree: true });
async function handle(el) {
try {
const req = el.textContent.trim() ? JSON.parse(el.textContent) : { cmd: 'cookies' };
const cmd = req.cmd || 'cookies';
let resp;
if (cmd === 'cookies') {
resp = await chrome.runtime.sendMessage({ cmd: 'cookies', url: req.url || location.href });
} else if (cmd === 'cdp') {
resp = await chrome.runtime.sendMessage({ cmd: 'cdp', method: req.method, params: req.params || {}, tabId: req.tabId });
} else if (cmd === 'batch') {
resp = await chrome.runtime.sendMessage({ cmd: 'batch', commands: req.commands, tabId: req.tabId });
} else if (cmd === 'tabs') {
resp = await chrome.runtime.sendMessage({ cmd: 'tabs', method: req.method, tabId: req.tabId });
} else {
resp = { ok: false, error: 'unknown cmd: ' + cmd };
}
el.textContent = JSON.stringify(resp);
} catch (e) {
el.textContent = JSON.stringify({ ok: false, error: e.message });
}
}
})();
================================================
FILE: assets/tmwd_cdp_bridge/disable_dialogs.js
================================================
// Disable alert/confirm/prompt to prevent page JS from blocking extension
(function() {
const _log = console.log.bind(console);
function toast(type, msg) {
_log('[TMWD] ' + type + ' suppressed:', msg);
try {
const d = document.createElement('div');
d.textContent = '[' + type + '] ' + msg;
Object.assign(d.style, {
position:'fixed', top:'12px', right:'12px', zIndex:'2147483647',
background:'#222', color:'#fff', padding:'10px 18px', borderRadius:'8px',
fontSize:'14px', maxWidth:'420px', wordBreak:'break-all',
boxShadow:'0 4px 16px rgba(0,0,0,.3)', opacity:'1',
transition:'opacity .5s', pointerEvents:'none'
});
(document.body || document.documentElement).appendChild(d);
setTimeout(() => { d.style.opacity = '0'; }, 3000);
setTimeout(() => { d.remove(); }, 3600);
} catch(e) {}
}
window.alert = function(msg) { toast('alert', msg); };
window.confirm = function(msg) { toast('confirm', msg); return true; };
window.prompt = function(msg, def) { toast('prompt', msg); return def || null; };
})();
================================================
FILE: assets/tmwd_cdp_bridge/manifest.json
================================================
{
"manifest_version": 3,
"name": "TMWD CDP Bridge",
"version": "2.0",
"description": "Cookie viewer + CDP bridge",
"permissions": [
"cookies",
"tabs",
"activeTab",
"debugger",
"scripting",
"alarms",
"declarativeNetRequest",
"management",
"contentSettings"
],
"host_permissions": [""],
"background": {
"service_worker": "background.js"
},
"content_scripts": [
{
"matches": [""],
"js": ["disable_dialogs.js"],
"run_at": "document_start",
"all_frames": true,
"world": "MAIN"
},
{
"matches": [""],
"js": ["config.js", "content.js"],
"run_at": "document_idle",
"all_frames": true
}
],
"action": {
"default_popup": "popup.html",
"default_title": "TMWD CDP Bridge"
}
}
================================================
FILE: assets/tmwd_cdp_bridge/popup.html
================================================
🍪 Cookies
点击刷新获取 cookies...
================================================
FILE: assets/tmwd_cdp_bridge/popup.js
================================================
document.addEventListener('DOMContentLoaded', () => {
const out = document.getElementById('out');
const btn = document.getElementById('refresh');
btn.addEventListener('click', fetchCookies);
fetchCookies();
});
async function fetchCookies() {
const out = document.getElementById('out');
try {
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
if (!tab?.url) { out.textContent = 'No active tab'; return; }
const resp = await chrome.runtime.sendMessage({ cmd: 'cookies', url: tab.url });
if (!resp?.ok) { out.textContent = 'Error: ' + (resp?.error || 'unknown'); return; }
if (!resp.data.length) { out.textContent = '(no cookies)'; return; }
// 展示带标记
out.textContent = resp.data.map(c =>
`${c.name}=${c.value}` + (c.httpOnly ? ' [H]' : '') + (c.secure ? ' [S]' : '') + (c.partitionKey ? ' [P]' : '')
).join('\n');
// 自动复制 name=value; 格式到剪贴板
const str = resp.data.map(c => `${c.name}=${c.value}`).join('; ');
await navigator.clipboard.writeText(str);
} catch (e) { out.textContent = 'Error: ' + e.message; }
}
================================================
FILE: assets/tool_usable_history.json
================================================
[{"role":"user","content":[{"type":"text","text":"code_run个helloworld,根据结果简单评价你的工具配置"}]},{"role":"assistant","content":[{"type":"text","text":"\n\n用户要求code_run helloworld并评价工具配置\n\n```python\nprint(\"Hello, World!\")\n```"},{"type":"tool_use","id":"tooluse_GlmTNw190WGgK4sBBOiDHK","name":"code_run","input":{"script":"print(\"Hello, World!\")"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"tooluse_GlmTNw190WGgK4sBBOiDHK","content":"{\"status\": \"success\", \"stdout\": \"Hello, World!\\r\\n\", \"exit_code\": 0}"},{"type":"text","text":"\n### [WORKING MEMORY]\n\n\n[USER]: code_run个helloworld,根据结果简单评价你的工具配置\n\nCurrent turn: 1\n"}]},{"role":"assistant","content":[{"type":"text","text":"\n\ncode_run helloworld成功,评价工具配置\n\n工具配置没问题。`code_run` 返回了结构化的 JSON(`status`、`stdout`、`exit_code`),执行链路通畅,cwd 默认落在 temp 目录。\n\n刚才还验证了 `inline_eval` 能正确访问 agent 内部状态并导出 history,`file_patch` 也能精准改文件。整套工具链从代码执行、文件读写、到 agent 内省都是通的。"}]}]
================================================
FILE: assets/tools_schema.json
================================================
[
{"type": "function", "function": {
"name": "code_run",
"description": "Code executor. Prefer python. Multi-call OK, use script param. Reply code block is executed if no script arg; prefer for single call to avoid escaping. No hardcoding bulk data",
"parameters": {"type": "object", "properties": {
"script": {"type": "string", "description": "[Mutually exclusive] NEVER use this param when use reply code block."},
"type": {"type": "string", "enum": ["python", "powershell"], "description": "Code type", "default": "python"},
"timeout": {"type": "integer", "description": "in seconds", "default": 60},
"cwd": {"type": "string", "description": "Working directory, defaults to cwd"},
"inline_eval": {"type": "boolean", "description": "DO NOT USE except explicitly specified."}}}
}},
{"type": "function", "function": {
"name": "file_read",
"description": "Read file. Read before modify for latest context and line numbers",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "Relative or absolute"},
"start": {"type": "integer", "description": "Start line number (1-based)"},
"count": {"type": "integer", "description": "Number of lines to read", "default": 200},
"keyword": {"type": "string", "description": "[Optional] If provided, returns first match (case-insensitive) with context"},
"show_linenos": {"type": "boolean", "description": "Show line numbers", "default": true}}}
}},
{"type": "function", "function": {
"name": "file_patch",
"description": "Replace unique old_content with new_content. Exact match required (whitespace/indentation). On failure, file_read to recheck",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "File path"},
"old_content": {"type": "string", "description": "Original text block to replace (must be unique)"},
"new_content": {"type": "string", "description": "New content. Supports {{file:path:startLine:endLine}} to ref file lines, auto-expanded"}}}
}},
{"type": "function", "function": {
"name": "file_write",
"description": "Create/overwrite/append files. HUGE edits ONLY. Supports {{file:path:startLine:endLine}}, auto-expanded",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "File path"},
"content": {"type": "string"},
"mode": {"type": "string", "enum": ["overwrite", "append", "prepend"], "description": "Write mode", "default": "overwrite"}}}
}},
{"type": "function", "function": {
"name": "web_scan",
"description": "Get simplified HTML and tab list. Removes hidden/floating/covered elements. Call after switching pages",
"parameters": {"type": "object", "properties": {
"tabs_only": {"type": "boolean", "description": "Show tab list only, no HTML"},
"switch_tab_id": {"type": "string", "description": "[Optional] Tab ID to switch to"},
"text_only": {"type": "boolean", "description": "Plain text only, no HTML"}}}
}},
{"type": "function", "function": {
"name": "web_execute_js",
"description": "Execute JS. Multi-call OK with different switch_tab_id. No guessing. Act accurately to reduce web_scan calls. Execute JS in ```javascript blocks if no script arg, prefer to avoid escaping",
"parameters": {"type": "object", "properties": {
"script": {"type": "string", "description": "[Mutually exclusive] JS code or script path. NEVER use this param when use reply code block"},
"save_to_file": {"type": "string", "description": "file path; **only** for long result"},
"no_monitor": {"type": "boolean", "description": "Skip page change monitoring, saves 2-3s. Only for reads, not for page actions"},
"switch_tab_id": {"type": "string", "description": "[Optional] Tab ID to switch to before executing"}}}
}},
{"type": "function", "function": {
"name": "update_working_checkpoint",
"description": "Short-term working notepad, auto-injected each turn to prevent info loss in long tasks. Call during early/mid stages, not at end. When: (1) after reading SOP, store user needs & key constraints (skip for simple 1-2 step tasks); (2) before subtask switch or context flush; (3) after repeated failures, re-read SOP and must store new findings; (4) on new task, update content, clear old progress but keep valid constraints.\n\nDon't call: simple tasks (1-2 steps), task completed (use long-term memory tool)",
"parameters": {"type": "object", "properties": {
"key_info": {"type": "string", "description": "Replaces current notepad (<200 tokens). Incremental update: review existing, keep valid, add/remove/modify. Store: pitfalls, user requirements, key params/findings, file paths, progress, next steps. Don't store: ephemeral info, obvious context, old task info when user switched tasks. Prefer over-updating over losing key info"},
"related_sop": {"type": "string", "description": "Related SOP names, tips for further re-read"}}}
}},
{"type": "function", "function": {
"name": "ask_user",
"description": "Interrupt task to ask user when needing decisions, extra info, or facing unresolvable blockers",
"parameters": {"type": "object", "properties": {
"question": {"type": "string", "description": "Question for the user"},
"candidates": {"type": "array", "items": {"type": "string"}, "description": "Optional quick-select choices for the user"}}}
}},
{"type": "function", "function": {
"name": "start_long_term_update",
"description": "Start distilling long-term memory. Call when discovering info worth remembering (env facts/user prefs/lessons learned). Skip if memory already updated or in autonomous flow. Must call when a task that took 15+ turns is completed",
"parameters": {"type": "object", "properties": {}}}
}
]
================================================
FILE: assets/tools_schema_cn.json
================================================
[
{"type": "function", "function": {
"name": "code_run",
"description": "代码执行器。优先使用python。支持Multi-call,并行时用script参数。无script参数时正文代码块会被执行,单次调用优先使用以免转义。禁硬编码大量数据",
"parameters": {"type": "object", "properties": {
"script": {"type": "string", "description": "[Optional] 要执行的代码。为免转义建议留空,改用正文代码块(与此参数互斥)"},
"type": {"type": "string", "enum": ["python", "powershell"], "description": "代码类型", "default": "python"},
"timeout": {"type": "integer", "description": "执行超时时间(秒)", "default": 60},
"cwd": {"type": "string", "description": "工作目录,默认为当前工作目录"},
"inline_eval": {"type": "boolean", "description": "不允许使用除非明确要求"}}}
}},
{"type": "function", "function": {
"name": "file_read",
"description": "读取文件内容。建议在修改文件前先读取,以确保获取最新的上下文和行号。支持分页读取或关键字搜索",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "文件相对或绝对路径"},
"start": {"type": "integer", "description": "起始行号(从 1 开始)"},
"count": {"type": "integer", "description": "读取的行数", "default": 200},
"keyword": {"type": "string", "description": "可选搜索关键字。如果提供,将返回第一个匹配项(忽略大小写)及其周边的内容"},
"show_linenos": {"type": "boolean", "description": "是否显示行号,建议开启以辅助 file_patch 定位", "default": true}}}
}},
{"type": "function", "function": {
"name": "file_patch",
"description": "精细化局部文件修改。在文件中寻找唯一的 old_content 块并替换为 new_content。要求 old_content 必须在文件中唯一存在,且空格、缩进、换行必须与原文件完全一致。如果匹配失败,请使用 file_read 重新确认文件内容",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "文件路径"},
"old_content": {"type": "string", "description": "文件中需要被替换的原始文本块(需确保唯一性)"},
"new_content": {"type": "string", "description": "替换后的新文本内容。支持 {{file:路径:起始行:结束行}} 语法引用文件内容,写入前自动展开"}}}
}},
{"type": "function", "function": {
"name": "file_write",
"description": "用于文件的新建、全量覆盖或追加写入。对于精细的代码修改,应优先使用 file_patch。写入内容支持 {{file:路径:起始行:结束行}} 语法引用文件片段,写入前自动展开",
"parameters": {"type": "object", "properties": {
"path": {"type": "string", "description": "文件路径"},
"content": {"type": "string"},
"mode": {"type": "string", "enum": ["overwrite", "append", "prepend"], "description": "写入模式覆盖、追加或在开头追加", "default": "overwrite"}}}
}},
{"type": "function", "function": {
"name": "web_scan",
"description": "获取当前页面的简化HTML内容和标签页列表。会移除隐藏/浮动/被遮盖的元素。切换页面后一般应先调用查看",
"parameters": {"type": "object", "properties": {
"tabs_only": {"type": "boolean", "description": "仅返回标签页列表和当前标签信息,不获取HTML内容"},
"switch_tab_id": {"type": "string", "description": "可选的标签页 ID。如果提供,系统将在扫描前切换到该标签页"},
"text_only": {"type": "boolean", "description": "只要纯文本不要HTML"}}}
}},
{"type": "function", "function": {
"name": "web_execute_js",
"description": "执行JS。支持Multi-call,用不同switch_tab_id并行操作多标签页。禁止猜测,准确操作以减少 web_scan 调用。无script参数时执行正文 ```javascript 块,以免转义",
"parameters": {"type": "object", "properties": {
"script": {"type": "string", "description": "[Optional] JS代码或路径。为免转义建议留空,改用正文代码块(与此参数互斥)"},
"save_to_file": {"type": "string", "description": "结果存文件,适合返回值较长时"},
"no_monitor": {"type": "boolean", "description": "跳过页面变更监控,省2-3秒。仅在纯读取信息时设置,页面操作时不要设置"},
"switch_tab_id": {"type": "string", "description": "可选的标签页 ID,切换到该标签页执行"}}}
}},
{"type": "function", "function": {
"name": "update_working_checkpoint",
"description": "短期工作便签,每轮自动注入上下文,防长任务信息丢失。前中期调用,非结束时。何时调用:(1)任务开始读SOP后,存用户需求和关键约束/参数(简单1-2步任务除外);(2)子任务切换或上下文即将被冲刷前;(3)多次重试失败后,重读SOP并必须调用存储新发现;(4)切换新任务时更新内容,清旧进度但保留仍有效的约束。\n\n何时不调用:简单任务(1-2步且无严重约束)、任务已完成时(应当用长期结算工具)",
"parameters": {"type": "object", "properties": {
"key_info": {"type": "string", "description": "替换当前便签(<200 tokens)。增量更新:先回顾现有内容,保留仍有效的,再增删改。存:要避的坑、用户原始需求、关键参数/发现、文件路径、当前进度、下一步计划。不存:马上要用用完即丢的、上下文中显而易见的、用户已换全新任务时的旧任务信息。宁多更新不丢关键"},
"related_sop": {"type": "string", "description": "相关sop名称,可以多个,必要时需要再读"}}}
}},
{"type": "function", "function": {
"name": "ask_user",
"description": "当需要用户决策、提供额外信息或遇到无法自动解决的阻碍时,调用此工具中断任务并提问",
"parameters": {"type": "object", "properties": {
"question": {"type": "string", "description": "向用户提出的明确问题"},
"candidates": {"type": "array", "items": {"type": "string"}, "description": "提供给用户的可选快捷选项列表"}}}
}},
{"type": "function", "function": {
"name": "start_long_term_update",
"description": "准备开始提炼记忆。发现值得长期记忆的信息(环境事实/用户偏好/避坑经验)时调用此工具。已记忆更新或在自主流程内时无需调用。超15轮完成的任务必须调用以沉淀经验",
"parameters": {"type": "object", "properties": {}}}
}
]
================================================
FILE: frontends/DESKTOP_PET_README.md
================================================
# Desktop Pet Skin System
## 快速开始
运行桌面宠物:
```bash
python3 desktop_pet_v2.pyw
```
## 功能特性
### 1. 多皮肤支持
- 自动发现 `skins/` 目录下的所有皮肤
- 右键菜单切换皮肤
- 支持 sprite sheet 和 GIF 两种格式
### 2. 多动画状态
- **idle** - 待机动画
- **walk** - 行走动画
- **run** - 跑步动画
- **sprint** - 冲刺动画
右键菜单可切换动画状态
### 3. 交互功能
- **单击** - 拖动宠物
- **双击** - 关闭程序
- **右键** - 打开菜单(切换皮肤/动画)
### 4. HTTP 远程控制
```bash
# 显示消息
curl "http://127.0.0.1:51983/?msg=Hello"
# 切换动画状态
curl "http://127.0.0.1:51983/?state=run"
# POST 消息
curl -X POST -d "任务完成" http://127.0.0.1:51983/
```
## 添加新皮肤
### 目录结构
```
skins/
└── your-skin-name/
├── skin.json # 配置文件(必需)
├── idle.png # 动画资源
├── walk.png
├── run.png
└── sprint.png
```
### skin.json 配置示例
#### Sprite Sheet 格式(推荐)
```json
{
"name": "My Pet",
"version": "1.0.0",
"author": "Your Name",
"description": "描述",
"format": "sprite",
"animations": {
"idle": {
"file": "idle.png",
"loop": true,
"sprite": {
"frameWidth": 44,
"frameHeight": 31,
"frameCount": 6,
"columns": 6,
"fps": 6,
"startFrame": 0
}
},
"walk": {
"file": "walk.png",
"loop": true,
"sprite": {
"frameWidth": 65,
"frameHeight": 32,
"frameCount": 8,
"columns": 8,
"fps": 8,
"startFrame": 0
}
}
}
}
```
#### GIF 格式
```json
{
"name": "My Pet",
"format": "gif",
"animations": {
"idle": {
"file": "idle.gif",
"loop": true
},
"walk": {
"file": "walk.gif",
"loop": true
}
}
}
```
### 配置说明
- **frameWidth/frameHeight**: 单帧尺寸(像素)
- **frameCount**: 帧数
- **columns**: sprite sheet 的列数
- **fps**: 播放帧率
- **startFrame**: 起始帧索引(从 0 开始)
### Sprite Sheet 布局
```
+-------+-------+-------+-------+
| 帧0 | 帧1 | 帧2 | 帧3 | ← 第一行
+-------+-------+-------+-------+
| 帧4 | 帧5 | 帧6 | 帧7 | ← 第二行
+-------+-------+-------+-------+
```
如果 `columns=4, startFrame=2, frameCount=3`,则读取:帧2, 帧3, 帧4
## 已包含的皮肤
1. **Glube** - 像素风小怪兽(多文件 sprite)
2. **Vita** - 像素风小恐龙(单文件 sprite)
3. **Doux** - 像素风小恐龙(单文件 sprite)
## 从 ai-bubu 导入更多皮肤
ai-bubu 项目包含更多皮肤资源,可以直接复制:
```bash
# 复制皮肤
cp -r ai-bubu-main/packages/app/public/skins/boy frontends/skins/
cp -r ai-bubu-main/packages/app/public/skins/dinosaur frontends/skins/
cp -r ai-bubu-main/packages/app/public/skins/line frontends/skins/
cp -r ai-bubu-main/packages/app/public/skins/mort frontends/skins/
cp -r ai-bubu-main/packages/app/public/skins/tard frontends/skins/
```
## 与 stapp.py 集成
在 `stapp.py` 中点击"🐱 桌面宠物"按钮会自动启动桌面宠物,并在每个 turn 结束时发送通知。
## 故障排查
### 皮肤不显示
1. 检查 `skin.json` 格式是否正确
2. 确认图片文件存在
3. 检查 sprite 配置参数是否匹配图片尺寸
### 动画不流畅
- 调整 `fps` 参数
- 检查帧数是否正确
### 透明背景问题
- 确保 PNG 文件包含 alpha 通道
- 使用 RGBA 模式的图片
## 技术细节
- 基于 Tkinter + PIL/Pillow
- 支持透明背景(#01FF01 色键)
- 窗口置顶、无边框
- HTTP 服务器端口:51983
================================================
FILE: frontends/btw_cmd.py
================================================
"""`/btw` 命令:side question — 不打断主 Agent 的临时 subagent 问答。
- 持锁 deepcopy backend.history → 后台线程 backend.raw_ask 单次拉答
- 主 agent backend.history 零写入;不入 task_queue
- 答案 → display_queue 'done'(install 路径)或同步 return(frontend 路径)
复用 backend.raw_ask + make_messages,不新建 LLM 实例。
"""
from __future__ import annotations
import copy, os, threading, time
from typing import Optional
_WRAPPER_ZH = """
这是用户的临时插问 (side question)。主 agent 仍在后台运行,**不会被打断**。
身份与边界:
- 你是一个独立的轻量 sub-agent
- 上下文里能看到主 agent 与用户的完整对话、最近的工具调用与结果
- 用户在问当前进展或顺便确认某事——基于已有信息**一次性**作答
- 没有任何工具可用:不要"让我查一下" / "我去试试" / 任何承诺动作
- 信息不足就坦白说"基于目前对话我不知道"
侧问内容如下:
{question}"""
_WRAPPER_EN = """
This is a side question from the user. The main agent is NOT interrupted — it continues in the background.
Identity & boundaries:
- You are an independent lightweight sub-agent
- You can see the full conversation between the main agent and the user, plus recent tool calls/results
- The user is asking about current progress or a quick aside — answer in **one shot** from existing info
- You have NO tools — never say "let me check" / "I'll try" / any action promise
- If info is missing, just say "based on the conversation I don't know"
Question:
{question}"""
_TIMEOUT_SEC = 120
def _wrapper(): return _WRAPPER_EN if os.environ.get('GA_LANG') == 'en' else _WRAPPER_ZH
def _strip_cmd(query):
s = (query or '').strip()
return s[len('/btw'):].strip() if s.startswith('/btw') else s
def _help_text():
return ('**/btw 用法**:side question — 临时问主 agent 当前进展,不打断主线\n\n'
'`/btw <你的问题>`\n\n'
'行为:抓取当前对话上下文 → 单轮纯文本作答(无工具)→ 主 agent 历史不变。')
def _snapshot_history(backend):
"""Lock + deepcopy: defends against concurrent compress_history_tags mutating inner blocks."""
with backend.lock:
return copy.deepcopy(list(backend.history))
def _build_wire(backend, history, sidequest_msg):
"""history + sidequest → wire-format. Dispatches: BaseSession subclasses → make_messages,
Native* → raw pairs (raw_ask runs _fix/_drop/_ensure transforms itself)."""
msgs = history + [sidequest_msg]
if hasattr(backend, 'make_messages'):
return backend.make_messages(msgs)
return [{"role": m["role"], "content": list(m.get("content", []))} for m in msgs]
def _ask(agent, question, deadline):
"""One-shot raw_ask against current backend; never mutates backend.history."""
backend = agent.llmclient.backend
user_msg = {"role": "user",
"content": [{"type": "text", "text": _wrapper().format(question=question)}]}
wire = _build_wire(backend, _snapshot_history(backend), user_msg)
text = ''
for chunk in backend.raw_ask(wire):
text += chunk
if time.time() > deadline:
return text + '\n\n⚠️ /btw 超时,仅返回部分回复。'
return text
def _format(question, body, took):
head = f'> 🟡 /btw {question}\n\n'
return head + (body.strip() or '*(空回复)*') + f'\n\n*({took:.1f}s)*'
def _run(agent, question, deadline):
"""Catches errors at the boundary so neither caller path needs its own try/except."""
try: return _ask(agent, question, deadline)
except Exception as e: return f'❌ /btw 失败: {type(e).__name__}: {e}'
def handle(agent, query, display_queue) -> Optional[str]:
"""Slash-cmd entry (server-side, install path). Spawn worker; return None to consume."""
question = _strip_cmd(query)
if not question or question in ('help', '?', '-h', '--help'):
display_queue.put({'done': _help_text(), 'source': 'system'})
return None
started = time.time()
deadline = started + _TIMEOUT_SEC
def worker():
body = _run(agent, question, deadline)
display_queue.put({'done': _format(question, body, time.time() - started), 'source': 'system'})
threading.Thread(target=worker, daemon=True, name='btw-sidequest').start()
return None
def handle_frontend_command(agent, query) -> str:
"""Sync entry for frontends wanting a string back (tg/wx/stapp/...)."""
question = _strip_cmd(query)
if not question or question in ('help', '?', '-h', '--help'):
return _help_text()
started = time.time()
body = _run(agent, question, started + _TIMEOUT_SEC)
return _format(question, body, time.time() - started)
def install(cls):
"""Idempotent monkey-patch: intercept /btw before original dispatch."""
orig = cls._handle_slash_cmd
if getattr(orig, '_btw_patched', False): return
def patched(self, raw_query, display_queue):
s = (raw_query or '').strip()
if s == '/btw' or s.startswith('/btw ') or s.startswith('/btw\t'):
r = handle(self, raw_query, display_queue)
if r is None: return None
return r
return orig(self, raw_query, display_queue)
patched._btw_patched = True
cls._handle_slash_cmd = patched
================================================
FILE: frontends/chatapp_common.py
================================================
import ast, asyncio, glob, json, os, queue as Q, re, socket, sys, time
HELP_COMMANDS = (
("/help", "显示帮助"),
("/status", "查看状态"),
("/stop", "停止当前任务"),
("/new", "开启新对话并清空当前上下文"),
("/restore", "恢复上次对话历史"),
("/continue", "列出可恢复会话"),
("/continue [n]", "恢复第 n 个会话"),
("/btw ", "side question — 临时插问主 agent 进展,不打断主线"),
("/llm", "查看当前模型列表"),
("/llm [n]", "切换到第 n 个模型"),
)
TELEGRAM_MENU_COMMANDS = (
("help", "显示帮助"),
("status", "查看状态"),
("stop", "停止当前任务"),
("new", "开启新对话并清空当前上下文"),
("restore", "恢复上次对话历史"),
("continue", "列出可恢复会话;/continue n 恢复第 n 个"),
("llm", "查看模型列表;/llm n 切换到指定模型"),
)
def build_help_text(commands=HELP_COMMANDS):
return "📖 命令列表:\n" + "\n".join(f"{cmd} - {desc}" for cmd, desc in commands)
HELP_TEXT = build_help_text()
FILE_HINT = "If you need to show files to user, use [FILE:filepath] in your response."
TAG_PATS = [r"<" + t + r">.*?" + t + r">" for t in ("thinking", "summary", "tool_use", "file_content")]
PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
RESTORE_GLOBS = (
os.path.join(PROJECT_ROOT, "temp", "model_responses", "model_responses_*.txt"),
os.path.join(PROJECT_ROOT, "temp", "model_responses_*.txt"),
)
RESTORE_BLOCK_RE = re.compile(
r"^=== (Prompt|Response) ===.*?\n(.*?)(?=^=== (?:Prompt|Response) ===|\Z)",
re.DOTALL | re.MULTILINE,
)
HISTORY_RE = re.compile(r"\s*(.*?)\s*", re.DOTALL)
SUMMARY_RE = re.compile(r"\s*(.*?)\s*", re.DOTALL)
def clean_reply(text):
for pat in TAG_PATS:
text = re.sub(pat, "", text or "", flags=re.DOTALL)
return re.sub(r"\n{3,}", "\n\n", text).strip() or "..."
def extract_files(text):
return re.findall(r"\[FILE:([^\]]+)\]", text or "")
def strip_files(text):
return re.sub(r"\[FILE:[^\]]+\]", "", text or "").strip()
def split_text(text, limit):
text, parts = (text or "").strip() or "...", []
while len(text) > limit:
cut = text.rfind("\n", 0, limit)
if cut < limit * 0.6:
cut = limit
parts.append(text[:cut].rstrip())
text = text[cut:].lstrip()
return parts + ([text] if text else []) or ["..."]
def _restore_log_files():
files = []
for pattern in RESTORE_GLOBS:
files.extend(glob.glob(pattern))
return sorted(set(files))
def _restore_text_pairs(content):
users = re.findall(r"=== USER ===\n(.+?)(?==== |$)", content, re.DOTALL)
resps = re.findall(r"=== Response ===.*?\n(.+?)(?==== Prompt|$)", content, re.DOTALL)
restored = []
for u, r in zip(users, resps):
u, r = u.strip(), r.strip()[:500]
if u and r:
restored.extend([f"[USER]: {u}", f"[Agent] {r}"])
return restored
def _native_prompt_obj(prompt_body):
try:
prompt = json.loads(prompt_body)
except Exception:
return None
if not isinstance(prompt, dict) or prompt.get("role") != "user":
return None
if not isinstance(prompt.get("content"), list):
return None
return prompt
def _native_prompt_text(prompt):
texts = []
for block in prompt.get("content", []):
if isinstance(block, dict) and block.get("type") == "text":
text = block.get("text", "")
if isinstance(text, str) and text.strip():
texts.append(text)
return "\n".join(texts).strip()
def _native_history_lines(prompt_text):
match = HISTORY_RE.search(prompt_text or "")
if not match:
return []
restored = []
for line in match.group(1).splitlines():
line = line.strip()
if line.startswith("[USER]: ") or line.startswith("[Agent] "):
restored.append(line)
return restored
def _native_first_user_line(prompt_text):
text = (prompt_text or "").strip()
if not text or "" in text or text.startswith("### [WORKING MEMORY]"):
return ""
if text.startswith(FILE_HINT):
text = text[len(FILE_HINT):].lstrip()
if "### 用户当前消息" in text:
text = text.split("### 用户当前消息", 1)[-1].strip()
return text
def _native_response_summary(response_body):
try:
blocks = ast.literal_eval((response_body or "").strip())
except Exception:
return ""
if not isinstance(blocks, list):
return ""
text_parts = []
for block in blocks:
if isinstance(block, dict) and block.get("type") == "text":
text = block.get("text", "")
if isinstance(text, str) and text:
text_parts.append(text)
match = SUMMARY_RE.search("\n".join(text_parts))
return (match.group(1).strip() if match else "")[:500]
def _restore_native_history(content):
blocks = RESTORE_BLOCK_RE.findall(content or "")
if not blocks:
return []
pairs = []
pending_prompt = None
for label, body in blocks:
if label == "Prompt":
pending_prompt = body
elif pending_prompt is not None:
pairs.append((pending_prompt, body))
pending_prompt = None
for prompt_body, response_body in reversed(pairs):
prompt = _native_prompt_obj(prompt_body)
if prompt is None:
continue
prompt_text = _native_prompt_text(prompt)
restored = list(_native_history_lines(prompt_text))
if restored:
summary = _native_response_summary(response_body)
summary_line = f"[Agent] {summary}" if summary else ""
if summary_line and (not restored or restored[-1] != summary_line):
restored.append(summary_line)
return restored
user_text = _native_first_user_line(prompt_text)
summary = _native_response_summary(response_body)
if user_text and summary:
return [f"[USER]: {user_text}", f"[Agent] {summary}"]
return []
def format_restore():
files = _restore_log_files()
if not files:
return None, "❌ 没有找到历史记录"
latest = max(files, key=os.path.getmtime)
with open(latest, "r", encoding="utf-8") as f:
content = f.read()
restored = _restore_text_pairs(content) or _restore_native_history(content)
if not restored:
return None, "❌ 历史记录里没有可恢复内容"
count = sum(1 for line in restored if line.startswith("[USER]: "))
return (restored, os.path.basename(latest), count), None
def build_done_text(raw_text):
files = [p for p in extract_files(raw_text) if os.path.exists(p)]
body = strip_files(clean_reply(raw_text))
if files:
body = (body + "\n\n" if body else "") + "\n".join(f"生成文件: {p}" for p in files)
return body or "..."
def public_access(allowed):
return not allowed or "*" in allowed
def to_allowed_set(value):
if value is None:
return set()
if isinstance(value, str):
value = [value]
return {str(x).strip() for x in value if str(x).strip()}
def allowed_label(allowed):
return "public" if public_access(allowed) else sorted(allowed)
def ensure_single_instance(port, label):
try:
lock_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
lock_sock.bind(("127.0.0.1", port))
return lock_sock
except OSError:
print(f"[{label}] Another instance is already running, skipping...")
sys.exit(1)
def require_runtime(agent, label, **required):
missing = [k for k, v in required.items() if not v]
if missing:
print(f"[{label}] ERROR: please set {', '.join(missing)} in mykey.py or mykey.json")
sys.exit(1)
if agent.llmclient is None:
print(f"[{label}] ERROR: no usable LLM backend found in mykey.py or mykey.json")
sys.exit(1)
def redirect_log(script_file, log_name, label, allowed):
log_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(script_file))), "temp")
os.makedirs(log_dir, exist_ok=True)
logf = open(os.path.join(log_dir, log_name), "a", encoding="utf-8", buffering=1)
sys.stdout = sys.stderr = logf
print(f"[NEW] {label} process starting, the above are history infos ...")
print(f"[{label}] allow list: {allowed_label(allowed)}")
class AgentChatMixin:
label = "Chat"
source = "chat"
split_limit = 1500
ping_interval = 20
def __init__(self, agent, user_tasks):
self.agent, self.user_tasks = agent, user_tasks
async def send_text(self, chat_id, content, **ctx):
raise NotImplementedError
async def send_done(self, chat_id, raw_text, **ctx):
await self.send_text(chat_id, build_done_text(raw_text), **ctx)
async def handle_command(self, chat_id, cmd, **ctx):
parts = (cmd or "").split()
op = (parts[0] if parts else "").lower()
if op == "/help":
return await self.send_text(chat_id, HELP_TEXT, **ctx)
if op == "/stop":
state = self.user_tasks.get(chat_id)
if state:
state["running"] = False
self.agent.abort()
return await self.send_text(chat_id, "⏹️ 正在停止...", **ctx)
if op == "/status":
llm = self.agent.get_llm_name() if self.agent.llmclient else "未配置"
return await self.send_text(chat_id, f"状态: {'🔴 运行中' if self.agent.is_running else '🟢 空闲'}\nLLM: [{self.agent.llm_no}] {llm}", **ctx)
if op == "/llm":
if not self.agent.llmclient:
return await self.send_text(chat_id, "❌ 当前没有可用的 LLM 配置", **ctx)
if len(parts) > 1:
try:
self.agent.next_llm(int(parts[1]))
return await self.send_text(chat_id, f"✅ 已切换到 [{self.agent.llm_no}] {self.agent.get_llm_name()}", **ctx)
except Exception:
return await self.send_text(chat_id, f"用法: /llm <0-{len(self.agent.list_llms()) - 1}>", **ctx)
lines = [f"{'→' if cur else ' '} [{i}] {name}" for i, name, cur in self.agent.list_llms()]
return await self.send_text(chat_id, "LLMs:\n" + "\n".join(lines), **ctx)
if op == "/restore":
try:
restored_info, err = format_restore()
if err:
return await self.send_text(chat_id, err, **ctx)
restored, fname, count = restored_info
self.agent.abort()
self.agent.history.extend(restored)
return await self.send_text(chat_id, f"✅ 已恢复 {count} 轮对话\n来源: {fname}\n(仅恢复上下文,请输入新问题继续)", **ctx)
except Exception as e:
return await self.send_text(chat_id, f"❌ 恢复失败: {e}", **ctx)
if op == "/continue":
return await self.send_text(chat_id, _handle_continue_frontend(self.agent, cmd), **ctx)
if op == "/new":
return await self.send_text(chat_id, _reset_conversation(self.agent), **ctx)
if op == "/btw":
answer = await asyncio.to_thread(_handle_btw_frontend, self.agent, cmd)
return await self.send_text(chat_id, answer, **ctx)
return await self.send_text(chat_id, HELP_TEXT, **ctx)
async def run_agent(self, chat_id, text, **ctx):
state = {"running": True}
self.user_tasks[chat_id] = state
try:
await self.send_text(chat_id, "思考中...", **ctx)
dq = self.agent.put_task(f"{FILE_HINT}\n\n{text}", source=self.source)
last_ping = time.time()
while state["running"]:
try:
item = await asyncio.to_thread(dq.get, True, 3)
except Q.Empty:
if self.agent.is_running and time.time() - last_ping > self.ping_interval:
await self.send_text(chat_id, "⏳ 还在处理中,请稍等...", **ctx)
last_ping = time.time()
continue
if "done" in item:
await self.send_done(chat_id, item.get("done", ""), **ctx)
break
if not state["running"]:
await self.send_text(chat_id, "⏹️ 已停止", **ctx)
except Exception as e:
import traceback
print(f"[{self.label}] run_agent error: {e}")
traceback.print_exc()
await self.send_text(chat_id, f"❌ 错误: {e}", **ctx)
finally:
self.user_tasks.pop(chat_id, None)
from agentmain import GeneraticAgent as _GA
from continue_cmd import handle_frontend_command as _handle_continue_frontend, install as _install_continue, reset_conversation as _reset_conversation
_install_continue(_GA)
from btw_cmd import handle_frontend_command as _handle_btw_frontend, install as _install_btw; _install_btw(_GA)
================================================
FILE: frontends/continue_cmd.py
================================================
"""`/continue` command: list & restore past model_responses sessions.
Pure functions + one `install(cls)` monkey-patch entry. No side effects at import.
"""
import ast, glob, json, os, re, time
_LOG_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
'temp', 'model_responses')
_LOG_GLOB = os.path.join(_LOG_DIR, 'model_responses_*.txt')
_BLOCK_RE = re.compile(r'^=== (Prompt|Response) ===.*?\n(.*?)(?=^=== (?:Prompt|Response) ===|\Z)',
re.DOTALL | re.MULTILINE)
_SUMMARY_RE = re.compile(r'\s*(.*?)\s*', re.DOTALL)
def _rel_time(mtime):
d = int(time.time() - mtime)
if d < 60: return f'{d}秒前'
if d < 3600: return f'{d // 60}分前'
if d < 86400: return f'{d // 3600}小时前'
return f'{d // 86400}天前'
def _pairs(content):
blocks, pairs, pending = _BLOCK_RE.findall(content or ''), [], None
for label, body in blocks:
if label == 'Prompt': pending = body.strip()
elif pending is not None:
pairs.append((pending, body.strip())); pending = None
return pairs
def _first_user(pairs):
for p, _ in pairs:
try: msg = json.loads(p)
except Exception: continue
if not isinstance(msg, dict): continue
for blk in msg.get('content', []) or []:
if isinstance(blk, dict) and blk.get('type') == 'text':
t = (blk.get('text') or '').strip()
if t and '' not in t and not t.startswith('### [WORKING MEMORY]'):
return t
for p, _ in pairs[:1]:
for line in p.splitlines():
s = line.strip()
if s and not s.startswith('###'): return s
return ''
def _last_summary(pairs):
for _, response_body in reversed(pairs):
try:
blocks = ast.literal_eval(response_body)
except Exception:
continue
if not isinstance(blocks, list):
continue
text_parts = []
for block in blocks:
if isinstance(block, dict) and block.get('type') == 'text':
text = block.get('text', '')
if isinstance(text, str) and text:
text_parts.append(text)
match = _SUMMARY_RE.search('\n'.join(text_parts))
if match:
summary = match.group(1).strip()
if summary:
return summary
return ''
def _preview_text(pairs):
return _last_summary(pairs) or _first_user(pairs)
def _recent_context(my_pid, n=5):
"""扫描最近 n 个 model_response 文件(排除自身),提取 lastQ / lastA。"""
out = []
for f in sorted(glob.glob(_LOG_GLOB), key=os.path.getmtime, reverse=True):
m = re.search(r'model_responses_(\d+)', os.path.basename(f))
if not m or m.group(1) == str(my_pid): continue
try: c = open(f, encoding='utf-8', errors='ignore').read()
except Exception: continue
q = s = ""
for hm in re.finditer(r'(.*?)', c, re.DOTALL):
u = re.search(r'\[USER\]:\s*(.+?)(?:\\n|<)', hm.group(1))
if u: q = u.group(1)
sm = _SUMMARY_RE.search(c)
if sm: s = sm.group(1).strip()
q, s = q[:60].strip(), s[:60].replace('\n', ' ').strip()
out.append(f'· {m.group(1)} | lastQ: {q or "-"} | lastA: {s or "-"}')
if len(out) >= n: break
return ('[RecentContext] 近期并行会话(非当前):\n' + '\n'.join(out) + '\n[/RecentContext]') if out else ""
def _parse_native_history(pairs):
history = []
for p, r in pairs:
try: user_msg = json.loads(p)
except Exception: return None
try: blocks = ast.literal_eval(r)
except Exception: return None
if not (isinstance(user_msg, dict) and user_msg.get('role') == 'user'): return None
if not isinstance(blocks, list): return None
history.append(user_msg)
history.append({'role': 'assistant', 'content': blocks})
return history
def list_sessions(exclude_pid=None):
"""Newest-first list of (path, mtime, first_user_text, n_rounds)."""
files = glob.glob(_LOG_GLOB)
if exclude_pid is not None:
tag = f'model_responses_{exclude_pid}.txt'
files = [f for f in files if not f.endswith(tag)]
out = []
for f in files:
try:
with open(f, encoding='utf-8', errors='replace') as fh:
content = fh.read()
except Exception: continue
pairs = _pairs(content)
if not pairs: continue
out.append((f, os.path.getmtime(f), _preview_text(pairs), len(pairs)))
out.sort(key=lambda x: x[1], reverse=True)
return out
_MD_ESCAPE_RE = re.compile(r'([\\`*_\[\]])')
def _escape_md(s): return _MD_ESCAPE_RE.sub(r'\\\1', s)
def _agent_clients(agent):
clients = []
for client in getattr(agent, 'llmclients', []) or []:
if client not in clients:
clients.append(client)
current = getattr(agent, 'llmclient', None)
if current is not None and current not in clients:
clients.insert(0, current)
return clients
def _replace_backend_history(agent, history):
backend = getattr(getattr(agent, 'llmclient', None), 'backend', None)
if backend is not None and hasattr(backend, 'history'):
backend.history = list(history or [])
def _current_log_path(pid=None):
pid = os.getpid() if pid is None else pid
return os.path.join(_LOG_DIR, f'model_responses_{pid}.txt')
def _snapshot_current_log(pid=None):
"""Persist current PID log as a standalone recoverable snapshot, then clear it."""
path = _current_log_path(pid)
if not os.path.isfile(path):
return None
try:
with open(path, encoding='utf-8', errors='replace') as fh:
content = fh.read()
except Exception:
return None
if not _pairs(content):
return None
os.makedirs(_LOG_DIR, exist_ok=True)
pid = os.getpid() if pid is None else pid
stamp = time.strftime('%Y%m%d_%H%M%S')
snapshot = os.path.join(_LOG_DIR, f'model_responses_snapshot_{pid}_{stamp}_{time.time_ns() % 1_000_000_000:09d}.txt')
with open(snapshot, 'w', encoding='utf-8', errors='replace') as fh:
fh.write(content)
with open(path, 'w', encoding='utf-8', errors='replace'):
pass
return snapshot
def reset_conversation(agent, message='🆕 已开启新对话,当前上下文已清空'):
"""Abort current work and clear all known frontend-visible conversation state."""
try:
agent.abort()
except Exception:
pass
_snapshot_current_log()
if hasattr(agent, 'history'):
agent.history = []
for client in _agent_clients(agent):
backend = getattr(client, 'backend', None)
if backend is not None and hasattr(backend, 'history'):
backend.history = []
if hasattr(client, 'last_tools'):
client.last_tools = ''
if hasattr(agent, 'handler'):
agent.handler = None
return message
def format_list(sessions, limit=20):
if not sessions: return '❌ 没有可恢复的历史会话'
lines = ['**可恢复会话**(输入 `/continue N` 恢复第 N 个):', '']
for i, (_, mtime, first, n) in enumerate(sessions[:limit], 1):
preview = _escape_md((first or '(无法预览)').replace('\n', ' ')[:60])
lines.append(f'{i}. `{_rel_time(mtime)}` · **{n} 轮** · {preview}')
return '\n'.join(lines)
def restore(agent, path):
"""Restore session at path. Returns (msg, is_full)."""
try:
with open(path, encoding='utf-8', errors='replace') as fh:
content = fh.read()
except Exception as e: return f'❌ 读取失败: {e}', False
pairs = _pairs(content)
if not pairs: return f'❌ {os.path.basename(path)} 为空或格式不符', False
history = _parse_native_history(pairs)
name = os.path.basename(path)
if history is not None:
agent.abort()
_replace_backend_history(agent, history)
return f'✅ 已恢复 {len(pairs)} 轮完整对话({name})\n(已写入 backend.history,可直接继续)', True
from chatapp_common import _restore_native_history, _restore_text_pairs
summary = _restore_text_pairs(content) or _restore_native_history(content)
if not summary: return f'❌ {name} 无法解析(非 native 且无摘要可提取)', False
agent.abort()
agent.history.extend(summary)
n = sum(1 for l in summary if l.startswith('[USER]: '))
return f'⚠️ 非 native 格式,已降级恢复 {n} 轮摘要({name})\n(请输入新问题继续)', False
def handle(agent, query, display_queue):
"""Dispatch /continue or /continue N. Returns None if consumed else original query."""
s = (query or '').strip()
if s == '/continue':
display_queue.put({'done': format_list(list_sessions(exclude_pid=os.getpid())), 'source': 'system'})
return None
m = re.match(r'/continue\s+(\d+)\s*$', s)
if m:
sessions = list_sessions(exclude_pid=os.getpid())
idx = int(m.group(1)) - 1
if not (0 <= idx < len(sessions)):
display_queue.put({'done': f'❌ 索引越界(有效范围 1-{len(sessions)})', 'source': 'system'})
return None
reset_conversation(agent, message=None)
msg, _ = restore(agent, sessions[idx][0])
display_queue.put({'done': msg, 'source': 'system'})
return None
return query
def _user_text(prompt_body):
"""User-typed text from a prompt JSON; '' if this is an agent auto-continuation."""
try: msg = json.loads(prompt_body)
except Exception: return ''
if not isinstance(msg, dict): return ''
for blk in msg.get('content', []) or []:
if isinstance(blk, dict) and blk.get('type') == 'text':
t = (blk.get('text') or '').strip()
if t and not t.startswith('### [WORKING MEMORY]'): return t
return ''
def _assistant_text(response_body):
"""Joined text from a response blocks repr; '' on parse failure."""
try: blocks = ast.literal_eval(response_body)
except Exception: return ''
if not isinstance(blocks, list): return ''
return '\n'.join(b['text'] for b in blocks
if isinstance(b, dict) and b.get('type') == 'text'
and isinstance(b.get('text'), str) and b['text'].strip())
_TURN_MARK = '**LLM Running (Turn {}) ...**\n\n'
def extract_ui_messages(path):
"""Parse a model_responses log into [{role, content}, ...] for UI replay.
Auto-continuation turns are folded into one assistant bubble with Turn markers,
matching live chat rendering via fold_turns().
"""
try:
with open(path, encoding='utf-8', errors='replace') as f: content = f.read()
except Exception: return []
rounds = [] # [(user_text, [turn_text, ...]), ...]
for prompt, response in _pairs(content):
user = _user_text(prompt)
if user or not rounds: rounds.append((user, []))
rounds[-1][1].append(_assistant_text(response))
out = []
for user, turns in rounds:
if not user or not any(turns): continue
body = '\n\n'.join(t if i == 0 else _TURN_MARK.format(i + 1) + t
for i, t in enumerate(turns))
out += [{'role': 'user', 'content': user},
{'role': 'assistant', 'content': body}]
return out
def handle_frontend_command(agent, query, exclude_pid=None):
"""Frontend-friendly /continue entry that returns text directly."""
s = (query or '').strip()
exclude_pid = os.getpid() if exclude_pid is None else exclude_pid
if s == '/continue':
return format_list(list_sessions(exclude_pid=exclude_pid))
m = re.match(r'/continue\s+(\d+)\s*$', s)
if not m:
return '用法: /continue 或 /continue N'
sessions = list_sessions(exclude_pid=exclude_pid)
idx = int(m.group(1)) - 1
if not (0 <= idx < len(sessions)):
return f'❌ 索引越界(有效范围 1-{len(sessions)})'
reset_conversation(agent, message=None)
msg, _ = restore(agent, sessions[idx][0])
return msg
def install(cls):
"""Wrap cls._handle_slash_cmd so /continue is handled before original dispatch."""
orig = cls._handle_slash_cmd
if getattr(orig, '_continue_patched', False): return
def patched(self, raw_query, display_queue):
if (raw_query or '').startswith('/continue'):
r = handle(self, raw_query, display_queue)
if r is None: return None
return orig(self, raw_query, display_queue)
patched._continue_patched = True
cls._handle_slash_cmd = patched
================================================
FILE: frontends/dcapp.py
================================================
# Discord Bot Frontend for GenericAgent
# ⚠️ 需要在 Discord Developer Portal 开启 "Message Content Intent"
# Bot → Privileged Gateway Intents → MESSAGE CONTENT INTENT → 打开
# pip install discord.py
import asyncio, json, os, queue as Q, re, sys, threading, time
from collections import OrderedDict
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from agentmain import GeneraticAgent
from chatapp_common import (
AgentChatMixin, build_done_text, ensure_single_instance, extract_files,
public_access, redirect_log, require_runtime, split_text, strip_files, clean_reply,
HELP_TEXT, FILE_HINT, format_restore,
_handle_continue_frontend, _reset_conversation,
)
from llmcore import mykeys
try:
import discord
except Exception:
print("Please install discord.py to use Discord: pip install discord.py")
sys.exit(1)
agent = GeneraticAgent(); agent.verbose = False
BOT_TOKEN = str(mykeys.get("discord_bot_token", "") or "").strip()
ALLOWED = {str(x).strip() for x in mykeys.get("discord_allowed_users", []) if str(x).strip()}
USER_TASKS = {}
PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
TEMP_DIR = os.path.join(PROJECT_ROOT, "temp")
MEDIA_DIR = os.path.join(TEMP_DIR, "discord_media")
ACTIVE_FILE = os.path.join(TEMP_DIR, "discord_active_channels.json")
ACTIVE_TTL_SECONDS = 30 * 24 * 3600
EXIT_CHANNEL_TEXTS = {"退出该频道", "退出此频道", "退出频道"}
EXIT_THREAD_TEXTS = {"退出该子区", "退出此子区", "退出子区"}
os.makedirs(MEDIA_DIR, exist_ok=True)
def _extract_discord_progress(text):
"""Return the newest concise from a streaming transcript."""
matches = re.findall(r"\s*(.*?)\s*", text or "", flags=re.DOTALL)
if not matches:
return ""
summary = re.sub(r"\s+", " ", matches[-1]).strip()
return summary[:120]
def _strip_discord_transcript(text):
"""Hide LLM/tool transcript noise while preserving the final natural reply."""
text = text or ""
text = re.sub(r"^\s*\*?\*?LLM Running \(Turn \d+\) \.\.\.\*?\*?\s*$", "", text, flags=re.M)
text = re.sub(r"^\s*🛠️\s+.*?(?=^\s*(?:\*?\*?LLM Running||$))", "", text, flags=re.M | re.DOTALL)
text = re.sub(r"^\s*(?:✅|❌|ERR|STDOUT|PAT\b|RC\b).*?$", "", text, flags=re.M)
text = re.sub(r".*?", "", text, flags=re.DOTALL)
text = clean_reply(text)
return strip_files(text).strip()
def _display_done_text(text):
body = _strip_discord_transcript(text)
if body and body != "...":
return body
summaries = re.findall(r"\s*(.*?)\s*", text or "", flags=re.DOTALL)
if summaries:
return re.sub(r"\s+", " ", summaries[-1]).strip() or "..."
return "..."
class DiscordApp(AgentChatMixin):
label, source, split_limit = "Discord", "discord", 1900
def __init__(self):
super().__init__(agent, USER_TASKS)
intents = discord.Intents.default()
intents.message_content = True
intents.guilds = True
intents.dm_messages = True
proxy = str(mykeys.get("proxy", "") or "").strip() or None
self.client = discord.Client(intents=intents, proxy=proxy)
self.background_tasks = set()
self._channel_cache = OrderedDict() # chat_id -> channel/user object (LRU, max 500)
self._active_channels = self._load_active_channels() # guild chat_id -> {last_seen: float}
self._active_lock = threading.Lock()
self._agents = OrderedDict() # chat_id -> GeneraticAgent, each chat has isolated history
self._agent_lock = threading.Lock()
@self.client.event
async def on_ready():
print(f"[Discord] bot ready: {self.client.user} ({self.client.user.id})")
@self.client.event
async def on_message(message):
await self._handle_message(message)
def _chat_id(self, message):
"""Return a string chat_id: 'dm:' or 'ch:'."""
if isinstance(message.channel, discord.DMChannel):
return f"dm:{message.author.id}"
return f"ch:{message.channel.id}"
def _load_active_channels(self):
try:
with open(ACTIVE_FILE, "r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
return {}
now = time.time()
active = {}
for chat_id, item in data.items():
if not str(chat_id).startswith("ch:") or not isinstance(item, dict):
continue
last_seen = float(item.get("last_seen") or 0)
if now - last_seen <= ACTIVE_TTL_SECONDS:
active[str(chat_id)] = {"last_seen": last_seen}
return active
except FileNotFoundError:
return {}
except Exception as e:
print(f"[Discord] failed to load active channels: {e}")
return {}
def _save_active_channels(self):
try:
os.makedirs(os.path.dirname(ACTIVE_FILE), exist_ok=True)
tmp = ACTIVE_FILE + ".tmp"
with open(tmp, "w", encoding="utf-8") as f:
json.dump(self._active_channels, f, ensure_ascii=False, indent=2, sort_keys=True)
os.replace(tmp, ACTIVE_FILE)
except Exception as e:
print(f"[Discord] failed to save active channels: {e}")
def _is_active_channel(self, chat_id, now=None):
now = now or time.time()
with self._active_lock:
item = self._active_channels.get(chat_id)
if not item:
return False
if now - float(item.get("last_seen") or 0) > ACTIVE_TTL_SECONDS:
self._active_channels.pop(chat_id, None)
self._save_active_channels()
print(f"[Discord] channel expired: {chat_id}")
return False
return True
def _touch_active_channel(self, chat_id, now=None):
if not chat_id.startswith("ch:"):
return
with self._active_lock:
self._active_channels[chat_id] = {"last_seen": float(now or time.time())}
self._save_active_channels()
def _deactivate_channel(self, chat_id):
with self._active_lock:
changed = self._active_channels.pop(chat_id, None) is not None
self._save_active_channels()
state = self.user_tasks.get(chat_id)
if state:
state["running"] = False
try:
self._get_agent(chat_id).abort()
except Exception as e:
print(f"[Discord] deactivate abort failed for {chat_id}: {e}")
return changed
def _get_agent(self, chat_id):
with self._agent_lock:
ga = self._agents.get(chat_id)
if ga is None:
ga = GeneraticAgent()
ga.verbose = False
self._agents[chat_id] = ga
threading.Thread(target=ga.run, daemon=True, name=f"discord-agent-{chat_id}").start()
if len(self._agents) > 200:
old_chat_id, _old_agent = self._agents.popitem(last=False)
print(f"[Discord] dropped agent cache entry: {old_chat_id}")
else:
self._agents.move_to_end(chat_id)
return ga
async def _download_attachments(self, message):
"""Download attachments/images to MEDIA_DIR, return list of local paths."""
paths = []
for att in message.attachments:
safe_name = re.sub(r'[<>:"/\\|?*]', '_', att.filename or f"file_{att.id}")
local_path = os.path.join(MEDIA_DIR, f"{att.id}_{safe_name}")
try:
await att.save(local_path)
paths.append(local_path)
print(f"[Discord] saved attachment: {local_path}")
except Exception as e:
print(f"[Discord] failed to save attachment {att.filename}: {e}")
return paths
async def send_text(self, chat_id, content, **ctx):
"""Send text (and optionally files) to a chat_id."""
channel = self._channel_cache.get(chat_id)
if channel is None:
try:
if chat_id.startswith("dm:"):
user = await self.client.fetch_user(int(chat_id[3:]))
channel = await user.create_dm()
else:
channel = await self.client.fetch_channel(int(chat_id[3:]))
self._channel_cache[chat_id] = channel
if len(self._channel_cache) > 500:
self._channel_cache.popitem(last=False)
except Exception as e:
print(f"[Discord] cannot resolve channel for {chat_id}: {e}")
return
for part in split_text(content, self.split_limit):
try:
await channel.send(part)
except Exception as e:
print(f"[Discord] send error: {e}")
async def send_done(self, chat_id, raw_text, **ctx):
"""Send final reply: text parts + file attachments."""
files = [p for p in extract_files(raw_text) if os.path.exists(p)]
body = _display_done_text(raw_text)
# Send text (send_text handles splitting internally)
if body and body != "...":
await self.send_text(chat_id, body, **ctx)
# Send files as Discord attachments
if files:
channel = self._channel_cache.get(chat_id)
if channel:
for fpath in files:
try:
await channel.send(file=discord.File(fpath))
except Exception as e:
print(f"[Discord] failed to send file {fpath}: {e}")
await self.send_text(chat_id, f"⚠️ 文件发送失败: {os.path.basename(fpath)}", **ctx)
if not body and not files:
await self.send_text(chat_id, "...", **ctx)
async def handle_command(self, chat_id, cmd, **ctx):
"""Handle slash commands against the per-chat agent, keeping Discord chats isolated."""
ga = self._get_agent(chat_id)
parts = (cmd or "").split()
op = (parts[0] if parts else "").lower()
if op == "/help":
return await self.send_text(chat_id, HELP_TEXT, **ctx)
if op == "/stop":
state = self.user_tasks.get(chat_id)
if state:
state["running"] = False
ga.abort()
return await self.send_text(chat_id, "⏹️ 正在停止...", **ctx)
if op == "/status":
llm = ga.get_llm_name() if ga.llmclient else "未配置"
return await self.send_text(chat_id, f"状态: {'🔴 运行中' if ga.is_running else '🟢 空闲'}\nLLM: [{ga.llm_no}] {llm}", **ctx)
if op == "/llm":
if not ga.llmclient:
return await self.send_text(chat_id, "❌ 当前没有可用的 LLM 配置", **ctx)
if len(parts) > 1:
try:
ga.next_llm(int(parts[1]))
return await self.send_text(chat_id, f"✅ 已切换到 [{ga.llm_no}] {ga.get_llm_name()}", **ctx)
except Exception:
return await self.send_text(chat_id, f"用法: /llm <0-{len(ga.list_llms()) - 1}>", **ctx)
lines = [f"{'→' if cur else ' '} [{i}] {name}" for i, name, cur in ga.list_llms()]
return await self.send_text(chat_id, "LLMs:\n" + "\n".join(lines), **ctx)
if op == "/restore":
try:
restored_info, err = format_restore()
if err:
return await self.send_text(chat_id, err, **ctx)
restored, fname, count = restored_info
ga.abort()
ga.history.extend(restored)
return await self.send_text(chat_id, f"✅ 已恢复 {count} 轮对话\n来源: {fname}\n(仅恢复上下文,请输入新问题继续)", **ctx)
except Exception as e:
return await self.send_text(chat_id, f"❌ 恢复失败: {e}", **ctx)
if op == "/continue":
return await self.send_text(chat_id, _handle_continue_frontend(ga, cmd), **ctx)
if op == "/new":
return await self.send_text(chat_id, _reset_conversation(ga), **ctx)
return await self.send_text(chat_id, HELP_TEXT, **ctx)
async def run_agent(self, chat_id, text, **ctx):
"""Run the isolated per-chat Discord agent."""
ga = self._get_agent(chat_id)
state = {"running": True}
self.user_tasks[chat_id] = state
try:
await self.send_text(chat_id, "思考中...", **ctx)
dq = ga.put_task(f"{FILE_HINT}\n\n{text}", source=self.source)
last_ping = time.time()
last_step = ""
step_no = 0
while state["running"]:
try:
item = await asyncio.to_thread(dq.get, True, 3)
except Q.Empty:
if ga.is_running and time.time() - last_ping > self.ping_interval:
await self.send_text(chat_id, "⏳ 还在处理中,请稍等...", **ctx)
last_ping = time.time()
continue
if "next" in item:
step = _extract_discord_progress(item.get("next", ""))
if step and step != last_step:
step_no += 1
await self.send_text(chat_id, f"步骤{step_no}:{step}", **ctx)
last_step = step
last_ping = time.time()
continue
if "done" in item:
await self.send_done(chat_id, item.get("done", ""), **ctx)
break
if not state["running"]:
await self.send_text(chat_id, "⏹️ 已停止", **ctx)
except Exception as e:
import traceback
print(f"[{self.label}] run_agent error: {e}")
traceback.print_exc()
await self.send_text(chat_id, f"❌ 错误: {e}", **ctx)
finally:
self.user_tasks.pop(chat_id, None)
async def _handle_message(self, message):
# Ignore self
if message.author == self.client.user or message.author.bot:
return
is_dm = isinstance(message.channel, discord.DMChannel)
is_guild = message.guild is not None
chat_id = self._chat_id(message)
now = time.time()
mentioned = bool(is_guild and self.client.user and self.client.user.mentioned_in(message))
self._channel_cache[chat_id] = message.channel
if len(self._channel_cache) > 500:
self._channel_cache.popitem(last=False)
user_id = str(message.author.id)
user_name = str(message.author)
if not public_access(ALLOWED) and user_id not in ALLOWED:
print(f"[Discord] unauthorized user: {user_name} ({user_id})")
return
if is_guild:
active = self._is_active_channel(chat_id, now)
if not mentioned and not active:
return
if mentioned or active:
self._touch_active_channel(chat_id, now)
# Strip bot mention from content
content = message.content or ""
if is_guild and self.client.user:
content = re.sub(rf"<@!?{self.client.user.id}>", "", content).strip()
else:
content = content.strip()
normalized = re.sub(r"\s+", "", content)
if is_guild and normalized in EXIT_CHANNEL_TEXTS | EXIT_THREAD_TEXTS:
self._deactivate_channel(chat_id)
label = "子区" if normalized in EXIT_THREAD_TEXTS else "频道"
await self.send_text(chat_id, f"✅ 已退出该{label},之后除非重新 @ 我,否则不会主动响应。")
print(f"[Discord] manually deactivated {chat_id} by {user_name} ({user_id})")
return
# Download attachments
attachment_paths = await self._download_attachments(message)
# Build message text with attachment paths
if attachment_paths:
paths_text = "\n".join(f"[附件: {p}]" for p in attachment_paths)
content = f"{content}\n{paths_text}" if content else paths_text
if not content:
return
print(f"[Discord] message from {user_name} ({user_id}, {'dm' if is_dm else 'guild'}): {content[:200]}")
if content.startswith("/"):
return await self.handle_command(chat_id, content)
task = asyncio.create_task(self.run_agent(chat_id, content))
self.background_tasks.add(task)
task.add_done_callback(self.background_tasks.discard)
async def start(self):
print("[Discord] bot starting...")
delay, max_delay = 5, 300
while True:
started_at = time.monotonic()
try:
await self.client.start(BOT_TOKEN)
except Exception as e:
print(f"[Discord] error: {e}")
if time.monotonic() - started_at >= 60:
delay = 5
print(f"[Discord] reconnect in {delay}s...")
await asyncio.sleep(delay)
delay = min(delay * 2, max_delay)
if __name__ == "__main__":
_LOCK_SOCK = ensure_single_instance(19532, "Discord")
require_runtime(agent, "Discord", discord_bot_token=BOT_TOKEN)
redirect_log(__file__, "dcapp.log", "Discord", ALLOWED)
asyncio.run(DiscordApp().start())
================================================
FILE: frontends/desktop_pet.pyw
================================================
"""Desktop Pet with HTTP Toast — ~90 lines"""
import tkinter as tk, threading, random, os, sys
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import urlparse, parse_qs
PORT = 41983
GIF = os.path.join(os.path.dirname(os.path.abspath(sys.argv[0])), 'pet.gif')
class Pet:
def __init__(self):
self.root = tk.Tk()
self.root.overrideredirect(True)
self.root.wm_attributes('-topmost', True)
self.root.wm_attributes('-transparentcolor', '#01FF01')
self.root.config(bg='#01FF01')
self.root.after(50, lambda: self.root.geometry('+300+500'))
# load GIF frames
self.frames, i = [], 0
while True:
try: self.frames.append(tk.PhotoImage(file=GIF, format=f'gif -index {i}')); i += 1
except: break
if not self.frames: raise FileNotFoundError(f'No GIF: {GIF}')
self.idx = 0
self.label = tk.Label(self.root, image=self.frames[0], bg='#01FF01', bd=0)
self.label.pack()
# drag
self.label.bind('', lambda e: setattr(self, '_d', (e.x, e.y)))
self.label.bind('', self._drag)
self.label.bind('', lambda e: (self.root.destroy(), os._exit(0)))
# start loops
self._animate()
self._wander()
self._start_server()
self.root.mainloop()
def _drag(self, e):
x, y = self.root.winfo_x() + e.x - self._d[0], self.root.winfo_y() + e.y - self._d[1]
self.root.geometry(f'+{x}+{y}')
def _animate(self):
self.idx = (self.idx + 1) % len(self.frames)
self.label.config(image=self.frames[self.idx])
self.root.after(150, self._animate)
def _wander(self):
if random.random() < 0.25:
x = self.root.winfo_x() + random.randint(-15, 15)
y = self.root.winfo_y() + random.randint(-5, 5)
self.root.geometry(f'+{x}+{y}')
self.root.after(4000, self._wander)
def show_toast(self, msg):
"""Show a speech bubble near the pet that auto-dismisses."""
tw = tk.Toplevel(self.root)
tw.overrideredirect(True)
tw.wm_attributes('-topmost', True)
tw.config(bg='#FFFDE7')
px, py = self.root.winfo_x(), self.root.winfo_y()
tw.geometry(f'+{px + 30}+{py - 50}')
# bubble content
f = tk.Frame(tw, bg='#FFFDE7', highlightbackground='#888', highlightthickness=1, padx=8, pady=4)
f.pack()
tk.Label(f, text=msg, bg='#FFFDE7', fg='#333', font=('Segoe UI', 10), wraplength=220, justify='left').pack()
# auto dismiss
tw.after(3000, tw.destroy)
def _start_server(self):
pet = self
class H(BaseHTTPRequestHandler):
def do_GET(self):
qs = parse_qs(urlparse(self.path).query)
msg = qs.get('msg', [''])[0]
if msg:
pet.root.after(0, pet.show_toast, msg)
self.send_response(200); self.end_headers(); self.wfile.write(b'ok')
else:
self.send_response(400); self.end_headers(); self.wfile.write(b'?msg=xxx')
def do_POST(self):
body = self.rfile.read(int(self.headers.get('Content-Length', 0))).decode()
if body:
pet.root.after(0, pet.show_toast, body)
self.send_response(200); self.end_headers(); self.wfile.write(b'ok')
else:
self.send_response(400); self.end_headers(); self.wfile.write(b'empty body')
def log_message(self, *a): pass
HTTPServer.allow_reuse_address = False
srv = HTTPServer(('127.0.0.1', PORT), H)
t = threading.Thread(target=srv.serve_forever, daemon=True)
t.start()
print(f'Toast server: http://127.0.0.1:{PORT}/?msg=hello')
if __name__ == '__main__':
Pet()
================================================
FILE: frontends/desktop_pet_v2.pyw
================================================
"""Desktop Pet with Skin System — Cross-platform with True Transparency"""
import os, re, sys, json, threading, io
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import urlparse, parse_qs
from PIL import Image, ImageDraw, ImageFont, ImageOps
PORT = 41983
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
PROJECT_DIR = os.path.dirname(SCRIPT_DIR)
SKINS_DIR = os.path.join(SCRIPT_DIR, 'skins')
class SkinLoader:
"""Load and parse skin configuration"""
@staticmethod
def load_skin(skin_path):
"""Load skin.json and return skin config"""
config_file = os.path.join(skin_path, 'skin.json')
if not os.path.exists(config_file):
raise FileNotFoundError(f"skin.json not found in {skin_path}")
with open(config_file, 'r', encoding='utf-8') as f:
config = json.load(f)
if 'animations' not in config:
raise ValueError("skin.json must contain 'animations' field")
config['path'] = skin_path
return config
@staticmethod
def list_skins():
"""List all available skins"""
if not os.path.exists(SKINS_DIR):
return []
skins = []
for item in os.listdir(SKINS_DIR):
skin_path = os.path.join(SKINS_DIR, item)
if os.path.isdir(skin_path):
config_file = os.path.join(skin_path, 'skin.json')
if os.path.exists(config_file):
skins.append(item)
return skins
class AnimationLoader:
"""Load animation frames from sprite sheet"""
@staticmethod
def load_sprite_frames(skin_path, anim_config):
"""Load frames from sprite sheet"""
file_path = os.path.join(skin_path, anim_config['file'])
sprite_config = anim_config['sprite']
img = Image.open(file_path)
frames = []
frame_width = sprite_config['frameWidth']
frame_height = sprite_config['frameHeight']
frame_count = sprite_config['frameCount']
columns = sprite_config['columns']
start_frame = sprite_config.get('startFrame', 0)
for i in range(frame_count):
frame_idx = start_frame + i
row = frame_idx // columns
col = frame_idx % columns
x = col * frame_width
y = row * frame_height
frame = img.crop((x, y, x + frame_width, y + frame_height))
frames.append(frame)
return frames
def _load_default_font(size):
"""Load a usable font for bubble text."""
font_candidates = [
'/System/Library/Fonts/Supplemental/Arial Unicode.ttf',
'/System/Library/Fonts/PingFang.ttc',
'/System/Library/Fonts/STHeiti Light.ttc',
'C:/Windows/Fonts/msyh.ttc',
'C:/Windows/Fonts/simhei.ttf',
'C:/Windows/Fonts/arial.ttf',
'/usr/share/fonts/truetype/noto/NotoSansCJK-Regular.ttc',
'/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc',
'/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc',
'/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf',
'/usr/share/fonts/noto-cjk/NotoSansCJK-Regular.ttc',
]
for font_path in font_candidates:
if os.path.exists(font_path):
try:
return ImageFont.truetype(font_path, size=size)
except Exception:
pass
return ImageFont.load_default()
def _normalize_bubble_text(text):
"""Normalize text for fonts that cannot render some symbols."""
text = (text or '').strip()
lines = text.replace('\r\n', '\n').replace('\r', '\n').split('\n')
if lines:
turn_match = re.match(r'^\s*🔄?\s*Turn\s+(\d+)\s*$', lines[0], flags=re.IGNORECASE)
if turn_match:
rest = '\n'.join(line.strip() for line in lines[1:] if line.strip())
return f"Turn {turn_match.group(1)}: {rest}" if rest else f"Turn {turn_match.group(1)}:"
return text.replace('🔄 Turn', 'Turn').replace('🔄', '').strip()
def _wrap_text_for_width(draw, text, font, max_width):
"""Wrap text to fit inside max_width."""
text = _normalize_bubble_text(text)
if not text:
return ['']
paragraphs = text.replace('\r\n', '\n').replace('\r', '\n').split('\n')
lines = []
for paragraph in paragraphs:
if not paragraph:
lines.append('')
continue
current = ''
for ch in paragraph:
candidate = current + ch
bbox = draw.textbbox((0, 0), candidate, font=font)
width = bbox[2] - bbox[0]
if current and width > max_width:
lines.append(current)
current = ch
else:
current = candidate
if current:
lines.append(current)
return lines or ['']
def build_bubble_image(message, max_width=220):
"""Build a PIL image for the toast bubble using the user asset when available."""
message = (message or '').strip()
bubble_path = next((p for p in [os.path.join(SCRIPT_DIR, 'chat_bubble.png'),
os.path.join(SCRIPT_DIR, 'bubble.png')]
if os.path.exists(p)), None)
if bubble_path:
bubble = Image.open(bubble_path).convert('RGBA')
else:
bubble = Image.new('RGBA', (256, 128), (255, 255, 255, 0))
draw = ImageDraw.Draw(bubble)
draw.rounded_rectangle((8, 8, 247, 87), radius=12, fill=(255, 255, 255, 255), outline=(0, 0, 0, 255), width=3)
draw.polygon([(48, 87), (72, 87), (56, 112)], fill=(255, 255, 255, 255), outline=(0, 0, 0, 255))
bubble = ImageOps.contain(bubble, (max_width, max(64, int(max_width * bubble.height / bubble.width))), Image.NEAREST)
# Detect the actual opaque bubble region to position text correctly
alpha = bubble.getchannel('A')
content_box = alpha.getbbox() # (left, top, right, bottom) of opaque area
if content_box:
cb_left, cb_top, cb_right, cb_bottom = content_box
else:
cb_left, cb_top, cb_right, cb_bottom = 0, 0, bubble.width, bubble.height
content_w = cb_right - cb_left
content_h = cb_bottom - cb_top
font_size = max(12, content_h // 6)
font = _load_default_font(font_size)
draw = ImageDraw.Draw(bubble)
# Padding relative to the opaque bubble region, not the full image
inner_pad_x = max(6, content_w // 14)
inner_pad_top = max(4, content_h // 12)
inner_pad_bottom = max(12, content_h // 4)
text_area_width = max(36, content_w - inner_pad_x * 2)
lines = _wrap_text_for_width(draw, message, font, text_area_width)
ascent, descent = font.getmetrics() if hasattr(font, 'getmetrics') else (font_size, font_size // 4)
line_height = max(font_size, ascent + descent)
usable_h = content_h - inner_pad_top - inner_pad_bottom
max_lines = max(1, usable_h // line_height)
if len(lines) > max_lines:
lines = lines[:max_lines]
if lines:
last = lines[-1]
while last and draw.textbbox((0, 0), last + '…', font=font)[2] > text_area_width:
last = last[:-1]
lines[-1] = (last + '…') if last else '…'
total_text_height = len(lines) * line_height
y = cb_top + inner_pad_top + max(0, (usable_h - total_text_height) // 2) - 3
for line in lines:
bbox = draw.textbbox((0, 0), line, font=font)
text_width = bbox[2] - bbox[0]
x = cb_left + inner_pad_x + (text_area_width - text_width) / 2
draw.text((x, y), line, font=font, fill=(32, 32, 32, 255))
y += line_height
alpha = bubble.getchannel('A')
bbox = alpha.getbbox()
if bbox:
bubble = bubble.crop(bbox)
width, height = bubble.size
alpha = bubble.getchannel('A')
bottom_y = height - 1
tail_x = width // 2
for y in range(height - 1, -1, -1):
xs = [x for x in range(width) if alpha.getpixel((x, y)) > 0]
if xs:
bottom_y = y
tail_x = xs[len(xs) // 2]
break
return {
'image': bubble,
'size': bubble.size,
'tail_tip': (tail_x, bottom_y),
}
# ============================================================================
# Shared Base Class
# ============================================================================
class PetBase:
"""Shared logic for Mac and Windows pet implementations."""
def _schedule_main(self, fn):
"""Schedule fn on the GUI main thread. Subclasses must override."""
raise NotImplementedError
def set_state_safe(self, state):
"""Thread-safe wrapper for set_state."""
self._schedule_main(lambda: self.set_state(state))
def show_toast_safe(self, message):
"""Thread-safe wrapper for show_toast."""
self._schedule_main(lambda m=message: self.show_toast(m))
def _start_server(self):
"""Start HTTP control server."""
pet = self
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
parsed = urlparse(self.path)
params = parse_qs(parsed.query)
if 'state' in params:
state = params['state'][0]
pet.set_state_safe(state)
self.send_response(200)
self.end_headers()
self.wfile.write(b'ok')
elif 'msg' in params:
msg = params['msg'][0]
pet.show_toast_safe(msg)
self.send_response(200)
self.end_headers()
self.wfile.write(b'ok')
else:
self.send_response(400)
self.end_headers()
self.wfile.write(b'?state=idle/walk/run/sprint or ?msg=hello')
def do_POST(self):
body = self.rfile.read(int(self.headers.get('Content-Length', 0))).decode()
if body:
pet.show_toast_safe(body)
self.send_response(200)
self.end_headers()
self.wfile.write(b'ok')
else:
self.send_response(400)
self.end_headers()
self.wfile.write(b'empty body')
def log_message(self, *a):
pass
try:
HTTPServer.allow_reuse_address = True
srv = HTTPServer(('127.0.0.1', PORT), Handler)
threading.Thread(target=srv.serve_forever, daemon=True).start()
print(f'✓ Server: http://127.0.0.1:{PORT}/?state=walk')
except OSError as e:
if e.errno == 48:
print(f'⚠ Port {PORT} already in use')
else:
raise
# ============================================================================
# macOS Implementation - Pure Cocoa with True Transparency
# ============================================================================
if sys.platform == 'darwin':
from Cocoa import (
NSApplication, NSWindow, NSImageView, NSImage, NSData, NSTimer,
NSMenu, NSMenuItem, NSApp, NSFloatingWindowLevel, NSColor,
NSBackingStoreBuffered, NSWindowStyleMaskBorderless,
NSApplicationActivationPolicyAccessory
)
from Foundation import NSMakeRect, NSMakePoint, NSMakeSize
from PyObjCTools import AppHelper
import objc
class MacPet(PetBase):
def __init__(self, skin_name=None):
self.app = NSApplication.sharedApplication()
self.app.setActivationPolicy_(NSApplicationActivationPolicyAccessory)
# Load skin
self.load_skin(skin_name)
self.available_skins = SkinLoader.list_skins()
# Get screen size
from AppKit import NSScreen, NSWindowCollectionBehaviorCanJoinAllSpaces, NSWindowCollectionBehaviorStationary
screen = NSScreen.mainScreen()
screen_frame = screen.frame()
screen_width = screen_frame.size.width
screen_height = screen_frame.size.height
# Position at right side
x_pos = screen_width - 200
y_pos = 300
# Create transparent window
self.window = NSWindow.alloc().initWithContentRect_styleMask_backing_defer_(
NSMakeRect(x_pos, y_pos, self.display_width, self.display_height),
NSWindowStyleMaskBorderless,
NSBackingStoreBuffered,
False
)
self.window.setOpaque_(False)
self.window.setBackgroundColor_(NSColor.clearColor())
self.window.setLevel_(NSFloatingWindowLevel)
self.window.setMovableByWindowBackground_(True)
self.window.setAcceptsMouseMovedEvents_(True)
# Make window sticky across spaces (stays in fixed screen position)
self.window.setCollectionBehavior_(
NSWindowCollectionBehaviorCanJoinAllSpaces |
NSWindowCollectionBehaviorStationary
)
# Create custom view for handling mouse events
from AppKit import NSView
from objc import super as objc_super
class DraggableImageView(NSView):
"""Custom view that handles dragging and double-click"""
def initWithFrame_(self, frame):
self = objc_super(DraggableImageView, self).initWithFrame_(frame)
if self is None:
return None
self.image_view = NSImageView.alloc().initWithFrame_(self.bounds())
self.image_view.setImageScaling_(1) # NSImageScaleProportionallyUpOrDown
self.addSubview_(self.image_view)
# Create overlay view for toast (always on top)
# Make it non-opaque so it doesn't block the image
self.overlay_view = NSView.alloc().initWithFrame_(self.bounds())
self.overlay_view.setWantsLayer_(True)
self.addSubview_(self.overlay_view)
self.drag_start = None
return self
def mouseDown_(self, event):
"""Handle mouse down for dragging"""
if event.clickCount() == 2:
# Double-click to quit
from AppKit import NSApp
NSApp.terminate_(None)
else:
# Start dragging
self.drag_start = event.locationInWindow()
def mouseDragged_(self, event):
"""Handle mouse drag"""
if self.drag_start:
current_location = event.locationInWindow()
window_frame = self.window().frame()
dx = current_location.x - self.drag_start.x
dy = current_location.y - self.drag_start.y
new_origin = NSMakePoint(
window_frame.origin.x + dx,
window_frame.origin.y + dy
)
self.window().setFrameOrigin_(new_origin)
def acceptsFirstMouse_(self, event):
"""Accept first mouse click"""
return True
def rightMouseDown_(self, event):
from AppKit import NSMenu, NSMenuItem, NSApp
menu = NSMenu.alloc().init()
pet = getattr(self, 'mac_pet', None) or self.window().delegate()
if not pet:
return
for skin_name in pet.available_skins: # preload this in MacPet.__init__
item = NSMenuItem.alloc().initWithTitle_action_keyEquivalent_(
skin_name,
'changeSkin:',
''
)
item.setTarget_(pet)
item.setRepresentedObject_(skin_name)
menu.addItem_(item)
menu.addItem_(NSMenuItem.separatorItem())
quit_item = NSMenuItem.alloc().initWithTitle_action_keyEquivalent_('Quit', 'terminate:', '')
menu.addItem_(quit_item)
NSApp.activateIgnoringOtherApps_(True)
NSMenu.popUpContextMenu_withEvent_forView_(menu, event, self)
# Create draggable view
self.content_view = DraggableImageView.alloc().initWithFrame_(
NSMakeRect(0, 0, self.display_width, self.display_height)
)
self.content_view.mac_pet = self
self.image_view = self.content_view.image_view
self.overlay_view = self.content_view.overlay_view
self.window.setContentView_(self.content_view)
# Animation state
self.current_state = 'idle'
self.frame_idx = 0
# Toast state
self.toast_label = None
self.toast_timer = None
self.toast_image = None
self.toast_window = None
# Start animation timer
self.timer = NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(
1.0 / self.animations[self.current_state]['fps'],
self,
'animate:',
None,
True
)
# Show window
self.window.makeKeyAndOrderFront_(None)
# Start HTTP server
self._start_server()
print(f"✓ macOS Pet started at ({x_pos}, {y_pos})")
print(f" Animations: {', '.join(self.animations.keys())}")
def load_skin(self, skin_name=None):
"""Load skin configuration and animations"""
available_skins = SkinLoader.list_skins()
if not available_skins:
raise FileNotFoundError(f"No skins found in {SKINS_DIR}")
if skin_name is None or skin_name not in available_skins:
skin_name = available_skins[0]
skin_path = os.path.join(SKINS_DIR, skin_name)
self.skin_config = SkinLoader.load_skin(skin_path)
# Get display size
display_size = self.skin_config.get('size', {})
self.display_width = display_size.get('width', 128)
self.display_height = display_size.get('height', 128)
# Load animations
self.animations = {}
for anim_name, anim_config in self.skin_config['animations'].items():
pil_frames = AnimationLoader.load_sprite_frames(skin_path, anim_config)
# Scale frames
scaled_frames = []
for frame in pil_frames:
if frame.mode != 'RGBA':
frame = frame.convert('RGBA')
scaled = frame.resize((self.display_width, self.display_height), Image.NEAREST)
scaled_frames.append(scaled)
# Convert to NSImage with proper alpha handling
ns_images = []
for pil_img in scaled_frames:
# Convert PIL to PNG bytes (PNG preserves alpha channel)
png_buffer = io.BytesIO()
pil_img.save(png_buffer, format='PNG')
png_data = png_buffer.getvalue()
# Create NSImage from PNG data
ns_data = NSData.dataWithBytes_length_(png_data, len(png_data))
ns_image = NSImage.alloc().initWithData_(ns_data)
ns_images.append(ns_image)
self.animations[anim_name] = {
'frames': ns_images,
'fps': anim_config.get('sprite', {}).get('fps', 6)
}
def animate_(self, timer):
"""Animation callback"""
anim = self.animations[self.current_state]
frames = anim['frames']
if frames:
self.image_view.setImage_(frames[self.frame_idx])
self.frame_idx = (self.frame_idx + 1) % len(frames)
def set_state(self, state):
"""Change animation state (must be called on main thread)"""
if state in self.animations and state != self.current_state:
self.current_state = state
self.frame_idx = 0
# Update timer interval
self.timer.invalidate()
self.timer = NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(
1.0 / self.animations[self.current_state]['fps'],
self,
'animate:',
None,
True
)
print(f"→ State: {state}")
def _schedule_main(self, fn):
AppHelper.callAfter(fn)
def show_toast(self, message):
"""Show toast message above pet"""
from AppKit import NSImageView
if self.toast_window:
self.toast_window.orderOut_(None)
self.toast_window = None
self.toast_label = None
if self.toast_timer:
self.toast_timer.invalidate()
self.toast_timer = None
bubble_info = build_bubble_image(message, max_width=max(180, min(260, self.display_width * 2)))
bubble_pil = bubble_info['image']
bubble_width, bubble_height = bubble_info['size']
tail_x, tail_y = bubble_info['tail_tip']
png_buffer = io.BytesIO()
bubble_pil.save(png_buffer, format='PNG')
png_data = png_buffer.getvalue()
ns_data = NSData.dataWithBytes_length_(png_data, len(png_data))
self.toast_image = NSImage.alloc().initWithData_(ns_data)
pet_frame = self.window.frame()
anchor_x = pet_frame.origin.x + self.display_width * 0.75
anchor_y = pet_frame.origin.y + self.display_height * 1.65
toast_x = anchor_x - tail_x
toast_y = anchor_y - tail_y
self.toast_window = NSWindow.alloc().initWithContentRect_styleMask_backing_defer_(
NSMakeRect(toast_x, toast_y, bubble_width, bubble_height),
NSWindowStyleMaskBorderless,
NSBackingStoreBuffered,
False
)
self.toast_window.setOpaque_(False)
self.toast_window.setBackgroundColor_(NSColor.clearColor())
self.toast_window.setLevel_(NSFloatingWindowLevel)
self.toast_window.setIgnoresMouseEvents_(True)
self.toast_window.setHasShadow_(False)
self.toast_label = NSImageView.alloc().initWithFrame_(
NSMakeRect(0, 0, bubble_width, bubble_height)
)
self.toast_label.setImage_(self.toast_image)
self.toast_label.setImageScaling_(0)
self.toast_window.setContentView_(self.toast_label)
self.toast_window.orderFrontRegardless()
self.toast_timer = NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(
3.0,
self,
'hideToast:',
None,
False
)
print(f"Toast: {message}")
def hideToast_(self, timer):
"""Hide toast message"""
if self.toast_window:
self.toast_window.orderOut_(None)
self.toast_window = None
self.toast_label = None
self.toast_image = None
self.toast_timer = None
def run(self):
"""Run the application"""
AppHelper.runEventLoop()
def changeSkin_(self, sender):
skin_name = sender.representedObject()
print(f"Changing skin to: {skin_name}")
self.load_skin(skin_name)
self.current_state = 'idle'
self.frame_idx = 0
# ============================================================================
# Windows/Linux Implementations
# ============================================================================
else:
if sys.platform.startswith('win'):
import tkinter as tk
from PIL import ImageTk
class WinPet(PetBase):
def __init__(self, skin_name=None):
self.root = tk.Tk()
self.root.wm_attributes('-topmost', True)
self.is_windows = sys.platform.startswith('win')
self.platform_name = 'Windows' if self.is_windows else 'Linux'
self.pet_bg_color = '#F0F0F0' if self.is_windows else 'black'
self.toast_bg_color = '#00ff01' if self.is_windows else 'black'
# Load skin
self.load_skin(skin_name)
# Setup window
screen_width = self.root.winfo_screenwidth()
screen_height = self.root.winfo_screenheight()
x_pos = screen_width - 200
y_pos = screen_height - 300
self.root.geometry(f'{self.display_width}x{self.display_height}+{x_pos}+{y_pos}')
self.root.overrideredirect(True)
self.root.wm_attributes('-topmost', True)
# Transparent background
if self.is_windows:
self.root.wm_attributes('-transparentcolor', self.pet_bg_color)
self.root.config(bg=self.pet_bg_color)
# Create label
self.label = tk.Label(self.root, bg=self.pet_bg_color, bd=0)
self.label.pack()
# Bind events
self.label.bind('', lambda e: setattr(self, '_d', (e.x, e.y)))
self.label.bind('', self._drag)
self.label.bind('', lambda e: (self.root.destroy(), os._exit(0)))
self.label.bind('', self._on_right_click)
# Animation state
self.current_state = 'idle'
self.frame_idx = 0
# Toast state
self.toast_window = None
self.toast_photo = None
# Start animation
self._animate()
self._start_server()
print(f"✓ {self.platform_name} Pet started at ({x_pos}, {y_pos})")
print(f" Animations: {', '.join(self.animations.keys())}")
self.root.mainloop()
def load_skin(self, skin_name=None):
"""Load skin configuration and animations"""
available_skins = SkinLoader.list_skins()
if not available_skins:
raise FileNotFoundError(f"No skins found in {SKINS_DIR}")
if skin_name is None or skin_name not in available_skins:
skin_name = available_skins[0]
skin_path = os.path.join(SKINS_DIR, skin_name)
self.skin_config = SkinLoader.load_skin(skin_path)
# Get display size
display_size = self.skin_config.get('size', {})
self.display_width = display_size.get('width', 128)
self.display_height = display_size.get('height', 128)
# Load animations
self.animations = {}
for anim_name, anim_config in self.skin_config['animations'].items():
pil_frames = AnimationLoader.load_sprite_frames(skin_path, anim_config)
# Scale and convert frames
tk_frames = []
for frame in pil_frames:
if frame.mode != 'RGBA':
frame = frame.convert('RGBA')
scaled = frame.resize((self.display_width, self.display_height), Image.NEAREST)
tk_frames.append(ImageTk.PhotoImage(scaled))
self.animations[anim_name] = {
'frames': tk_frames,
'fps': anim_config.get('sprite', {}).get('fps', 6)
}
def set_state(self, state):
"""Change animation state"""
if state in self.animations and state != self.current_state:
self.current_state = state
self.frame_idx = 0
print(f"→ State: {state}")
def _drag(self, e):
x = self.root.winfo_x() + e.x - self._d[0]
y = self.root.winfo_y() + e.y - self._d[1]
self.root.geometry(f'+{x}+{y}')
def _animate(self):
"""Animate current state"""
if self.current_state not in self.animations:
self.root.after(100, self._animate)
return
anim = self.animations[self.current_state]
frames = anim['frames']
if frames:
self.label.config(image=frames[self.frame_idx])
self.frame_idx = (self.frame_idx + 1) % len(frames)
delay = int(1000 / anim['fps'])
self.root.after(delay, self._animate)
def show_toast(self, message):
"""Show toast message above pet"""
if self.toast_window:
try:
self.toast_window.destroy()
except:
pass
self.toast_window = None
bubble_info = build_bubble_image(message, max_width=max(180, min(260, self.display_width * 2)))
bubble_pil = bubble_info['image']
bubble_width, bubble_height = bubble_info['size']
tail_x, tail_y = bubble_info['tail_tip']
self.toast_photo = ImageTk.PhotoImage(bubble_pil)
self.toast_window = tk.Toplevel(self.root)
self.toast_window.overrideredirect(True)
self.toast_window.wm_attributes('-topmost', True)
if self.is_windows:
self.toast_window.wm_attributes('-transparentcolor', self.toast_bg_color)
self.toast_window.config(bg=self.toast_bg_color)
toast_label = tk.Label(
self.toast_window,
image=self.toast_photo,
bg=self.toast_bg_color,
bd=0,
highlightthickness=0
)
toast_label.pack()
pet_x = self.root.winfo_x()
pet_y = self.root.winfo_y()
anchor_x = pet_x + int(self.display_width * 0.75)
anchor_y = pet_y
toast_x = anchor_x - tail_x
toast_y = anchor_y - bubble_height
self.toast_window.geometry(f'{bubble_width}x{bubble_height}+{toast_x}+{toast_y}')
self.root.after(3000, self._hide_toast)
print(f"Toast: {message}")
def _hide_toast(self):
"""Hide toast message"""
if self.toast_window:
try:
self.toast_window.destroy()
self.toast_window = None
except:
pass
def _schedule_main(self, fn):
self.root.after(0, fn)
def run(self):
"""Run the application (already in mainloop)"""
pass
def _on_right_click(self, event):
# Build a dynamic menu of all available skins
menu = tk.Menu(self.root, tearoff=0)
for skin_name in SkinLoader.list_skins():
menu.add_command(
label=skin_name,
command=lambda name=skin_name: self._change_skin(name)
)
menu.add_separator()
menu.add_command(label="Quit", command=lambda: (self.root.destroy(), os._exit(0)))
menu.tk_popup(event.x_root, event.y_root)
def _change_skin(self, skin_name):
print(f"Changing skin to: {skin_name}")
self.load_skin(skin_name)
self.current_state = 'idle'
self.frame_idx = 0
else:
from PySide6.QtCore import Qt, QTimer, QPoint
from PySide6.QtGui import QAction, QCursor, QImage, QPixmap
from PySide6.QtWidgets import QApplication, QLabel, QMenu, QWidget
class _LinuxPetLabel(QLabel):
def __init__(self, pet):
super().__init__()
self.pet = pet
self.drag_offset = None
def mousePressEvent(self, event):
if event.button() == Qt.LeftButton:
self.drag_offset = event.globalPosition().toPoint() - self.pet.window.frameGeometry().topLeft()
event.accept()
return
if event.button() == Qt.RightButton:
self.pet._show_context_menu(event.globalPosition().toPoint())
event.accept()
return
super().mousePressEvent(event)
def mouseMoveEvent(self, event):
if self.drag_offset is not None and (event.buttons() & Qt.LeftButton):
self.pet.window.move(event.globalPosition().toPoint() - self.drag_offset)
self.pet._reposition_toast()
event.accept()
return
super().mouseMoveEvent(event)
def mouseReleaseEvent(self, event):
if event.button() == Qt.LeftButton:
self.drag_offset = None
super().mouseReleaseEvent(event)
def mouseDoubleClickEvent(self, event):
if event.button() == Qt.LeftButton:
QApplication.instance().quit()
event.accept()
return
super().mouseDoubleClickEvent(event)
class LinuxPet(PetBase):
def __init__(self, skin_name=None):
self.app = QApplication.instance() or QApplication(sys.argv)
self.available_skins = SkinLoader.list_skins()
self.load_skin(skin_name)
screen = self.app.primaryScreen()
screen_geo = screen.availableGeometry() if screen else None
if screen_geo:
x_pos = screen_geo.right() - self.display_width - 72
y_pos = screen_geo.bottom() - self.display_height - 120
else:
x_pos, y_pos = 1200, 700
self.window = QWidget()
self.window.setWindowFlags(
Qt.FramelessWindowHint |
Qt.WindowStaysOnTopHint |
Qt.Tool
)
self.window.setAttribute(Qt.WA_TranslucentBackground, True)
self.window.setAttribute(Qt.WA_ShowWithoutActivating, True)
self.window.resize(self.display_width, self.display_height)
self.window.move(x_pos, y_pos)
self.label = _LinuxPetLabel(self)
self.label.setParent(self.window)
self.label.setGeometry(0, 0, self.display_width, self.display_height)
self.label.setAttribute(Qt.WA_TranslucentBackground, True)
self.label.setStyleSheet('background: transparent;')
self.label.setScaledContents(True)
self.current_state = 'idle'
self.frame_idx = 0
self.toast_window = None
self.toast_label = None
self.toast_pixmap = None
self.anim_timer = QTimer()
self.anim_timer.timeout.connect(self._animate)
self._restart_animation_timer()
self.window.show()
self._start_server()
print(f"✓ Linux PySide6 Pet started at ({x_pos}, {y_pos})")
print(f" Animations: {', '.join(self.animations.keys())}")
def _pil_to_qpixmap(self, pil_img):
buffer = io.BytesIO()
pil_img.save(buffer, format='PNG')
qimage = QImage.fromData(buffer.getvalue(), 'PNG')
return QPixmap.fromImage(qimage)
def load_skin(self, skin_name=None):
available_skins = SkinLoader.list_skins()
if not available_skins:
raise FileNotFoundError(f"No skins found in {SKINS_DIR}")
if skin_name is None or skin_name not in available_skins:
skin_name = available_skins[0]
skin_path = os.path.join(SKINS_DIR, skin_name)
self.skin_config = SkinLoader.load_skin(skin_path)
display_size = self.skin_config.get('size', {})
self.display_width = display_size.get('width', 128)
self.display_height = display_size.get('height', 128)
self.animations = {}
for anim_name, anim_config in self.skin_config['animations'].items():
pil_frames = AnimationLoader.load_sprite_frames(skin_path, anim_config)
qt_frames = []
for frame in pil_frames:
if frame.mode != 'RGBA':
frame = frame.convert('RGBA')
scaled = frame.resize((self.display_width, self.display_height), Image.NEAREST)
qt_frames.append(self._pil_to_qpixmap(scaled))
self.animations[anim_name] = {
'frames': qt_frames,
'fps': anim_config.get('sprite', {}).get('fps', 6)
}
if hasattr(self, 'window'):
self.window.resize(self.display_width, self.display_height)
self.label.setGeometry(0, 0, self.display_width, self.display_height)
self._animate(force=True)
self._reposition_toast()
def _restart_animation_timer(self):
anim = self.animations.get(self.current_state) or next(iter(self.animations.values()))
fps = max(1, anim.get('fps', 6))
self.anim_timer.start(int(1000 / fps))
def _animate(self, force=False):
if self.current_state not in self.animations:
return
anim = self.animations[self.current_state]
frames = anim['frames']
if not frames:
return
if force:
self.frame_idx = 0
self.label.setPixmap(frames[self.frame_idx])
self.frame_idx = (self.frame_idx + 1) % len(frames)
def set_state(self, state):
if state in self.animations and state != self.current_state:
self.current_state = state
self.frame_idx = 0
self._restart_animation_timer()
print(f"→ State: {state}")
def _show_context_menu(self, global_pos):
menu = QMenu(self.window)
for skin_name in SkinLoader.list_skins():
action = QAction(skin_name, menu)
action.triggered.connect(lambda checked=False, name=skin_name: self._change_skin(name))
menu.addAction(action)
menu.addSeparator()
quit_action = QAction('Quit', menu)
quit_action.triggered.connect(QApplication.instance().quit)
menu.addAction(quit_action)
menu.popup(global_pos)
def _compute_toast_geometry(self, bubble_width, bubble_height, tail_x, tail_y):
pet_pos = self.window.frameGeometry().topLeft()
anchor_x = pet_pos.x() + int(self.display_width * 0.75)
anchor_y = pet_pos.y() + int(self.display_height * 0.15)
return anchor_x - tail_x, anchor_y - tail_y - bubble_height // 2
def show_toast(self, message):
if self.toast_window:
self.toast_window.close()
self.toast_window = None
self.toast_label = None
self.toast_pixmap = None
bubble_info = build_bubble_image(message, max_width=max(180, min(260, self.display_width * 2)))
bubble_pil = bubble_info['image']
bubble_width, bubble_height = bubble_info['size']
tail_x, tail_y = bubble_info['tail_tip']
self.toast_pixmap = self._pil_to_qpixmap(bubble_pil)
self.toast_window = QWidget()
self.toast_window.setWindowFlags(
Qt.FramelessWindowHint |
Qt.WindowStaysOnTopHint |
Qt.Tool |
Qt.WindowTransparentForInput
)
self.toast_window.setAttribute(Qt.WA_TranslucentBackground, True)
self.toast_window.setAttribute(Qt.WA_ShowWithoutActivating, True)
self.toast_window.resize(bubble_width, bubble_height)
self.toast_label = QLabel(self.toast_window)
self.toast_label.setGeometry(0, 0, bubble_width, bubble_height)
self.toast_label.setPixmap(self.toast_pixmap)
self.toast_label.setAttribute(Qt.WA_TranslucentBackground, True)
self.toast_label.setStyleSheet('background: transparent;')
toast_x, toast_y = self._compute_toast_geometry(bubble_width, bubble_height, tail_x, tail_y)
self.toast_window.move(toast_x, toast_y)
self.toast_window.show()
QTimer.singleShot(3000, self._hide_toast)
print(f"Toast: {message}")
def _reposition_toast(self):
if not self.toast_window:
return
label_pixmap = self.toast_label.pixmap() if self.toast_label else None
if label_pixmap is None:
return
bubble_width = label_pixmap.width()
bubble_height = label_pixmap.height()
toast_x, toast_y = self._compute_toast_geometry(
bubble_width,
bubble_height,
bubble_width // 2,
bubble_height
)
self.toast_window.move(toast_x, toast_y)
def _hide_toast(self):
if self.toast_window:
self.toast_window.close()
self.toast_window = None
self.toast_label = None
self.toast_pixmap = None
def _schedule_main(self, fn):
QTimer.singleShot(0, fn)
def _change_skin(self, skin_name):
print(f"Changing skin to: {skin_name}")
self.load_skin(skin_name)
self.current_state = 'idle'
self.frame_idx = 0
self._restart_animation_timer()
def run(self):
self.app.exec()
if __name__ == '__main__':
# Singleton: if port already in use, another instance is running
import socket
_s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
_s.connect(('127.0.0.1', PORT))
_s.close()
print(f'⚠ Pet already running on port {PORT}, exiting.')
sys.exit(0)
except ConnectionRefusedError:
pass
if sys.platform == 'darwin':
pet = MacPet('vita')
pet.run()
elif sys.platform.startswith('win'):
pet = WinPet('vita')
else:
pet = LinuxPet('vita')
pet.run()
================================================
FILE: frontends/dingtalkapp.py
================================================
import asyncio, json, os, sys, threading, time
import requests
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from agentmain import GeneraticAgent
from chatapp_common import AgentChatMixin, ensure_single_instance, public_access, redirect_log, require_runtime, split_text
from llmcore import mykeys
try:
from dingtalk_stream import AckMessage, CallbackHandler, Credential, DingTalkStreamClient
from dingtalk_stream.chatbot import ChatbotMessage
except Exception:
print("Please install dingtalk-stream to use DingTalk: pip install dingtalk-stream")
sys.exit(1)
agent = GeneraticAgent(); agent.verbose = False
CLIENT_ID = str(mykeys.get("dingtalk_client_id", "") or "").strip()
CLIENT_SECRET = str(mykeys.get("dingtalk_client_secret", "") or "").strip()
ALLOWED = {str(x).strip() for x in mykeys.get("dingtalk_allowed_users", []) if str(x).strip()}
USER_TASKS = {}
class DingTalkApp(AgentChatMixin):
label, source, split_limit = "DingTalk", "dingtalk", 1800
def __init__(self):
super().__init__(agent, USER_TASKS)
self.client, self.access_token, self.token_expiry, self.background_tasks = None, None, 0, set()
async def _get_access_token(self):
if self.access_token and time.time() < self.token_expiry:
return self.access_token
def _fetch():
resp = requests.post("https://api.dingtalk.com/v1.0/oauth2/accessToken", json={"appKey": CLIENT_ID, "appSecret": CLIENT_SECRET}, timeout=20)
resp.raise_for_status()
return resp.json()
last_err = None
for attempt in range(2):
try:
data = await asyncio.to_thread(_fetch)
self.access_token = data.get("accessToken")
self.token_expiry = time.time() + int(data.get("expireIn", 7200)) - 60
return self.access_token
except Exception as e:
last_err = e
if attempt == 0:
await asyncio.sleep(1)
print(f"[DingTalk] token error after retry: {last_err}")
return None
async def _send_batch_message(self, chat_id, msg_key, msg_param):
token = await self._get_access_token()
if not token:
return False
headers = {"x-acs-dingtalk-access-token": token}
if chat_id.startswith("group:"):
url = "https://api.dingtalk.com/v1.0/robot/groupMessages/send"
payload = {"robotCode": CLIENT_ID, "openConversationId": chat_id[6:], "msgKey": msg_key, "msgParam": json.dumps(msg_param, ensure_ascii=False)}
else:
url = "https://api.dingtalk.com/v1.0/robot/oToMessages/batchSend"
payload = {"robotCode": CLIENT_ID, "userIds": [chat_id], "msgKey": msg_key, "msgParam": json.dumps(msg_param, ensure_ascii=False)}
def _post():
resp = requests.post(url, json=payload, headers=headers, timeout=20)
body = resp.text
if resp.status_code != 200:
raise RuntimeError(f"HTTP {resp.status_code}: {body[:300]}")
result = resp.json() if "json" in resp.headers.get("content-type", "") else {}
errcode = result.get("errcode")
if errcode not in (None, 0):
raise RuntimeError(f"API errcode={errcode}: {body[:300]}")
return True
try:
return await asyncio.to_thread(_post)
except Exception as e:
print(f"[DingTalk] send error: {e}")
return False
async def send_text(self, chat_id, content):
for part in split_text(content, self.split_limit):
await self._send_batch_message(chat_id, "sampleMarkdown", {"text": part, "title": "Agent Reply"})
async def on_message(self, content, sender_id, sender_name, conversation_type=None, conversation_id=None):
try:
if not content:
return
if not public_access(ALLOWED) and sender_id not in ALLOWED:
print(f"[DingTalk] unauthorized user: {sender_id}")
return
is_group = conversation_type == "2" and conversation_id
chat_id = f"group:{conversation_id}" if is_group else sender_id
print(f"[DingTalk] message from {sender_name} ({sender_id}): {content}")
if content.startswith("/"):
return await self.handle_command(chat_id, content)
task = asyncio.create_task(self.run_agent(chat_id, content))
self.background_tasks.add(task)
task.add_done_callback(self.background_tasks.discard)
except Exception:
import traceback
print("[DingTalk] handle_message error")
traceback.print_exc()
async def start(self):
self.client = DingTalkStreamClient(Credential(CLIENT_ID, CLIENT_SECRET))
self.client.register_callback_handler(ChatbotMessage.TOPIC, _DingTalkHandler(self))
print("[DingTalk] bot starting...")
delay, max_delay = 5, 300
while True:
started_at = time.monotonic()
try:
await self.client.start()
except Exception as e:
print(f"[DingTalk] stream error: {e}")
# any session that lived >=60s is treated as healthy -> reset backoff
if time.monotonic() - started_at >= 60:
delay = 5
print(f"[DingTalk] reconnect in {delay}s...")
await asyncio.sleep(delay)
delay = min(delay * 2, max_delay)
class _DingTalkHandler(CallbackHandler):
def __init__(self, app):
super().__init__()
self.app = app
async def process(self, message):
try:
chatbot_msg = ChatbotMessage.from_dict(message.data)
text = getattr(getattr(chatbot_msg, "text", None), "content", "") or ""
extensions = getattr(chatbot_msg, "extensions", None) or {}
recognition = ((extensions.get("content") or {}).get("recognition") or "").strip() if isinstance(extensions, dict) else ""
if not (text := text.strip()):
text = recognition or str((message.data.get("text", {}) or {}).get("content", "") or "").strip()
sender_id = str(getattr(chatbot_msg, "sender_staff_id", None) or getattr(chatbot_msg, "sender_id", None) or "unknown")
sender_name = getattr(chatbot_msg, "sender_nick", None) or "Unknown"
await self.app.on_message(text, sender_id, sender_name, message.data.get("conversationType"), message.data.get("conversationId") or message.data.get("openConversationId"))
except Exception as e:
print(f"[DingTalk] callback error: {e}")
return AckMessage.STATUS_OK, "OK"
if __name__ == "__main__":
_LOCK_SOCK = ensure_single_instance(19530, "DingTalk")
require_runtime(agent, "DingTalk", dingtalk_client_id=CLIENT_ID, dingtalk_client_secret=CLIENT_SECRET)
redirect_log(__file__, "dingtalkapp.log", "DingTalk", ALLOWED)
threading.Thread(target=agent.run, daemon=True).start()
asyncio.run(DingTalkApp().start())
================================================
FILE: frontends/fsapp.py
================================================
import glob, json, os, queue as Q, re, sys, threading, time
PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, PROJECT_ROOT)
os.chdir(PROJECT_ROOT)
from agentmain import GeneraticAgent
from frontends.chatapp_common import format_restore
from frontends.continue_cmd import handle_frontend_command as handle_continue_frontend, reset_conversation
from llmcore import mykeys
import traceback
import lark_oapi as lark
from lark_oapi.api.im.v1 import *
_TAG_PATS = [r"<" + t + r">.*?" + t + r">" for t in ("thinking", "summary", "tool_use", "file_content")]
_IMAGE_EXTS = {".png", ".jpg", ".jpeg", ".gif", ".bmp", ".webp", ".ico", ".tiff", ".tif"}
_AUDIO_EXTS = {".opus", ".mp3", ".wav", ".m4a", ".aac"}
_VIDEO_EXTS = {".mp4", ".mov", ".avi", ".mkv", ".webm"}
_FILE_TYPE_MAP = {
".opus": "opus",
".mp4": "mp4",
".pdf": "pdf",
".doc": "doc",
".docx": "doc",
".xls": "xls",
".xlsx": "xls",
".ppt": "ppt",
".pptx": "ppt",
}
_MSG_TYPE_MAP = {"image": "[image]", "audio": "[audio]", "file": "[file]", "media": "[media]", "sticker": "[sticker]"}
TEMP_DIR = os.path.join(PROJECT_ROOT, "temp")
MEDIA_DIR = os.path.join(TEMP_DIR, "feishu_media")
os.makedirs(MEDIA_DIR, exist_ok=True)
_TRUNC_TAIL = 300 # 截断兜底时保留原文尾部字符数
def _clean(text):
for pat in _TAG_PATS:
text = re.sub(pat, "", text or "", flags=re.DOTALL)
return re.sub(r"\n{3,}", "\n\n", text).strip()
def _extract_files(text):
return re.findall(r"\[FILE:([^\]]+)\]", text or "")
def _strip_files(text):
return re.sub(r"\[FILE:[^\]]+\]", "", text or "").strip()
def _display_text(text):
cleaned = _strip_files(_clean(text))
if cleaned:
return cleaned
tail = (text or "").strip()[-_TRUNC_TAIL:]
return "⚠️ 模型输出被截断或为空" + (f"\n…{tail}" if tail else "")
def _to_allowed_set(value):
if value is None:
return set()
if isinstance(value, str):
value = [value]
return {str(x).strip() for x in value if str(x).strip()}
def _parse_json(raw):
if not raw:
return {}
try:
return json.loads(raw)
except Exception:
return {}
def _extract_share_card_content(content_json, msg_type):
parts = []
if msg_type == "share_chat":
parts.append(f"[shared chat: {content_json.get('chat_id', '')}]")
elif msg_type == "share_user":
parts.append(f"[shared user: {content_json.get('user_id', '')}]")
elif msg_type == "interactive":
parts.extend(_extract_interactive_content(content_json))
elif msg_type == "share_calendar_event":
parts.append(f"[shared calendar event: {content_json.get('event_key', '')}]")
elif msg_type == "system":
parts.append("[system message]")
elif msg_type == "merge_forward":
parts.append("[merged forward messages]")
return "\n".join([p for p in parts if p]).strip() or f"[{msg_type}]"
def _extract_interactive_content(content):
parts = []
if isinstance(content, str):
try:
content = json.loads(content)
except Exception:
return [content] if content.strip() else []
if not isinstance(content, dict):
return parts
title = content.get("title")
if isinstance(title, dict):
title_text = title.get("content", "") or title.get("text", "")
if title_text:
parts.append(f"title: {title_text}")
elif isinstance(title, str) and title:
parts.append(f"title: {title}")
elements = content.get("elements", [])
if isinstance(elements, list):
for row in elements:
if isinstance(row, dict):
parts.extend(_extract_element_content(row))
elif isinstance(row, list):
for el in row:
parts.extend(_extract_element_content(el))
card = content.get("card", {})
if card:
parts.extend(_extract_interactive_content(card))
header = content.get("header", {})
if isinstance(header, dict):
header_title = header.get("title", {})
if isinstance(header_title, dict):
header_text = header_title.get("content", "") or header_title.get("text", "")
if header_text:
parts.append(f"title: {header_text}")
return [p for p in parts if p]
def _extract_element_content(element):
parts = []
if not isinstance(element, dict):
return parts
tag = element.get("tag", "")
if tag in ("markdown", "lark_md"):
content = element.get("content", "")
if content:
parts.append(content)
elif tag == "div":
text = element.get("text", {})
if isinstance(text, dict):
text_content = text.get("content", "") or text.get("text", "")
if text_content:
parts.append(text_content)
elif isinstance(text, str) and text:
parts.append(text)
for field in element.get("fields", []) or []:
if isinstance(field, dict):
field_text = field.get("text", {})
if isinstance(field_text, dict):
content = field_text.get("content", "") or field_text.get("text", "")
if content:
parts.append(content)
elif tag == "a":
href = element.get("href", "")
text = element.get("text", "")
if href:
parts.append(f"link: {href}")
if text:
parts.append(text)
elif tag == "button":
text = element.get("text", {})
if isinstance(text, dict):
content = text.get("content", "") or text.get("text", "")
if content:
parts.append(content)
url = element.get("url", "") or (element.get("multi_url", {}) or {}).get("url", "")
if url:
parts.append(f"link: {url}")
elif tag == "img":
alt = element.get("alt", {})
if isinstance(alt, dict):
parts.append(alt.get("content", "[image]") or "[image]")
else:
parts.append("[image]")
for child in element.get("elements", []) or []:
parts.extend(_extract_element_content(child))
for col in element.get("columns", []) or []:
for child in (col.get("elements", []) if isinstance(col, dict) else []):
parts.extend(_extract_element_content(child))
return parts
def _extract_post_content(content_json):
def _parse_block(block):
if not isinstance(block, dict) or not isinstance(block.get("content"), list):
return None, []
texts, images = [], []
if block.get("title"):
texts.append(block.get("title"))
for row in block["content"]:
if not isinstance(row, list):
continue
for el in row:
if not isinstance(el, dict):
continue
tag = el.get("tag")
if tag in ("text", "a"):
texts.append(el.get("text", ""))
elif tag == "at":
texts.append(f"@{el.get('user_name', 'user')}")
elif tag == "img" and el.get("image_key"):
images.append(el["image_key"])
text = " ".join([t for t in texts if t]).strip()
return text or None, images
root = content_json
if isinstance(root, dict) and isinstance(root.get("post"), dict):
root = root["post"]
if not isinstance(root, dict):
return "", []
if "content" in root:
text, imgs = _parse_block(root)
if text or imgs:
return text or "", imgs
for key in ("zh_cn", "en_us", "ja_jp"):
if key in root:
text, imgs = _parse_block(root[key])
if text or imgs:
return text or "", imgs
for val in root.values():
if isinstance(val, dict):
text, imgs = _parse_block(val)
if text or imgs:
return text or "", imgs
return "", []
APP_ID = str(mykeys.get("fs_app_id", "") or "").strip()
APP_SECRET = str(mykeys.get("fs_app_secret", "") or "").strip()
ALLOWED_USERS = _to_allowed_set(mykeys.get("fs_allowed_users", []))
PUBLIC_ACCESS = not ALLOWED_USERS or "*" in ALLOWED_USERS
AGENT_TIMEOUT_SEC = 900
agent = GeneraticAgent()
threading.Thread(target=agent.run, daemon=True).start()
client, user_tasks = None, {}
def create_client():
return lark.Client.builder().app_id(APP_ID).app_secret(APP_SECRET).log_level(lark.LogLevel.INFO).build()
def _card_raw(elements):
return json.dumps({
"schema": "2.0",
"config": {"streaming_mode": False, "width_mode": "fill"},
"body": {"elements": elements},
}, ensure_ascii=False)
def _card(text):
return _card_raw([{"tag": "markdown", "content": text}])
def _send_raw(receive_id, payload, msg_type, rtype):
try:
body = CreateMessageRequest.builder().receive_id_type(rtype).request_body(
CreateMessageRequestBody.builder().receive_id(receive_id).msg_type(msg_type).content(payload).build()
).build()
r = client.im.v1.message.create(body)
if r.success():
return r.data.message_id if r.data else None
print(f"发送失败: {r.code}, {r.msg}")
except Exception as e:
print(f"[ERROR] _send_raw 网络异常: {e}")
return None
def _patch_card(message_id, card_json):
return _patch_card_result(message_id, card_json)[0]
def _patch_card_result(message_id, card_json):
try:
body = PatchMessageRequest.builder().message_id(message_id).request_body(
PatchMessageRequestBody.builder().content(card_json).build()
).build()
r = client.im.v1.message.patch(body)
if not r.success():
print(f"[ERROR] patch_card 失败: {r.code}, {r.msg}")
msg = f"{getattr(r, 'code', '')} {getattr(r, 'msg', '')}".lower()
return r.success(), ("230099" in msg or "11310" in msg or "element exceeds the limit" in msg)
except Exception as e:
print(f"[ERROR] _patch_card 网络异常: {e}")
return False, False
def send_message(receive_id, content, msg_type="text", use_card=False, receive_id_type="open_id"):
if use_card:
return _send_raw(receive_id, _card(content), "interactive", receive_id_type)
if msg_type == "text":
return _send_raw(receive_id, json.dumps({"text": content}, ensure_ascii=False), "text", receive_id_type)
return _send_raw(receive_id, content, msg_type, receive_id_type)
def update_message(message_id, content):
return _patch_card(message_id, _card(content))
def _upload_image_sync(file_path):
try:
with open(file_path, "rb") as f:
request = CreateImageRequest.builder().request_body(
CreateImageRequestBody.builder().image_type("message").image(f).build()
).build()
response = client.im.v1.image.create(request)
if response.success():
return response.data.image_key
print(f"[ERROR] upload image failed: {response.code}, {response.msg}")
except Exception as e:
print(f"[ERROR] upload image failed {file_path}: {e}")
return None
def _upload_file_sync(file_path):
ext = os.path.splitext(file_path)[1].lower()
file_type = _FILE_TYPE_MAP.get(ext, "stream")
file_name = os.path.basename(file_path)
try:
with open(file_path, "rb") as f:
request = CreateFileRequest.builder().request_body(
CreateFileRequestBody.builder().file_type(file_type).file_name(file_name).file(f).build()
).build()
response = client.im.v1.file.create(request)
if response.success():
return response.data.file_key
print(f"[ERROR] upload file failed: {response.code}, {response.msg}")
except Exception as e:
print(f"[ERROR] upload file failed {file_path}: {e}")
return None
def _download_image_sync(message_id, image_key):
try:
request = GetMessageResourceRequest.builder().message_id(message_id).file_key(image_key).type("image").build()
response = client.im.v1.message_resource.get(request)
if response.success():
data = response.file.read() if hasattr(response.file, "read") else response.file
return data, response.file_name
print(f"[ERROR] download image failed: {response.code}, {response.msg}")
except Exception as e:
print(f"[ERROR] download image failed {image_key}: {e}")
return None, None
def _download_file_sync(message_id, file_key, resource_type="file"):
if resource_type == "audio":
resource_type = "file"
try:
request = GetMessageResourceRequest.builder().message_id(message_id).file_key(file_key).type(resource_type).build()
response = client.im.v1.message_resource.get(request)
if response.success():
data = response.file.read() if hasattr(response.file, "read") else response.file
return data, response.file_name
print(f"[ERROR] download {resource_type} failed: {response.code}, {response.msg}")
except Exception as e:
print(f"[ERROR] download {resource_type} failed {file_key}: {e}")
return None, None
def _download_and_save_media(msg_type, content_json, message_id):
data, filename = None, None
if msg_type == "image":
image_key = content_json.get("image_key")
if image_key and message_id:
data, filename = _download_image_sync(message_id, image_key)
if not filename:
filename = f"{image_key[:16]}.jpg"
elif msg_type in ("audio", "file", "media"):
file_key = content_json.get("file_key")
if file_key and message_id:
data, filename = _download_file_sync(message_id, file_key, msg_type)
if not filename:
filename = file_key[:16]
if msg_type == "audio" and filename and not filename.endswith(".opus"):
filename = f"{filename}.opus"
if data and filename:
file_path = os.path.join(MEDIA_DIR, os.path.basename(filename))
with open(file_path, "wb") as f:
f.write(data)
return file_path, filename
return None, None
def _describe_media(msg_type, file_path, filename):
if msg_type == "image":
return f"[image: {filename}]\n[Image: source: {file_path}]"
if msg_type == "audio":
return f"[audio: {filename}]\n[File: source: {file_path}]"
if msg_type in ("file", "media"):
return f"[{msg_type}: {filename}]\n[File: source: {file_path}]"
return f"[{msg_type}]\n[File: source: {file_path}]"
def _send_local_file(receive_id, file_path, receive_id_type="open_id"):
if not os.path.isfile(file_path):
send_message(receive_id, f"⚠️ 文件不存在: {file_path}", receive_id_type=receive_id_type)
return False
ext = os.path.splitext(file_path)[1].lower()
if ext in _IMAGE_EXTS:
image_key = _upload_image_sync(file_path)
if image_key:
send_message(receive_id, json.dumps({"image_key": image_key}, ensure_ascii=False), msg_type="image", receive_id_type=receive_id_type)
return True
else:
file_key = _upload_file_sync(file_path)
if file_key:
msg_type = "media" if ext in _AUDIO_EXTS or ext in _VIDEO_EXTS else "file"
send_message(receive_id, json.dumps({"file_key": file_key}, ensure_ascii=False), msg_type=msg_type, receive_id_type=receive_id_type)
return True
send_message(receive_id, f"⚠️ 文件发送失败: {os.path.basename(file_path)}", receive_id_type=receive_id_type)
return False
def _send_generated_files(receive_id, raw_text, receive_id_type="open_id"):
for file_path in _extract_files(raw_text):
_send_local_file(receive_id, file_path, receive_id_type)
def _build_user_message(message):
msg_type = message.message_type
message_id = message.message_id
content_json = _parse_json(message.content)
parts, image_paths = [], []
if msg_type == "text":
text = str(content_json.get("text", "") or "").strip()
if text:
parts.append(text)
elif msg_type == "post":
text, image_keys = _extract_post_content(content_json)
if text:
parts.append(text)
for image_key in image_keys:
file_path, filename = _download_and_save_media("image", {"image_key": image_key}, message_id)
if file_path and filename:
parts.append(_describe_media("image", file_path, filename))
image_paths.append(file_path)
else:
parts.append("[image: download failed]")
elif msg_type in ("image", "audio", "file", "media"):
file_path, filename = _download_and_save_media(msg_type, content_json, message_id)
if file_path and filename:
parts.append(_describe_media(msg_type, file_path, filename))
if msg_type == "image":
image_paths.append(file_path)
else:
parts.append(f"[{msg_type}: download failed]")
elif msg_type in ("share_chat", "share_user", "interactive", "share_calendar_event", "system", "merge_forward"):
parts.append(_extract_share_card_content(content_json, msg_type))
else:
parts.append(_MSG_TYPE_MAP.get(msg_type, f"[{msg_type}]"))
return "\n".join([p for p in parts if p]).strip(), image_paths
def _fmt_tool_call(tc):
name = tc.get('tool_name', '?')
args = {k: v for k, v in (tc.get('args') or {}).items() if not k.startswith('_')}
return f"- `{name}`({json.dumps(args, ensure_ascii=False)[:200]})"
def _build_step_detail(resp, tool_calls):
"""从 LLM response + tool_calls 组装单步展开详情(纯函数)。"""
parts = []
thinking = (getattr(resp, 'thinking', '') or '').strip() if resp else ''
if thinking:
parts.append(f"### 💭 Thinking\n{thinking}")
if tool_calls:
parts.append("### 🛠 Tool Calls\n" + "\n".join(_fmt_tool_call(tc) for tc in tool_calls))
content = _display_text((getattr(resp, 'content', '') or '')).strip() if resp else ''
if content and content != '...':
parts.append(f"### 📝 Output\n{content}")
return "\n\n".join(parts)
class _TaskCard:
"""飞书任务卡片:单卡片持续 patch;每步一个独立折叠面板(header 显示 summary,展开看详情)。"""
_DETAIL_LIMIT = 8000
_FINAL_LIMIT = 6000
def __init__(self, receive_id, rid_type):
self.rid, self.rtype = receive_id, rid_type
self.steps = [] # [(summary, detail), ...]
self.status = "🤔 思考中..."
self.final = None
self.msg_id = None
self.page_no = 1
self.turn_no = 0
self.turn_base = 1
self.note = None
def _step_panel(self, idx, summary, detail):
detail = detail or "_(无输出)_"
if len(detail) > self._DETAIL_LIMIT:
detail = detail[:self._DETAIL_LIMIT] + f"\n\n…(已截断,共 {len(detail)} 字符)"
return {
"tag": "collapsible_panel", "expanded": False,
"header": {"title": {"tag": "plain_text", "content": f"Turn {idx} · {summary}"}},
"elements": [{"tag": "markdown", "content": detail}],
}
def _build(self):
header = f"**{self.status}**"
if self.page_no > 1:
header += f"\n\n📄 工作卡片 {self.page_no}"
els = [{"tag": "markdown", "content": header}]
if self.note:
els.append({"tag": "markdown", "content": self.note})
for i, (s, d) in enumerate(self.steps, self.turn_base):
els.append(self._step_panel(i, s, d))
if self.final:
els += [{"tag": "hr"}, {"tag": "markdown", "content": self.final}]
return _card_raw(els)
def _push(self):
card = self._build()
if self.msg_id:
return _patch_card_result(self.msg_id, card)
else:
self.msg_id = _send_raw(self.rid, card, "interactive", self.rtype)
return bool(self.msg_id), False
def _rollover(self):
self.page_no += 1
self.msg_id = None
self.final = None
self.note = "⚠️ 上一张工作卡片达到飞书限制,本页继续展示后续进展。"
# ── 公开接口 ──
def start(self):
self._push()
def step(self, summary, detail=""):
self.turn_no += 1
step = (summary, detail)
self.steps.append(step)
self.status = f"⏳ 工作中 · Turn {self.turn_no}"
ok, limit = self._push()
if limit:
self.steps.pop()
self._rollover()
self.turn_base = self.turn_no
self.steps = [step]
self._push()
def done(self, text):
self.status = "✅ 已完成"
self.final = (text or "_(无文本输出)_")[:self._FINAL_LIMIT]
ok, limit = self._push()
if limit:
self._rollover()
self.steps = []
self.turn_base = self.turn_no + 1
self.final = (text or "_(无文本输出)_")[:self._FINAL_LIMIT]
self._push()
def fail(self, msg):
self.status = f"❌ {msg}"
self._push()
def _make_task_hook(card, done_event, on_final):
"""飞书任务 hook:每轮 patch 卡片状态;结束触发 on_final(raw) 处理附件。"""
def hook(ctx):
try:
if ctx.get('exit_reason'):
resp = ctx.get('response')
raw = resp.content if hasattr(resp, 'content') else str(resp)
card.done(_display_text(raw))
on_final(raw)
done_event.set()
elif ctx.get('summary'):
detail = _build_step_detail(ctx.get('response'), ctx.get('tool_calls') or [])
card.step(ctx['summary'], detail)
except Exception as e:
print(f"[fs hook] error: {e}")
return hook
def handle_message(data):
event, message, sender = data.event, data.event.message, data.event.sender
open_id = sender.sender_id.open_id
chat_id = message.chat_id
if not PUBLIC_ACCESS and open_id not in ALLOWED_USERS:
print(f"未授权用户: {open_id}")
return
user_input, image_paths = _build_user_message(message)
if not user_input:
if chat_id:
send_message(chat_id, f"⚠️ 暂不支持处理此类飞书消息:{message.message_type}", receive_id_type="chat_id")
else:
send_message(open_id, f"⚠️ 暂不支持处理此类飞书消息:{message.message_type}")
return
print(f"收到消息 [{open_id}] ({message.message_type}, {len(image_paths)} images): {user_input[:200]}")
if message.message_type == "text" and user_input.startswith("/"):
return handle_command(open_id, user_input, chat_id)
def run_agent():
user_tasks[open_id] = {"running": True}
receive_id = chat_id or open_id
rid_type = "chat_id" if chat_id else "open_id"
done_event = threading.Event()
hook_key = f"fs_{open_id}"
card = _TaskCard(receive_id, rid_type)
card.start()
on_final = lambda raw: _send_generated_files(receive_id, raw, receive_id_type=rid_type)
if not hasattr(agent, '_turn_end_hooks'): agent._turn_end_hooks = {}
agent._turn_end_hooks[hook_key] = _make_task_hook(card, done_event, on_final)
try:
agent.put_task(user_input, source="feishu", images=image_paths)
start = time.time()
while not done_event.wait(timeout=3):
if not user_tasks.get(open_id, {}).get("running", True):
agent.abort()
card.fail("已停止")
break
if time.time() - start > AGENT_TIMEOUT_SEC:
agent.abort()
card.fail("任务超时")
break
except Exception as e:
traceback.print_exc()
card.fail(f"错误: {e}")
finally:
agent._turn_end_hooks.pop(hook_key, None)
user_tasks.pop(open_id, None)
threading.Thread(target=run_agent, daemon=True).start()
def handle_command(open_id, cmd, chat_id=None):
def _send_cmd_response(content):
if chat_id:
send_message(chat_id, content, receive_id_type="chat_id")
else:
send_message(open_id, content)
parts = (cmd or "").split()
op = (parts[0] if parts else "").lower()
if op == "/stop":
if open_id in user_tasks:
user_tasks[open_id]["running"] = False
agent.abort()
_send_cmd_response("正在停止...")
elif op == "/new":
_send_cmd_response(reset_conversation(agent))
elif op == "/help":
_send_cmd_response("命令列表:\n/stop - 停止当前任务\n/status - 查看状态\n/llm - 查看当前模型列表\n/llm [n] - 切换到第 n 个模型\n/restore - 恢复上次对话历史\n/continue - 列出可恢复会话\n/continue [n] - 恢复第 n 个会话\n/new - 开启新对话并清空当前上下文\n/help - 显示帮助")
elif op == "/status":
llm = agent.get_llm_name() if agent.llmclient else "未配置"
_send_cmd_response(f"状态: {'🔴 运行中' if agent.is_running else '🟢 空闲'}\nLLM: [{agent.llm_no}] {llm}")
elif op == "/llm":
if not agent.llmclient:
return _send_cmd_response("❌ 当前没有可用的 LLM 配置")
if len(parts) > 1:
try:
agent.next_llm(int(parts[1]))
return _send_cmd_response(f"✅ 已切换到 [{agent.llm_no}] {agent.get_llm_name()}")
except Exception:
return _send_cmd_response(f"用法: /llm <0-{len(agent.list_llms()) - 1}>")
lines = [f"{'→' if cur else ' '} [{i}] {name}" for i, name, cur in agent.list_llms()]
_send_cmd_response("LLMs:\n" + "\n".join(lines))
elif op == "/restore":
try:
restored_info, err = format_restore()
if err:
return _send_cmd_response(err.replace("❌ ", ""))
restored, fname, count = restored_info
agent.history.extend(restored)
agent.abort()
_send_cmd_response(f"已恢复 {count} 轮对话\n来源: {fname}\n(仅恢复上下文,请输入新问题继续)")
except Exception as e:
_send_cmd_response(f"恢复失败: {e}")
elif op == "/continue" or cmd.startswith("/continue"):
_send_cmd_response(handle_continue_frontend(agent, cmd))
else:
_send_cmd_response(f"未知命令: {cmd}")
def main():
global client
if not APP_ID or not APP_SECRET:
print("错误: 请在 mykey.py 或 mykey.json 中配置 fs_app_id 和 fs_app_secret")
sys.exit(1)
client = create_client()
handler = lark.EventDispatcherHandler.builder("", "").register_p2_im_message_receive_v1(handle_message).build()
print("=" * 50 + "\n飞书 Agent 已启动(长连接模式)\n" + f"App ID: {APP_ID}\n等待消息...\n" + "=" * 50)
retry_delay = 5
while True:
try:
cli = lark.ws.Client(APP_ID, APP_SECRET, event_handler=handler, log_level=lark.LogLevel.INFO)
cli.start()
except Exception as e:
print(f"[WARN] 飞书长连接断开或启动失败: {e}")
print(f"[INFO] {retry_delay}s 后重连...")
time.sleep(retry_delay)
retry_delay = min(retry_delay * 2, 120)
# 重连时刷新 client
try:
client = create_client()
except Exception:
pass
if __name__ == "__main__":
main()
================================================
FILE: frontends/genericagent_acp_bridge.py
================================================
import io
import json
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# Must run BEFORE importing agentmain — it reconfigures stdout at import time,
# and its submodules may print() during init. We capture the raw binary stdout
# for ACP JSON-RPC, then redirect the text-mode stdout to stderr so any stray
# prints from agentmain/llmcore don't pollute the ACP channel.
if sys.platform == "win32":
import msvcrt
_stdout_fd = os.dup(sys.__stdout__.fileno())
msvcrt.setmode(_stdout_fd, os.O_BINARY)
_acp_stdout = os.fdopen(_stdout_fd, "wb", buffering=0)
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
# Mark the ACP fd as non-inheritable so child processes can't write to it.
os.set_inheritable(_stdout_fd, False)
# Redirect the original stdout fd to stderr so child processes
# (tool calls) don't write into the ACP JSON-RPC channel.
os.dup2(sys.stderr.fileno(), sys.__stdout__.fileno())
else:
_stdout_fd = os.dup(sys.__stdout__.fileno())
os.set_inheritable(_stdout_fd, False)
_acp_stdout = os.fdopen(_stdout_fd, "wb", buffering=0)
os.dup2(sys.stderr.fileno(), sys.__stdout__.fileno())
class _StdoutToStderrRouter(io.TextIOBase):
"""Redirect text-mode stdout to stderr so agentmain prints don't leak."""
def writable(self): return True
def write(self, s):
if s:
sys.stderr.write(s)
sys.stderr.flush()
return len(s) if s else 0
def flush(self): sys.stderr.flush()
sys.stdout = _StdoutToStderrRouter()
import argparse
import queue
import threading
import traceback
import uuid
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from agentmain import GeneraticAgent
JSONRPC_VERSION = "2.0"
ACP_PROTOCOL_VERSION = 1
def eprint(*args: Any) -> None:
print(*args, file=sys.stderr, flush=True)
def make_text_block(text: str) -> Dict[str, Any]:
return {"type": "text", "text": text}
def make_session_update(session_id: str, update: Dict[str, Any]) -> Dict[str, Any]:
return {
"jsonrpc": JSONRPC_VERSION,
"method": "session/update",
"params": {"sessionId": session_id, "update": update},
}
def compact_json(obj: Dict[str, Any]) -> str:
return json.dumps(obj, ensure_ascii=False, separators=(",", ":"))
def parse_jsonrpc_line(line: str) -> Optional[Dict[str, Any]]:
stripped = line.strip()
if not stripped:
return None
try:
obj = json.loads(stripped)
except json.JSONDecodeError:
return None
return obj if isinstance(obj, dict) else None
def content_blocks_to_text(blocks: List[Dict[str, Any]]) -> str:
parts: List[str] = []
for block in blocks:
if not isinstance(block, dict):
continue
block_type = block.get("type")
if block_type == "text":
text = block.get("text")
if isinstance(text, str) and text:
parts.append(text)
elif block_type == "resource_link":
name = block.get("name") or "resource"
uri = block.get("uri") or ""
desc = block.get("description") or ""
parts.append(f"[ResourceLink] {name}: {uri}\n{desc}".strip())
elif block_type == "resource":
uri = block.get("uri") or "resource"
text = block.get("text")
if isinstance(text, str) and text:
parts.append(f"[Resource] {uri}\n{text}")
else:
parts.append(f"[Resource] {uri}")
elif block_type == "image":
uri = block.get("uri") or "inline-image"
parts.append(f"[Image omitted] {uri}")
else:
parts.append(f"[Unsupported content block: {block_type}]")
return "\n\n".join(p for p in parts if p).strip()
def jsonrpc_error(code: int, message: str, req_id: Any = None, data: Any = None) -> Dict[str, Any]:
err: Dict[str, Any] = {"code": code, "message": message}
if data is not None:
err["data"] = data
return {"jsonrpc": JSONRPC_VERSION, "id": req_id, "error": err}
def jsonrpc_result(req_id: Any, result: Any) -> Dict[str, Any]:
return {"jsonrpc": JSONRPC_VERSION, "id": req_id, "result": result}
@dataclass
class SessionState:
session_id: str
cwd: str
agent: GeneraticAgent
current_prompt_id: Any = None
prompt_lock: threading.Lock = field(default_factory=threading.Lock)
class GenericAgentAcpBridge:
def __init__(self, llm_no: int = 0):
self.llm_no = llm_no
self._json_out = _acp_stdout
self._write_lock = threading.Lock()
self._sessions: Dict[str, SessionState] = {}
self._shutdown = False
def write_message(self, msg: Dict[str, Any]) -> None:
payload = compact_json(msg)
raw = (payload + "\n").encode("utf-8")
method = msg.get("method", msg.get("id", "?"))
eprint(f"[ACP-BRIDGE] >>> {payload[:500]}")
try:
with self._write_lock:
self._json_out.write(raw)
self._json_out.flush()
except Exception as e:
eprint(f"[ACP-BRIDGE] WRITE FAILED: {type(e).__name__}: {e}")
def new_agent(self) -> GeneraticAgent:
agent = GeneraticAgent()
agent.next_llm(self.llm_no)
agent.verbose = True
agent.inc_out = True
threading.Thread(target=agent.run, daemon=True).start()
return agent
def handle_initialize(self, req_id: Any, params: Dict[str, Any]) -> None:
requested_version = params.get("protocolVersion", ACP_PROTOCOL_VERSION)
version = ACP_PROTOCOL_VERSION if requested_version == ACP_PROTOCOL_VERSION else ACP_PROTOCOL_VERSION
result = {
"protocolVersion": version,
"agentCapabilities": {
"loadSession": False,
"mcpCapabilities": {"http": False, "sse": False},
"promptCapabilities": {
"image": False,
"audio": False,
"embeddedContext": False,
},
"sessionCapabilities": {},
},
"agentInfo": {
"name": "genericagent-acp",
"title": "GenericAgent",
"version": "0.1.0",
},
"authMethods": [],
}
self.write_message(jsonrpc_result(req_id, result))
def handle_session_new(self, req_id: Any, params: Dict[str, Any]) -> None:
cwd = params.get("cwd")
if not isinstance(cwd, str) or not cwd:
self.write_message(jsonrpc_error(-32602, "cwd is required", req_id))
return
if not os.path.isabs(cwd):
cwd = os.path.abspath(cwd)
session_id = f"ga_{uuid.uuid4().hex}"
agent = self.new_agent()
session = SessionState(session_id=session_id, cwd=cwd, agent=agent)
self._sessions[session_id] = session
self.write_message(
jsonrpc_result(
req_id,
{
"sessionId": session_id,
"modes": None,
"configOptions": None,
},
)
)
def handle_session_prompt(self, req_id: Any, params: Dict[str, Any]) -> None:
session_id = params.get("sessionId")
prompt_blocks = params.get("prompt")
session = self._sessions.get(session_id)
if session is None:
self.write_message(jsonrpc_error(-32602, "unknown sessionId", req_id))
return
if not isinstance(prompt_blocks, list):
self.write_message(jsonrpc_error(-32602, "prompt must be an array", req_id))
return
prompt_text = content_blocks_to_text(prompt_blocks)
if not prompt_text:
self.write_message(jsonrpc_error(-32602, "prompt must contain text or supported content", req_id))
return
with session.prompt_lock:
if session.current_prompt_id is not None:
self.write_message(
jsonrpc_error(-32603, "session already has an active prompt", req_id)
)
return
session.current_prompt_id = req_id
def run_prompt() -> None:
stop_reason = "end_turn"
try:
dq = session.agent.put_task(prompt_text, source="acp")
self._drain_agent_queue(session, dq)
except Exception as exc:
stop_reason = "end_turn"
self.write_message(
make_session_update(
session.session_id,
{
"sessionUpdate": "agent_message_chunk",
"content": make_text_block(
f"[Bridge error] {type(exc).__name__}: {exc}"
),
},
)
)
eprint("[GenericAgent ACP] prompt thread failed:", traceback.format_exc())
finally:
with session.prompt_lock:
finished_req_id = session.current_prompt_id
session.current_prompt_id = None
if finished_req_id is not None:
import time
time.sleep(0.1)
self.write_message(
jsonrpc_result(finished_req_id, {"stopReason": stop_reason})
)
threading.Thread(target=run_prompt, daemon=True).start()
def _drain_agent_queue(self, session: SessionState, dq: "queue.Queue[Dict[str, Any]]") -> None:
sent_any = False
while True:
item = dq.get()
if not isinstance(item, dict):
continue
# With inc_out=True, "next" items are already incremental deltas.
if "next" in item and "done" not in item:
delta = item["next"]
if isinstance(delta, str) and delta:
sent_any = True
try:
self.write_message(
make_session_update(
session.session_id,
{
"sessionUpdate": "agent_message_chunk",
"content": make_text_block(delta),
},
)
)
except Exception as e:
eprint(f"[ACP-BRIDGE] ERROR writing update: {e}")
if "done" in item:
# "done" text has post-processing (\n\n insertion)
# that shifts offsets — cannot safely compute a tail delta.
# Only use "done" content if nothing was streamed (error case).
if not sent_any:
done_text = item["done"]
if isinstance(done_text, str) and done_text:
try:
self.write_message(
make_session_update(
session.session_id,
{
"sessionUpdate": "agent_message_chunk",
"content": make_text_block(done_text),
},
)
)
except Exception as e:
eprint(f"[ACP-BRIDGE] ERROR writing done: {e}")
break
def handle_session_cancel(self, params: Dict[str, Any]) -> None:
session_id = params.get("sessionId")
session = self._sessions.get(session_id)
if session is None:
return
if session.current_prompt_id is not None:
session.agent.abort()
def handle_message(self, msg: Dict[str, Any]) -> None:
method = msg.get("method")
req_id = msg.get("id")
params = msg.get("params") or {}
try:
if method == "initialize":
self.handle_initialize(req_id, params)
elif method == "session/new":
self.handle_session_new(req_id, params)
elif method == "session/prompt":
self.handle_session_prompt(req_id, params)
elif method == "session/cancel":
self.handle_session_cancel(params)
elif method == "session/load":
self.write_message(jsonrpc_error(-32601, "session/load not supported", req_id))
elif method == "session/list":
self.write_message(jsonrpc_error(-32601, "session/list not supported", req_id))
elif method == "session/close":
self.write_message(jsonrpc_result(req_id, {}))
elif method is None:
if req_id is not None:
self.write_message(jsonrpc_error(-32600, "invalid request", req_id))
else:
if req_id is not None:
self.write_message(jsonrpc_error(-32601, f"method not found: {method}", req_id))
except Exception as exc:
eprint("[GenericAgent ACP] request handler failed:", traceback.format_exc())
if req_id is not None:
self.write_message(
jsonrpc_error(-32603, f"internal error: {type(exc).__name__}: {exc}", req_id)
)
def serve(self) -> None:
eprint("[GenericAgent ACP] bridge started")
stdin = io.TextIOWrapper(sys.stdin.buffer, encoding="utf-8", errors="replace") if hasattr(sys.stdin, 'buffer') else sys.stdin
for raw_line in stdin:
msg = parse_jsonrpc_line(raw_line)
if msg is None:
continue
self.handle_message(msg)
if self._shutdown:
break
eprint("[GenericAgent ACP] bridge stopped")
def main() -> int:
parser = argparse.ArgumentParser(description="GenericAgent ACP bridge over stdio")
parser.add_argument("--llm-no", type=int, default=0, help="LLM index for GenericAgent")
args = parser.parse_args()
bridge = GenericAgentAcpBridge(llm_no=args.llm_no)
bridge.serve()
return 0
if __name__ == "__main__":
raise SystemExit(main())
================================================
FILE: frontends/qqapp.py
================================================
import asyncio, os, sys, threading, time
from collections import deque
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from agentmain import GeneraticAgent
from chatapp_common import AgentChatMixin, ensure_single_instance, public_access, redirect_log, require_runtime, split_text
from llmcore import mykeys
try:
import botpy
from botpy.message import C2CMessage, GroupMessage
except Exception:
print("Please install qq-botpy to use QQ module: pip install qq-botpy")
sys.exit(1)
agent = GeneraticAgent(); agent.verbose = False
APP_ID = str(mykeys.get("qq_app_id", "") or "").strip()
APP_SECRET = str(mykeys.get("qq_app_secret", "") or "").strip()
ALLOWED = {str(x).strip() for x in mykeys.get("qq_allowed_users", []) if str(x).strip()}
PROCESSED_IDS, USER_TASKS = deque(maxlen=1000), {}
SEQ_LOCK, MSG_SEQ = threading.Lock(), 1
def _next_msg_seq():
global MSG_SEQ
with SEQ_LOCK:
MSG_SEQ += 1
return MSG_SEQ
def _build_intents():
try:
return botpy.Intents(public_messages=True, direct_message=True)
except Exception:
intents = botpy.Intents.none() if hasattr(botpy.Intents, "none") else botpy.Intents()
for attr in ("public_messages", "public_guild_messages", "direct_message", "direct_messages", "c2c_message", "c2c_messages", "group_at_message", "group_at_messages"):
if hasattr(intents, attr):
try:
setattr(intents, attr, True)
except Exception:
pass
return intents
def _make_bot_class(app):
class QQBot(botpy.Client):
def __init__(self):
super().__init__(intents=_build_intents(), ext_handlers=False)
async def on_ready(self):
print(f"[QQ] bot ready: {getattr(getattr(self, 'robot', None), 'name', 'QQBot')}")
async def on_c2c_message_create(self, message: C2CMessage):
await app.on_message(message, is_group=False)
async def on_group_at_message_create(self, message: GroupMessage):
await app.on_message(message, is_group=True)
async def on_direct_message_create(self, message):
await app.on_message(message, is_group=False)
return QQBot
class QQApp(AgentChatMixin):
label, source, split_limit = "QQ", "qq", 1500
def __init__(self):
super().__init__(agent, USER_TASKS)
self.client = None
async def send_text(self, chat_id, content, *, msg_id=None, is_group=False):
if not self.client:
return
api = self.client.api.post_group_message if is_group else self.client.api.post_c2c_message
key = "group_openid" if is_group else "openid"
for part in split_text(content, self.split_limit):
await api(**{key: chat_id, "msg_type": 0, "content": part, "msg_id": msg_id, "msg_seq": _next_msg_seq()})
async def on_message(self, data, is_group=False):
try:
msg_id = getattr(data, "id", None)
if msg_id in PROCESSED_IDS:
return
PROCESSED_IDS.append(msg_id)
content = (getattr(data, "content", "") or "").strip()
if not content:
return
author = getattr(data, "author", None)
user_id = str(getattr(author, "member_openid" if is_group else "user_openid", "") or getattr(author, "id", "") or "unknown")
chat_id = str(getattr(data, "group_openid", "") or user_id) if is_group else user_id
if not public_access(ALLOWED) and user_id not in ALLOWED:
print(f"[QQ] unauthorized user: {user_id}")
return
print(f"[QQ] message from {user_id} ({'group' if is_group else 'c2c'}): {content}")
if content.startswith("/"):
return await self.handle_command(chat_id, content, msg_id=msg_id, is_group=is_group)
asyncio.create_task(self.run_agent(chat_id, content, msg_id=msg_id, is_group=is_group))
except Exception:
import traceback
print("[QQ] handle_message error")
traceback.print_exc()
async def start(self):
self.client = _make_bot_class(self)()
delay, max_delay = 5, 300
while True:
started_at = time.monotonic()
try:
print(f"[QQ] bot starting... {time.strftime('%m-%d %H:%M')}")
await self.client.start(appid=APP_ID, secret=APP_SECRET)
except Exception as e:
print(f"[QQ] bot error: {e}")
if time.monotonic() - started_at >= 60:
delay = 5
print(f"[QQ] reconnect in {delay}s...")
await asyncio.sleep(delay)
delay = min(delay * 2, max_delay)
if __name__ == "__main__":
_LOCK_SOCK = ensure_single_instance(19528, "QQ")
require_runtime(agent, "QQ", qq_app_id=APP_ID, qq_app_secret=APP_SECRET)
redirect_log(__file__, "qqapp.log", "QQ", ALLOWED)
threading.Thread(target=agent.run, daemon=True).start()
asyncio.run(QQApp().start())
================================================
FILE: frontends/qtapp.py
================================================
"""
桌面前端单文件版 – PySide6 聊天面板 + 悬浮按钮 thanks to GaoZhiCheng
依赖: pip install PySide6
可选: pip install markdown (Markdown 渲染)
用法: python frontends/qtapp.py
"""
from __future__ import annotations
import math, os, sys, json, glob, re, base64, time, threading
import queue as _queue
from datetime import datetime
from typing import Optional
from PySide6.QtWidgets import (
QWidget, QVBoxLayout, QHBoxLayout, QLabel, QPushButton,
QScrollArea, QFrame, QTextEdit, QStackedWidget,
QListWidget, QListWidgetItem, QSizePolicy, QFileDialog,
QSplitter, QTextBrowser, QApplication, QMessageBox,
QMenu, QLineEdit,
)
from PySide6.QtCore import (
Qt, QTimer, QPoint, QPointF, QByteArray, QSize,
Signal, QMetaObject, Q_ARG, QObject, QDateTime, QEvent,
)
from PySide6.QtGui import (
QPainter, QColor, QLinearGradient, QRadialGradient,
QPen, QPainterPath, QCursor, QFont, QIcon, QPixmap, QRegion,
)
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from agentmain import GeneraticAgent
from chatapp_common import FILE_HINT, HELP_TEXT, clean_reply, build_done_text, format_restore
# ══════════════════════════════════════════════════════════════════════
# FloatingButton
# ══════════════════════════════════════════════════════════════════════
class FloatingButton(QWidget):
SIZE = 60 # circle diameter
MARGIN = 14 # extra space for glow
TOTAL = SIZE + MARGIN * 2
def __init__(self, chat_panel: QWidget):
super().__init__()
self.chat_panel = chat_panel
self._drag_origin_global: QPoint | None = None
self._drag_origin_win: QPoint | None = None
self._dragged = False
self._glow = 0.5
self._glow_dir = 1
self._hovering = False
self._hover_clock = 0.0
self._hover_strength = 0.0
self._flow_phase = 0.0
self._running = False
self._last_toggle_ms = 0 # debounce timestamp
# Window flags: frameless, always on top, no taskbar entry
self.setWindowFlags(
Qt.FramelessWindowHint
| Qt.WindowStaysOnTopHint
| Qt.Tool
)
self.setAttribute(Qt.WA_TranslucentBackground)
self.setFixedSize(self.TOTAL, self.TOTAL)
self.setCursor(QCursor(Qt.PointingHandCursor))
# Smooth animation (~30 fps)
self._timer = QTimer(self)
self._timer.timeout.connect(self._tick)
self._timer.start(33)
# Default position: bottom-right of the work area
scr = QApplication.primaryScreen().availableGeometry()
self.move(scr.right() - self.TOTAL - 20, scr.bottom() - self.TOTAL - 20)
# ── Animation ────────────────────────────────────────
def _tick(self):
# running status: green when model is actively responding
self._running = bool(
getattr(self.chat_panel, "_is_streaming", False)
or getattr(getattr(self.chat_panel, "agent", None), "is_running", False)
)
self._glow += self._glow_dir * 0.04
if self._glow >= 1.0:
self._glow, self._glow_dir = 1.0, -1
elif self._glow <= 0.0:
self._glow, self._glow_dir = 0.0, 1
target = 1.0 if self._hovering else 0.0
self._hover_strength += (target - self._hover_strength) * 0.20
self._hover_clock += 0.033
self._flow_phase += 0.16 + (0.06 if self._running else 0.0) + (0.05 if self._hovering else 0.0)
self.update()
# ── Painting ──────────────────────────────────────────
def paintEvent(self, _event):
p = QPainter(self)
p.setRenderHint(QPainter.Antialiasing)
m = self.MARGIN
r = self.SIZE // 2
cx = m + r
# Rhythmic spring bounce: one main hop + one lighter rebound per beat.
beat_t = self._hover_clock % 1.18
spring = 0.0
if beat_t < 0.70:
spring += max(0.0, math.exp(-5.2 * beat_t) * math.sin(15.5 * beat_t))
if beat_t > 0.20:
rt = beat_t - 0.20
spring += 0.52 * max(0.0, math.exp(-7.0 * rt) * math.sin(21.0 * rt))
idle_sway = 0.20 * math.sin(self._hover_clock * 2.1)
bounce = int(round((spring * 7.2 + idle_sway) * self._hover_strength))
cy = m + r - bounce
if self._running:
# running: #2DFFF5 -> #FFF878
g0 = QColor(45, 255, 245, 195)
g1 = QColor(255, 248, 120, 195)
glow_rgb = (96, 255, 216)
else:
# idle: #103CE7 -> #64E9FF
g0 = QColor(16, 60, 231, 195)
g1 = QColor(100, 233, 255, 195)
glow_rgb = (74, 170, 255)
# --- Outer glow rings (3 layers) ---
base_alpha = int(45 + 25 * self._glow)
for i, gr in enumerate([r + 10, r + 6, r + 2]):
g = QRadialGradient(QPointF(cx, cy), gr)
g.setColorAt(0.0, QColor(glow_rgb[0], glow_rgb[1], glow_rgb[2], max(0, base_alpha - i * 14)))
g.setColorAt(1.0, QColor(glow_rgb[0], glow_rgb[1], glow_rgb[2], 0))
p.setBrush(g)
p.setPen(Qt.NoPen)
p.drawEllipse(int(cx - gr), int(cy - gr), int(gr * 2), int(gr * 2))
# --- Frosted glass disc behind main circle ---
frost = QRadialGradient(QPointF(cx, cy), r)
frost.setColorAt(0.0, QColor(30, 30, 45, 140))
frost.setColorAt(0.85, QColor(20, 20, 32, 160))
frost.setColorAt(1.0, QColor(14, 14, 20, 100))
p.setBrush(frost)
p.setPen(Qt.NoPen)
p.drawEllipse(cx - r, cy - r, r * 2, r * 2)
# --- Main circle (flowing state gradient) ---
spin = self._flow_phase
dx = math.cos(spin) * r
dy = math.sin(spin) * r
grad = QLinearGradient(cx - dx, cy - dy, cx + dx, cy + dy)
grad.setColorAt(0.0, g0)
grad.setColorAt(1.0, g1)
p.setBrush(grad)
p.setPen(QPen(QColor(255, 255, 255, 50), 1.5))
p.drawEllipse(cx - r, cy - r, r * 2, r * 2)
# --- Flowing glass streaks ---
clip = QPainterPath()
clip.addEllipse(float(cx - r), float(cy - r), float(r * 2), float(r * 2))
p.setClipPath(clip)
flow_shift = math.sin(self._flow_phase * 0.85) * (r * 0.7)
streak1 = QLinearGradient(cx - r + flow_shift, cy - r, cx + r + flow_shift, cy + r)
streak1.setColorAt(0.00, QColor(255, 255, 255, 0))
streak1.setColorAt(0.45, QColor(255, 255, 255, 42))
streak1.setColorAt(0.52, QColor(255, 255, 255, 78))
streak1.setColorAt(0.60, QColor(255, 255, 255, 24))
streak1.setColorAt(1.00, QColor(255, 255, 255, 0))
p.setBrush(streak1)
p.setPen(Qt.NoPen)
p.drawEllipse(cx - r, cy - r, r * 2, r * 2)
flow_shift_2 = math.cos(self._flow_phase * 1.2) * (r * 0.5)
streak2 = QLinearGradient(cx - r, cy + flow_shift_2, cx + r, cy - flow_shift_2)
streak2.setColorAt(0.00, QColor(255, 255, 255, 0))
streak2.setColorAt(0.35, QColor(255, 255, 255, 16))
streak2.setColorAt(0.50, QColor(255, 255, 255, 46))
streak2.setColorAt(0.65, QColor(255, 255, 255, 16))
streak2.setColorAt(1.00, QColor(255, 255, 255, 0))
p.setBrush(streak2)
p.drawEllipse(cx - r, cy - r, r * 2, r * 2)
# --- Top highlight ---
hl = QLinearGradient(cx, cy - r, cx, cy)
hl.setColorAt(0.0, QColor(255, 255, 255, 72))
hl.setColorAt(1.0, QColor(255, 255, 255, 0))
p.setBrush(hl)
p.drawRect(cx - r, cy - r, r * 2, r)
p.setClipping(False)
# --- Bot icon ---
p.setPen(QPen(QColor(255, 255, 255, 220), 1.8))
p.setBrush(Qt.NoBrush)
# Head
p.drawRoundedRect(cx - 9, cy - 6, 18, 12, 2, 2)
# Eyes
p.setBrush(QColor(255, 255, 255, 220))
p.setPen(Qt.NoPen)
p.drawEllipse(cx - 6, cy - 3, 4, 4)
p.drawEllipse(cx + 2, cy - 3, 4, 4)
# Antenna stem
p.setPen(QPen(QColor(255, 255, 255, 220), 1.8))
p.drawLine(cx, cy - 6, cx, cy - 10)
# Antenna tip
p.setBrush(QColor(255, 255, 255, 190))
p.setPen(Qt.NoPen)
p.drawEllipse(cx - 2, cy - 13, 4, 4)
def enterEvent(self, event):
self._hovering = True
self.update()
super().enterEvent(event)
def leaveEvent(self, event):
self._hovering = False
self.update()
super().leaveEvent(event)
# ── Mouse events (drag + click) ───────────────────────
def mousePressEvent(self, event):
if event.button() == Qt.LeftButton:
self._drag_origin_global = event.globalPosition().toPoint()
self._drag_origin_win = self.pos()
self._dragged = False
def mouseMoveEvent(self, event):
if event.buttons() == Qt.LeftButton and self._drag_origin_global:
delta = event.globalPosition().toPoint() - self._drag_origin_global
if abs(delta.x()) > 5 or abs(delta.y()) > 5:
self._dragged = True
if self._dragged:
new = self._drag_origin_win + delta
scr = QApplication.primaryScreen().availableGeometry()
new.setX(max(scr.left(), min(new.x(), scr.right() - self.width())))
new.setY(max(scr.top(), min(new.y(), scr.bottom() - self.height())))
self.move(new)
def mouseDoubleClickEvent(self, event):
# Qt sends Press→Release→DoubleClick→Release on double-click.
# The first Release already toggled the panel; swallow the DoubleClick
# so the second Release does NOT trigger a second toggle.
self._dragged = True # mark as "dragged" → Release will be ignored
event.accept()
def mouseReleaseEvent(self, event):
if event.button() == Qt.LeftButton:
if not self._dragged:
self._toggle()
self._dragged = False
self._drag_origin_global = None
# ── Toggle panel ──────────────────────────────────────
def _toggle(self):
now = QDateTime.currentMSecsSinceEpoch()
if now - self._last_toggle_ms < 500: # 500 ms debounce
return
self._last_toggle_ms = now
if self.chat_panel.isVisible():
self.chat_panel.hide()
else:
self._position_panel()
self.chat_panel.show()
self.chat_panel.raise_()
self.chat_panel.activateWindow()
def _position_panel(self):
scr = QApplication.primaryScreen().availableGeometry()
btn = self.geometry()
pw = self.chat_panel.width()
ph = self.chat_panel.height()
# Prefer left of button, bottom-aligned
x = btn.left() - pw - 12
y = btn.bottom() - ph
x = max(scr.left() + 10, min(x, scr.right() - pw - 10))
y = max(scr.top() + 10, min(y, scr.bottom() - ph - 10))
self.chat_panel.move(x, y)
# ══════════════════════════════════════════════════════════════════════
# ChatPanel
# ══════════════════════════════════════════════════════════════════════
# ── constants ─────────────────────────────────────────────────────────────────
HISTORY_FILE = "memory/chat_history.json"
TEXT_FILE_EXTS = {
".txt", ".md", ".py", ".json", ".csv", ".yaml", ".yml",
".log", ".ini", ".toml", ".xml", ".html", ".js", ".ts", ".sql",
}
MAX_INLINE_CHARS = 6000
MAX_UPLOAD_BYTES = 10 * 1024 * 1024 # 10 MB
AUTO_IDLE_THRESHOLD = 1800 # seconds before autonomous trigger
AUTO_COOLDOWN = 120 # seconds between triggers
C = {
"bg": QColor(14, 14, 18),
"panel": QColor(20, 20, 24, 248),
"border": QColor(45, 45, 50),
"accent": "#7c3aed",
"text": "#e4e4e7",
"muted": "#71717a",
"user_g0": QColor(79, 70, 229),
"user_g1": QColor(124, 58, 237),
"asst_bg": QColor(39, 39, 42, 210),
"asst_bdr": QColor(63, 63, 70),
"send_g0": QColor(220, 38, 38),
"send_g1": QColor(239, 68, 68),
"green": "#22c55e",
"hover_bg": "rgba(63,63,70,0.6)",
"accent_bg":"rgba(124,58,237,0.25)",
"accent_bdr":"rgba(124,58,237,0.5)",
}
SCROLLBAR_STYLE = """
QScrollBar:vertical { width: 5px; background: transparent; border: none; }
QScrollBar::handle:vertical {
background: rgba(255,255,255,0.12); border-radius: 2px; min-height: 20px;
}
QScrollBar::add-line:vertical, QScrollBar::sub-line:vertical { height: 0; }
QScrollBar::add-page:vertical, QScrollBar::sub-page:vertical { background: none; }
"""
_SVG_COPY = ''
_SVG_REGEN = ''
_SVG_CHAT = ''
_SVG_CLOCK = ''
_SVG_SEARCH = ''
_SVG_BOOK = ''
_SVG_GEAR = ''
_SVG_PLUS = ''
_SVG_CLIP = _SVG_PLUS
_SVG_STOP = ''
_SVG_RESET = _SVG_REGEN
_SVG_SAVE = ''
_SVG_TRASH = ''
_SVG_BOLT = ''
_SVG_PLAY = ''
_SVG_FILE = ''
_SVG_USER = ''
_SVG_BOT = ''
_SVG_SEND = ''
_MD_CSS = """
body { color: #e4e4e7; font-family: "Arial", "Microsoft YaHei", sans-serif; font-size: 13px; line-height: 1.6; font-weight: 400; }
h1 { color: #f4f4f5; font-size: 20px; font-weight: 700; border-bottom: 1px solid #3f3f46; padding-bottom: 4px; margin-top: 16px; }
h2 { color: #f4f4f5; font-size: 17px; font-weight: 700; border-bottom: 1px solid #3f3f46; padding-bottom: 3px; margin-top: 14px; }
h3 { color: #f4f4f5; font-size: 15px; font-weight: 600; margin-top: 12px; }
h4,h5,h6 { color: #d4d4d8; font-size: 13px; font-weight: 600; margin-top: 10px; }
code { background: rgba(63,63,70,0.6); color: #c4b5fd; padding: 1px 4px; border-radius: 3px;
font-family: Consolas, "Courier New", monospace; font-size: 12px; }
pre { background: rgba(24,24,30,0.95); border: 1px solid #3f3f46; border-radius: 6px;
padding: 10px 12px; margin: 8px 0; }
pre code { background: transparent; padding: 0; color: #d4d4d8; }
a { color: #818cf8; text-decoration: none; }
a:hover { text-decoration: underline; }
blockquote { border-left: 3px solid #7c3aed; margin: 8px 0 8px 0; padding: 4px 0 4px 12px; color: #a1a1aa; }
table { border-collapse: collapse; margin: 8px 0; }
th, td { border: 1px solid #3f3f46; padding: 5px 10px; }
th { background: rgba(63,63,70,0.35); color: #d4d4d8; font-weight: 700; }
hr { border: none; border-top: 1px solid #3f3f46; margin: 12px 0; }
ul, ol { padding-left: 22px; margin: 4px 0; }
li { margin: 2px 0; }
p { margin: 6px 0; }
"""
def _md_to_html(text: str) -> str:
try:
import markdown
return markdown.markdown(
text, extensions=["fenced_code", "tables", "nl2br", "sane_lists"]
)
except ImportError:
pass
html, in_code, in_ul = [], False, False
for raw in text.split("\n"):
if raw.strip().startswith("```"):
if in_code:
html.append("")
else:
html.append("
")
in_code = not in_code
continue
if in_code:
html.append(raw.replace("&", "&").replace("<", "<").replace(">", ">"))
continue
line = raw
line = re.sub(r"`([^`]+)`", r"\1", line)
line = re.sub(r"\*\*(.+?)\*\*", r"\1", line)
line = re.sub(r"\*(.+?)\*", r"\1", line)
line = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", r'\1', line)
if re.match(r"^#{1,6}\s", line):
lvl = len(line.split()[0])
line = f"{line[lvl:].strip()}"
elif re.match(r"^-{3,}$|^_{3,}$|^\*{3,}$", line.strip()):
line = ""
elif re.match(r"^\s*[-*+]\s", line):
content = re.sub(r"^\s*[-*+]\s", "", line)
if not in_ul:
html.append("
")
in_ul = True
line = f"
{content}
"
else:
if in_ul:
html.append("
")
in_ul = False
line = f"
{line}
" if line.strip() else ""
html.append(line)
if in_code:
html.append("
")
if in_ul:
html.append("")
return "\n".join(html)
_icon_cache: dict[str, QIcon] = {}
def _svg_icon(key: str, svg_template: str, color: str = "#a1a1aa",
size: int = 16) -> QIcon:
cache_key = f"{key}_{color}_{size}"
if cache_key not in _icon_cache:
try:
from PySide6.QtSvg import QSvgRenderer
except ImportError:
return QIcon()
data = QByteArray(svg_template.format(c=color).encode("utf-8"))
renderer = QSvgRenderer(data)
pixmap = QPixmap(size, size)
pixmap.fill(Qt.transparent)
painter = QPainter(pixmap)
renderer.render(painter)
painter.end()
_icon_cache[cache_key] = QIcon(pixmap)
return _icon_cache[cache_key]
# ── utilities ─────────────────────────────────────────────────────────────────
def _make_session_id() -> str:
return datetime.now().strftime("%Y%m%d_%H%M%S_%f")
def _load_history() -> list:
if os.path.exists(HISTORY_FILE):
try:
with open(HISTORY_FILE, "r", encoding="utf-8") as f:
return json.load(f)
except Exception:
pass
return []
def _save_history(history: list):
os.makedirs(os.path.dirname(HISTORY_FILE), exist_ok=True)
with open(HISTORY_FILE, "w", encoding="utf-8") as f:
json.dump(history, f, ensure_ascii=False, indent=2)
def _build_prompt_with_uploads(prompt: str, files: list) -> tuple:
"""
files: list of {'name': str, 'type': str, 'raw': bytes}
returns (full_prompt, display_prompt, display_attachments)
"""
if not files:
return prompt, prompt, []
os.makedirs("temp/uploaded", exist_ok=True)
attachment_chunks = ["\n\n[用户上传附件 — 文件已保存到本地磁盘,可用 file_read 工具读取]"]
display_attachments = []
img_count, file_names = 0, []
for f in files:
raw, name, mime = f["raw"], f["name"], f.get("type", "")
size = len(raw)
ext = os.path.splitext(name)[1].lower()
safe = re.sub(r"[^A-Za-z0-9._\-]", "_", name)
saved = os.path.join(
"temp", "uploaded",
f"{datetime.now().strftime('%Y%m%d_%H%M%S_%f')}_{safe}",
)
try:
with open(saved, "wb") as out:
out.write(raw)
except Exception:
saved = "(保存失败)"
if mime.startswith("image/"):
b64 = base64.b64encode(raw).decode()
attachment_chunks.append(
f"\n- [图片附件] {name} ({size} bytes)\n 磁盘路径: {saved}"
f"\n data:{mime};base64,{b64}"
)
display_attachments.append({"type": "image", "name": name})
img_count += 1
elif ext in TEXT_FILE_EXTS:
text = raw.decode("utf-8", errors="replace")
attachment_chunks.append(
f"\n--- 文本文件: {name} ({size} bytes) ---\n磁盘路径: {saved}\n{text[:MAX_INLINE_CHARS]}"
+ ("\n[内容已截断,请用 file_read 读取完整内容]" if len(text) > MAX_INLINE_CHARS else "")
)
display_attachments.append({"type": "file", "name": name})
file_names.append(name)
else:
attachment_chunks.append(
f"\n- 文件: {name} ({size} bytes)\n 磁盘路径: {saved}"
)
display_attachments.append({"type": "file", "name": name})
file_names.append(name)
parts = []
if img_count:
parts.append(f"{img_count} 张图片")
if file_names:
parts.append(f"{len(file_names)} 个文件({'、'.join(file_names)})")
display_prompt = f"{prompt}\n\n📎 已附带:{','.join(parts)}" if parts else prompt
return prompt + "\n".join(attachment_chunks), display_prompt, display_attachments
# ── small reusable widgets ────────────────────────────────────────────────────
class _Separator(QFrame):
def __init__(self, parent=None):
super().__init__(parent)
self.setFixedHeight(1)
self.setStyleSheet(f"background: {C['border'].name()};")
class _Badge(QLabel):
def __init__(self, text: str, parent=None):
super().__init__(text, parent)
self.setStyleSheet(
"QLabel { background: rgba(63,63,70,0.9); color: #a1a1aa;"
" border: 1px solid #3f3f46; border-radius: 9px;"
" padding: 1px 8px; font-size: 11px; }"
)
class _StreamingBadge(QLabel):
def __init__(self, parent=None):
super().__init__("处理中…", parent)
self.setStyleSheet(
"QLabel { background: rgba(124,58,237,0.18); color: #c4b5fd;"
" border: 1px solid rgba(124,58,237,0.35); border-radius: 9px;"
" padding: 1px 8px; font-size: 11px; }"
)
self.hide()
class _FoldableTextBrowser(QTextBrowser):
"""QTextBrowser subclass that reliably detects clicks on fold anchors."""
def __init__(self, parent=None):
super().__init__(parent)
self.viewport().installEventFilter(self)
def eventFilter(self, obj, event):
from PySide6.QtCore import QEvent
if obj is self.viewport() and event.type() == QEvent.MouseButtonRelease:
href = self.anchorAt(event.pos())
if href and href.startswith("#fold_"):
from urllib.parse import unquote
title = unquote(href[6:])
p = self.parent()
while p and not isinstance(p, _MsgRow):
p = p.parent()
if p and hasattr(p, '_toggle_fold'):
p._toggle_fold(title)
return True
return super().eventFilter(obj, event)
class _MsgRow(QWidget):
"""A single message row – flat layout with avatar, inspired by ChatGPT / Qwen."""
_ACTION_BTN = """
QPushButton {
background: transparent; border: none; border-radius: 4px; padding: 3px;
}
QPushButton:hover { background: %s; }
""" % C["hover_bg"]
def __init__(self, text: str, role: str, parent=None, on_resend=None, on_delete=None, on_rewrite=None, created_at: str = None):
super().__init__(parent)
self._text = text
self._role = role
self._on_resend = on_resend
self._on_delete = on_delete
self._on_rewrite = on_rewrite
self._created_at = created_at
self._action_row = None
self._finished = True
is_user = role == "user"
self.setStyleSheet("background: transparent;")
outer = QHBoxLayout(self)
outer.setContentsMargins(12, 10, 12, 10)
outer.setSpacing(10)
outer.setAlignment(Qt.AlignTop)
# ── avatar ──
avatar = QLabel()
avatar.setFixedSize(30, 30)
avatar.setAlignment(Qt.AlignCenter)
svg_data = _SVG_USER if is_user else _SVG_BOT
avatar_color = "#c8c8d0" if is_user else "#9eb4d0"
pm = QPixmap(30, 30)
pm.fill(QColor(0, 0, 0, 0))
from PySide6.QtSvg import QSvgRenderer
renderer = QSvgRenderer(QByteArray(svg_data.replace("{c}", avatar_color).encode()))
p = QPainter(pm)
renderer.render(p)
p.end()
avatar.setPixmap(pm)
avatar.setStyleSheet(
"QLabel { background: rgba(255,255,255,0.04); border: 1px solid rgba(255,255,255,0.10);"
" border-radius: 15px; }"
)
# ── content column ──
content_col = QVBoxLayout()
content_col.setContentsMargins(0, 0, 0, 0)
content_col.setSpacing(2)
role_lbl = QLabel("你" if is_user else "助手")
role_lbl.setStyleSheet(
"color: #d4d4d8; font-size: 12px; font-weight: 700; background: transparent;"
)
if is_user:
role_lbl.setAlignment(Qt.AlignRight)
content_col.addWidget(role_lbl)
if is_user:
# ── user: right-aligned bubble ──
bubble = QWidget()
bubble.setStyleSheet(
"background: rgba(63,63,70,0.4); border-radius: 12px;"
)
bubble_ly = QVBoxLayout(bubble)
bubble_ly.setContentsMargins(12, 8, 12, 8)
bubble_ly.setSpacing(0)
label = QLabel(text)
label.setWordWrap(True)
label.setTextInteractionFlags(Qt.TextSelectableByMouse)
label.setSizePolicy(QSizePolicy.Preferred, QSizePolicy.Minimum)
label.setStyleSheet(
"QLabel { background: transparent; color: #e4e4e7;"
" padding: 0; font-size: 14px; line-height: 1.6; }"
)
bubble_ly.addWidget(label)
self._label = label
# Size bubble to text: measure longest line, cap at 420
fm = label.fontMetrics()
text_w = max((fm.horizontalAdvance(ln) for ln in text.split('\n')), default=0)
bubble.setMinimumWidth(min(text_w + 24, 420))
bubble.setMaximumWidth(420)
content_col.addWidget(bubble, 0, Qt.AlignRight)
# ── user message action row ──
self._action_row = QWidget()
self._action_row.setStyleSheet("background: transparent;")
alayout = QHBoxLayout(self._action_row)
alayout.setContentsMargins(0, 4, 0, 0)
alayout.setSpacing(4)
alayout.setAlignment(Qt.AlignRight)
icon_sz = QSize(15, 15)
copy_btn = QPushButton()
copy_btn.setIcon(_svg_icon("copy", _SVG_COPY))
copy_btn.setIconSize(icon_sz)
copy_btn.setFixedSize(26, 24)
copy_btn.setStyleSheet(self._ACTION_BTN)
copy_btn.setToolTip("复制")
copy_btn.setCursor(QCursor(Qt.PointingHandCursor))
copy_btn.clicked.connect(self._copy_text)
alayout.addWidget(copy_btn)
if on_delete:
delete_btn = QPushButton()
delete_btn.setIcon(_svg_icon("delete", _SVG_TRASH))
delete_btn.setIconSize(icon_sz)
delete_btn.setFixedSize(26, 24)
delete_btn.setStyleSheet(self._ACTION_BTN)
delete_btn.setToolTip("删除")
delete_btn.setCursor(QCursor(Qt.PointingHandCursor))
delete_btn.clicked.connect(self._do_delete)
alayout.addWidget(delete_btn)
if on_rewrite:
rewrite_btn = QPushButton()
rewrite_btn.setIcon(_svg_icon("rewrite", _SVG_RESET))
rewrite_btn.setIconSize(icon_sz)
rewrite_btn.setFixedSize(26, 24)
rewrite_btn.setStyleSheet(self._ACTION_BTN)
rewrite_btn.setToolTip("重写")
rewrite_btn.setCursor(QCursor(Qt.PointingHandCursor))
rewrite_btn.clicked.connect(self._do_rewrite)
alayout.addWidget(rewrite_btn)
alayout.addStretch()
if created_at:
from datetime import datetime
try:
dt = datetime.fromisoformat(created_at)
time_lbl = QLabel(dt.strftime("%Y-%m-%d %H:%M"))
time_lbl.setStyleSheet("color: #a1a1aa; font-size: 11px; background: transparent;")
alayout.addWidget(time_lbl)
except:
pass
self._action_row.hide()
content_col.addWidget(self._action_row, 0, Qt.AlignRight)
else:
# ── assistant: left-aligned, no bubble ──
browser = _FoldableTextBrowser()
browser.setReadOnly(True)
browser.setOpenExternalLinks(True)
browser.setVerticalScrollBarPolicy(Qt.ScrollBarAlwaysOff)
browser.setHorizontalScrollBarPolicy(Qt.ScrollBarAlwaysOff)
browser.setSizePolicy(QSizePolicy.Expanding, QSizePolicy.Minimum)
browser.document().setDefaultStyleSheet(_MD_CSS)
browser.setStyleSheet(
"QTextBrowser { background: transparent; color: #e4e4e7;"
" border: none; padding: 0; font-size: 14px; }"
)
self._folded_ids = set() # 记录被折叠的块
self._auto_fold_new_blocks(text)
browser.setHtml(self._render_with_folds(text))
self._label = browser
content_col.addWidget(browser)
self._adjust_browser_height()
self._action_row = QWidget()
self._action_row.setStyleSheet("background: transparent;")
alayout = QHBoxLayout(self._action_row)
alayout.setContentsMargins(0, 4, 0, 0)
alayout.setSpacing(4)
icon_sz = QSize(15, 15)
copy_btn = QPushButton()
copy_btn.setIcon(_svg_icon("copy", _SVG_COPY))
copy_btn.setIconSize(icon_sz)
copy_btn.setFixedSize(26, 24)
copy_btn.setStyleSheet(self._ACTION_BTN)
copy_btn.setToolTip("复制")
copy_btn.setCursor(QCursor(Qt.PointingHandCursor))
copy_btn.clicked.connect(self._copy_text)
alayout.addWidget(copy_btn)
if on_delete:
delete_btn = QPushButton()
delete_btn.setIcon(_svg_icon("delete", _SVG_TRASH))
delete_btn.setIconSize(icon_sz)
delete_btn.setFixedSize(26, 24)
delete_btn.setStyleSheet(self._ACTION_BTN)
delete_btn.setToolTip("删除")
delete_btn.setCursor(QCursor(Qt.PointingHandCursor))
delete_btn.clicked.connect(self._do_delete)
alayout.addWidget(delete_btn)
if on_resend:
regen_btn = QPushButton()
regen_btn.setIcon(_svg_icon("regen", _SVG_REGEN))
regen_btn.setIconSize(icon_sz)
regen_btn.setFixedSize(26, 24)
regen_btn.setStyleSheet(self._ACTION_BTN)
regen_btn.setToolTip("重新生成")
regen_btn.setCursor(QCursor(Qt.PointingHandCursor))
regen_btn.clicked.connect(self._do_resend)
alayout.addWidget(regen_btn)
export_btn = QPushButton()
export_btn.setIcon(_svg_icon("save", _SVG_SAVE))
export_btn.setIconSize(icon_sz)
export_btn.setFixedSize(26, 24)
export_btn.setStyleSheet(self._ACTION_BTN)
export_btn.setToolTip("导出为md")
export_btn.setCursor(QCursor(Qt.PointingHandCursor))
export_btn.clicked.connect(self._export_as_md)
alayout.addWidget(export_btn)
alayout.addStretch()
if created_at:
from datetime import datetime
try:
dt = datetime.fromisoformat(created_at)
time_lbl = QLabel(dt.strftime("%Y-%m-%d %H:%M"))
time_lbl.setStyleSheet("color: #a1a1aa; font-size: 11px; background: transparent;")
alayout.addWidget(time_lbl)
except:
pass
self._action_row.hide()
content_col.addWidget(self._action_row)
# ── assemble: assistant left, user right ──
if is_user:
outer.addStretch(1)
outer.addLayout(content_col, 0)
outer.addWidget(avatar, 0, Qt.AlignTop)
else:
outer.addWidget(avatar, 0, Qt.AlignTop)
outer.addLayout(content_col, 1)
def _copy_text(self):
QApplication.clipboard().setText(self._text)
def _do_resend(self):
if self._on_resend:
self._on_resend()
def _do_delete(self):
if self._on_delete:
self._on_delete()
def _do_rewrite(self):
if self._on_rewrite:
self._on_rewrite()
def _export_as_md(self):
from PySide6.QtWidgets import QFileDialog
import os
from datetime import datetime
default_name = f"msg_{datetime.now().strftime('%Y%m%d_%H%M%S')}.md"
file_path, _ = QFileDialog.getSaveFileName(
self, "导出为 Markdown", default_name, "Markdown 文件 (*.md);;所有文件 (*)"
)
if file_path:
try:
with open(file_path, "w", encoding="utf-8") as f:
f.write(self._text)
except Exception as e:
import traceback
traceback.print_exc()
def enterEvent(self, event):
if self._action_row and self._finished:
self._action_row.show()
super().enterEvent(event)
def leaveEvent(self, event):
if self._action_row:
self._action_row.hide()
super().leaveEvent(event)
def resizeEvent(self, event):
super().resizeEvent(event)
if self._role != "user" and hasattr(self, '_label'):
self._adjust_browser_height()
def set_finished(self, done: bool):
self._finished = done
if not done and self._action_row:
self._action_row.hide()
def _adjust_browser_height(self):
doc = self._label.document()
w = self._label.width()
if w < 50:
w = 460
doc.setTextWidth(w - 6)
self._label.setFixedHeight(int(doc.size().height() + 8))
def set_text(self, text: str):
self._text = text
if self._role == "user":
self._label.setText(text)
self._label.adjustSize()
else:
self._auto_fold_new_blocks(text)
self._label.setHtml(self._render_with_folds(text))
self._adjust_browser_height()
def highlight(self, keyword: str):
"""Apply highlight and return keyword's y position in document, or None."""
if not keyword or not self._text:
return None
kw_lower = keyword.lower()
text_lower = self._text.lower()
if kw_lower not in text_lower:
return None
if self._role == "user":
escaped = self._text.replace("&", "&").replace("<", "<").replace(">", ">")
kw_esc = keyword.replace("&", "&").replace("<", "<").replace(">", ">")
highlighted = escaped.replace(kw_esc, f'{kw_esc}')
self._label.setText(highlighted)
self._label.adjustSize()
return 0 # plain text, keyword at top
else:
from PySide6.QtGui import QTextDocument, QTextCursor, QTextCharFormat
doc = self._label.document()
cursor = QTextCursor(doc)
flags = QTextDocument.FindFlags(0)
fmt = QTextCharFormat()
fmt.setBackground(QColor(251, 191, 36, 90))
fmt.setForeground(QColor(251, 191, 36))
keyword_y = None
while True:
cursor = doc.find(keyword, cursor, flags)
if cursor.isNull():
break
cursor.mergeCharFormat(fmt)
if keyword_y is None:
keyword_y = self._label.cursorRect(cursor).y()
self._adjust_browser_height()
return keyword_y
def clear_highlight(self):
if self._role == "user":
self._label.setText(self._text)
self._label.adjustSize()
else:
self._label.setHtml(self._render_with_folds(self._text))
self._adjust_browser_height()
def _parse_foldable_blocks(self, text: str):
"""解析文本为可折叠块,返回 [(type, title_or_None, content), ...]"""
import re
lines = text.split('\n')
blocks = []
current_type = "normal"
current_title = None
current_lines = []
for line in lines:
# 检查是否是折叠块开始
llm_match = re.match(r'^\s*\*\*LLM Running \(Turn \d+\) \.\.\.\*\*\s*$', line)
tool_match = re.match(r'^\s*🛠️\s*Tool:', line)
tool_compact_match = re.match(r'^\s*🛠️\s+\w+\(', line)
is_foldable_start = llm_match or tool_match or tool_compact_match
if is_foldable_start:
if current_lines:
blocks.append((current_type, current_title, '\n'.join(current_lines)))
title = line.strip()
if llm_match:
title = line.strip().replace('**', '')
current_type = "foldable"
current_title = title
current_lines = [line]
else:
current_lines.append(line)
if current_lines:
blocks.append((current_type, current_title, '\n'.join(current_lines)))
return blocks
def _auto_fold_new_blocks(self, text: str):
"""将新出现的折叠块加入 _folded_ids(仅在此处修改集合)"""
for _, title, _ in self._parse_foldable_blocks(text):
if title is not None and title not in self._folded_ids:
self._folded_ids.add(title)
def _render_with_folds(self, text: str) -> str:
"""渲染文本为带折叠的 HTML(纯渲染,不修改 _folded_ids)"""
from urllib.parse import quote
blocks = self._parse_foldable_blocks(text)
html_parts = []
for i, (block_type, title, content) in enumerate(blocks):
if block_type == "normal":
html_parts.append(f'
{_md_to_html(content)}
')
else:
safe_title = quote(title, safe='')
display_title = title.replace('**', '')
if title in self._folded_ids:
# 折叠状态:只显示标题 + 展开链接
html_parts.append(
f'