Repository: YunQiAI/OpenManusWeb
Branch: dev_web_app
Commit: 3ceaa9182b39
Files: 107
Total size: 546.8 KB
Directory structure:
gitextract_o63q7tbk/
├── .gitattributes
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── config.yaml
│ │ ├── request_new_features.md
│ │ └── show_me_the_bug.md
│ ├── PULL_REQUEST_TEMPLATE.md
│ └── workflows/
│ ├── build-package.yaml
│ ├── pre-commit.yaml
│ └── stale.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── CODE_OF_CONDUCT.md
├── DEVELOPMENT_LOG.md
├── LICENSE
├── README.md
├── README_zh.md
├── app/
│ ├── __init__.py
│ ├── agent/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── llm_wrapper.py
│ │ ├── manus.py
│ │ ├── planning.py
│ │ ├── react.py
│ │ ├── swe.py
│ │ └── toolcall.py
│ ├── config.py
│ ├── exceptions.py
│ ├── flow/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── flow_factory.py
│ │ ├── planning.py
│ │ └── tracking_support.py
│ ├── llm.py
│ ├── logger.py
│ ├── prompt/
│ │ ├── __init__.py
│ │ ├── manus.py
│ │ ├── planning.py
│ │ ├── swe.py
│ │ └── toolcall.py
│ ├── schema.py
│ ├── tool/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── bash.py
│ │ ├── browser_use_tool.py
│ │ ├── create_chat_completion.py
│ │ ├── file_saver.py
│ │ ├── google_search.py
│ │ ├── planning.py
│ │ ├── python_execute.py
│ │ ├── run.py
│ │ ├── str_replace_editor.py
│ │ ├── terminate.py
│ │ └── tool_collection.py
│ ├── utils/
│ │ └── log_monitor.py
│ └── web/
│ ├── README.md
│ ├── __init__.py
│ ├── api.py
│ ├── app.py
│ ├── llm_monitor.py
│ ├── log_handler.py
│ ├── log_parser.py
│ ├── static/
│ │ ├── archive/
│ │ │ ├── apiManager.js
│ │ │ ├── chatManager.js
│ │ │ ├── fileViewerManager.js
│ │ │ ├── final_interface.html
│ │ │ ├── index.html
│ │ │ ├── logManager.js
│ │ │ ├── main.js
│ │ │ ├── new_chatManager.js
│ │ │ ├── new_fileViewerManager.js
│ │ │ ├── new_index.html
│ │ │ ├── new_interface_demo.html
│ │ │ ├── new_main.js
│ │ │ ├── new_style.css
│ │ │ ├── new_thinkingManager.js
│ │ │ ├── new_websocketManager.js
│ │ │ ├── new_workspaceManager.js
│ │ │ ├── simple_test.html
│ │ │ ├── standalone.html
│ │ │ ├── style.css
│ │ │ ├── terminalManager.js
│ │ │ └── websocketManager.js
│ │ ├── connected_chatManager.js
│ │ ├── connected_fileViewerManager.js
│ │ ├── connected_interface.html
│ │ ├── connected_interface.js
│ │ ├── connected_thinkingManager.js
│ │ ├── connected_websocketManager.js
│ │ ├── connected_workspaceManager.js
│ │ └── i18n.js
│ ├── templates/
│ │ ├── archive/
│ │ │ ├── index.html
│ │ │ └── new_index.html
│ │ ├── index.html
│ │ └── job_detail.html
│ └── thinking_tracker.py
├── config/
│ └── config.example.toml
├── examples/
│ ├── japan-travel-plan/
│ │ ├── japan_travel_guide_instructions.txt
│ │ ├── japan_travel_handbook.html
│ │ ├── japan_travel_handbook_mobile.html
│ │ └── japan_travel_handbook_print.html
│ └── readme.md
├── main.py
├── pytest.ini
├── requirements.txt
├── run_flow.py
├── setup.py
├── tools/
│ └── debug_log_monitor.py
└── web_run.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitattributes
================================================
# HTML code is incorrectly calculated into statistics, so ignore them
*.html linguist-detectable=false
# Auto detect text files and perform LF normalization
* text=auto eol=lf
# Ensure shell scripts use LF (Linux style) line endings on Windows
*.sh text eol=lf
# Treat specific binary files as binary and prevent line ending conversion
*.png binary
*.jpg binary
*.gif binary
*.ico binary
*.jpeg binary
*.mp3 binary
*.zip binary
*.bin binary
# Preserve original line endings for specific document files
*.doc text eol=crlf
*.docx text eol=crlf
*.pdf binary
# Ensure source code and script files use LF line endings
*.py text eol=lf
*.js text eol=lf
*.html text eol=lf
*.css text eol=lf
# Specify custom diff driver for specific file types
*.md diff=markdown
*.json diff=json
*.mp4 filter=lfs diff=lfs merge=lfs -text
*.mov filter=lfs diff=lfs merge=lfs -text
*.webm filter=lfs diff=lfs merge=lfs -text
================================================
FILE: .github/ISSUE_TEMPLATE/config.yaml
================================================
blank_issues_enabled: false
contact_links:
- name: "📑 Read online docs"
about: Find tutorials, use cases, and guides in the OpenManus documentation.
================================================
FILE: .github/ISSUE_TEMPLATE/request_new_features.md
================================================
---
name: "🤔 Request new features"
about: Suggest ideas or features you’d like to see implemented in OpenManus.
title: ''
labels: kind/features
assignees: ''
---
**Feature description**
**Your Feature**
================================================
FILE: .github/ISSUE_TEMPLATE/show_me_the_bug.md
================================================
---
name: "🪲 Show me the Bug"
about: Report a bug encountered while using OpenManus and seek assistance.
title: ''
labels: kind/bug
assignees: ''
---
**Bug description**
**Bug solved method**
**Environment information**
- System version:
- Python version:
- OpenManus version or branch:
- Installation method (e.g., `pip install -r requirements.txt` or `pip install -e .`):
**Screenshots or logs**
================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
**Features**
- Feature 1
- Feature 2
**Feature Docs**
**Influence**
**Result**
**Other**
================================================
FILE: .github/workflows/build-package.yaml
================================================
name: Build and upload Python package
on:
workflow_dispatch:
release:
types: [created, published]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install setuptools wheel twine
- name: Set package version
run: |
export VERSION="${GITHUB_REF#refs/tags/v}"
sed -i "s/version=.*/version=\"${VERSION}\",/" setup.py
- name: Build and publish
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
run: |
python setup.py bdist_wheel sdist
twine upload dist/*
================================================
FILE: .github/workflows/pre-commit.yaml
================================================
name: Pre-commit checks
on:
pull_request:
branches:
- '**'
push:
branches:
- '**'
jobs:
pre-commit-check:
runs-on: ubuntu-latest
steps:
- name: Checkout Source Code
uses: actions/checkout@v4
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install pre-commit and tools
run: |
python -m pip install --upgrade pip
pip install pre-commit black==23.1.0 isort==5.12.0 autoflake==2.0.1
- name: Run pre-commit hooks
run: pre-commit run --all-files
================================================
FILE: .github/workflows/stale.yaml
================================================
name: Close inactive issues
on:
schedule:
- cron: "5 0 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 30
days-before-issue-close: 14
stale-issue-label: "inactive"
stale-issue-message: "This issue has been inactive for 30 days. Please comment if you have updates."
close-issue-message: "This issue was closed due to 45 days of inactivity. Reopen if still relevant."
days-before-pr-stale: -1
days-before-pr-close: -1
repo-token: ${{ secrets.GITHUB_TOKEN }}
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
config/config.toml
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
# PyPI configuration file
.pypirc
# Logs
logs/
# Data
data/
# Workspace
workspace/
config/config.toml
================================================
FILE: .pre-commit-config.yaml
================================================
repos:
- repo: https://github.com/psf/black
rev: 23.1.0
hooks:
- id: black
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/PyCQA/autoflake
rev: v2.0.1
hooks:
- id: autoflake
args: [
--remove-all-unused-imports,
--ignore-init-module-imports,
--expand-star-imports,
--remove-duplicate-keys,
--remove-unused-variables,
--recursive,
--in-place,
--exclude=__init__.py,
]
files: \.py$
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
args: [
"--profile", "black",
"--filter-files",
"--lines-after-imports=2",
]
================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people.
* Being respectful of differing opinions, viewpoints, and experiences.
* Giving and gracefully accepting constructive feedback.
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience.
* Focusing on what is best not just for us as individuals, but for the overall
community.
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind.
* Trolling, insulting or derogatory comments, and personal or political attacks.
* Public or private harassment.
* Publishing others' private information, such as a physical or email address,
without their explicit permission.
* Other conduct which could reasonably be considered inappropriate in a
professional setting.
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
mannaandpoem@gmail.com
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
### Slack and Discord Etiquettes
These Slack and Discord etiquette guidelines are designed to foster an inclusive, respectful, and productive environment
for all community members. By following these best practices, we ensure effective communication and collaboration while
minimizing disruptions. Let’s work together to build a supportive and welcoming community!
- Communicate respectfully and professionally, avoiding sarcasm or harsh language, and remember that tone can be
difficult to interpret in text.
- Use threads for specific discussions to keep channels organized and easier to follow.
- Tag others only when their input is critical or urgent, and use @here, @channel or @everyone sparingly to minimize
disruptions.
- Be patient, as open-source contributors and maintainers often have other commitments and may need time to respond.
- Post questions or discussions in the most relevant
channel ([discord - #general](https://discord.com/channels/1125308739348594758/1138430348557025341)).
- When asking for help or raising issues, include necessary details like links, screenshots, or clear explanations to
provide context.
- Keep discussions in public channels whenever possible to allow others to benefit from the conversation, unless the
matter is sensitive or private.
- Always adhere to [our standards](https://github.com/mannaandpoem/OpenManus/blob/main/CODE_OF_CONDUCT.md#our-standards)
to ensure a welcoming and collaborative environment.
- If you choose to mute a channel, consider setting up alerts for topics that still interest you to stay engaged. For
Slack, Go to Settings → Notifications → My Keywords to add specific keywords that will notify you when mentioned. For
example, if you're here for discussions about LLMs, mute the channel if it’s too busy, but set notifications to alert
you only when “LLMs” appears in messages. Also for Discord, go to the channel notifications and choose the option that
best describes your need.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
================================================
FILE: DEVELOPMENT_LOG.md
================================================
# OpenManus 开发日志
## 项目概述
OpenManus是一个开源的AI助手项目,旨在提供类似于Manus的功能,但无需邀请码。该项目由MetaGPT团队的成员在短时间内快速开发完成,现在添加了Web界面以提升用户体验。
## 开发时间线
### 2025-03-06
- 项目初始化
- 实现基础命令行界面(CLI)版本
- 集成基本的AI模型功能
### 2025-03-07
- 开始设计Web界面
- 创建FastAPI应用程序框架
- 实现基本的路由和模板
### 2025-03-08
- 实现前端界面,包括聊天和日志展示
- 添加WebSocket支持实时通信
- 解决WebSocket依赖问题
- 添加自动打开浏览器功能
- 实现左右布局设计,左侧日志,右侧对话
- 添加停止请求功能
- 实现Manus风格的任务进展日志显示
- 优化Manus风格的进度日志系统
- 调整日志展示样式,使其更加简洁直观
- 完善文档和使用说明
## 技术栈
- 后端:FastAPI, Python 3.12
- 前端:HTML, CSS, JavaScript (原生)
- 通信:WebSocket, REST API
- 容器化:支持Docker部署
- AI模型:支持多种大型语言模型接口
## 功能实现
1. **Web界面**
- 响应式设计,适配移动和桌面设备
- 左右分栏布局:左侧日志,右侧对话
- 实时显示处理状态和日志
2. **实时通信**
- WebSocket实现实时日志更新
- 自动降级到轮询机制(当WebSocket不可用时)
3. **日志系统**
- 支持多种日志级别(info, warning, error, success)
- 按时间顺序实时显示处理步骤
- 实现简单但可靠的日志捕获系统
- **新增:Manus风格的任务进展日志**
- 简洁的进行时任务描述
- 实时展示AI正在执行的思考和研究过程
- 无时间戳的极简呈现方式
- 任务完成时的总结信息
4. **用户体验优化**
- 自动打开浏览器功能
- 停止请求按钮
- 清除对话功能
- 代码块自动格式化
## 遇到的问题和解决方案
### 问题1: WebSocket连接错误
**问题描述**:在启动时出现"Unsupported upgrade request"和"No supported WebSocket library detected"错误。
**解决方案**:
- 添加WebSocket依赖检测
- 安装websockets库或uvicorn[standard]
- 实现前端优雅降级到轮询方式
### 问题2: 日志记录格式错误
**问题描述**:在尝试捕获loguru日志时出现"TypeError: string indices must be integers, not 'str'"错误。
**解决方案**:
- 创建专用的日志处理模块
- 实现SimpleLogCapture类替代loguru的复杂格式
- 使用自定义上下文管理器来处理日志
### 问题3: 界面布局在移动设备上显示问题
**问题描述**:在小屏幕设备上左右布局不合理。
**解决方案**:
- 添加媒体查询
- 在小屏幕上转换为垂直布局
- 调整各组件的最大宽度
### 问题4: 需要实现类Manus的日志展示
**问题描述**:用户期望看到类似Manus的实时任务进展日志,而不是技术性的日志信息。
**解决方案**:
- 创建专门的思考步骤跟踪系统
- 将AI的思考过程转化为简洁的进行时任务描述
- 保持日志界面干净,只展示用户关心的内容
- 添加任务完成的总结信息
## Manus风格日志实现方案
为了实现类似Manus的日志呈现方式,我们采用以下方案:
1. **任务跟踪系统**:
- 创建ThinkingTracker类,记录AI思考过程的关键步骤
- 将复杂的后台处理过程转化为简洁的用户友好描述
- 支持任务进度百分比估计(可选)
2. **前端展示优化**:
- 去除技术性的时间戳和日志级别
- 使用简单的文本行展示,每行代表一个思考步骤
- 使用淡入淡出效果增强用户体验
3. **WebSocket实时更新**:
- 将AI处理过程实时推送到前端
- 支持分批次更新长任务的思考步骤
4. **任务完成总结**:
- 在任务完成时生成简洁的总结信息
- 提供后续操作建议
## 后续开发计划
1. **功能增强**
- 添加用户认证系统
- 支持会话历史保存
- 实现多语言支持
2. **性能优化**
- 优化WebSocket通信效率
- 添加日志分页功能
- 实现请求队列管理
3. **用户体验提升**
- 添加更多主题选项
- 实现对话导出功能
- 添加语音输入支持
- **完善Manus风格日志系统**,增加更多任务类型的处理模板
4. **集成测试**
- 添加端到端测试
- 实现自动化UI测试
- 性能基准测试
## 贡献指南
欢迎对OpenManus Web进行贡献!您可以通过以下方式参与:
1. 报告Bug或提出功能建议
2. 提交代码改进Pull Request
3. 改进文档
4. 分享您的使用体验
请确保您的代码遵循项目的代码风格并通过所有测试。
---
*最后更新: 2025-03-08*
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2025 manna_and_poem
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
English | [中文](README_zh.md)
[](https://github.com/gregpr07/browser-use/stargazers)
[](https://twitter.com/openmanus)
[](https://discord.gg/6dn7Sa3a)
[](https://opensource.org/licenses/MIT)
# 👋 OpenManus
Manus is incredible, but OpenManus can achieve any idea without an *Invite Code* 🛫!
Our team
members [@mannaandpoem](https://github.com/mannaandpoem) [@XiangJinyu](https://github.com/XiangJinyu) [@MoshiQAQ](https://github.com/MoshiQAQ) [@didiforgithub](https://github.com/didiforgithub) [@stellaHSR](https://github.com/stellaHSR)
and [@Xinyu Zhang](https://x.com/xinyzng), we are from [@MetaGPT](https://github.com/geekan/MetaGPT) etc. The prototype
is launched within 3 hours and we are keeping building!
It's a simple implementation, so we welcome any suggestions, contributions, and feedback!
Enjoy your own agent with OpenManus!
We're also excited to introduce [OpenManus-RL](https://github.com/OpenManus/OpenManus-RL), an open-source project dedicated to reinforcement learning (RL)- based (such as GRPO) tuning methods for LLM agents, developed collaboratively by researchers from UIUC and OpenManus.
## Web Interface Preview

The web interface is developed by [@YunQiAI](https://github.com/YunQiAI).
For more information, please refer to [app/web/README.md](app/web/README.md).
## Project Demo
## Installation
We provide two installation methods. Method 2 (using uv) is recommended for faster installation and better dependency management.
### Method 1: Using conda
1. Create a new conda environment:
```bash
conda create -n open_manus python=3.12
conda activate open_manus
```
2. Clone the repository:
```bash
git clone https://github.com/mannaandpoem/OpenManus.git
cd OpenManus
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
### Method 2: Using uv (Recommended)
1. Install uv (A fast Python package installer and resolver):
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
2. Clone the repository:
```bash
git clone https://github.com/mannaandpoem/OpenManus.git
cd OpenManus
```
3. Create a new virtual environment and activate it:
```bash
uv venv
source .venv/bin/activate # On Unix/macOS
# Or on Windows:
# .venv\Scripts\activate
```
4. Install dependencies:
```bash
uv pip install -r requirements.txt
```
## Configuration
OpenManus requires configuration for the LLM APIs it uses. Follow these steps to set up your configuration:
1. Create a `config.toml` file in the `config` directory (you can copy from the example):
```bash
cp config/config.example.toml config/config.toml
```
2. Edit `config/config.toml` to add your API keys and customize settings:
```toml
# Global LLM configuration
[llm]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..." # Replace with your actual API key
max_tokens = 4096
temperature = 0.0
# Optional configuration for specific LLM models
[llm.vision]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..." # Replace with your actual API key
```
## Quick Start
One line for run OpenManus:
```bash
python main.py --web
```
Then input your idea via terminal!
### Web Interface
You can also use OpenManus through a user-friendly web interface:
```bash
uvicorn app.web.app:app --reload
```
or
```bash
python web_run.py
```
Then open your browser and navigate to `http://localhost:8000` to access the web interface. The web UI allows you to:
- Interact with OpenManus using a chat-like interface
- Monitor AI thinking process in real-time
- View and access workspace files
- See execution progress visually
For unstable version, you also can run:
```bash
python run_flow.py
```
## How to contribute
We welcome any friendly suggestions and helpful contributions! Just create issues or submit pull requests.
Or contact @mannaandpoem via 📧email: mannaandpoem@gmail.com
## Roadmap
After comprehensively gathering feedback from community members, we have decided to adopt a 3-4 day iteration cycle to gradually implement the highly anticipated features.
- [ ] Enhance Planning capabilities, optimize task breakdown and execution logic
- [ ] Introduce standardized evaluation metrics (based on GAIA and TAU-Bench) for continuous performance assessment and optimization
- [ ] Expand model adaptation and optimize low-cost application scenarios
- [ ] Implement containerized deployment to simplify installation and usage workflows
- [ ] Enrich example libraries with more practical cases, including analysis of both successful and failed examples
- [ ] Frontend/backend development to improve user experience
## Community Group
Join our discord group
[](https://discord.gg/jkT5udP9bw)
Join our networking group on Feishu and share your experience with other developers!
## Star 数量
[](https://star-history.com/#mannaandpoem/OpenManus&Date)
## 致谢
特别感谢 [anthropic-computer-use](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo)
和 [browser-use](https://github.com/browser-use/browser-use) 为本项目提供的基础支持!
OpenManus 由 MetaGPT 社区的贡献者共同构建,感谢这个充满活力的智能体开发者社区!
================================================
FILE: app/__init__.py
================================================
================================================
FILE: app/agent/__init__.py
================================================
from app.agent.base import BaseAgent
from app.agent.planning import PlanningAgent
from app.agent.react import ReActAgent
from app.agent.swe import SWEAgent
from app.agent.toolcall import ToolCallAgent
__all__ = [
"BaseAgent",
"PlanningAgent",
"ReActAgent",
"SWEAgent",
"ToolCallAgent",
]
================================================
FILE: app/agent/base.py
================================================
import asyncio
from abc import ABC, abstractmethod
from contextlib import asynccontextmanager
from typing import List, Literal, Optional
from pydantic import BaseModel, Field, model_validator
from app.llm import LLM
from app.logger import logger
from app.schema import AgentState, Memory, Message
class BaseAgent(BaseModel, ABC):
"""Abstract base class for managing agent state and execution.
Provides foundational functionality for state transitions, memory management,
and a step-based execution loop. Subclasses must implement the `step` method.
"""
# Core attributes
name: str = Field(..., description="Unique name of the agent")
description: Optional[str] = Field(None, description="Optional agent description")
# Prompts
system_prompt: Optional[str] = Field(
None, description="System-level instruction prompt"
)
next_step_prompt: Optional[str] = Field(
None, description="Prompt for determining next action"
)
# Dependencies
llm: LLM = Field(default_factory=LLM, description="Language model instance")
memory: Memory = Field(default_factory=Memory, description="Agent's memory store")
state: AgentState = Field(
default=AgentState.IDLE, description="Current agent state"
)
# Execution control
max_steps: int = Field(default=10, description="Maximum steps before termination")
current_step: int = Field(default=0, description="Current step in execution")
duplicate_threshold: int = 2
class Config:
arbitrary_types_allowed = True
extra = "allow" # Allow extra fields for flexibility in subclasses
@model_validator(mode="after")
def initialize_agent(self) -> "BaseAgent":
"""Initialize agent with default settings if not provided."""
if self.llm is None or not isinstance(self.llm, LLM):
self.llm = LLM(config_name=self.name.lower())
if not isinstance(self.memory, Memory):
self.memory = Memory()
return self
@asynccontextmanager
async def state_context(self, new_state: AgentState):
"""Context manager for safe agent state transitions.
Args:
new_state: The state to transition to during the context.
Yields:
None: Allows execution within the new state.
Raises:
ValueError: If the new_state is invalid.
"""
if not isinstance(new_state, AgentState):
raise ValueError(f"Invalid state: {new_state}")
previous_state = self.state
self.state = new_state
try:
yield
except Exception as e:
self.state = AgentState.ERROR # Transition to ERROR on failure
raise e
finally:
self.state = previous_state # Revert to previous state
def update_memory(
self,
role: Literal["user", "system", "assistant", "tool"],
content: str,
**kwargs,
) -> None:
"""Add a message to the agent's memory.
Args:
role: The role of the message sender (user, system, assistant, tool).
content: The message content.
**kwargs: Additional arguments (e.g., tool_call_id for tool messages).
Raises:
ValueError: If the role is unsupported.
"""
message_map = {
"user": Message.user_message,
"system": Message.system_message,
"assistant": Message.assistant_message,
"tool": lambda content, **kw: Message.tool_message(content, **kw),
}
if role not in message_map:
raise ValueError(f"Unsupported message role: {role}")
msg_factory = message_map[role]
msg = msg_factory(content, **kwargs) if role == "tool" else msg_factory(content)
self.memory.add_message(msg)
async def run(
self, request: Optional[str] = None, cancel_event: asyncio.Event = None
) -> str:
"""Execute the agent's main loop asynchronously.
Args:
request: Optional initial user request to process.
cancel_event: Optional asyncio event to signal cancellation.
Returns:
A string summarizing the execution results.
Raises:
RuntimeError: If the agent is not in IDLE state at start.
"""
if self.state != AgentState.IDLE:
raise RuntimeError(f"Cannot run agent from state: {self.state}")
if request:
self.update_memory("user", request)
results: List[str] = []
async with self.state_context(AgentState.RUNNING):
while (
self.current_step < self.max_steps and self.state != AgentState.FINISHED
):
# Check for cancellation
if cancel_event and cancel_event.is_set():
return "操作已被取消"
self.current_step += 1
logger.info(f"Executing step {self.current_step}/{self.max_steps}")
step_result = await self.step()
# Check for stuck state
if self.is_stuck():
self.handle_stuck_state()
results.append(f"Step {self.current_step}: {step_result}")
if self.current_step >= self.max_steps:
results.append(f"Terminated: Reached max steps ({self.max_steps})")
return "\n".join(results) if results else "No steps executed"
@abstractmethod
async def step(self) -> str:
"""Execute a single step in the agent's workflow.
Must be implemented by subclasses to define specific behavior.
"""
def handle_stuck_state(self):
"""Handle stuck state by adding a prompt to change strategy"""
stuck_prompt = "\
Observed duplicate responses. Consider new strategies and avoid repeating ineffective paths already attempted."
self.next_step_prompt = f"{stuck_prompt}\n{self.next_step_prompt}"
logger.warning(f"Agent detected stuck state. Added prompt: {stuck_prompt}")
def is_stuck(self) -> bool:
"""Check if the agent is stuck in a loop by detecting duplicate content"""
if len(self.memory.messages) < 2:
return False
last_message = self.memory.messages[-1]
if not last_message.content:
return False
# Count identical content occurrences
duplicate_count = sum(
1
for msg in reversed(self.memory.messages[:-1])
if msg.role == "assistant" and msg.content == last_message.content
)
return duplicate_count >= self.duplicate_threshold
@property
def messages(self) -> List[Message]:
"""Retrieve a list of messages from the agent's memory."""
return self.memory.messages
@messages.setter
def messages(self, value: List[Message]):
"""Set the list of messages in the agent's memory."""
self.memory.messages = value
================================================
FILE: app/agent/llm_wrapper.py
================================================
"""
LLM回调包装器,为现有LLM添加回调功能
"""
import functools
import inspect
import os
from typing import Any, Callable, Dict
class LLMCallbackWrapper:
"""为LLM添加回调功能的包装类"""
def __init__(self, llm_instance):
self._llm = llm_instance
self._callbacks = {
"before_request": [], # 发送请求前
"after_request": [], # 收到回复后
"on_error": [], # 发生错误时
}
self._wrap_methods()
def _wrap_methods(self):
"""包装LLM实例的方法以添加回调支持"""
# 常见的方法名称
method_names = ["completion", "chat", "generate", "run", "call", "__call__"]
for name in method_names:
if hasattr(self._llm, name) and callable(getattr(self._llm, name)):
original_method = getattr(self._llm, name)
# 检查是否是异步方法
is_async = inspect.iscoroutinefunction(original_method)
if is_async:
@functools.wraps(original_method)
async def async_wrapped(*args, **kwargs):
# 执行前回调
request_data = {"args": args, "kwargs": kwargs}
self._execute_callbacks("before_request", request_data)
try:
# 调用原始方法
result = await original_method(*args, **kwargs)
# 执行后回调
response_data = {
"request": request_data,
"response": result,
}
self._execute_callbacks("after_request", response_data)
# 保存文件到当前工作目录(如果是在工作区内)
current_dir = os.getcwd()
if "workspace" in current_dir:
self._save_conversation_to_file(args, kwargs, result)
return result
except Exception as e:
# 错误回调
error_data = {
"request": request_data,
"error": str(e),
"exception": e,
}
self._execute_callbacks("on_error", error_data)
raise
# 替换为包装后的方法
setattr(self, name, async_wrapped)
else:
@functools.wraps(original_method)
def wrapped(*args, **kwargs):
# 执行前回调
request_data = {"args": args, "kwargs": kwargs}
self._execute_callbacks("before_request", request_data)
try:
# 调用原始方法
result = original_method(*args, **kwargs)
# 执行后回调
response_data = {
"request": request_data,
"response": result,
}
self._execute_callbacks("after_request", response_data)
return result
except Exception as e:
# 错误回调
error_data = {
"request": request_data,
"error": str(e),
"exception": e,
}
self._execute_callbacks("on_error", error_data)
raise
# 替换为包装后的方法
setattr(self, name, wrapped)
def _save_conversation_to_file(self, args, kwargs, result):
"""保存对话到文件(如果设置了)"""
try:
# 检查是否有保存对话的环境变量
if os.environ.get("SAVE_LLM_CONVERSATION", "0") == "1":
prompt = kwargs.get("prompt", "")
if not prompt and args:
prompt = args[0]
if not prompt:
return
# 创建对话记录文件
with open("llm_conversation.txt", "a", encoding="utf-8") as f:
f.write("\n--- LLM REQUEST ---\n")
f.write(str(prompt)[:2000]) # 限制长度
f.write("\n\n--- LLM RESPONSE ---\n")
# 获取响应内容
response_content = ""
if isinstance(result, str):
response_content = result
elif isinstance(result, dict) and "content" in result:
response_content = result["content"]
elif hasattr(result, "content"):
response_content = result.content
else:
response_content = str(result)
f.write(response_content[:2000]) # 限制长度
f.write("\n\n--------------------\n")
except Exception as e:
print(f"保存对话到文件时出错: {str(e)}")
def register_callback(self, event_type: str, callback: Callable):
"""注册回调函数
Args:
event_type: 事件类型,可以是"before_request"、"after_request"或"on_error"
callback: 回调函数,接收相应的数据
"""
if event_type in self._callbacks:
self._callbacks[event_type].append(callback)
return True
return False
def unregister_callback(self, event_type: str, callback: Callable):
"""注销特定的回调函数"""
if event_type in self._callbacks and callback in self._callbacks[event_type]:
self._callbacks[event_type].remove(callback)
return True
return False
def clear_callbacks(self, event_type: str = None):
"""清除所有回调函数"""
if event_type is None:
# 清除所有类型的回调
for event in self._callbacks:
self._callbacks[event] = []
elif event_type in self._callbacks:
# 清除特定类型的回调
self._callbacks[event_type] = []
def _execute_callbacks(self, event_type: str, data: Dict[str, Any]):
"""执行指定类型的回调函数"""
if event_type in self._callbacks:
for callback in self._callbacks[event_type]:
try:
callback(data)
except Exception as e:
print(f"回调执行出错: {str(e)}")
def __getattr__(self, name):
"""转发其他属性访问到原始LLM实例"""
return getattr(self._llm, name)
================================================
FILE: app/agent/manus.py
================================================
from pydantic import Field
from app.agent.toolcall import ToolCallAgent
from app.prompt.manus import NEXT_STEP_PROMPT, SYSTEM_PROMPT
from app.tool import Terminate, ToolCollection
from app.tool.browser_use_tool import BrowserUseTool
from app.tool.file_saver import FileSaver
from app.tool.google_search import GoogleSearch
from app.tool.python_execute import PythonExecute
class Manus(ToolCallAgent):
"""
A versatile general-purpose agent that uses planning to solve various tasks.
This agent extends PlanningAgent with a comprehensive set of tools and capabilities,
including Python execution, web browsing, file operations, and information retrieval
to handle a wide range of user requests.
"""
name: str = "Manus"
description: str = (
"A versatile agent that can solve various tasks using multiple tools"
)
system_prompt: str = SYSTEM_PROMPT
next_step_prompt: str = NEXT_STEP_PROMPT
# Add general-purpose tools to the tool collection
available_tools: ToolCollection = Field(
default_factory=lambda: ToolCollection(
PythonExecute(), GoogleSearch(), BrowserUseTool(), FileSaver(), Terminate()
)
)
================================================
FILE: app/agent/planning.py
================================================
import time
from typing import Dict, List, Literal, Optional
from pydantic import Field, model_validator
from app.agent.toolcall import ToolCallAgent
from app.logger import logger
from app.prompt.planning import NEXT_STEP_PROMPT, PLANNING_SYSTEM_PROMPT
from app.schema import Message, ToolCall
from app.tool import PlanningTool, Terminate, ToolCollection
class PlanningAgent(ToolCallAgent):
"""
An agent that creates and manages plans to solve tasks.
This agent uses a planning tool to create and manage structured plans,
and tracks progress through individual steps until task completion.
"""
name: str = "planning"
description: str = "An agent that creates and manages plans to solve tasks"
system_prompt: str = PLANNING_SYSTEM_PROMPT
next_step_prompt: str = NEXT_STEP_PROMPT
available_tools: ToolCollection = Field(
default_factory=lambda: ToolCollection(PlanningTool(), Terminate())
)
tool_choices: Literal["none", "auto", "required"] = "auto"
special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
tool_calls: List[ToolCall] = Field(default_factory=list)
active_plan_id: Optional[str] = Field(default=None)
# Add a dictionary to track the step status for each tool call
step_execution_tracker: Dict[str, Dict] = Field(default_factory=dict)
current_step_index: Optional[int] = None
max_steps: int = 20
@model_validator(mode="after")
def initialize_plan_and_verify_tools(self) -> "PlanningAgent":
"""Initialize the agent with a default plan ID and validate required tools."""
self.active_plan_id = f"plan_{int(time.time())}"
if "planning" not in self.available_tools.tool_map:
self.available_tools.add_tool(PlanningTool())
return self
async def think(self) -> bool:
"""Decide the next action based on plan status."""
prompt = (
f"CURRENT PLAN STATUS:\n{await self.get_plan()}\n\n{self.next_step_prompt}"
if self.active_plan_id
else self.next_step_prompt
)
self.messages.append(Message.user_message(prompt))
# Get the current step index before thinking
self.current_step_index = await self._get_current_step_index()
result = await super().think()
# After thinking, if we decided to execute a tool and it's not a planning tool or special tool,
# associate it with the current step for tracking
if result and self.tool_calls:
latest_tool_call = self.tool_calls[0] # Get the most recent tool call
if (
latest_tool_call.function.name != "planning"
and latest_tool_call.function.name not in self.special_tool_names
and self.current_step_index is not None
):
self.step_execution_tracker[latest_tool_call.id] = {
"step_index": self.current_step_index,
"tool_name": latest_tool_call.function.name,
"status": "pending", # Will be updated after execution
}
return result
async def act(self) -> str:
"""Execute a step and track its completion status."""
result = await super().act()
# After executing the tool, update the plan status
if self.tool_calls:
latest_tool_call = self.tool_calls[0]
# Update the execution status to completed
if latest_tool_call.id in self.step_execution_tracker:
self.step_execution_tracker[latest_tool_call.id]["status"] = "completed"
self.step_execution_tracker[latest_tool_call.id]["result"] = result
# Update the plan status if this was a non-planning, non-special tool
if (
latest_tool_call.function.name != "planning"
and latest_tool_call.function.name not in self.special_tool_names
):
await self.update_plan_status(latest_tool_call.id)
return result
async def get_plan(self) -> str:
"""Retrieve the current plan status."""
if not self.active_plan_id:
return "No active plan. Please create a plan first."
result = await self.available_tools.execute(
name="planning",
tool_input={"command": "get", "plan_id": self.active_plan_id},
)
return result.output if hasattr(result, "output") else str(result)
async def run(self, request: Optional[str] = None) -> str:
"""Run the agent with an optional initial request."""
if request:
await self.create_initial_plan(request)
return await super().run()
async def update_plan_status(self, tool_call_id: str) -> None:
"""
Update the current plan progress based on completed tool execution.
Only marks a step as completed if the associated tool has been successfully executed.
"""
if not self.active_plan_id:
return
if tool_call_id not in self.step_execution_tracker:
logger.warning(f"No step tracking found for tool call {tool_call_id}")
return
tracker = self.step_execution_tracker[tool_call_id]
if tracker["status"] != "completed":
logger.warning(f"Tool call {tool_call_id} has not completed successfully")
return
step_index = tracker["step_index"]
try:
# Mark the step as completed
await self.available_tools.execute(
name="planning",
tool_input={
"command": "mark_step",
"plan_id": self.active_plan_id,
"step_index": step_index,
"step_status": "completed",
},
)
logger.info(
f"Marked step {step_index} as completed in plan {self.active_plan_id}"
)
except Exception as e:
logger.warning(f"Failed to update plan status: {e}")
async def _get_current_step_index(self) -> Optional[int]:
"""
Parse the current plan to identify the first non-completed step's index.
Returns None if no active step is found.
"""
if not self.active_plan_id:
return None
plan = await self.get_plan()
try:
plan_lines = plan.splitlines()
steps_index = -1
# Find the index of the "Steps:" line
for i, line in enumerate(plan_lines):
if line.strip() == "Steps:":
steps_index = i
break
if steps_index == -1:
return None
# Find the first non-completed step
for i, line in enumerate(plan_lines[steps_index + 1 :], start=0):
if "[ ]" in line or "[→]" in line: # not_started or in_progress
# Mark current step as in_progress
await self.available_tools.execute(
name="planning",
tool_input={
"command": "mark_step",
"plan_id": self.active_plan_id,
"step_index": i,
"step_status": "in_progress",
},
)
return i
return None # No active step found
except Exception as e:
logger.warning(f"Error finding current step index: {e}")
return None
async def create_initial_plan(self, request: str) -> None:
"""Create an initial plan based on the request."""
logger.info(f"Creating initial plan with ID: {self.active_plan_id}")
messages = [
Message.user_message(
f"Analyze the request and create a plan with ID {self.active_plan_id}: {request}"
)
]
self.memory.add_messages(messages)
response = await self.llm.ask_tool(
messages=messages,
system_msgs=[Message.system_message(self.system_prompt)],
tools=self.available_tools.to_params(),
tool_choice="required",
)
assistant_msg = Message.from_tool_calls(
content=response.content, tool_calls=response.tool_calls
)
self.memory.add_message(assistant_msg)
plan_created = False
for tool_call in response.tool_calls:
if tool_call.function.name == "planning":
result = await self.execute_tool(tool_call)
logger.info(
f"Executed tool {tool_call.function.name} with result: {result}"
)
# Add tool response to memory
tool_msg = Message.tool_message(
content=result,
tool_call_id=tool_call.id,
name=tool_call.function.name,
)
self.memory.add_message(tool_msg)
plan_created = True
break
if not plan_created:
logger.warning("No plan created from initial request")
tool_msg = Message.assistant_message(
"Error: Parameter `plan_id` is required for command: create"
)
self.memory.add_message(tool_msg)
async def main():
# Configure and run the agent
agent = PlanningAgent(available_tools=ToolCollection(PlanningTool(), Terminate()))
result = await agent.run("Help me plan a trip to the moon")
print(result)
if __name__ == "__main__":
import asyncio
asyncio.run(main())
================================================
FILE: app/agent/react.py
================================================
from abc import ABC, abstractmethod
from typing import Optional
from pydantic import Field
from app.agent.base import BaseAgent
from app.llm import LLM
from app.schema import AgentState, Memory
class ReActAgent(BaseAgent, ABC):
name: str
description: Optional[str] = None
system_prompt: Optional[str] = None
next_step_prompt: Optional[str] = None
llm: Optional[LLM] = Field(default_factory=LLM)
memory: Memory = Field(default_factory=Memory)
state: AgentState = AgentState.IDLE
max_steps: int = 10
current_step: int = 0
@abstractmethod
async def think(self) -> bool:
"""Process current state and decide next action"""
@abstractmethod
async def act(self) -> str:
"""Execute decided actions"""
async def step(self) -> str:
"""Execute a single step: think and act."""
should_act = await self.think()
if not should_act:
return "Thinking complete - no action needed"
return await self.act()
================================================
FILE: app/agent/swe.py
================================================
from typing import List
from pydantic import Field
from app.agent.toolcall import ToolCallAgent
from app.prompt.swe import NEXT_STEP_TEMPLATE, SYSTEM_PROMPT
from app.tool import Bash, StrReplaceEditor, Terminate, ToolCollection
class SWEAgent(ToolCallAgent):
"""An agent that implements the SWEAgent paradigm for executing code and natural conversations."""
name: str = "swe"
description: str = "an autonomous AI programmer that interacts directly with the computer to solve tasks."
system_prompt: str = SYSTEM_PROMPT
next_step_prompt: str = NEXT_STEP_TEMPLATE
available_tools: ToolCollection = ToolCollection(
Bash(), StrReplaceEditor(), Terminate()
)
special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
max_steps: int = 30
bash: Bash = Field(default_factory=Bash)
working_dir: str = "."
async def think(self) -> bool:
"""Process current state and decide next action"""
# Update working directory
self.working_dir = await self.bash.execute("pwd")
self.next_step_prompt = self.next_step_prompt.format(
current_dir=self.working_dir
)
return await super().think()
================================================
FILE: app/agent/toolcall.py
================================================
import json
from typing import Any, List, Literal
from pydantic import Field
from app.agent.react import ReActAgent
from app.logger import logger
from app.prompt.toolcall import NEXT_STEP_PROMPT, SYSTEM_PROMPT
from app.schema import AgentState, Message, ToolCall
from app.tool import CreateChatCompletion, Terminate, ToolCollection
TOOL_CALL_REQUIRED = "Tool calls required but none provided"
class ToolCallAgent(ReActAgent):
"""Base agent class for handling tool/function calls with enhanced abstraction"""
name: str = "toolcall"
description: str = "an agent that can execute tool calls."
system_prompt: str = SYSTEM_PROMPT
next_step_prompt: str = NEXT_STEP_PROMPT
available_tools: ToolCollection = ToolCollection(
CreateChatCompletion(), Terminate()
)
tool_choices: Literal["none", "auto", "required"] = "auto"
special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
tool_calls: List[ToolCall] = Field(default_factory=list)
max_steps: int = 30
async def think(self) -> bool:
"""Process current state and decide next actions using tools"""
if self.next_step_prompt:
user_msg = Message.user_message(self.next_step_prompt)
self.messages += [user_msg]
# Get response with tool options
response = await self.llm.ask_tool(
messages=self.messages,
system_msgs=[Message.system_message(self.system_prompt)]
if self.system_prompt
else None,
tools=self.available_tools.to_params(),
tool_choice=self.tool_choices,
)
self.tool_calls = response.tool_calls
# Log response info
logger.info(f"✨ {self.name}'s thoughts: {response.content}")
logger.info(
f"🛠️ {self.name} selected {len(response.tool_calls) if response.tool_calls else 0} tools to use"
)
if response.tool_calls:
logger.info(
f"🧰 Tools being prepared: {[call.function.name for call in response.tool_calls]}"
)
try:
# Handle different tool_choices modes
if self.tool_choices == "none":
if response.tool_calls:
logger.warning(
f"🤔 Hmm, {self.name} tried to use tools when they weren't available!"
)
if response.content:
self.memory.add_message(Message.assistant_message(response.content))
return True
return False
# Create and add assistant message
assistant_msg = (
Message.from_tool_calls(
content=response.content, tool_calls=self.tool_calls
)
if self.tool_calls
else Message.assistant_message(response.content)
)
self.memory.add_message(assistant_msg)
if self.tool_choices == "required" and not self.tool_calls:
return True # Will be handled in act()
# For 'auto' mode, continue with content if no commands but content exists
if self.tool_choices == "auto" and not self.tool_calls:
return bool(response.content)
return bool(self.tool_calls)
except Exception as e:
logger.error(f"🚨 Oops! The {self.name}'s thinking process hit a snag: {e}")
self.memory.add_message(
Message.assistant_message(
f"Error encountered while processing: {str(e)}"
)
)
return False
async def act(self) -> str:
"""Execute tool calls and handle their results"""
if not self.tool_calls:
if self.tool_choices == "required":
raise ValueError(TOOL_CALL_REQUIRED)
# Return last message content if no tool calls
return self.messages[-1].content or "No content or commands to execute"
results = []
for command in self.tool_calls:
result = await self.execute_tool(command)
logger.info(
f"🎯 Tool '{command.function.name}' completed its mission! Result: {result}"
)
# Add tool response to memory
tool_msg = Message.tool_message(
content=result, tool_call_id=command.id, name=command.function.name
)
self.memory.add_message(tool_msg)
results.append(result)
return "\n\n".join(results)
async def execute_tool(self, command: ToolCall) -> str:
"""Execute a single tool call with robust error handling"""
if not command or not command.function or not command.function.name:
return "Error: Invalid command format"
name = command.function.name
if name not in self.available_tools.tool_map:
return f"Error: Unknown tool '{name}'"
try:
# Parse arguments
args = json.loads(command.function.arguments or "{}")
# Execute the tool
logger.info(f"🔧 Activating tool: '{name}'...")
result = await self.available_tools.execute(name=name, tool_input=args)
# Format result for display
observation = (
f"Observed output of cmd `{name}` executed:\n{str(result)}"
if result
else f"Cmd `{name}` completed with no output"
)
# Handle special tools like `finish`
await self._handle_special_tool(name=name, result=result)
return observation
except json.JSONDecodeError:
error_msg = f"Error parsing arguments for {name}: Invalid JSON format"
logger.error(
f"📝 Oops! The arguments for '{name}' don't make sense - invalid JSON, arguments:{command.function.arguments}"
)
return f"Error: {error_msg}"
except Exception as e:
error_msg = f"⚠️ Tool '{name}' encountered a problem: {str(e)}"
logger.error(error_msg)
return f"Error: {error_msg}"
async def _handle_special_tool(self, name: str, result: Any, **kwargs):
"""Handle special tool execution and state changes"""
if not self._is_special_tool(name):
return
if self._should_finish_execution(name=name, result=result, **kwargs):
# Set agent state to finished
logger.info(f"🏁 Special tool '{name}' has completed the task!")
self.state = AgentState.FINISHED
@staticmethod
def _should_finish_execution(**kwargs) -> bool:
"""Determine if tool execution should finish the agent"""
return True
def _is_special_tool(self, name: str) -> bool:
"""Check if tool name is in special tools list"""
return name.lower() in [n.lower() for n in self.special_tool_names]
================================================
FILE: app/config.py
================================================
import threading
import tomllib
from pathlib import Path
from typing import Dict
from pydantic import BaseModel, Field
def get_project_root() -> Path:
"""Get the project root directory"""
return Path(__file__).resolve().parent.parent
PROJECT_ROOT = get_project_root()
WORKSPACE_ROOT = PROJECT_ROOT / "workspace"
class LLMSettings(BaseModel):
model: str = Field(..., description="Model name")
base_url: str = Field(..., description="API base URL")
api_key: str = Field(..., description="API key")
max_tokens: int = Field(4096, description="Maximum number of tokens per request")
temperature: float = Field(1.0, description="Sampling temperature")
api_type: str = Field(..., description="AzureOpenai or Openai")
api_version: str = Field(..., description="Azure Openai version if AzureOpenai")
class AppConfig(BaseModel):
llm: Dict[str, LLMSettings]
class Config:
_instance = None
_lock = threading.Lock()
_initialized = False
def __new__(cls):
if cls._instance is None:
with cls._lock:
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def __init__(self):
if not self._initialized:
with self._lock:
if not self._initialized:
self._config = None
self._load_initial_config()
self._initialized = True
@staticmethod
def _get_config_path() -> Path:
root = PROJECT_ROOT
config_path = root / "config" / "config.toml"
if config_path.exists():
return config_path
example_path = root / "config" / "config.example.toml"
if example_path.exists():
return example_path
raise FileNotFoundError("No configuration file found in config directory")
def _load_config(self) -> dict:
config_path = self._get_config_path()
with config_path.open("rb") as f:
return tomllib.load(f)
def _load_initial_config(self):
raw_config = self._load_config()
base_llm = raw_config.get("llm", {})
llm_overrides = {
k: v for k, v in raw_config.get("llm", {}).items() if isinstance(v, dict)
}
default_settings = {
"model": base_llm.get("model"),
"base_url": base_llm.get("base_url"),
"api_key": base_llm.get("api_key"),
"max_tokens": base_llm.get("max_tokens", 4096),
"temperature": base_llm.get("temperature", 1.0),
"api_type": base_llm.get("api_type", ""),
"api_version": base_llm.get("api_version", ""),
}
config_dict = {
"llm": {
"default": default_settings,
**{
name: {**default_settings, **override_config}
for name, override_config in llm_overrides.items()
},
}
}
self._config = AppConfig(**config_dict)
@property
def llm(self) -> Dict[str, LLMSettings]:
return self._config.llm
config = Config()
================================================
FILE: app/exceptions.py
================================================
class ToolError(Exception):
"""Raised when a tool encounters an error."""
def __init__(self, message):
self.message = message
================================================
FILE: app/flow/__init__.py
================================================
================================================
FILE: app/flow/base.py
================================================
import asyncio
from abc import ABC, abstractmethod
from enum import Enum
from typing import Dict, List, Optional, Union
from pydantic import BaseModel
from app.agent.base import BaseAgent
class FlowType(str, Enum):
PLANNING = "planning"
class BaseFlow(BaseModel, ABC):
"""Base class for execution flows supporting multiple agents"""
agents: Dict[str, BaseAgent]
tools: Optional[List] = None
primary_agent_key: Optional[str] = None
class Config:
arbitrary_types_allowed = True
def __init__(
self, agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]], **data
):
# Handle different ways of providing agents
if isinstance(agents, BaseAgent):
agents_dict = {"default": agents}
elif isinstance(agents, list):
agents_dict = {f"agent_{i}": agent for i, agent in enumerate(agents)}
else:
agents_dict = agents
# If primary agent not specified, use first agent
primary_key = data.get("primary_agent_key")
if not primary_key and agents_dict:
primary_key = next(iter(agents_dict))
data["primary_agent_key"] = primary_key
# Set the agents dictionary
data["agents"] = agents_dict
# Initialize using BaseModel's init
super().__init__(**data)
@property
def primary_agent(self) -> Optional[BaseAgent]:
"""Get the primary agent for the flow"""
return self.agents.get(self.primary_agent_key)
def get_agent(self, key: str) -> Optional[BaseAgent]:
"""Get a specific agent by key"""
return self.agents.get(key)
def add_agent(self, key: str, agent: BaseAgent) -> None:
"""Add a new agent to the flow"""
self.agents[key] = agent
@abstractmethod
async def execute(
self, input_text: str, job_id: str = None, cancel_event: asyncio.Event = None
) -> str:
"""Execute the flow with the given input text."""
raise NotImplementedError("Subclasses must implement execute method")
================================================
FILE: app/flow/flow_factory.py
================================================
from typing import Dict, List, Union
from app.agent.base import BaseAgent
from app.flow.base import BaseFlow, FlowType
class FlowFactory:
"""Factory for creating different types of flows with support for multiple agents"""
@staticmethod
def create_flow(
flow_type: FlowType,
agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]],
**kwargs,
) -> BaseFlow:
"""Create a flow of the specified type with the provided agents."""
# 根据flow_type参数创建相应的flow
if flow_type == FlowType.PLANNING:
from app.flow.planning import PlanningFlow
return PlanningFlow(agents, **kwargs)
# ...other flow types...
else:
raise ValueError(f"Unknown flow type: {flow_type}")
================================================
FILE: app/flow/planning.py
================================================
import asyncio # 添加导入
import json
import os # 添加导入os模块
import time
from typing import Dict, List, Optional, Union
from pydantic import Field
from app.agent.base import BaseAgent
from app.flow.base import BaseFlow
from app.llm import LLM
from app.logger import logger
from app.schema import AgentState, Message
from app.tool import PlanningTool
class PlanningFlow(BaseFlow):
"""A flow that manages planning and execution of tasks using agents."""
llm: LLM = Field(default_factory=lambda: LLM())
planning_tool: PlanningTool = Field(default_factory=PlanningTool)
executor_keys: List[str] = Field(default_factory=list)
active_plan_id: str = Field(default_factory=lambda: f"plan_{int(time.time())}")
current_step_index: Optional[int] = None
def __init__(
self, agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]], **data
):
# Set executor keys before super().__init__
if "executors" in data:
data["executor_keys"] = data.pop("executors")
# Set plan ID if provided
if "plan_id" in data:
data["active_plan_id"] = data.pop("plan_id")
# Initialize the planning tool if not provided
if "planning_tool" not in data:
planning_tool = PlanningTool()
data["planning_tool"] = planning_tool
# Call parent's init with the processed data
super().__init__(agents, **data)
# Set executor_keys to all agent keys if not specified
if not self.executor_keys:
self.executor_keys = list(self.agents.keys())
def get_executor(self, step_type: Optional[str] = None) -> BaseAgent:
"""
Get an appropriate executor agent for the current step.
Can be extended to select agents based on step type/requirements.
"""
# If step type is provided and matches an agent key, use that agent
if step_type and step_type in self.agents:
return self.agents[step_type]
# Otherwise use the first available executor or fall back to primary agent
for key in self.executor_keys:
if key in self.agents:
return self.agents[key]
# Fallback to primary agent
return self.primary_agent
async def execute(
self, input_text: str, job_id: str = None, cancel_event: asyncio.Event = None
) -> str:
"""Execute the planning flow with agents."""
try:
if not self.primary_agent:
raise ValueError("No primary agent available")
# Create initial plan if input provided
if input_text:
await self._create_initial_plan(input_text, job_id)
# Verify plan was created successfully
if self.active_plan_id not in self.planning_tool.plans:
logger.error(
f"Plan creation failed. Plan ID {self.active_plan_id} not found in planning tool."
)
return f"Failed to create plan for: {input_text}"
result = ""
while True:
# 检查是否被要求取消执行
if cancel_event and cancel_event.is_set():
logger.warning("Execution cancelled by user")
return result + "\n执行已被用户取消"
# Get current step to execute
self.current_step_index, step_info = await self._get_current_step_info()
# Exit if no more steps or plan completed
if self.current_step_index is None:
result += await self._finalize_plan()
break
# Execute current step with appropriate agent
step_type = step_info.get("type") if step_info else None
executor = self.get_executor(step_type)
step_result = await self._execute_step(executor, step_info)
result += step_result + "\n"
# Check if agent wants to terminate
if hasattr(executor, "state") and executor.state == AgentState.FINISHED:
break
return result
except Exception as e:
logger.error(f"Error in PlanningFlow: {str(e)}")
return f"Execution failed: {str(e)}"
async def _create_initial_plan(self, request: str, job_id: str = None) -> None:
"""Create an initial plan based on the request using the flow's LLM and PlanningTool."""
# 如果提供了job_id,则使用它;否则生成一个基于请求的job_id
if not job_id:
job_id = f"job_{request[:8].replace(' ', '_')}"
if len(job_id) < 10: # 如果太短,加上时间戳
job_id = f"job_{int(time.time())}"
log_file_path = f"logs/{job_id}.log"
os.environ["OPENMANUS_TASK_ID"] = job_id
os.environ["OPENMANUS_LOG_FILE"] = log_file_path
# 设置日志文件名为job_id
logger.add(log_file_path, rotation="100 MB")
logger.info(f"Creating initial plan with ID: {self.active_plan_id}")
# 原有代码继续执行
# Create a system message for plan creation
system_message = Message.system_message(
"You are a planning assistant. Create a concise, actionable plan with clear steps. "
"Focus on key milestones rather than detailed sub-steps. "
"Optimize for clarity and efficiency."
)
# Create a user message with the request
user_message = Message.user_message(
f"Create a reasonable plan with clear steps to accomplish the task: {request}"
)
# Call LLM with PlanningTool
response = await self.llm.ask_tool(
messages=[user_message],
system_msgs=[system_message],
tools=[self.planning_tool.to_param()],
tool_choice="required",
)
# Process tool calls if present
if response.tool_calls:
for tool_call in response.tool_calls:
if tool_call.function.name == "planning":
# Parse the arguments
args = tool_call.function.arguments
if isinstance(args, str):
try:
args = json.loads(args)
except json.JSONDecodeError:
logger.error(f"Failed to parse tool arguments: {args}")
continue
# Ensure plan_id is set correctly and execute the tool
args["plan_id"] = self.active_plan_id
# Execute the tool via ToolCollection instead of directly
result = await self.planning_tool.execute(**args)
logger.info(f"Plan creation result: {str(result)}")
return
# If execution reached here, create a default plan
logger.warning("Creating default plan")
# Create default plan using the ToolCollection
await self.planning_tool.execute(
**{
"command": "create",
"plan_id": self.active_plan_id,
"title": f"Plan for: {request[:50]}{'...' if len(request) > 50 else ''}",
"steps": ["Analyze request", "Execute task", "Verify results"],
}
)
async def _get_current_step_info(self) -> tuple[Optional[int], Optional[dict]]:
"""
Parse the current plan to identify the first non-completed step's index and info.
Returns (None, None) if no active step is found.
"""
if (
not self.active_plan_id
or self.active_plan_id not in self.planning_tool.plans
):
logger.error(f"Plan with ID {self.active_plan_id} not found")
return None, None
try:
# Direct access to plan data from planning tool storage
plan_data = self.planning_tool.plans[self.active_plan_id]
steps = plan_data.get("steps", [])
step_statuses = plan_data.get("step_statuses", [])
# Find first non-completed step
for i, step in enumerate(steps):
if i >= len(step_statuses):
status = "not_started"
else:
status = step_statuses[i]
if status in ["not_started", "in_progress"]:
# Extract step type/category if available
step_info = {"text": step}
# Try to extract step type from the text (e.g., [SEARCH] or [CODE])
import re
type_match = re.search(r"\[([A-Z_]+)\]", step)
if type_match:
step_info["type"] = type_match.group(1).lower()
# Mark current step as in_progress
try:
await self.planning_tool.execute(
command="mark_step",
plan_id=self.active_plan_id,
step_index=i,
step_status="in_progress",
)
except Exception as e:
logger.warning(f"Error marking step as in_progress: {e}")
# Update step status directly if needed
if i < len(step_statuses):
step_statuses[i] = "in_progress"
else:
while len(step_statuses) < i:
step_statuses.append("not_started")
step_statuses.append("in_progress")
plan_data["step_statuses"] = step_statuses
return i, step_info
return None, None # No active step found
except Exception as e:
logger.warning(f"Error finding current step index: {e}")
return None, None
async def _execute_step(self, executor: BaseAgent, step_info: dict) -> str:
"""Execute the current step with the specified agent using agent.run()."""
# Prepare context for the agent with current plan status
plan_status = await self._get_plan_text()
step_text = step_info.get("text", f"Step {self.current_step_index}")
# Create a prompt for the agent to execute the current step
step_prompt = f"""
CURRENT PLAN STATUS:
{plan_status}
YOUR CURRENT TASK:
You are now working on step {self.current_step_index}: "{step_text}"
Please execute this step using the appropriate tools. When you're done, provide a summary of what you accomplished.
"""
# Use agent.run() to execute the step
try:
step_result = await executor.run(step_prompt)
# Mark the step as completed after successful execution
await self._mark_step_completed()
return step_result
except Exception as e:
logger.error(f"Error executing step {self.current_step_index}: {e}")
return f"Error executing step {self.current_step_index}: {str(e)}"
async def _mark_step_completed(self) -> None:
"""Mark the current step as completed."""
if self.current_step_index is None:
return
try:
# Mark the step as completed
await self.planning_tool.execute(
command="mark_step",
plan_id=self.active_plan_id,
step_index=self.current_step_index,
step_status="completed",
)
logger.info(
f"Marked step {self.current_step_index} as completed in plan {self.active_plan_id}"
)
# ThinkingTracker.add_thinking_step(self.active_plan_id, f"Completed step {self.current_step_index}")
except Exception as e:
logger.warning(f"Failed to update plan status: {e}")
# Update step status directly in planning tool storage
if self.active_plan_id in self.planning_tool.plans:
plan_data = self.planning_tool.plans[self.active_plan_id]
step_statuses = plan_data.get("step_statuses", [])
# Ensure the step_statuses list is long enough
while len(step_statuses) <= self.current_step_index:
step_statuses.append("not_started")
# Update the status
step_statuses[self.current_step_index] = "completed"
plan_data["step_statuses"] = step_statuses
async def _get_plan_text(self) -> str:
"""Get the current plan as formatted text."""
try:
result = await self.planning_tool.execute(
command="get", plan_id=self.active_plan_id
)
return result.output if hasattr(result, "output") else str(result)
except Exception as e:
logger.error(f"Error getting plan: {e}")
return self._generate_plan_text_from_storage()
def _generate_plan_text_from_storage(self) -> str:
"""Generate plan text directly from storage if the planning tool fails."""
try:
if self.active_plan_id not in self.planning_tool.plans:
return f"Error: Plan with ID {self.active_plan_id} not found"
plan_data = self.planning_tool.plans[self.active_plan_id]
title = plan_data.get("title", "Untitled Plan")
steps = plan_data.get("steps", [])
step_statuses = plan_data.get("step_statuses", [])
step_notes = plan_data.get("step_notes", [])
# Ensure step_statuses and step_notes match the number of steps
while len(step_statuses) < len(steps):
step_statuses.append("not_started")
while len(step_notes) < len(steps):
step_notes.append("")
# Count steps by status
status_counts = {
"completed": 0,
"in_progress": 0,
"blocked": 0,
"not_started": 0,
}
for status in step_statuses:
if status in status_counts:
status_counts[status] += 1
completed = status_counts["completed"]
total = len(steps)
progress = (completed / total) * 100 if total > 0 else 0
plan_text = f"Plan: {title} (ID: {self.active_plan_id})\n"
plan_text += "=" * len(plan_text) + "\n\n"
plan_text += (
f"Progress: {completed}/{total} steps completed ({progress:.1f}%)\n"
)
plan_text += f"Status: {status_counts['completed']} completed, {status_counts['in_progress']} in progress, "
plan_text += f"{status_counts['blocked']} blocked, {status_counts['not_started']} not started\n\n"
plan_text += "Steps:\n"
for i, (step, status, notes) in enumerate(
zip(steps, step_statuses, step_notes)
):
if status == "completed":
status_mark = "[✓]"
elif status == "in_progress":
status_mark = "[→]"
elif status == "blocked":
status_mark = "[!]"
else: # not_started
status_mark = "[ ]"
plan_text += f"{i}. {status_mark} {step}\n"
if notes:
plan_text += f" Notes: {notes}\n"
return plan_text
except Exception as e:
logger.error(f"Error generating plan text from storage: {e}")
return f"Error: Unable to retrieve plan with ID {self.active_plan_id}"
async def _finalize_plan(self) -> str:
"""Finalize the plan and provide a summary using the flow's LLM directly."""
plan_text = await self._get_plan_text()
# Create a summary using the flow's LLM directly
try:
system_message = Message.system_message(
"You are a planning assistant. Your task is to summarize the completed plan."
)
user_message = Message.user_message(
f"The plan has been completed. Here is the final plan status:\n\n{plan_text}\n\nPlease provide a summary of what was accomplished and any final thoughts."
)
response = await self.llm.ask(
messages=[user_message], system_msgs=[system_message]
)
return f"Plan completed:\n\n{response}"
except Exception as e:
logger.error(f"Error finalizing plan with LLM: {e}")
# Fallback to using an agent for the summary
try:
agent = self.primary_agent
summary_prompt = f"""
The plan has been completed. Here is the final plan status:
{plan_text}
Please provide a summary of what was accomplished and any final thoughts.
"""
summary = await agent.run(summary_prompt)
return f"Plan completed:\n\n{summary}"
except Exception as e2:
logger.error(f"Error finalizing plan with agent: {e2}")
return "Plan completed. Error generating summary."
================================================
FILE: app/flow/tracking_support.py
================================================
"""
为流程添加思考过程追踪功能
"""
from functools import wraps
from app.web.thinking_tracker import ThinkingTracker
class FlowTracker:
"""流程跟踪器,用于钩入流程执行过程,添加思考步骤记录"""
@staticmethod
def patch_flow(flow_obj, session_id: str):
"""为流程对象应用跟踪补丁"""
if not hasattr(flow_obj, "_original_execute"):
# 保存原始方法
flow_obj._original_execute = flow_obj.execute
# 添加会话ID
flow_obj._tracker_session_id = session_id
# 替换execute方法
@wraps(flow_obj._original_execute)
async def tracked_execute(prompt, *args, **kwargs):
# 在执行前添加思考步骤
ThinkingTracker.add_thinking_step(session_id, "开始执行流程")
# 跟踪子步骤执行
if hasattr(flow_obj, "_execute_step"):
original_step = flow_obj._execute_step
@wraps(original_step)
async def tracked_step():
if hasattr(flow_obj, "current_step_description"):
step_desc = flow_obj.current_step_description
ThinkingTracker.add_thinking_step(
session_id, f"执行步骤: {step_desc}"
)
else:
ThinkingTracker.add_thinking_step(session_id, "执行流程步骤")
result = await original_step()
return result
flow_obj._execute_step = tracked_step
# 执行原始方法
result = await flow_obj._original_execute(prompt, *args, **kwargs)
# 在执行后添加思考步骤
ThinkingTracker.add_thinking_step(session_id, "流程执行完成")
return result
flow_obj.execute = tracked_execute
return True
return False
================================================
FILE: app/llm.py
================================================
from typing import Dict, List, Literal, Optional, Union
from openai import (
APIError,
AsyncAzureOpenAI,
AsyncOpenAI,
AuthenticationError,
OpenAIError,
RateLimitError,
)
from tenacity import retry, stop_after_attempt, wait_random_exponential
from app.config import LLMSettings, config
from app.logger import logger # Assuming a logger is set up in your app
from app.schema import Message
class LLM:
_instances: Dict[str, "LLM"] = {}
def __new__(
cls, config_name: str = "default", llm_config: Optional[LLMSettings] = None
):
if config_name not in cls._instances:
instance = super().__new__(cls)
instance.__init__(config_name, llm_config)
cls._instances[config_name] = instance
return cls._instances[config_name]
def __init__(
self, config_name: str = "default", llm_config: Optional[LLMSettings] = None
):
if not hasattr(self, "client"): # Only initialize if not already initialized
llm_config = llm_config or config.llm
llm_config = llm_config.get(config_name, llm_config["default"])
self.model = llm_config.model
self.max_tokens = llm_config.max_tokens
self.temperature = llm_config.temperature
self.api_type = llm_config.api_type
self.api_key = llm_config.api_key
self.api_version = llm_config.api_version
self.base_url = llm_config.base_url
if self.api_type == "azure":
self.client = AsyncAzureOpenAI(
base_url=self.base_url,
api_key=self.api_key,
api_version=self.api_version,
)
else:
self.client = AsyncOpenAI(api_key=self.api_key, base_url=self.base_url)
@staticmethod
def format_messages(messages: List[Union[dict, Message]]) -> List[dict]:
"""
Format messages for LLM by converting them to OpenAI message format.
Args:
messages: List of messages that can be either dict or Message objects
Returns:
List[dict]: List of formatted messages in OpenAI format
Raises:
ValueError: If messages are invalid or missing required fields
TypeError: If unsupported message types are provided
Examples:
>>> msgs = [
... Message.system_message("You are a helpful assistant"),
... {"role": "user", "content": "Hello"},
... Message.user_message("How are you?")
... ]
>>> formatted = LLM.format_messages(msgs)
"""
formatted_messages = []
for message in messages:
if isinstance(message, dict):
# If message is already a dict, ensure it has required fields
if "role" not in message:
raise ValueError("Message dict must contain 'role' field")
formatted_messages.append(message)
elif isinstance(message, Message):
# If message is a Message object, convert it to dict
formatted_messages.append(message.to_dict())
else:
raise TypeError(f"Unsupported message type: {type(message)}")
# Validate all messages have required fields
for msg in formatted_messages:
if msg["role"] not in ["system", "user", "assistant", "tool"]:
raise ValueError(f"Invalid role: {msg['role']}")
if "content" not in msg and "tool_calls" not in msg:
raise ValueError(
"Message must contain either 'content' or 'tool_calls'"
)
return formatted_messages
@retry(
wait=wait_random_exponential(min=1, max=60),
stop=stop_after_attempt(6),
)
async def ask(
self,
messages: List[Union[dict, Message]],
system_msgs: Optional[List[Union[dict, Message]]] = None,
stream: bool = True,
temperature: Optional[float] = None,
) -> str:
"""
Send a prompt to the LLM and get the response.
Args:
messages: List of conversation messages
system_msgs: Optional system messages to prepend
stream (bool): Whether to stream the response
temperature (float): Sampling temperature for the response
Returns:
str: The generated response
Raises:
ValueError: If messages are invalid or response is empty
OpenAIError: If API call fails after retries
Exception: For unexpected errors
"""
try:
# Format system and user messages
if system_msgs:
system_msgs = self.format_messages(system_msgs)
messages = system_msgs + self.format_messages(messages)
else:
messages = self.format_messages(messages)
if not stream:
# Non-streaming request
response = await self.client.chat.completions.create(
model=self.model,
messages=messages,
max_tokens=self.max_tokens,
temperature=temperature or self.temperature,
stream=False,
)
if not response.choices or not response.choices[0].message.content:
raise ValueError("Empty or invalid response from LLM")
return response.choices[0].message.content
# Streaming request
response = await self.client.chat.completions.create(
model=self.model,
messages=messages,
max_tokens=self.max_tokens,
temperature=temperature or self.temperature,
stream=True,
)
collected_messages = []
async for chunk in response:
chunk_message = chunk.choices[0].delta.content or ""
collected_messages.append(chunk_message)
print(chunk_message, end="", flush=True)
print() # Newline after streaming
full_response = "".join(collected_messages).strip()
if not full_response:
raise ValueError("Empty response from streaming LLM")
return full_response
except ValueError as ve:
logger.error(f"Validation error: {ve}")
raise
except OpenAIError as oe:
logger.error(f"OpenAI API error: {oe}")
raise
except Exception as e:
logger.error(f"Unexpected error in ask: {e}")
raise
@retry(
wait=wait_random_exponential(min=1, max=60),
stop=stop_after_attempt(6),
)
async def ask_tool(
self,
messages: List[Union[dict, Message]],
system_msgs: Optional[List[Union[dict, Message]]] = None,
timeout: int = 60,
tools: Optional[List[dict]] = None,
tool_choice: Literal["none", "auto", "required"] = "auto",
temperature: Optional[float] = None,
**kwargs,
):
"""
Ask LLM using functions/tools and return the response.
Args:
messages: List of conversation messages
system_msgs: Optional system messages to prepend
timeout: Request timeout in seconds
tools: List of tools to use
tool_choice: Tool choice strategy
temperature: Sampling temperature for the response
**kwargs: Additional completion arguments
Returns:
ChatCompletionMessage: The model's response
Raises:
ValueError: If tools, tool_choice, or messages are invalid
OpenAIError: If API call fails after retries
Exception: For unexpected errors
"""
try:
# Validate tool_choice
if tool_choice not in ["none", "auto", "required"]:
raise ValueError(f"Invalid tool_choice: {tool_choice}")
# Format messages
if system_msgs:
system_msgs = self.format_messages(system_msgs)
messages = system_msgs + self.format_messages(messages)
else:
messages = self.format_messages(messages)
# Validate tools if provided
if tools:
for tool in tools:
if not isinstance(tool, dict) or "type" not in tool:
raise ValueError("Each tool must be a dict with 'type' field")
# Set up the completion request
response = await self.client.chat.completions.create(
model=self.model,
messages=messages,
temperature=temperature or self.temperature,
max_tokens=self.max_tokens,
tools=tools,
tool_choice=tool_choice,
timeout=timeout,
**kwargs,
)
# Check if response is valid
if not response.choices or not response.choices[0].message:
print(response)
raise ValueError("Invalid or empty response from LLM")
return response.choices[0].message
except ValueError as ve:
logger.error(f"Validation error in ask_tool: {ve}")
raise
except OpenAIError as oe:
if isinstance(oe, AuthenticationError):
logger.error("Authentication failed. Check API key.")
elif isinstance(oe, RateLimitError):
logger.error("Rate limit exceeded. Consider increasing retry attempts.")
elif isinstance(oe, APIError):
logger.error(f"API error: {oe}")
raise
except Exception as e:
logger.error(f"Unexpected error in ask_tool: {e}")
raise
================================================
FILE: app/logger.py
================================================
import os
import sys
import time
from pathlib import Path
from loguru import logger
# 获取项目根目录
project_root = Path(__file__).parent.parent
# 创建logs目录
logs_dir = project_root / "logs"
logs_dir.mkdir(exist_ok=True)
# 检查是否指定了日志文件
log_file = os.environ.get("OPENMANUS_LOG_FILE")
if not log_file:
# 如果没有指定,检查是否有任务ID(从session或工作区目录名)
task_id = os.environ.get("OPENMANUS_TASK_ID", "")
# 使用任务ID作为日志文件名,而不是使用日期时间格式
if task_id:
# 确保任务ID以job_开头
if not task_id.startswith("job_"):
task_id = f"job_{task_id}"
log_filename = f"{task_id}.log"
else:
# 如果没有任务ID,使用时间戳创建一个job_ID格式的日志文件名
job_id = f"job_{int(time.time())}"
log_filename = f"{job_id}.log"
log_file = logs_dir / log_filename
else:
# 使用指定的日志文件
log_file = Path(log_file)
# 配置loguru日志
logger.remove() # 移除默认的handler
# 添加控制台输出
logger.add(
sys.stderr,
format="{time:YYYY-MM-DD HH:mm:ss.SSS} | {level: <8} | {name}:{function}:{line} - {message}",
level="INFO",
)
# 添加文件输出
logger.add(
log_file,
format="{time:YYYY-MM-DD HH:mm:ss.SSS} | {level: <8} | {name}:{function}:{line} - {message}",
level="INFO",
rotation="100 MB",
retention="10 days",
)
# 导出配置好的logger
__all__ = ["logger"]
if __name__ == "__main__":
logger.info("Starting application")
logger.debug("Debug message")
logger.warning("Warning message")
logger.error("Error message")
logger.critical("Critical message")
try:
raise ValueError("Test error")
except Exception as e:
logger.exception(f"An error occurred: {e}")
================================================
FILE: app/prompt/__init__.py
================================================
================================================
FILE: app/prompt/manus.py
================================================
SYSTEM_PROMPT = "You are OpenManus, an all-capable AI assistant, aimed at solving any task presented by the user. You have various tools at your disposal that you can call upon to efficiently complete complex requests. Whether it's programming, information retrieval, file processing, or web browsing, you can handle it all."
NEXT_STEP_PROMPT = """You can interact with the computer using PythonExecute, save important content and information files through FileSaver, open browsers with BrowserUseTool, and retrieve information using GoogleSearch.
PythonExecute: Execute Python code to interact with the computer system, data processing, automation tasks, etc.
FileSaver: Save files locally, such as txt, py, html, etc.
BrowserUseTool: Open, browse, and use web browsers.If you open a local HTML file, you must provide the absolute path to the file.
GoogleSearch: Perform web information retrieval
Based on user needs, proactively select the most appropriate tool or combination of tools. For complex tasks, you can break down the problem and use different tools step by step to solve it. After using each tool, clearly explain the execution results and suggest the next steps.
"""
================================================
FILE: app/prompt/planning.py
================================================
PLANNING_SYSTEM_PROMPT = """
You are an expert Planning Agent tasked with solving problems efficiently through structured plans.
Your job is:
1. Analyze requests to understand the task scope
2. Create a clear, actionable plan that makes meaningful progress with the `planning` tool
3. Execute steps using available tools as needed
4. Track progress and adapt plans when necessary
5. Use `finish` to conclude immediately when the task is complete
Available tools will vary by task but may include:
- `planning`: Create, update, and track plans (commands: create, update, mark_step, etc.)
- `finish`: End the task when complete
Break tasks into logical steps with clear outcomes. Avoid excessive detail or sub-steps.
Think about dependencies and verification methods.
Know when to conclude - don't continue thinking once objectives are met.
"""
NEXT_STEP_PROMPT = """
Based on the current state, what's your next action?
Choose the most efficient path forward:
1. Is the plan sufficient, or does it need refinement?
2. Can you execute the next step immediately?
3. Is the task complete? If so, use `finish` right away.
Be concise in your reasoning, then select the appropriate tool or action.
"""
================================================
FILE: app/prompt/swe.py
================================================
SYSTEM_PROMPT = """SETTING: You are an autonomous programmer, and you're working directly in the command line with a special interface.
The special interface consists of a file editor that shows you {{WINDOW}} lines of a file at a time.
In addition to typical bash commands, you can also use specific commands to help you navigate and edit files.
To call a command, you need to invoke it with a function call/tool call.
Please note that THE EDIT COMMAND REQUIRES PROPER INDENTATION.
If you'd like to add the line ' print(x)' you must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
RESPONSE FORMAT:
Your shell prompt is formatted as follows:
(Open file: )
(Current directory: )
bash-$
First, you should _always_ include a general thought about what you're going to do next.
Then, for every response, you must include exactly _ONE_ tool call/function call.
Remember, you should always include a _SINGLE_ tool call/function call and then wait for a response from the shell before continuing with more discussion and commands. Everything you include in the DISCUSSION section will be saved for future reference.
If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first tool call, and then after receiving a response you'll be able to issue the second tool call.
Note that the environment does NOT support interactive session commands (e.g. python, vim), so please do not invoke them.
"""
NEXT_STEP_TEMPLATE = """{{observation}}
(Open file: {{open_file}})
(Current directory: {{working_dir}})
bash-$
"""
================================================
FILE: app/prompt/toolcall.py
================================================
SYSTEM_PROMPT = "You are an agent that can execute tool calls"
NEXT_STEP_PROMPT = (
"If you want to stop interaction, use `terminate` tool/function call."
)
================================================
FILE: app/schema.py
================================================
from enum import Enum
from typing import Any, List, Literal, Optional, Union
from pydantic import BaseModel, Field
class AgentState(str, Enum):
"""Agent execution states"""
IDLE = "IDLE"
RUNNING = "RUNNING"
FINISHED = "FINISHED"
ERROR = "ERROR"
class Function(BaseModel):
name: str
arguments: str
class ToolCall(BaseModel):
"""Represents a tool/function call in a message"""
id: str
type: str = "function"
function: Function
class Message(BaseModel):
"""Represents a chat message in the conversation"""
role: Literal["system", "user", "assistant", "tool"] = Field(...)
content: Optional[str] = Field(default=None)
tool_calls: Optional[List[ToolCall]] = Field(default=None)
name: Optional[str] = Field(default=None)
tool_call_id: Optional[str] = Field(default=None)
def __add__(self, other) -> List["Message"]:
"""支持 Message + list 或 Message + Message 的操作"""
if isinstance(other, list):
return [self] + other
elif isinstance(other, Message):
return [self, other]
else:
raise TypeError(
f"unsupported operand type(s) for +: '{type(self).__name__}' and '{type(other).__name__}'"
)
def __radd__(self, other) -> List["Message"]:
"""支持 list + Message 的操作"""
if isinstance(other, list):
return other + [self]
else:
raise TypeError(
f"unsupported operand type(s) for +: '{type(other).__name__}' and '{type(self).__name__}'"
)
def to_dict(self) -> dict:
"""Convert message to dictionary format"""
message = {"role": self.role}
if self.content is not None:
message["content"] = self.content
if self.tool_calls is not None:
message["tool_calls"] = [tool_call.dict() for tool_call in self.tool_calls]
if self.name is not None:
message["name"] = self.name
if self.tool_call_id is not None:
message["tool_call_id"] = self.tool_call_id
return message
@classmethod
def user_message(cls, content: str) -> "Message":
"""Create a user message"""
return cls(role="user", content=content)
@classmethod
def system_message(cls, content: str) -> "Message":
"""Create a system message"""
return cls(role="system", content=content)
@classmethod
def assistant_message(cls, content: Optional[str] = None) -> "Message":
"""Create an assistant message"""
return cls(role="assistant", content=content)
@classmethod
def tool_message(cls, content: str, name, tool_call_id: str) -> "Message":
"""Create a tool message"""
return cls(role="tool", content=content, name=name, tool_call_id=tool_call_id)
@classmethod
def from_tool_calls(
cls, tool_calls: List[Any], content: Union[str, List[str]] = "", **kwargs
):
"""Create ToolCallsMessage from raw tool calls.
Args:
tool_calls: Raw tool calls from LLM
content: Optional message content
"""
formatted_calls = [
{"id": call.id, "function": call.function.model_dump(), "type": "function"}
for call in tool_calls
]
return cls(
role="assistant", content=content, tool_calls=formatted_calls, **kwargs
)
class Memory(BaseModel):
messages: List[Message] = Field(default_factory=list)
max_messages: int = Field(default=100)
def add_message(self, message: Message) -> None:
"""Add a message to memory"""
self.messages.append(message)
# Optional: Implement message limit
if len(self.messages) > self.max_messages:
self.messages = self.messages[-self.max_messages :]
def add_messages(self, messages: List[Message]) -> None:
"""Add multiple messages to memory"""
self.messages.extend(messages)
def clear(self) -> None:
"""Clear all messages"""
self.messages.clear()
def get_recent_messages(self, n: int) -> List[Message]:
"""Get n most recent messages"""
return self.messages[-n:]
def to_dict_list(self) -> List[dict]:
"""Convert messages to list of dicts"""
return [msg.to_dict() for msg in self.messages]
================================================
FILE: app/tool/__init__.py
================================================
from app.tool.base import BaseTool
from app.tool.bash import Bash
from app.tool.create_chat_completion import CreateChatCompletion
from app.tool.planning import PlanningTool
from app.tool.str_replace_editor import StrReplaceEditor
from app.tool.terminate import Terminate
from app.tool.tool_collection import ToolCollection
__all__ = [
"BaseTool",
"Bash",
"Terminate",
"StrReplaceEditor",
"ToolCollection",
"CreateChatCompletion",
"PlanningTool",
]
================================================
FILE: app/tool/base.py
================================================
from abc import ABC, abstractmethod
from typing import Any, Dict, Optional
from pydantic import BaseModel, Field
class BaseTool(ABC, BaseModel):
name: str
description: str
parameters: Optional[dict] = None
class Config:
arbitrary_types_allowed = True
async def __call__(self, **kwargs) -> Any:
"""Execute the tool with given parameters."""
return await self.execute(**kwargs)
@abstractmethod
async def execute(self, **kwargs) -> Any:
"""Execute the tool with given parameters."""
def to_param(self) -> Dict:
"""Convert tool to function call format."""
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters,
},
}
class ToolResult(BaseModel):
"""Represents the result of a tool execution."""
output: Any = Field(default=None)
error: Optional[str] = Field(default=None)
system: Optional[str] = Field(default=None)
class Config:
arbitrary_types_allowed = True
def __bool__(self):
return any(getattr(self, field) for field in self.__fields__)
def __add__(self, other: "ToolResult"):
def combine_fields(
field: Optional[str], other_field: Optional[str], concatenate: bool = True
):
if field and other_field:
if concatenate:
return field + other_field
raise ValueError("Cannot combine tool results")
return field or other_field
return ToolResult(
output=combine_fields(self.output, other.output),
error=combine_fields(self.error, other.error),
system=combine_fields(self.system, other.system),
)
def __str__(self):
return f"Error: {self.error}" if self.error else self.output
def replace(self, **kwargs):
"""Returns a new ToolResult with the given fields replaced."""
# return self.copy(update=kwargs)
return type(self)(**{**self.dict(), **kwargs})
class CLIResult(ToolResult):
"""A ToolResult that can be rendered as a CLI output."""
class ToolFailure(ToolResult):
"""A ToolResult that represents a failure."""
class AgentAwareTool:
agent: Optional = None
================================================
FILE: app/tool/bash.py
================================================
import asyncio
import os
from typing import Optional
from app.exceptions import ToolError
from app.tool.base import BaseTool, CLIResult, ToolResult
_BASH_DESCRIPTION = """Execute a bash command in the terminal.
* Long running commands: For commands that may run indefinitely, it should be run in the background and the output should be redirected to a file, e.g. command = `python3 app.py > server.log 2>&1 &`.
* Interactive: If a bash command returns exit code `-1`, this means the process is not yet finished. The assistant must then send a second call to terminal with an empty `command` (which will retrieve any additional logs), or it can send additional text (set `command` to the text) to STDIN of the running process, or it can send command=`ctrl+c` to interrupt the process.
* Timeout: If a command execution result says "Command timed out. Sending SIGINT to the process", the assistant should retry running the command in the background.
"""
class _BashSession:
"""A session of a bash shell."""
_started: bool
_process: asyncio.subprocess.Process
command: str = "/bin/bash"
_output_delay: float = 0.2 # seconds
_timeout: float = 120.0 # seconds
_sentinel: str = "<>"
def __init__(self):
self._started = False
self._timed_out = False
async def start(self):
if self._started:
return
self._process = await asyncio.create_subprocess_shell(
self.command,
preexec_fn=os.setsid,
shell=True,
bufsize=0,
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
self._started = True
def stop(self):
"""Terminate the bash shell."""
if not self._started:
raise ToolError("Session has not started.")
if self._process.returncode is not None:
return
self._process.terminate()
async def run(self, command: str):
"""Execute a command in the bash shell."""
if not self._started:
raise ToolError("Session has not started.")
if self._process.returncode is not None:
return ToolResult(
system="tool must be restarted",
error=f"bash has exited with returncode {self._process.returncode}",
)
if self._timed_out:
raise ToolError(
f"timed out: bash has not returned in {self._timeout} seconds and must be restarted",
)
# we know these are not None because we created the process with PIPEs
assert self._process.stdin
assert self._process.stdout
assert self._process.stderr
# send command to the process
self._process.stdin.write(
command.encode() + f"; echo '{self._sentinel}'\n".encode()
)
await self._process.stdin.drain()
# read output from the process, until the sentinel is found
try:
async with asyncio.timeout(self._timeout):
while True:
await asyncio.sleep(self._output_delay)
# if we read directly from stdout/stderr, it will wait forever for
# EOF. use the StreamReader buffer directly instead.
output = (
self._process.stdout._buffer.decode()
) # pyright: ignore[reportAttributeAccessIssue]
if self._sentinel in output:
# strip the sentinel and break
output = output[: output.index(self._sentinel)]
break
except asyncio.TimeoutError:
self._timed_out = True
raise ToolError(
f"timed out: bash has not returned in {self._timeout} seconds and must be restarted",
) from None
if output.endswith("\n"):
output = output[:-1]
error = (
self._process.stderr._buffer.decode()
) # pyright: ignore[reportAttributeAccessIssue]
if error.endswith("\n"):
error = error[:-1]
# clear the buffers so that the next output can be read correctly
self._process.stdout._buffer.clear() # pyright: ignore[reportAttributeAccessIssue]
self._process.stderr._buffer.clear() # pyright: ignore[reportAttributeAccessIssue]
return CLIResult(output=output, error=error)
class Bash(BaseTool):
"""A tool for executing bash commands"""
name: str = "bash"
description: str = _BASH_DESCRIPTION
parameters: dict = {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "The bash command to execute. Can be empty to view additional logs when previous exit code is `-1`. Can be `ctrl+c` to interrupt the currently running process.",
},
},
"required": ["command"],
}
_session: Optional[_BashSession] = None
async def execute(
self, command: str | None = None, restart: bool = False, **kwargs
) -> CLIResult:
if restart:
if self._session:
self._session.stop()
self._session = _BashSession()
await self._session.start()
return ToolResult(system="tool has been restarted.")
if self._session is None:
self._session = _BashSession()
await self._session.start()
if command is not None:
return await self._session.run(command)
raise ToolError("no command provided.")
if __name__ == "__main__":
bash = Bash()
rst = asyncio.run(bash.execute("ls -l"))
print(rst)
================================================
FILE: app/tool/browser_use_tool.py
================================================
import asyncio
import json
import logging # 添加导入
from typing import Optional
from browser_use import Browser as BrowserUseBrowser
from browser_use import BrowserConfig
from browser_use.browser.context import BrowserContext
from browser_use.dom.service import DomService
from pydantic import Field, field_validator
from pydantic_core.core_schema import ValidationInfo
from app.tool.base import BaseTool, ToolResult
_BROWSER_DESCRIPTION = """
Interact with a web browser to perform various actions such as navigation, element interaction,
content extraction, and tab management. Supported actions include:
- 'navigate': Go to a specific URL
- 'click': Click an element by index
- 'input_text': Input text into an element
- 'screenshot': Capture a screenshot
- 'get_html': Get page HTML content
- 'get_text': Get text content of the page
- 'read_links': Get all links on the page
- 'execute_js': Execute JavaScript code
- 'scroll': Scroll the page
- 'switch_tab': Switch to a specific tab
- 'new_tab': Open a new tab
- 'close_tab': Close the current tab
- 'refresh': Refresh the current page
"""
class BrowserUseTool(BaseTool):
name: str = "browser_use"
description: str = _BROWSER_DESCRIPTION
parameters: dict = {
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": [
"navigate",
"click",
"input_text",
"screenshot",
"get_html",
"get_text",
"execute_js",
"scroll",
"switch_tab",
"new_tab",
"close_tab",
"refresh",
],
"description": "The browser action to perform",
},
"url": {
"type": "string",
"description": "URL for 'navigate' or 'new_tab' actions",
},
"index": {
"type": "integer",
"description": "Element index for 'click' or 'input_text' actions",
},
"text": {"type": "string", "description": "Text for 'input_text' action"},
"script": {
"type": "string",
"description": "JavaScript code for 'execute_js' action",
},
"scroll_amount": {
"type": "integer",
"description": "Pixels to scroll (positive for down, negative for up) for 'scroll' action",
},
"tab_id": {
"type": "integer",
"description": "Tab ID for 'switch_tab' action",
},
},
"required": ["action"],
"dependencies": {
"navigate": ["url"],
"click": ["index"],
"input_text": ["index", "text"],
"execute_js": ["script"],
"switch_tab": ["tab_id"],
"new_tab": ["url"],
"scroll": ["scroll_amount"],
},
}
lock: asyncio.Lock = Field(default_factory=asyncio.Lock)
browser: Optional[BrowserUseBrowser] = Field(default=None, exclude=True)
context: Optional[BrowserContext] = Field(default=None, exclude=True)
dom_service: Optional[DomService] = Field(default=None, exclude=True)
@field_validator("parameters", mode="before")
def validate_parameters(cls, v: dict, info: ValidationInfo) -> dict:
if not v:
raise ValueError("Parameters cannot be empty")
return v
async def _ensure_browser_initialized(self) -> BrowserContext:
"""Ensure browser and context are initialized."""
if self.browser is None:
self.browser = BrowserUseBrowser(BrowserConfig(headless=False))
if self.context is None:
self.context = await self.browser.new_context()
self.dom_service = DomService(await self.context.get_current_page())
return self.context
async def execute(
self,
action: str,
url: Optional[str] = None,
index: Optional[int] = None,
text: Optional[str] = None,
script: Optional[str] = None,
scroll_amount: Optional[int] = None,
tab_id: Optional[int] = None,
**kwargs,
) -> ToolResult:
"""
Execute a specified browser action.
Args:
action: The browser action to perform
url: URL for navigation or new tab
index: Element index for click or input actions
text: Text for input action
script: JavaScript code for execution
scroll_amount: Pixels to scroll for scroll action
tab_id: Tab ID for switch_tab action
**kwargs: Additional arguments
Returns:
ToolResult with the action's output or error
"""
async with self.lock:
try:
context = await self._ensure_browser_initialized()
if action == "navigate":
if not url:
return ToolResult(error="URL is required for 'navigate' action")
await context.navigate_to(url)
return ToolResult(output=f"Navigated to {url}")
elif action == "click":
if index is None:
return ToolResult(error="Index is required for 'click' action")
element = await context.get_dom_element_by_index(index)
if not element:
return ToolResult(error=f"Element with index {index} not found")
download_path = await context._click_element_node(element)
output = f"Clicked element at index {index}"
if download_path:
output += f" - Downloaded file to {download_path}"
return ToolResult(output=output)
elif action == "input_text":
if index is None or not text:
return ToolResult(
error="Index and text are required for 'input_text' action"
)
element = await context.get_dom_element_by_index(index)
if not element:
return ToolResult(error=f"Element with index {index} not found")
await context._input_text_element_node(element, text)
return ToolResult(
output=f"Input '{text}' into element at index {index}"
)
elif action == "screenshot":
screenshot = await context.take_screenshot(full_page=True)
return ToolResult(
output=f"Screenshot captured (base64 length: {len(screenshot)})",
system=screenshot,
)
elif action == "get_html":
html = await context.get_page_html()
truncated = html[:2000] + "..." if len(html) > 2000 else html
return ToolResult(output=truncated)
elif action == "get_text":
text = await context.execute_javascript("document.body.innerText")
return ToolResult(output=text)
elif action == "read_links":
links = await context.execute_javascript(
"document.querySelectorAll('a[href]').forEach((elem) => {if (elem.innerText) {console.log(elem.innerText, elem.href)}})"
)
return ToolResult(output=links)
elif action == "execute_js":
if not script:
return ToolResult(
error="Script is required for 'execute_js' action"
)
result = await context.execute_javascript(script)
return ToolResult(output=str(result))
elif action == "scroll":
if scroll_amount is None:
return ToolResult(
error="Scroll amount is required for 'scroll' action"
)
await context.execute_javascript(
f"window.scrollBy(0, {scroll_amount});"
)
direction = "down" if scroll_amount > 0 else "up"
return ToolResult(
output=f"Scrolled {direction} by {abs(scroll_amount)} pixels"
)
elif action == "switch_tab":
if tab_id is None:
return ToolResult(
error="Tab ID is required for 'switch_tab' action"
)
await context.switch_to_tab(tab_id)
return ToolResult(output=f"Switched to tab {tab_id}")
elif action == "new_tab":
if not url:
return ToolResult(error="URL is required for 'new_tab' action")
await context.create_new_tab(url)
return ToolResult(output=f"Opened new tab with URL {url}")
elif action == "close_tab":
await context.close_current_tab()
return ToolResult(output="Closed current tab")
elif action == "refresh":
await context.refresh_page()
return ToolResult(output="Refreshed current page")
else:
return ToolResult(error=f"Unknown action: {action}")
except Exception as e:
return ToolResult(error=f"Browser action '{action}' failed: {str(e)}")
async def get_current_state(self) -> ToolResult:
"""Get the current browser state as a ToolResult."""
async with self.lock:
try:
context = await self._ensure_browser_initialized()
state = await context.get_state()
state_info = {
"url": state.url,
"title": state.title,
"tabs": [tab.model_dump() for tab in state.tabs],
"interactive_elements": state.element_tree.clickable_elements_to_string(),
}
return ToolResult(output=json.dumps(state_info))
except Exception as e:
return ToolResult(error=f"Failed to get browser state: {str(e)}")
async def cleanup(self):
"""清理浏览器资源"""
if hasattr(self, "browser") and self.browser is not None:
try:
if (
hasattr(self, "context")
and self.context
and not self.context.is_closed()
):
await self.context.close()
if self.browser and not self.browser.is_closed():
await self.browser.close()
self.browser = None
self.context = None
self.dom_service = None # 修正变量名
except Exception as e:
logging.error(f"浏览器清理过程中出错: {str(e)}")
def __del__(self):
"""在对象被销毁时尝试清理资源"""
if hasattr(self, "browser") and self.browser is not None:
try:
# 检查是否有事件循环正在运行
try:
loop = asyncio.get_running_loop()
if loop.is_running():
# 如果有循环在运行,记录警告并跳过(避免运行时错误)
logging.warning("事件循环正在运行,跳过浏览器清理。这可能会导致资源泄漏。")
return
except RuntimeError:
# 没有事件循环在运行,可以创建新的
loop = asyncio.new_event_loop()
loop.run_until_complete(self.cleanup())
loop.close()
except Exception as e:
logging.error(f"浏览器资源清理失败: {str(e)}")
================================================
FILE: app/tool/create_chat_completion.py
================================================
from typing import Any, List, Optional, Type, Union, get_args, get_origin
from pydantic import BaseModel, Field
from app.tool import BaseTool
class CreateChatCompletion(BaseTool):
name: str = "create_chat_completion"
description: str = (
"Creates a structured completion with specified output formatting."
)
# Type mapping for JSON schema
type_mapping: dict = {
str: "string",
int: "integer",
float: "number",
bool: "boolean",
dict: "object",
list: "array",
}
response_type: Optional[Type] = None
required: List[str] = Field(default_factory=lambda: ["response"])
def __init__(self, response_type: Optional[Type] = str):
"""Initialize with a specific response type."""
super().__init__()
self.response_type = response_type
self.parameters = self._build_parameters()
def _build_parameters(self) -> dict:
"""Build parameters schema based on response type."""
if self.response_type == str:
return {
"type": "object",
"properties": {
"response": {
"type": "string",
"description": "The response text that should be delivered to the user.",
},
},
"required": self.required,
}
if isinstance(self.response_type, type) and issubclass(
self.response_type, BaseModel
):
schema = self.response_type.model_json_schema()
return {
"type": "object",
"properties": schema["properties"],
"required": schema.get("required", self.required),
}
return self._create_type_schema(self.response_type)
def _create_type_schema(self, type_hint: Type) -> dict:
"""Create a JSON schema for the given type."""
origin = get_origin(type_hint)
args = get_args(type_hint)
# Handle primitive types
if origin is None:
return {
"type": "object",
"properties": {
"response": {
"type": self.type_mapping.get(type_hint, "string"),
"description": f"Response of type {type_hint.__name__}",
}
},
"required": self.required,
}
# Handle List type
if origin is list:
item_type = args[0] if args else Any
return {
"type": "object",
"properties": {
"response": {
"type": "array",
"items": self._get_type_info(item_type),
}
},
"required": self.required,
}
# Handle Dict type
if origin is dict:
value_type = args[1] if len(args) > 1 else Any
return {
"type": "object",
"properties": {
"response": {
"type": "object",
"additionalProperties": self._get_type_info(value_type),
}
},
"required": self.required,
}
# Handle Union type
if origin is Union:
return self._create_union_schema(args)
return self._build_parameters()
def _get_type_info(self, type_hint: Type) -> dict:
"""Get type information for a single type."""
if isinstance(type_hint, type) and issubclass(type_hint, BaseModel):
return type_hint.model_json_schema()
return {
"type": self.type_mapping.get(type_hint, "string"),
"description": f"Value of type {getattr(type_hint, '__name__', 'any')}",
}
def _create_union_schema(self, types: tuple) -> dict:
"""Create schema for Union types."""
return {
"type": "object",
"properties": {
"response": {"anyOf": [self._get_type_info(t) for t in types]}
},
"required": self.required,
}
async def execute(self, required: list | None = None, **kwargs) -> Any:
"""Execute the chat completion with type conversion.
Args:
required: List of required field names or None
**kwargs: Response data
Returns:
Converted response based on response_type
"""
required = required or self.required
# Handle case when required is a list
if isinstance(required, list) and len(required) > 0:
if len(required) == 1:
required_field = required[0]
result = kwargs.get(required_field, "")
else:
# Return multiple fields as a dictionary
return {field: kwargs.get(field, "") for field in required}
else:
required_field = "response"
result = kwargs.get(required_field, "")
# Type conversion logic
if self.response_type == str:
return result
if isinstance(self.response_type, type) and issubclass(
self.response_type, BaseModel
):
return self.response_type(**kwargs)
if get_origin(self.response_type) in (list, dict):
return result # Assuming result is already in correct format
try:
return self.response_type(result)
except (ValueError, TypeError):
return result
================================================
FILE: app/tool/file_saver.py
================================================
import os
import aiofiles
from app.tool.base import BaseTool
class FileSaver(BaseTool):
name: str = "file_saver"
description: str = """Save content to a local file at a specified path.
Use this tool when you need to save text, code, or generated content to a file on the local filesystem.
The tool accepts content and a file path, and saves the content to that location.
"""
parameters: dict = {
"type": "object",
"properties": {
"content": {
"type": "string",
"description": "(required) The content to save to the file.",
},
"file_path": {
"type": "string",
"description": "(required) The path where the file should be saved, including filename and extension.",
},
"mode": {
"type": "string",
"description": "(optional) The file opening mode. Default is 'w' for write. Use 'a' for append.",
"enum": ["w", "a"],
"default": "w",
},
},
"required": ["content", "file_path"],
}
async def execute(self, content: str, file_path: str, mode: str = "w") -> str:
"""
Save content to a file at the specified path.
Args:
content (str): The content to save to the file.
file_path (str): The path where the file should be saved.
mode (str, optional): The file opening mode. Default is 'w' for write. Use 'a' for append.
Returns:
str: A message indicating the result of the operation.
"""
try:
# Ensure the directory exists
directory = os.path.dirname(file_path)
if directory and not os.path.exists(directory):
os.makedirs(directory)
# Write directly to the file
async with aiofiles.open(file_path, mode, encoding="utf-8") as file:
await file.write(content)
return f"Content successfully saved to {file_path}"
except Exception as e:
return f"Error saving file: {str(e)}"
================================================
FILE: app/tool/google_search.py
================================================
import asyncio
from typing import List
from googlesearch import search
from app.tool.base import BaseTool
class GoogleSearch(BaseTool):
name: str = "google_search"
description: str = """Perform a Google search and return a list of relevant links.
Use this tool when you need to find information on the web, get up-to-date data, or research specific topics.
The tool returns a list of URLs that match the search query.
"""
parameters: dict = {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "(required) The search query to submit to Google.",
},
"num_results": {
"type": "integer",
"description": "(optional) The number of search results to return. Default is 10.",
"default": 10,
},
},
"required": ["query"],
}
async def execute(self, query: str, num_results: int = 10) -> List[str]:
"""
Execute a Google search and return a list of URLs.
Args:
query (str): The search query to submit to Google.
num_results (int, optional): The number of search results to return. Default is 10.
Returns:
List[str]: A list of URLs matching the search query.
"""
# Run the search in a thread pool to prevent blocking
loop = asyncio.get_event_loop()
links = await loop.run_in_executor(
None, lambda: list(search(query, num_results=num_results))
)
return links
================================================
FILE: app/tool/planning.py
================================================
# tool/planning.py
from typing import Dict, List, Literal, Optional
from app.exceptions import ToolError
from app.tool.base import BaseTool, ToolResult
_PLANNING_TOOL_DESCRIPTION = """
A planning tool that allows the agent to create and manage plans for solving complex tasks.
The tool provides functionality for creating plans, updating plan steps, and tracking progress.
"""
class PlanningTool(BaseTool):
"""
A planning tool that allows the agent to create and manage plans for solving complex tasks.
The tool provides functionality for creating plans, updating plan steps, and tracking progress.
"""
name: str = "planning"
description: str = _PLANNING_TOOL_DESCRIPTION
parameters: dict = {
"type": "object",
"properties": {
"command": {
"description": "The command to execute. Available commands: create, update, list, get, set_active, mark_step, delete.",
"enum": [
"create",
"update",
"list",
"get",
"set_active",
"mark_step",
"delete",
],
"type": "string",
},
"plan_id": {
"description": "Unique identifier for the plan. Required for create, update, set_active, and delete commands. Optional for get and mark_step (uses active plan if not specified).",
"type": "string",
},
"title": {
"description": "Title for the plan. Required for create command, optional for update command.",
"type": "string",
},
"steps": {
"description": "List of plan steps. Required for create command, optional for update command.",
"type": "array",
"items": {"type": "string"},
},
"step_index": {
"description": "Index of the step to update (0-based). Required for mark_step command.",
"type": "integer",
},
"step_status": {
"description": "Status to set for a step. Used with mark_step command.",
"enum": ["not_started", "in_progress", "completed", "blocked"],
"type": "string",
},
"step_notes": {
"description": "Additional notes for a step. Optional for mark_step command.",
"type": "string",
},
},
"required": ["command"],
"additionalProperties": False,
}
plans: dict = {} # Dictionary to store plans by plan_id
_current_plan_id: Optional[str] = None # Track the current active plan
async def execute(
self,
*,
command: Literal[
"create", "update", "list", "get", "set_active", "mark_step", "delete"
],
plan_id: Optional[str] = None,
title: Optional[str] = None,
steps: Optional[List[str]] = None,
step_index: Optional[int] = None,
step_status: Optional[
Literal["not_started", "in_progress", "completed", "blocked"]
] = None,
step_notes: Optional[str] = None,
**kwargs,
):
"""
Execute the planning tool with the given command and parameters.
Parameters:
- command: The operation to perform
- plan_id: Unique identifier for the plan
- title: Title for the plan (used with create command)
- steps: List of steps for the plan (used with create command)
- step_index: Index of the step to update (used with mark_step command)
- step_status: Status to set for a step (used with mark_step command)
- step_notes: Additional notes for a step (used with mark_step command)
"""
if command == "create":
return self._create_plan(plan_id, title, steps)
elif command == "update":
return self._update_plan(plan_id, title, steps)
elif command == "list":
return self._list_plans()
elif command == "get":
return self._get_plan(plan_id)
elif command == "set_active":
return self._set_active_plan(plan_id)
elif command == "mark_step":
return self._mark_step(plan_id, step_index, step_status, step_notes)
elif command == "delete":
return self._delete_plan(plan_id)
else:
raise ToolError(
f"Unrecognized command: {command}. Allowed commands are: create, update, list, get, set_active, mark_step, delete"
)
def _create_plan(
self, plan_id: Optional[str], title: Optional[str], steps: Optional[List[str]]
) -> ToolResult:
"""Create a new plan with the given ID, title, and steps."""
if not plan_id:
raise ToolError("Parameter `plan_id` is required for command: create")
if plan_id in self.plans:
raise ToolError(
f"A plan with ID '{plan_id}' already exists. Use 'update' to modify existing plans."
)
if not title:
raise ToolError("Parameter `title` is required for command: create")
if (
not steps
or not isinstance(steps, list)
or not all(isinstance(step, str) for step in steps)
):
raise ToolError(
"Parameter `steps` must be a non-empty list of strings for command: create"
)
# Create a new plan with initialized step statuses
plan = {
"plan_id": plan_id,
"title": title,
"steps": steps,
"step_statuses": ["not_started"] * len(steps),
"step_notes": [""] * len(steps),
}
self.plans[plan_id] = plan
self._current_plan_id = plan_id # Set as active plan
return ToolResult(
output=f"Plan created successfully with ID: {plan_id}\n\n{self._format_plan(plan)}"
)
def _update_plan(
self, plan_id: Optional[str], title: Optional[str], steps: Optional[List[str]]
) -> ToolResult:
"""Update an existing plan with new title or steps."""
if not plan_id:
raise ToolError("Parameter `plan_id` is required for command: update")
if plan_id not in self.plans:
raise ToolError(f"No plan found with ID: {plan_id}")
plan = self.plans[plan_id]
if title:
plan["title"] = title
if steps:
if not isinstance(steps, list) or not all(
isinstance(step, str) for step in steps
):
raise ToolError(
"Parameter `steps` must be a list of strings for command: update"
)
# Preserve existing step statuses for unchanged steps
old_steps = plan["steps"]
old_statuses = plan["step_statuses"]
old_notes = plan["step_notes"]
# Create new step statuses and notes
new_statuses = []
new_notes = []
for i, step in enumerate(steps):
# If the step exists at the same position in old steps, preserve status and notes
if i < len(old_steps) and step == old_steps[i]:
new_statuses.append(old_statuses[i])
new_notes.append(old_notes[i])
else:
new_statuses.append("not_started")
new_notes.append("")
plan["steps"] = steps
plan["step_statuses"] = new_statuses
plan["step_notes"] = new_notes
return ToolResult(
output=f"Plan updated successfully: {plan_id}\n\n{self._format_plan(plan)}"
)
def _list_plans(self) -> ToolResult:
"""List all available plans."""
if not self.plans:
return ToolResult(
output="No plans available. Create a plan with the 'create' command."
)
output = "Available plans:\n"
for plan_id, plan in self.plans.items():
current_marker = " (active)" if plan_id == self._current_plan_id else ""
completed = sum(
1 for status in plan["step_statuses"] if status == "completed"
)
total = len(plan["steps"])
progress = f"{completed}/{total} steps completed"
output += f"• {plan_id}{current_marker}: {plan['title']} - {progress}\n"
return ToolResult(output=output)
def _get_plan(self, plan_id: Optional[str]) -> ToolResult:
"""Get details of a specific plan."""
if not plan_id:
# If no plan_id is provided, use the current active plan
if not self._current_plan_id:
raise ToolError(
"No active plan. Please specify a plan_id or set an active plan."
)
plan_id = self._current_plan_id
if plan_id not in self.plans:
raise ToolError(f"No plan found with ID: {plan_id}")
plan = self.plans[plan_id]
return ToolResult(output=self._format_plan(plan))
def _set_active_plan(self, plan_id: Optional[str]) -> ToolResult:
"""Set a plan as the active plan."""
if not plan_id:
raise ToolError("Parameter `plan_id` is required for command: set_active")
if plan_id not in self.plans:
raise ToolError(f"No plan found with ID: {plan_id}")
self._current_plan_id = plan_id
return ToolResult(
output=f"Plan '{plan_id}' is now the active plan.\n\n{self._format_plan(self.plans[plan_id])}"
)
def _mark_step(
self,
plan_id: Optional[str],
step_index: Optional[int],
step_status: Optional[str],
step_notes: Optional[str],
) -> ToolResult:
"""Mark a step with a specific status and optional notes."""
if not plan_id:
# If no plan_id is provided, use the current active plan
if not self._current_plan_id:
raise ToolError(
"No active plan. Please specify a plan_id or set an active plan."
)
plan_id = self._current_plan_id
if plan_id not in self.plans:
raise ToolError(f"No plan found with ID: {plan_id}")
if step_index is None:
raise ToolError("Parameter `step_index` is required for command: mark_step")
plan = self.plans[plan_id]
if step_index < 0 or step_index >= len(plan["steps"]):
raise ToolError(
f"Invalid step_index: {step_index}. Valid indices range from 0 to {len(plan['steps'])-1}."
)
if step_status and step_status not in [
"not_started",
"in_progress",
"completed",
"blocked",
]:
raise ToolError(
f"Invalid step_status: {step_status}. Valid statuses are: not_started, in_progress, completed, blocked"
)
if step_status:
plan["step_statuses"][step_index] = step_status
if step_notes:
plan["step_notes"][step_index] = step_notes
return ToolResult(
output=f"Step {step_index} updated in plan '{plan_id}'.\n\n{self._format_plan(plan)}"
)
def _delete_plan(self, plan_id: Optional[str]) -> ToolResult:
"""Delete a plan."""
if not plan_id:
raise ToolError("Parameter `plan_id` is required for command: delete")
if plan_id not in self.plans:
raise ToolError(f"No plan found with ID: {plan_id}")
del self.plans[plan_id]
# If the deleted plan was the active plan, clear the active plan
if self._current_plan_id == plan_id:
self._current_plan_id = None
return ToolResult(output=f"Plan '{plan_id}' has been deleted.")
def _format_plan(self, plan: Dict) -> str:
"""Format a plan for display."""
output = f"Plan: {plan['title']} (ID: {plan['plan_id']})\n"
output += "=" * len(output) + "\n\n"
# Calculate progress statistics
total_steps = len(plan["steps"])
completed = sum(1 for status in plan["step_statuses"] if status == "completed")
in_progress = sum(
1 for status in plan["step_statuses"] if status == "in_progress"
)
blocked = sum(1 for status in plan["step_statuses"] if status == "blocked")
not_started = sum(
1 for status in plan["step_statuses"] if status == "not_started"
)
output += f"Progress: {completed}/{total_steps} steps completed "
if total_steps > 0:
percentage = (completed / total_steps) * 100
output += f"({percentage:.1f}%)\n"
else:
output += "(0%)\n"
output += f"Status: {completed} completed, {in_progress} in progress, {blocked} blocked, {not_started} not started\n\n"
output += "Steps:\n"
# Add each step with its status and notes
for i, (step, status, notes) in enumerate(
zip(plan["steps"], plan["step_statuses"], plan["step_notes"])
):
status_symbol = {
"not_started": "[ ]",
"in_progress": "[→]",
"completed": "[✓]",
"blocked": "[!]",
}.get(status, "[ ]")
output += f"{i}. {status_symbol} {step}\n"
if notes:
output += f" Notes: {notes}\n"
return output
================================================
FILE: app/tool/python_execute.py
================================================
import threading
from typing import Dict
from app.tool.base import BaseTool
class PythonExecute(BaseTool):
"""A tool for executing Python code with timeout and safety restrictions."""
name: str = "python_execute"
description: str = "Executes Python code string. Note: Only print outputs are visible, function return values are not captured. Use print statements to see results."
parameters: dict = {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The Python code to execute.",
},
},
"required": ["code"],
}
async def execute(
self,
code: str,
timeout: int = 5,
) -> Dict:
"""
Executes the provided Python code with a timeout.
Args:
code (str): The Python code to execute.
timeout (int): Execution timeout in seconds.
Returns:
Dict: Contains 'output' with execution output or error message and 'success' status.
"""
result = {"observation": ""}
def run_code():
try:
safe_globals = {"__builtins__": dict(__builtins__)}
import sys
from io import StringIO
output_buffer = StringIO()
sys.stdout = output_buffer
exec(code, safe_globals, {})
sys.stdout = sys.__stdout__
result["observation"] = output_buffer.getvalue()
except Exception as e:
result["observation"] = str(e)
result["success"] = False
thread = threading.Thread(target=run_code)
thread.start()
thread.join(timeout)
if thread.is_alive():
return {
"observation": f"Execution timeout after {timeout} seconds",
"success": False,
}
return result
================================================
FILE: app/tool/run.py
================================================
"""Utility to run shell commands asynchronously with a timeout."""
import asyncio
TRUNCATED_MESSAGE: str = "To save on context only part of this file has been shown to you. You should retry this tool after you have searched inside the file with `grep -n` in order to find the line numbers of what you are looking for."
MAX_RESPONSE_LEN: int = 16000
def maybe_truncate(content: str, truncate_after: int | None = MAX_RESPONSE_LEN):
"""Truncate content and append a notice if content exceeds the specified length."""
return (
content
if not truncate_after or len(content) <= truncate_after
else content[:truncate_after] + TRUNCATED_MESSAGE
)
async def run(
cmd: str,
timeout: float | None = 120.0, # seconds
truncate_after: int | None = MAX_RESPONSE_LEN,
):
"""Run a shell command asynchronously with a timeout."""
process = await asyncio.create_subprocess_shell(
cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
)
try:
stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=timeout)
return (
process.returncode or 0,
maybe_truncate(stdout.decode(), truncate_after=truncate_after),
maybe_truncate(stderr.decode(), truncate_after=truncate_after),
)
except asyncio.TimeoutError as exc:
try:
process.kill()
except ProcessLookupError:
pass
raise TimeoutError(
f"Command '{cmd}' timed out after {timeout} seconds"
) from exc
================================================
FILE: app/tool/str_replace_editor.py
================================================
from collections import defaultdict
from pathlib import Path
from typing import Literal, get_args
from app.exceptions import ToolError
from app.tool import BaseTool
from app.tool.base import CLIResult, ToolResult
from app.tool.run import run
Command = Literal[
"view",
"create",
"str_replace",
"insert",
"undo_edit",
]
SNIPPET_LINES: int = 4
MAX_RESPONSE_LEN: int = 16000
TRUNCATED_MESSAGE: str = "To save on context only part of this file has been shown to you. You should retry this tool after you have searched inside the file with `grep -n` in order to find the line numbers of what you are looking for."
_STR_REPLACE_EDITOR_DESCRIPTION = """Custom editing tool for viewing, creating and editing files
* State is persistent across command calls and discussions with the user
* If `path` is a file, `view` displays the result of applying `cat -n`. If `path` is a directory, `view` lists non-hidden files and directories up to 2 levels deep
* The `create` command cannot be used if the specified `path` already exists as a file
* If a `command` generates a long output, it will be truncated and marked with ``
* The `undo_edit` command will revert the last edit made to the file at `path`
Notes for using the `str_replace` command:
* The `old_str` parameter should match EXACTLY one or more consecutive lines from the original file. Be mindful of whitespaces!
* If the `old_str` parameter is not unique in the file, the replacement will not be performed. Make sure to include enough context in `old_str` to make it unique
* The `new_str` parameter should contain the edited lines that should replace the `old_str`
"""
def maybe_truncate(content: str, truncate_after: int | None = MAX_RESPONSE_LEN):
"""Truncate content and append a notice if content exceeds the specified length."""
return (
content
if not truncate_after or len(content) <= truncate_after
else content[:truncate_after] + TRUNCATED_MESSAGE
)
class StrReplaceEditor(BaseTool):
"""A tool for executing bash commands"""
name: str = "str_replace_editor"
description: str = _STR_REPLACE_EDITOR_DESCRIPTION
parameters: dict = {
"type": "object",
"properties": {
"command": {
"description": "The commands to run. Allowed options are: `view`, `create`, `str_replace`, `insert`, `undo_edit`.",
"enum": ["view", "create", "str_replace", "insert", "undo_edit"],
"type": "string",
},
"path": {
"description": "Absolute path to file or directory.",
"type": "string",
},
"file_text": {
"description": "Required parameter of `create` command, with the content of the file to be created.",
"type": "string",
},
"old_str": {
"description": "Required parameter of `str_replace` command containing the string in `path` to replace.",
"type": "string",
},
"new_str": {
"description": "Optional parameter of `str_replace` command containing the new string (if not given, no string will be added). Required parameter of `insert` command containing the string to insert.",
"type": "string",
},
"insert_line": {
"description": "Required parameter of `insert` command. The `new_str` will be inserted AFTER the line `insert_line` of `path`.",
"type": "integer",
},
"view_range": {
"description": "Optional parameter of `view` command when `path` points to a file. If none is given, the full file is shown. If provided, the file will be shown in the indicated line number range, e.g. [11, 12] will show lines 11 and 12. Indexing at 1 to start. Setting `[start_line, -1]` shows all lines from `start_line` to the end of the file.",
"items": {"type": "integer"},
"type": "array",
},
},
"required": ["command", "path"],
}
_file_history: list = defaultdict(list)
async def execute(
self,
*,
command: Command,
path: str,
file_text: str | None = None,
view_range: list[int] | None = None,
old_str: str | None = None,
new_str: str | None = None,
insert_line: int | None = None,
**kwargs,
) -> str:
_path = Path(path)
self.validate_path(command, _path)
if command == "view":
result = await self.view(_path, view_range)
elif command == "create":
if file_text is None:
raise ToolError("Parameter `file_text` is required for command: create")
self.write_file(_path, file_text)
self._file_history[_path].append(file_text)
result = ToolResult(output=f"File created successfully at: {_path}")
elif command == "str_replace":
if old_str is None:
raise ToolError(
"Parameter `old_str` is required for command: str_replace"
)
result = self.str_replace(_path, old_str, new_str)
elif command == "insert":
if insert_line is None:
raise ToolError(
"Parameter `insert_line` is required for command: insert"
)
if new_str is None:
raise ToolError("Parameter `new_str` is required for command: insert")
result = self.insert(_path, insert_line, new_str)
elif command == "undo_edit":
result = self.undo_edit(_path)
else:
raise ToolError(
f'Unrecognized command {command}. The allowed commands for the {self.name} tool are: {", ".join(get_args(Command))}'
)
return str(result)
def validate_path(self, command: str, path: Path):
"""
Check that the path/command combination is valid.
"""
# Check if its an absolute path
if not path.is_absolute():
suggested_path = Path("") / path
raise ToolError(
f"The path {path} is not an absolute path, it should start with `/`. Maybe you meant {suggested_path}?"
)
# Check if path exists
if not path.exists() and command != "create":
raise ToolError(
f"The path {path} does not exist. Please provide a valid path."
)
if path.exists() and command == "create":
raise ToolError(
f"File already exists at: {path}. Cannot overwrite files using command `create`."
)
# Check if the path points to a directory
if path.is_dir():
if command != "view":
raise ToolError(
f"The path {path} is a directory and only the `view` command can be used on directories"
)
async def view(self, path: Path, view_range: list[int] | None = None):
"""Implement the view command"""
if path.is_dir():
if view_range:
raise ToolError(
"The `view_range` parameter is not allowed when `path` points to a directory."
)
_, stdout, stderr = await run(
rf"find {path} -maxdepth 2 -not -path '*/\.*'"
)
if not stderr:
stdout = f"Here's the files and directories up to 2 levels deep in {path}, excluding hidden items:\n{stdout}\n"
return CLIResult(output=stdout, error=stderr)
file_content = self.read_file(path)
init_line = 1
if view_range:
if len(view_range) != 2 or not all(isinstance(i, int) for i in view_range):
raise ToolError(
"Invalid `view_range`. It should be a list of two integers."
)
file_lines = file_content.split("\n")
n_lines_file = len(file_lines)
init_line, final_line = view_range
if init_line < 1 or init_line > n_lines_file:
raise ToolError(
f"Invalid `view_range`: {view_range}. Its first element `{init_line}` should be within the range of lines of the file: {[1, n_lines_file]}"
)
if final_line > n_lines_file:
raise ToolError(
f"Invalid `view_range`: {view_range}. Its second element `{final_line}` should be smaller than the number of lines in the file: `{n_lines_file}`"
)
if final_line != -1 and final_line < init_line:
raise ToolError(
f"Invalid `view_range`: {view_range}. Its second element `{final_line}` should be larger or equal than its first `{init_line}`"
)
if final_line == -1:
file_content = "\n".join(file_lines[init_line - 1 :])
else:
file_content = "\n".join(file_lines[init_line - 1 : final_line])
return CLIResult(
output=self._make_output(file_content, str(path), init_line=init_line)
)
def str_replace(self, path: Path, old_str: str, new_str: str | None):
"""Implement the str_replace command, which replaces old_str with new_str in the file content"""
# Read the file content
file_content = self.read_file(path).expandtabs()
old_str = old_str.expandtabs()
new_str = new_str.expandtabs() if new_str is not None else ""
# Check if old_str is unique in the file
occurrences = file_content.count(old_str)
if occurrences == 0:
raise ToolError(
f"No replacement was performed, old_str `{old_str}` did not appear verbatim in {path}."
)
elif occurrences > 1:
file_content_lines = file_content.split("\n")
lines = [
idx + 1
for idx, line in enumerate(file_content_lines)
if old_str in line
]
raise ToolError(
f"No replacement was performed. Multiple occurrences of old_str `{old_str}` in lines {lines}. Please ensure it is unique"
)
# Replace old_str with new_str
new_file_content = file_content.replace(old_str, new_str)
# Write the new content to the file
self.write_file(path, new_file_content)
# Save the content to history
self._file_history[path].append(file_content)
# Create a snippet of the edited section
replacement_line = file_content.split(old_str)[0].count("\n")
start_line = max(0, replacement_line - SNIPPET_LINES)
end_line = replacement_line + SNIPPET_LINES + new_str.count("\n")
snippet = "\n".join(new_file_content.split("\n")[start_line : end_line + 1])
# Prepare the success message
success_msg = f"The file {path} has been edited. "
success_msg += self._make_output(
snippet, f"a snippet of {path}", start_line + 1
)
success_msg += "Review the changes and make sure they are as expected. Edit the file again if necessary."
return CLIResult(output=success_msg)
def insert(self, path: Path, insert_line: int, new_str: str):
"""Implement the insert command, which inserts new_str at the specified line in the file content."""
file_text = self.read_file(path).expandtabs()
new_str = new_str.expandtabs()
file_text_lines = file_text.split("\n")
n_lines_file = len(file_text_lines)
if insert_line < 0 or insert_line > n_lines_file:
raise ToolError(
f"Invalid `insert_line` parameter: {insert_line}. It should be within the range of lines of the file: {[0, n_lines_file]}"
)
new_str_lines = new_str.split("\n")
new_file_text_lines = (
file_text_lines[:insert_line]
+ new_str_lines
+ file_text_lines[insert_line:]
)
snippet_lines = (
file_text_lines[max(0, insert_line - SNIPPET_LINES) : insert_line]
+ new_str_lines
+ file_text_lines[insert_line : insert_line + SNIPPET_LINES]
)
new_file_text = "\n".join(new_file_text_lines)
snippet = "\n".join(snippet_lines)
self.write_file(path, new_file_text)
self._file_history[path].append(file_text)
success_msg = f"The file {path} has been edited. "
success_msg += self._make_output(
snippet,
"a snippet of the edited file",
max(1, insert_line - SNIPPET_LINES + 1),
)
success_msg += "Review the changes and make sure they are as expected (correct indentation, no duplicate lines, etc). Edit the file again if necessary."
return CLIResult(output=success_msg)
def undo_edit(self, path: Path):
"""Implement the undo_edit command."""
if not self._file_history[path]:
raise ToolError(f"No edit history found for {path}.")
old_text = self._file_history[path].pop()
self.write_file(path, old_text)
return CLIResult(
output=f"Last edit to {path} undone successfully. {self._make_output(old_text, str(path))}"
)
def read_file(self, path: Path):
"""Read the content of a file from a given path; raise a ToolError if an error occurs."""
try:
return path.read_text()
except Exception as e:
raise ToolError(f"Ran into {e} while trying to read {path}") from None
def write_file(self, path: Path, file: str):
"""Write the content of a file to a given path; raise a ToolError if an error occurs."""
try:
path.write_text(file)
except Exception as e:
raise ToolError(f"Ran into {e} while trying to write to {path}") from None
def _make_output(
self,
file_content: str,
file_descriptor: str,
init_line: int = 1,
expand_tabs: bool = True,
):
"""Generate output for the CLI based on the content of a file."""
file_content = maybe_truncate(file_content)
if expand_tabs:
file_content = file_content.expandtabs()
file_content = "\n".join(
[
f"{i + init_line:6}\t{line}"
for i, line in enumerate(file_content.split("\n"))
]
)
return (
f"Here's the result of running `cat -n` on {file_descriptor}:\n"
+ file_content
+ "\n"
)
================================================
FILE: app/tool/terminate.py
================================================
from app.tool.base import BaseTool
_TERMINATE_DESCRIPTION = """Terminate the interaction when the request is met OR if the assistant cannot proceed further with the task."""
class Terminate(BaseTool):
name: str = "terminate"
description: str = _TERMINATE_DESCRIPTION
parameters: dict = {
"type": "object",
"properties": {
"status": {
"type": "string",
"description": "The finish status of the interaction.",
"enum": ["success", "failure"],
}
},
"required": ["status"],
}
async def execute(self, status: str) -> str:
"""Finish the current execution"""
return f"The interaction has been completed with status: {status}"
================================================
FILE: app/tool/tool_collection.py
================================================
"""Collection classes for managing multiple tools."""
from typing import Any, Dict, List
from app.exceptions import ToolError
from app.tool.base import BaseTool, ToolFailure, ToolResult
class ToolCollection:
"""A collection of defined tools."""
def __init__(self, *tools: BaseTool):
self.tools = tools
self.tool_map = {tool.name: tool for tool in tools}
def __iter__(self):
return iter(self.tools)
def to_params(self) -> List[Dict[str, Any]]:
return [tool.to_param() for tool in self.tools]
async def execute(
self, *, name: str, tool_input: Dict[str, Any] = None
) -> ToolResult:
tool = self.tool_map.get(name)
if not tool:
return ToolFailure(error=f"Tool {name} is invalid")
try:
result = await tool(**tool_input)
return result
except ToolError as e:
return ToolFailure(error=e.message)
async def execute_all(self) -> List[ToolResult]:
"""Execute all tools in the collection sequentially."""
results = []
for tool in self.tools:
try:
result = await tool()
results.append(result)
except ToolError as e:
results.append(ToolFailure(error=e.message))
return results
def get_tool(self, name: str) -> BaseTool:
return self.tool_map.get(name)
def add_tool(self, tool: BaseTool):
self.tools += (tool,)
self.tool_map[tool.name] = tool
return self
def add_tools(self, *tools: BaseTool):
for tool in tools:
self.add_tool(tool)
return self
================================================
FILE: app/utils/log_monitor.py
================================================
import os
import re
import time
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
class LogFileMonitor:
def __init__(self, job_id=None, log_dir="logs"):
# 优先使用环境变量中的任务ID
self.job_id = job_id or os.environ.get("OPENMANUS_TASK_ID")
self.log_dir = log_dir
# 优先使用环境变量中的日志文件路径
env_log_file = os.environ.get("OPENMANUS_LOG_FILE")
if env_log_file and os.path.exists(env_log_file):
self.log_file = env_log_file
else:
self.log_file = os.path.join(log_dir, f"{self.job_id}.log")
self.generated_files = []
self.log_entries = []
self.file_pattern = re.compile(r"Content successfully saved to (.+)")
self.last_update_time = 0
def start_monitoring(self):
# 确保日志文件目录存在
if not os.path.exists(self.log_dir):
os.makedirs(self.log_dir)
# 如果日志文件已存在,先读取现有内容
if os.path.exists(self.log_file):
try:
with open(self.log_file, "r", encoding="utf-8") as file:
for line in file:
self.parse_log_line(line.strip())
except Exception as e:
print(f"读取现有日志文件时出错: {e}")
# 创建观察者来监控日志文件的变化
event_handler = LogEventHandler(self)
observer = Observer()
observer.schedule(event_handler, self.log_dir, recursive=False)
observer.start()
return observer
def parse_log_line(self, line):
# 解析日志行
self.log_entries.append(line)
self.last_update_time = time.time()
# 检查是否有生成的文件
file_match = self.file_pattern.search(line)
if file_match:
filename = file_match.group(1)
if filename not in self.generated_files:
self.generated_files.append(filename)
def get_generated_files(self):
return self.generated_files
def get_log_entries(self):
return self.log_entries
def get_new_entries_since(self, timestamp):
"""获取指定时间戳之后的新日志条目"""
if not self.log_entries:
return []
# 如果没有新条目,返回空列表
if self.last_update_time <= timestamp:
return []
# 找出新添加的条目
new_entries = []
for i in range(len(self.log_entries) - 1, -1, -1):
# 这里简化处理,假设所有新条目都是连续添加的
# 实际实现可能需要在日志条目中添加时间戳
if i >= len(self.log_entries) - 10: # 最多返回最新的10条
new_entries.insert(0, self.log_entries[i])
else:
break
return new_entries
class LogEventHandler(FileSystemEventHandler):
def __init__(self, monitor):
self.monitor = monitor
self.last_position = 0
def on_modified(self, event):
if not event.is_directory and event.src_path == self.monitor.log_file:
try:
with open(event.src_path, "r", encoding="utf-8") as file:
file.seek(self.last_position)
for line in file:
self.monitor.parse_log_line(line.strip())
self.last_position = file.tell()
except Exception as e:
print(f"读取修改的日志文件时出错: {e}")
def on_created(self, event):
# 如果是新创建的目标日志文件
if not event.is_directory and event.src_path == self.monitor.log_file:
try:
with open(event.src_path, "r", encoding="utf-8") as file:
for line in file:
self.monitor.parse_log_line(line.strip())
self.last_position = file.tell()
except Exception as e:
print(f"读取新创建的日志文件时出错: {e}")
================================================
FILE: app/web/README.md
================================================
# OpenManus Web 应用
这是OpenManus项目的Web界面部分,提供了一个友好的用户界面,让用户可以直接在浏览器中与OpenManus AI助手进行交互。

## 主要特性
- 🌐 现代化Web界面,支持实时通信
- 💬 直观的聊天界面,可以提问并获得AI回答
- 🧠 可视化思考过程,展示AI思考的每一步
- 📁 工作区文件管理,查看和管理AI生成的文件
- 📊 详细的日志跟踪和监控
- 🚀 支持中断和停止正在处理的请求
## 技术栈
- **后端**: FastAPI, Python, WebSocket
- **前端**: HTML, CSS, JavaScript
- **通信**: WebSocket实时通信
- **存储**: 文件系统存储生成的文件和日志
## 快速开始
1. 确保已安装所有依赖:
```bash
pip install -r requirements.txt
```
2. 启动Web服务器:
```bash
python web_run.py
```
或者从项目根目录:
```bash
python main.py --web
```
3. 打开浏览器访问: http://localhost:8000
## 项目结构
```
app/web/
├── app.py # Web应用主入口,FastAPI应用实例
├── log_handler.py # 日志处理模块
├── log_parser.py # 日志解析器
├── thinking_tracker.py # 思考过程跟踪器
├── static/ # 静态资源文件夹(JS, CSS)
│ ├── connected_interface.html # 主要界面HTML
│ ├── connected_interface.js # 主要界面JavaScript
│ └── ... # 其他静态资源
└── templates/ # Jinja2模板文件夹
```
## API端点
### 聊天相关
- `POST /api/chat` - 创建新的聊天会话
- `GET /api/chat/{session_id}` - 获取特定会话的结果
- `POST /api/chat/{session_id}/stop` - 停止特定会话的处理
- `WebSocket /ws/{session_id}` - 与会话建立WebSocket连接
### 文件相关
- `GET /api/files` - 获取所有工作区目录和文件
- `GET /api/files/{file_path}` - 获取特定文件的内容
### 日志相关
- `GET /api/logs` - 获取系统日志列表
- `GET /api/logs/{log_name}` - 获取特定日志文件内容
- `GET /api/logs_parsed` - 获取解析后的日志信息列表
- `GET /api/logs_parsed/{log_name}` - 获取特定日志文件的解析信息
- `GET /api/latest_log` - 获取最新日志文件的解析信息
- `GET /api/systemlogs/{session_id}` - 获取指定会话的系统日志
### 思考过程
- `GET /api/thinking/{session_id}` - 获取特定会话的思考步骤
- `GET /api/progress/{session_id}` - 获取特定会话的进度信息
## 界面说明
OpenManus Web界面分为两个主要部分:
1. **左侧面板** - 显示AI思考过程和工作区文件
- AI思考时间线:显示AI处理过程中的每个步骤
- 工作区文件:显示AI生成的文件,可以点击查看内容
2. **右侧面板** - 对话界面
- 对话历史:显示用户和AI之间的对话
- 输入区域:用户可以输入问题或指令
## 本地开发
1. 克隆仓库
2. 安装依赖
3. 在开发模式启动应用:
```bash
uvicorn app.web.app:app --reload
```
或者
```bash
python web_run.py
```
## 贡献
欢迎贡献代码、报告问题或提出改进建议。请创建Issue或提交Pull Request。
## 许可证
本项目使用[开源许可证],详见项目根目录的LICENSE文件。
## 技术支持
如有问题或需要帮助,请创建GitHub Issue。
================================================
FILE: app/web/__init__.py
================================================
"""OpenManus Web应用模块"""
# 该模块包含OpenManus的Web界面实现
================================================
FILE: app/web/api.py
================================================
from flask import Blueprint, jsonify
from ..utils.log_monitor import LogFileMonitor
api_bp = Blueprint("api", __name__)
log_monitors = {}
@api_bp.route("/job//files", methods=["GET"])
def get_job_files(job_id):
"""获取作业生成的文件列表"""
if job_id not in log_monitors:
log_monitors[job_id] = LogFileMonitor(job_id)
log_monitors[job_id].start_monitoring()
files = log_monitors[job_id].get_generated_files()
return jsonify({"files": files})
@api_bp.route("/job//logs", methods=["GET"])
def get_job_logs(job_id):
"""获取作业的系统日志"""
if job_id not in log_monitors:
log_monitors[job_id] = LogFileMonitor(job_id)
log_monitors[job_id].start_monitoring()
logs = log_monitors[job_id].get_log_entries()
return jsonify({"logs": logs})
================================================
FILE: app/web/app.py
================================================
import asyncio
import json
import os
import threading
import time
import uuid
import webbrowser
from pathlib import Path
from typing import Dict
from fastapi import (
BackgroundTasks,
FastAPI,
HTTPException,
Request,
WebSocket,
WebSocketDisconnect,
)
from fastapi.responses import HTMLResponse
from fastapi.staticfiles import StaticFiles
from fastapi.templating import Jinja2Templates
from pydantic import BaseModel
from app.agent.manus import Manus
from app.flow.base import FlowType
from app.flow.flow_factory import FlowFactory
from app.web.log_handler import capture_session_logs, get_logs
from app.web.log_parser import get_all_logs_info, get_latest_log_info, parse_log_file
from app.web.thinking_tracker import ThinkingTracker
# 控制是否自动打开浏览器 (读取环境变量,默认为True)
AUTO_OPEN_BROWSER = os.environ.get("AUTO_OPEN_BROWSER", "1") == "1"
last_opened = False # 跟踪浏览器是否已打开
app = FastAPI(title="OpenManus Web")
# 获取当前文件所在目录
current_dir = Path(__file__).parent
# 设置静态文件目录
app.mount("/static", StaticFiles(directory=current_dir / "static"), name="static")
# 设置模板目录
templates = Jinja2Templates(directory=current_dir / "templates")
# 存储活跃的会话及其结果
active_sessions: Dict[str, dict] = {}
# 存储任务取消事件
cancel_events: Dict[str, asyncio.Event] = {}
# 创建工作区根目录
WORKSPACE_ROOT = Path(__file__).parent.parent.parent / "workspace"
WORKSPACE_ROOT.mkdir(exist_ok=True)
# 日志目录
LOGS_DIR = Path(__file__).parent.parent.parent / "logs"
LOGS_DIR.mkdir(exist_ok=True)
# 导入日志监视器
from app.utils.log_monitor import LogFileMonitor
# 存储活跃的日志监视器
active_log_monitors: Dict[str, LogFileMonitor] = {}
# 创建工作区目录的函数
def create_workspace(session_id: str) -> Path:
"""为会话创建工作区目录"""
# 简化session_id作为目录名
job_id = f"job_{session_id[:8]}"
workspace_dir = WORKSPACE_ROOT / job_id
workspace_dir.mkdir(exist_ok=True)
return workspace_dir
@app.on_event("startup")
async def startup_event():
"""启动事件:应用启动时自动打开浏览器"""
global last_opened
if AUTO_OPEN_BROWSER and not last_opened:
# 延迟1秒以确保服务已经启动
threading.Timer(1.0, lambda: webbrowser.open("http://localhost:8000")).start()
print("🌐 自动打开浏览器...")
last_opened = True
class SessionRequest(BaseModel):
prompt: str
@app.get("/", response_class=HTMLResponse)
async def get_home(request: Request):
"""主页入口 - 使用connected界面"""
return HTMLResponse(
content=open(
current_dir / "static" / "connected_interface.html", encoding="utf-8"
).read()
)
@app.get("/original", response_class=HTMLResponse)
async def get_original_interface(request: Request):
"""原始界面入口"""
return templates.TemplateResponse("index.html", {"request": request})
@app.get("/connected", response_class=HTMLResponse)
async def get_connected_interface(request: Request):
"""连接后端的新界面入口 (与主页相同)"""
return HTMLResponse(
content=open(
current_dir / "static" / "connected_interface.html", encoding="utf-8"
).read()
)
@app.post("/api/chat")
async def create_chat_session(
session_req: SessionRequest, background_tasks: BackgroundTasks
):
session_id = str(uuid.uuid4())
active_sessions[session_id] = {
"status": "processing",
"result": None,
"log": [],
"workspace": None,
}
# 创建取消事件
cancel_events[session_id] = asyncio.Event()
# 创建工作区目录
workspace_dir = create_workspace(session_id)
active_sessions[session_id]["workspace"] = str(
workspace_dir.relative_to(WORKSPACE_ROOT)
)
background_tasks.add_task(process_prompt, session_id, session_req.prompt)
return {
"session_id": session_id,
"workspace": active_sessions[session_id]["workspace"],
}
@app.get("/api/chat/{session_id}")
async def get_chat_result(session_id: str):
if session_id not in active_sessions:
raise HTTPException(status_code=404, detail="Session not found")
# 使用新的日志处理模块获取日志
session = active_sessions[session_id]
session["log"] = get_logs(session_id)
return session
@app.post("/api/chat/{session_id}/stop")
async def stop_processing(session_id: str):
if session_id not in active_sessions:
raise HTTPException(status_code=404, detail="Session not found")
if session_id in cancel_events:
cancel_events[session_id].set()
active_sessions[session_id]["status"] = "stopped"
active_sessions[session_id]["result"] = "处理已被用户停止"
return {"status": "stopped"}
@app.websocket("/ws/{session_id}")
async def websocket_endpoint(websocket: WebSocket, session_id: str):
try:
await websocket.accept()
if session_id not in active_sessions:
await websocket.send_text(json.dumps({"error": "Session not found"}))
await websocket.close()
return
session = active_sessions[session_id]
# 注册 WebSocket 发送回调函数
async def ws_send(message: str):
try:
await websocket.send_text(message)
except Exception as e:
print(f"WebSocket 发送消息失败: {str(e)}")
ThinkingTracker.register_ws_send_callback(session_id, ws_send)
# 初始状态通知中添加日志信息
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"log": session["log"],
"thinking_steps": ThinkingTracker.get_thinking_steps(session_id),
"logs": ThinkingTracker.get_logs(session_id), # 添加日志信息
}
)
)
# 获取工作区名称(job_id) - 优先从环境变量获取
job_id = None
# 首先检查当前会话的工作空间关联
if "workspace" in session:
job_id = session["workspace"]
# 如果当前没有日志监控器,则创建一个
if session_id not in active_log_monitors and job_id:
log_path = LOGS_DIR / f"{job_id}.log"
if log_path.exists():
log_monitor = LogFileMonitor(job_id)
log_monitor.start_monitoring()
active_log_monitors[session_id] = log_monitor
# 跟踪日志更新
last_log_entries = []
if job_id and session_id in active_log_monitors:
last_log_entries = active_log_monitors[session_id].get_log_entries()
# 等待结果更新
last_log_count = 0
last_thinking_step_count = 0
last_tracker_log_count = 0 # 添加ThinkingTracker日志计数
while session["status"] == "processing":
await asyncio.sleep(0.2) # 降低检查间隔提高实时性
# 检查系统日志更新 (新增)
if job_id and session_id in active_log_monitors:
current_log_entries = active_log_monitors[session_id].get_log_entries()
if len(current_log_entries) > len(last_log_entries):
new_logs = current_log_entries[len(last_log_entries) :]
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"system_logs": new_logs,
# 添加一个chat_logs字段,将系统日志作为聊天消息发送
"chat_logs": new_logs,
}
)
)
last_log_entries = current_log_entries
# 检查日志更新
current_log_count = len(session["log"])
if current_log_count > last_log_count:
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"log": session["log"][last_log_count:],
}
)
)
last_log_count = current_log_count
# 检查思考步骤更新
thinking_steps = ThinkingTracker.get_thinking_steps(session_id)
current_thinking_step_count = len(thinking_steps)
if current_thinking_step_count > last_thinking_step_count:
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"thinking_steps": thinking_steps[last_thinking_step_count:],
}
)
)
last_thinking_step_count = current_thinking_step_count
# 检查ThinkingTracker日志更新
tracker_logs = ThinkingTracker.get_logs(session_id)
current_tracker_log_count = len(tracker_logs)
if current_tracker_log_count > last_tracker_log_count:
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"logs": tracker_logs[last_tracker_log_count:],
}
)
)
last_tracker_log_count = current_tracker_log_count
# 检查结果更新
if session["result"]:
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"result": session["result"],
"log": session["log"][last_log_count:],
"thinking_steps": ThinkingTracker.get_thinking_steps(
session_id, last_thinking_step_count
),
"system_logs": last_log_entries, # 添加系统日志
"logs": ThinkingTracker.get_logs(
session_id, last_tracker_log_count
), # 添加ThinkingTracker日志
}
)
)
break # 结果已发送,退出循环,避免重复发送
# 仅在循环没有因result而break时才发送最终结果
if not session["result"]:
await websocket.send_text(
json.dumps(
{
"status": session["status"],
"result": session["result"],
"log": session["log"][last_log_count:],
"thinking_steps": ThinkingTracker.get_thinking_steps(
session_id, last_thinking_step_count
),
"system_logs": last_log_entries, # 添加系统日志
"logs": ThinkingTracker.get_logs(
session_id, last_tracker_log_count
), # 添加ThinkingTracker日志
}
)
)
# 取消注册 WebSocket 发送回调函数
ThinkingTracker.unregister_ws_send_callback(session_id)
await websocket.close()
except WebSocketDisconnect:
# 客户端断开连接,正常操作
ThinkingTracker.unregister_ws_send_callback(session_id)
except Exception as e:
# 其他异常,记录日志但不中断应用
print(f"WebSocket错误: {str(e)}")
ThinkingTracker.unregister_ws_send_callback(session_id)
# 在适当位置添加LLM通信钩子
from app.web.thinking_tracker import ThinkingTracker
# 修改通信跟踪器的实现方式
class LLMCommunicationTracker:
"""跟踪与LLM的通信内容,使用monkey patching代替回调"""
def __init__(self, session_id: str, agent=None):
self.session_id = session_id
self.agent = agent
self.original_run_method = None
# 如果提供了agent,安装钩子
if agent and hasattr(agent, "llm") and hasattr(agent.llm, "completion"):
self.install_hooks()
def install_hooks(self):
"""安装钩子以捕获LLM通信内容"""
if not self.agent or not hasattr(self.agent, "llm"):
return False
# 保存原始方法
llm = self.agent.llm
if hasattr(llm, "completion"):
self.original_completion = llm.completion
# 替换为我们的包装方法
llm.completion = self._wrap_completion(self.original_completion)
return True
return False
def uninstall_hooks(self):
"""卸载钩子,恢复原始方法"""
if self.agent and hasattr(self.agent, "llm") and self.original_completion:
self.agent.llm.completion = self.original_completion
def _wrap_completion(self, original_method):
"""包装LLM的completion方法以捕获输入和输出"""
session_id = self.session_id
async def wrapped_completion(*args, **kwargs):
# 记录输入
prompt = kwargs.get("prompt", "")
if not prompt and args:
prompt = args[0]
if prompt:
ThinkingTracker.add_communication(
session_id,
"发送到LLM",
prompt[:500] + ("..." if len(prompt) > 500 else ""),
)
# 调用原始方法
result = await original_method(*args, **kwargs)
# 记录输出
if result:
content = result
if isinstance(result, dict) and "content" in result:
content = result["content"]
elif hasattr(result, "content"):
content = result.content
if isinstance(content, str):
ThinkingTracker.add_communication(
session_id,
"从LLM接收",
content[:500] + ("..." if len(content) > 500 else ""),
)
return result
return wrapped_completion
# 导入新创建的LLM包装器
from app.agent.llm_wrapper import LLMCallbackWrapper
# 修改文件API,支持工作区目录
@app.get("/api/files")
async def get_generated_files():
"""获取所有工作区目录和文件"""
result = []
# 获取所有工作区目录
workspaces = list(WORKSPACE_ROOT.glob("job_*"))
workspaces.sort(key=lambda p: p.stat().st_mtime, reverse=True)
for workspace in workspaces:
workspace_name = workspace.name
# 获取工作区内所有文件并按修改时间排序
files = []
with os.scandir(workspace) as it:
for entry in it:
if entry.is_file() and entry.name.split(".")[-1] in [
"txt",
"md",
"html",
"css",
"js",
"py",
"json",
]:
files.append(entry)
# 按修改时间倒序排序
files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
# 如果有文件,添加该工作区
if files:
workspace_item = {
"name": workspace_name,
"path": str(workspace.relative_to(Path(__file__).parent.parent.parent)),
"modified": workspace.stat().st_mtime,
"files": [],
}
# 添加工作区下的文件
for file in sorted(files, key=lambda p: p.name):
workspace_item["files"].append(
{
"name": file.name,
"path": str(
Path(file.path).relative_to(
Path(__file__).parent.parent.parent
)
),
"type": Path(file.path).suffix[1:], # 去掉.的扩展名
"size": file.stat().st_size,
"modified": file.stat().st_mtime,
}
)
result.append(workspace_item)
return {"workspaces": result}
# 新增日志文件接口
@app.get("/api/logs")
async def get_system_logs(limit: int = 10):
"""获取系统日志列表"""
log_files = []
for entry in os.scandir(LOGS_DIR):
if entry.is_file() and entry.name.endswith(".log"):
log_files.append(
{
"name": entry.name,
"size": entry.stat().st_size,
"modified": entry.stat().st_mtime,
}
)
# 按修改时间倒序排序并限制数量
log_files.sort(key=lambda x: x["modified"], reverse=True)
return {"logs": log_files[:limit]}
@app.get("/api/logs/{log_name}")
async def get_log_content(log_name: str, parsed: bool = False):
"""获取特定日志文件内容"""
log_path = LOGS_DIR / log_name
# 安全检查
if not log_path.exists() or not log_path.is_file():
raise HTTPException(status_code=404, detail="Log file not found")
# 如果请求解析后的日志信息
if parsed:
log_info = parse_log_file(str(log_path))
log_info["name"] = log_name
return log_info
# 否则返回原始内容
with open(log_path, "r", encoding="utf-8") as f:
content = f.read()
return {"name": log_name, "content": content}
@app.get("/api/logs_parsed")
async def get_parsed_logs(limit: int = 10):
"""获取解析后的日志信息列表"""
return {"logs": get_all_logs_info(str(LOGS_DIR), limit)}
@app.get("/api/logs_parsed/{log_name}")
async def get_parsed_log(log_name: str):
"""获取特定日志文件的解析信息"""
log_path = LOGS_DIR / log_name
# 安全检查
if not log_path.exists() or not log_path.is_file():
raise HTTPException(status_code=404, detail="Log file not found")
log_info = parse_log_file(str(log_path))
log_info["name"] = log_name
return log_info
@app.get("/api/latest_log")
async def get_latest_log():
"""获取最新日志文件的解析信息"""
return get_latest_log_info(str(LOGS_DIR))
@app.get("/api/files/{file_path:path}")
async def get_file_content(file_path: str):
"""获取特定文件的内容"""
# 安全检查,防止目录遍历攻击
root_dir = Path(__file__).parent.parent.parent
full_path = root_dir / file_path
# 确保文件在项目目录内
try:
full_path.relative_to(root_dir)
except ValueError:
raise HTTPException(status_code=403, detail="Access denied")
if not full_path.exists() or not full_path.is_file():
raise HTTPException(status_code=404, detail="File not found")
# 读取文件内容
try:
with open(full_path, "r", encoding="utf-8") as f:
content = f.read()
# 确定文件类型
file_type = full_path.suffix[1:] if full_path.suffix else "text"
return {
"name": full_path.name,
"path": file_path,
"type": file_type,
"content": content,
}
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error reading file: {str(e)}")
# 修改process_prompt函数,处理工作区
async def process_prompt(session_id: str, prompt: str):
# 获取会话工作区
workspace_dir = None
if session_id in active_sessions and "workspace" in active_sessions[session_id]:
workspace_path = active_sessions[session_id]["workspace"]
workspace_dir = WORKSPACE_ROOT / workspace_path
os.makedirs(workspace_dir, exist_ok=True)
# 如果没有工作区,创建一个
if not workspace_dir:
workspace_dir = create_workspace(session_id)
if session_id in active_sessions:
active_sessions[session_id]["workspace"] = str(
workspace_dir.relative_to(WORKSPACE_ROOT)
)
# 设置当前工作目录为工作区
original_cwd = os.getcwd()
os.chdir(workspace_dir)
# 使用工作区名称作为日志文件名前缀
job_id = workspace_dir.name
# 设置日志文件路径
task_log_path = LOGS_DIR / f"{job_id}.log"
# 创建日志监视器并开始监控
log_monitor = LogFileMonitor(job_id)
observer = log_monitor.start_monitoring()
active_log_monitors[session_id] = log_monitor
async def sync_logs():
"""定期从LogFileMonitor获取日志并实时更新到ThinkingTracker"""
last_count = 0
try:
while True:
if session_id not in active_log_monitors:
break
current_logs = active_log_monitors[session_id].get_log_entries()
if len(current_logs) > last_count:
# 处理新的日志条目
new_logs = current_logs[last_count:]
# 逐条处理每条新日志,确保实时性
for log_entry in new_logs:
# 单独处理每条日志,立即添加到ThinkingTracker
ThinkingTracker.add_log_entry(
session_id,
{
"level": log_entry.get("level", "INFO"),
"message": log_entry.get("message", ""),
"timestamp": log_entry.get("timestamp", time.time()),
},
)
last_count = len(current_logs)
# 减少轮询间隔,提高实时性
await asyncio.sleep(0.1) # 每0.1秒检查一次
except Exception as e:
print(f"同步日志时发生错误: {str(e)}")
# 启动日志同步任务
sync_task = asyncio.create_task(sync_logs())
# 设置环境变量告知logger使用此日志文件,确保两种方式都设置
os.environ["OPENMANUS_LOG_FILE"] = str(task_log_path)
os.environ["OPENMANUS_TASK_ID"] = job_id
try:
# 使用日志捕获上下文管理器解析日志级别和内容
with capture_session_logs(session_id) as log:
# 初始化思考跟踪
ThinkingTracker.start_tracking(session_id)
ThinkingTracker.add_thinking_step(session_id, "开始处理用户请求")
ThinkingTracker.add_thinking_step(
session_id, f"工作区目录: {workspace_dir.name}"
)
# 直接记录用户输入的prompt
ThinkingTracker.add_communication(session_id, "用户输入", prompt)
# 初始化代理和任务流程
ThinkingTracker.add_thinking_step(session_id, "初始化AI代理和任务流程")
agent = Manus()
# 使用包装器包装LLM
if hasattr(agent, "llm"):
original_llm = agent.llm
wrapped_llm = LLMCallbackWrapper(original_llm)
# 注册回调函数
def on_before_request(data):
# 提取请求内容
prompt_content = None
if data.get("args") and len(data["args"]) > 0:
prompt_content = str(data["args"][0])
elif data.get("kwargs") and "prompt" in data["kwargs"]:
prompt_content = data["kwargs"]["prompt"]
else:
prompt_content = str(data)
# 记录通信内容
print(f"发送到LLM: {prompt_content[:100]}...")
ThinkingTracker.add_communication(
session_id, "发送到LLM", prompt_content
)
def on_after_request(data):
# 提取响应内容
response = data.get("response", "")
response_content = ""
# 尝试从不同格式中提取文本内容
if isinstance(response, str):
response_content = response
elif isinstance(response, dict):
if "content" in response:
response_content = response["content"]
elif "text" in response:
response_content = response["text"]
else:
response_content = str(response)
elif hasattr(response, "content"):
response_content = response.content
else:
response_content = str(response)
# 记录通信内容
print(f"从LLM接收: {response_content[:100]}...")
ThinkingTracker.add_communication(
session_id, "从LLM接收", response_content
)
# 注册回调
wrapped_llm.register_callback("before_request", on_before_request)
wrapped_llm.register_callback("after_request", on_after_request)
# 替换原始LLM
agent.llm = wrapped_llm
flow = FlowFactory.create_flow(
flow_type=FlowType.PLANNING,
agents=agent,
)
# 记录处理开始
ThinkingTracker.add_thinking_step(
session_id, f"分析用户请求: {prompt[:50]}{'...' if len(prompt) > 50 else ''}"
)
log.info(f"开始执行: {prompt[:50]}{'...' if len(prompt) > 50 else ''}")
# 检查任务是否被取消
cancel_event = cancel_events.get(session_id)
if cancel_event and cancel_event.is_set():
log.warning("处理已被用户取消")
ThinkingTracker.mark_stopped(session_id)
active_sessions[session_id]["status"] = "stopped"
active_sessions[session_id]["result"] = "处理已被用户停止"
return
# 执行前检查工作区已有文件
existing_files = set()
for ext in ["*.txt", "*.md", "*.html", "*.css", "*.js", "*.py", "*.json"]:
existing_files.update(f.name for f in workspace_dir.glob(ext))
# 跟踪计划创建过程
ThinkingTracker.add_thinking_step(session_id, "创建任务执行计划")
ThinkingTracker.add_thinking_step(session_id, "开始执行任务计划")
# 获取取消事件以传递给flow.execute
cancel_event = cancel_events.get(session_id)
# 初始检查,如果已经取消则不执行
if cancel_event and cancel_event.is_set():
log.warning("处理已被用户取消")
ThinkingTracker.mark_stopped(session_id)
active_sessions[session_id]["status"] = "stopped"
active_sessions[session_id]["result"] = "处理已被用户停止"
return
# 执行实际处理 - 传递job_id和cancel_event给flow.execute方法
result = await flow.execute(prompt, job_id, cancel_event)
# 执行结束后检查新生成的文件
new_files = set()
for ext in ["*.txt", "*.md", "*.html", "*.css", "*.js", "*.py", "*.json"]:
new_files.update(f.name for f in workspace_dir.glob(ext))
newly_created = new_files - existing_files
if newly_created:
files_list = ", ".join(newly_created)
ThinkingTracker.add_thinking_step(
session_id,
f"在工作区 {workspace_dir.name} 中生成了{len(newly_created)}个文件: {files_list}",
)
# 将文件列表也添加到会话结果中
active_sessions[session_id]["generated_files"] = list(newly_created)
# 记录完成情况
log.info("处理完成")
ThinkingTracker.add_conclusion(
session_id, f"任务处理完成!已在工作区 {workspace_dir.name} 中生成结果。"
)
active_sessions[session_id]["status"] = "completed"
active_sessions[session_id]["result"] = result
active_sessions[session_id][
"thinking_steps"
] = ThinkingTracker.get_thinking_steps(session_id)
except asyncio.CancelledError:
# 处理取消情况
print("处理已取消")
ThinkingTracker.mark_stopped(session_id)
active_sessions[session_id]["status"] = "stopped"
active_sessions[session_id]["result"] = "处理已被取消"
except Exception as e:
# 处理错误情况
error_msg = f"处理出错: {str(e)}"
print(error_msg)
ThinkingTracker.add_error(session_id, f"处理遇到错误: {str(e)}")
active_sessions[session_id]["status"] = "error"
active_sessions[session_id]["result"] = f"发生错误: {str(e)}"
finally:
# 恢复原始工作目录
os.chdir(original_cwd)
# 清除日志文件环境变量
if "OPENMANUS_LOG_FILE" in os.environ:
del os.environ["OPENMANUS_LOG_FILE"]
if "OPENMANUS_TASK_ID" in os.environ:
del os.environ["OPENMANUS_TASK_ID"]
# 清理资源
if (
"agent" in locals()
and hasattr(agent, "llm")
and isinstance(agent.llm, LLMCallbackWrapper)
):
try:
# 正确地移除回调
if "on_before_request" in locals():
agent.llm._callbacks["before_request"].remove(on_before_request)
if "on_after_request" in locals():
agent.llm._callbacks["after_request"].remove(on_after_request)
except (ValueError, Exception) as e:
print(f"清理回调时出错: {str(e)}")
# 清理取消事件
if session_id in cancel_events:
del cancel_events[session_id]
# 如果监视器存在,停止监控
if session_id in active_log_monitors:
observer.stop()
observer.join(timeout=1)
del active_log_monitors[session_id]
# 取消日志同步任务
if sync_task:
sync_task.cancel()
try:
await sync_task
except asyncio.CancelledError:
pass
# 添加一个新的API端点来获取思考步骤
@app.get("/api/thinking/{session_id}")
async def get_thinking_steps(session_id: str, start_index: int = 0):
if session_id not in active_sessions:
raise HTTPException(status_code=404, detail="Session not found")
return {
"status": ThinkingTracker.get_status(session_id),
"thinking_steps": ThinkingTracker.get_thinking_steps(session_id, start_index),
}
# 添加获取进度信息的API端点
@app.get("/api/progress/{session_id}")
async def get_progress(session_id: str):
if session_id not in active_sessions:
raise HTTPException(status_code=404, detail="Session not found")
return ThinkingTracker.get_progress(session_id)
# 添加API端点获取指定会话的系统日志
@app.get("/api/systemlogs/{session_id}")
async def get_system_logs(session_id: str):
"""获取指定会话的系统日志"""
if session_id not in active_sessions:
raise HTTPException(status_code=404, detail="Session not found")
job_id = None
if "workspace" in active_sessions[session_id]:
workspace_path = active_sessions[session_id]["workspace"]
job_id = workspace_path
if not job_id:
return {"logs": []}
# 如果有监控器使用监控器
if session_id in active_log_monitors:
logs = active_log_monitors[session_id].get_log_entries()
return {"logs": logs}
# 否则直接读取日志文件
log_path = LOGS_DIR / f"{job_id}.log"
if not log_path.exists():
return {"logs": []}
try:
with open(log_path, "r", encoding="utf-8") as f:
logs = [line.strip() for line in f.readlines()]
return {"logs": logs}
except Exception as e:
return {"error": f"Error reading log file: {str(e)}"}
================================================
FILE: app/web/llm_monitor.py
================================================
"""
LLM通信监控模块,用于捕获和模拟与LLM的通信内容
"""
import asyncio
import random
import time
from functools import wraps
from typing import Any, Callable, Dict, List, Optional
class LLMMonitor:
"""LLM通信监控器,支持多种方式追踪LLM通信"""
def __init__(self):
self.interceptors = []
self.communications = []
def register_interceptor(self, func: Callable):
"""注册一个拦截器函数,该函数将在每次通信时调用"""
self.interceptors.append(func)
return func # 便于当作装饰器使用
def record_communication(self, direction: str, content: Any):
"""记录通信内容"""
comm_record = {
"direction": direction, # "in" 或 "out"
"content": str(content)[:1000], # 限制长度
"timestamp": time.time(),
}
self.communications.append(comm_record)
# 通知所有拦截器
for interceptor in self.interceptors:
try:
interceptor(comm_record)
except Exception as e:
print(f"拦截器错误: {str(e)}")
def get_communications(self, start_idx: int = 0) -> List[Dict[str, Any]]:
"""获取通信记录"""
return self.communications[start_idx:]
def clear(self):
"""清除所有通信记录"""
self.communications = []
def intercept_method(self, obj, method_name):
"""拦截对象的方法调用"""
if not hasattr(obj, method_name):
return False
original_method = getattr(obj, method_name)
@wraps(original_method)
async def wrapped_method(*args, **kwargs):
# 记录输入
input_data = str(args[0]) if args else str(kwargs)
self.record_communication("in", input_data)
# 调用原始方法
result = await original_method(*args, **kwargs)
# 记录输出
self.record_communication("out", result)
return result
# 替换原始方法
setattr(obj, method_name, wrapped_method)
return True
# 创建一个全局监控器实例
monitor = LLMMonitor()
# 提供一些模拟LLM的函数,可用于演示或测试
async def simulate_llm_thinking(
prompt: str, callback: Optional[Callable] = None, steps: int = 5, delay: float = 1.0
):
"""模拟LLM思考过程,产生一系列思考步骤"""
# 记录输入
monitor.record_communication("in", prompt)
thinking_steps = ["分析问题需求", "检索相关知识", "整理和组织信息", "撰写初步答案", "检查和优化答案", "生成最终回复"]
# 根据提示调整思考步骤
if "代码" in prompt or "编程" in prompt:
thinking_steps = ["理解代码需求", "设计代码结构", "编写核心函数", "实现错误处理", "测试代码功能", "优化代码效率"]
# 确保步骤数量合理
actual_steps = min(steps, len(thinking_steps))
# 模拟思考过程
for i in range(actual_steps):
step_msg = thinking_steps[i]
if callback:
callback(step_msg)
await asyncio.sleep(delay * (0.5 + random.random()))
# 生成回答
result = f"这是对问题「{prompt[:30]}...」的回答。\n\n"
result += "根据我的分析,有以下几点建议:\n"
result += "1. 首先,确认问题的核心\n"
result += "2. 接下来,分析可能的解决方案\n"
result += "3. 最后,选择最合适的方法实施"
# 记录输出
monitor.record_communication("out", result)
return result
================================================
FILE: app/web/log_handler.py
================================================
"""
简单的日志处理模块,用于Web应用日志捕获
"""
import threading
from contextlib import contextmanager
from datetime import datetime
from typing import Dict, List
from loguru import logger
# 全局日志存储
session_logs: Dict[str, List[Dict]] = {}
_lock = threading.Lock()
# 注册自定义日志处理器,按会话ID分类存储日志
class SessionLogHandler:
def __init__(self, session_id: str):
self.session_id = session_id
def __call__(self, record):
log_entry = {
# "time": record["time"].strftime("%Y-%m-%d %H:%M:%S.%f"),
# "level": record["level"].name,
"message": record,
"timestamp": datetime.now().timestamp(),
}
with _lock:
if self.session_id not in session_logs:
session_logs[self.session_id] = []
session_logs[self.session_id].append(log_entry)
# 传递记录,继续处理链
return True
class SimpleLogCapture:
"""简单的日志捕获器,提供类似logger的接口"""
def __init__(self, session_id: str):
self.session_id = session_id
def info(self, message: str) -> None:
"""记录信息级别日志"""
add_log(self.session_id, "INFO", message)
logger.info(message)
def warning(self, message: str) -> None:
"""记录警告级别日志"""
add_log(self.session_id, "WARNING", message)
logger.warning(message)
def error(self, message: str) -> None:
"""记录错误级别日志"""
add_log(self.session_id, "ERROR", message)
logger.error(message)
def debug(self, message: str) -> None:
"""记录调试级别日志"""
add_log(self.session_id, "DEBUG", message)
logger.debug(message)
def exception(self, message: str) -> None:
"""记录异常级别日志"""
add_log(self.session_id, "ERROR", message)
logger.exception(message)
@contextmanager
def capture_session_logs(session_id: str):
"""
上下文管理器,用于捕获指定会话的日志
返回一个SimpleLogCapture实例,而不是直接返回日志列表
"""
# 创建该会话的日志存储
with _lock:
if session_id not in session_logs:
session_logs[session_id] = []
# 添加会话特定的日志处理器
handler_id = logger.add(SessionLogHandler(session_id))
# 创建一个简单的日志捕获器
log_capture = SimpleLogCapture(session_id)
try:
# 返回日志捕获器而不是日志列表
yield log_capture
finally:
# 移除临时添加的处理器
logger.remove(handler_id)
def add_log(session_id: str, level: str, message: str) -> None:
"""添加日志到指定会话"""
with _lock:
if session_id not in session_logs:
session_logs[session_id] = []
session_logs[session_id].append(
{
"time": datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f"),
"level": level,
"message": message,
"timestamp": datetime.now().timestamp(),
}
)
def get_logs(session_id: str) -> List[Dict]:
"""获取指定会话的日志"""
with _lock:
return session_logs.get(session_id, [])[:]
def clear_logs(session_id: str) -> None:
"""清除指定会话的日志"""
with _lock:
if session_id in session_logs:
session_logs[session_id] = []
================================================
FILE: app/web/log_parser.py
================================================
"""
Log parser module for extracting execution information from OpenManus log files.
"""
import os
import re
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List
class LogParser:
"""Parser for OpenManus log files to extract execution status and progress."""
def __init__(self, log_path: str):
"""
Initialize the log parser with a log file path.
Args:
log_path: Path to the log file to parse
"""
self.log_path = log_path
self.log_content = ""
self.plan_id = None
self.plan_title = ""
self.steps = []
self.step_statuses = []
self.current_step = 0
self.total_steps = 0
self.completed_steps = 0
self.tool_executions = []
self.errors = []
self.warnings = []
def parse(self) -> Dict[str, Any]:
"""
Parse the log file and extract execution information.
Returns:
Dict containing parsed information about the execution
"""
try:
with open(self.log_path, "r", encoding="utf-8") as f:
self.log_content = f.read()
# Extract plan information
self._extract_plan_info()
# Extract step information
self._extract_step_info()
# Extract tool executions
self._extract_tool_executions()
# Extract errors and warnings
self._extract_errors_warnings()
return {
"plan_id": self.plan_id,
"plan_title": self.plan_title,
"steps": self.steps,
"step_statuses": self.step_statuses,
"current_step": self.current_step,
"total_steps": self.total_steps,
"completed_steps": self.completed_steps,
"progress_percentage": self._calculate_progress(),
"tool_executions": self.tool_executions,
"errors": self.errors,
"warnings": self.warnings,
"timestamp": self._extract_timestamp(),
"status": self._determine_status(),
}
except Exception as e:
return {"error": f"Failed to parse log file: {str(e)}", "status": "error"}
def _extract_plan_info(self) -> None:
"""Extract plan ID and title from the log."""
# Extract plan ID
plan_id_match = re.search(
r"Creating initial plan with ID: (plan_\d+)", self.log_content
)
if plan_id_match:
self.plan_id = plan_id_match.group(1)
# Extract plan title
plan_title_match = re.search(r"Plan: (.*?) \(ID: plan_\d+\)", self.log_content)
if plan_title_match:
self.plan_title = plan_title_match.group(1)
def _extract_step_info(self) -> None:
"""Extract step information from the log."""
# Extract steps list
steps_section = re.search(
r"Steps:\n(.*?)(?:\n\n|\Z)", self.log_content, re.DOTALL
)
if steps_section:
steps_text = steps_section.group(1)
step_lines = steps_text.strip().split("\n")
for line in step_lines:
# Match step pattern: "0. [ ] Define the objective of task 11"
step_match = re.match(r"\d+\.\s+\[([ ✓→!])\]\s+(.*)", line)
if step_match:
status_symbol = step_match.group(1)
step_text = step_match.group(2)
self.steps.append(step_text)
# Convert status symbol to status text
if status_symbol == "✓":
self.step_statuses.append("completed")
self.completed_steps += 1
elif status_symbol == "→":
self.step_statuses.append("in_progress")
elif status_symbol == "!":
self.step_statuses.append("blocked")
else: # Empty space
self.step_statuses.append("not_started")
# Extract total steps
self.total_steps = len(self.steps)
# Extract current step from execution logs
current_step_matches = re.findall(
r"Executing step (\d+)/(\d+)", self.log_content
)
if current_step_matches:
# Get the latest execution step
latest_match = current_step_matches[-1]
self.current_step = int(latest_match[0])
# Update total steps if available from execution log
if int(latest_match[1]) > self.total_steps:
self.total_steps = int(latest_match[1])
# Extract completed steps from marking logs
completed_step_matches = re.findall(
r"Marked step (\d+) as completed", self.log_content
)
if completed_step_matches:
# Count unique completed steps
self.completed_steps = len(set(completed_step_matches))
def _extract_tool_executions(self) -> None:
"""Extract tool execution information from the log."""
# Match tool execution patterns
tool_patterns = [
r"🛠️ Manus selected \d+ tools to use",
r"🧰 Tools being prepared: \['([^']+)'\]",
r"🔧 Activating tool: '([^']+)'...",
r"🎯 Tool '([^']+)' completed its mission!",
]
for pattern in tool_patterns:
matches = re.finditer(pattern, self.log_content)
for match in matches:
if "'" in pattern:
tool_name = match.group(1)
self.tool_executions.append(
{
"tool": tool_name,
"timestamp": self._extract_timestamp_for_line(
match.group(0)
),
}
)
else:
self.tool_executions.append(
{
"action": match.group(0),
"timestamp": self._extract_timestamp_for_line(
match.group(0)
),
}
)
def _extract_errors_warnings(self) -> None:
"""Extract errors and warnings from the log."""
# Extract errors (ERROR level logs)
error_matches = re.finditer(
r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+ \| ERROR\s+\| (.*)",
self.log_content,
)
for match in error_matches:
self.errors.append(
{
"message": match.group(1),
"timestamp": self._extract_timestamp_for_line(match.group(0)),
}
)
# Extract warnings (WARNING level logs)
warning_matches = re.finditer(
r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+ \| WARNING\s+\| (.*)",
self.log_content,
)
for match in warning_matches:
self.warnings.append(
{
"message": match.group(1),
"timestamp": self._extract_timestamp_for_line(match.group(0)),
}
)
def _calculate_progress(self) -> int:
"""Calculate the progress percentage based on completed steps."""
if self.total_steps == 0:
return 0
return min(int((self.completed_steps / self.total_steps) * 100), 100)
def _extract_timestamp(self) -> str:
"""Extract the timestamp from the log file name or content."""
# Try to extract from filename first (format: YYYYMMDD_HHMMSS.log)
filename = os.path.basename(self.log_path)
timestamp_match = re.search(r"(\d{8}_\d{6})\.log", filename)
if timestamp_match:
timestamp_str = timestamp_match.group(1)
try:
dt = datetime.strptime(timestamp_str, "%Y%m%d_%H%M%S")
return dt.isoformat()
except ValueError:
pass
# Try to extract from first log line
first_line_match = re.search(
r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+)", self.log_content
)
if first_line_match:
timestamp_str = first_line_match.group(1)
try:
dt = datetime.strptime(timestamp_str, "%Y-%m-%d %H:%M:%S.%f")
return dt.isoformat()
except ValueError:
pass
# Fallback to file modification time
try:
mtime = os.path.getmtime(self.log_path)
return datetime.fromtimestamp(mtime).isoformat()
except:
return datetime.now().isoformat()
def _extract_timestamp_for_line(self, line: str) -> str:
"""Extract timestamp for a specific log line."""
timestamp_match = re.search(
r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+)", line
)
if timestamp_match:
timestamp_str = timestamp_match.group(1)
try:
dt = datetime.strptime(timestamp_str, "%Y-%m-%d %H:%M:%S.%f")
return dt.isoformat()
except ValueError:
pass
return ""
def _determine_status(self) -> str:
"""Determine the overall status of the execution."""
if self.errors:
return "error"
# Check for completion markers
if (
"task processing completed" in self.log_content.lower()
or "plan completed" in self.log_content.lower()
):
return "completed"
# Check if all steps are completed
if self.completed_steps >= self.total_steps and self.total_steps > 0:
return "completed"
# Check for termination
if (
"terminate" in self.log_content.lower()
and "completed its mission" in self.log_content.lower()
):
return "completed"
# Default to in_progress
return "in_progress"
def parse_log_file(log_path: str) -> Dict[str, Any]:
"""
Parse a single log file and return the execution information.
Args:
log_path: Path to the log file
Returns:
Dict containing parsed information about the execution
"""
parser = LogParser(log_path)
return parser.parse()
def get_latest_log_info(logs_dir: str = None) -> Dict[str, Any]:
"""
Get information from the latest log file.
Args:
logs_dir: Directory containing log files (default: project's logs directory)
Returns:
Dict containing parsed information about the latest execution
"""
if logs_dir is None:
# Default to project's logs directory
logs_dir = Path(__file__).parent.parent.parent / "logs"
# Find the latest log file
log_files = []
for entry in os.scandir(logs_dir):
if entry.is_file() and entry.name.endswith(".log"):
log_files.append({"path": entry.path, "modified": entry.stat().st_mtime})
if not log_files:
return {"error": "No log files found", "status": "unknown"}
# Sort by modification time (newest first)
log_files.sort(key=lambda x: x["modified"], reverse=True)
latest_log = log_files[0]["path"]
# Parse the latest log file
return parse_log_file(latest_log)
def get_all_logs_info(logs_dir: str = None, limit: int = 10) -> List[Dict[str, Any]]:
"""
Get information from all log files, sorted by modification time (newest first).
Args:
logs_dir: Directory containing log files (default: project's logs directory)
limit: Maximum number of logs to return
Returns:
List of dicts containing parsed information about each execution
"""
if logs_dir is None:
# Default to project's logs directory
logs_dir = Path(__file__).parent.parent.parent / "logs"
# Find all log files
log_files = []
for entry in os.scandir(logs_dir):
if entry.is_file() and entry.name.endswith(".log"):
log_files.append({"path": entry.path, "modified": entry.stat().st_mtime})
if not log_files:
return [{"error": "No log files found", "status": "unknown"}]
# Sort by modification time (newest first)
log_files.sort(key=lambda x: x["modified"], reverse=True)
# Parse each log file (up to the limit)
results = []
for log_file in log_files[:limit]:
log_info = parse_log_file(log_file["path"])
log_info["file_path"] = log_file["path"]
log_info["file_name"] = os.path.basename(log_file["path"])
results.append(log_info)
return results
================================================
FILE: app/web/static/archive/apiManager.js
================================================
import { addMessage } from './chatManager.js';
import { connectWebSocket } from './websocketManager.js';
import { updateLog } from './logManager.js';
let processingRequest = false;
export async function sendRequest(prompt) {
if (!prompt || processingRequest) return;
processingRequest = true;
addMessage(prompt, 'user');
document.getElementById('user-input').value = '';
document.getElementById('send-btn').disabled = true;
document.getElementById('stop-btn').disabled = false;
document.getElementById('status-indicator').textContent = '正在处理您的请求...';
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ prompt })
});
if (!response.ok) {
throw new Error('网络请求失败');
}
const data = await response.json();
connectWebSocket(data.session_id);
pollResults(data.session_id);
} catch (error) {
console.error('Error:', error);
document.getElementById('status-indicator').textContent = '发生错误: ' + error.message;
document.getElementById('send-btn').disabled = false;
document.getElementById('stop-btn').disabled = true;
processingRequest = false;
}
}
export async function pollResults(sessionId) {
let attempts = 0;
const maxAttempts = 60;
const poll = async () => {
if (attempts >= maxAttempts || !processingRequest) {
if (attempts >= maxAttempts) {
document.getElementById('status-indicator').textContent = '请求超时';
}
document.getElementById('send-btn').disabled = false;
document.getElementById('stop-btn').disabled = true;
processingRequest = false;
return;
}
try {
const response = await fetch(`/api/chat/${sessionId}`);
if (!response.ok) {
throw new Error('获取结果失败');
}
const data = await response.json();
if (data.status === 'completed') {
if (data.result && !chatContainsResult(data.result)) {
addMessage(data.result, 'ai');
}
document.getElementById('status-indicator').textContent = '';
document.getElementById('send-btn').disabled = false;
document.getElementById('stop-btn').disabled = true;
processingRequest = false;
return;
} else if (data.status === 'error') {
document.getElementById('status-indicator').textContent = '处理请求时发生错误';
document.getElementById('send-btn').disabled = false;
document.getElementById('stop-btn').disabled = true;
processingRequest = false;
return;
} else if (data.status === 'stopped') {
document.getElementById('status-indicator').textContent = '处理已停止';
document.getElementById('send-btn').disabled = false;
document.getElementById('stop-btn').disabled = true;
processingRequest = false;
return;
}
if (data.log && data.log.length > 0) {
updateLog(data.log);
}
attempts++;
setTimeout(poll, 3000);
try {
const terminalResponse = await fetch(`/api/terminal/${sessionId}`);
if (terminalResponse.ok) {
const terminalData = await terminalResponse.json();
if (terminalData.terminal_output && terminalData.terminal_output.length > 0) {
updateTerminalOutput(terminalData.terminal_output);
}
}
} catch (terminalError) {
console.error('获取终端输出错误:', terminalError);
}
} catch (error) {
console.error('轮询错误:', error);
attempts++;
setTimeout(poll, 3000);
}
};
setTimeout(poll, 3000);
}
export function updateThinkingSteps(steps) {
if (!Array.isArray(steps) || steps.length === 0) return;
const thinkingStepsContainer = document.getElementById('thinking-steps');
steps.forEach(step => {
const existingStep = document.querySelector(`.thinking-step[data-timestamp="${step.timestamp}"]`);
if (existingStep) return;
const stepElement = document.createElement('div');
stepElement.className = `thinking-step ${step.type}`;
stepElement.dataset.timestamp = step.timestamp;
const stepContent = document.createElement('div');
stepContent.className = 'thinking-step-content';
if (step.type === 'communication') {
const headerDiv = document.createElement('div');
headerDiv.className = 'communication-header';
headerDiv.innerHTML = `${step.message}▶`;
headerDiv.onclick = function() {
const detailsElement = this.nextElementSibling;
const toggleIcon = this.querySelector('.toggle-icon');
if (detailsElement.style.display === 'none' || !detailsElement.style.display) {
detailsElement.style.display = 'block';
toggleIcon.textContent = '▼';
} else {
detailsElement.style.display = 'none';
toggleIcon.textContent = '▶';
}
};
const detailsElement = document.createElement('div');
detailsElement.className = 'communication-details';
detailsElement.style.display = 'none';
if (step.message.includes("发送到LLM")) {
detailsElement.innerHTML = `
7. **Sound Integration:** Incorporated sound effects and background music to enhance the gaming experience.
8. **Testing & Debugging:** Conducted thorough testing of game functionalities and resolved any identified bugs.
9. **Performance Optimization:** Improved overall game performance and user playability.
10. **Game Release:** Successfully published the game and gathered user feedback for potential improvements.
**Final Thoughts:**
The development of the网页版贪蛇游戏 (web-based Snake Game) was a comprehensive process that saw all planned steps executed effectively. The combination of clear rule definition, strategic development choices, and user-centric design contributed to a successful project outcome. Feedback from users will be valuable for future iterations and enhancements. Overall, this project demonstrates a solid execution of game development concepts and practices.
7. **Sound Integration:** Incorporated sound effects and background music to enhance the gaming experience.
8. **Testing & Debugging:** Conducted thorough testing of game functionalities and resolved any identified bugs.
9. **Performance Optimization:** Improved overall game performance and user playability.
10. **Game Release:** Successfully published the game and gathered user feedback for potential improvements.
**Final Thoughts:**
The development of the网页版贪蛇游戏 (web-based Snake Game) was a comprehensive process that saw all planned steps executed effectively. The combination of clear rule definition, strategic development choices, and user-centric design contributed to a successful project outcome. Feedback from users will be valuable for future iterations and enhancements. Overall, this project demonstrates a solid execution of game development concepts and practices.
snake_game_technology_stack.txt
Technology Stack for Snake Game Development:
1. **HTML**: For structuring the game interface.
2. **CSS**: For styling the game layout and visuals.
3. **JavaScript**: For implementing game logic and interactivity.
4. **Game Libraries/Frameworks**:
- **p5.js**: A library for creative coding that simplifies graphics and interaction.
- **Phaser**: A fast and free open-source framework for Canvas and WebGL games.
- **Three.js** (if considering 3D elements): A library that makes 3D rendering easier.
Next Steps:
1. Review the Suggested Technologies.
2. Select the Final Tech Stack.
3. Document Your Choices.
================================================
FILE: app/web/static/archive/new_interface_demo.html
================================================
OpenManus Web - New Interface
7. **Sound Integration:** Incorporated sound effects and background music to enhance the gaming experience.
8. **Testing & Debugging:** Conducted thorough testing of game functionalities and resolved any identified bugs.
9. **Performance Optimization:** Improved overall game performance and user playability.
10. **Game Release:** Successfully published the game and gathered user feedback for potential improvements.
**Final Thoughts:**
The development of the网页版贪蛇游戏 (web-based Snake Game) was a comprehensive process that saw all planned steps executed effectively. The combination of clear rule definition, strategic development choices, and user-centric design contributed to a successful project outcome. Feedback from users will be valuable for future iterations and enhancements. Overall, this project demonstrates a solid execution of game development concepts and practices.
snake_game_technology_stack.txt
Technology Stack for Snake Game Development:
1. **HTML**: For structuring the game interface.
2. **CSS**: For styling the game layout and visuals.
3. **JavaScript**: For implementing game logic and interactivity.
4. **Game Libraries/Frameworks**:
- **p5.js**: A library for creative coding that simplifies graphics and interaction.
- **Phaser**: A fast and free open-source framework for Canvas and WebGL games.
- **Three.js** (if considering 3D elements): A library that makes 3D rendering easier.
Next Steps:
1. Review the Suggested Technologies.
2. Select the Final Tech Stack.
3. Document Your Choices.
7. **Sound Integration:** Incorporated sound effects and background music to enhance the gaming experience.
8. **Testing & Debugging:** Conducted thorough testing of game functionalities and resolved any identified bugs.
9. **Performance Optimization:** Improved overall game performance and user playability.
10. **Game Release:** Successfully published the game and gathered user feedback for potential improvements.
**Final Thoughts:**
The development of the网页版贪蛇游戏 (web-based Snake Game) was a comprehensive process that saw all planned steps executed effectively. The combination of clear rule definition, strategic development choices, and user-centric design contributed to a successful project outcome. Feedback from users will be valuable for future iterations and enhancements. Overall, this project demonstrates a solid execution of game development concepts and practices.
snake_game_technology_stack.txt
Technology Stack for Snake Game Development:
1. **HTML**: For structuring the game interface.
2. **CSS**: For styling the game layout and visuals.
3. **JavaScript**: For implementing game logic and interactivity.
4. **Game Libraries/Frameworks**:
- **p5.js**: A library for creative coding that simplifies graphics and interaction.
- **Phaser**: A fast and free open-source framework for Canvas and WebGL games.
- **Three.js** (if considering 3D elements): A library that makes 3D rendering easier.
Next Steps:
1. Review the Suggested Technologies.
2. Select the Final Tech Stack.
3. Document Your Choices.