Full Code of onebirdrocks/ebook-mcp for AI

main d44056ec3f1d cached

41 files

251.1 KB

66.9k tokens

189 symbols

1 requests

Download .txt

Showing preview only (293K chars total). Download the full file or copy to clipboard to get everything.

Repository: onebirdrocks/ebook-mcp
Branch: main
Commit: d44056ec3f1d
Files: 41
Total size: 251.1 KB

Directory structure:
gitextract_aguziwn7/

├── .github/
│   └── workflows/
│       └── cli.yml
├── .gitignore
├── .kiro/
│   ├── hooks/
│   │   └── readme-translation-hook.kiro.hook
│   └── steering/
│       └── ebook-mcp-server-guide.md
├── .python-version
├── .vscode/
│   └── settings.json
├── CHANGELOG.md
├── HOW-TO-TEST.md
├── LICENSE
├── README-CN.md
├── README-DE.md
├── README-FR.md
├── README-JP.md
├── README-KR.md
├── README.md
├── pyproject.toml
├── pytest.ini
├── run_tests.sh
└── src/
    └── ebook_mcp/
        ├── __init__.py
        ├── main.py
        ├── mcp_client_example/
        │   ├── .gitignore
        │   ├── README-CN.md
        │   ├── README.md
        │   ├── anthropic_example.py
        │   ├── deepseek_example.py
        │   ├── openai_example.py
        │   ├── pyproject.toml
        │   └── requirements.txt
        ├── tests/
        │   ├── conftest.py
        │   ├── run_tests.py
        │   ├── test_azw.py
        │   ├── test_basic.py
        │   ├── test_epub_chapter_extraction.py
        │   ├── test_epub_helper.py
        │   ├── test_logger_config.py
        │   ├── test_main.py
        │   └── test_pdf_helper.py
        └── tools/
            ├── epub_helper.py
            ├── logger_config.py
            ├── pdf_helper.py
            └── pdf_helper.py.backup

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/workflows/cli.yml
================================================
name: Publish Python Package

on:
  push:
    tags:
      - 'v*.*.*'  # 只有打 tag 才触发

permissions:
  id-token: write  # 用 OIDC 登录
  contents: read

jobs:
  build-and-publish:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'
    - name: Install dependencies
      run: |
        python -m pip install --upgrade build
    - name: Build package
      run: python -m build
    - name: Publish package to PyPI
      uses: pypa/gh-action-pypi-publish@release/v1
      with:
        packages-dir: dist/


================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so
**/.next/
.DS_Store

*.log
logs/
*__pycache__
uv.lock

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# UV
#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#uv.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

# Ruff stuff:
.ruff_cache/

# PyPI configuration file
.pypirc


================================================
FILE: .kiro/hooks/readme-translation-hook.kiro.hook
================================================
{
  "enabled": true,
  "name": "README Translation & Grammar Check",
  "description": "Automatically checks README.md for grammar and spelling errors, corrects them, and translates the corrected content into Simplified Chinese, Japanese, Korean, French, and German",
  "version": "1",
  "when": {
    "type": "fileEdited",
    "patterns": [
      "README.md"
    ]
  },
  "then": {
    "type": "askAgent",
    "prompt": "The README.md file has been modified. Please:\n\n1. First, carefully read the updated README.md content and check for any English grammar or spelling mistakes\n2. If any errors are found, correct them and update the README.md file with the corrected version\n3. Then translate the corrected README.md content into the following languages and save each translation as a separate file:\n   - Simplified Chinese → README-CN.md\n   - Japanese → README-JP.md  \n   - Korean → README-KR.md\n   - French → README-FR.md\n   - German → README-DE.md\n\nEnsure that:\n- All technical terms, code blocks, and formatting are preserved in translations\n- File paths, URLs, and command examples remain unchanged\n- The translations are accurate and maintain the original meaning\n- Each translated file follows the same structure as the original README.md"
  }
}

================================================
FILE: .kiro/steering/ebook-mcp-server-guide.md
================================================
---
inclusion: always
---

# Ebook MCP Server Development Guide

## Architecture
FastMCP-based server for EPUB/PDF processing with tools for metadata extraction, content parsing, and format conversion.

**Core Structure:**
- `src/ebook_mcp/main.py` - MCP server entry point with FastMCP app
- `src/ebook_mcp/tools/` - MCP tool implementations (epub_helper.py, pdf_helper.py, logger_config.py)
- Use `uv` package manager exclusively

## Code Conventions

**Python Standards:**
- Python 3.10+ with full type annotations
- PEP 8 compliance, English-only documentation
- Specific exception handling: `FileNotFoundError` over generic `Exception`
- Always log debug info before processing: `logger.debug(f"Calling {tool_name}: {params}")`

**MCP Tool Pattern (MANDATORY):**
```python
@mcp.tool()
def tool_name(param: str) -> ReturnType:
    """Brief description.
    
    Args:
        param: Description with format examples
        
    Returns:
        ReturnType: Description
        
    Raises:
        FileNotFoundError: When file doesn't exist
        Exception: For parsing/processing errors
    """
    logger.debug(f"Calling tool_name: {param}")
    try:
        # Implementation
        return result
    except FileNotFoundError as e:
        raise FileNotFoundError(str(e))
    except Exception as e:
        raise Exception(str(e))
```

## Processing Rules

**EPUB Processing:**
- Use `ebooklib` library exclusively
- **CRITICAL:** 
- Use `get_epub_chapter_markdown` (fixes truncation issues)
- Convert all content to Markdown format
- Handle missing TOC gracefully with fallback methods

**PDF Processing:**
- Use `PyMuPDF` (fitz) library exclusively
- Support both page-based and chapter-based extraction
- When TOC missing, fallback to page-by-page extraction
- Provide both plain text and Markdown output formats

**Error Handling Pattern:**
- File operations: Raise `FileNotFoundError` for missing files
- Format/parsing errors: Use specific exceptions with descriptive messages
- Wrap optional imports in try-except blocks
- Always preserve original error context in exception messages

## Available Tools Reference

**EPUB:** `get_all_epub_files`, `get_epub_metadata`, `get_epub_toc`, `get_epub_chapter_markdown_fixed`
**PDF:** `get_all_pdf_files`, `get_pdf_metadata`, `get_pdf_toc`, `get_pdf_page_text`, `get_pdf_page_markdown`, `get_pdf_chapter_content`

## Development Commands
```bash
# Test suite
uv run pytest [specific_test_file]

# Development server
uv run mcp dev src/ebook_mcp/main.py

# Debug with MCP inspector
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

**Logging:** Files in `logs/ebook-mcp_server_YYYYMMDD_HHMMSS.log`, DEBUG level, console + file output



================================================
FILE: .python-version
================================================
3.12


================================================
FILE: .vscode/settings.json
================================================
{
}

================================================
FILE: CHANGELOG.md
================================================
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.1.7] - 2025-08-06

### 🔧 Refactored
- **Modernized Dependency Management**: Removed `requirements.txt`, fully using `pyproject.toml` for dependency management
  - Deleted `requirements.txt` file
  - Updated installation instructions in all README files
  - Unified use of modern Python package management standards
  - Simplified installation process: `uv pip install -e .` or `pip install -e .`

- **PDF Processing Optimization**: Removed `PyPDF2` dependency, fully using `PyMuPDF`
  - Removed `PyPDF2` imports and `get_meta_pypdf2` function from `pdf_helper.py`
  - Updated `pyproject.toml`, removed `PyPDF2` dependency
  - Deleted `test_pdf_metadata_comparison.py` test file
  - Updated related tests, removed `PyPDF2` related tests

  - Enhanced PDF metadata extraction functionality, providing richer metadata information

### 🌍 Added
- **Internationalization Support**: Added multilingual README documentation
  - Added German README (`README-DE.md`)
  - Added French README (`README-FR.md`)
  - Added Japanese README (`README-JP.md`)
  - Added Korean README (`README-KR.md`)
  - Added Kiro translation tool configuration (`.kiro/hooks/readme-translation-hook.kiro.hook`)

### 🔧 Technical Improvements
- **Dependency Management**: Compliant with modern Python project standards (PEP 518/621)
- **PDF Processing**: Improved performance and stability, reduced dependency conflicts
- **Test Coverage**: All tests passing (76 passed, 5 skipped)
- **Code Quality**: Simplified code structure, improved maintainability

### 📝 Documentation
- Updated installation instructions in all README files
- Added multilingual support documentation
- Updated MCP client example documentation
- Improved project documentation accessibility

### 🗑️ Removed
- `requirements.txt` file
- `PyPDF2` dependency and related code
- `test_pdf_metadata_comparison.py` test file
- Outdated installation instruction references

### 🔄 Backward Compatibility
- ✅ Maintained API compatibility, no need to modify existing code
- ✅ All MCP tools working normally
- ✅ Functional integrity guaranteed

### 📦 Installation Instructions
```bash
# Development environment
git clone <repository-url>
cd ebook-mcp
uv pip install -e .
# or
pip install -e .

# Run tests
./run_tests.sh
# or
pytest src/ebook_mcp/tests/
```

### 🎯 Impact Assessment
- **Positive Impact**:
  - Simplified dependency management
  - Improved PDF processing performance
  - Enhanced internationalization support
  - Reduced maintenance complexity
  - Compliant with modern Python project standards

- **Potential Impact**:
  - Users need to update installation methods
  - Removed specific PyPDF2 features (replaced by PyMuPDF)

### 🔄 Migration Guide
For existing users:
1. Delete `requirements.txt` file (if exists)
2. Reinstall using `uv pip install -e .`
3. Update CI/CD configuration (if using requirements.txt)

---

## [0.1.4] - 2025-08-05

### 🔧 Fixed
- Fixed subchapter truncation issue in EPUB chapter extraction
- Added `get_epub_chapter_markdown_fixed` tool
- Improved chapter boundary detection logic
- Updated related tests and documentation

### 📝 Documentation
- Added `HOW-TO-TEST.md` testing documentation
- Updated test runner scripts
- Improved error handling and logging

## [0.1.3] - 2025-08-04

### 🌟 Added
- Added comprehensive unit test suite
- Created test configuration files and runner scripts
- Added test documentation and examples

### 🔧 Improved
- Improved error handling mechanisms
- Optimized code structure and readability
- Enhanced test coverage

## [0.1.2] - 2025-08-03

### 🌟 Added
- Added PDF chapter content extraction functionality
- Support for extracting content by chapter title
- Added Markdown format output support

### 🔧 Improved
- Optimized PDF metadata extraction
- Improved error handling
- Updated API documentation

## [0.0.1] - 2025-08-02

### 🔧 Fixed
- Fixed compatibility issues in PDF processing
- Improved EPUB metadata extraction
- Optimized file path handling

### 📝 Documentation
- Updated installation instructions
- Added usage examples
- Improved API documentation

## [1.0.0] - 2025-08-01

### 🌟 Initial Release
- EPUB and PDF format support
- Basic file processing APIs
- MCP client examples - Claude, DeepSeek, OpenAI
- Support for running server from PyPI
- Basic metadata extraction functionality
- Table of contents extraction support
- Chapter content extraction functionality

---

## Version Notes

### Semantic Versioning
- **Major version**: Incompatible API changes
- **Minor version**: Backward-compatible functionality additions
- **Patch version**: Backward-compatible bug fixes

### Change Types
- **Added**: New features
- **Changed**: Changes to existing functionality
- **Deprecated**: Features that will be removed soon
- **Removed**: Removed features
- **Fixed**: Bug fixes
- **Security**: Security improvements 

================================================
FILE: HOW-TO-TEST.md
================================================
# Ebook-MCP 单元测试指南

本指南说明如何运行 ebook-mcp 项目的服务器部分单元测试。

## 测试文件结构

```
src/ebook_mcp/tests/
├── conftest.py              # pytest 配置和共享 fixtures
├── test_main.py             # main.py 的单元测试
├── test_epub_helper.py      # epub_helper.py 的单元测试
├── test_pdf_helper.py       # pdf_helper.py 的单元测试
├── test_azw.py              # 现有的 AZW 测试
└── run_tests.py             # 测试运行脚本
```

## 测试覆盖范围

### main.py 测试
- EPUB 相关工具函数测试
  - `get_all_epub_files`
  - `get_epub_metadata`
  - `get_epub_toc`
  - `get_epub_chapter_markdown`
- PDF 相关工具函数测试
  - `get_all_pdf_files`
  - `get_pdf_metadata`
  - `get_pdf_toc`
  - `get_pdf_page_text`
  - `get_pdf_page_markdown`
  - `get_pdf_chapter_content`
- 错误处理测试
  - 文件不存在
  - 解析错误
  - 异常处理

### epub_helper.py 测试
- 文件操作测试
- EPUB 解析测试
- 目录结构处理测试
- HTML 清理和转换测试
- 章节提取测试

### pdf_helper.py 测试
- PDF 文件操作测试
- 元数据提取测试
- 目录提取测试
- 页面文本提取测试
- 章节内容提取测试

## 运行测试

### 方法 1: 使用测试运行脚本

```bash
# 运行所有测试
python src/ebook_mcp/tests/run_tests.py

# 列出所有测试文件
python src/ebook_mcp/tests/run_tests.py list

# 运行特定测试文件
python src/ebook_mcp/tests/run_tests.py run test_main.py
```

### 方法 2: 使用 pytest 直接运行

```bash
# 运行所有测试
pytest src/ebook_mcp/tests/ -v

# 运行特定测试文件
pytest src/ebook_mcp/tests/test_main.py -v

# 运行特定测试类
pytest src/ebook_mcp/tests/test_main.py::TestEpubFunctions -v

# 运行特定测试方法
pytest src/ebook_mcp/tests/test_main.py::TestEpubFunctions::test_get_all_epub_files_empty_directory -v
```

### 方法 3: 从项目根目录运行

```bash
# 从项目根目录运行所有测试
python -m pytest src/ebook_mcp/tests/ -v

# 运行特定测试
python -m pytest src/ebook_mcp/tests/test_main.py -v
```

### 方法 4: 使用 uv 运行（如果使用 uv 管理依赖）

```bash
# 运行所有测试
uv run pytest src/ebook_mcp/tests/ -v

# 运行特定测试
uv run pytest src/ebook_mcp/tests/test_main.py -v
```

### 方法 5: 运行基本测试（不需要外部依赖）

```bash
# 运行基本测试（推荐用于快速验证）
pytest src/ebook_mcp/tests/test_basic.py -v

# 或者
python -m pytest src/ebook_mcp/tests/test_basic.py -v
```

## 测试环境要求

### 基本依赖
确保已安装以下依赖：

```bash
# 如果使用 pip
pip install pytest
pip install pytest-cov  # 可选：用于代码覆盖率

# 如果使用 uv
uv add --dev pytest
uv add --dev pytest-cov  # 可选：用于代码覆盖率
```

### 完整依赖（用于完整测试）
如果要运行所有测试（包括需要外部库的测试），需要安装项目依赖：

```bash
# 安装项目依赖
pip install -e .

# 或者使用 uv
uv sync --dev
```

### 依赖说明
- **基本测试** (`test_basic.py`): 不需要外部依赖，可以立即运行 ✅
- **主要功能测试** (`test_main.py`, `test_epub_helper.py`, `test_pdf_helper.py`): 需要项目依赖 ⚠️
- **AZW 测试** (`test_azw.py`): 需要 AZW 模块（如果可用）⏭️

### 测试状态
- ✅ **基本测试**: 10个测试全部通过
- ⏭️ **AZW 测试**: 4个测试跳过（模块不可用）
- ⚠️ **主要功能测试**: 需要安装项目依赖后运行

## 测试输出示例

### 成功运行示例
```
Running ebook-mcp unit tests...
==================================================
test_main.py::TestEpubFunctions::test_get_all_epub_files_empty_directory PASSED
test_main.py::TestEpubFunctions::test_get_all_epub_files_with_epub_files PASSED
test_main.py::TestEpubFunctions::test_get_epub_metadata_success PASSED
...
test_pdf_helper.py::TestPdfHelper::test_get_all_pdf_files_empty_directory PASSED
test_pdf_helper.py::TestPdfHelper::test_get_all_pdf_files_with_pdf_files PASSED
...

==================================================
✅ All tests passed!
```

### 失败运行示例
```
test_main.py::TestEpubFunctions::test_get_epub_metadata_file_not_found FAILED
...
AssertionError: Expected FileNotFoundError to be raised, but no exception was raised.
```

## 测试策略

### 单元测试原则
1. **隔离性**: 每个测试都是独立的，不依赖其他测试
2. **可重复性**: 测试可以在任何环境下重复运行
3. **快速性**: 测试运行速度快
4. **完整性**: 覆盖正常情况和异常情况

### Mock 使用
- 使用 `unittest.mock` 来模拟外部依赖
- 模拟文件系统操作
- 模拟 EPUB 和 PDF 解析库

### 测试数据
- 使用临时文件和目录
- 使用模拟数据而不是真实文件
- 测试完成后自动清理

## 调试测试

### 查看详细输出
```bash
# 运行测试并显示详细输出
pytest src/ebook_mcp/tests/ -v -s

# 在失败时停止
pytest src/ebook_mcp/tests/ -x

# 显示本地变量
pytest src/ebook_mcp/tests/ --tb=long
```

### 运行特定测试
```bash
# 运行包含特定关键词的测试
pytest src/ebook_mcp/tests/ -k "epub" -v

# 运行特定类的测试
pytest src/ebook_mcp/tests/ -k "TestEpubFunctions" -v

# 运行特定方法的测试
pytest src/ebook_mcp/tests/ -k "test_get_epub_metadata" -v
```

## 添加新测试

### 为新功能添加测试
1. 在相应的测试文件中添加测试类或方法
2. 使用描述性的测试方法名
3. 包含正常情况和异常情况的测试
4. 使用适当的 mock 来隔离依赖

### 测试方法命名规范
- 使用 `test_` 前缀
- 描述测试的功能和场景
- 例如: `test_get_epub_metadata_success`, `test_get_epub_metadata_file_not_found`

### 测试文档
- 每个测试方法都应该有文档字符串
- 说明测试的目的和预期结果
- 对于复杂的测试，说明测试步骤

## 持续集成

这些测试可以集成到 CI/CD 流程中：

```yaml
# 示例 GitHub Actions 配置
- name: Run tests
  run: |
    python -m pytest src/ebook_mcp/tests/ -v --cov=src/ebook_mcp
```

## 故障排除

### 常见问题

1. **导入错误**: 确保 Python 路径正确设置
   ```bash
   # 从项目根目录运行
   cd /path/to/ebook-mcp
   python -m pytest src/ebook_mcp/tests/ -v
   ```

2. **Mock 问题**: 检查 mock 路径是否正确
   ```python
   # 确保 mock 路径与实际的导入路径一致
   @patch('ebook_mcp.main.epub_helper.get_meta')
   ```

3. **文件权限**: 确保有创建临时文件的权限
   ```bash
   # 检查当前目录权限
   ls -la
   ```

4. **依赖问题**: 确保所有测试依赖已安装
   ```bash
   # 安装开发依赖
   pip install -e .[dev]
   # 或使用 uv
   uv sync --dev
   ```

### 获取帮助

如果遇到问题，可以：

1. 查看测试输出中的错误信息
2. 使用 `-v` 参数获取详细输出
3. 使用 `--tb=long` 查看完整的错误堆栈
4. 检查测试文件中的 mock 设置 

================================================
FILE: LICENSE
================================================
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: README-CN.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">

[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md)

Ebook-MCP 是一个强大的模型上下文协议（MCP）服务器，用于处理电子书。基于[模型上下文协议](https://github.com/modelcontextprotocol)构建，它提供了一套标准化的API，用于LLM应用程序与电子书处理功能之间的无缝集成。目前支持EPUB和PDF格式。

## 使用场景与价值

Ebook-MCP 通过与您的阅读材料进行自然语言对话，改变了您与数字图书的互动方式。它与现代AI驱动的IDE（如Cursor和Claude）无缝集成，让您能够：

- **智能图书馆管理**：简单地询问"显示我下载文件夹中的所有EPUB文件"或"在我的图书馆中找到关于GenAI的书籍"
- **互动阅读体验**：与您的书籍进行自然对话：
  - "给我简要介绍一下《LLM工程师手册》"
  - "第3章涵盖了什么内容？"
  - "总结这本书中关于RAG的要点"
- **主动学习支持**：通过AI驱动的互动增强您的学习：
  - "基于第5章的RAG概念创建一个测验"
  - "解释本章讨论的架构之间的差异"
  - "基于本节的概念给我一些实践练习"
- **内容导航**：使用自然语言查询轻松浏览您的书籍：
  - "找到所有讨论提示工程的章节"
  - "显示关于微调的章节"
  - "带我到关于向量数据库的部分"

通过弥合传统电子书与AI功能之间的差距，Ebook-MCP帮助读者通过直观的、基于对话的互动从他们的数字图书馆中提取更多价值。

快速演示 - 使用DeepSeek运行（请查看mcp-client-example文件夹下所有Claude/OpenAI的示例）
```
查询：请检查我下载文件夹中的所有epub书籍。用户ID是onebird。
......

查询：你能问我几个问题来测试我对这本书的理解吗？

回答：当然！这里有5个关于沃尔特·艾萨克森的《埃隆·马斯克》的问题来测试你对这本书的理解。请根据你的知识回答——我会在每个回答后给出反馈和补充细节。

问题1：
马斯克的童年和家庭背景如何塑造了他后来的性格和职业生涯？
（提示：他与父亲的关系，在南非的成长经历等）

问题2：
Zip2和X.com（后来的PayPal）是马斯克早期的创业公司。它们解决了什么问题？马斯克最终是如何退出这些公司的？
（提示：Zip2的商业模式，X.com与Confinity的合并等）

问题3：
2008年，SpaceX和特斯拉都濒临破产。马斯克是如何拯救这两家公司的？
（提示：猎鹰1号的第四次发射，NASA合同，特斯拉的最后一刻救援等）

问题4：
围绕马斯克管理风格的争议有哪些？书中哪些具体事件说明了他的"硬核"或"极端"工作文化？
（提示：弗里蒙特工厂的"生产地狱"，推特裁员等）

问题5：
马斯克对人工智能（AI）的态度是什么？为什么他同时支持OpenAI和Neuralink，这两者似乎有冲突的目标？
（提示：AI作为生存威胁，脑机接口愿景等）

请回答每个问题。我会提供反馈和额外细节或纠正任何误解！
（如果问题不清楚，请随时问我。）

查询：
问题1：他父亲的虐待态度使他对他人缺乏同情心。
问题2：Zip2解决了电子地图和在线黄页的问题。X.com解决了在线支付问题。
问题3：他获得了NASA的合同。特斯拉通过在中国建厂解决了生产问题。
问题4：我不知道。请为我回答。
问题5：他认为开源技术对确保AI安全和开放是必要的。

回答：你的答案抓住了书中的一些要点。让我为问题4和5添加细节、纠正一些不准确之处并提供进一步背景：

问题1 — 纠正与补充
父亲的影响：
马斯克的父亲埃罗尔确实极其严厉——例如，强迫年幼的埃隆跪下数米粒数小时。书中强调这种情感脱敏塑造了马斯克成为一个极其目标导向的人——对他人的痛苦漠不关心，但对自己的目标执着追求。

反向影响：
这并不是说马斯克没有同情心——而是选择性的。例如，他可能冷淡地对待员工，但会亲自研究医疗设备来解决儿子X的健康问题。

问题2 — 纠正与补充
Zip2：
本质上是在线黄页+地图导航服务（帮助报纸上线）。马斯克亲自编写了大部分早期代码并睡在办公室。最终，康柏以3.07亿美元收购了Zip2（马斯克获得2200万美元）。

X.com：
不仅仅是在线支付——这是马斯克试图创建综合金融平台的尝试（旨在取代银行）。后来与Confinity（PayPal的前身）合并。马斯克因内部冲突被驱逐，但当eBay收购PayPal时，马斯克赚了1.8亿美元。
（注：马斯克后来买回了X.com域名——现在是推特母公司的名称）。

问题3 — 纠正与补充
SpaceX：
2008年，猎鹰1号在第四次发射中成功（前三次失败耗尽了资金）。紧接着，SpaceX获得了16亿美元的NASA货物补给合同——这是公司的生命线。

特斯拉：
2008年，特斯拉通过戴姆勒5000万美元的投资和美国能源部的贷款得以生存。
在中国建设超级工厂（2019年）是后来的事。在早期，生存依赖于Model S的推出（2012年）和马斯克的个人赌注——出售所有房地产来资助特斯拉。

问题4 — 解释：围绕他管理风格的争议
"硬核"文化：

在SpaceX/特斯拉：马斯克要求员工每周工作80+小时，甚至睡在工厂。（"如果你不在这里，你的替代者就在。"）

收购推特后：他立即裁员75%的员工，并要求剩余员工签署"硬核承诺"——同意长时间和高强度工作。

极端案例：

在弗里蒙特工厂：在Model 3的"生产地狱"期间，马斯克睡在生产线上，曾对员工大喊称他们"比狗还差"（他后来道歉）。

解雇高管：马斯克曾因为PowerPoint演示文稿的字体大小不够大而当场解雇了一名特斯拉高级主管。

问题5 — 纠正与补充
AI威胁论：
马斯克认为AI可能摧毁人类（"比核武器更危险"）。因此：

OpenAI：他最初支持它以促进安全、开源的AI（但后来因控制权争议而离开）。

Neuralink：他旨在使用脑机接口将人类与AI融合——以防止人类被抛在后面。

矛盾：
在警告AI风险的同时，马斯克同时在特斯拉推进全自动驾驶（FSD）技术，并创立了xAI来研究通用人工智能（AGI）。

总体评价：
✅ 你准确地抓住了关键事件，但细节需要加强（如时间线和具体数字）。
🔍 建议进一步学习的领域：马斯克的多线程管理风格（同时运营五家公司）和他的媒体操控技巧（通过推特治理）。
```

## 功能特性

### EPUB支持
- 提取元数据（标题、作者、出版日期等）
- 提取目录
- 提取章节内容（Markdown输出）
- 批量处理EPUB文件

### PDF支持
- 提取元数据（标题、作者、创建日期等）
- 提取目录
- 按页码提取内容
- 按章节标题提取内容
- Markdown输出支持
- 批量处理PDF文件

## 安装

1. 克隆仓库：
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. 使用`uv`安装依赖：
```bash
uv pip install -e .
```

## 使用方法

### 在开发模式下启动MCP服务器

在开发模式下运行服务器：
```bash
uv run mcp dev src/ebook_mcp/main.py
```

您可以访问 http://localhost:5173/ 进行测试和调试
您也可以安装检查器进行测试。
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### 在生产模式下启动MCP服务器

运行服务器：
```bash
uv run src/ebook_mcp/main.py
```

#### 在Cursor中配置MCP

在Cursor中添加以下配置
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```

### EPUB处理示例

```python
# 获取目录中的所有EPUB文件
epub_files = get_all_epub_files("/path/to/books")

# 获取EPUB元数据
metadata = get_metadata("/path/to/book.epub")

# 获取目录
toc = get_toc("/path/to/book.epub")

# 获取特定章节内容（Markdown格式）
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### PDF处理示例

```python
# 获取目录中的所有PDF文件
pdf_files = get_all_pdf_files("/path/to/books")

# 获取PDF元数据
metadata = get_pdf_metadata("/path/to/book.pdf")

# 获取目录
toc = get_pdf_toc("/path/to/book.pdf")

# 获取特定页面内容
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# 获取特定章节内容
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## API参考

### EPUB APIs

#### `get_all_epub_files(path: str) -> List[str]`
获取指定目录中的所有EPUB文件。

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
从EPUB文件获取元数据。

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
从EPUB文件获取目录。

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
获取Markdown格式的章节内容。

### PDF APIs

#### `get_all_pdf_files(path: str) -> List[str]`
获取指定目录中的所有PDF文件。

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
从PDF文件获取元数据。

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
从PDF文件获取目录。

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
从特定页面获取纯文本内容。

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
从特定页面获取Markdown格式内容。

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
通过章节标题获取章节内容和对应页码。

## 依赖项

主要依赖包括：
- ebooklib：EPUB文件处理
- PyPDF2：基础PDF处理
- PyMuPDF：高级PDF处理
- beautifulsoup4：HTML解析
- html2text：HTML到Markdown转换
- pydantic：数据验证
- fastmcp：MCP服务器框架

## 重要说明

1. PDF处理依赖于文档的目录。如果没有目录，某些功能可能无法工作。
2. 对于大型PDF文件，建议按页面范围处理，以避免一次加载整个文件。
3. EPUB章节ID必须从目录结构中获取。

## 架构

```
           ┌────────────────────────────┐
           │         代理层             │
           │  - 翻译策略                │
           │  - 风格一致性检查          │
           │  - LLM调用与交互           │
           └────────────▲─────────────┘
                        │ 工具调用
           ┌────────────┴─────────────┐
           │        MCP工具层          │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ 系统/IO调用
           ┌────────────┴─────────────┐
           │     系统基础层            │
           │  - 文件读取              │
           │  - ebooklib解析          │
           │  - 文件路径存储/检查      │
           └────────────────────────────┘
```

## 贡献

我们欢迎Issues和Pull Requests！

有关最近更改的详细信息，请参阅 [CHANGELOG.md](CHANGELOG.md)。

================================================
FILE: README-DE.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">

[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md)

Ebook-MCP ist ein leistungsstarker Model Context Protocol (MCP) Server für die Verarbeitung elektronischer Bücher. Aufgebaut auf dem [Model Context Protocol](https://github.com/modelcontextprotocol), bietet es eine Reihe standardisierter APIs für die nahtlose Integration zwischen LLM-Anwendungen und E-Book-Verarbeitungsfunktionen. Unterstützt derzeit EPUB- und PDF-Formate.

## Anwendungsfälle und Wert

Ebook-MCP transformiert die Art, wie Sie mit Ihren digitalen Büchern interagieren, indem es natürlichsprachliche Gespräche mit Ihren Lesematerialien ermöglicht. Es integriert sich nahtlos in moderne KI-gestützte IDEs wie Cursor und Claude und ermöglicht Ihnen:

- **Intelligente Bibliotheksverwaltung**: Fragen Sie einfach "Zeigen Sie mir alle EPUB-Dateien in meinem Downloads-Ordner" oder "Finden Sie Bücher über GenAI in meiner Bibliothek"
- **Interaktive Leseerfahrung**: Führen Sie natürliche Gespräche über Ihre Bücher:
  - "Geben Sie mir eine kurze Einführung in das 'LLM Engineer Handbook'"
  - "Was wird in Kapitel 3 behandelt?"
  - "Fassen Sie die wichtigsten Punkte über RAG aus diesem Buch zusammen"
- **Aktive Lernunterstützung**: Verbessern Sie Ihr Lernen durch KI-gestützte Interaktionen:
  - "Erstellen Sie ein Quiz basierend auf den RAG-Konzepten aus Kapitel 5"
  - "Erklären Sie die Unterschiede zwischen den in diesem Kapitel diskutierten Architekturen"
  - "Geben Sie mir praktische Übungen basierend auf den Konzepten in diesem Abschnitt"
- **Inhaltsnavigation**: Navigieren Sie einfach durch Ihre Bücher mit natürlichsprachlichen Abfragen:
  - "Finden Sie alle Abschnitte, die Prompt Engineering diskutieren"
  - "Zeigen Sie mir die Kapitel über Fine-Tuning"
  - "Bringen Sie mich zu dem Teil über Vektordatenbanken"

Durch die Überbrückung der Lücke zwischen traditionellen E-Books und KI-Fähigkeiten hilft Ebook-MCP Lesern dabei, mehr Wert aus ihrer digitalen Bibliothek durch intuitive, gesprächsbasierte Interaktionen zu extrahieren.

Schnelle Demo - Ausführung mit DeepSeek (Bitte überprüfen Sie alle Beispiele mit Claude/OpenAI im mcp-client-example Ordner)
```
Anfrage: Bitte überprüfen Sie alle epub-Bücher in meinem Download-Ordner. Benutzer-ID ist onebird.
......

Anfrage: Können Sie mir ein paar Fragen stellen, um mein Verständnis dieses Buches zu testen?

Antwort: Sicher! Hier sind 5 Fragen über Elon Musk von Walter Isaacson, um Ihr Verständnis des Buches zu testen. Bitte antworten Sie basierend auf Ihrem Wissen — ich werde Feedback geben und Details nach jeder Antwort hinzufügen.

Frage 1:
Wie haben Musks Kindheit und familiärer Hintergrund seine Persönlichkeit und Karriere später im Leben geprägt?
(Hinweis: Seine Beziehung zu seinem Vater, Aufwachsen in Südafrika, etc.)

Frage 2:
Zip2 und X.com (später PayPal) waren Musks frühe Startups. Welche Probleme haben sie gelöst? Wie ist Musk schließlich aus diesen Unternehmen ausgestiegen?
(Hinweis: Zip2s Geschäftsmodell, X.coms Fusion mit Confinity, etc.)

Frage 3:
2008 standen sowohl SpaceX als auch Tesla am Rande des Bankrotts. Wie hat Musk diese beiden Unternehmen gerettet?
(Hinweis: Falcon 1s vierter Start, NASA-Vertrag, Teslas Last-Minute-Rettung, etc.)

Frage 4:
Welche Kontroversen umgeben Musks Führungsstil? Welche spezifischen Vorfälle im Buch veranschaulichen seine "Hardcore"- oder "extreme" Arbeitskultur?
(Hinweis: Fremont-Fabrik "Produktionshölle", Twitter-Entlassungen, etc.)

Frage 5:
Was ist Musks Einstellung zur Künstlichen Intelligenz (KI)? Warum unterstützte er gleichzeitig sowohl OpenAI als auch Neuralink, die widersprüchliche Ziele zu haben scheinen?
(Hinweis: KI als existenzielle Bedrohung, Gehirn-Maschine-Schnittstellen-Vision, etc.)

Bitte beantworten Sie jede Frage. Ich werde Feedback und zusätzliche Details geben oder Missverständnisse korrigieren!
(Wenn eine Frage unklar ist, fragen Sie mich gerne.)

Anfrage:
Frage 1: Die missbräuchliche Haltung seines Vaters machte ihn weniger empathisch gegenüber anderen.
Frage 2: Zip2 löste das Problem elektronischer Karten und Online-Gelbe Seiten. X.com löste Online-Zahlungen.
Frage 3: Er sicherte sich einen Vertrag mit der NASA. Tesla löste seine Produktionsprobleme durch den Bau einer Fabrik in China.
Frage 4: Ich weiß es nicht. Bitte antworten Sie für mich.
Frage 5: Er glaubt, dass Open-Source-Technologie notwendig ist, um sicherzustellen, dass KI sicher und offen ist.

Antwort: Ihre Antworten haben einige wichtige Punkte aus dem Buch erfasst. Lassen Sie mich Details hinzufügen, einige Ungenauigkeiten korrigieren und weiteren Hintergrund für die Fragen 4 und 5 liefern:

Frage 1 — Korrektur und Ergänzung
Einfluss seines Vaters:
Musks Vater Errol war tatsächlich extrem hart — zum Beispiel zwang er den jungen Elon, stundenlang zu knien und Reiskörner zu zählen. Das Buch betont, dass diese emotionale Desensibilisierung Musk zu einer extrem zielorientierten Person formte — gleichgültig gegenüber dem Leiden anderer, aber obsessiv seinen Zielen verpflichtet.

Umgekehrte Auswirkung:
Es ist nicht so, dass Musk unfähig zur Empathie ist — sie ist selektiv. Zum Beispiel kann er Angestellte kalt behandeln, aber persönlich medizinische Geräte für die Gesundheitsprobleme seines Sohnes X studieren.

Frage 2 — Korrektur und Ergänzung
Zip2:
Im Wesentlichen ein Online-Gelbe-Seiten + Kartennavigationsdienst (half Zeitungen online zu gehen). Musk schrieb viel des frühen Codes selbst und schlief im Büro. Schließlich erwarb Compaq Zip2 für 307 Millionen Dollar (Musk erhielt 22 Millionen Dollar).

X.com:
Nicht nur Online-Zahlungen — es war Musks Versuch, eine umfassende Finanzplattform zu schaffen (mit dem Ziel, Banken zu ersetzen). Später fusionierte es mit Confinity (dem Vorläufer von PayPal). Musk wurde aufgrund interner Konflikte verdrängt, aber als eBay PayPal erwarb, verdiente Musk 180 Millionen Dollar.
(Hinweis: Musk kaufte später die Domain X.com zurück — es ist jetzt der Name von Twitters Muttergesellschaft).

Frage 3 — Korrektur und Ergänzung
SpaceX:
2008 gelang Falcon 1 beim vierten Start (nach drei Fehlschlägen, die ihre Mittel erschöpften). Direkt danach sicherte sich SpaceX einen 1,6-Milliarden-Dollar-NASA-Vertrag für Frachtversorgung — eine Rettungsleine für das Unternehmen.

Tesla:
2008 überlebte Tesla durch eine 50-Millionen-Dollar-Investition von Daimler und ein Darlehen vom US-Energieministerium.
Der Bau der Gigafactory in China (2019) kam viel später. In den frühen Jahren hing das Überleben vom Start des Model S (2012) und Musks persönlichem Einsatz ab — dem Verkauf all seiner Immobilien zur Finanzierung von Tesla.

Frage 4 — Erklärung: Kontroverse um seinen Führungsstil
"Hardcore"-Kultur:

Bei SpaceX/Tesla: Musk verlangte von Angestellten, 80+ Stunden pro Woche zu arbeiten, sogar in der Fabrik zu schlafen. ("Wenn Sie nicht hier sind, ist Ihr Ersatz da.")

Nach der Twitter-Übernahme: Er entließ sofort 75% der Belegschaft und verlangte von den verbleibenden Angestellten, ein "Hardcore-Commitment" zu unterschreiben — Zustimmung zu langen Stunden und hochintensiver Leistung.

Extreme Fälle:

In der Fremont-Fabrik: Während der "Produktionshölle" des Model 3 schlief Musk auf der Produktionslinie und schrie einmal Angestellte an, nannte sie "schlechter als Hunde" (er entschuldigte sich später).

Entlassung von Führungskräften: Musk entließ einmal eine Tesla-Führungskraft auf der Stelle, weil die Schriftgröße in ihrer PowerPoint-Präsentation nicht groß genug war.

Frage 5 — Korrektur und Ergänzung
KI-Bedrohungstheorie:
Musk glaubt, dass KI die Menschheit zerstören könnte ("gefährlicher als Atomwaffen"). Daher:

OpenAI: Er unterstützte es zunächst, um sichere, Open-Source-KI zu fördern (verließ aber später aufgrund von Streitigkeiten über die Kontrolle).

Neuralink: Er zielt darauf ab, Gehirn-Maschine-Schnittstellen zu verwenden, um Menschen mit KI zu verschmelzen — um zu verhindern, dass Menschen zurückgelassen werden.

Widersprüche:
Während er vor KI-Risiken warnt, treibt Musk gleichzeitig die Full Self-Driving (FSD) Technologie bei Tesla voran und hat xAI gegründet, um Artificial General Intelligence (AGI) zu erforschen.

Gesamtbewertung:
✅ Sie haben die wichtigsten Ereignisse genau erfasst, aber die Details müssen verstärkt werden (wie Zeitpläne und spezifische Zahlen).
🔍 Empfohlene Bereiche für weiteres Studium: Musks Multi-Thread-Führungsstil (gleichzeitiges Führen von fünf Unternehmen) und seine Fähigkeit zur Medienmanipulation (Regieren durch Twitter).
```

## Funktionen

### EPUB-Unterstützung
- Metadaten extrahieren (Titel, Autor, Veröffentlichungsdatum, etc.)
- Inhaltsverzeichnis extrahieren
- Kapitelinhalt extrahieren (mit Markdown-Ausgabe)
- Stapelverarbeitung von EPUB-Dateien

### PDF-Unterstützung
- Metadaten extrahieren (Titel, Autor, Erstellungsdatum, etc.)
- Inhaltsverzeichnis extrahieren
- Inhalt nach Seitenzahl extrahieren
- Inhalt nach Kapiteltitel extrahieren
- Markdown-Ausgabe-Unterstützung
- Stapelverarbeitung von PDF-Dateien

## Installation

1. Repository klonen:
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. Abhängigkeiten mit `uv` installieren:
```bash
uv pip install -e .
```

## Verwendung

### MCP-Server im Entwicklungsmodus starten

Server im Entwicklungsmodus ausführen:
```bash
uv run mcp dev src/ebook_mcp/main.py
```

Sie können http://localhost:5173/ für Test- und Debugging-Zwecke besuchen
Sie können auch den Inspector für den Test installieren.
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### MCP-Server im Produktionsmodus starten

Server ausführen:
```bash
uv run src/ebook_mcp/main.py
```

#### MCP in Cursor konfigurieren

Folgende Konfiguration in Cursor hinzufügen
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```

### EPUB-Verarbeitungsbeispiele

```python
# Alle EPUB-Dateien in einem Verzeichnis abrufen
epub_files = get_all_epub_files("/path/to/books")

# EPUB-Metadaten abrufen
metadata = get_metadata("/path/to/book.epub")

# Inhaltsverzeichnis abrufen
toc = get_toc("/path/to/book.epub")

# Spezifischen Kapitelinhalt abrufen (im Markdown-Format)
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### PDF-Verarbeitungsbeispiele

```python
# Alle PDF-Dateien in einem Verzeichnis abrufen
pdf_files = get_all_pdf_files("/path/to/books")

# PDF-Metadaten abrufen
metadata = get_pdf_metadata("/path/to/book.pdf")

# Inhaltsverzeichnis abrufen
toc = get_pdf_toc("/path/to/book.pdf")

# Spezifischen Seiteninhalt abrufen
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# Spezifischen Kapitelinhalt abrufen
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## API-Referenz

### EPUB-APIs

#### `get_all_epub_files(path: str) -> List[str]`
Alle EPUB-Dateien im angegebenen Verzeichnis abrufen.

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
Metadaten aus einer EPUB-Datei abrufen.

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
Inhaltsverzeichnis aus einer EPUB-Datei abrufen.

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
Kapitelinhalt im Markdown-Format abrufen.

### PDF-APIs

#### `get_all_pdf_files(path: str) -> List[str]`
Alle PDF-Dateien im angegebenen Verzeichnis abrufen.

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
Metadaten aus einer PDF-Datei abrufen.

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
Inhaltsverzeichnis aus einer PDF-Datei abrufen.

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
Klartext-Inhalt von einer spezifischen Seite abrufen.

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
Markdown-formatierten Inhalt von einer spezifischen Seite abrufen.

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
Kapitelinhalt und entsprechende Seitenzahlen nach Kapiteltitel abrufen.

## Abhängigkeiten

Wichtige Abhängigkeiten umfassen:
- ebooklib: EPUB-Dateiverarbeitung
- PyPDF2: Grundlegende PDF-Verarbeitung
- PyMuPDF: Erweiterte PDF-Verarbeitung
- beautifulsoup4: HTML-Parsing
- html2text: HTML zu Markdown Konvertierung
- pydantic: Datenvalidierung
- fastmcp: MCP-Server-Framework

## Wichtige Hinweise

1. PDF-Verarbeitung basiert auf dem Inhaltsverzeichnis des Dokuments. Einige Funktionen funktionieren möglicherweise nicht, wenn das Inhaltsverzeichnis nicht verfügbar ist.
2. Für große PDF-Dateien wird empfohlen, nach Seitenbereichen zu verarbeiten, um das Laden der gesamten Datei auf einmal zu vermeiden.
3. EPUB-Kapitel-IDs müssen aus der Inhaltsverzeichnisstruktur abgerufen werden.

## Architektur

```
           ┌────────────────────────────┐
           │         Agent-Schicht      │
           │  - Übersetzungsstrategie   │
           │  - Stil-Konsistenzprüfung  │
           │  - LLM-Aufruf & Interaktion│
           └────────────▲─────────────┘
                        │ Tool-Aufrufe
           ┌────────────┴─────────────┐
           │        MCP-Tool-Schicht  │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ System/IO-Aufrufe
           ┌────────────┴─────────────┐
           │     System-Basis-Schicht │
           │  - Datei-Lesen           │
           │  - ebooklib-Parsing      │
           │  - Dateipfad-Speicher/Prüf│
           └────────────────────────────┘
```

## Mitwirken

Wir begrüßen Issues und Pull Requests!

Detaillierte Informationen zu den letzten Änderungen finden Sie in [CHANGELOG.md](CHANGELOG.md).

================================================
FILE: README-FR.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">

[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md)

Ebook-MCP est un serveur Model Context Protocol (MCP) puissant pour le traitement des livres électroniques. Construit sur le [Model Context Protocol](https://github.com/modelcontextprotocol), il fournit un ensemble d'API standardisées pour une intégration transparente entre les applications LLM et les capacités de traitement des livres électroniques. Prend actuellement en charge les formats EPUB et PDF.

## Cas d'usage et valeur

Ebook-MCP transforme la façon dont vous interagissez avec vos livres numériques en permettant des conversations en langage naturel avec vos matériaux de lecture. Il s'intègre parfaitement avec les IDE modernes alimentés par l'IA comme Cursor et Claude, vous permettant de :

- **Gestion intelligente de bibliothèque** : Demandez simplement "Montrez-moi tous les fichiers EPUB dans mon dossier de téléchargements" ou "Trouvez des livres sur GenAI dans ma bibliothèque"
- **Expérience de lecture interactive** : Ayez des conversations naturelles sur vos livres :
  - "Donnez-moi une brève introduction au 'Manuel de l'ingénieur LLM'"
  - "Que couvre le chapitre 3 ?"
  - "Résumez les points clés sur RAG de ce livre"
- **Support d'apprentissage actif** : Améliorez votre apprentissage grâce aux interactions alimentées par l'IA :
  - "Créez un quiz basé sur les concepts RAG du chapitre 5"
  - "Expliquez les différences entre les architectures discutées dans ce chapitre"
  - "Donnez-moi des exercices pratiques basés sur les concepts de cette section"
- **Navigation de contenu** : Naviguez facilement dans vos livres avec des requêtes en langage naturel :
  - "Trouvez toutes les sections discutant de l'ingénierie des prompts"
  - "Montrez-moi les chapitres sur le fine-tuning"
  - "Amenez-moi à la partie sur les bases de données vectorielles"

En comblant le fossé entre les livres électroniques traditionnels et les capacités de l'IA, Ebook-MCP aide les lecteurs à extraire plus de valeur de leur bibliothèque numérique grâce à des interactions intuitives basées sur la conversation.

Démo rapide - Exécuter avec DeepSeek (Veuillez vérifier tous les exemples avec Claude/OpenAI dans le dossier mcp-client-example)
```
Requête : Veuillez vérifier tous les livres epub dans mon dossier de téléchargement. L'ID utilisateur est onebird.
......

Requête : Pouvez-vous me poser quelques questions pour tester ma compréhension de ce livre ?

Réponse : Bien sûr ! Voici 5 questions sur Elon Musk par Walter Isaacson pour tester votre compréhension du livre. Veuillez répondre en vous basant sur vos connaissances — je donnerai des commentaires et ajouterai des détails après chaque réponse.

Question 1 :
Comment l'enfance et le contexte familial de Musk ont-ils façonné sa personnalité et sa carrière plus tard dans la vie ?
(Indice : Sa relation avec son père, grandir en Afrique du Sud, etc.)

Question 2 :
Zip2 et X.com (plus tard PayPal) étaient les premières startups de Musk. Quels problèmes ont-ils résolus ? Comment Musk a-t-il finalement quitté ces entreprises ?
(Indice : Le modèle commercial de Zip2, la fusion de X.com avec Confinity, etc.)

Question 3 :
En 2008, SpaceX et Tesla étaient tous deux au bord de la faillite. Comment Musk a-t-il sauvé ces deux entreprises ?
(Indice : Le quatrième lancement de Falcon 1, le contrat NASA, le sauvetage de dernière minute de Tesla, etc.)

Question 4 :
Quelles controverses entourent le style de gestion de Musk ? Quels incidents spécifiques dans le livre illustrent sa culture de travail "hardcore" ou "extrême" ?
(Indice : "L'enfer de production" de l'usine Fremont, les licenciements Twitter, etc.)

Question 5 :
Quelle est l'attitude de Musk envers l'Intelligence Artificielle (IA) ? Pourquoi a-t-il simultanément soutenu OpenAI et Neuralink, qui semblent avoir des objectifs contradictoires ?
(Indice : L'IA comme menace existentielle, vision d'interface cerveau-machine, etc.)

Veuillez répondre à chaque question. Je fournirai des commentaires et des détails supplémentaires ou corrigerai tout malentendu !
(Si une question n'est pas claire, n'hésitez pas à me demander.)

Requête :
Question 1 : L'attitude abusive de son père l'a rendu moins empathique envers les autres.
Question 2 : Zip2 a résolu le problème des cartes électroniques et des pages jaunes en ligne. X.com a résolu les paiements en ligne.
Question 3 : Il a obtenu un contrat avec la NASA. Tesla a résolu ses problèmes de production en construisant une usine en Chine.
Question 4 : Je ne sais pas. Veuillez répondre pour moi.
Question 5 : Il croit que la technologie open-source est nécessaire pour s'assurer que l'IA soit sûre et ouverte.

Réponse : Vos réponses ont capturé quelques points clés du livre. Laissez-moi ajouter des détails, corriger quelques inexactitudes et fournir plus de contexte pour les questions 4 et 5 :

Question 1 — Correction et complément
Influence de son père :
Le père de Musk, Errol, était effectivement extrêmement dur — par exemple, forçant le jeune Elon à s'agenouiller et compter des grains de riz pendant des heures. Le livre souligne que cette désensibilisation émotionnelle a façonné Musk en une personne extrêmement orientée vers les objectifs — indifférente à la souffrance des autres mais obsessionnellement engagée envers ses objectifs.

Impact inverse :
Ce n'est pas que Musk soit incapable d'empathie — elle est sélective. Par exemple, il peut traiter les employés froidement mais étudier personnellement l'équipement médical pour les problèmes de santé de son fils X.

Question 2 — Correction et complément
Zip2 :
Essentiellement un service de pages jaunes en ligne + navigation cartographique (aidant les journaux à aller en ligne). Musk a écrit une grande partie du code initial lui-même et dormait au bureau. Finalement, Compaq a acquis Zip2 pour 307 millions de dollars (Musk a reçu 22 millions de dollars).

X.com :
Pas seulement les paiements en ligne — c'était la tentative de Musk de créer une plateforme financière complète (visant à remplacer les banques). Plus tard fusionné avec Confinity (le précurseur de PayPal). Musk a été évincé en raison de conflits internes, mais quand eBay a acquis PayPal, Musk a gagné 180 millions de dollars.
(Note : Musk a plus tard racheté le domaine X.com — c'est maintenant le nom de la société mère de Twitter).

Question 3 — Correction et complément
SpaceX :
En 2008, Falcon 1 a réussi lors de son quatrième lancement (après trois échecs qui ont épuisé leurs fonds). Juste après cela, SpaceX a obtenu un contrat NASA de 1,6 milliard de dollars pour le réapprovisionnement de fret — une bouée de sauvetage pour l'entreprise.

Tesla :
En 2008, Tesla a survécu grâce à un investissement de 50 millions de dollars de Daimler et un prêt du Département de l'Énergie américain.
La construction de la Gigafactory en Chine (2019) est venue beaucoup plus tard. Dans les premières années, la survie dépendait du lancement de la Model S (2012) et du pari personnel de Musk — vendre tous ses biens immobiliers pour financer Tesla.

Question 4 — Explication : Controverse autour de son style de gestion
Culture "Hardcore" :

Chez SpaceX/Tesla : Musk exigeait que les employés travaillent 80+ heures par semaine, dormant même à l'usine. ("Si vous n'êtes pas ici, votre remplaçant l'est.")

Après l'acquisition de Twitter : Il a immédiatement licencié 75% du personnel et demandé aux employés restants de signer un "engagement hardcore" — acceptant de longues heures et une production de haute intensité.

Cas extrêmes :

À l'usine Fremont : Pendant "l'enfer de production" de la Model 3, Musk dormait sur la ligne de production et a une fois crié sur les employés en les appelant "pires que des chiens" (il s'est excusé plus tard).

Licenciement de cadres : Musk a une fois licencié sur-le-champ un cadre supérieur de Tesla parce que la taille de police dans leur présentation PowerPoint n'était pas assez grande.

Question 5 — Correction et complément
Théorie de la menace IA :
Musk croit que l'IA pourrait détruire l'humanité ("plus dangereuse que les armes nucléaires"). Par conséquent :

OpenAI : Il l'a initialement soutenu pour promouvoir une IA sûre et open-source (mais est parti plus tard en raison de disputes sur le contrôle).

Neuralink : Il vise à utiliser les interfaces cerveau-machine pour fusionner les humains avec l'IA — pour empêcher les humains d'être laissés pour compte.

Contradictions :
Tout en avertissant des risques de l'IA, Musk pousse simultanément la technologie de conduite entièrement autonome (FSD) chez Tesla et a fondé xAI pour rechercher l'Intelligence Artificielle Générale (AGI).

Évaluation globale :
✅ Vous avez capturé les événements clés avec précision, mais les détails doivent être renforcés (comme les chronologies et les chiffres spécifiques).
🔍 Domaines recommandés pour une étude plus approfondie : Le style de gestion multi-thread de Musk (dirigeant cinq entreprises simultanément) et sa compétence en manipulation médiatique (gouverner par Twitter).
```

## Fonctionnalités

### Support EPUB
- Extraire les métadonnées (titre, auteur, date de publication, etc.)
- Extraire la table des matières
- Extraire le contenu des chapitres (avec sortie Markdown)
- Traitement par lots des fichiers EPUB

### Support PDF
- Extraire les métadonnées (titre, auteur, date de création, etc.)
- Extraire la table des matières
- Extraire le contenu par numéro de page
- Extraire le contenu par titre de chapitre
- Support de sortie Markdown
- Traitement par lots des fichiers PDF

## Installation

1. Cloner le dépôt :
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. Installer les dépendances en utilisant `uv` :
```bash
uv pip install -e .
```

## Utilisation

### Démarrer le serveur MCP en mode développement

Exécuter le serveur en mode développement :
```bash
uv run mcp dev src/ebook_mcp/main.py
```

Vous pouvez visiter http://localhost:5173/ à des fins de test et de débogage
Vous pouvez également installer l'inspecteur pour le test.
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### Démarrer le serveur MCP en mode production

Exécuter le serveur :
```bash
uv run src/ebook_mcp/main.py
```

#### Configurer le MCP dans Cursor

Ajouter la configuration suivante dans Cursor
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```

### Exemples de traitement EPUB

```python
# Obtenir tous les fichiers EPUB dans un répertoire
epub_files = get_all_epub_files("/path/to/books")

# Obtenir les métadonnées EPUB
metadata = get_metadata("/path/to/book.epub")

# Obtenir la table des matières
toc = get_toc("/path/to/book.epub")

# Obtenir le contenu d'un chapitre spécifique (au format Markdown)
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### Exemples de traitement PDF

```python
# Obtenir tous les fichiers PDF dans un répertoire
pdf_files = get_all_pdf_files("/path/to/books")

# Obtenir les métadonnées PDF
metadata = get_pdf_metadata("/path/to/book.pdf")

# Obtenir la table des matières
toc = get_pdf_toc("/path/to/book.pdf")

# Obtenir le contenu d'une page spécifique
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# Obtenir le contenu d'un chapitre spécifique
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## Référence API

### APIs EPUB

#### `get_all_epub_files(path: str) -> List[str]`
Obtenir tous les fichiers EPUB dans le répertoire spécifié.

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
Obtenir les métadonnées d'un fichier EPUB.

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
Obtenir la table des matières d'un fichier EPUB.

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
Obtenir le contenu du chapitre au format Markdown.

### APIs PDF

#### `get_all_pdf_files(path: str) -> List[str]`
Obtenir tous les fichiers PDF dans le répertoire spécifié.

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
Obtenir les métadonnées d'un fichier PDF.

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
Obtenir la table des matières d'un fichier PDF.

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
Obtenir le contenu en texte brut d'une page spécifique.

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
Obtenir le contenu formaté Markdown d'une page spécifique.

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
Obtenir le contenu du chapitre et les numéros de page correspondants par titre de chapitre.

## Dépendances

Les dépendances clés incluent :
- ebooklib : Traitement des fichiers EPUB
- PyPDF2 : Traitement PDF de base
- PyMuPDF : Traitement PDF avancé
- beautifulsoup4 : Analyse HTML
- html2text : Conversion HTML vers Markdown
- pydantic : Validation des données
- fastmcp : Framework de serveur MCP

## Notes importantes

1. Le traitement PDF dépend de la table des matières du document. Certaines fonctionnalités peuvent ne pas fonctionner si la table des matières n'est pas disponible.
2. Pour les gros fichiers PDF, il est recommandé de traiter par plages de pages pour éviter de charger tout le fichier en une fois.
3. Les ID de chapitre EPUB doivent être obtenus à partir de la structure de la table des matières.

## Architecture

```
           ┌────────────────────────────┐
           │      Couche Agent          │
           │  - Stratégie de traduction │
           │  - Vérification cohérence  │
           │  - Appel LLM & Interaction │
           └────────────▲─────────────┘
                        │ Appels d'outils
           ┌────────────┴─────────────┐
           │     Couche Outils MCP    │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ Appels Système/IO
           ┌────────────┴─────────────┐
           │   Couche Base Système    │
           │  - Lecture de fichiers   │
           │  - Analyse ebooklib      │
           │  - Stockage/Vérif. chemin│
           └────────────────────────────┘
```

## Contribution

Nous accueillons les Issues et Pull Requests !

Pour des informations détaillées sur les changements récents, veuillez consulter [CHANGELOG.md](CHANGELOG.md).

================================================
FILE: README-JP.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">

[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md)

Ebook-MCPは、電子書籍を処理するための強力なModel Context Protocol（MCP）サーバーです。[Model Context Protocol](https://github.com/modelcontextprotocol)上に構築され、LLMアプリケーションと電子書籍処理機能間のシームレスな統合のための標準化されたAPIセットを提供します。現在、EPUBおよびPDF形式をサポートしています。

## 使用例と価値

Ebook-MCPは、読書資料との自然言語による会話を可能にすることで、デジタル書籍との相互作用を変革します。CursorやClaudeなどの現代的なAI駆動IDEとシームレスに統合し、以下のことを可能にします：

- **スマートライブラリ管理**：「ダウンロードフォルダ内のすべてのEPUBファイルを表示して」や「ライブラリ内のGenAIに関する本を見つけて」と簡単に尋ねることができます
- **インタラクティブな読書体験**：本について自然な会話を行う：
  - 「『LLMエンジニアハンドブック』の簡単な紹介をして」
  - 「第3章では何が扱われていますか？」
  - 「この本からRAGに関する要点をまとめて」
- **アクティブラーニングサポート**：AI駆動のインタラクションを通じて学習を強化：
  - 「第5章のRAG概念に基づいてクイズを作成して」
  - 「この章で議論されているアーキテクチャの違いを説明して」
  - 「このセクションの概念に基づいて実践的な演習を提供して」
- **コンテンツナビゲーション**：自然言語クエリで本を簡単にナビゲート：
  - 「プロンプトエンジニアリングについて議論しているすべてのセクションを見つけて」
  - 「ファインチューニングに関する章を表示して」
  - 「ベクターデータベースに関する部分に案内して」

従来の電子書籍とAI機能の間のギャップを埋めることで、Ebook-MCPは読者が直感的で会話ベースのインタラクションを通じてデジタルライブラリからより多くの価値を抽出するのを支援します。

クイックデモ - DeepSeekで実行（mcp-client-exampleフォルダ下のClaude/OpenAIのすべての例をご確認ください）
```
クエリ：私のダウンロードフォルダ内のすべてのepub本をチェックしてください。ユーザーIDはonebirdです。
......

クエリ：この本の理解度をテストするためにいくつか質問してもらえますか？

回答：もちろんです！ウォルター・アイザックソンの『イーロン・マスク』についてあなたの理解をテストするための5つの質問があります。あなたの知識に基づいて答えてください — 各回答の後にフィードバックと詳細を追加します。

質問1：
マスクの幼少期と家族背景は、後の人生での彼の性格とキャリアをどのように形作りましたか？
（ヒント：父親との関係、南アフリカでの成長など）

質問2：
Zip2とX.com（後のPayPal）はマスクの初期のスタートアップでした。これらはどのような問題を解決しましたか？マスクは最終的にこれらの会社からどのように退出しましたか？
（ヒント：Zip2のビジネスモデル、X.comとConfinityの合併など）

質問3：
2008年、SpaceXとTeslaの両方が破産の危機に瀕していました。マスクはこれら2つの会社をどのように救いましたか？
（ヒント：Falcon 1の4回目の打ち上げ、NASAの契約、Teslaの土壇場での救済など）

質問4：
マスクの管理スタイルを巡ってどのような論争がありますか？本の中で彼の「ハードコア」または「極端な」労働文化を示す具体的な事件は何ですか？
（ヒント：フリーモント工場の「生産地獄」、Twitterのレイオフなど）

質問5：
人工知能（AI）に対するマスクの態度は何ですか？なぜ彼は同時にOpenAIとNeuralinkの両方を支援したのですか？これらは相反する目標を持っているように見えます。
（ヒント：存在的脅威としてのAI、ブレイン・マシン・インターフェースのビジョンなど）

各質問に答えてください。フィードバックと追加の詳細を提供し、誤解があれば訂正します！
（質問が不明な場合は、お気軽にお尋ねください。）

クエリ：
質問1：父親の虐待的な態度により、彼は他人に対して共感が少なくなりました。
質問2：Zip2は電子地図とオンラインイエローページの問題を解決しました。X.comはオンライン決済を解決しました。
質問3：彼はNASAとの契約を確保しました。Teslaは中国に工場を建設することで生産問題を解決しました。
質問4：わかりません。私のために答えてください。
質問5：彼はAIが安全でオープンであることを確保するためにオープンソース技術が必要だと信じています。

回答：あなたの答えは本からいくつかの重要なポイントを捉えています。質問4と5について詳細を追加し、いくつかの不正確さを訂正し、さらなる背景を提供させてください：

質問1 — 訂正と補足
父親の影響：
マスクの父親エロルは確かに極めて厳しく — 例えば、幼いイーロンに何時間も跪いて米粒を数えることを強制しました。本書は、この感情的な鈍感化がマスクを極めて目標志向の人間に形作ったことを強調しています — 他人の苦痛には無関心だが、自分の目標に執着的にコミットしています。

逆の影響：
マスクが共感能力がないということではありません — それは選択的です。例えば、従業員を冷たく扱うかもしれませんが、息子Xの健康問題のために医療機器を個人的に研究します。

質問2 — 訂正と補足
Zip2：
本質的にはオンラインイエローページ + 地図ナビゲーションサービス（新聞のオンライン化を支援）。マスクは初期のコードの多くを自分で書き、オフィスで寝泊まりしました。最終的に、CompaqがZip2を3億700万ドルで買収（マスクは2200万ドルを受け取りました）。

X.com：
単なるオンライン決済ではありません — これは包括的な金融プラットフォームを作成するマスクの試み（銀行の置き換えを目指していました）。後にConfinity（PayPalの前身）と合併。マスクは内部対立により追放されましたが、eBayがPayPalを買収した際、マスクは1億8000万ドルを稼ぎました。
（注：マスクは後にX.comドメインを買い戻しました — 現在はTwitterの親会社名です）。

質問3 — 訂正と補足
SpaceX：
2008年、Falcon 1は4回目の打ち上げで成功しました（3回の失敗で資金が枯渇した後）。その直後、SpaceXは貨物補給のための16億ドルのNASA契約を確保 — 会社の生命線でした。

Tesla：
2008年、TeslaはDaimlerからの5000万ドルの投資と米国エネルギー省からの融資により生き残りました。
中国でのギガファクトリー建設（2019年）はずっと後のことです。初期の頃、生存はModel Sの発売（2012年）とマスクの個人的な賭け — Teslaに資金提供するためにすべての不動産を売却することに依存していました。

質問4 — 説明：彼の管理スタイルを巡る論争
「ハードコア」文化：

SpaceX/Teslaで：マスクは従業員に週80時間以上の労働を要求し、工場で寝ることさえありました。（「あなたがここにいなければ、あなたの代替者がいます。」）

Twitter買収後：彼は即座にスタッフの75%を解雇し、残りの従業員に「ハードコアコミットメント」への署名を求めました — 長時間と高強度の出力に同意することです。

極端なケース：

フリーモント工場で：Model 3の「生産地獄」の間、マスクは生産ラインで寝て、従業員を「犬より悪い」と呼んで怒鳴ったことがありました（彼は後に謝罪しました）。

幹部の解雇：マスクはPowerPointプレゼンテーションのフォントサイズが十分大きくないという理由で、Teslaの上級幹部をその場で解雇したことがあります。

質問5 — 訂正と補足
AI脅威論：
マスクはAIが人類を破壊する可能性があると信じています（「核兵器より危険」）。したがって：

OpenAI：彼は最初、安全でオープンソースのAIを促進するためにそれを支援しました（しかし後に制御権を巡る争いで離脱）。

Neuralink：彼はブレイン・マシン・インターフェースを使用して人間とAIを融合させることを目指しています — 人間が取り残されることを防ぐためです。

矛盾：
AIリスクについて警告する一方で、マスクは同時にTeslaで完全自動運転（FSD）技術を推進し、人工汎用知能（AGI）を研究するためにxAIを設立しています。

全体的な評価：
✅ あなたは重要な出来事を正確に捉えていますが、詳細を強化する必要があります（タイムラインや具体的な数字など）。
🔍 さらなる学習が推奨される分野：マスクのマルチスレッド管理スタイル（5つの会社を同時に運営）とメディア操作のスキル（Twitterを通じた統治）。
```

## 機能

### EPUBサポート
- メタデータの抽出（タイトル、著者、出版日など）
- 目次の抽出
- 章の内容の抽出（Markdown出力付き）
- EPUBファイルのバッチ処理

### PDFサポート
- メタデータの抽出（タイトル、著者、作成日など）
- 目次の抽出
- ページ番号による内容の抽出
- 章タイトルによる内容の抽出
- Markdown出力サポート
- PDFファイルのバッチ処理

## インストール

1. リポジトリをクローン：
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. `uv`を使用して依存関係をインストール：
```bash
uv pip install -e .
```

## 使用方法

### 開発モードでMCPサーバーを起動

開発モードでサーバーを実行：
```bash
uv run mcp dev src/ebook_mcp/main.py
```

テストとデバッグのために http://localhost:5173/ にアクセスできます
テスト用にインスペクターをインストールすることもできます。
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### 本番モードでMCPサーバーを起動

サーバーを実行：
```bash
uv run src/ebook_mcp/main.py
```

#### CursorでMCPを設定

Cursorに以下の設定を追加
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```

### EPUB処理の例

```python
# ディレクトリ内のすべてのEPUBファイルを取得
epub_files = get_all_epub_files("/path/to/books")

# EPUBメタデータを取得
metadata = get_metadata("/path/to/book.epub")

# 目次を取得
toc = get_toc("/path/to/book.epub")

# 特定の章の内容を取得（Markdown形式）
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### PDF処理の例

```python
# ディレクトリ内のすべてのPDFファイルを取得
pdf_files = get_all_pdf_files("/path/to/books")

# PDFメタデータを取得
metadata = get_pdf_metadata("/path/to/book.pdf")

# 目次を取得
toc = get_pdf_toc("/path/to/book.pdf")

# 特定のページの内容を取得
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# 特定の章の内容を取得
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## APIリファレンス

### EPUB APIs

#### `get_all_epub_files(path: str) -> List[str]`
指定されたディレクトリ内のすべてのEPUBファイルを取得。

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
EPUBファイルからメタデータを取得。

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
EPUBファイルから目次を取得。

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
Markdown形式で章の内容を取得。

### PDF APIs

#### `get_all_pdf_files(path: str) -> List[str]`
指定されたディレクトリ内のすべてのPDFファイルを取得。

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
PDFファイルからメタデータを取得。

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
PDFファイルから目次を取得。

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
特定のページからプレーンテキスト内容を取得。

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
特定のページからMarkdown形式の内容を取得。

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
章タイトルによって章の内容と対応するページ番号を取得。

## 依存関係

主要な依存関係には以下が含まれます：
- ebooklib：EPUBファイル処理
- PyPDF2：基本的なPDF処理
- PyMuPDF：高度なPDF処理
- beautifulsoup4：HTML解析
- html2text：HTMLからMarkdownへの変換
- pydantic：データ検証
- fastmcp：MCPサーバーフレームワーク

## 重要な注意事項

1. PDF処理は文書の目次に依存します。目次が利用できない場合、一部の機能が動作しない可能性があります。
2. 大きなPDFファイルの場合、ファイル全体を一度に読み込むことを避けるため、ページ範囲で処理することをお勧めします。
3. EPUB章IDは目次構造から取得する必要があります。

## アーキテクチャ

```
           ┌────────────────────────────┐
           │         エージェント層      │
           │  - 翻訳戦略                │
           │  - スタイル一貫性チェック    │
           │  - LLM呼び出しと相互作用    │
           └────────────▲─────────────┘
                        │ ツール呼び出し
           ┌────────────┴─────────────┐
           │        MCPツール層         │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ システム/IO呼び出し
           ┌────────────┴─────────────┐
           │     システム基盤層         │
           │  - ファイル読み取り        │
           │  - ebooklib解析           │
           │  - ファイルパス保存/チェック│
           └────────────────────────────┘
```

## 貢献

IssuesとPull Requestsを歓迎します！

最近の変更の詳細については、[CHANGELOG.md](CHANGELOG.md)をご覧ください。

================================================
FILE: README-KR.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">

[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md)

Ebook-MCP는 전자책 처리를 위한 강력한 Model Context Protocol(MCP) 서버입니다. [Model Context Protocol](https://github.com/modelcontextprotocol) 위에 구축되어 LLM 애플리케이션과 전자책 처리 기능 간의 원활한 통합을 위한 표준화된 API 세트를 제공합니다. 현재 EPUB 및 PDF 형식을 지원합니다.

## 사용 사례 및 가치

Ebook-MCP는 독서 자료와의 자연어 대화를 가능하게 하여 디지털 도서와의 상호작용을 변화시킵니다. Cursor 및 Claude와 같은 현대적인 AI 기반 IDE와 원활하게 통합되어 다음을 가능하게 합니다:

- **스마트 라이브러리 관리**: "다운로드 폴더의 모든 EPUB 파일을 보여줘" 또는 "내 라이브러리에서 GenAI에 관한 책을 찾아줘"라고 간단히 물어보세요
- **대화형 독서 경험**: 책에 대해 자연스러운 대화를 나누세요:
  - "'LLM 엔지니어 핸드북'에 대한 간단한 소개를 해줘"
  - "3장에서는 무엇을 다루나요?"
  - "이 책에서 RAG에 대한 핵심 포인트를 요약해줘"
- **능동적 학습 지원**: AI 기반 상호작용을 통해 학습을 향상시키세요:
  - "5장의 RAG 개념을 바탕으로 퀴즈를 만들어줘"
  - "이 장에서 논의된 아키텍처 간의 차이점을 설명해줘"
  - "이 섹션의 개념을 바탕으로 실습 문제를 제공해줘"
- **콘텐츠 내비게이션**: 자연어 쿼리로 책을 쉽게 탐색하세요:
  - "프롬프트 엔지니어링을 논의하는 모든 섹션을 찾아줘"
  - "파인튜닝에 관한 장을 보여줘"
  - "벡터 데이터베이스에 관한 부분으로 안내해줘"

전통적인 전자책과 AI 기능 간의 격차를 해소함으로써, Ebook-MCP는 독자들이 직관적이고 대화 기반의 상호작용을 통해 디지털 라이브러리에서 더 많은 가치를 추출할 수 있도록 도와줍니다.

빠른 데모 - DeepSeek으로 실행 (mcp-client-example 폴더 하위의 Claude/OpenAI 모든 예제를 확인해주세요)
```
쿼리: 내 다운로드 폴더의 모든 epub 책을 확인해주세요. 사용자 ID는 onebird입니다.
......

쿼리: 이 책에 대한 이해도를 테스트하기 위해 몇 가지 질문을 해주실 수 있나요?

답변: 물론입니다! 월터 아이작슨의 『일론 머스크』에 대한 이해도를 테스트하기 위한 5가지 질문이 있습니다. 당신의 지식을 바탕으로 답변해주세요 — 각 답변 후에 피드백과 세부사항을 추가하겠습니다.

질문 1:
머스크의 어린 시절과 가족 배경이 그의 성격과 후의 경력에 어떤 영향을 미쳤나요?
(힌트: 아버지와의 관계, 남아프리카에서의 성장 등)

질문 2:
Zip2와 X.com(후의 PayPal)은 머스크의 초기 스타트업이었습니다. 이들은 어떤 문제를 해결했나요? 머스크는 결국 이 회사들에서 어떻게 퇴출했나요?
(힌트: Zip2의 비즈니스 모델, X.com과 Confinity의 합병 등)

질문 3:
2008년, SpaceX와 Tesla 모두 파산 위기에 처했습니다. 머스크는 이 두 회사를 어떻게 구했나요?
(힌트: Falcon 1의 네 번째 발사, NASA 계약, Tesla의 막판 구조 등)

질문 4:
머스크의 관리 스타일을 둘러싼 논란은 무엇인가요? 책에서 그의 "하드코어" 또는 "극단적인" 업무 문화를 보여주는 구체적인 사건은 무엇인가요?
(힌트: 프리몬트 공장의 "생산 지옥", 트위터 해고 등)

질문 5:
인공지능(AI)에 대한 머스크의 태도는 무엇인가요? 왜 그는 상충하는 목표를 가진 것처럼 보이는 OpenAI와 Neuralink를 동시에 지원했나요?
(힌트: 존재적 위협으로서의 AI, 뇌-기계 인터페이스 비전 등)

각 질문에 답변해주세요. 피드백과 추가 세부사항을 제공하거나 오해가 있다면 수정하겠습니다!
(질문이 명확하지 않다면 언제든 물어보세요.)

쿼리:
질문 1: 아버지의 학대적인 태도로 인해 그는 다른 사람들에 대한 공감이 부족해졌습니다.
질문 2: Zip2는 전자 지도와 온라인 옐로우페이지 문제를 해결했습니다. X.com은 온라인 결제를 해결했습니다.
질문 3: 그는 NASA와 계약을 확보했습니다. Tesla는 중국에 공장을 건설하여 생산 문제를 해결했습니다.
질문 4: 모르겠습니다. 저를 위해 답변해주세요.
질문 5: 그는 AI가 안전하고 개방적이도록 보장하기 위해 오픈소스 기술이 필요하다고 믿습니다.

답변: 당신의 답변은 책의 몇 가지 핵심 포인트를 잘 포착했습니다. 질문 4와 5에 대한 세부사항을 추가하고, 일부 부정확한 부분을 수정하며, 추가 배경을 제공하겠습니다:

질문 1 — 수정 및 보완
아버지의 영향:
머스크의 아버지 에롤은 확실히 극도로 가혹했습니다 — 예를 들어, 어린 일론에게 무릎을 꿇고 몇 시간 동안 쌀알을 세도록 강요했습니다. 책은 이러한 감정적 둔감화가 머스크를 극도로 목표 지향적인 사람으로 만들었다고 강조합니다 — 타인의 고통에는 무관심하지만 자신의 목표에는 강박적으로 헌신합니다.

역방향 영향:
머스크가 공감 능력이 없다는 것이 아니라 — 선택적입니다. 예를 들어, 직원들을 차갑게 대할 수 있지만 아들 X의 건강 문제를 위해 의료 장비를 개인적으로 연구합니다.

질문 2 — 수정 및 보완
Zip2:
본질적으로 온라인 옐로우페이지 + 지도 내비게이션 서비스(신문의 온라인화 지원). 머스크는 초기 코드의 대부분을 직접 작성했고 사무실에서 잠을 잤습니다. 결국 Compaq이 Zip2를 3억 700만 달러에 인수했습니다(머스크는 2200만 달러를 받았습니다).

X.com:
단순한 온라인 결제가 아니라 — 포괄적인 금융 플랫폼을 만들려는 머스크의 시도였습니다(은행을 대체하는 것을 목표로). 나중에 Confinity(PayPal의 전신)와 합병했습니다. 머스크는 내부 갈등으로 축출되었지만, eBay가 PayPal을 인수했을 때 머스크는 1억 8천만 달러를 벌었습니다.
(참고: 머스크는 나중에 X.com 도메인을 다시 구매했습니다 — 현재 트위터의 모회사 이름입니다).

질문 3 — 수정 및 보완
SpaceX:
2008년, Falcon 1은 네 번째 발사에서 성공했습니다(세 번의 실패로 자금이 고갈된 후). 바로 그 후, SpaceX는 화물 보급을 위한 16억 달러의 NASA 계약을 확보했습니다 — 회사의 생명줄이었습니다.

Tesla:
2008년, Tesla는 Daimler의 5천만 달러 투자와 미국 에너지부의 대출을 통해 생존했습니다.
중국에서의 기가팩토리 건설(2019년)은 훨씬 나중의 일입니다. 초기에는 Model S 출시(2012년)와 머스크의 개인적 도박 — Tesla에 자금을 제공하기 위해 모든 부동산을 매각하는 것에 생존이 달려 있었습니다.

질문 4 — 설명: 그의 관리 스타일을 둘러싼 논란
"하드코어" 문화:

SpaceX/Tesla에서: 머스크는 직원들에게 주당 80시간 이상 일하도록 요구했고, 심지어 공장에서 잠을 자기도 했습니다. ("당신이 여기 없다면, 당신의 대체자가 있습니다.")

트위터 인수 후: 그는 즉시 직원의 75%를 해고하고 나머지 직원들에게 "하드코어 약속"에 서명하도록 요구했습니다 — 장시간과 고강도 업무에 동의하는 것입니다.

극단적인 사례:

프리몬트 공장에서: Model 3의 "생산 지옥" 기간 동안, 머스크는 생산 라인에서 잠을 자고 직원들을 "개보다 못하다"고 소리치며 욕했습니다(그는 나중에 사과했습니다).

임원 해고: 머스크는 PowerPoint 프레젠테이션의 글꼴 크기가 충분히 크지 않다는 이유로 Tesla 고위 임원을 즉석에서 해고한 적이 있습니다.

질문 5 — 수정 및 보완
AI 위협론:
머스크는 AI가 인류를 파괴할 수 있다고 믿습니다("핵무기보다 위험"). 따라서:

OpenAI: 그는 처음에 안전하고 오픈소스인 AI를 촉진하기 위해 지원했습니다(하지만 나중에 통제권 분쟁으로 떠났습니다).

Neuralink: 그는 뇌-기계 인터페이스를 사용하여 인간과 AI를 융합시키는 것을 목표로 합니다 — 인간이 뒤처지는 것을 방지하기 위해서입니다.

모순:
AI 위험에 대해 경고하면서도, 머스크는 동시에 Tesla에서 완전 자율주행(FSD) 기술을 추진하고 있으며 인공일반지능(AGI)을 연구하기 위해 xAI를 설립했습니다.

전체 평가:
✅ 당신은 주요 사건들을 정확하게 포착했지만, 세부사항을 강화할 필요가 있습니다(타임라인과 구체적인 수치 등).
🔍 추가 학습이 권장되는 영역: 머스크의 멀티스레드 관리 스타일(5개 회사를 동시에 운영)과 미디어 조작 기술(트위터를 통한 통치).
```

## 기능

### EPUB 지원
- 메타데이터 추출 (제목, 저자, 출판일 등)
- 목차 추출
- 장 내용 추출 (Markdown 출력 포함)
- EPUB 파일 일괄 처리

### PDF 지원
- 메타데이터 추출 (제목, 저자, 생성일 등)
- 목차 추출
- 페이지 번호별 내용 추출
- 장 제목별 내용 추출
- Markdown 출력 지원
- PDF 파일 일괄 처리

## 설치

1. 저장소 클론:
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. `uv`를 사용하여 종속성 설치:
```bash
uv pip install -e .
```

## 사용법

### 개발 모드에서 MCP 서버 시작

개발 모드에서 서버 실행:
```bash
uv run mcp dev src/ebook_mcp/main.py
```

테스트 및 디버깅을 위해 http://localhost:5173/ 을 방문할 수 있습니다
테스트를 위해 인스펙터를 설치할 수도 있습니다.
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### 프로덕션 모드에서 MCP 서버 시작

서버 실행:
```bash
uv run src/ebook_mcp/main.py
```

#### Cursor에서 MCP 구성

Cursor에 다음 구성 추가
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```

### EPUB 처리 예제

```python
# 디렉토리의 모든 EPUB 파일 가져오기
epub_files = get_all_epub_files("/path/to/books")

# EPUB 메타데이터 가져오기
metadata = get_metadata("/path/to/book.epub")

# 목차 가져오기
toc = get_toc("/path/to/book.epub")

# 특정 장 내용 가져오기 (Markdown 형식)
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### PDF 처리 예제

```python
# 디렉토리의 모든 PDF 파일 가져오기
pdf_files = get_all_pdf_files("/path/to/books")

# PDF 메타데이터 가져오기
metadata = get_pdf_metadata("/path/to/book.pdf")

# 목차 가져오기
toc = get_pdf_toc("/path/to/book.pdf")

# 특정 페이지 내용 가져오기
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# 특정 장 내용 가져오기
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## API 참조

### EPUB APIs

#### `get_all_epub_files(path: str) -> List[str]`
지정된 디렉토리의 모든 EPUB 파일을 가져옵니다.

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
EPUB 파일에서 메타데이터를 가져옵니다.

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
EPUB 파일에서 목차를 가져옵니다.

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
Markdown 형식으로 장 내용을 가져옵니다.

### PDF APIs

#### `get_all_pdf_files(path: str) -> List[str]`
지정된 디렉토리의 모든 PDF 파일을 가져옵니다.

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
PDF 파일에서 메타데이터를 가져옵니다.

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
PDF 파일에서 목차를 가져옵니다.

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
특정 페이지에서 일반 텍스트 내용을 가져옵니다.

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
특정 페이지에서 Markdown 형식 내용을 가져옵니다.

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
장 제목으로 장 내용과 해당 페이지 번호를 가져옵니다.

## 종속성

주요 종속성은 다음과 같습니다:
- ebooklib: EPUB 파일 처리
- PyPDF2: 기본 PDF 처리
- PyMuPDF: 고급 PDF 처리
- beautifulsoup4: HTML 파싱
- html2text: HTML에서 Markdown으로 변환
- pydantic: 데이터 검증
- fastmcp: MCP 서버 프레임워크

## 중요 사항

1. PDF 처리는 문서의 목차에 의존합니다. 목차를 사용할 수 없는 경우 일부 기능이 작동하지 않을 수 있습니다.
2. 큰 PDF 파일의 경우 전체 파일을 한 번에 로드하는 것을 피하기 위해 페이지 범위별로 처리하는 것이 좋습니다.
3. EPUB 장 ID는 목차 구조에서 가져와야 합니다.

## 아키텍처

```
           ┌────────────────────────────┐
           │         에이전트 레이어     │
           │  - 번역 전략               │
           │  - 스타일 일관성 검사       │
           │  - LLM 호출 및 상호작용     │
           └────────────▲─────────────┘
                        │ 도구 호출
           ┌────────────┴─────────────┐
           │        MCP 도구 레이어    │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ 시스템/IO 호출
           ┌────────────┴─────────────┐
           │     시스템 기반 레이어    │
           │  - 파일 읽기              │
           │  - ebooklib 파싱          │
           │  - 파일 경로 저장/검사     │
           └────────────────────────────┘
```

## 기여

Issues와 Pull Requests를 환영합니다!

최근 변경사항에 대한 자세한 정보는 [CHANGELOG.md](CHANGELOG.md)를 참조하세요.

================================================
FILE: README.md
================================================
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/onebirdrocks-ebook-mcp)

# Ebook-MCP

<img src="logo.png" alt="ebook-mcp logo" width="132" height="132">


[English](https://github.com/onebirdrocks/ebook-mcp/blob/main/README.md) | [中文](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-CN.md) | [日本語](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-JP.md) | [한국어](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-KR.md) | [Français](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-FR.md) | [Deutsch](https://github.com/onebirdrocks/ebook-mcp/blob/main/README-DE.md) 

Ebook-MCP is a powerful Model Context Protocol (MCP) server for processing electronic books. Built on the [Model Context Protocol](https://github.com/modelcontextprotocol), it provides a set of standardized APIs for seamless integration between LLM applications and e-book processing capabilities. Currently supports EPUB and PDF formats.


## Use Cases & Value

Ebook-MCP transforms how you interact with your digital books by enabling natural language conversations with your reading materials. It seamlessly integrates with modern AI-powered IDEs like Cursor and Claude, allowing you to:

- **Smart Library Management**: Simply ask "Show me all EPUB files in my downloads folder" or "Find books about GenAI in my library"
- **Interactive Reading Experience**: Have natural conversations about your books:
  - "Give me a brief introduction to 'LLM Engineer Handbook'"
  - "What's covered in Chapter 3?"
  - "Summarize the key points about RAG from this book"
- **Active Learning Support**: Enhance your learning through AI-powered interactions:
  - "Create a quiz based on the RAG concepts from Chapter 5"
  - "Explain the differences between the architectures discussed in this chapter"
  - "Give me practical exercises based on the concepts in this section"
- **Content Navigation**: Easily navigate through your books with natural language queries:
  - "Find all sections discussing prompt engineering"
  - "Show me the chapters about fine-tuning"
  - "Take me to the part about vector databases"

By bridging the gap between traditional e-books and AI capabilities, Ebook-MCP helps readers extract more value from their digital library through intuitive, conversation-based interactions.

Quick demo - Run with DeepSeek (Please check all the examples with Claude/OpenAI under mcp-client-example folder) 
```
Query:Please check all the epub books in my download folder. user id is onebird.
......

Query: Can you ask me a few questions to test my understanding of this book?


Answer: Sure! Here are 5 questions about Elon Musk by Walter Isaacson to test your understanding of the book. Please answer based on your knowledge — I’ll give feedback and add details after each response.

Question 1:
How did Musk’s childhood and family background shape his personality and career later in life?
(Hint: His relationship with his father, growing up in South Africa, etc.)

Question 2:
Zip2 and X.com (later PayPal) were Musk’s early startups. What problems did they solve? How did Musk eventually exit from these companies?
(Hint: Zip2’s business model, X.com’s merger with Confinity, etc.)

Question 3:
In 2008, both SpaceX and Tesla were on the verge of bankruptcy. How did Musk save these two companies?
(Hint: Falcon 1’s fourth launch, NASA contract, Tesla’s last-minute rescue, etc.)

Question 4:
What controversies surround Musk’s management style? What specific incidents in the book illustrate his “hardcore” or “extreme” work culture?
(Hint: Fremont factory’s “production hell,” Twitter layoffs, etc.)

Question 5:
What is Musk’s attitude toward Artificial Intelligence (AI)? Why did he simultaneously support both OpenAI and Neuralink, which seem to have conflicting goals?
(Hint: AI as an existential threat, brain-machine interface vision, etc.)

Please answer each question. I’ll provide feedback and additional details or correct any misunderstandings!
(If a question isn’t clear, feel free to ask me.)

Query:
Question 1: His father’s abusive attitude made him less empathetic towards others.
Question 2: Zip2 solved the problem of electronic maps and online yellow pages. X.com solved online payments.
Question 3: He secured a contract with NASA. Tesla solved its production problems by building a factory in China.
Question 4: I don’t know. Please answer for me.
Question 5: He believes open-source technology is necessary to ensure AI is safe and open.

Answer: Your answers have captured some key points from the book. Let me add details, correct some inaccuracies, and provide further background for Questions 4 and 5:

Question 1 — Correction & Supplement
Influence of His Father:
Musk's father, Errol, was indeed extremely harsh — for example, forcing young Elon to kneel and count grains of rice for hours. The book emphasizes that this emotional desensitization shaped Musk into an extremely goal-oriented person — indifferent to others’ suffering but obsessively committed to his goals.

Reverse Impact:
It’s not that Musk is incapable of empathy — it’s selective. For instance, he may treat employees coldly but personally study medical equipment for his son X’s health issues.

Question 2 — Correction & Supplement
Zip2:
Essentially an online Yellow Pages + map navigation service (helping newspapers go online). Musk wrote much of the early code himself and slept in the office. Eventually, Compaq acquired Zip2 for $307 million (Musk received $22 million).

X.com:
Not just online payments — it was Musk’s attempt to create a comprehensive financial platform (aiming to replace banks). Later merged with Confinity (the precursor of PayPal). Musk was ousted due to internal conflicts, but when eBay acquired PayPal, Musk earned $180 million.
(Note: Musk later bought back the domain X.com — it’s now Twitter’s parent company name).

Question 3 — Correction & Supplement
SpaceX:
In 2008, Falcon 1 succeeded on its fourth launch (after three failures drained their funds). Right after that, SpaceX secured a $1.6 billion NASA contract for cargo resupply — a lifeline for the company.

Tesla:
In 2008, Tesla survived through a $50 million investment from Daimler and a loan from the U.S. Department of Energy.
Building the Gigafactory in China (2019) came much later. In the early years, survival depended on the launch of Model S (2012) and Musk’s personal gamble — selling all his real estate to fund Tesla.

Question 4 — Explanation: Controversy Around His Management Style
“Hardcore” Culture:

At SpaceX/Tesla: Musk demanded employees work 80+ hours per week, even sleeping at the factory. ("If you’re not here, your replacement is.")

After acquiring Twitter: He immediately laid off 75% of staff and asked the remaining employees to sign a "hardcore commitment" — agreeing to long hours and high-intensity output.

Extreme Cases:

At the Fremont Factory: During the "production hell" of the Model 3, Musk slept on the production line and once yelled at employees calling them "worse than dogs" (he later apologized).

Firing Executives: Musk once fired a senior Tesla executive on the spot because the font size in their PowerPoint presentation wasn’t big enough.

Question 5 — Correction & Supplement
AI Threat Theory:
Musk believes that AI could destroy humanity ("more dangerous than nuclear weapons"). Therefore:

OpenAI: He initially supported it to promote safe, open-source AI (but left later due to disputes over control).

Neuralink: He aims to use brain-machine interfaces to merge humans with AI — to prevent humans from being left behind.

Contradictions:
While warning about AI risks, Musk is simultaneously pushing for Full Self-Driving (FSD) technology at Tesla and has founded xAI to research Artificial General Intelligence (AGI).

Overall Evaluation:
✅ You have captured the key events accurately, but the details need to be strengthened (such as timelines and specific figures).
🔍 Recommended areas for further study: Musk's multi-threaded management style (running five companies simultaneously) and his skill in media manipulation (governing through Twitter).
```

## Features

### EPUB Support
- Extract metadata (title, author, publication date, etc.)
- Extract table of contents
- Extract chapter content (with Markdown output)
- Batch process EPUB files

### PDF Support
- Extract metadata (title, author, creation date, etc.)
- Extract table of contents
- Extract content by page number
- Extract content by chapter title
- Markdown output support
- Batch process PDF files

## Installation

1. Clone the repository:
```bash
git clone https://github.com/yourusername/ebook-mcp.git
cd ebook-mcp
```

2. Install dependencies using `uv`:
```bash
uv pip install -e .
```

## Usage

### Starting the MCP Server in development mode

Run the server in development mode:
```bash
uv run mcp dev src/ebook_mcp/main.py
```


You can visit http://localhost:5173/ for testing & debugging purpose 
You can also install the inspector for the test.
```
npx @modelcontextprotocol/inspector uv --directory . run src/ebook_mcp/main.py
```

### Starting the MCP Server in Prod mode

Run the server:
```bash
uv run src/ebook_mcp/main.py
```


#### Configure the MCP in Cursor

Add the following configuration in Cursor
```bash
"ebook-mcp":{
            "command": "uv",
            "args": [
                "--directory",
                "/Users/onebird/github/ebook-mcp/src/ebook_mcp/",
                "run",
                "main.py"
            ]
        }
```




### EPUB Processing Examples

```python
# Get all EPUB files in a directory
epub_files = get_all_epub_files("/path/to/books")

# Get EPUB metadata
metadata = get_metadata("/path/to/book.epub")

# Get table of contents
toc = get_toc("/path/to/book.epub")

# Get specific chapter content (in Markdown format)
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")
```

### PDF Processing Examples

```python
# Get all PDF files in a directory
pdf_files = get_all_pdf_files("/path/to/books")

# Get PDF metadata
metadata = get_pdf_metadata("/path/to/book.pdf")

# Get table of contents
toc = get_pdf_toc("/path/to/book.pdf")

# Get specific page content
page_text = get_pdf_page_text("/path/to/book.pdf", 1)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# Get specific chapter content
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")
```

## API Reference

### EPUB APIs

#### `get_all_epub_files(path: str) -> List[str]`
Get all EPUB files in the specified directory.

#### `get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]`
Get metadata from an EPUB file.

#### `get_toc(epub_path: str) -> List[Tuple[str, str]]`
Get table of contents from an EPUB file.

#### `get_chapter_markdown(epub_path: str, chapter_id: str) -> str`
Get chapter content in Markdown format.

### PDF APIs

#### `get_all_pdf_files(path: str) -> List[str]`
Get all PDF files in the specified directory.

#### `get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]`
Get metadata from a PDF file.

#### `get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]`
Get table of contents from a PDF file.

#### `get_pdf_page_text(pdf_path: str, page_number: int) -> str`
Get plain text content from a specific page.

#### `get_pdf_page_markdown(pdf_path: str, page_number: int) -> str`
Get Markdown formatted content from a specific page.

#### `get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]`
Get chapter content and corresponding page numbers by chapter title.

## Dependencies

Key dependencies include:
- ebooklib: EPUB file processing
- PyPDF2: Basic PDF processing
- PyMuPDF: Advanced PDF processing
- beautifulsoup4: HTML parsing
- html2text: HTML to Markdown conversion
- pydantic: Data validation
- fastmcp: MCP server framework

## Important Notes

1. PDF processing relies on the document's table of contents. Some features may not work if TOC is not available.
2. For large PDF files, it's recommended to process by page ranges to avoid loading the entire file at once.
3. EPUB chapter IDs must be obtained from the table of contents structure.

## Architecture

```
           ┌────────────────────────────┐
           │         Agent Layer        │
           │  - Translation Strategy    │
           │  - Style Consistency Check │
           │  - LLM Call & Interaction │
           └────────────▲─────────────┘
                        │ Tool Calls
           ┌────────────┴─────────────┐
           │        MCP Tool Layer     │
           │  - extract_chapter        │
           │  - write_translated_chapter│
           │  - generate_epub          │
           └────────────▲─────────────┘
                        │ System/IO Calls
           ┌────────────┴─────────────┐
           │     System Base Layer     │
           │  - File Reading          │
           │  - ebooklib Parsing      │
           │  - File Path Storage/Check│
           └────────────────────────────┘
```



## Contributing

We welcome Issues and Pull Requests!

For detailed information about recent changes, please see [CHANGELOG.md](CHANGELOG.md).


================================================
FILE: pyproject.toml
================================================
[project]
name = "ebook-mcp"
version = "0.1.8"
description = "An MCP server for chatting with ebooks (PDF/EPUB)."
authors = [{ name = "onebird", email = "rymrocks@163.com" }]
license = "Apache-2.0"
readme = "README.md"
requires-python = ">=3.10"

dependencies = [
    "ebooklib>=0.19",
    "PyMuPDF>=1.26.3",
    "beautifulsoup4>=4.13.4",
    "html2text>=2025.4.15",
    "pydantic>=2.11.7",
    "fastmcp>=2.11.1",
    "typer>=0.16.0"
]

[project.optional-dependencies]
dev = [
    "pytest>=8.4.1",
    "uvicorn>=0.35.0,<1.0.0",
]



[project.scripts]
ebook-mcp = "ebook_mcp.main:cli_entry"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/ebook_mcp"]

================================================
FILE: pytest.ini
================================================
[tool:pytest]
testpaths = src/ebook_mcp/tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts = 
    -v
    --tb=short
    --strict-markers
    --disable-warnings
    --import-mode=importlib
markers =
    slow: marks tests as slow (deselect with '-m "not slow"')
    integration: marks tests as integration tests
    unit: marks tests as unit tests 

================================================
FILE: run_tests.sh
================================================
#!/bin/bash

# Ebook-MCP 测试运行脚本

echo "=========================================="
echo "Ebook-MCP 单元测试运行器"
echo "=========================================="

# 检查是否安装了 pytest
if ! command -v pytest &> /dev/null; then
    echo "❌ pytest 未安装，请先安装: pip install pytest"
    exit 1
fi

# 设置测试目录
TEST_DIR="src/ebook_mcp/tests"

echo "📁 测试目录: $TEST_DIR"
echo ""

# 运行基本测试（推荐）
echo "🧪 运行基本测试（不需要外部依赖）..."
python -m pytest $TEST_DIR/test_basic.py -v
BASIC_RESULT=$?

echo ""
echo "🧪 运行 EPUB 章节提取修复版本测试..."
python -m pytest $TEST_DIR/test_epub_chapter_extraction.py -v
FIXED_RESULT=$?

echo ""
echo "🧪 运行 AZW 测试（如果模块可用）..."
python -m pytest $TEST_DIR/test_azw.py -v
AZW_RESULT=$?

echo ""
echo "=========================================="
echo "测试结果总结:"
echo "=========================================="

if [ $BASIC_RESULT -eq 0 ]; then
    echo "✅ 基本测试: 通过"
else
    echo "❌ 基本测试: 失败"
fi

if [ $FIXED_RESULT -eq 0 ]; then
    echo "✅ EPUB 章节提取修复测试: 通过"
else
    echo "❌ EPUB 章节提取修复测试: 失败"
fi

if [ $AZW_RESULT -eq 0 ]; then
    echo "✅ AZW 测试: 通过或跳过"
else
    echo "❌ AZW 测试: 失败"
fi

echo ""
echo "💡 提示:"
echo "   - 要运行完整测试，请先安装项目依赖: pip install -e ."
echo "   - 查看详细测试指南: cat HOW-TO-TEST.md"
echo ""

# 返回总体结果
if [ $BASIC_RESULT -eq 0 ] && [ $FIXED_RESULT -eq 0 ] && [ $AZW_RESULT -eq 0 ]; then
    echo "🎉 所有可用测试通过！"
    exit 0
else
    echo "⚠️  部分测试失败，请检查上述输出"
    exit 1
fi 

================================================
FILE: src/ebook_mcp/__init__.py
================================================


================================================
FILE: src/ebook_mcp/main.py
================================================
import os
from typing import Any,List,Dict,Union,Tuple, Callable, TypeVar
from functools import wraps
from mcp.server.fastmcp import FastMCP
from ebooklib import epub
from pydantic import BaseModel
from bs4 import BeautifulSoup
from ebook_mcp.tools import epub_helper, pdf_helper
import logging
from datetime import datetime
from ebook_mcp.tools.logger_config import setup_logger  # Import logger config

# Type variable for generic function return type
T = TypeVar('T')

def handle_mcp_errors(func: Callable[..., T]) -> Callable[..., T]:
    """
    Decorator to handle common MCP tool errors uniformly.
    
    This decorator catches FileNotFoundError and other exceptions,
    re-raises them with consistent error messages.
    """
    @wraps(func)
    def wrapper(*args, **kwargs) -> T:
        try:
            return func(*args, **kwargs)
        except FileNotFoundError as e:
            raise FileNotFoundError(str(e))
        except (epub_helper.EpubProcessingError, pdf_helper.PdfProcessingError) as e:
            # Re-raise custom exceptions as-is to preserve detailed error information
            raise e
        except Exception as e:
            raise Exception(str(e))
    return wrapper

def handle_pdf_errors(func: Callable[..., T]) -> Callable[..., T]:
    """
    Decorator to handle PDF-specific errors.
    
    Some PDF functions don't need FileNotFoundError handling
    as they handle it internally.
    """
    @wraps(func)
    def wrapper(*args, **kwargs) -> T:
        try:
            return func(*args, **kwargs)
        except Exception as e:
            raise Exception(str(e))
    return wrapper


log_dir = "logs"
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

log_file = os.path.join(log_dir, f"ebook-mcp_server_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log")
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(log_file),
        logging.StreamHandler()
    ]
)
logger = logging.getLogger(__name__)




# Initialize FastMCP server
mcp = FastMCP("ebook-MCP")

# EPUB related tools
@mcp.tool()
@handle_mcp_errors
def get_all_epub_files(path: str) -> List[str]:
    """Get all epub files in a given path.
    """
    return epub_helper.get_all_epub_files(path)

@mcp.tool()
@handle_mcp_errors
def get_epub_metadata(epub_path:str) -> Dict[str, Union[str, List[str]]]:
    """Get metadata of a given ebook.

    Args:
        epub_path: Full path to the ebook file.eg. "/Users/macbook/Downloads/test.epub"
    
    Returns:
        List[Tuple[str, str]]: Return a list，Each element is a Tuple，contains key and value of metadata

    Raises:
        FileNotFoundError: Raises when the epub file not found
        Exception: Raisers when running into parsing error of epub file
    """
    logger.debug(f"Getting ebook metadata: {epub_path}")
    return epub_helper.get_meta(epub_path)


@mcp.tool()
@handle_mcp_errors
def get_epub_toc(epub_path: str) -> List[Tuple[str, str]]:
    """Get table of contents of a given EPUB file.

    Args:
        epub_path: Full path to the ebook file.eg. "/Users/macbook/Downloads/test.epub"
    
    Returns:
        List[Tuple[str, str]]: List of TOC entries, each entry is a tuple of (title, href)

    Raises:
        FileNotFoundError: Raises when the EPUB file not found
        Exception: Raisers when running into parsing error of EPUB file
    """
    logger.debug(f"calling get_epub_toc: {epub_path}")
    return epub_helper.get_toc(epub_path)

@mcp.tool()
@handle_mcp_errors
def get_epub_chapter_markdown(epub_path:str, chapter_id: str) -> str:
    """Get content of a given chapter using the improved extraction method.
    
    ✅ RECOMMENDED: This tool fixes the truncation issue in the original version when processing subchapters.
    It can correctly handle EPUB files with subchapters and provide complete chapter content.
    
    This function uses extract_chapter_html which properly handles subchapters
    and provides accurate chapter boundaries, fixing the issue where subchapters
    in the TOC cause premature truncation of chapter content.

    Args:
        epub_path: Full path to the ebook file. eg. "/Users/macbook/Downloads/test.epub"
        chapter_id: Chapter id of the chapter to get content (e.g., "chapter1.xhtml#section1_3")
    
    Returns:
        str: Chapter content in markdown format
    """
    logger.debug(f"calling get_epub_chapter_markdown: {epub_path}, chapter ID: {chapter_id}")
    book = epub_helper.read_epub(epub_path)
    
    # Use the improved version
    return epub_helper.extract_chapter_markdown(book, chapter_id)

# PDF related tools
@mcp.tool()
@handle_mcp_errors
def get_all_pdf_files(path: str) -> List[str]:
    """Get all PDF files in a given path.
    """
    return pdf_helper.get_all_pdf_files(path)

@mcp.tool()
@handle_mcp_errors
def get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]:
    """Get metadata of a given PDF file.

    Args:
        pdf_path: Full path to the PDF file.eg. "/Users/macbook/Downloads/test.pdf"
    
    Returns:
        Dict[str, Union[str, List[str]]]: Dictionary containing metadata

    Raises:
        FileNotFoundError: Raises when the PDF file not found
        Exception: Raisers when running into parsing error of PDF file
    """
    logger.debug(f"calling get_pdf_metadata: {pdf_path}")
    return pdf_helper.get_meta(pdf_path)

@mcp.tool()
@handle_mcp_errors
def get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]:
    """Get table of contents of a given PDF file.

    Args:
        pdf_path: Full path to the PDF file.eg. "/Users/macbook/Downloads/test.pdf"
    
    Returns:
        List[Tuple[str, int]]: List of TOC entries, each entry is a tuple of (title, page_number)

    Raises:
        FileNotFoundError: Raises when the PDF file not found
        Exception: Raisers when running into parsing error of PDF file
    """
    logger.debug(f"calling get_pdf_toc: {pdf_path}")
    return pdf_helper.get_toc(pdf_path)

@mcp.tool()
@handle_pdf_errors
def get_pdf_page_text(pdf_path: str, page_number: int) -> str:
    """Get text content of a specific page in PDF file.

    Args:
        pdf_path: Full path to the PDF file.eg. "/Users/macbook/Downloads/test.pdf"
        page_number: Page number to extract (1-based index)
    
    Returns:
        str: Extracted text content
    """
    logger.debug(f"calling get_pdf_page_text: {pdf_path}, page: {page_number}")
    return pdf_helper.extract_page_text(pdf_path, page_number)

@mcp.tool()
@handle_pdf_errors
def get_pdf_page_markdown(pdf_path: str, page_number: int) -> str:
    """Get markdown formatted content of a specific page in PDF file.

    Args:
        pdf_path: Full path to the PDF file.eg. "/Users/macbook/Downloads/test.pdf"
        page_number: Page number to extract (1-based index)
    
    Returns:
        str: Markdown formatted text
    """
    logger.debug(f"calling get_pdf_page_markdown: {pdf_path}, page: {page_number}")
    return pdf_helper.extract_page_markdown(pdf_path, page_number)

@mcp.tool()
@handle_pdf_errors
def get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[str, List[int]]:
    """Get content of a specific chapter in PDF file by its title.

    Args:
        pdf_path: Full path to the PDF file.eg. "/Users/macbook/Downloads/test.pdf"
        chapter_title: Title of the chapter to extract
    
    Returns:
        Tuple[str, List[int]]: Tuple containing (chapter_content, page_numbers)
    """
    logger.debug(f"calling get_pdf_chapter_content: {pdf_path}, chapter: {chapter_title}")
    return pdf_helper.extract_chapter_by_title(pdf_path, chapter_title)

if __name__ == "__main__":
    # Initialize and run the server
    logger.info("Server is starting.....")
    mcp.run(transport='stdio')

# as the cli entry after the "pip install ebook-mcp"
def cli_entry():
    import logging
    logging.info("Starting ebook-mcp server")
    from mcp.server.fastmcp import FastMCP
    mcp = FastMCP("ebook-mcp")
    mcp.run(transport='stdio')


================================================
FILE: src/ebook_mcp/mcp_client_example/.gitignore
================================================
.env

# Logs
logs/
*.log

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
env/
ENV/
.venv/
.env/

# IDE
.idea/
.vscode/
*.swp
*.swo
.DS_Store

# Project specific
*.bak
*.tmp

================================================
FILE: src/ebook_mcp/mcp_client_example/README-CN.md
================================================
[English](README.md) | 中文

本项目提供了多个 MCP（模型控制协议）客户端示例，用于与不同的 AI 模型进行交互，包括 OpenAI GPT-4、Anthropic Claude 和 Deepseek AI。

## 安装

安装依赖：
```bash
pip install -e .
```

## 配置

在项目根目录创建 `.env` 文件，并根据你想要使用的模型添加相应的 API 密钥：

```env
# OpenAI API 密钥 (用于 GPT-4)
OPENAI_API_KEY=your_openai_api_key

# Anthropic API 密钥 (用于 Claude)
ANTHROPIC_API_KEY=your_anthropic_api_key

# Deepseek API 密钥
DEEPSEEK_API_KEY=your_deepseek_api_key
```

## 使用方法

本项目提供了三个客户端示例，每个对应不同的 AI 模型：

### OpenAI GPT-4 客户端
```bash
python openai_example.py ../main.py
```

### Anthropic Claude 客户端
```bash
python anthropic_example.py ../main.py
```

### Deepseek AI 客户端
```bash
python deepseek_example.py ../main.py
```

## 对话示例

以下是使用 MCP 客户端（DeepSeek）与电子书交互的示例：

```
## 查询：
你能考察我几个问题，看我对这本书的掌握程度吗？

2025-04-12 15:29:12,313 - mcp.server.lowlevel.server - INFO - 正在处理请求类型 ListToolsRequest

好的！以下是关于《埃隆·马斯克传》（Walter Isaacson 著）的5个问题，测试你对这本书的掌握程度。请根据你的理解回答，我会根据你的答案进行反馈和补充。

---

### 问题1：
马斯克的童年和家庭背景，如何影响了他后来的性格和事业？
（提示：与父亲的关系、南非的成长经历等）

---

### 问题2：
Zip2 和 X.com（后来的 PayPal）是马斯克早期的创业项目。它们分别解决了什么问题？马斯克最终是如何退出这些公司的？
（提示：Zip2的商业模式，X.com与Confinity的合并等）

---

### 问题3：
2008年，SpaceX 和特斯拉同时濒临破产。马斯克是如何拯救这两家公司的？
（提示：猎鹰1号的第四次发射，NASA合同，特斯拉的“最后一搏”等）

---

### 问题4：
马斯克的管理风格有哪些争议？书中有哪些具体事件体现了他的“硬核”或“极端”工作文化？
（提示：弗里蒙特工厂的“生产地狱”、推特的大裁员等）

---

### 问题5：
马斯克对人工智能（AI）的态度是什么？他为什么同时支持OpenAI和Neuralink这两个看似矛盾的项目？
（提示：AI的生存威胁论，脑机接口的愿景等）

---

请依次回答这些问题，我会根据你的答案进行补充和纠正。
（如果某个问题不清楚，也可以直接问我）

---

## 回答：

- 问题1：他父亲对他粗暴的态度，让他对他人缺乏共情。
- 问题2：Zip2 解决了电子地图和在线黄页的问题。X.com 解决了电子支付。
- 问题3：他拿到了NASA的订单。特斯拉通过在中国建厂解决了产能问题。
- 问题4：我不知道，请帮我解答。
- 问题5：他认为开源技术是确保AI安全和开放的必要方式。

---

## 你的回答已经抓住了书中的一些关键点，我来补充细节和纠正一些偏差，同时为问题4和5提供更多背景：

---

### 问题1 —— 修正与补充

#### 父亲的影响：
马斯克的父亲埃罗尔确实非常严苛，比如曾经强迫小马斯克跪在地上数大米几个小时。书中强调，这种情感钝化的经历，反而让他形成了极端的目标导向性格 —— 对别人的痛苦不敏感，但对目标极度执着。

#### 反向影响：
马斯克并不是完全没有共情能力，而是选择性共情。例如他对员工冷酷，但会亲自研究医疗设备来帮助生病的儿子X。

---

### 问题2 —— 修正与补充

#### Zip2：
本质上是企业在线黄页 + 地图导航服务（帮助报纸实现数字化转型）。马斯克早期亲自编写大量代码，甚至睡在办公室。最终被康柏（Compaq）以3.07亿美元收购，马斯克个人获得2200万美元。

#### X.com：
不仅仅是电子支付，更是马斯克试图打造的一个金融全平台（取代银行）。后来与Confinity（PayPal前身）合并，但因为内部斗争被踢出管理层。eBay收购PayPal时，马斯克获得了1.8亿美元。
> 注：X.com 后来被马斯克买回，成为推特母公司的名称。

---

### 问题3 —— 修正与补充

#### SpaceX：
2008年，猎鹰1号在经历三次失败后，第四次发射终于成功。紧接着，SpaceX 拿到了NASA价值16亿美元的货运合同，成为公司的救命稻草。

#### 特斯拉：
2008年，特斯拉靠戴姆勒投资的5000万美元，以及美国能源部的贷款才得以生存下来。
> 中国建厂（2019年）是后期的事情。早期的生存关键是Model S（2012年）的成功发布，以及马斯克卖掉自己所有房产，全力押注特斯拉。

---

### 问题4 —— 解答：管理风格的争议

#### “硬核”文化：

- 在SpaceX和特斯拉：马斯克要求员工每周工作80小时以上，甚至要在工厂里过夜。他曾说：“如果你不在这里，就会有人取代你的位置。”
  
- 收购推特后：他立即裁员75%，并要求剩余员工签署“极端硬核”承诺书，承诺长时间、高强度工作。

#### 极端案例：

- 弗里蒙特工厂：Model 3产能地狱时期，马斯克直接睡在生产线上，甚至骂员工“连狗都不如”（事后道歉）。

- 开除高管：曾因为PPT字体不够大，当场开除特斯拉高管。

---

### 问题5 —— 修正与补充

#### AI威胁论：
马斯克认为AI可能会毁灭人类（“比核弹更危险”）。因此：

- OpenAI：最初支持OpenAI，推动安全、开源的AI（但后来因控制权分歧离开）。

- Neuralink：希望通过脑机接口让人类与AI共生，避免人类被AI淘汰。

#### 矛盾之处：
尽管警告AI的风险，但马斯克同时在特斯拉大力推动自动驾驶（FSD），并成立xAI研究AGI（通用人工智能）。

```


================================================
FILE: src/ebook_mcp/mcp_client_example/README.md
================================================
English | [中文](README-CN.md)
# MCP Client Examples



This project provides multiple MCP (Model Control Protocol) client examples for interacting with different AI models, including OpenAI GPT-4, Anthropic Claude, and Deepseek AI.

## Installation

Install dependencies:
```bash
pip install -e .
```

## Configuration

Create a `.env` file in the project root directory and add the corresponding API keys based on the model you want to use:

```env
# OpenAI API key (for GPT-4)
OPENAI_API_KEY=your_openai_api_key

# Anthropic API key (for Claude)
ANTHROPIC_API_KEY=your_anthropic_api_key

# Deepseek API key
DEEPSEEK_API_KEY=your_deepseek_api_key
```

## Usage

The project provides three client examples, each corresponding to a different AI model:

### OpenAI GPT-4 Client
```bash
python openai_example.py ../main.py
```

### Anthropic Claude Client
```bash
python anthropic_example.py ../main.py
```

### Deepseek AI Client
```bash
python deepseek_example.py ../main.py
```

## Example Conversation

Here's an example of interacting with an ebook using the MCP client with DeepSeek:

```

Query: Can you ask me a few questions to test my understanding of this book?


Sure! Here are 5 questions about Elon Musk by Walter Isaacson to test your understanding of the book. Please answer based on your knowledge — I’ll give feedback and add details after each response.

Question 1:
How did Musk’s childhood and family background shape his personality and career later in life?
(Hint: His relationship with his father, growing up in South Africa, etc.)

Question 2:
Zip2 and X.com (later PayPal) were Musk’s early startups. What problems did they solve? How did Musk eventually exit from these companies?
(Hint: Zip2’s business model, X.com’s merger with Confinity, etc.)

Question 3:
In 2008, both SpaceX and Tesla were on the verge of bankruptcy. How did Musk save these two companies?
(Hint: Falcon 1’s fourth launch, NASA contract, Tesla’s last-minute rescue, etc.)

Question 4:
What controversies surround Musk’s management style? What specific incidents in the book illustrate his “hardcore” or “extreme” work culture?
(Hint: Fremont factory’s “production hell,” Twitter layoffs, etc.)

Question 5:
What is Musk’s attitude toward Artificial Intelligence (AI)? Why did he simultaneously support both OpenAI and Neuralink, which seem to have conflicting goals?
(Hint: AI as an existential threat, brain-machine interface vision, etc.)

Please answer each question. I’ll provide feedback and additional details or correct any misunderstandings!
(If a question isn’t clear, feel free to ask me.)

Query:
Question 1: His father’s abusive attitude made him less empathetic towards others.
Question 2: Zip2 solved the problem of electronic maps and online yellow pages. X.com solved online payments.
Question 3: He secured a contract with NASA. Tesla solved its production problems by building a factory in China.
Question 4: I don’t know. Please answer for me.
Question 5: He believes open-source technology is necessary to ensure AI is safe and open.

Your answers have captured some key points from the book. Let me add details, correct some inaccuracies, and provide further background for Questions 4 and 5:

Question 1 — Correction & Supplement
Influence of His Father:
Musk's father, Errol, was indeed extremely harsh — for example, forcing young Elon to kneel and count grains of rice for hours. The book emphasizes that this emotional desensitization shaped Musk into an extremely goal-oriented person — indifferent to others’ suffering but obsessively committed to his goals.

Reverse Impact:
It’s not that Musk is incapable of empathy — it’s selective. For instance, he may treat employees coldly but personally study medical equipment for his son X’s health issues.

Question 2 — Correction & Supplement
Zip2:
Essentially an online Yellow Pages + map navigation service (helping newspapers go online). Musk wrote much of the early code himself and slept in the office. Eventually, Compaq acquired Zip2 for $307 million (Musk received $22 million).

X.com:
Not just online payments — it was Musk’s attempt to create a comprehensive financial platform (aiming to replace banks). Later merged with Confinity (the precursor of PayPal). Musk was ousted due to internal conflicts, but when eBay acquired PayPal, Musk earned $180 million.
(Note: Musk later bought back the domain X.com — it’s now Twitter’s parent company name).

Question 3 — Correction & Supplement
SpaceX:
In 2008, Falcon 1 succeeded on its fourth launch (after three failures drained their funds). Right after that, SpaceX secured a $1.6 billion NASA contract for cargo resupply — a lifeline for the company.

Tesla:
In 2008, Tesla survived through a $50 million investment from Daimler and a loan from the U.S. Department of Energy.
Building the Gigafactory in China (2019) came much later. In the early years, survival depended on the launch of Model S (2012) and Musk’s personal gamble — selling all his real estate to fund Tesla.

Question 4 — Explanation: Controversy Around His Management Style
“Hardcore” Culture:

At SpaceX/Tesla: Musk demanded employees work 80+ hours per week, even sleeping at the factory. ("If you’re not here, your replacement is.")

After acquiring Twitter: He immediately laid off 75% of staff and asked the remaining employees to sign a "hardcore commitment" — agreeing to long hours and high-intensity output.

Extreme Cases:

At the Fremont Factory: During the "production hell" of the Model 3, Musk slept on the production line and once yelled at employees calling them "worse than dogs" (he later apologized).

Firing Executives: Musk once fired a senior Tesla executive on the spot because the font size in their PowerPoint presentation wasn’t big enough.

Question 5 — Correction & Supplement
AI Threat Theory:
Musk believes that AI could destroy humanity ("more dangerous than nuclear weapons"). Therefore:

OpenAI: He initially supported it to promote safe, open-source AI (but left later due to disputes over control).

Neuralink: He aims to use brain-machine interfaces to merge humans with AI — to prevent humans from being left behind.

Contradictions:
While warning about AI risks, Musk is simultaneously pushing for Full Self-Driving (FSD) technology at Tesla and has founded xAI to research Artificial General Intelligence (AGI).




```

## Notes

- Make sure you have configured the appropriate API keys
- Each client generates log files in the `logs` directory. Use those log's tu understand the conversation and tools usage.
- Use the `clear` command to clear conversation history(not implement in deepseek version)
- Use the `quit` command to exit the program

## Contributing
Issues and suggestions for improvements are welcome!



================================================
FILE: src/ebook_mcp/mcp_client_example/anthropic_example.py
================================================
import asyncio
import logging
import os
from datetime import datetime
from typing import Optional, List, Dict
from contextlib import AsyncExitStack

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

# Configure logging
log_dir = "logs"
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

log_file = os.path.join(log_dir, f"anthropic_mcp_client_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log")
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(log_file),
        #logging.StreamHandler()
    ]
)
logger = logging.getLogger(__name__)

class MCPClient:
    """
    MCP Client class that handles communication with the server and Claude API.
    Manages message history and tool execution.
    """
    def __init__(self):
        # Initialize session and client objects
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.anthropic = Anthropic()
        self.messages: List[Dict[str, str]] = []
        self.max_messages = 20  # Maximum number of messages to keep in history
        self.max_tokens = 8000  # Maximum token count (using half of Claude 3.5 Sonnet's 16k context window)
        logger.debug("MCPClient initialized")

    def _estimate_tokens(self, text: str) -> int:
        """
        Roughly estimate the number of tokens in a text.
        Uses a simple approximation of 4 characters per token.
        """
        return len(text) // 4

    def _trim_messages_history(self):
        """
        Clean up message history to ensure it doesn't exceed limits.
        Removes oldest messages first when exceeding count or token limits.
        """
        initial_count = len(self.messages)
        
        # Remove messages if count exceeds limit
        while len(self.messages) > self.max_messages:
            removed_msg = self.messages.pop(0)
            logger.debug(f"Removed message due to count limit: {removed_msg['role']}")
        
        # Calculate and trim based on token count
        total_tokens = sum(self._estimate_tokens(msg["content"]) for msg in self.messages)
        while total_tokens > self.max_tokens and self.messages:
            removed_msg = self.messages.pop(0)
            removed_tokens = self._estimate_tokens(removed_msg["content"])
            total_tokens -= removed_tokens
            logger.debug(f"Removed message due to token limit: {removed_msg['role']}, tokens: {removed_tokens}")
        
        if initial_count != len(self.messages):
            logger.debug(f"Trimmed messages from {initial_count} to {len(self.messages)}")

    def add_message(self, role: str, content: str):
        """
        Add a new message to the history and trim if necessary.
        """
        self.messages.append({"role": role, "content": content})
        logger.debug(f"Added message - Role: {role}, Content length: {len(content)}")
        self._trim_messages_history()

    async def connect_to_server(self, server_script_path: str):
        """
        Connect to an MCP server
        
        Args:
            server_script_path: Path to the server script (.py or .js)
        """
        logger.debug(f"Connecting to server with script: {server_script_path}")
        
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            logger.error("Invalid server script type")
            raise ValueError("Server script must be a .py or .js file")
            
        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )
        
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        
        await self.session.initialize()
        
        # List available tools
        response = await self.session.list_tools()
        tools = response.tools
        tools_list = [tool.name for tool in tools]
        logger.debug(f"Connected to server with tools: {tools_list}")
        print("\nConnected to server with tools:", tools_list)

    async def process_query(self, query: str) -> str:
        """
        Process a query using Claude and available tools
        """
        logger.debug(f"Processing new query: {query}")
        self.add_message("user", query)

        response = await self.session.list_tools()
        available_tools = [{ 
            "name": tool.name,
            "description": tool.description,
            "input_schema": tool.inputSchema
        } for tool in response.tools]
        logger.debug(f"Available tools: {[tool['name'] for tool in available_tools]}")

        # Initial Claude API call
        response = self.anthropic.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=1000,
            messages=self.messages,
            tools=available_tools
        )

        tool_results = []
        final_text = []

        for content in response.content:
            if content.type == 'text':
                final_text.append(content.text)
                self.add_message("assistant", content.text)
                logger.debug("Added assistant text response")
            elif content.type == 'tool_use':
                tool_name = content.name
                tool_args = content.input
                
                logger.debug(f"Executing tool call - Tool: {tool_name}, Args: {tool_args}")
                result = await self.session.call_tool(tool_name, tool_args)
                tool_results.append({"call": tool_name, "result": result})
                final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")
                logger.debug(f"Tool execution result: {result.content}")

                if hasattr(content, 'text') and content.text:
                    self.add_message("assistant", content.text)
                self.add_message("user", result.content)

                response = self.anthropic.messages.create(
                    model="claude-3-5-sonnet-20241022",
                    max_tokens=1000,
                    messages=self.messages,
                )

                final_text.append(response.content[0].text)
                self.add_message("assistant", response.content[0].text)
                logger.debug("Added follow-up assistant response")

        return "\n".join(final_text)

    async def chat_loop(self):
        """
        Run an interactive chat loop
        """
        logger.info("Starting chat loop")
        print("\nMCP Client Started!")
        print("Type your queries or 'quit' to exit.")
        print("Type 'clear' to clear conversation history.")
        
        while True:
            try:
                query = input("\nQuery: ").strip()
                
                if query.lower() == 'quit':
                    logger.info("Chat loop terminated by user")
                    break
                elif query.lower() == 'clear':
                    self.messages = []
                    logger.debug("Conversation history cleared")
                    print("\nConversation history cleared.")
                    continue
                    
                response = await self.process_query(query)
                print("\n" + response)
                    
            except Exception as e:
                logger.error(f"Error in chat loop: {str(e)}", exc_info=True)
                print(f"\nError: {str(e)}")
    
    async def cleanup(self):
        """
        Clean up resources
        """
        logger.debug("Cleaning up resources")
        await self.exit_stack.aclose()

async def main():
    if len(sys.argv) < 2:
        logger.error("Missing server script path argument")
        print("Usage: python client.py <path_to_server_script>")
        sys.exit(1)
        
    client = MCPClient()
    try:
        await client.connect_to_server(sys.argv[1])
        await client.chat_loop()
    finally:
        await client.cleanup()

if __name__ == "__main__":
    import sys
    logger.info("Starting Anthropic MCP Client application")
    asyncio.run(main())

================================================
FILE: src/ebook_mcp/mcp_client_example/deepseek_example.py
================================================
import asyncio
from typing import Optional
from contextlib import AsyncExitStack
import json
import logging
import os
from datetime import datetime

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

from openai import AsyncOpenAI
from dotenv import load_dotenv

load_dotenv()

# Setup logging
def setup_logging():
    """Configure logging settings"""
    # Create logs directory if it doesn't exist
    if not os.path.exists('logs'):
        os.makedirs('logs')
    
    # Generate log filename with timestamp
    log_filename = f'logs/deepseek_mcp_client_{datetime.now().strftime("%Y%m%d_%H%M%S")}.log'
    
    # Configure logging
    logging.basicConfig(
        level=logging.DEBUG,
        format='%(asctime)s - %(levelname)s - %(message)s',
        handlers=[
            logging.FileHandler(log_filename, encoding='utf-8'),
            #logging.StreamHandler()  # Also output to console
        ]
    )
    return logging.getLogger(__name__)

class MCPClient:
    def __init__(self):
        # Initialize session and client objects
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        
        # Initialize Deepseek client using OpenAI's client with custom base URL
        self.client = AsyncOpenAI(
            api_key=os.getenv("DEEPSEEK_API_KEY"),
            base_url="https://api.deepseek.com/v1"  # Deepseek API endpoint
        )
        
        self.conversation_history = []
        self.model = "deepseek-chat"  # Using Deepseek Chat model
        self.max_tokens = 8192  # Deepseek context window
        self.logger = setup_logging()

    def count_tokens(self, messages):
        """Calculate token count for message history - simplified for Deepseek"""
        # Note: This is a simplified token counting method
        # Deepseek uses its own tokenizer internally
        total_chars = sum(len(str(msg.get("content", ""))) for msg in messages)
        # Approximate token count (assuming average 4 chars per token)
        return total_chars // 4

    def trim_conversation_history(self):
        """Trim conversation history to ensure it doesn't exceed maximum token limit"""
        while self.conversation_history and self.count_tokens(self.conversation_history) > self.max_tokens:
            removed_message = self.conversation_history.pop(1)
            self.logger.debug(f"Trimmed message from history: {removed_message}")

    async def connect_to_server(self, server_script_path: str):
        """Connect to an MCP server"""
        self.logger.info(f"Connecting to server: {server_script_path}")
        
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")
            
        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )
        
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        
        await self.session.initialize()
        
        response = await self.session.list_tools()
        tools = response.tools
        self.logger.info(f"Connected to server with tools: {[tool.name for tool in tools]}")
        print("\nConnected to server with tools:", [tool.name for tool in tools])

    async def process_query(self, query: str) -> str:
        """Process a query using Deepseek and available tools"""
        self.logger.debug(f"Processing user query: {query}")
        
        if query.lower() == 'clear':
            self.logger.debug("Clearing conversation history")
            self.conversation_history = []
            return "Conversation history cleared."

        self.conversation_history.append({
            "role": "user",
            "content": query
        })
        self.logger.debug("Added user message to history")

        self.trim_conversation_history()

        response = await self.session.list_tools()
        available_tools = [{ 
            "type": "function",
            "function": {
                "name": tool.name,
                "description": tool.description,
                "parameters": tool.inputSchema
            }
        } for tool in response.tools]

        final_text = []
        current_tool_call_id = None  # Track current tool call ID
        
        while True:
            # Ensure all message contents are strings
            messages_for_api = []
            for msg in self.conversation_history:
                api_msg = {"role": msg["role"]}
                
                # Convert content to string if it exists
                if "content" in msg:
                    api_msg["content"] = str(msg["content"]) if msg["content"] is not None else ""
                else:
                    api_msg["content"] = ""
                
                # Handle tool calls if they exist
                if msg.get("tool_calls"):
                    api_msg["tool_calls"] = msg["tool_calls"]
                    if msg["tool_calls"]:
                        current_tool_call_id = msg["tool_calls"][0]["id"]
                
                # Add tool_call_id for tool messages
                if msg["role"] == "tool" and current_tool_call_id:
                    api_msg["tool_call_id"] = current_tool_call_id
                
                messages_for_api.append(api_msg)

            self.logger.debug(f"Sending request to Deepseek with {len(messages_for_api)} messages in history")
            try:
                response = await self.client.chat.completions.create(
                    model=self.model,
                    messages=messages_for_api,
                    tools=available_tools,
                    tool_choice="auto"
                )
            except Exception as e:
                self.logger.error(f"API request failed: {str(e)}")
                self.logger.debug(f"Messages sent: {json.dumps(messages_for_api, indent=2)}")
                raise

            message = response.choices[0].message
            assistant_message = {
                "role": "assistant",
                "content": str(message.content) if message.content is not None else ""
            }
            
            # Only add tool_calls if they exist
            if message.tool_calls:
                assistant_message["tool_calls"] = [
                    {
                        "id": tool_call.id,
                        "type": "function",
                        "function": {
                            "name": tool_call.function.name,
                            "arguments": tool_call.function.arguments
                        }
                    } for tool_call in message.tool_calls
                ]
                current_tool_call_id = message.tool_calls[0].id
            
            self.conversation_history.append(assistant_message)
            self.logger.debug(f"Added assistant message to history: {message.content}")
            final_text.append(message.content or "")

            if not message.tool_calls:
                break

            for tool_call in message.tool_calls:
                tool_name = tool_call.function.name
                tool_args = json.loads(tool_call.function.arguments)
                
                self.logger.debug(f"Calling tool: {tool_name}")
                self.logger.debug(f"Tool arguments: {tool_args}")
                
                result = await self.session.call_tool(tool_name, tool_args)
                self.logger.debug(f"Tool result: {result.content}")
                
                final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")

                # Add tool response to conversation history with tool_call_id
                tool_message = {
                    "role": "tool",
                    "content": str(result.content),
                    "tool_call_id": tool_call.id  # Use current tool call ID
                }
                self.conversation_history.append(tool_message)
                self.logger.debug("Added tool result to history")

            self.trim_conversation_history()

        self.logger.debug("Final conversation history:")
        for msg in self.conversation_history:
            self.logger.debug(f"Message: {msg}")

        return "\n".join(filter(None, final_text))

    async def chat_loop(self):
        """Run an interactive chat loop"""
        self.logger.info("Starting chat loop")
        print("\nMCP Client Started!")
        print("Type your queries, 'clear' to clear history, or 'quit' to exit.")
        
        while True:
            try:
                query = input("\nQuery: ").strip()
                
                if query.lower() == 'quit':
                    self.logger.info("User requested to quit")
                    break
                    
                response = await self.process_query(query)
                print("\n" + response)
                    
            except Exception as e:
                error_msg = f"Error: {str(e)}"
                self.logger.error(error_msg, exc_info=True)
                print("\n" + error_msg)
    
    async def cleanup(self):
        """Clean up resources"""
        self.logger.info("Cleaning up resources")
        await self.exit_stack.aclose()

async def main():
    if len(sys.argv) < 2:
        print("Usage: python client.py <path_to_server_script>")
        sys.exit(1)
        
    client = MCPClient()
    client.logger.info("Starting Deepseek MCP Client application")
    try:
        await client.connect_to_server(sys.argv[1])
        await client.chat_loop()
    finally:
        await client.cleanup()

if __name__ == "__main__":
    import sys
    asyncio.run(main())

================================================
FILE: src/ebook_mcp/mcp_client_example/openai_example.py
================================================
import asyncio
from typing import Optional
from contextlib import AsyncExitStack
import json
import tiktoken
import logging
import os
from datetime import datetime

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

from openai import AsyncOpenAI
from dotenv import load_dotenv

load_dotenv()

# Setup logging
def setup_logging():
    """Configure logging settings"""
    # Create logs directory if it doesn't exist
    if not os.path.exists('logs'):
        os.makedirs('logs')
    
    # Generate log filename with timestamp
    log_filename = f'logs/openai_mcp_client_{datetime.now().strftime("%Y%m%d_%H%M%S")}.log'
    
    # Configure logging
    logging.basicConfig(
        level=logging.DEBUG,
        format='%(asctime)s - %(levelname)s - %(message)s',
        handlers=[
            logging.FileHandler(log_filename, encoding='utf-8'),
            #logging.StreamHandler()  # Also output to console
        ]
    )
    return logging.getLogger(__name__)

class MCPClient:
    def __init__(self):
        # Initialize session and client objects
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.client = AsyncOpenAI()
        self.conversation_history = []
        self.model = "gpt-4-turbo-preview"
        self.max_tokens = 4000
        self.encoding = tiktoken.encoding_for_model(self.model)
        self.logger = setup_logging()

    def count_tokens(self, messages):
        """Calculate token count for message history"""
        num_tokens = 0
        for message in messages:
            num_tokens += 4
            for key, value in message.items():
                if key == "tool_calls" and value:
                    for tool_call in value:
                        if isinstance(tool_call, dict):
                            function_name = tool_call.get("function", {}).get("name", "")
                            function_args = tool_call.get("function", {}).get("arguments", "")
                        else:
                            function_name = tool_call.function.name
                            function_args = tool_call.function.arguments
                        num_tokens += len(self.encoding.encode(function_name))
                        num_tokens += len(self.encoding.encode(str(function_args)))
                elif isinstance(value, str):
                    num_tokens += len(self.encoding.encode(value))
        num_tokens += 2
        return num_tokens

    def trim_conversation_history(self):
        """Trim conversation history to ensure it doesn't exceed maximum token limit"""
        while self.conversation_history and self.count_tokens(self.conversation_history) > self.max_tokens:
            removed_message = self.conversation_history.pop(1)
            self.logger.debug(f"Trimmed message from history: {removed_message}")

    async def connect_to_server(self, server_script_path: str):
        """Connect to an MCP server"""
        self.logger.info(f"Connecting to server: {server_script_path}")
        
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")
            
        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )
        
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        
        await self.session.initialize()
        
        response = await self.session.list_tools()
        tools = response.tools
        self.logger.info(f"Connected to server with tools: {[tool.name for tool in tools]}")
        print("\nConnected to server with tools:", [tool.name for tool in tools])

    async def process_query(self, query: str) -> str:
        """Process a query using OpenAI and available tools"""
        self.logger.debug(f"Processing user query: {query}")
        
        if query.lower() == 'clear':
            self.logger.debug("Clearing conversation history")
            self.conversation_history = []
            return "Conversation history cleared."

        self.conversation_history.append({
            "role": "user",
            "content": query
        })
        self.logger.debug("Added user message to history")

        self.trim_conversation_history()

        response = await self.session.list_tools()
        available_tools = [{ 
            "type": "function",
            "function": {
                "name": tool.name,
                "description": tool.description,
                "parameters": tool.inputSchema
            }
        } for tool in response.tools]

        final_text = []
        while True:
            self.logger.debug(f"Sending request to OpenAI with {len(self.conversation_history)} messages in history")
            response = await self.client.chat.completions.create(
                model=self.model,
                messages=self.conversation_history,
                tools=available_tools,
                tool_choice="auto"
            )

            message = response.choices[0].message
            assistant_message = {
                "role": "assistant",
                "content": message.content,
                "tool_calls": [
                    {
                        "id": tool_call.id,
                        "type": "function",
                        "function": {
                            "name": tool_call.function.name,
                            "arguments": tool_call.function.arguments
                        }
                    } for tool_call in (message.tool_calls or [])
                ] if message.tool_calls else None
            }
            self.conversation_history.append(assistant_message)
            self.logger.debug(f"Added assistant message to history: {message.content}")
            final_text.append(message.content or "")

            if not message.tool_calls:
                break

            for tool_call in message.tool_calls:
                tool_name = tool_call.function.name
                tool_args = json.loads(tool_call.function.arguments)
                
                self.logger.debug(f"Calling tool: {tool_name}")
                self.logger.debug(f"Tool arguments: {tool_args}")
                
                result = await self.session.call_tool(tool_name, tool_args)
                self.logger.debug(f"Tool result: {result.content}")
                
                final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")

                tool_message = {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result.content
                }
                self.conversation_history.append(tool_message)
                self.logger.debug("Added tool result to history")

            self.trim_conversation_history()

        self.logger.debug("Final conversation history:")
        for msg in self.conversation_history:
            self.logger.debug(f"Message: {msg}")

        return "\n".join(filter(None, final_text))

    async def chat_loop(self):
        """Run an interactive chat loop"""
        self.logger.info("Starting chat loop")
        print("\nMCP Client Started!")
        print("Type your queries, 'clear' to clear history, or 'quit' to exit.")
        
        while True:
            try:
                query = input("\nQuery: ").strip()
                
                if query.lower() == 'quit':
                    self.logger.info("User requested to quit")
                    break
                    
                response = await self.process_query(query)
                print("\n" + response)
                    
            except Exception as e:
                error_msg = f"Error: {str(e)}"
                self.logger.error(error_msg, exc_info=True)
                print("\n" + error_msg)
    
    async def cleanup(self):
        """Clean up resources"""
        self.logger.info("Cleaning up resources")
        await self.exit_stack.aclose()

async def main():
    if len(sys.argv) < 2:
        print("Usage: python client.py <path_to_server_script>")
        sys.exit(1)
        
    client = MCPClient()
    client.logger.info("Starting OpenAI MCP Client application")
    try:
        await client.connect_to_server(sys.argv[1])
        await client.chat_loop()
    finally:
        await client.cleanup()

if __name__ == "__main__":
    import sys
    asyncio.run(main())

================================================
FILE: src/ebook_mcp/mcp_client_example/pyproject.toml
================================================
[project]
name = "demo-mcp-client"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
    "anthropic>=0.49.0",
    "mcp>=1.4.1",
    "openai>=1.69.0",
    "python-dotenv>=1.0.1",
]


================================================
FILE: src/ebook_mcp/mcp_client_example/requirements.txt
================================================
anthropic==0.49.0
python-dotenv>=1.0.0
openai>=1.0.0
fastmcp>=0.1.0
pydantic>=2.0.0


================================================
FILE: src/ebook_mcp/tests/conftest.py
================================================
import pytest
import tempfile
import os
from unittest.mock import Mock


@pytest.fixture
def temp_dir():
    """Create a temporary directory for testing"""
    with tempfile.TemporaryDirectory() as temp_dir:
        yield temp_dir


@pytest.fixture
def mock_epub_book():
    """Create a mock EPUB book for testing"""
    mock_book = Mock()
    mock_book.get_metadata.return_value = {
        'title': [('Test Book', {})],
        'creator': [('Test Author', {})],
        'language': [('en', {})],
        'identifier': [('test-id', {})],
        'date': [('2023-01-01', {})],
        'publisher': [('Test Publisher', {})],
        'description': [('Test description', {})]
    }
    return mock_book


@pytest.fixture
def mock_pdf_reader():
    """Create a mock PDF reader for testing"""
    mock_reader = Mock()
    mock_reader.metadata = {
        '/Title': 'Test PDF',
        '/Author': 'Test Author',
        '/Subject': 'Test Subject',
        '/Creator': 'Test Creator',
        '/Producer': 'Test Producer',
        '/CreationDate': '2023-01-01',
        '/ModDate': '2023-01-02'
    }
    mock_reader.pages = [Mock(), Mock(), Mock()]  # 3 pages
    return mock_reader


@pytest.fixture
def mock_pdf_document():
    """Create a mock PDF document for testing"""
    mock_doc = Mock()
    mock_doc.get_toc.return_value = [
        (1, "Chapter 1", 1),
        (1, "Chapter 2", 5),
        (2, "Subchapter 2.1", 7)
    ]
    return mock_doc


@pytest.fixture
def sample_epub_files():
    """Create sample EPUB file names for testing"""
    return ["book1.epub", "book2.epub", "document.txt"]


@pytest.fixture
def sample_pdf_files():
    """Create sample PDF file names for testing"""
    return ["document1.pdf", "document2.pdf", "text.txt"]


@pytest.fixture
def temp_epub_file():
    """Create a temporary EPUB file for testing"""
    with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
        f.write(b"mock epub content")
        epub_path = f.name
    
    yield epub_path
    
    # Cleanup
    if os.path.exists(epub_path):
        os.unlink(epub_path)


@pytest.fixture
def temp_pdf_file():
    """Create a temporary PDF file for testing"""
    with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as f:
        f.write(b"mock pdf content")
        pdf_path = f.name
    
    yield pdf_path
    
    # Cleanup
    if os.path.exists(pdf_path):
        os.unlink(pdf_path) 

================================================
FILE: src/ebook_mcp/tests/run_tests.py
================================================
#!/usr/bin/env python3
"""
Test runner script for ebook-mcp project.
This script runs all unit tests for the server components.
"""

import sys
import os
import subprocess
import pytest

def run_tests():
    """Run all tests for the ebook-mcp project"""
    
    # Add the src directory to Python path
    src_path = os.path.join(os.path.dirname(__file__), '..', '..')
    sys.path.insert(0, src_path)
    
    # Get the tests directory
    tests_dir = os.path.dirname(__file__)
    
    print("Running ebook-mcp unit tests...")
    print("=" * 50)
    
    # Run tests with pytest
    try:
        # Run tests with verbose output and coverage
        result = pytest.main([
            tests_dir,
            '-v',
            '--tb=short',
            '--strict-markers',
            '--disable-warnings'
        ])
        
        if result == 0:
            print("\n" + "=" * 50)
            print("✅ All tests passed!")
            return True
        else:
            print("\n" + "=" * 50)
            print("❌ Some tests failed!")
            return False
            
    except Exception as e:
        print(f"Error running tests: {e}")
        return False


def run_specific_test(test_file):
    """Run a specific test file"""
    src_path = os.path.join(os.path.dirname(__file__), '..', '..')
    sys.path.insert(0, src_path)
    
    test_path = os.path.join(os.path.dirname(__file__), test_file)
    
    print(f"Running specific test: {test_file}")
    print("=" * 50)
    
    try:
        result = pytest.main([
            test_path,
            '-v',
            '--tb=short'
        ])
        
        if result == 0:
            print("\n" + "=" * 50)
            print("✅ Test passed!")
            return True
        else:
            print("\n" + "=" * 50)
            print("❌ Test failed!")
            return False
            
    except Exception as e:
        print(f"Error running test: {e}")
        return False


def list_tests():
    """List all available test files"""
    tests_dir = os.path.dirname(__file__)
    test_files = []
    
    for file in os.listdir(tests_dir):
        if file.startswith('test_') and file.endswith('.py'):
            test_files.append(file)
    
    print("Available test files:")
    print("=" * 30)
    for test_file in sorted(test_files):
        print(f"  - {test_file}")
    
    return test_files


if __name__ == "__main__":
    if len(sys.argv) > 1:
        command = sys.argv[1]
        
        if command == "list":
            list_tests()
        elif command == "run":
            if len(sys.argv) > 2:
                test_file = sys.argv[2]
                run_specific_test(test_file)
            else:
                run_tests()
        else:
            print("Usage:")
            print("  python run_tests.py list          - List all test files")
            print("  python run_tests.py run           - Run all tests")
            print("  python run_tests.py run test_file - Run specific test file")
    else:
        run_tests() 

================================================
FILE: src/ebook_mcp/tests/test_azw.py
================================================
import os
import pytest
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

try:
    from ebook_mcp.azw import find_azw_files, get_metadata, get_toc, get_chapter_text
    AZW_AVAILABLE = True
except ImportError:
    # Skip tests if azw module is not available
    AZW_AVAILABLE = False

@pytest.mark.skipif(not AZW_AVAILABLE, reason="AZW module not available")
def test_find_azw_files(tmp_path):
    # Create test directory structure
    test_dir = tmp_path / "test_azw"
    test_dir.mkdir()

    # Create test files
    (test_dir / "book1.azw").touch()
    (test_dir / "book2.azw3").touch()
    (test_dir / "book3.txt").touch()
    
    # Create subdirectory with more files
    sub_dir = test_dir / "subdir"
    sub_dir.mkdir()
    (sub_dir / "book4.azw").touch()

    # Test file finding
    azw_files = find_azw_files(str(test_dir))
    assert len(azw_files) == 3

    # Clean up test files
    # (handled automatically by pytest)

@pytest.mark.skipif(not AZW_AVAILABLE, reason="AZW module not available")
def test_get_metadata(test_azw_path):
    """Test metadata extraction

    Args:
        test_azw_path: Path to test AZW file (should be provided when running tests)
    """
    metadata = get_metadata(test_azw_path)
    
    # Verify basic metadata fields exist
    assert "title" in metadata
    assert "author" in metadata
    assert "publisher" in metadata
    assert "publication_date" in metadata
    assert "language" in metadata
    assert "isbn" in metadata

@pytest.mark.skipif(not AZW_AVAILABLE, reason="AZW module not available")
def test_get_toc(test_azw_path):
    """Test table of contents extraction

    Args:
        test_azw_path: Path to test AZW file (should be provided when running tests)
    """
    toc = get_toc(test_azw_path)
    
    # Verify TOC structure
    assert isinstance(toc, list)
    for entry in toc:
        assert isinstance(entry, tuple)
        assert len(entry) == 2
        assert isinstance(entry[0], str)  # title
        assert isinstance(entry[1], str)  # chapter_id

@pytest.mark.skipif(not AZW_AVAILABLE, reason="AZW module not available")
def test_get_chapter_text(test_azw_path):
    """Test chapter text extraction

    Args:
        test_azw_path: Path to test AZW file (should be provided when running tests)
    """
    toc = get_toc(test_azw_path)
    if not toc:
        pytest.skip("No TOC available in test file")
    
    chapter_id = toc[0][1]
    text = get_chapter_text(test_azw_path, chapter_id)
    
    assert isinstance(text, str)
    assert len(text) > 0 

================================================
FILE: src/ebook_mcp/tests/test_basic.py
================================================
import pytest
import os
import tempfile
from unittest.mock import Mock, patch

# Test basic file operations that don't require external dependencies

def test_get_all_epub_files_basic():
    """Test basic EPUB file discovery without external dependencies"""
    with tempfile.TemporaryDirectory() as temp_dir:
        # Create mock EPUB files
        epub_files = ["book1.epub", "book2.epub", "document.txt"]
        for file in epub_files:
            with open(os.path.join(temp_dir, file), 'w') as f:
                f.write("mock content")
        
        # Test the basic file discovery logic
        result = [f for f in os.listdir(temp_dir) if f.endswith('.epub')]
        assert set(result) == {"book1.epub", "book2.epub"}

def test_get_all_pdf_files_basic():
    """Test basic PDF file discovery without external dependencies"""
    with tempfile.TemporaryDirectory() as temp_dir:
        # Create mock PDF files
        pdf_files = ["document1.pdf", "document2.pdf", "text.txt"]
        for file in pdf_files:
            with open(os.path.join(temp_dir, file), 'w') as f:
                f.write("mock content")
        
        # Test the basic file discovery logic
        result = [f for f in os.listdir(temp_dir) if f.endswith('.pdf')]
        assert set(result) == {"document1.pdf", "document2.pdf"}

def test_file_not_found_error():
    """Test file not found error handling"""
    with pytest.raises(FileNotFoundError):
        with open("/nonexistent/file.txt", 'r') as f:
            pass

def test_temp_file_operations():
    """Test temporary file operations"""
    with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
        f.write("test content")
        temp_path = f.name
    
    try:
        # Verify file was created
        assert os.path.exists(temp_path)
        
        # Read content
        with open(temp_path, 'r') as f:
            content = f.read()
        assert content == "test content"
    finally:
        # Clean up
        if os.path.exists(temp_path):
            os.unlink(temp_path)

def test_directory_operations():
    """Test directory operations"""
    with tempfile.TemporaryDirectory() as temp_dir:
        # Create subdirectory
        sub_dir = os.path.join(temp_dir, "subdir")
        os.makedirs(sub_dir)
        
        # Create files in subdirectory
        files = ["file1.txt", "file2.txt"]
        for file in files:
            with open(os.path.join(sub_dir, file), 'w') as f:
                f.write(f"content for {file}")
        
        # List files
        result = os.listdir(sub_dir)
        assert set(result) == set(files)

@pytest.mark.parametrize("file_extension,expected_count", [
    (".epub", 2),
    (".pdf", 1),
    (".txt", 3),
])
def test_file_filtering(file_extension, expected_count):
    """Test file filtering by extension"""
    with tempfile.TemporaryDirectory() as temp_dir:
        # Create test files
        test_files = [
            "book1.epub",
            "book2.epub", 
            "document.pdf",
            "file1.txt",
            "file2.txt",
            "file3.txt"
        ]
        
        for file in test_files:
            with open(os.path.join(temp_dir, file), 'w') as f:
                f.write("content")
        
        # Filter by extension
        result = [f for f in os.listdir(temp_dir) if f.endswith(file_extension)]
        assert len(result) == expected_count

def test_mock_basic_operations():
    """Test basic mock operations"""
    mock_file = Mock()
    mock_file.read.return_value = "mock content"
    mock_file.write.return_value = None
    
    # Test mock behavior
    assert mock_file.read() == "mock content"
    mock_file.write("test")
    mock_file.write.assert_called_once_with("test")

def test_patch_basic():
    """Test basic patch functionality"""
    with patch('os.path.exists', return_value=False):
        assert not os.path.exists("/any/path")
    
    with patch('os.path.exists', return_value=True):
        assert os.path.exists("/any/path") 

================================================
FILE: src/ebook_mcp/tests/test_epub_chapter_extraction.py
================================================
import pytest
import tempfile
import os
from unittest.mock import Mock, patch, MagicMock

# Add project root to path
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

# Skip tests if dependencies are not available
try:
    from ebooklib import epub
    from bs4 import BeautifulSoup
    import html2text
    DEPENDENCIES_AVAILABLE = True
except ImportError:
    DEPENDENCIES_AVAILABLE = False

if DEPENDENCIES_AVAILABLE:
    from ebook_mcp.tools.epub_helper import (
        extract_chapter_html,
        extract_chapter_markdown,
        clean_html,
        convert_html_to_markdown
    )


class TestExtractChapterHtml:
    """Test the improved version of extract_chapter_html function"""
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_simple_chapter_extraction(self):
        """Test simple chapter extraction without subchapters"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock TOC structure
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        mock_book.toc = [mock_chapter1, mock_chapter2]
        
        # Mock HTML content
        html_content = """
        <html>
            <body>
                <h1 id="chapter1">Chapter 1</h1>
                <p>Chapter 1 content</p>
                
                <h1 id="chapter2">Chapter 2</h1>
                <p>Chapter 2 content</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test extracting Chapter 1 - use the chapter ID that exists in TOC
        result = extract_chapter_markdown(mock_book, "chapter1.xhtml")
        
        # Should include Chapter 1 content but not Chapter 2
        assert "Chapter 1 content" in result
        assert "Chapter 2 content" not in result
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_chapter_with_subchapters_bug_case(self):
        """Test the specific bug case: chapter with subchapters causing premature truncation"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock TOC structure with subchapters (the problematic case)
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_subchapter1_3 = Mock()
        mock_subchapter1_3.title = "1.3 Append-only"
        mock_subchapter1_3.href = "chapter1.xhtml#section1_3"
        
        mock_subchapter1_4 = Mock()
        mock_subchapter1_4.title = "1.4 Another Section"
        mock_subchapter1_4.href = "chapter1.xhtml#section1_4"
        
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        # Set up nested TOC
        mock_book.toc = [
            (mock_chapter1, [mock_subchapter1_3, mock_subchapter1_4]),
            mock_chapter2
        ]
        
        # Mock HTML content that matches the bug report
        html_content = """
        <html>
            <body>
                <h1 id="chapter1">Chapter 1</h1>
                <p>Chapter 1 introduction</p>
                
                <h2 id="section1_3">1.3 Append-only</h2>
                <h3 id="subsection">Safe incremental updates with logs</h3>
                <p>One way to do incremental updates is to just append the updates to a file. 
                This is called a "log" because it's append-only. It's safer than in-place updates 
                because no data is overwritten; you can always recover the old data after a crash.</p>
                
                <h2 id="section1_4">1.4 Another Section</h2>
                <p>Another section content</p>
                
                <h1 id="chapter2">Chapter 2</h1>
                <p>Chapter 2 content</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test extracting section1_3 (the problematic case)
        result = extract_chapter_html(mock_book, "chapter1.xhtml#section1_3")
        
        # Should include the full section content
        assert "1.3 Append-only" in result
        assert "Safe incremental updates with logs" in result
        assert "One way to do incremental updates" in result
        assert "This is called a \"log\" because it's append-only" in result
        
        # Should NOT include content from other sections
        assert "1.4 Another Section" not in result
        assert "Another section content" not in result
        assert "Chapter 2 content" not in result
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_comparison_with_original_function(self):
        """Compare the improved function with the original function"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock TOC structure
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_subchapter1_3 = Mock()
        mock_subchapter1_3.title = "1.3 Append-only"
        mock_subchapter1_3.href = "chapter1.xhtml#section1_3"
        
        mock_subchapter1_4 = Mock()
        mock_subchapter1_4.title = "1.4 Another Section"
        mock_subchapter1_4.href = "chapter1.xhtml#section1_4"
        
        mock_book.toc = [
            (mock_chapter1, [mock_subchapter1_3, mock_subchapter1_4])
        ]
        
        # Mock HTML content
        html_content = """
        <html>
            <body>
                <h2 id="section1_3">1.3 Append-only</h2>
                <p>Section 1.3 content</p>
                
                <h2 id="section1_4">1.4 Another Section</h2>
                <p>Section 1.4 content</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test with improved function (should work - return full content)
        improved_result = extract_chapter_html(mock_book, "chapter1.xhtml#section1_3")
        
        # The improved function should return the full content
        assert "Section 1.3 content" in improved_result
        assert "Section 1.4 content" not in improved_result
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_markdown_conversion(self):
        """Test the fixed markdown conversion function"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock TOC structure
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_subchapter1_3 = Mock()
        mock_subchapter1_3.title = "1.3 Append-only"
        mock_subchapter1_3.href = "chapter1.xhtml#section1_3"
        
        mock_book.toc = [
            (mock_chapter1, [mock_subchapter1_3])
        ]
        
        # Mock HTML content
        html_content = """
        <html>
            <body>
                <h2 id="section1_3">1.3 Append-only</h2>
                <p>This is <strong>bold</strong> content.</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test markdown conversion
        result = extract_chapter_markdown(mock_book, "chapter1.xhtml#section1_3")
        
        # Should convert to markdown format
        assert "1.3 Append-only" in result
        assert "bold" in result
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_edge_cases(self):
        """Test edge cases and error conditions"""
        # Mock EPUB book
        mock_book = Mock()
        mock_book.toc = []
        
        # Test with non-existent chapter
        from ebook_mcp.tools.epub_helper import EpubProcessingError
        with pytest.raises(EpubProcessingError, match="not found in TOC"):
            extract_chapter_html(mock_book, "nonexistent.xhtml")
        
        # Test with non-existent anchor
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        mock_book.toc = [mock_chapter1]
        
        mock_item = Mock()
        mock_item.get_content.return_value = "<html><body><h1>Test</h1></body></html>".encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        with pytest.raises(EpubProcessingError, match="not found in"):
            extract_chapter_html(mock_book, "chapter1.xhtml#nonexistent")
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_last_chapter_extraction(self):
        """Test extracting the last chapter (no next chapter)"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock TOC structure
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        mock_book.toc = [mock_chapter1, mock_chapter2]
        
        # Mock HTML content
        html_content = """
        <html>
            <body>
                <h1 id="chapter2">Chapter 2</h1>
                <p>Chapter 2 content</p>
                <p>More content</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test extracting the last chapter
        result = extract_chapter_html(mock_book, "chapter2.xhtml")
        
        # Should include all content (no next chapter to truncate at)
        assert "Chapter 2 content" in result
        assert "More content" in result
    
    @pytest.mark.skipif(not DEPENDENCIES_AVAILABLE, reason="Dependencies not available")
    def test_complex_nested_toc(self):
        """Test with complex nested TOC structure"""
        # Mock EPUB book
        mock_book = Mock()
        
        # Mock complex TOC structure
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        
        mock_subchapter1_1 = Mock()
        mock_subchapter1_1.title = "1.1 Introduction"
        mock_subchapter1_1.href = "chapter1.xhtml#intro"
        
        mock_subchapter1_2 = Mock()
        mock_subchapter1_2.title = "1.2 Background"
        mock_subchapter1_2.href = "chapter1.xhtml#background"
        
        mock_subchapter1_3 = Mock()
        mock_subchapter1_3.title = "1.3 Append-only"
        mock_subchapter1_3.href = "chapter1.xhtml#section1_3"
        
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        # Set up nested TOC
        mock_book.toc = [
            (mock_chapter1, [mock_subchapter1_1, mock_subchapter1_2, mock_subchapter1_3]),
            mock_chapter2
        ]
        
        # Mock HTML content
        html_content = """
        <html>
            <body>
                <h1 id="chapter1">Chapter 1</h1>
                
                <h2 id="intro">1.1 Introduction</h2>
                <p>Introduction content</p>
                
                <h2 id="background">1.2 Background</h2>
                <p>Background content</p>
                
                <h2 id="section1_3">1.3 Append-only</h2>
                <p>Section 1.3 content</p>
                <p>More content in section 1.3</p>
                
                <h1 id="chapter2">Chapter 2</h1>
                <p>Chapter 2 content</p>
            </body>
        </html>
        """
        
        # Mock book.get_item_with_href
        mock_item = Mock()
        mock_item.get_content.return_value = html_content.encode('utf-8')
        mock_book.get_item_with_href.return_value = mock_item
        
        # Test extracting section1_3
        result = extract_chapter_html(mock_book, "chapter1.xhtml#section1_3")
        
        # Should include section 1.3 content
        assert "1.3 Append-only" in result
        assert "Section 1.3 content" in result
        assert "More content in section 1.3" in result
        
        # Should NOT include content from other sections
        assert "1.1 Introduction" not in result
        assert "Introduction content" not in result
        assert "1.2 Background" not in result
        assert "Background content" not in result
        assert "Chapter 2 content" not in result 

================================================
FILE: src/ebook_mcp/tests/test_epub_helper.py
================================================
import pytest
import os
import tempfile
from unittest.mock import Mock, patch, MagicMock

# Mock external dependencies
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

# Mock ebooklib
try:
    from ebooklib import epub
except ImportError:
    epub = Mock()

# Mock BeautifulSoup
try:
    from bs4 import BeautifulSoup
except ImportError:
    BeautifulSoup = Mock()

from ebook_mcp.tools.epub_helper import (
    get_all_epub_files,
    get_meta,
    get_toc,
    read_epub,
    flatten_toc,
    extract_chapter_html,
    extract_chapter_plain_text,

    convert_html_to_markdown,
    clean_html
)


class TestEpubHelper:
    """Test EPUB helper functions"""
    
    def test_get_all_epub_files_empty_directory(self):
        """Test get_all_epub_files with empty directory"""
        with tempfile.TemporaryDirectory() as temp_dir:
            result = get_all_epub_files(temp_dir)
            assert result == []
    
    def test_get_all_epub_files_with_epub_files(self):
        """Test get_all_epub_files with EPUB files present"""
        with tempfile.TemporaryDirectory() as temp_dir:
            # Create mock EPUB files
            epub_files = ["book1.epub", "book2.epub", "document.txt"]
            for file in epub_files:
                with open(os.path.join(temp_dir, file), 'w') as f:
                    f.write("mock content")
            
            result = get_all_epub_files(temp_dir)
            assert set(result) == {"book1.epub", "book2.epub"}
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_get_toc_success(self, mock_read_epub):
        """Test get_toc successful case"""
        # Mock EPUB book with TOC
        mock_book = Mock()
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        mock_book.toc = [mock_chapter1, mock_chapter2]
        mock_read_epub.return_value = mock_book
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            result = get_toc(epub_path)
            expected = [
                ("Chapter 1", "chapter1.xhtml"),
                ("Chapter 2", "chapter2.xhtml")
            ]
            assert result == expected
        finally:
            os.unlink(epub_path)
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_get_toc_nested_structure(self, mock_read_epub):
        """Test get_toc with nested TOC structure"""
        # Mock EPUB book with nested TOC
        mock_book = Mock()
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        mock_subchapter1 = Mock()
        mock_subchapter1.title = "Subchapter 1.1"
        mock_subchapter1.href = "subchapter1.1.xhtml"
        
        mock_book.toc = [(mock_chapter1, [mock_subchapter1])]
        mock_read_epub.return_value = mock_book
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            result = get_toc(epub_path)
            expected = [
                ("Chapter 1", "chapter1.xhtml"),
                ("Subchapter 1.1", "subchapter1.1.xhtml")
            ]
            assert result == expected
        finally:
            os.unlink(epub_path)
    
    def test_get_toc_file_not_found(self):
        """Test get_toc with non-existent file"""
        with pytest.raises(FileNotFoundError):
            get_toc("/path/to/nonexistent.epub")
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_get_toc_parsing_error(self, mock_read_epub):
        """Test get_toc with parsing error"""
        mock_read_epub.side_effect = Exception("EPUB parsing error")
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            with pytest.raises(Exception):
                get_toc(epub_path)
        finally:
            os.unlink(epub_path)
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_get_meta_success(self, mock_read_epub):
        """Test get_meta successful case"""
        # Mock EPUB book with metadata
        mock_book = Mock()
        
        # 设置 get_metadata 方法返回正确的格式
        def mock_get_metadata(namespace, field):
            metadata_map = {
                'title': [('Test Book', {})],
                'creator': [('Test Author', {})],
                'language': [('en', {})],
                'identifier': [('test-id', {})],
                'date': [('2023-01-01', {})],
                'publisher': [('Test Publisher', {})],
                'description': [('Test description', {})]
            }
            return metadata_map.get(field, [])
        
        mock_book.get_metadata = mock_get_metadata
        mock_read_epub.return_value = mock_book
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            result = get_meta(epub_path)
            expected = {
                'title': 'Test Book',
                'creator': ['Test Author'],
                'language': 'en',
                'identifier': 'test-id',
                'date': '2023-01-01',
                'publisher': 'Test Publisher',
                'description': 'Test description'
            }
            assert result == expected
        finally:
            os.unlink(epub_path)
    
    def test_get_meta_file_not_found(self):
        """Test get_meta with non-existent file"""
        with pytest.raises(FileNotFoundError):
            get_meta("/path/to/nonexistent.epub")
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_get_meta_parsing_error(self, mock_read_epub):
        """Test get_meta with parsing error"""
        mock_read_epub.side_effect = Exception("EPUB parsing error")
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            with pytest.raises(Exception):
                get_meta(epub_path)
        finally:
            os.unlink(epub_path)
    
    @patch('ebook_mcp.tools.epub_helper.epub.read_epub')
    def test_read_epub_success(self, mock_read_epub):
        """Test read_epub successful case"""
        mock_book = Mock()
        mock_read_epub.return_value = mock_book
        
        with tempfile.NamedTemporaryFile(suffix='.epub', delete=False) as f:
            f.write(b"mock epub content")
            epub_path = f.name
        
        try:
            result = read_epub(epub_path)
            assert result == mock_book
            mock_read_epub.assert_called_once_with(epub_path)
        finally:
            os.unlink(epub_path)
    
    def test_flatten_toc_simple(self):
        """Test flatten_toc with simple TOC structure"""
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        mock_chapter2 = Mock()
        mock_chapter2.title = "Chapter 2"
        mock_chapter2.href = "chapter2.xhtml"
        
        toc = [mock_chapter1, mock_chapter2]
        mock_book = Mock()
        mock_book.toc = toc
        result = flatten_toc(mock_book)
        
        expected = ["chapter1.xhtml", "chapter2.xhtml"]
        assert result == expected
    
    def test_flatten_toc_nested(self):
        """Test flatten_toc with nested TOC structure"""
        mock_chapter1 = Mock()
        mock_chapter1.title = "Chapter 1"
        mock_chapter1.href = "chapter1.xhtml"
        mock_subchapter1 = Mock()
        mock_subchapter1.title = "Subchapter 1.1"
        mock_subchapter1.href = "subchapter1.1.xhtml"
        
        toc = [(mock_chapter1, [mock_subchapter1])]
        mock_book = Mock()
        mock_book.toc = toc
        result = flatten_toc(mock_book)
        
        expected = ["chapter1.xhtml", "subchapter1.1.xhtml"]
        assert result == expected
    
    def test_clean_html(self):
        """Test clean_html function"""
        html_content = """
        <html>
            <head><title>Test</title></head>
            <body>
                <h1>Title</h1>
                <p>Content</p>
                <!-- Comment -->
                <script>alert('test');</script>
            </body>
        </html>
        """
        
        result = clean_html(html_content)
        
        # Should remove comments and scripts
        assert "<!-- Comment -->" not in result
        assert "<script>" not in result
        assert "alert('test');" not in result
        # Should keep content
        assert "<h1>Title</h1>" in result
        assert "<p>Content</p>" in result
    
    def test_convert_html_to_markdown(self):
        """Test convert_html_to_markdown function"""
        html_content = "<h1>Title</h1><p>This is <strong>bold</strong> text.</p>"
        
        result = convert_html_to_markdown(html_content)
        
        # Should convert HTML to markdown
        assert "# Title" in result
        assert "**bold**" in result
    
    @patch('ebook_mcp.tools.epub_helper.extract_chapter_html')
    def test_extract_chapter_plain_text(self, mock_extract_html):
        """Test extract_chapter_plain_text function"""
        mock_extract_html.return_value = "<h1>Title</h1><p>Content</p>"
        
        mock_book = Mock()
        result = extract_chapter_plain_text(mock_book, "chapter1")
        
        mock_extract_html.assert_called_once_with(mock_book, "chapter1")
        # Should return plain text (HTML tags removed)
        assert "<h1>" not in result
        assert "<p>" not in result
        assert "Title" in result
        assert "Content" in result
    
 

================================================
FILE: src/ebook_mcp/tests/test_logger_config.py
================================================
import pytest
import json
import os
import sys
import tempfile
import logging
from unittest.mock import patch, MagicMock
from ebook_mcp.tools.logger_config import (
    StructuredFormatter, 
    StructuredLogger, 
    setup_logger, 
    get_logger, 
    log_operation
)

class TestStructuredFormatter:
    """Test the structured formatter"""
    
    def test_format_basic_log(self):
        """Test basic log formatting"""
        formatter = StructuredFormatter()
        record = logging.LogRecord(
            name="test_logger",
            level=logging.INFO,
            pathname="test.py",
            lineno=10,
            msg="Test message",
            args=(),
            exc_info=None
        )
        
        result = formatter.format(record)
        log_entry = json.loads(result)
        
        assert log_entry["level"] == "INFO"
        assert log_entry["logger"] == "test_logger"
        assert log_entry["message"] == "Test message"
        assert log_entry["module"] == "test"
        # function name might be None in test environment
        assert log_entry["function"] in ["<module>", None]
        assert log_entry["line"] == 10
        assert "timestamp" in log_entry
    
    def test_format_with_extra_fields(self):
        """Test formatting with extra context fields"""
        formatter = StructuredFormatter()
        record = logging.LogRecord(
            name="test_logger",
            level=logging.ERROR,
            pathname="test.py",
            lineno=20,
            msg="Error message",
            args=(),
            exc_info=None
        )
        
        # Add extra fields
        record.file_path = "/test/file.epub"
        record.operation = "metadata_extraction"
        record.duration_ms = 150.5
        record.error_type = "FileNotFoundError"
        
        result = formatter.format(record)
        log_entry = json.loads(result)
        
        assert log_entry["file_path"] == "/test/file.epub"
        assert log_entry["operation"] == "metadata_extraction"
        assert log_entry["duration_ms"] == 150.5
        assert log_entry["error_type"] == "FileNotFoundError"
    
    def test_format_with_exception(self):
        """Test formatting with exception info"""
        formatter = StructuredFormatter()
        
        try:
            raise ValueError("Test exception")
        except ValueError:
            record = logging.LogRecord(
                name="test_logger",
                level=logging.ERROR,
                pathname="test.py",
                lineno=30,
                msg="Exception occurred",
                args=(),
                exc_info=sys.exc_info()
            )
        
        result = formatter.format(record)
        log_entry = json.loads(result)
        
        assert "exception" in log_entry
        assert log_entry["exception"]["type"] == "ValueError"
        assert log_entry["exception"]["message"] == "Test exception"
        assert "traceback" in log_entry["exception"]

class TestStructuredLogger:
    """Test the structured logger"""
    
    def test_logger_creation(self):
        """Test logger creation"""
        logger = StructuredLogger("test_module")
        assert logger.name == "test_module"
        assert isinstance(logger.logger, logging.Logger)
    
    def test_log_with_context(self):
        """Test logging with context fields"""
        # Test that StructuredLogger can be created and has the expected interface
        logger = StructuredLogger("test_module")
        assert logger.name == "test_module"
        assert hasattr(logger, 'info')
        assert hasattr(logger, 'debug')
        assert hasattr(logger, 'warning')
        assert hasattr(logger, 'error')
        assert hasattr(logger, 'critical')

class TestLogOperationDecorator:
    """Test the log_operation decorator"""
    
    def test_log_operation_success(self):
        """Test successful operation logging"""
        @log_operation("test_operation")
        def test_function():
            return "success"
        
        result = test_function()
        assert result == "success"
    
    def test_log_operation_failure(self):
        """Test failed operation logging"""
        @log_operation("test_operation")
        def test_function():
            raise ValueError("Test error")
        
        with pytest.raises(ValueError, match="Test error"):
            test_function()

class TestSetupLogger:
    """Test logger setup"""
    
    def test_setup_logger_creates_directory(self):
        """Test that setup_logger creates logs directory"""
        with tempfile.TemporaryDirectory() as temp_dir:
            with patch('ebook_mcp.tools.logger_config.os.path.dirname', return_value=temp_dir):
                with patch('ebook_mcp.tools.logger_config.os.makedirs') as mock_makedirs:
                    with patch('ebook_mcp.tools.logger_config.logging.FileHandler') as mock_file_handler:
                        mock_handler = MagicMock()
                        mock_file_handler.return_value = mock_handler
                        setup_logger()
                        mock_makedirs.assert_called_once()
    
    def test_setup_logger_configures_handlers(self):
        """Test that setup_logger configures handlers correctly"""
        with patch('ebook_mcp.tools.logger_config.logging.getLogger') as mock_get_logger:
            mock_logger = MagicMock()
            mock_get_logger.return_value = mock_logger
            
            setup_logger()
            
            # Verify handlers were added
            mock_logger.addHandler.assert_called()
            assert mock_logger.addHandler.call_count == 2  # file and console handlers

if __name__ == "__main__":
    # Import sys for exception testing
    import sys
    pytest.main([__file__])


================================================
FILE: src/ebook_mcp/tests/test_main.py
================================================
import pytest
import os
import tempfile
from unittest.mock import Mock, patch, MagicMock
from typing import List, Dict, Union, Tuple

# Mock external dependencies before importing main
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

# Mock mcp.server.fastmcp
try:
    import mcp.server.fastmcp
except ImportError:
    sys.modules['mcp.server.fastmcp'] = Mock()
    sys.modules['mcp'] = Mock()
    sys.modules['mcp.server'] = Mock()

# Import the functions to test
from ebook_mcp.main import (
    get_all_epub_files,
    get_epub_metadata,
    get_epub_toc,
    get_all_pdf_files,
    get_pdf_metadata,
    get_pdf_toc,
    get_pdf_page_text,
    get_pdf_page_markdown,
    get_pdf_chapter_content
)


class TestEpubFunctions:
    """Test EPUB related functions"""
    
    def test_get_all_epub_files_empty_directory(self):
        """Test get_all_epub_files with empty directory"""
        with tempfile.TemporaryDirectory() as temp_dir:
            result = get_all_epub_files(temp_dir)
            assert result == []
    
    def test_get_all_epub_files_with_epub_files(self):
        """Test get_all_epub_files with EPUB files present"""
        with tempfile.TemporaryDirectory() as temp_dir:
            # Create mock EPUB files
            epub_files = ["book1.epub", "book2.epub", "document.txt"]
            for file in epub_files:
                with open(os.path.join(temp_dir, file), 'w') as f:
                    f.write("mock content")
            
            result = get_all_epub_files(temp_dir)
            assert set(result) == {"book1.epub", "book2.epub"}
    
    @patch('ebook_mcp.main.epub_helper.get_meta')
    def test_get_epub_metadata_success(self, mock_get_meta):
        """Test get_epub_metadata successful case"""
        mock_metadata = {
            'title': 'Test Book',
            'author': 'Test Author',
            'language': 'en'
        }
        mock_get_meta.return_value = mock_metadata
        
        result = get_epub_metadata("/path/to/test.epub")
        assert result == mock_metadata
        mock_get_meta.assert_called_once_with("/path/to/test.epub")
    
    @patch('ebook_mcp.main.epub_helper.get_meta')
    def test_get_epub_metadata_file_not_found(self, mock_get_meta):
        """Test get_epub_metadata with file not found"""
        mock_get_meta.side_effect = FileNotFoundError("File not found")
        
        with pytest.raises(FileNotFoundError):
            get_epub_metadata("/path/to/nonexistent.epub")
    
    @patch('ebook_mcp.main.epub_helper.get_meta')
    def test_get_epub_metadata_parsing_error(self, mock_get_meta):
        """Test get_epub_metadata with parsing error"""
        mock_get_meta.side_effect = Exception("Parsing error")
        
        with pytest.raises(Exception):
            get_epub_metadata("/path/to/corrupted.epub")
    
    @patch('ebook_mcp.main.epub_helper.get_toc')
    def test_get_epub_toc_success(self, mock_get_toc):
        """Test get_epub_toc successful case"""
        mock_toc = [
            ("Chapter 1", "chapter1.xhtml"),
            ("Chapter 2", "chapter2.xhtml")
        ]
        mock_get_toc.return_value = mock_toc
        
        result = get_epub_toc("/path/to/test.epub")
        assert result == mock_toc
        mock_get_toc.assert_called_once_with("/path/to/test.epub")
    
    @patch('ebook_mcp.main.epub_helper.get_toc')
    def test_get_epub_toc_file_not_found(self, mock_get_toc):
        """Test get_epub_toc with file not found"""
        mock_get_toc.side_effect = FileNotFoundError("File not found")
        
        with pytest.raises(FileNotFoundError):
            get_epub_toc("/path/to/nonexistent.epub")


class TestPdfFunctions:
    """Test PDF related functions"""
    
    def test_get_all_pdf_files_empty_directory(self):
        """Test get_all_pdf_files with empty directory"""
        with tempfile.TemporaryDirectory() as temp_dir:
            result = get_all_pdf_files(temp_dir)
            assert result == []
    
    def test_get_all_pdf_files_with_pdf_files(self):
        """Test get_all_pdf_files with PDF files present"""
        with tempfile.TemporaryDirectory() as temp_dir:
            # Create mock PDF files
            pdf_files = ["document1.pdf", "document2.pdf", "text.txt"]
            for file in pdf_files:
                with open(os.path.join(temp_dir, file), 'w') as f:
                    f.write("mock content")
            
            result = get_all_pdf_files(temp_dir)
            assert set(result) == {"document1.pdf", "document2.pdf"}
    
    @patch('ebook_mcp.main.pdf_helper.get_meta')
    def test_get_pdf_metadata_success(self, mock_get_meta):
        """Test get_pdf_metadata successful case"""
        mock_metadata = {
            'title': 'Test PDF',
            'author': 'Test Author',
            'pages': 10
        }
        mock_get_meta.return_value = mock_metadata
        
        result = get_pdf_metadata("/path/to/test.pdf")
        assert result == mock_metadata
        mock_get_meta.assert_called_once_with("/path/to/test.pdf")
    
    @patch('ebook_mcp.main.pdf_helper.get_meta')
    def test_get_pdf_metadata_file_not_found(self, mock_get_meta):
        """Test get_pdf_metadata with file not found"""
        mock_get_meta.side_effect = FileNotFoundError("File not found")
        
        with pytest.raises(FileNotFoundError):
            get_pdf_metadata("/path/to/nonexistent.pdf")
    
    @patch('ebook_mcp.main.pdf_helper.get_meta')
    def test_get_pdf_metadata_parsing_error(self, mock_get_meta):
        """Test get_pdf_metadata with parsing error"""
        mock_get_meta.side_effect = Exception("Parsing error")
        
        with pytest.raises(Exception):
            get_pdf_metadata("/path/to/corrupted.pdf")
    
    @patch('ebook_mcp.main.pdf_helper.get_toc')
    def test_get_pdf_toc_success(self, mock_get_toc):
        """Test get_pdf_toc successful case"""
        mock_toc = [
            ("Chapter 1", 1),
            ("Chapter 2", 5)
        ]
        mock_get_toc.return_value = mock_toc
        
        result = get_pdf_toc("/path/to/test.pdf")
        assert result == mock_toc
        mock_get_toc.assert_called_once_with("/path/to/test.pdf")
    
    @patch('ebook_mcp.main.pdf_helper.get_toc')
    def test_get_pdf_toc_file_not_found(self, mock_get_toc):
        """Test get_pdf_toc with file not found"""
        mock_get_toc.side_effect = FileNotFoundError("File not found")
        
        with pytest.raises(FileNotFoundError):
            get_pdf_toc("/path/to/nonexistent.pdf")
    
    @patch('ebook_mcp.main.pdf_helper.extract_page_text')
    def test_get_pdf_page_text_success(self, mock_extract):
        """Test get_pdf_page_text successful case"""
        mock_extract.return_value = "This is page 1 content."
        
        result = get_pdf_page_text("/path/to/test.pdf", 1)
        assert result == "This is page 1 content."
        mock_extract.assert_called_once_with("/path/to/test.pdf", 1)
    
    @patch('ebook_mcp.main.pdf_helper.extract_page_text')
    def test_get_pdf_page_text_error(self, mock_extract):
        """Test get_pdf_page_text with error"""
        mock_extract.side_effect = Exception("Extraction error")
        
        with pytest.raises(Exception):
            get_pdf_page_text("/path/to/test.pdf", 1)
    
    @patch('ebook_mcp.main.pdf_helper.extract_page_markdown')
    def test_get_pdf_page_markdown_success(self, mock_extract):
        """Test get_pdf_page_markdown successful case"""
        mock_extract.return_value = "# Page 1\n\nThis is page 1 content."
        
        result = get_pdf_page_markdown("/path/to/test.pdf", 1)
        assert result == "# Page 1\n\nThis is page 1 content."
        mock_extract.assert_called_once_with("/path/to/test.pdf", 1)
    
    @patch('ebook_mcp.main.pdf_helper.extract_page_markdown')
    def test_get_pdf_page_markdown_error(self, mock_extract):
        """Test get_pdf_page_markdown with error"""
        mock_extract.side_effect = Exception("Extraction error")
        
        with pytest.raises(Exception):
            get_pdf_page_markdown("/path/to/test.pdf", 1)
    
    @patch('ebook_mcp.main.pdf_helper.extract_chapter_by_title')
    def test_get_pdf_chapter_content_success(self, mock_get_chapter):
        """Test get_pdf_chapter_content successful case"""
        mock_content = ("This is chapter content.", [1, 2, 3])
        mock_get_chapter.return_value = mock_content
        
        result = get_pdf_chapter_content("/path/to/test.pdf", "Chapter 1")
        assert result == mock_content
        mock_get_chapter.assert_called_once_with("/path/to/test.pdf", "Chapter 1")
    
    @patch('ebook_mcp.main.pdf_helper.extract_chapt

Download .txt

gitextract_aguziwn7/

├── .github/
│   └── workflows/
│       └── cli.yml
├── .gitignore
├── .kiro/
│   ├── hooks/
│   │   └── readme-translation-hook.kiro.hook
│   └── steering/
│       └── ebook-mcp-server-guide.md
├── .python-version
├── .vscode/
│   └── settings.json
├── CHANGELOG.md
├── HOW-TO-TEST.md
├── LICENSE
├── README-CN.md
├── README-DE.md
├── README-FR.md
├── README-JP.md
├── README-KR.md
├── README.md
├── pyproject.toml
├── pytest.ini
├── run_tests.sh
└── src/
    └── ebook_mcp/
        ├── __init__.py
        ├── main.py
        ├── mcp_client_example/
        │   ├── .gitignore
        │   ├── README-CN.md
        │   ├── README.md
        │   ├── anthropic_example.py
        │   ├── deepseek_example.py
        │   ├── openai_example.py
        │   ├── pyproject.toml
        │   └── requirements.txt
        ├── tests/
        │   ├── conftest.py
        │   ├── run_tests.py
        │   ├── test_azw.py
        │   ├── test_basic.py
        │   ├── test_epub_chapter_extraction.py
        │   ├── test_epub_helper.py
        │   ├── test_logger_config.py
        │   ├── test_main.py
        │   └── test_pdf_helper.py
        └── tools/
            ├── epub_helper.py
            ├── logger_config.py
            ├── pdf_helper.py
            └── pdf_helper.py.backup

Download .txt

SYMBOL INDEX (189 symbols across 16 files)

FILE: src/ebook_mcp/main.py
  function handle_mcp_errors (line 16) | def handle_mcp_errors(func: Callable[..., T]) -> Callable[..., T]:
  function handle_pdf_errors (line 36) | def handle_pdf_errors(func: Callable[..., T]) -> Callable[..., T]:
  function get_all_epub_files (line 76) | def get_all_epub_files(path: str) -> List[str]:
  function get_epub_metadata (line 83) | def get_epub_metadata(epub_path:str) -> Dict[str, Union[str, List[str]]]:
  function get_epub_toc (line 102) | def get_epub_toc(epub_path: str) -> List[Tuple[str, str]]:
  function get_epub_chapter_markdown (line 120) | def get_epub_chapter_markdown(epub_path:str, chapter_id: str) -> str:
  function get_all_pdf_files (line 146) | def get_all_pdf_files(path: str) -> List[str]:
  function get_pdf_metadata (line 153) | def get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]:
  function get_pdf_toc (line 171) | def get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]:
  function get_pdf_page_text (line 189) | def get_pdf_page_text(pdf_path: str, page_number: int) -> str:
  function get_pdf_page_markdown (line 204) | def get_pdf_page_markdown(pdf_path: str, page_number: int) -> str:
  function get_pdf_chapter_content (line 219) | def get_pdf_chapter_content(pdf_path: str, chapter_title: str) -> Tuple[...
  function cli_entry (line 238) | def cli_entry():

FILE: src/ebook_mcp/mcp_client_example/anthropic_example.py
  class MCPClient (line 32) | class MCPClient:
    method __init__ (line 37) | def __init__(self):
    method _estimate_tokens (line 47) | def _estimate_tokens(self, text: str) -> int:
    method _trim_messages_history (line 54) | def _trim_messages_history(self):
    method add_message (line 77) | def add_message(self, role: str, content: str):
    method connect_to_server (line 85) | async def connect_to_server(self, server_script_path: str):
    method process_query (line 120) | async def process_query(self, query: str) -> str:
    method chat_loop (line 177) | async def chat_loop(self):
    method cleanup (line 206) | async def cleanup(self):
  function main (line 213) | async def main():

FILE: src/ebook_mcp/mcp_client_example/deepseek_example.py
  function setup_logging (line 18) | def setup_logging():
  class MCPClient (line 38) | class MCPClient:
    method __init__ (line 39) | def __init__(self):
    method count_tokens (line 55) | def count_tokens(self, messages):
    method trim_conversation_history (line 63) | def trim_conversation_history(self):
    method connect_to_server (line 69) | async def connect_to_server(self, server_script_path: str):
    method process_query (line 96) | async def process_query(self, query: str) -> str:
    method chat_loop (line 219) | async def chat_loop(self):
    method cleanup (line 241) | async def cleanup(self):
  function main (line 246) | async def main():

FILE: src/ebook_mcp/mcp_client_example/openai_example.py
  function setup_logging (line 19) | def setup_logging():
  class MCPClient (line 39) | class MCPClient:
    method __init__ (line 40) | def __init__(self):
    method count_tokens (line 51) | def count_tokens(self, messages):
    method trim_conversation_history (line 72) | def trim_conversation_history(self):
    method connect_to_server (line 78) | async def connect_to_server(self, server_script_path: str):
    method process_query (line 105) | async def process_query(self, query: str) -> str:
    method chat_loop (line 192) | async def chat_loop(self):
    method cleanup (line 214) | async def cleanup(self):
  function main (line 219) | async def main():

FILE: src/ebook_mcp/tests/conftest.py
  function temp_dir (line 8) | def temp_dir():
  function mock_epub_book (line 15) | def mock_epub_book():
  function mock_pdf_reader (line 31) | def mock_pdf_reader():
  function mock_pdf_document (line 48) | def mock_pdf_document():
  function sample_epub_files (line 60) | def sample_epub_files():
  function sample_pdf_files (line 66) | def sample_pdf_files():
  function temp_epub_file (line 72) | def temp_epub_file():
  function temp_pdf_file (line 86) | def temp_pdf_file():

FILE: src/ebook_mcp/tests/run_tests.py
  function run_tests (line 12) | def run_tests():
  function run_specific_test (line 50) | def run_specific_test(test_file):
  function list_tests (line 81) | def list_tests():

FILE: src/ebook_mcp/tests/test_azw.py
  function test_find_azw_files (line 14) | def test_find_azw_files(tmp_path):
  function test_get_metadata (line 37) | def test_get_metadata(test_azw_path):
  function test_get_toc (line 54) | def test_get_toc(test_azw_path):
  function test_get_chapter_text (line 71) | def test_get_chapter_text(test_azw_path):

FILE: src/ebook_mcp/tests/test_basic.py
  function test_get_all_epub_files_basic (line 8) | def test_get_all_epub_files_basic():
  function test_get_all_pdf_files_basic (line 21) | def test_get_all_pdf_files_basic():
  function test_file_not_found_error (line 34) | def test_file_not_found_error():
  function test_temp_file_operations (line 40) | def test_temp_file_operations():
  function test_directory_operations (line 59) | def test_directory_operations():
  function test_file_filtering (line 81) | def test_file_filtering(file_extension, expected_count):
  function test_mock_basic_operations (line 102) | def test_mock_basic_operations():
  function test_patch_basic (line 113) | def test_patch_basic():

FILE: src/ebook_mcp/tests/test_epub_chapter_extraction.py
  class TestExtractChapterHtml (line 28) | class TestExtractChapterHtml:
    method test_simple_chapter_extraction (line 32) | def test_simple_chapter_extraction(self):
    method test_chapter_with_subchapters_bug_case (line 74) | def test_chapter_with_subchapters_bug_case(self):
    method test_comparison_with_original_function (line 144) | def test_comparison_with_original_function(self):
    method test_markdown_conversion (line 192) | def test_markdown_conversion(self):
    method test_edge_cases (line 233) | def test_edge_cases(self):
    method test_last_chapter_extraction (line 258) | def test_last_chapter_extraction(self):
    method test_complex_nested_toc (line 298) | def test_complex_nested_toc(self):

FILE: src/ebook_mcp/tests/test_epub_helper.py
  class TestEpubHelper (line 36) | class TestEpubHelper:
    method test_get_all_epub_files_empty_directory (line 39) | def test_get_all_epub_files_empty_directory(self):
    method test_get_all_epub_files_with_epub_files (line 45) | def test_get_all_epub_files_with_epub_files(self):
    method test_get_toc_success (line 58) | def test_get_toc_success(self, mock_read_epub):
    method test_get_toc_nested_structure (line 87) | def test_get_toc_nested_structure(self, mock_read_epub):
    method test_get_toc_file_not_found (line 115) | def test_get_toc_file_not_found(self):
    method test_get_toc_parsing_error (line 121) | def test_get_toc_parsing_error(self, mock_read_epub):
    method test_get_meta_success (line 136) | def test_get_meta_success(self, mock_read_epub):
    method test_get_meta_file_not_found (line 176) | def test_get_meta_file_not_found(self):
    method test_get_meta_parsing_error (line 182) | def test_get_meta_parsing_error(self, mock_read_epub):
    method test_read_epub_success (line 197) | def test_read_epub_success(self, mock_read_epub):
    method test_flatten_toc_simple (line 213) | def test_flatten_toc_simple(self):
    method test_flatten_toc_nested (line 230) | def test_flatten_toc_nested(self):
    method test_clean_html (line 247) | def test_clean_html(self):
    method test_convert_html_to_markdown (line 271) | def test_convert_html_to_markdown(self):
    method test_extract_chapter_plain_text (line 282) | def test_extract_chapter_plain_text(self, mock_extract_html):

FILE: src/ebook_mcp/tests/test_logger_config.py
  class TestStructuredFormatter (line 16) | class TestStructuredFormatter:
    method test_format_basic_log (line 19) | def test_format_basic_log(self):
    method test_format_with_extra_fields (line 44) | def test_format_with_extra_fields(self):
    method test_format_with_exception (line 71) | def test_format_with_exception(self):
  class TestStructuredLogger (line 96) | class TestStructuredLogger:
    method test_logger_creation (line 99) | def test_logger_creation(self):
    method test_log_with_context (line 105) | def test_log_with_context(self):
  class TestLogOperationDecorator (line 116) | class TestLogOperationDecorator:
    method test_log_operation_success (line 119) | def test_log_operation_success(self):
    method test_log_operation_failure (line 128) | def test_log_operation_failure(self):
  class TestSetupLogger (line 137) | class TestSetupLogger:
    method test_setup_logger_creates_directory (line 140) | def test_setup_logger_creates_directory(self):
    method test_setup_logger_configures_handlers (line 151) | def test_setup_logger_configures_handlers(self):

FILE: src/ebook_mcp/tests/test_main.py
  class TestEpubFunctions (line 33) | class TestEpubFunctions:
    method test_get_all_epub_files_empty_directory (line 36) | def test_get_all_epub_files_empty_directory(self):
    method test_get_all_epub_files_with_epub_files (line 42) | def test_get_all_epub_files_with_epub_files(self):
    method test_get_epub_metadata_success (line 55) | def test_get_epub_metadata_success(self, mock_get_meta):
    method test_get_epub_metadata_file_not_found (line 69) | def test_get_epub_metadata_file_not_found(self, mock_get_meta):
    method test_get_epub_metadata_parsing_error (line 77) | def test_get_epub_metadata_parsing_error(self, mock_get_meta):
    method test_get_epub_toc_success (line 85) | def test_get_epub_toc_success(self, mock_get_toc):
    method test_get_epub_toc_file_not_found (line 98) | def test_get_epub_toc_file_not_found(self, mock_get_toc):
  class TestPdfFunctions (line 106) | class TestPdfFunctions:
    method test_get_all_pdf_files_empty_directory (line 109) | def test_get_all_pdf_files_empty_directory(self):
    method test_get_all_pdf_files_with_pdf_files (line 115) | def test_get_all_pdf_files_with_pdf_files(self):
    method test_get_pdf_metadata_success (line 128) | def test_get_pdf_metadata_success(self, mock_get_meta):
    method test_get_pdf_metadata_file_not_found (line 142) | def test_get_pdf_metadata_file_not_found(self, mock_get_meta):
    method test_get_pdf_metadata_parsing_error (line 150) | def test_get_pdf_metadata_parsing_error(self, mock_get_meta):
    method test_get_pdf_toc_success (line 158) | def test_get_pdf_toc_success(self, mock_get_toc):
    method test_get_pdf_toc_file_not_found (line 171) | def test_get_pdf_toc_file_not_found(self, mock_get_toc):
    method test_get_pdf_page_text_success (line 179) | def test_get_pdf_page_text_success(self, mock_extract):
    method test_get_pdf_page_text_error (line 188) | def test_get_pdf_page_text_error(self, mock_extract):
    method test_get_pdf_page_markdown_success (line 196) | def test_get_pdf_page_markdown_success(self, mock_extract):
    method test_get_pdf_page_markdown_error (line 205) | def test_get_pdf_page_markdown_error(self, mock_extract):
    method test_get_pdf_chapter_content_success (line 213) | def test_get_pdf_chapter_content_success(self, mock_get_chapter):
    method test_get_pdf_chapter_content_error (line 223) | def test_get_pdf_chapter_content_error(self, mock_get_chapter):
  class TestMainModule (line 231) | class TestMainModule:
    method test_main_module_imports (line 234) | def test_main_module_imports(self):
    method test_cli_entry_function (line 242) | def test_cli_entry_function(self):
  class TestDecorators (line 255) | class TestDecorators:
    method test_handle_mcp_errors_file_not_found (line 258) | def test_handle_mcp_errors_file_not_found(self):
    method test_handle_mcp_errors_general_exception (line 269) | def test_handle_mcp_errors_general_exception(self):
    method test_handle_pdf_errors (line 280) | def test_handle_pdf_errors(self):
    method test_decorator_preserves_return_value (line 291) | def test_decorator_preserves_return_value(self):
    method test_handle_mcp_errors_with_custom_exceptions (line 302) | def test_handle_mcp_errors_with_custom_exceptions(self):

FILE: src/ebook_mcp/tests/test_pdf_helper.py
  class TestPdfHelper (line 26) | class TestPdfHelper:
    method test_get_all_pdf_files_empty_directory (line 29) | def test_get_all_pdf_files_empty_directory(self):
    method test_get_all_pdf_files_with_pdf_files (line 35) | def test_get_all_pdf_files_with_pdf_files(self):
    method test_get_meta_success (line 48) | def test_get_meta_success(self, mock_fitz_open):
    method test_get_meta_no_metadata (line 99) | def test_get_meta_no_metadata(self, mock_fitz_open):
    method test_get_meta_file_not_found (line 134) | def test_get_meta_file_not_found(self):
    method test_get_meta_parsing_error (line 140) | def test_get_meta_parsing_error(self, mock_fitz_open):
    method test_get_toc_success (line 155) | def test_get_toc_success(self, mock_fitz_open):
    method test_get_toc_empty (line 182) | def test_get_toc_empty(self, mock_fitz_open):
    method test_get_toc_file_not_found (line 199) | def test_get_toc_file_not_found(self):
    method test_get_toc_parsing_error (line 205) | def test_get_toc_parsing_error(self, mock_fitz_open):
    method test_extract_page_text_success (line 220) | def test_extract_page_text_success(self, mock_fitz_open):
    method test_extract_page_text_page_not_found (line 240) | def test_extract_page_text_page_not_found(self, mock_fitz_open):
    method test_extract_page_markdown_success (line 258) | def test_extract_page_markdown_success(self, mock_fitz_open):
    method test_extract_page_markdown_with_formatting (line 294) | def test_extract_page_markdown_with_formatting(self, mock_fitz_open):
    method test_extract_chapter_by_title_success (line 328) | def test_extract_chapter_by_title_success(self, mock_fitz_open):
    method test_extract_chapter_by_title_chapter_not_found (line 362) | def test_extract_chapter_by_title_chapter_not_found(self, mock_fitz_op...
    method test_extract_chapter_by_title_single_page (line 383) | def test_extract_chapter_by_title_single_page(self, mock_fitz_open):

FILE: src/ebook_mcp/tools/epub_helper.py
  class EpubProcessingError (line 6) | class EpubProcessingError(Exception):
    method __init__ (line 8) | def __init__(self, message: str, file_path: str, operation: str, origi...
  class PdfProcessingError (line 15) | class PdfProcessingError(Exception):
    method __init__ (line 17) | def __init__(self, message: str, file_path: str, operation: str, origi...
  function get_all_epub_files (line 51) | def get_all_epub_files(path: str) -> List[str]:
  function get_toc (line 58) | def get_toc(epub_path: str) -> List[Tuple[str, str]]:
  function get_meta (line 127) | def get_meta(epub_path: str) -> Dict[str, Union[str, List[str]]]:
  function extract_chapter_from_epub (line 207) | def extract_chapter_from_epub(epub_path: str, anchor_href: str) -> str:
  function read_epub (line 263) | def read_epub(epub_path: str) -> Any:
  function flatten_toc (line 266) | def flatten_toc(book: Any) -> List[str]:
  function extract_chapter_plain_text (line 281) | def extract_chapter_plain_text(book: Any, anchor_href: str) -> str:
  function convert_html_to_markdown (line 288) | def convert_html_to_markdown(html_str: str) -> str:
  function clean_html (line 294) | def clean_html(html_str: str) -> str:
  function extract_chapter_html (line 323) | def extract_chapter_html(book: Any, anchor_href: str) -> str:
  function extract_chapter_markdown (line 404) | def extract_chapter_markdown(book: Any, anchor_href: str) -> str:
  function extract_multiple_chapters (line 410) | def extract_multiple_chapters(book: Any, anchor_list: List[str], output:...

FILE: src/ebook_mcp/tools/logger_config.py
  class StructuredFormatter (line 10) | class StructuredFormatter(logging.Formatter):
    method format (line 13) | def format(self, record: logging.LogRecord) -> str:
  class StructuredLogger (line 53) | class StructuredLogger:
    method __init__ (line 56) | def __init__(self, name: str):
    method _log_with_context (line 60) | def _log_with_context(self, level: int, message: str, **context):
    method info (line 75) | def info(self, message: str, **context):
    method debug (line 79) | def debug(self, message: str, **context):
    method warning (line 83) | def warning(self, message: str, **context):
    method error (line 87) | def error(self, message: str, **context):
    method critical (line 91) | def critical(self, message: str, **context):
  function setup_logger (line 95) | def setup_logger(level: str = "INFO", log_file: str = "ebook_mcp.log"):
  function get_logger (line 133) | def get_logger(name: str) -> StructuredLogger:
  function log_operation (line 137) | def log_operation(operation_name: str):

FILE: src/ebook_mcp/tools/pdf_helper.py
  class PdfProcessingError (line 9) | class PdfProcessingError(Exception):
    method __init__ (line 11) | def __init__(self, message: str, file_path: str, operation: str, origi...
  function get_all_pdf_files (line 21) | def get_all_pdf_files(path: str) -> List[str]:
  function get_meta (line 28) | def get_meta(pdf_path: str) -> Dict[str, Union[str, List[str]]]:
  function get_toc (line 135) | def get_toc(pdf_path: str) -> List[Tuple[str, int]]:
  function extract_page_text (line 194) | def extract_page_text(pdf_path: str, page_number: int) -> str:
  function extract_page_markdown (line 223) | def extract_page_markdown(pdf_path: str, page_number: int) -> str:
  function extract_chapter_by_title (line 275) | def extract_chapter_by_title(pdf_path: str, chapter_title: str) -> Tuple...

Download .json

Condensed preview — 41 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (299K chars).

[
  {
    "path": ".github/workflows/cli.yml",
    "chars": 621,
    "preview": "name: Publish Python Package\n\non:\n  push:\n    tags:\n      - 'v*.*.*'  # 只有打 tag 才触发\n\npermissions:\n  id-token: write  # 用"
  },
  {
    "path": ".gitignore",
    "chars": 3496,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n**/.next/\n.DS_Store\n\n*.lo"
  },
  {
    "path": ".kiro/hooks/readme-translation-hook.kiro.hook",
    "chars": 1268,
    "preview": "{\n  \"enabled\": true,\n  \"name\": \"README Translation & Grammar Check\",\n  \"description\": \"Automatically checks README.md fo"
  },
  {
    "path": ".kiro/steering/ebook-mcp-server-guide.md",
    "chars": 2727,
    "preview": "---\ninclusion: always\n---\n\n# Ebook MCP Server Development Guide\n\n## Architecture\nFastMCP-based server for EPUB/PDF proce"
  },
  {
    "path": ".python-version",
    "chars": 5,
    "preview": "3.12\n"
  },
  {
    "path": ".vscode/settings.json",
    "chars": 3,
    "preview": "{\n}"
  },
  {
    "path": "CHANGELOG.md",
    "chars": 5112,
    "preview": "# Changelog\n\nAll notable changes to this project will be documented in this file.\n\nThe format is based on [Keep a Change"
  },
  {
    "path": "HOW-TO-TEST.md",
    "chars": 4873,
    "preview": "# Ebook-MCP 单元测试指南\n\n本指南说明如何运行 ebook-mcp 项目的服务器部分单元测试。\n\n## 测试文件结构\n\n```\nsrc/ebook_mcp/tests/\n├── conftest.py              "
  },
  {
    "path": "LICENSE",
    "chars": 11357,
    "preview": "                                 Apache License\n                           Version 2.0, January 2004\n                   "
  },
  {
    "path": "README-CN.md",
    "chars": 6976,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "README-DE.md",
    "chars": 14510,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "README-FR.md",
    "chars": 15211,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "README-JP.md",
    "chars": 8444,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "README-KR.md",
    "chars": 8850,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "README.md",
    "chars": 13230,
    "preview": "[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/onebirdrocks-ebook-mcp-badge.png)](https://mseep.ai/app/oneb"
  },
  {
    "path": "pyproject.toml",
    "chars": 727,
    "preview": "[project]\nname = \"ebook-mcp\"\nversion = \"0.1.8\"\ndescription = \"An MCP server for chatting with ebooks (PDF/EPUB).\"\nauthor"
  },
  {
    "path": "pytest.ini",
    "chars": 383,
    "preview": "[tool:pytest]\ntestpaths = src/ebook_mcp/tests\npython_files = test_*.py\npython_classes = Test*\npython_functions = test_*\n"
  },
  {
    "path": "run_tests.sh",
    "chars": 1372,
    "preview": "#!/bin/bash\n\n# Ebook-MCP 测试运行脚本\n\necho \"==========================================\"\necho \"Ebook-MCP 单元测试运行器\"\necho \"======"
  },
  {
    "path": "src/ebook_mcp/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/ebook_mcp/main.py",
    "chars": 8074,
    "preview": "import os\nfrom typing import Any,List,Dict,Union,Tuple, Callable, TypeVar\nfrom functools import wraps\nfrom mcp.server.fa"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/.gitignore",
    "chars": 333,
    "preview": ".env\n\n# Logs\nlogs/\n*.log\n\n# Python\n__pycache__/\n*.py[cod]\n*$py.class\n*.so\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\n"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/README-CN.md",
    "chars": 2840,
    "preview": "[English](README.md) | 中文\n\n本项目提供了多个 MCP（模型控制协议）客户端示例，用于与不同的 AI 模型进行交互，包括 OpenAI GPT-4、Anthropic Claude 和 Deepseek AI。\n\n#"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/README.md",
    "chars": 6757,
    "preview": "English | [中文](README-CN.md)\n# MCP Client Examples\n\n\n\nThis project provides multiple MCP (Model Control Protocol) client"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/anthropic_example.py",
    "chars": 8534,
    "preview": "import asyncio\nimport logging\nimport os\nfrom datetime import datetime\nfrom typing import Optional, List, Dict\nfrom conte"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/deepseek_example.py",
    "chars": 10053,
    "preview": "import asyncio\nfrom typing import Optional\nfrom contextlib import AsyncExitStack\nimport json\nimport logging\nimport os\nfr"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/openai_example.py",
    "chars": 8839,
    "preview": "import asyncio\nfrom typing import Optional\nfrom contextlib import AsyncExitStack\nimport json\nimport tiktoken\nimport logg"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/pyproject.toml",
    "chars": 255,
    "preview": "[project]\nname = \"demo-mcp-client\"\nversion = \"0.1.0\"\ndescription = \"Add your description here\"\nreadme = \"README.md\"\nrequ"
  },
  {
    "path": "src/ebook_mcp/mcp_client_example/requirements.txt",
    "chars": 84,
    "preview": "anthropic==0.49.0\npython-dotenv>=1.0.0\nopenai>=1.0.0\nfastmcp>=0.1.0\npydantic>=2.0.0\n"
  },
  {
    "path": "src/ebook_mcp/tests/conftest.py",
    "chars": 2413,
    "preview": "import pytest\nimport tempfile\nimport os\nfrom unittest.mock import Mock\n\n\n@pytest.fixture\ndef temp_dir():\n    \"\"\"Create a"
  },
  {
    "path": "src/ebook_mcp/tests/run_tests.py",
    "chars": 3028,
    "preview": "#!/usr/bin/env python3\n\"\"\"\nTest runner script for ebook-mcp project.\nThis script runs all unit tests for the server comp"
  },
  {
    "path": "src/ebook_mcp/tests/test_azw.py",
    "chars": 2567,
    "preview": "import os\nimport pytest\nimport sys\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))\n\ntry:\n    fro"
  },
  {
    "path": "src/ebook_mcp/tests/test_basic.py",
    "chars": 4010,
    "preview": "import pytest\nimport os\nimport tempfile\nfrom unittest.mock import Mock, patch\n\n# Test basic file operations that don't r"
  },
  {
    "path": "src/ebook_mcp/tests/test_epub_chapter_extraction.py",
    "chars": 13524,
    "preview": "import pytest\nimport tempfile\nimport os\nfrom unittest.mock import Mock, patch, MagicMock\n\n# Add project root to path\nimp"
  },
  {
    "path": "src/ebook_mcp/tests/test_epub_helper.py",
    "chars": 10140,
    "preview": "import pytest\nimport os\nimport tempfile\nfrom unittest.mock import Mock, patch, MagicMock\n\n# Mock external dependencies\ni"
  },
  {
    "path": "src/ebook_mcp/tests/test_logger_config.py",
    "chars": 5789,
    "preview": "import pytest\nimport json\nimport os\nimport sys\nimport tempfile\nimport logging\nfrom unittest.mock import patch, MagicMock"
  },
  {
    "path": "src/ebook_mcp/tests/test_main.py",
    "chars": 12389,
    "preview": "import pytest\nimport os\nimport tempfile\nfrom unittest.mock import Mock, patch, MagicMock\nfrom typing import List, Dict, "
  },
  {
    "path": "src/ebook_mcp/tests/test_pdf_helper.py",
    "chars": 14737,
    "preview": "import pytest\nimport os\nimport tempfile\nfrom unittest.mock import Mock, patch, MagicMock\n\n# Mock external dependencies\ni"
  },
  {
    "path": "src/ebook_mcp/tools/epub_helper.py",
    "chars": 15091,
    "preview": "from typing import List, Tuple, Dict, Union, Any, Optional\nimport os\nfrom .logger_config import get_logger, log_operatio"
  },
  {
    "path": "src/ebook_mcp/tools/logger_config.py",
    "chars": 6818,
    "preview": "import logging\nimport json\nimport os\nimport time\nfrom datetime import datetime\nfrom typing import Dict, Any, Optional\nfr"
  },
  {
    "path": "src/ebook_mcp/tools/pdf_helper.py",
    "chars": 11127,
    "preview": "from typing import List, Tuple, Dict, Union\nimport os\nfrom io import StringIO\nimport fitz  # PyMuPDF\nimport re\nfrom .log"
  },
  {
    "path": "src/ebook_mcp/tools/pdf_helper.py.backup",
    "chars": 10537,
    "preview": "from PyPDF2 import PdfReader\nfrom typing import List, Tuple, Dict, Union\nimport os\nimport logging\nfrom io import StringI"
  }
]

About this extraction

This page contains the full source code of the onebirdrocks/ebook-mcp GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 41 files (251.1 KB), approximately 66.9k tokens, and a symbol index with 189 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo