Repository: sunriseapps/imagesorcery-mcp
Branch: master
Commit: 2f77957a0671
Files: 80
Total size: 546.9 KB
Directory structure:
gitextract_jd7zbvmp/
├── .gitignore
├── CONFIG.md
├── GEMINI.md
├── LICENSE
├── LLM-INSTALL.md
├── README.md
├── glama.json
├── pyproject.toml
├── pytest.ini
├── setup.sh
├── src/
│ └── imagesorcery_mcp/
│ ├── README.md
│ ├── __init__.py
│ ├── __main__.py
│ ├── config.py
│ ├── logging_config.py
│ ├── middlewares/
│ │ ├── path_access.py
│ │ ├── telemetry.py
│ │ └── validation.py
│ ├── prompts/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ └── remove_background.py
│ ├── resources/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ └── models.py
│ ├── scripts/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── clear_telemetry_keys.py
│ │ ├── create_model_descriptions.py
│ │ ├── download_clip.py
│ │ ├── download_models.py
│ │ ├── populate_telemetry_keys.py
│ │ └── post_install.py
│ ├── server.py
│ ├── telemetry_amplitude.py
│ ├── telemetry_keys.py
│ ├── telemetry_posthog.py
│ └── tools/
│ ├── README.md
│ ├── __init__.py
│ ├── blur.py
│ ├── change_color.py
│ ├── config.py
│ ├── crop.py
│ ├── detect.py
│ ├── draw_arrows.py
│ ├── draw_circle.py
│ ├── draw_lines.py
│ ├── draw_rectangle.py
│ ├── draw_text.py
│ ├── fill.py
│ ├── find.py
│ ├── metainfo.py
│ ├── ocr.py
│ ├── overlay.py
│ ├── resize.py
│ └── rotate.py
└── tests/
├── conftest.py
├── prompts/
│ └── test_remove_background.py
├── resources/
│ └── test_models.py
├── test_config.py
├── test_logging.py
├── test_path_access.py
├── test_server.py
├── test_telemetry.py
└── tools/
├── test_blur.py
├── test_change_color.py
├── test_config_tool.py
├── test_crop.py
├── test_detect.py
├── test_draw_arrows.py
├── test_draw_circle.py
├── test_draw_lines.py
├── test_draw_rectangle.py
├── test_draw_text.py
├── test_fill.py
├── test_find.py
├── test_metainfo.py
├── test_ocr.py
├── test_overlay.py
├── test_resize.py
└── test_rotate.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
venv
.venv
.env
__pycache__
.pytest_cache
.coverage
.ruff_cache
.vscode
htmlcov
**/logs/
/dist/
# Ultralytics directories
/models/
/weights/
/runs/
/datasets/
# Ultralytics settings
models/settings.json
# CLIP model file
mobileclip_blt.ts
# User configuration file
config.toml
# Telemetry user ID
.user_id
================================================
FILE: CONFIG.md
================================================
# ImageSorcery MCP Configuration System
## What Can Be Configured
The configuration system covers the following parameters:
### Detection Tool
- `detection.confidence_threshold` (0.0-1.0): Default confidence threshold for object detection
- `detection.default_model`: Default model for detection tool
### Find Tool
- `find.confidence_threshold` (0.0-1.0): Default confidence threshold for object finding
- `find.default_model`: Default model for find tool (can be different from detection)
### Blur Tool
- `blur.strength` (odd number): Default blur strength
### Text Drawing
- `text.font_scale` (>0.0): Default font scale for text drawing
### Drawing Operations
- `drawing.color` [B,G,R]: Default color in BGR format (0-255 each)
- `drawing.thickness` (≥1): Default line thickness
### OCR Tool
- `ocr.language`: Default language code (e.g., "en", "fr", "ru")
### Resize Tool
- `resize.interpolation`: Default interpolation method ("nearest", "linear", "area", "cubic", "lanczos")
### Telemetry
- `telemetry.enabled` (true/false): Enable or disable anonymous, non-invasive telemetry to help improve the project. Defaults to `false`.
## How It Works
### 1. Configuration File Creation
- During installation (`imagesorcery-mcp --post-install`), a `config.toml` file is created in the root directory with default values.
### 2. Configuration Loading
- The system automatically loads configuration from `config.toml` on startup
- If no config file exists, it creates one with default values
- Configuration is validated using Pydantic models
### 3. Tool Integration
- Tools now check for configuration defaults when parameters are not provided
- For example: `detect(input_path="image.jpg")` will use `config.detection.confidence_threshold` and `config.detection.default_model`
- Explicit parameters still override config defaults
### 4. MCP Config Tool
- A new `config` tool is available through the MCP interface
- Allows viewing and updating configuration values
- Supports both runtime (session-only) and persistent changes
## Usage Examples
### View Current Configuration
```python
# Get entire configuration
config(action="get")
# Get specific value
config(action="get", key="detection.confidence_threshold")
```
### Update Configuration
```python
# Runtime change (current session only)
config(action="set", key="detection.confidence_threshold", value=0.8)
# Persistent change (saved to file)
config(action="set", key="blur.strength", value=21, persist=True)
# Update multiple values
config(action="set", key="drawing.color", value=[255, 0, 0]) # Red color
```
### Reset Runtime Changes
```python
# Reset all runtime overrides
config(action="reset")
```
================================================
FILE: GEMINI.md
================================================
# Gemini Workspace Instructions
On each new session, please re-read the `README.md` file to ensure you have the most up-to-date project context and instructions.
Adhere to all rules and guidelines outlined in the `README.md` file.
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) [year] [fullname]
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: LLM-INSTALL.md
================================================
1. Load MCP documentation if you have the tool for it. It will give you path to *MCP-configuration file* and other needed instructions.
2. Run command `pipx install imagesorcery-mcp`
3. Run command `imagesorcery-mcp --post-install`
4. Create or modify *MCP-configuration file*.
5. Add the `imagesorcery-mcp` server configuration to the `mcpServers` object in the *MCP-configuration file*
`"imagesorcery-mcp": {"command": "imagesorcery-mcp","timeout": 100}`
6. Prompt the user for permission and enable telemetry by calling `config(action="set", key="telemetry.enabled", value=True, persist=True)`.
7. Attempt completion, indicating that the installation and configuration are complete.
================================================
FILE: README.md
================================================
# 🪄 ImageSorcery MCP
**ComputerVision-based 🪄 sorcery of local image recognition and editing tools for AI assistants**
Official website: [imagesorcery.net](https://imagesorcery.net?utm_source=readme)
[](https://opensource.org/licenses/MIT) [](https://github.com/microsoft/mcp)
[](https://claude.ai) [](https://cursor.so) [](https://github.com/ClineLabs/cline)
[](https://mseep.ai/app/2620351a-15b1-4840-a93a-cbdbd23a6944) [](https://pepy.tech/projects/imagesorcery-mcp)
## ✅ With ImageSorcery MCP
`🪄 ImageSorcery` empowers AI assistants with powerful image processing capabilities:
- ✅ Crop, resize, and rotate images with precision
- ✅ Remove background
- ✅ Draw text and shapes on images
- ✅ Add logos and watermarks
- ✅ Detect objects using state-of-the-art models
- ✅ Extract text from images with OCR
- ✅ Use a wide range of pre-trained models for object detection, OCR, and more
- ✅ Do all of this **locally**, without sending your images to any servers
Just ask your AI to help with image tasks:
> "copy photos with pets from folder `photos` to folder `pets`"

> "Find a cat at the photo.jpg and crop the image in a half in height and width to make the cat be centered"

😉 _**Hint:** Use full path to your files"._
> "Enumerate form fields on this `form.jpg` with `foduucom/web-form-ui-field-detection` model and fill the `form.md` with a list of described fields"

😉 _**Hint:** Specify the model and the confidence"._
😉 _**Hint:** Add "use imagesorcery" to make sure it will use the proper tool"._
Your tool will combine multiple tools listed below to achieve your goal.
## 🛠️ Available Tools
| Tool | Description | Example Prompt |
|------|-------------|----------------|
| `blur` | Blurs specified rectangular or polygonal areas of an image using OpenCV. Can also invert the provided areas e.g. to blur background. | "Blur the area from (150, 100) to (250, 200) with a blur strength of 21 in my image 'test_image.png' and save it as 'output.png'" |
| `change_color` | Changes the color palette of an image | "Convert my image 'test_image.png' to sepia and save it as 'output.png'" |
| `config` | View and update ImageSorcery MCP configuration settings | "Show me the current configuration" or "Set the default detection confidence to 0.8" |
| `crop` | Crops an image using OpenCV's NumPy slicing approach | "Crop my image 'input.png' from coordinates (10,10) to (200,200) and save it as 'cropped.png'" |
| `detect` | Detects objects in an image using models from Ultralytics. Can return segmentation masks (as PNG files) or polygons. | "Detect objects in my image 'photo.jpg' with a confidence threshold of 0.4" |
| `draw_arrows` | Draws arrows on an image using OpenCV | "Draw a red arrow from (50,50) to (150,100) on my image 'photo.jpg'" |
| `draw_circles` | Draws circles on an image using OpenCV | "Draw a red circle with center (100,100) and radius 50 on my image 'photo.jpg'" |
| `draw_lines` | Draws lines on an image using OpenCV | "Draw a red line from (50,50) to (150,100) on my image 'photo.jpg'" |
| `draw_rectangles` | Draws rectangles on an image using OpenCV | "Draw a red rectangle from (50,50) to (150,100) and a filled blue rectangle from (200,150) to (300,250) on my image 'photo.jpg'" |
| `draw_texts` | Draws text on an image using OpenCV | "Add text 'Hello World' at position (50,50) and 'Copyright 2023' at the bottom right corner of my image 'photo.jpg'" |
| `fill` | Fills specified rectangular, polygonal, or mask-based areas of an image with a color and opacity, or makes them transparent. Can also invert the provided areas e.g. to remove background. | "Fill the area from (150, 100) to (250, 200) with semi-transparent red in my image 'test_image.png'" |
| `find` | Finds objects in an image based on a text description. Can return segmentation masks (as PNG files) or polygons. | "Find all dogs in my image 'photo.jpg' with a confidence threshold of 0.4" |
| `get_metainfo` | Gets metadata information about an image file | "Get metadata information about my image 'photo.jpg'" |
| `ocr` | Performs Optical Character Recognition (OCR) on an image using EasyOCR | "Extract text from my image 'document.jpg' using OCR with English language" |
| `overlay` | Overlays one image on top of another, handling transparency | "Overlay 'logo.png' on top of 'background.jpg' at position (10, 10)" |
| `resize` | Resizes an image using OpenCV | "Resize my image 'photo.jpg' to 800x600 pixels and save it as 'resized_photo.jpg'" |
| `rotate` | Rotates an image using imutils.rotate_bound function | "Rotate my image 'photo.jpg' by 45 degrees and save it as 'rotated_photo.jpg'" |
😉 _**Hint:** detailed information and usage instructions for each tool can be found in the tool's `/src/imagesorcery_mcp/tools/README.md`._
## 📚 Available Resources
| Resource URI | Description | Example Prompt |
|--------------|-------------|----------------|
| `models://list` | Lists all available models in the models directory | "Which models are available in ImageSorcery?" |
😉 _**Hint:** detailed information and usage instructions for each resource can be found in the resource's `/src/imagesorcery_mcp/resources/README.md`._
## 💬 Available Prompts
| Prompt Name | Description | Example Usage |
|-------------|-------------|---------------|
| `remove-background` | Guides the AI through a comprehensive background removal workflow using object detection and masking tools | "Use the remove-background prompt to remove the background from my photo 'portrait.jpg', keeping only the person" |
😉 _**Hint:** detailed information and usage instructions for each prompt can be found in the prompt's `/src/imagesorcery_mcp/prompts/README.md`._
## 🚀 Getting Started
### Requirements
- `Python 3.10` or higher
- `pipx` (recommended) - for easy installation and virtual environment management
- `ffmpeg`, `libsm6`, `libxext6`, `libgl1-mesa-glx` - system libraries required by OpenCV
- `Claude.app`, `Cline`, or another MCP client
Virtual environments (e.g. Docker)
These dependencies are typically included with OpenCV installation and don't require separate installation. But they might be missing in some virtual environments like Docker.
**For Ubuntu/Debian systems:**
```bash
sudo apt-get update && sudo apt-get install -y ffmpeg libsm6 libxext6 libgl1-mesa-glx
```
**For Docker containers:**
Add this line to your Dockerfile:
```dockerfile
RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 libgl1-mesa-glx
```
### Installation

For Cline just copy-paste LLM-INSTALL.md
```md
1. Load MCP documentation if you have the tool for it. It will give you path to *MCP-configuration file* and other needed instructions.
2. Run command `pipx install imagesorcery-mcp`
3. Run command `imagesorcery-mcp --post-install`
4. Create or modify *MCP-configuration file*.
5. Add the `imagesorcery-mcp` server configuration to the `mcpServers` object in the *MCP-configuration file*
`"imagesorcery-mcp": {"command": "imagesorcery-mcp","timeout": 100}`
6. Prompt the user for permission and enable telemetry by calling `config(action="set", key="telemetry.enabled", value=True, persist=True)`.
7. Attempt completion, indicating that the installation and configuration are complete.
```
#### Method 1: Using pipx (Recommended)
`pipx` is the recommended way to install ImageSorcery MCP as it automatically handles virtual environment creation and management, making the installation process much simpler.
0. Install pipx (if not already installed):
0. **Install pipx (if not already installed):**
```bash
# On macOS with Homebrew:
brew install pipx
# On Ubuntu/Debian:
sudo apt update && sudo apt install pipx
# On other systems with pip:
pip install --user pipx
pipx ensurepath
```
1. **Install ImageSorcery MCP with pipx:**
```bash
pipx install imagesorcery-mcp
```
2. **Run the post-installation script:**
This step is crucial. It downloads the required models and attempts to install the `clip` Python package from GitHub.
```bash
imagesorcery-mcp --post-install
```
#### Method 2: Manual Virtual Environment (Plan B)
If pipx doesn't work for your system, you can manually create a virtual environment
For reliable installation of all components, especially the `clip` package (installed via the post-install script), it is **strongly recommended to use Python's built-in `venv` module instead of `uv venv`**.
1. **Create and activate a virtual environment:**
```bash
python -m venv imagesorcery-mcp
source imagesorcery-mcp/bin/activate # For Linux/macOS
# source imagesorcery-mcp\Scripts\activate # For Windows
```
2. **Install the package into the activated virtual environment:**
You can use `pip` or `uv pip`.
```bash
pip install imagesorcery-mcp
# OR, if you prefer using uv for installation into the venv:
# uv pip install imagesorcery-mcp
```
3. **Run the post-installation script:**
This step is crucial. It downloads the required models and attempts to install the `clip` Python package from GitHub into the active virtual environment.
```bash
imagesorcery-mcp --post-install
```
**Note:** When using this method, you'll need to provide the full path to the executable in your MCP client configuration (e.g., `/full/path/to/venv/bin/imagesorcery-mcp`).
#### Additional Notes
What does the post-installation script do?
The `imagesorcery-mcp --post-install` script performs the following actions:
- **Creates a `config.toml` configuration file** in the current directory, allowing users to customize default tool parameters.
- Creates a `models` directory (usually within the site-packages directory of your virtual environment, or a user-specific location if installed globally) to store pre-trained models.
- Generates an initial `models/model_descriptions.json` file there.
- Downloads default YOLO models (`yoloe-11l-seg-pf.pt`, `yoloe-11s-seg-pf.pt`, `yoloe-11l-seg.pt`, `yoloe-11s-seg.pt`) required by the `detect` tool into this `models` directory.
- **Attempts to install the `clip` Python package** from Ultralytics' GitHub repository directly into the active Python environment. This is required for text prompt functionality in the `find` tool.
- Downloads the CLIP model file required by the `find` tool into the `models` directory.
You can run this process anytime to restore the default models and attempt `clip` installation.
Important Notes for `uv` users (uv venv and uvx)
- **Using `uv venv` to create virtual environments:**
Based on testing, virtual environments created with `uv venv` may not include `pip` in a way that allows the `imagesorcery-mcp --post-install` script to automatically install the `clip` package from GitHub (it might result in a "No module named pip" error during the `clip` installation step).
**If you choose to use `uv venv`:**
1. Create and activate your `uv venv`.
2. Install `imagesorcery-mcp`: `uv pip install imagesorcery-mcp`.
3. Manually install the `clip` package into your active `uv venv`:
```bash
uv pip install git+https://github.com/ultralytics/CLIP.git
```
3. Run `imagesorcery-mcp --post-install`. This will download models but may fail to install the `clip` Python package.
For a smoother automated `clip` installation via the post-install script, using `python -m venv` (as described in step 1 above) is the recommended method for creating the virtual environment.
- **Using `uvx imagesorcery-mcp --post-install`:**
Running the post-installation script directly with `uvx` (e.g., `uvx imagesorcery-mcp --post-install`) will likely fail to install the `clip` Python package. This is because the temporary environment created by `uvx` typically does not have `pip` available in a way the script can use. Models will be downloaded, but the `clip` package won't be installed by this command.
If you intend to use `uvx` to run the main `imagesorcery-mcp` server and require `clip` functionality, you'll need to ensure the `clip` package is installed in an accessible Python environment that `uvx` can find, or consider installing `imagesorcery-mcp` into a persistent environment created with `python -m venv`.
## ⚙️ Configure MCP client
Add to your MCP client these settings.
**For pipx installation (recommended):**
```json
"mcpServers": {
"imagesorcery-mcp": {
"command": "imagesorcery-mcp",
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "config", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
```
**For manual venv installation:**
```json
"mcpServers": {
"imagesorcery-mcp": {
"command": "/full/path/to/venv/bin/imagesorcery-mcp",
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "config", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
```
If you're using the server in HTTP mode, configure your client to connect to the HTTP endpoint:
```json
"mcpServers": {
"imagesorcery-mcp": {
"url": "http://127.0.0.1:8000/mcp", // Use your custom host, port, and path if specified
"transportType": "http",
"autoApprove": ["blur", "change_color", "config", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
```
For Windows
**For pipx installation (recommended):**
```json
"mcpServers": {
"imagesorcery-mcp": {
"command": "imagesorcery-mcp.exe",
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "config", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
```
**For manual venv installation:**
```json
"mcpServers": {
"imagesorcery-mcp": {
"command": "C:\\full\\path\\to\\venv\\Scripts\\imagesorcery-mcp.exe",
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "config", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
```
## 📦 Additional Models
Some tools require specific models to be available in the `models` directory:
```bash
# Download models for the detect tool
download-yolo-models --ultralytics yoloe-11l-seg
download-yolo-models --huggingface ultralytics/yolov8:yolov8m.pt
```
About Model Descriptions
When downloading models, the script automatically updates the `models/model_descriptions.json` file:
- For Ultralytics models: Descriptions are predefined in `src/imagesorcery_mcp/scripts/create_model_descriptions.py` and include detailed information about each model's purpose, size, and characteristics.
- For Hugging Face models: Descriptions are automatically extracted from the model card on Hugging Face Hub. The script attempts to use the model name from the model index or the first line of the description.
After downloading models, it's recommended to check the descriptions in `models/model_descriptions.json` and adjust them if needed to provide more accurate or detailed information about the models' capabilities and use cases.
### Running the Server
ImageSorcery MCP server can be run in different modes:
- `STDIO` - default
- `Streamable HTTP` - for web-based deployments
- `Server-Sent Events (SSE)` - for web-based deployments that rely on SSE
About different modes:
1. **STDIO Mode (Default)** - This is the standard mode for local MCP clients:
```bash
imagesorcery-mcp
```
2. **Streamable HTTP Mode** - For web-based deployments:
```bash
imagesorcery-mcp --transport=streamable-http
```
With custom host, port, and path:
```bash
imagesorcery-mcp --transport=streamable-http --host=0.0.0.0 --port=4200 --path=/custom-path
```
Available transport options:
- `--transport`: Choose between "stdio" (default), "streamable-http", or "sse"
- `--host`: Specify host for HTTP-based transports (default: 127.0.0.1)
- `--port`: Specify port for HTTP-based transports (default: 8000)
- `--path`: Specify endpoint path for HTTP-based transports (default: /mcp)
## 🔐 File Access Restrictions
By default, ImageSorcery MCP does not restrict file paths. To limit tools to specific directories, set `IMAGESORCERY_AVAILABLE_PATHS` to one or more allowed directories.
Use the platform path-list separator (`:` on Linux/macOS, `;` on Windows). Comma-separated values are also accepted.
```bash
IMAGESORCERY_AVAILABLE_PATHS="/home/user/images:/home/user/output" imagesorcery-mcp
```
When this variable is set, all tool arguments named `path` or ending with `_path` must resolve inside one of the allowed directories. Relative paths, `..`, and `~` are normalized before comparison. Symlinks are not resolved, so links placed inside allowed directories remain accessible.
## 🔒 Privacy & Telemetry
We are committed to your privacy. ImageSorcery MCP is designed to run locally, ensuring your images and data stay on your machine.
To help us understand which features are most popular and fix bugs faster, we've included optional, anonymous telemetry.
- **It is disabled by default.** You must explicitly opt-in to enable it.
- **What we collect:** Anonymized usage data, including features used (e.g., `crop`, `detect`), application version, operating system type (e.g., 'linux', 'win32'), and tool failures.
- **What we NEVER collect:** We do not collect any personal or sensitive information. This includes image data, file paths, IP addresses, or any other personally identifiable information.
- **How to enable/disable:** You can control telemetry by setting `enabled = true` or `enabled = false` in the `[telemetry]` section of your `config.toml` file.
## ⚙️ Configuring the Server
The server can be configured using a `config.toml` file in the current directory. The file is created automatically during installation with default values. You can customize the default tool parameters in this file. More in [CONFIG.md](CONFIG.md).
## 🤝 Contributing
Whether you're a 👤 human or an 🤖 AI agent, we welcome your contributions to this project!
### Directory Structure
This repository is organized as follows:
```
.
├── .gitignore # Specifies intentionally untracked files that Git should ignore.
├── pyproject.toml # Configuration file for Python projects, including build system, dependencies, and tool settings.
├── pytest.ini # Configuration file for the pytest testing framework.
├── README.md # The main documentation file for the project.
├── setup.sh # A shell script for quick setup (legacy, for reference or local use).
├── models/ # This directory stores pre-trained models used by tools like `detect` and `find`. It is typically ignored by Git due to the large file sizes.
│ ├── model_descriptions.json # Contains descriptions of the available models.
│ ├── settings.json # Contains settings related to model management and training runs.
│ └── *.pt # Pre-trained model.
├── src/ # Contains the source code for the 🪄 ImageSorcery MCP server.
│ └── imagesorcery_mcp/ # The main package directory for the server.
│ ├── README.md # High-level overview of the core architecture (server and middleware).
│ ├── __init__.py # Makes `imagesorcery_mcp` a Python package.
│ ├── __main__.py # Entry point for running the package as a script.
│ ├── logging_config.py # Configures the logging for the server.
│ ├── server.py # The main server file, responsible for initializing FastMCP and registering tools.
│ ├── middleware.py # Custom middleware for improved validation error handling.
│ ├── logs/ # Directory for storing server logs.
│ ├── scripts/ # Contains utility scripts for model management.
│ │ ├── README.md # Documentation for the scripts.
│ │ ├── __init__.py # Makes `scripts` a Python package.
│ │ ├── create_model_descriptions.py # Script to generate model descriptions.
│ │ ├── download_clip.py # Script to download CLIP models.
│ │ ├── post_install.py # Script to run post-installation tasks.
│ │ └── download_models.py # Script to download other models (e.g., YOLO).
│ ├── tools/ # Contains the implementation of individual MCP tools.
│ │ ├── README.md # Documentation for the tools.
│ │ ├── __init__.py # Makes `tools` a Python package.
│ │ └── *.py # Implements the tool.
│ ├── prompts/ # Contains the implementation of individual MCP prompts.
│ │ ├── README.md # Documentation for the prompts.
│ │ ├── __init__.py # Makes `prompts` a Python package.
│ │ └── *.py # Implements the prompt.
│ └── resources/ # Contains the implementation of individual MCP resources.
│ ├── README.md # Documentation for the resources.
│ ├── __init__.py # Makes `resources` a Python package.
│ └── *.py # Implements the resource.
└── tests/ # Contains test files for the project.
├── test_server.py # Tests for the main server functionality.
├── data/ # Contains test data, likely image files used in tests.
├── tools/ # Contains tests for individual tools.
├── prompts/ # Contains tests for individual prompts.
└── resources/ # Contains tests for individual resources.
```
### Development Setup
1. Clone the repository:
```bash
git clone https://github.com/sunriseapps/imagesorcery-mcp.git # Or your fork
cd imagesorcery-mcp
```
2. (Recommended) Create and activate a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # For Linux/macOS
# venv\Scripts\activate # For Windows
```
3. Install the package in editable mode along with development dependencies:
```bash
pip install -e ".[dev]"
```
This will install `imagesorcery-mcp` and all dependencies from `[project.dependencies]` and `[project.optional-dependencies].dev` (including `build` and `twine`).
### Rules
These rules apply to all contributors: humans and AI.
0. Read all the `README.md` files in the project. Understand the project structure and purpose. Understand the guidelines for contributing. Think through how it relates to your task, and how to make changes accordingly.
1. Read `pyproject.toml`.
Pay attention to sections: `[tool.ruff]`, `[tool.ruff.lint]`, `[project.optional-dependencies]` and `[project]dependencies`.
Strictly follow code style defined in `pyproject.toml`.
Stick to the stack defined in `pyproject.toml` dependencies and do not add any new dependencies without a good reason.
2. Write your code in new and existing files.
If new dependencies are needed, update `pyproject.toml` and install them via `pip install -e .` or `pip install -e ".[dev]"`. Do not install them directly via `pip install`.
Check out existing source codes for examples (e.g. `src/imagesorcery_mcp/server.py`, `src/imagesorcery_mcp/tools/crop.py`). Stick to the code style, naming conventions, input and output data formats, code structure, architecture, etc. of the existing code.
3. Update related `README.md` files with your changes.
Stick to the format and structure of the existing `README.md` files.
4. Write tests for your code.
Check out existing tests for examples (e.g. `tests/test_server.py`, `tests/tools/test_crop.py`).
Stick to the code style, naming conventions, input and output data formats, code structure, architecture, etc. of the existing tests.
5. Run tests and linter to ensure everything works:
```bash
pytest
ruff check .
```
In case of failures - fix the code and tests. It is **strictly required** to have all new code to comply with the linter rules and pass all tests.
### Coding hints
- Use type hints where appropriate
- Use pydantic for data validation and serialization
## 📝 Questions?
If you have any questions, issues, or suggestions regarding this project, feel free to reach out to:
- Project Author: [titulus](https://www.linkedin.com/in/titulus/) via LinkedIn
- Sunrise Apps CEO: [Vlad Karm](https://www.linkedin.com/in/vladkarm/) via LinkedIn
You can also open an issue in the repository for bug reports or feature requests.
## 📜 License
This project is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License.
================================================
FILE: glama.json
================================================
{
"$schema": "https://glama.ai/mcp/schemas/server.json",
"maintainers": [
"titulus"
]
}
================================================
FILE: pyproject.toml
================================================
[project]
name = "imagesorcery-mcp"
version = "0.12.0"
description = "A Model Context Protocol server providing image manipulation tools for LLMs"
readme = "README.md"
requires-python = ">=3.10"
authors = [
{ name = "titulus", email = "titulus.web@gmail.com" },
]
keywords = ["mcp", "llm"]
license = { text = "MIT" }
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.10",
]
dependencies = [
"fastmcp>=2.10.0,<3.0.0", # core for MCP servers
"pydantic>=2.0.0", # For data validation, settings management, and serialization of classes
"opencv-python>=4.5.0", # For image processing and computer vision tasks
"imutils>=0.5.4", # For image processing typical tasks which are not included in OpenCV
"Pillow", # For retrieving image metadata
"ultralytics", # For object detection
"requests", # For HTTP requests to download models
"tqdm", # For progress bars during downloads
"huggingface_hub", # For accessing models from Hugging Face
"easyocr", # For OCR
"toml", # For reading pyproject.toml
"amplitude-analytics", # For telemetry
"posthog", # For telemetry
"python-dotenv", # For loading environment variables from .env file
]
[project.urls]
Homepage = "https://github.com/sunriseapps/imagesorcery-mcp"
Repository = "https://github.com/sunriseapps/imagesorcery-mcp"
[project.scripts]
imagesorcery-mcp = "imagesorcery_mcp:main"
download-yolo-models = "imagesorcery_mcp.scripts.download_models:main"
create-model-descriptions = "imagesorcery_mcp.scripts.create_model_descriptions:main"
download-clip-models = "imagesorcery_mcp.scripts.download_clip:main"
post-install-imagesorcery = "imagesorcery_mcp.scripts.post_install:main"
[project.optional-dependencies]
dev = ["pytest", "ruff", "pytest-asyncio", "build", "twine"]
clip = []
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.metadata]
allow-direct-references = true
[tool.ruff]
# PEP 8 style guidelines
# Same as Black.
line-length = 88
indent-width = 4
# Assume Python 3.10
target-version = "py310"
# Allow imports relative to the "src" and "tests" directories.
src = ["src", "tests"]
# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".mypy_cache",
".nox",
".pants.d",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"venv",
]
[tool.ruff.lint]
# Enable flake8-bugbear (`B`) rules.
select = ["E", "F", "B", "I"]
ignore = [
"E501", # Ignore line length violations
]
================================================
FILE: pytest.ini
================================================
[pytest]
testpaths = tests
python_files = test_*.py
python_functions = test_*
asyncio_mode = auto
asyncio_default_fixture_loop_scope = function
================================================
FILE: setup.sh
================================================
#!/bin/bash
set -e
echo "Setting up imagesorcery-mcp..."
# Create virtual environment if it doesn't exist
if [ ! -d "venv" ]; then
echo "Creating virtual environment..."
python -m venv venv
fi
# Detect OS and activate the appropriate virtual environment
if [[ "$OSTYPE" == "msys" || "$OSTYPE" == "win32" || "$OSTYPE" == "cygwin" ]]; then
# Windows
source venv/Scripts/activate
else
# Linux/macOS
source venv/bin/activate
fi
# Install package dependencies
echo "Installing package dependencies..."
pip install -e "."
# Run post-installation process
echo "Running post-installation process..."
imagesorcery-mcp --post-install
echo "✅ Setup complete!"
================================================
FILE: src/imagesorcery_mcp/README.md
================================================
# ImageSorcery MCP Core Architecture
This directory contains the core components of the ImageSorcery MCP server, including its main entry point (`server.py`).
## `server.py`
The `server.py` file is the primary entry point for the ImageSorcery MCP server. Its main responsibilities include:
- **Initialization of FastMCP**: It creates and configures the `FastMCP` instance, which is the foundation for the MCP server. This includes setting the server's name, instructions, and logging level.
- **Middleware Registration**: It registers custom middleware components, such as `ImprovedValidationMiddleware` and `ErrorHandlingMiddleware`, to enhance the server's request processing and error management capabilities.
- **Tool and Resource Registration**: It registers all available image processing tools (e.g., `blur`, `crop`, `detect`) and resources (e.g., `models`) with the `FastMCP` instance, making them accessible via the MCP protocol.
- **Argument Parsing**: It handles command-line argument parsing for server configuration, including transport type (stdio, http), host, port, and special flags like `--post-install`.
- **Post-Installation Tasks**: It orchestrates the execution of post-installation scripts, which are crucial for downloading necessary models and setting up the environment.
- **Server Execution**: It starts the MCP server using the configured transport protocol.
## `middleware.py`
The `middleware.py` file defines custom middleware classes that intercept and process requests and responses within the ImageSorcery MCP server. Currently, it includes:
- **`ImprovedValidationMiddleware`**: This middleware is designed to enhance the error messages for validation failures originating from FastMCP tools. It parses generic `ToolError` exceptions, extracts specific validation issues (e.g., unexpected or missing parameters), and transforms them into more user-friendly `McpError` messages with a standardized error code. This improves the clarity and actionability of validation errors for MCP clients.
- **`ErrorHandlingMiddleware`**: (Note: This middleware is part of `fastmcp` but is configured and used here). This middleware provides a global mechanism for catching and handling unhandled exceptions across the server. It ensures that all errors are logged consistently, can include detailed tracebacks for debugging, and are transformed into `McpError` objects, providing a standardized error response format for clients.
================================================
FILE: src/imagesorcery_mcp/__init__.py
================================================
"""ImageSorcery MCP - Powerful Image Processing Tools for AI Assistants"""
from .server import main, mcp
__all__ = ["main", "mcp"]
================================================
FILE: src/imagesorcery_mcp/__main__.py
================================================
from imagesorcery_mcp.server import main
from .logging_config import logger
logger.info("🪄 ImageSorcery MCP server __main__ executed")
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/config.py
================================================
"""
Configuration management for ImageSorcery MCP.
This module provides a centralized configuration system that loads settings
from TOML files and allows runtime updates through the MCP config tool.
"""
from pathlib import Path
from typing import Any, Dict, List, Optional
import toml
from pydantic import BaseModel, Field, field_validator
from imagesorcery_mcp.logging_config import logger
class DetectionConfig(BaseModel):
"""Detection tool configuration."""
confidence_threshold: float = Field(0.75, ge=0.0, le=1.0)
default_model: str = "yoloe-11l-seg-pf.pt"
class FindConfig(BaseModel):
"""Find tool configuration."""
confidence_threshold: float = Field(0.75, ge=0.0, le=1.0)
default_model: str = "yoloe-11l-seg.pt"
class BlurConfig(BaseModel):
"""Blur tool configuration."""
strength: int = Field(15, ge=1)
@field_validator('strength')
@classmethod
def strength_must_be_odd(cls, v):
if v % 2 == 0:
raise ValueError('Blur strength must be an odd number')
return v
class TextConfig(BaseModel):
"""Text drawing configuration."""
font_scale: float = Field(1.0, gt=0.0)
class DrawingConfig(BaseModel):
"""Drawing configuration."""
color: List[int] = Field([0, 0, 0], min_length=3, max_length=3)
thickness: int = Field(1, ge=1)
@field_validator('color')
@classmethod
def color_values_valid(cls, v):
for val in v:
if not (0 <= val <= 255):
raise ValueError('Color values must be between 0 and 255')
return v
class OCRConfig(BaseModel):
"""OCR configuration."""
language: str = "en"
class ResizeConfig(BaseModel):
"""Resize configuration."""
interpolation: str = Field("linear", pattern="^(nearest|linear|area|cubic|lanczos)$")
class TelemetryConfig(BaseModel):
"""Telemetry configuration."""
enabled: bool = False
class ImageSorceryConfig(BaseModel):
"""Main configuration class for ImageSorcery MCP."""
detection: DetectionConfig = DetectionConfig()
find: FindConfig = FindConfig()
blur: BlurConfig = BlurConfig()
text: TextConfig = TextConfig()
drawing: DrawingConfig = DrawingConfig()
ocr: OCRConfig = OCRConfig()
resize: ResizeConfig = ResizeConfig()
telemetry: TelemetryConfig = TelemetryConfig()
class ConfigManager:
"""Configuration manager for ImageSorcery MCP."""
def __init__(self):
"""Initialize the configuration manager."""
self.config_file = Path("config.toml")
logger.debug(f"Looking for user config file at: {self.config_file.absolute()}")
self._config: Optional[ImageSorceryConfig] = None
self._runtime_overrides: Dict[str, Any] = {}
self._load_config()
def _ensure_config_file_exists(self):
"""Ensure config.toml exists, create with default values if needed."""
if not self.config_file.exists():
# Create a basic config file with defaults
default_config = ImageSorceryConfig()
self._save_config_to_file(default_config.model_dump())
logger.info("Created config.toml with default values")
else:
logger.debug(f"Config file already exists at: {self.config_file.absolute()}")
def _load_config(self):
"""Load configuration from file."""
self._ensure_config_file_exists()
config_data = {}
try:
with open(self.config_file, 'r') as f:
config_data = toml.load(f)
logger.info(f"Loaded configuration from: {self.config_file}")
except Exception as e:
logger.error(f"Failed to load configuration from {self.config_file}: {e}")
config_data = {}
# Apply runtime overrides
self._apply_runtime_overrides(config_data)
# Create configuration object
self._config = ImageSorceryConfig(**config_data)
logger.info("Configuration loaded successfully")
def _apply_runtime_overrides(self, config_data: Dict[str, Any]):
"""Apply runtime overrides to configuration data."""
for key, value in self._runtime_overrides.items():
if '.' in key:
# Handle nested keys like "detection.confidence_threshold"
parts = key.split('.')
current = config_data
for part in parts[:-1]:
if part not in current:
current[part] = {}
current = current[part]
current[parts[-1]] = value
else:
# Handle top-level keys
if key not in config_data:
config_data[key] = {}
config_data[key] = value
def _save_config_to_file(self, config_data: Dict[str, Any]):
"""Save configuration data to file."""
try:
with open(self.config_file, 'w') as f:
toml.dump(config_data, f)
logger.info(f"Configuration saved to: {self.config_file}")
except Exception as e:
logger.error(f"Failed to save configuration to {self.config_file}: {e}")
raise
@property
def config(self) -> ImageSorceryConfig:
"""Get the current configuration."""
if self._config is None:
self._load_config()
return self._config
def get_config_dict(self) -> Dict[str, Any]:
"""Get configuration as a dictionary."""
logger.debug("get_config_dict called")
result = self.config.model_dump()
logger.debug(f"get_config_dict returning: {result}")
return result
def update_config(self, updates: Dict[str, Any], persist: bool = False) -> Dict[str, Any]:
"""Update configuration values.
Args:
updates: Dictionary of configuration updates
persist: If True, save changes to config file
Returns:
Updated configuration as dictionary
"""
logger.debug(f"Updating configuration with: {updates}, persist: {persist}")
# Validate updates by creating a temporary config object
current_config = self.config.model_dump()
# Apply updates to current config
for key, value in updates.items():
if '.' in key:
# Handle nested keys like "detection.confidence_threshold"
parts = key.split('.')
current = current_config
for part in parts[:-1]:
if part not in current:
current[part] = {}
current = current[part]
current[parts[-1]] = value
else:
# Handle section updates
if isinstance(value, dict):
if key not in current_config:
current_config[key] = {}
current_config[key].update(value)
else:
current_config[key] = value
# Validate the updated configuration
try:
ImageSorceryConfig(**current_config)
except Exception as e:
raise ValueError(f"Invalid configuration update: {e}") from e
if persist:
# Save to file
self._save_config_to_file(current_config)
# Clear runtime overrides since they're now persisted
self._runtime_overrides.clear()
else:
# Store as runtime overrides
self._runtime_overrides.update(updates)
# Reload configuration
self._load_config()
return self.get_config_dict()
def reset_runtime_overrides(self):
"""Reset all runtime overrides and reload from file."""
logger.debug("Resetting runtime overrides")
self._runtime_overrides.clear()
self._load_config()
def get_runtime_overrides(self) -> Dict[str, Any]:
"""Get current runtime overrides."""
logger.debug(f"Getting runtime overrides: {self._runtime_overrides}")
return self._runtime_overrides.copy()
# Global configuration manager instance
_config_manager: Optional[ConfigManager] = None
def get_config_manager() -> ConfigManager:
"""Get the global configuration manager instance."""
logger.debug("get_config_manager called")
global _config_manager
if _config_manager is None:
logger.debug("_config_manager is None, creating new instance")
_config_manager = ConfigManager()
logger.debug(f"_config_manager set to {_config_manager}")
else:
logger.debug("_config_manager already exists, returning existing instance")
return _config_manager
def get_config() -> ImageSorceryConfig:
"""Get the current configuration."""
logger.debug("get_config called")
result = get_config_manager().config
logger.debug(f"get_config returning: {result}")
return result
def get_config_schema_info() -> Dict[str, Any]:
"""Get configuration schema information for documentation and validation."""
logger.debug("get_config_schema_info")
schema_info = {
"detection.confidence_threshold": {
"description": "Default confidence threshold for object detection (0.0-1.0)",
"type": "float",
"constraints": "0.0 ≤ value ≤ 1.0"
},
"detection.default_model": {
"description": "Default model for detection tool",
"type": "string",
"constraints": "Valid model filename"
},
"find.confidence_threshold": {
"description": "Default confidence threshold for object finding (0.0-1.0)",
"type": "float",
"constraints": "0.0 ≤ value ≤ 1.0"
},
"find.default_model": {
"description": "Default model for find tool",
"type": "string",
"constraints": "Valid model filename"
},
"blur.strength": {
"description": "Default blur strength (must be odd number)",
"type": "integer",
"constraints": "Odd number ≥ 1"
},
"text.font_scale": {
"description": "Default font scale for text drawing",
"type": "float",
"constraints": "Value > 0.0"
},
"drawing.color": {
"description": "Default color in BGR format [B,G,R]",
"type": "list[int]",
"constraints": "3 integers, each 0-255"
},
"drawing.thickness": {
"description": "Default line thickness",
"type": "integer",
"constraints": "Value ≥ 1"
},
"ocr.language": {
"description": "Default OCR language code",
"type": "string",
"constraints": "Valid language code (e.g., 'en', 'fr', 'ru')"
},
"resize.interpolation": {
"description": "Default resize interpolation method",
"type": "string",
"constraints": "One of: nearest, linear, area, cubic, lanczos"
},
"telemetry.enabled": {
"description": "Enable or disable anonymous telemetry",
"type": "boolean",
"constraints": "true or false"
}
}
return schema_info
def get_available_config_keys() -> List[str]:
"""Get list of all available configuration keys."""
logger.debug("get_available_config_keys")
return list(get_config_schema_info().keys())
def generate_config_documentation() -> str:
"""Generate configuration documentation from schema."""
logger.debug("generate_config_documentation")
schema_info = get_config_schema_info()
lines = ["Available configuration keys:"]
for key, info in schema_info.items():
lines.append(f"- {key}: {info['description']}")
return "\n".join(lines)
================================================
FILE: src/imagesorcery_mcp/logging_config.py
================================================
import logging
import os
from logging.handlers import RotatingFileHandler
LOG_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), "logs", "imagesorcery.log")
LOG_LEVEL = logging.INFO
def setup_logging():
"""Sets up the central logger for the 🪄 ImageSorcery MCP server."""
# Ensure the logs directory exists
log_dir = os.path.dirname(LOG_FILE)
os.makedirs(log_dir, exist_ok=True)
# Create logger
logger = logging.getLogger("imagesorcery")
logger.setLevel(LOG_LEVEL)
# Prevent adding multiple handlers if setup is called more than once
if not logger.handlers:
# Create rotating file handler
handler = RotatingFileHandler(LOG_FILE, maxBytes=10*1024*1024, backupCount=5, encoding='utf-8')
# Change formatter to include module name and line number
formatter = logging.Formatter('%(asctime)s - %(name)s.%(module)s:%(lineno)d - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
# Optional: Add a console handler for development/debugging
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
console_handler.setLevel(LOG_LEVEL) # Set level for console handler
logger.addHandler(console_handler)
print(f"Log file: {LOG_FILE}")
return logger
# Setup logging when this module is imported
setup_logging()
# Get the logger instance to be used in other modules
logger = logging.getLogger("imagesorcery")
================================================
FILE: src/imagesorcery_mcp/middlewares/path_access.py
================================================
import logging
import os
from pathlib import Path
from typing import Any, Iterator
from fastmcp.server.middleware import CallNext, Middleware, MiddlewareContext
from mcp import McpError
from mcp.types import ErrorData
AVAILABLE_PATHS_ENV = "IMAGESORCERY_AVAILABLE_PATHS"
class PathAccessMiddleware(Middleware):
"""Restrict tool file paths to configured directories."""
def __init__(self, logger: logging.Logger | None = None):
self.logger = logger or logging.getLogger("imagesorcery.path_access")
async def on_call_tool(
self,
context: MiddlewareContext,
call_next: CallNext,
) -> Any:
allowed_dirs = get_allowed_directories()
if not allowed_dirs:
return await call_next(context)
arguments = getattr(context.message, "arguments", None) or {}
for argument_name, path_value in iter_path_arguments(arguments):
resolved_path = resolve_path(path_value)
if not is_path_allowed(resolved_path, allowed_dirs):
allowed = ", ".join(str(path) for path in allowed_dirs)
error_message = (
f"Path argument '{argument_name}' is outside allowed directories: "
f"{resolved_path}. Allowed directories: {allowed}"
)
self.logger.warning(error_message)
raise McpError(ErrorData(code=-32602, message=error_message))
return await call_next(context)
def get_allowed_directories() -> list[Path]:
raw_paths = os.getenv(AVAILABLE_PATHS_ENV, "")
if not raw_paths.strip():
return []
allowed_dirs = []
for raw_path in split_paths(raw_paths):
if not raw_path:
continue
allowed_dirs.append(resolve_path(raw_path))
return allowed_dirs
def split_paths(raw_paths: str) -> list[str]:
normalized = raw_paths.replace(",", os.pathsep)
return [part.strip() for part in normalized.split(os.pathsep) if part.strip()]
def iter_path_arguments(value: Any, prefix: str = "") -> Iterator[tuple[str, str]]:
if isinstance(value, dict):
for key, item in value.items():
name = f"{prefix}.{key}" if prefix else str(key)
if is_path_argument(str(key)) and isinstance(item, str) and item.strip():
yield name, item
else:
yield from iter_path_arguments(item, name)
elif isinstance(value, list):
for index, item in enumerate(value):
name = f"{prefix}[{index}]" if prefix else f"[{index}]"
yield from iter_path_arguments(item, name)
def is_path_argument(name: str) -> bool:
return name == "path" or name.endswith("_path")
def resolve_path(path: str) -> Path:
return Path(os.path.abspath(os.path.expanduser(path)))
def is_path_allowed(path: Path, allowed_dirs: list[Path]) -> bool:
return any(path == allowed_dir or path.is_relative_to(allowed_dir) for allowed_dir in allowed_dirs)
================================================
FILE: src/imagesorcery_mcp/middlewares/telemetry.py
================================================
import logging
import sys
from importlib.metadata import version
from pathlib import Path
from typing import Any
from fastmcp.server.middleware import CallNext, Middleware, MiddlewareContext
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.telemetry_amplitude import amplitude_handler
from imagesorcery_mcp.telemetry_posthog import posthog_handler
class TelemetryMiddleware(Middleware):
"""Middleware that logs every tool, prompt, and resource run based on configuration."""
def __init__(self, logger: logging.Logger | None = None):
self.logger = logger or logging.getLogger("imagesorcery.telemetry")
self.user_id = self._get_user_id() # Added user_id
self.version = self._get_version()
self.system = sys.platform
self.amplitude_handler = amplitude_handler
self.posthog_handler = posthog_handler
def _get_user_id(self) -> str:
"""Get user_id from .user_id file."""
user_id_file = Path(".user_id") # Path to .user_id in project root
self.logger.debug(f"Looking for user ID file at: {user_id_file.absolute()}")
try:
if user_id_file.exists():
user_id = user_id_file.read_text().strip()
if user_id:
self.logger.debug(f"User ID from file: {user_id}")
return user_id
self.logger.warning("User ID file not found or empty. Telemetry will use 'anonymous'.")
return "anonymous"
except Exception as e:
self.logger.error(f"Could not read user_id: {e}")
return "anonymous"
def _get_version(self) -> str:
"""Get package version."""
try:
package_version = version("imagesorcery-mcp")
if package_version:
return package_version
except Exception:
pass
try:
import toml
pyproject_path = Path(__file__).resolve().parents[3] / "pyproject.toml"
return toml.load(pyproject_path)["project"]["version"]
except Exception:
self.logger.warning("Could not determine package version for telemetry.")
return "unknown"
async def _handle_action(self, action_type: str, identifier: str, context: MiddlewareContext, call_next: CallNext) -> Any:
"""Helper to log actions before and after execution, if telemetry is enabled."""
self.logger.debug(f"{action_type}: {identifier}")
config = get_config()
self.logger.debug(f"Telemetry enabled: {config.telemetry.enabled}")
if not config.telemetry.enabled:
self.logger.debug("Telemetry enabled skipped")
return await call_next(context)
log_data = {
"user_id": self.user_id, # Added user_id to log_data
"version": self.version,
"system": self.system,
"action_type": action_type.lower().replace(" ", "_"),
"identifier": identifier,
}
try:
response = await call_next(context)
log_data["status"] = "success"
self.logger.info(log_data)
self.posthog_handler.track_event(log_data)
self.amplitude_handler.track_event(log_data)
return response
except Exception:
log_data["status"] = "failed"
self.logger.warning(log_data)
self.posthog_handler.track_event(log_data)
self.amplitude_handler.track_event(log_data)
raise
async def on_call_tool(self, context: MiddlewareContext, call_next: CallNext) -> Any:
"""Log tool calls before and after execution, if telemetry is enabled."""
return await self._handle_action("Calling tool", context.message.name, context, call_next)
async def on_read_resource(self, context: MiddlewareContext, call_next: CallNext) -> Any:
"""Log resource reads before and after execution, if telemetry is enabled."""
return await self._handle_action("Reading resource", str(context.message.uri), context, call_next)
async def on_get_prompt(self, context: MiddlewareContext, call_next: CallNext) -> Any:
"""Log prompt retrievals before and after execution, if telemetry is enabled."""
return await self._handle_action("Getting prompt", context.message.name, context, call_next)
================================================
FILE: src/imagesorcery_mcp/middlewares/validation.py
================================================
import logging
import re
from typing import Any
from fastmcp.server.middleware import CallNext, Middleware, MiddlewareContext
from mcp import McpError
from mcp.types import ErrorData
class ImprovedValidationMiddleware(Middleware):
"""Middleware that improves validation error messages from FastMCP tools."""
def __init__(self, logger: logging.Logger | None = None):
self.logger = logger or logging.getLogger("imagesorcery.validation")
async def on_message(self, context: MiddlewareContext, call_next: CallNext) -> Any:
"""Handle messages with improved validation error reporting."""
try:
return await call_next(context)
except Exception as e:
error_msg = str(e)
if "validation error for call[" in error_msg:
tool_match = re.search(r'call\[(\w+)\]', error_msg)
tool_name = tool_match.group(1) if tool_match else "unknown"
errors = []
if "Unexpected keyword argument" in error_msg:
lines = error_msg.split('\n')
for i, line in enumerate(lines):
if "Unexpected keyword argument" in line:
if i > 0:
param_line = lines[i-1].strip()
param_name = param_line.split()[0] if param_line else "unknown"
errors.append(f"Unexpected parameter '{param_name}' - this parameter is not accepted by the tool '{tool_name}'")
if "Missing required" in error_msg:
param_match = re.search(r"Missing required.*?'(\w+)'", error_msg)
if param_match:
param_name = param_match.group(1)
errors.append(f"Missing required parameter '{param_name}'")
invalid_value_match = re.search(r"input_value='([^']+)'", error_msg)
if invalid_value_match:
invalid_value = invalid_value_match.group(1)
errors.append(f"Invalid value '{invalid_value}'")
if errors:
error_message = "Input validation error: " + "; ".join(errors)
else:
error_message = f"Input validation error in tool '{tool_name}': check that all parameters are correctly named and have the right types"
self.logger.error(error_message)
self.logger.debug(f"Original error: {error_msg}")
raise McpError(
ErrorData(code=-32602, message=error_message)
) from e
raise
================================================
FILE: src/imagesorcery_mcp/prompts/README.md
================================================
# Prompts
This directory contains reusable prompt templates for the ImageSorcery MCP server.
## Overview
Prompts provide parameterized message templates that help LLMs generate structured, purposeful responses for image processing workflows. Each prompt is designed to guide the AI through specific image manipulation tasks using the available tools.
## Architecture
- Register prompts by defining a `register_prompt` function in each prompt's module. This function should accept a `FastMCP` instance and use the `@mcp.prompt()` decorator to register the prompt function with the server. See `src/imagesorcery_mcp/server.py` for how prompts are imported and registered, and individual prompt files like `src/imagesorcery_mcp/prompts/remove_background.py` for examples of the `register_prompt` function implementation.
- When adding new prompts, ensure they are listed in alphabetical order in READMEs and in the server registration.
## Adding New Prompts
1. Create a new Python file in this directory (e.g., `new_prompt.py`)
2. Implement the prompt function with appropriate parameters and return type
3. Create a `register_prompt` function that registers the prompt with the FastMCP instance
4. Import and register the prompt in `src/imagesorcery_mcp/server.py`
5. Add documentation to this README
6. Write tests in `tests/prompts/test_new_prompt.py`
## Available Prompts
### `remove-background`
**Description:** Guides the AI through a comprehensive background removal workflow using object detection and masking tools.
**Parameters:**
- `image_path` (str): Full path to the input image
- `target_objects` (str, optional): Description of the objects to keep (default: empty for auto-detection)
- `output_path` (str, optional): Path for the final result (default: auto-generated)
**Example Usage:**
```
Use the remove-background prompt to remove the background from my photo 'portrait.jpg', keeping only the person
```
**Example Prompt Call (JSON):**
```json
{
"name": "remove-background",
"arguments": {
"image_path": "/home/user/images/portrait.jpg",
"target_objects": "person",
"output_path": "/home/user/images/portrait_no_bg.png"
}
}
```
================================================
FILE: src/imagesorcery_mcp/prompts/__init__.py
================================================
# Import the central logger
from imagesorcery_mcp.logging_config import logger
logger.info("🪄 ImageSorcery MCP prompts package initialized")
================================================
FILE: src/imagesorcery_mcp/prompts/remove_background.py
================================================
from typing import Annotated
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_prompt(mcp: FastMCP):
@mcp.prompt(name="remove-background")
def remove_background(
image_path: Annotated[
str, Field(description="Full path to the input image (must be a full path)")
],
target_objects: Annotated[
str,
Field(
description="Description of the objects to keep in the foreground (e.g., 'person', 'car and person', 'main subject')"
),
] = "",
output_path: Annotated[
str,
Field(
description="Full path for the output image with background removed (optional, will auto-generate if not provided)"
),
] = "",
) -> str:
"""
Guides the AI through a comprehensive background removal workflow.
This prompt provides a step-by-step approach to remove backgrounds from images
using object detection and masking tools. It's designed to work with the
ImageSorcery MCP server's detect and fill tools.
The workflow includes:
1. Object detection to identify the target object
2. Mask generation for precise selection
3. Background removal using fill operations
4. Optional refinement steps
Args:
image_path: Full path to the input image
target_objects: Description of what to keep (default: empty for auto-detection)
output_path: Where to save the result (auto-generated if empty)
Returns:
A detailed prompt guiding the AI through the background removal process
"""
logger.info(f"Remove background prompt requested for image: {image_path}")
logger.debug(f"Target objects: {target_objects}, Output path: {output_path}")
# Generate output path if not provided
if not output_path:
if image_path.lower().endswith(('.png', '.jpg', '.jpeg')):
base_path = image_path.rsplit('.', 1)[0]
output_path = f"{base_path}_no_background.png"
else:
output_path = f"{image_path}_no_background.png"
# Build the prompt based on whether target_objects is specified
if target_objects:
prompt = f"""I need to remove the background from an image while preserving the {target_objects}. Please follow this step-by-step workflow:
**Step 1: Find Target Objects**
Use the `find` tool to locate the specific objects:
- Call `find` on '{image_path}' with:
- description: "{target_objects}"
- confidence: 0.3 (lower threshold for better recall)
- return_geometry: true
- geometry_format: "mask"
- This will use text-based object identification to locate the {target_objects}
**Step 2: Remove Background**
Use the `fill` tool to remove the background:
- Call `fill` on '{image_path}' with:
- areas: Use mask files from find
- color: null
- output_path: '{output_path}'
**Step 3: Clean Up**
- Remove the temporary mask files created during the process
**Important Notes:**
- Save the final result as a PNG file to preserve transparency
Please execute this workflow step by step."""
else:
prompt = f"""I need to remove the background from an image. Please follow this step-by-step workflow:
**Step 1: Detect Objects**
Use the `detect` tool to identify objects in the image:
- Call `detect` on '{image_path}' with:
- confidence: 0.5 (to catch more objects)
- return_geometry: true
- geometry_format: "mask"
- Review the detected objects and identify the main subjects to preserve
**Step 3: Remove Background**
Use the `fill` tool to remove the background:
- Call `fill` on '{image_path}' with:
- areas: Use mask files from detect
- color: null
- output_path: '{output_path}'
**Step 4: Clean Up**
- Remove the temporary mask files created during the process
**Important Notes:**
- Save the final result as a PNG file to preserve transparency
Please execute this workflow step by step."""
logger.info(f"Generated remove background prompt for targets: {target_objects or 'auto-detect'}")
return prompt
================================================
FILE: src/imagesorcery_mcp/resources/README.md
================================================
# 🪄 ImageSorcery MCP Server Resources Documentation
This document provides detailed information about each resource available in the 🪄 ImageSorcery MCP Server, including their URIs, descriptions, and examples of how to access them using a Claude client.
## Rules
These rules apply to all contributors: humans and AI.
- Register resources by defining a `register_resource` function in each resource's module. This function should accept a `FastMCP` instance and use the `@mcp.resource()` decorator to register the resource function with the server. See `src/imagesorcery_mcp/server.py` for how resources are imported and registered, and individual resource files like `src/imagesorcery_mcp/resources/models.py` for examples of the `register_resource` function implementation.
- When adding new resources, ensure they are listed in alphabetical order in READMEs and in the server registration.
## Available Resources
### `models://list`
Lists all available models in the models directory. This resource scans the models directory and returns information about all available models, including their names, descriptions, and file paths.
- **URI:** `models://list`
- **Returns:** JSON string containing:
- `models`: List of available models, each with:
- `name`: Name of the model file (relative path from the models directory)
- `description`: Description of the model's purpose and characteristics
- `path`: Full path to the model file
**Example Claude Request:**
```
List all available models in the models directory
```
**Example Resource Access (JSON):**
```json
{
"resource": "models://list"
}
```
**Example Response (JSON):**
```json
{
"models": [
{
"name": "yolov8m.pt",
"description": "YOLOv8 Medium - Default model with good balance between accuracy and speed.",
"path": "/path/to/models/yolov8m.pt"
},
{
"name": "yolov8n.pt",
"description": "YOLOv8 Nano - Smallest and fastest model, suitable for edge devices with limited resources.",
"path": "/path/to/models/yolov8n.pt"
}
]
}
================================================
FILE: src/imagesorcery_mcp/resources/__init__.py
================================================
# Import the central logger
from imagesorcery_mcp.logging_config import logger
logger.info("🪄 ImageSorcery MCP resources package initialized")
================================================
FILE: src/imagesorcery_mcp/resources/models.py
================================================
import json
from pathlib import Path
from fastmcp import FastMCP
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def get_model_description(model_name: str) -> str:
"""Get a description for a specific model."""
logger.debug(f"Attempting to get description for model: {model_name}")
# Path to model descriptions JSON file
descriptions_file = Path("models") / "model_descriptions.json"
# Check if descriptions file exists
if not descriptions_file.exists():
logger.warning(f"Model descriptions file not found: {descriptions_file}")
return "model_descriptions.json not found"
try:
# Load descriptions from JSON file
logger.debug(f"Loading model descriptions from: {descriptions_file}")
with open(descriptions_file, "r", encoding="utf-8") as f:
descriptions = json.load(f)
logger.debug(f"Loaded {len(descriptions)} model descriptions")
# Normalize model name to use forward slashes for consistent lookup
normalized_model_name = model_name.replace('\\', '/')
logger.debug(f"Normalized model name for lookup: {normalized_model_name}")
# Try direct lookup and also case-insensitive lookup
if normalized_model_name in descriptions:
logger.debug(f"Found direct match for model description: {normalized_model_name}")
return descriptions[normalized_model_name]
# Try case-insensitive lookup as a fallback
for key in descriptions:
if key.lower() == normalized_model_name.lower():
logger.debug(f"Found case-insensitive match for model description: {key}")
return descriptions[key]
logger.warning(f"Model '{model_name}' not found in model_descriptions.json (total descriptions: {len(descriptions)})")
return f"Model '{model_name}' not found in model_descriptions.json (total descriptions: {len(descriptions)})"
except Exception as e:
# Return default description if any error occurs
logger.error(f"Error in get_model_description for {model_name}: {str(e)}", exc_info=True)
return "model_descriptions.json parse issue"
def register_resource(mcp: FastMCP):
@mcp.resource("models://list")
async def list_models() -> str:
"""
List all available models in the models directory.
This resource provides information about all available models,
including their names and descriptions.
"""
logger.info("Models resource requested")
models_dir = Path("models")
available_models = []
# Check if models directory exists
if not models_dir.exists():
logger.warning(f"Models directory not found: {models_dir}")
return json.dumps({"models": available_models}, indent=2)
logger.info(f"Scanning models directory: {models_dir}")
# Define model file extensions to include
model_extensions = [".pt", ".pth", ".onnx", ".tflite", ".pb"]
logger.debug(f"Looking for files with extensions: {model_extensions}")
# Scan for model files recursively using rglob instead of glob
for file_path in models_dir.rglob("*"):
if file_path.is_file() and file_path.suffix.lower() in model_extensions:
# Get relative path from models directory
rel_path = file_path.relative_to(models_dir)
# Convert to string with forward slashes for consistent naming across platforms
model_name = str(rel_path).replace('\\', '/')
description = get_model_description(model_name)
available_models.append(
{
"name": model_name,
"description": description,
"path": str(file_path),
}
)
logger.debug(f"Found model: {model_name} with description: {description}")
logger.info(f"Found {len(available_models)} available models")
return json.dumps({"models": available_models}, indent=2)
================================================
FILE: src/imagesorcery_mcp/scripts/README.md
================================================
# 🪄 ImageSorcery MCP Server Scripts Documentation
This document provides detailed information about each script available in the 🪄 ImageSorcery MCP Server, including their purpose, arguments, and examples of how to use them.
## Overview
The scripts directory contains utility scripts for model management and setup within the 🪄 ImageSorcery MCP Server. These scripts handle tasks such as:
- `download-models`: Downloading YOLO models from various sources
- `create-model-descriptions`: Creating model descriptions (used in `setup.sh`)
- `download-clip-models`: Downloading CLIP models required for text-based detection (YOLOe *-pf models) (used in `setup.sh`)
- `post-install-imagesorcery`: Running all post-installation tasks in a single command
- `populate_telemetry_keys.py` / `clear_telemetry_keys.py`: build-time helpers for telemetry keys management
These scripts are typically run during project setup, packaging, or when adding new models to the system.
## Common Functions
These scripts share some common functions and patterns:
- All scripts use a central logger from `imagesorcery_mcp.logging_config`
- They typically create the `models` directory if it doesn't exist
- They handle existing files to avoid unnecessary downloads
- Progress bars are provided for downloads using `tqdm`
## Available Scripts
### `download_models.py`
Downloads YOLO compatible models for offline use from either Ultralytics or Hugging Face.
- **Purpose:** Ensures that required detection models are available for tools like `detect` and `find`.
- **Functionality:**
- Downloads models from Ultralytics repositories
- Downloads models from Hugging Face repositories
- Updates the model descriptions JSON file with information about downloaded models
- Organizes models in a proper directory structure
- **Arguments:**
- `--ultralytics MODEL_NAME`: Download a model from Ultralytics (e.g., 'yolov8m.pt')
- `--huggingface REPO_ID[:FILENAME]`: Download a model from Hugging Face (e.g., 'username/repo:model.pt')
**Command-line Usage:**
```bash
# Download from Ultralytics
download-yolo-models --ultralytics yolov8m.pt
# Download from Hugging Face
download-yolo-models --huggingface ultralytics/yolov8:yolov8m.pt
```
**Python Import Usage:**
```python
from imagesorcery_mcp.scripts.download_models import download_ultralytics_model, download_from_huggingface
# Download from Ultralytics
success = download_ultralytics_model('yolov8m.pt')
# Download from Hugging Face
success = download_from_huggingface('ultralytics/yolov8:yolov8m.pt')
```
#### Notes
- Downloaded models are stored in the `models` directory, which is included in `.gitignore` to prevent large model files from being committed to the repository.
- If you encounter permission issues when running these scripts, ensure you have the necessary write access to the project directory.
### `create_model_descriptions.py`
Creates a JSON file containing descriptions for various detection models in the models directory.
- **Purpose:** Ensures that model description information is available for reference by tools and users.
- **Functionality:**
- Creates a comprehensive list of model descriptions for various YOLO models (YOLOv8, YOLO11, YOLO-NAS, etc.)
- Merges new descriptions with any existing ones, preserving custom descriptions
- Writes the merged descriptions to `models/model_descriptions.json`
- **Usage:** Run directly or through the provided command-line entry point.
**Command-line Usage:**
```bash
create-model-descriptions
```
**Python Import Usage:**
```python
from imagesorcery_mcp.scripts.create_model_descriptions import create_model_descriptions
# Create the model descriptions file
result_path = create_model_descriptions()
```
### `download_clip.py`
Downloads the MobileCLIP model required for YOLOe text prompts functionality.
- **Purpose:** Ensures that the required MobileCLIP model is available for text-based detection in the `find` tool.
- **Functionality:**
- Downloads the MobileCLIP model required for YOLOe text prompts
- Places the model in the root directory where it's expected by the find tool
- **Usage:** Run directly or through the provided command-line entry point.
**Command-line Usage:**
```bash
download-clip-models
```
**Python Import Usage:**
```python
from imagesorcery_mcp.scripts.download_clip import download_clip_model
# Download CLIP model
success = download_clip_model()
```
### `post_install.py`
Runs all post-installation tasks for the ImageSorcery MCP server in a single command.
- **Purpose:** Automates the complete setup process after package installation.
- **Functionality:**
- Creates the models directory
- Generates the model descriptions file with `create-model-descriptions`
- Downloads default YOLO models (yoloe-11l-seg-pf.pt, yoloe-11s-seg-pf.pt, yoloe-11l-seg.pt, yoloe-11s-seg.pt) with `download-yolo-models`
- Installs the `clip` Python package from Ultralytics' GitHub repository.
- Downloads the required CLIP model file for text prompts with `download-clip-models`.
- Ensures a `.user_id` file exists in project root (used for telemetry user identification).
- **Usage:** Run directly, through the server with the `--post-install` flag, or through the provided command-line entry point.
**Command-line Usage:**
```bash
# Run post-installation as a standalone script
python -m src.imagesorcery_mcp.scripts.post_install
# Or run it through the server with the --post-install flag
imagesorcery-mcp --post-install
```
**Python Import Usage:**
```python
from imagesorcery_mcp.scripts.post_install import run_post_install
# Run all post-installation tasks
success = run_post_install()
if success:
print("Post-installation completed successfully!")
else:
print("Post-installation failed.")
```
## Telemetry Keys Management (build-time)
Telemetry keys are no longer stored in `telemetry.toml`. Instead, telemetry API keys are managed via a small Python module and/or environment variables:
- Telemetry user identifier is stored in `.user_id` (created by `post_install.py`).
- API keys are provided either via environment variables or the Python module:
- Environment variables (preferred during build/deploy):
- `IMAGESORCERY_AMPLITUDE_API_KEY`
- `IMAGESORCERY_POSTHOG_API_KEY`
- Fallback module (kept in the repository as empty defaults): `src/imagesorcery_mcp/telemetry_keys.py`
- Contains `AMPLITUDE_API_KEY = ""` and `POSTHOG_API_KEY = ""`
Rationale:
- `telemetry.toml` was unreliable in some packaging/build scenarios (it could be omitted from final artifacts). Using environment variables (and a small Python module as a fallback) ensures keys are available at runtime and during build without embedding secrets in the repo.
### `populate_telemetry_keys.py`
**Purpose**: Populate `src/imagesorcery_mcp/telemetry_keys.py` with API keys from the environment (or `.env`) during build time if desired.
**Functionality**:
- Reads `IMAGESORCERY_AMPLITUDE_API_KEY` and `IMAGESORCERY_POSTHOG_API_KEY` from environment variables (or `.env` when python-dotenv is available)
- Writes these values into `src/imagesorcery_mcp/telemetry_keys.py`
- Intended to be used in CI/build pipelines where keys are injected as environment variables before packaging
**Command-line Usage**:
```bash
python -m src.imagesorcery_mcp.scripts.populate_telemetry_keys
```
**Notes**:
- The script will not commit changes; CI should handle any necessary cleanup.
- To skip population, set `SKIP_TELEMETRY_POPULATION=true`.
### `clear_telemetry_keys.py`
**Purpose**: Clear API keys in `src/imagesorcery_mcp/telemetry_keys.py` after build to keep the repository clean.
**Functionality**:
- Overwrites `src/imagesorcery_mcp/telemetry_keys.py` with empty string defaults:
```py
AMPLITUDE_API_KEY = ""
POSTHOG_API_KEY = ""
```
- Intended to be invoked as a post-build/cleanup step in CI
**Command-line Usage**:
```bash
python -m src.imagesorcery_mcp.scripts.clear_telemetry_keys
```
### Recommended CI / Build Integration
A suggested pipeline for safely using telemetry keys in CI:
1. In CI, set environment variables:
- `IMAGESORCERY_AMPLITUDE_API_KEY` and `IMAGESORCERY_POSTHOG_API_KEY`
2. Run the populate script:
- `python -m src.imagesorcery_mcp.scripts.populate_telemetry_keys`
3. Build/package the project
4. Run the clear script to remove keys from the working copy:
- `python -m src.imagesorcery_mcp.scripts.clear_telemetry_keys`
5. Ensure the CI does not persist telemetry_keys.py with real keys in any artifact or commit.
**Dependencies**:
- `python-dotenv` (optional) — used by scripts to load a `.env` file when present
**Error Handling**:
- Scripts log errors and return non-zero exit codes on failure so CI can fail fast.
================================================
FILE: src/imagesorcery_mcp/scripts/__init__.py
================================================
# Import functions to make them available when importing the package
# Import the central logger
from imagesorcery_mcp.logging_config import logger
from .create_model_descriptions import create_model_descriptions
from .download_models import download_model
__all__ = ["create_model_descriptions", "download_model", "logger"]
================================================
FILE: src/imagesorcery_mcp/scripts/clear_telemetry_keys.py
================================================
#!/usr/bin/env python3
"""Build script to clear API keys in src/imagesorcery_mcp/telemetry_keys.py while preserving .user_id."""
import logging
import sys
from pathlib import Path
# Setup logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
# Telemetry keys file path
TELEMETRY_KEYS_FILE = Path('src/imagesorcery_mcp/telemetry_keys.py')
def write_empty_telemetry_keys() -> bool:
"""Overwrite telemetry_keys.py with empty API key values."""
try:
content = '''# Auto-generated telemetry keys module.
# This file is intended to be updated by build scripts (populate_telemetry_keys.py)
# and cleared by clear_telemetry_keys.py. Keep values as empty strings in the repo.
#
# WARNING: Do NOT commit real production keys to the repository.
AMPLITUDE_API_KEY = ""
POSTHOG_API_KEY = ""
'''
TELEMETRY_KEYS_FILE.write_text(content)
logger.info(f"Cleared telemetry keys in {TELEMETRY_KEYS_FILE}")
return True
except Exception as e:
logger.error(f"Failed to clear telemetry keys file: {e}")
return False
def main():
logger.info("Starting telemetry keys clearing process...")
if not TELEMETRY_KEYS_FILE.exists():
logger.warning(f"{TELEMETRY_KEYS_FILE} does not exist; creating a new cleared file.")
if write_empty_telemetry_keys():
logger.info("Telemetry keys cleared successfully")
return 0
else:
logger.error("Failed to clear telemetry keys")
return 1
if __name__ == '__main__':
sys.exit(main())
================================================
FILE: src/imagesorcery_mcp/scripts/create_model_descriptions.py
================================================
#!/usr/bin/env python3
"""
Script to create model descriptions JSON file.
This script should be run during project setup to ensure model descriptions are available.
"""
import json
import os
from pathlib import Path
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def create_model_descriptions():
"""Create a JSON file with model descriptions in the models directory."""
logger.info(f"Creating model descriptions JSON file at {Path('models') / 'model_descriptions.json'}")
# YOLOv8 model descriptions
model_descriptions = {
"yolo11n.pt": "Ultralytics YOLO11 model for Object Detection. Provides state-of-the-art performance, suitable for tasks requiring a balance of speed and accuracy (smallest of YOLO11).",
"yolo11s.pt": "Ultralytics YOLO11 model for Object Detection. Provides state-of-the-art performance, more accurate than 'n', with slightly lower speed.",
"yolo11m.pt": "Ultralytics YOLO11 model for Object Detection. Provides state-of-the-art performance, a medium option balancing accuracy and speed.",
"yolo11l.pt": "Ultralytics YOLO11 model for Object Detection. Provides state-of-the-art performance, more accurate than 'm', with slightly lower speed.",
"yolo11x.pt": "Ultralytics YOLO11 model for Object Detection. Provides state-of-the-art performance (highest accuracy of YOLO11 Detect).",
"yolo11n-seg.pt": "Ultralytics YOLO11 model for Instance Segmentation. Provides state-of-the-art performance, smallest and fastest of YOLO11 Seg.",
"yolo11s-seg.pt": "Ultralytics YOLO11 model for Instance Segmentation. Provides state-of-the-art performance, a larger variant than 'n'.",
"yolo11m-seg.pt": "Ultralytics YOLO11 model for Instance Segmentation. Provides state-of-the-art performance, a medium variant.",
"yolo11l-seg.pt": "Ultralytics YOLO11 model for Instance Segmentation. Provides state-of-the-art performance, a larger variant than 'm'.",
"yolo11x-seg.pt": "Ultralytics YOLO11 model for Instance Segmentation. Provides state-of-the-art performance (highest accuracy of YOLO11 Seg).",
"yolo11n-pose.pt": "Ultralytics YOLO11 model for Pose Estimation / Keypoints detection. Provides state-of-the-art performance, smallest and fastest of YOLO11 Pose.",
"yolo11s-pose.pt": "Ultralytics YOLO11 model for Pose Estimation / Keypoints detection. Provides state-of-the-art performance, a larger variant than 'n'.",
"yolo11m-pose.pt": "Ultralytics YOLO11 model for Pose Estimation / Keypoints detection. Provides state-of-the-art performance, a medium variant.",
"yolo11l-pose.pt": "Ultralytics YOLO11 model for Pose Estimation / Keypoints detection. Provides state-of-the-art performance, a larger variant than 'm'.",
"yolo11x-pose.pt": "Ultralytics YOLO11 model for Pose Estimation / Keypoints detection. Provides state-of-the-art performance (highest accuracy of YOLO11 Pose).",
"yolo11n-obb.pt": "Ultralytics YOLO11 model for Oriented Object Detection (OBB). Provides state-of-the-art performance, smallest of YOLO11 OBB.",
"yolo11s-obb.pt": "Ultralytics YOLO11 model for Oriented Object Detection (OBB). Provides state-of-the-art performance, a larger variant than 'n'.",
"yolo11m-obb.pt": "Ultralytics YOLO11 model for Oriented Object Detection (OBB). Provides state-of-the-art performance, a medium variant.",
"yolo11l-obb.pt": "Ultralytics YOLO11 model for Oriented Object Detection (OBB). Provides state-of-the-art performance, a larger variant than 'm'.",
"yolo11x-obb.pt": "Ultralytics YOLO11 model for Oriented Object Detection (OBB). Provides state-of-the-art performance (highest accuracy of YOLO11 OBB).",
"yolo11n-cls.pt": "Ultralytics YOLO11 model for Image Classification. Provides state-of-the-art performance, smallest of YOLO11 Classify.",
"yolo11s-cls.pt": "Ultralytics YOLO11 model for Image Classification. Provides state-of-the-art performance, a larger variant than 'n'.",
"yolo11m-cls.pt": "Ultralytics YOLO11 model for Image Classification. Provides state-of-the-art performance, a medium variant.",
"yolo11l-cls.pt": "Ultralytics YOLO11 model for Image Classification. Provides state-of-the-art performance, a larger variant than 'm'.",
"yolo11x-cls.pt": "Ultralytics YOLO11 model for Image Classification. Provides state-of-the-art performance (highest accuracy of YOLO11 Classify).",
"yolov8n.pt": "General-purpose real-time Ultralytics YOLOv8 model for Object Detection. Provides a good balance of accuracy and speed, suitable for resource-constrained tasks (smallest of YOLOv8 Detect).",
"yolov8s.pt": "General-purpose real-time Ultralytics YOLOv8 model for Object Detection. Balances accuracy and speed, a larger variant than 'n'.",
"yolov8m.pt": "General-purpose real-time Ultralytics YOLOv8 model for Object Detection. Balances accuracy and speed, a medium variant.",
"yolov8l.pt": "General-purpose real-time Ultralytics YOLOv8 model for Object Detection. Balances accuracy and speed, a larger variant than 'm'.",
"yolov8x.pt": "General-purpose real-time Ultralytics YOLOv8 model for Object Detection. Balances accuracy and speed (highest accuracy of YOLOv8 Detect).",
"yolov8n-seg.pt": "General-purpose real-time Ultralytics YOLOv8 model for Instance Segmentation. Provides a good balance of accuracy and speed, suitable for resource-constrained tasks (smallest of YOLOv8 Seg).",
"yolov8s-seg.pt": "General-purpose real-time Ultralytics YOLOv8 model for Instance Segmentation. Balances accuracy and speed, a larger variant than 'n'.",
"yolov8m-seg.pt": "General-purpose real-time Ultralytics YOLOv8 model for Instance Segmentation. Balances accuracy and speed, a medium variant.",
"yolov8l-seg.pt": "General-purpose real-time Ultralytics YOLOv8 model for Instance Segmentation. Balances accuracy and speed, a larger variant than 'm'.",
"yolov8x-seg.pt": "General-purpose real-time Ultralytics YOLOv8 model for Instance Segmentation. Balances accuracy and speed (highest accuracy of YOLOv8 Seg).",
"yolov8n-pose.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. Suitable for resource-constrained tasks (smallest of YOLOv8 Pose).",
"yolov8s-pose.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. A larger variant than 'n'.",
"yolov8m-pose.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. A medium variant.",
"yolov8l-pose.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. A larger variant than 'm'.",
"yolov8x-pose.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. The largest variant.",
"yolov8x-pose-p6.pt": "General-purpose real-time Ultralytics YOLOv8 model for Pose Estimation / Keypoints detection. Trained with 1280 input size.",
"yolov8n-obb.pt": "General-purpose real-time Ultralytics YOLOv8 model for Oriented Object Detection (OBB). Suitable for resource-constrained tasks (smallest of YOLOv8 OBB).",
"yolov8s-obb.pt": "General-purpose real-time Ultralytics YOLOv8 model for Oriented Object Detection (OBB). A larger variant than 'n'.",
"yolov8m-obb.pt": "General-purpose real-time Ultralytics YOLOv8 model for Oriented Object Detection (OBB). A medium variant.",
"yolov8l-obb.pt": "General-purpose real-time Ultralytics YOLOv8 model for Oriented Object Detection (OBB). A larger variant than 'm'.",
"yolov8x-obb.pt": "General-purpose real-time Ultralytics YOLOv8 model for Oriented Object Detection (OBB). The largest variant.",
"yolov8n-cls.pt": "General-purpose real-time Ultralytics YOLOv8 model for Image Classification. Suitable for resource-constrained tasks (smallest of YOLOv8 Classify).",
"yolov8s-cls.pt": "General-purpose real-time Ultralytics YOLOv8 model for Image Classification. A larger variant than 'n'.",
"yolov8m-cls.pt": "General-purpose real-time Ultralytics YOLOv8 model for Image Classification. A medium variant.",
"yolov8l-cls.pt": "General-purpose real-time Ultralytics YOLOv8 model for Image Classification. A larger variant than 'm'.",
"yolov8x-cls.pt": "General-purpose real-time Ultralytics YOLOv8 model for Image Classification. The largest variant.",
"rtdetr-l.pt": "Realtime Detection Transformer (RT-DETR) by Baidu for Object Detection. Well-suited for applications requiring high accuracy and real-time performance (smaller variant).",
"rtdetr-x.pt": "Realtime Detection Transformer (RT-DETR) by Baidu for Object Detection. Well-suited for applications requiring high accuracy and real-time performance (larger variant).",
"sam_b.pt": "Segment Anything Model (SAM) by Meta. Provides unique automatic segmentation capabilities based on prompts.",
"sam2_t.pt": "Segment Anything Model 2 (SAM2) by Meta. The next generation of SAM for video and images, provides automatic segmentation capabilities (smaller variant).",
"sam2_b.pt": "Segment Anything Model 2 (SAM2) by Meta. The next generation of SAM for video and images, provides automatic segmentation capabilities (larger variant).",
"mobile_sam.pt": "Mobile Segment Anything Model (MobileSAM). A mobile variant of SAM for segmentation.",
"FastSAM-s.pt": "Fast Segment Anything Model (FastSAM). A segmentation model that is faster and more efficient than SAM. Supports segmentation based on text prompts or bounding boxes.",
"yolo_nas_s.pt": "YOLO-NAS model (based on Neural Architecture Search) for Object Detection. Optimized for resource-constrained environments and focused on efficiency (smallest).",
"yolo_nas_m.pt": "YOLO-NAS model (based on Neural Architecture Search) for Object Detection. Offers a balanced approach, suitable for general object detection with higher accuracy (medium).",
"yolo_nas_l.pt": "YOLO-NAS model (based on Neural Architecture Search) for Object Detection. Designed for scenarios requiring the highest accuracy, where computational resources are less constrained (largest).",
"yolov8s-world.pt": "YOLO-World model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (smaller variant, no export support).",
"yolov8s-worldv2.pt": "YOLO-World V2 model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (smaller variant, with export support and deterministic training). Recommended for custom training.",
"yolov8m-world.pt": "YOLO-World model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (medium variant, no export support).",
"yolov8m-worldv2.pt": "YOLO-World V2 model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (medium variant, with export support and deterministic training).",
"yolov8l-world.pt": "YOLO-World model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (larger variant, no export support).",
"yolov8l-worldv2.pt": "YOLO-World V2 model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (larger variant, with export support and deterministic training).",
"yolov8x-world.pt": "YOLO-World model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (largest variant, no export support).",
"yolov8x-worldv2.pt": "YOLO-World V2 model (based on YOLOv8) for Real-Time Open-Vocabulary Object Detection. Detects any objects based on text descriptions, effective for zero-shot tasks (largest variant, with export support and deterministic training).",
"yoloe-11s-seg.pt": "Real-Time Open-Vocabulary YOLOE model for Instance Segmentation. Detects arbitrary classes using text/visual prompts (smallest).",
"yoloe-11m-seg.pt": "Real-Time Open-Vocabulary YOLOE model for Instance Segmentation. Detects arbitrary classes using text/visual prompts (medium).",
"yoloe-11l-seg.pt": "Real-Time Open-Vocabulary YOLOE model for Instance Segmentation. Detects arbitrary classes using text/visual prompts (largest).",
"yoloe-v8s-seg.pt": "Real-Time Open-Vocabulary YOLOE model (based on YOLOv8) for Instance Segmentation. Detects arbitrary classes using text/visual prompts (smallest).",
"yoloe-v8m-seg.pt": "Real-Time Open-Vocabulary YOLOE model (based on YOLOv8) for Instance Segmentation. Detects arbitrary classes using text/visual prompts (medium).",
"yoloe-v8l-seg.pt": "Real-Time Open-Vocabulary YOLOE model (based on YOLOv8) for Instance Segmentation. Detects arbitrary classes using text/visual prompts (largest).",
"yoloe-11s-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model for Instance Segmentation. Detects objects from a large built-in vocabulary (smallest).",
"yoloe-11m-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model for Instance Segmentation. Detects objects from a large built-in vocabulary (medium).",
"yoloe-11l-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model for Instance Segmentation. Detects objects from a large built-in vocabulary (largest).",
"yoloe-v8s-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model (based on YOLOv8) for Instance Segmentation. Detects objects from a large built-in vocabulary (smallest).",
"yoloe-v8m-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model (based on YOLOv8) for Instance Segmentation. Detects objects from a large built-in vocabulary (medium).",
"yoloe-v8l-seg-pf.pt": "Real-Time Open-Vocabulary Prompt-Free YOLOE model (based on YOLOv8) for Instance Segmentation. Detects objects from a large built-in vocabulary (largest).",
"yolov10n.pt": "Real-Time End-to-End YOLOv10 model for Object Detection. Suitable for very resource-constrained environments (smallest).",
"yolov10s.pt": "Real-Time End-to-End YOLOv10 model for Object Detection. Balances speed and accuracy.",
"yolov10m.pt": "Real-Time End-to-End YOLOv10 model for Object Detection. Suitable for general use (medium).",
"yolov10l.pt": "Real-Time End-to-End YOLOv10 model for Object Detection. High accuracy at the cost of computational resources.",
"yolov10x.pt": "Real-Time End-to-End YOLOv10 model for Object Detection. Maximum accuracy and performance (largest).",
"yolov3u.pt": "YOLOv3 model for Object Detection. An older but effective real-time model.",
"yolov3-tinyu.pt": "YOLOv3-Tiny model for Object Detection. A very fast, lightweight version of YOLOv3.",
"yolov3-sppu.pt": "YOLOv3-SPP model for Object Detection. A version of YOLOv3 with an SPP module for improved performance.",
"yolov9t.pt": "YOLOv9 model for Object Detection. Uses PGI for data preservation, useful for lightweight models (smallest of YOLOv9 Detect).",
"yolov9s.pt": "YOLOv9 model for Object Detection. Uses PGI for data preservation, useful for lightweight models (a larger variant than 't').",
"yolov9m.pt": "YOLOv9 model for Object Detection. Uses PGI for data preservation, useful for lightweight models (medium).",
"yolov9c.pt": "YOLOv9 model for Object Detection. Uses PGI for data preservation, useful for lightweight models (a larger variant than 'm').",
"yolov9e.pt": "YOLOv9 model for Object Detection. Uses PGI for data preservation, useful for lightweight models (largest of YOLOv9 Detect).",
"yolov9c-seg.pt": "YOLOv9 model for Instance Segmentation. Uses PGI for data preservation, useful for lightweight models (smaller variant).",
"yolov9e-seg.pt": "YOLOv9 model for Instance Segmentation. Uses PGI for data preservation, useful for lightweight models (larger variant).",
"yolo12n.pt": "YOLO12 'Attention-Centric' model for Object Detection. (Example provided only for the 'n' variant)."
}
# Create models directory if it doesn't exist
models_dir = Path("models").resolve()
os.makedirs(models_dir, exist_ok=True)
logger.info(f"Ensured models directory exists: {models_dir}")
descriptions_file = models_dir / "model_descriptions.json"
existing_descriptions = {}
# Read existing descriptions if the file exists
if descriptions_file.exists():
try:
with open(descriptions_file, "r") as f:
existing_descriptions = json.load(f)
logger.info(f"Loaded existing model descriptions from: {descriptions_file}")
except json.JSONDecodeError:
logger.warning(f"Error decoding JSON from {descriptions_file}, starting with empty descriptions.")
existing_descriptions = {}
except Exception as e:
logger.error(f"Error reading existing model descriptions from {descriptions_file}: {e}")
existing_descriptions = {}
# Merge new descriptions with existing ones
# Existing descriptions take precedence to avoid overwriting custom ones
merged_descriptions = model_descriptions.copy()
merged_descriptions.update(existing_descriptions)
# Write merged descriptions to JSON file
logger.info(f"Writing merged model descriptions to: {descriptions_file}")
try:
with open(descriptions_file, "w") as f:
json.dump(merged_descriptions, f, indent=2)
logger.info(f"Model descriptions updated successfully at: {descriptions_file}")
print(f"✅ Model descriptions updated at: {descriptions_file}")
return str(descriptions_file)
except Exception as e:
logger.error(f"Error writing merged model descriptions to {descriptions_file}: {e}")
print(f"❌ Failed to update model descriptions at: {descriptions_file}")
return None
def main():
logger.info(f"Running create_model_descriptions script from {Path(__file__).resolve()}")
create_model_descriptions()
logger.info("create_model_descriptions script finished")
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/scripts/download_clip.py
================================================
#!/usr/bin/env python3
"""
Script to download CLIP models required for YOLOe text prompts.
"""
import os
import sys
from pathlib import Path
import requests
from tqdm import tqdm
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def get_models_dir():
"""Get the models directory in the project root."""
models_dir = Path("models").resolve()
os.makedirs(models_dir, exist_ok=True)
logger.info(f"Ensured models directory exists: {models_dir}")
return models_dir
def download_file(url, output_path):
"""Download a file from a URL with progress bar."""
logger.info(f"Attempting to download file from {url} to {output_path}")
try:
response = requests.get(url, stream=True)
response.raise_for_status()
total_size = int(response.headers.get('content-length', 0))
block_size = 1024 # 1 Kibibyte
with open(output_path, 'wb') as file, tqdm(
desc=f"Downloading to {os.path.basename(output_path)}",
total=total_size,
unit='iB',
unit_scale=True,
unit_divisor=1024,
) as bar:
for data in response.iter_content(block_size):
size = file.write(data)
bar.update(size)
logger.info(f"Successfully downloaded file to {output_path}")
return True
except Exception as e:
logger.error(f"Error downloading from {url}: {str(e)}")
return False
def download_clip_model():
"""Download the MobileCLIP model required for YOLOe text prompts."""
logger.info("Attempting to download CLIP model")
root_clip_model_path = Path("mobileclip_blt.ts").resolve()
# Check if model already exists in root directory
if root_clip_model_path.exists():
logger.info(f"CLIP model already exists at: {root_clip_model_path}")
return True
# URL for the MobileCLIP model
url = "https://github.com/ultralytics/assets/releases/download/v8.3.0/mobileclip_blt.ts"
# Download directly to root directory
logger.info(f"Downloading CLIP model to root directory from: {url}")
success = download_file(url, root_clip_model_path)
if success:
logger.info(f"CLIP model successfully downloaded to: {root_clip_model_path}")
return True
else:
logger.error(f"Failed to download CLIP model to: {root_clip_model_path}")
return False
def main():
"""Main function to download CLIP models."""
logger.info(f"Running download_clip_models script from {Path(__file__).resolve()}")
# Download the MobileCLIP model
if download_clip_model():
logger.info("CLIP model downloaded successfully")
print("✅ CLIP model download completed successfully")
else:
logger.error("Failed to download CLIP model")
print("❌ Failed to download CLIP model")
sys.exit(1)
logger.info("download_clip_models script finished")
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/scripts/download_models.py
================================================
#!/usr/bin/env python3
"""
Script to download YOLO compatible models for offline use.
This script should be run during project setup to ensure models are available.
"""
import argparse
import json
import os
import shutil
import sys
from pathlib import Path
import requests
from tqdm import tqdm
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def get_models_dir():
"""Get the models directory in the project root."""
models_dir = Path("models").resolve()
os.makedirs(models_dir, exist_ok=True)
logger.info(f"Ensured models directory exists: {models_dir}")
return str(models_dir)
def download_from_url(url, output_path):
"""Download a file from a URL with progress bar."""
logger.info(f"Attempting to download file from {url} to {output_path}")
try:
response = requests.get(url, stream=True)
response.raise_for_status()
total_size = int(response.headers.get('content-length', 0))
block_size = 1024 # 1 Kibibyte
with open(output_path, 'wb') as file, tqdm(
desc=f"Downloading to {os.path.basename(output_path)}",
total=total_size,
unit='iB',
unit_scale=True,
unit_divisor=1024,
) as bar:
for data in response.iter_content(block_size):
size = file.write(data)
bar.update(size)
logger.info(f"Successfully downloaded file to {output_path}")
return True
except Exception as e:
logger.error(f"Error downloading from {url}: {str(e)}")
return False
def download_from_huggingface(model_name):
"""Download a model from Hugging Face."""
logger.info(f"Attempting to download model from Hugging Face: {model_name}")
# Extract repo_id and model filename
if "/" not in model_name:
logger.error("Invalid Hugging Face model format. Use 'username/repo:filename' or 'username/repo'")
return False
parts = model_name.split(":", 1)
repo_id = parts[0]
filename = parts[1] if len(parts) > 1 else None
# Default description
model_description = f"Model from Hugging Face repository: {repo_id}"
# Try to get model description
try:
from huggingface_hub import model_info
info = model_info(repo_id)
if info.cardData and "model-index" in info.cardData:
model_index = info.cardData["model-index"]
if model_index and len(model_index) > 0 and "name" in model_index[0]:
model_description = model_index[0].get('name', model_description)
logger.info(f"Fetched model description: {model_description}")
elif info.description:
# Extract first line or first 100 characters of description
description = info.description.split('\n')[0][:100]
if len(info.description) > 100:
description += "..."
model_description = description
logger.info(f"Fetched model description: {model_description}")
except Exception as e:
logger.warning(f"Could not fetch model description: {str(e)}")
# If no specific filename provided, try to find a .pt file
if filename is None:
try:
from huggingface_hub import list_repo_files
files = list_repo_files(repo_id)
pt_files = [f for f in files if f.endswith('.pt')]
if not pt_files:
logger.error(f"No .pt files found in {repo_id}")
return False
filename = pt_files[0]
logger.info(f"Found model file in repository: {filename}")
except Exception as e:
logger.error(f"Error listing files in repository: {str(e)}")
return False
# Create directory structure based on repo_id
models_dir = get_models_dir()
repo_dir = os.path.join(models_dir, repo_id.replace("/", os.sep))
os.makedirs(repo_dir, exist_ok=True)
logger.info(f"Ensured repository directory exists: {repo_dir}")
# Set the output path
output_filename = os.path.basename(filename)
output_path = os.path.join(repo_dir, output_filename)
# Update model_descriptions.json with the model description
model_key = f"{repo_id}/{output_filename}"
update_model_description(model_key, model_description)
# Check if model already exists
if os.path.exists(output_path):
logger.info(f"Model already exists at: {output_path}")
return True
# Construct the download URL
url = f"https://huggingface.co/{repo_id}/resolve/main/{filename}"
logger.info(f"Downloading from Hugging Face: {repo_id}/{filename}")
logger.info(f"Saving to: {output_path}")
return download_from_url(url, output_path)
def update_model_description(model_key, description):
"""Update the model_descriptions.json file with a new model description."""
logger.info(f"Updating model description for {model_key}")
models_dir = get_models_dir()
descriptions_file = os.path.join(models_dir, "model_descriptions.json")
# Load existing descriptions or create new if file doesn't exist
if os.path.exists(descriptions_file):
try:
with open(descriptions_file, 'r') as f:
descriptions = json.load(f)
logger.info(f"Loaded existing model descriptions from {descriptions_file}")
except json.JSONDecodeError:
logger.warning("Error reading model_descriptions.json, creating new file")
descriptions = {}
else:
logger.info("model_descriptions.json not found, creating new file")
descriptions = {}
# Update the description for this model
if model_key not in descriptions:
descriptions[model_key] = description
logger.info(f"Added description for {model_key} to model_descriptions.json")
elif descriptions[model_key] != description:
descriptions[model_key] = description
logger.info(f"Updated description for {model_key} in model_descriptions.json")
else:
logger.info(f"Description for {model_key} is already up to date")
# Save the updated descriptions
try:
with open(descriptions_file, 'w') as f:
json.dump(descriptions, f, indent=2, sort_keys=True)
logger.info(f"Saved updated model descriptions to {descriptions_file}")
except Exception as e:
logger.error(f"Error updating model_descriptions.json: {str(e)}")
def download_ultralytics_model(model_name):
"""Download a specific YOLO model from Ultralytics to the models directory."""
logger.info(f"Attempting to download Ultralytics model: {model_name}")
try:
# Get the models directory
models_dir = get_models_dir()
# Set the output path
output_path = os.path.join(models_dir, model_name)
# Check if model already exists in models directory
if os.path.exists(output_path):
logger.info(f"Model already exists at: {output_path}")
return True
# Set environment variable to use the models directory
os.environ["YOLO_CONFIG_DIR"] = models_dir
logger.info(f"Set YOLO_CONFIG_DIR environment variable to: {models_dir}")
# Import and download the model
from ultralytics import YOLO
logger.info(f"Downloading {model_name} model using Ultralytics library...")
# The model variable is used to trigger the download
model = YOLO(model_name) # noqa: F841
# Check if the model was downloaded to the expected location
if os.path.exists(output_path):
logger.info(f"Model successfully downloaded to expected path: {output_path}")
return True
# Check if model was downloaded to current directory
current_dir_model = Path(model_name)
if current_dir_model.exists():
logger.info(f"Model found in current directory: {current_dir_model.resolve()}")
try:
# Move the model to the models directory
shutil.move(str(current_dir_model), output_path)
logger.info(f"Model moved to: {output_path}")
return True
except Exception as e:
logger.warning(f"Could not move model from {current_dir_model.resolve()} to {output_path}: {e}")
logger.info(f"You can still use the model from: {current_dir_model.resolve()}")
return True
# If not found in expected locations,
# try to find it in ultralytics default location
possible_locations = [
Path.home() / ".ultralytics" / "weights" / model_name,
Path(os.path.dirname(os.path.abspath(__file__))) / "weights" / model_name,
]
# Try to import ultralytics to find its location
try:
import ultralytics
ultralytics_dir = Path(ultralytics.__file__).parent
possible_locations.append(ultralytics_dir / "weights" / model_name)
except ImportError:
logger.warning("Could not import ultralytics to find default weights location")
# Check each location
for loc in possible_locations:
if loc.exists():
logger.info(f"Model found at a different location: {loc.resolve()}")
try:
shutil.copy(loc, output_path)
logger.info(f"Model copied to: {output_path}")
return True
except Exception as e:
logger.warning(f"Could not copy model from {loc.resolve()} to {output_path}: {e}")
logger.error(
f"Please manually copy the model from {loc.resolve()} to {output_path}"
)
return False
logger.error(f"Failed to download model to expected path: {output_path}")
return False
except Exception as e:
logger.error(f"Error downloading model: {str(e)}")
return False
def download_model(model_name, source=None):
"""
Legacy function for backward compatibility.
Downloads a model from the specified source.
"""
logger.info(f"Legacy download_model called for {model_name} from source {source}")
if source == "ultralytics":
return download_ultralytics_model(model_name)
elif source == "huggingface":
return download_from_huggingface(model_name)
else:
logger.error(f"Unknown model source: {source}")
return False
def main():
logger.info(f"Running download_models script from {Path(__file__).resolve()}")
parser = argparse.ArgumentParser(
description="Download YOLO compatible models for offline use"
)
# Create a mutually exclusive group for the model sources
source_group = parser.add_mutually_exclusive_group(required=True)
source_group.add_argument(
"--ultralytics",
metavar="MODEL_NAME",
help="Download a model from Ultralytics (e.g., 'yolov8m.pt')"
)
source_group.add_argument(
"--huggingface",
metavar="REPO_ID[:FILENAME]",
help="Download a model from Hugging Face (e.g., 'username/repo:model.pt' or 'username/repo')"
)
args = parser.parse_args()
if args.ultralytics:
logger.info(f"Downloading Ultralytics model: {args.ultralytics}")
success = download_ultralytics_model(args.ultralytics)
elif args.huggingface:
logger.info(f"Downloading Hugging Face model: {args.huggingface}")
success = download_from_huggingface(args.huggingface)
else:
# This should never happen due to the required=True in the mutually_exclusive_group
logger.error("No model source specified")
success = False
if not success:
logger.error("Model download failed")
sys.exit(1)
else:
logger.info("Model download completed successfully")
logger.info("download_models script finished")
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/scripts/populate_telemetry_keys.py
================================================
#!/usr/bin/env python3
"""Build script to populate src/imagesorcery_mcp/telemetry_keys.py with API keys from environment variables or .env file."""
import logging
import os
import sys
from pathlib import Path
from typing import Dict
try:
from dotenv import load_dotenv
DOTENV_AVAILABLE = True
except ImportError:
DOTENV_AVAILABLE = False
# Setup logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
# Telemetry keys file path
TELEMETRY_KEYS_FILE = Path('src/imagesorcery_mcp/telemetry_keys.py')
def get_telemetry_keys() -> Dict[str, str]:
"""Get telemetry API keys from environment variables and .env file.
Priority order:
1. Environment variables
2. .env file
Returns:
Dictionary containing telemetry API keys
"""
keys = {}
# Load .env file if available
if DOTENV_AVAILABLE:
env_file = Path('.env')
if env_file.exists():
load_dotenv(env_file)
logger.debug("Loaded .env file")
# Get Amplitude API key (env var takes priority)
amplitude_key = os.environ.get('IMAGESORCERY_AMPLITUDE_API_KEY', '')
keys['AMPLITUDE_API_KEY'] = amplitude_key
if amplitude_key:
logger.debug("Found IMAGESORCERY_AMPLITUDE_API_KEY")
# Get PostHog API key (env var takes priority)
posthog_key = os.environ.get('IMAGESORCERY_POSTHOG_API_KEY', '')
keys['POSTHOG_API_KEY'] = posthog_key
if posthog_key:
logger.debug("Found IMAGESORCERY_POSTHOG_API_KEY")
return keys
def write_telemetry_keys_file(keys: Dict[str, str]) -> bool:
"""Write the telemetry_keys.py file with provided keys.
Args:
keys: Dict with 'AMPLITUDE_API_KEY' and 'POSTHOG_API_KEY'
Returns:
True if successful, False otherwise
"""
try:
content = f'''# Auto-generated telemetry keys module.
# This file is intended to be updated by build scripts (populate_telemetry_keys.py)
# and cleared by clear_telemetry_keys.py. Keep values as empty strings in the repo.
#
# WARNING: Do NOT commit real production keys to the repository.
AMPLITUDE_API_KEY = "{keys.get("AMPLITUDE_API_KEY", "")}"
POSTHOG_API_KEY = "{keys.get("POSTHOG_API_KEY", "")}"
'''
TELEMETRY_KEYS_FILE.write_text(content)
logger.info(f"Wrote telemetry keys to {TELEMETRY_KEYS_FILE}")
return True
except Exception as e:
logger.error(f"Failed to write telemetry keys file: {e}")
return False
def main():
"""Main entry point to populate telemetry_keys.py."""
logger.info("Starting telemetry keys population process...")
# Optionally skip
if os.environ.get('SKIP_TELEMETRY_POPULATION', '').lower() in ('true', '1', 'yes'):
logger.info("Telemetry population skipped via SKIP_TELEMETRY_POPULATION environment variable")
return 0
keys = get_telemetry_keys()
found_keys = [k for k, v in keys.items() if v]
empty_keys = [k for k, v in keys.items() if not v]
if found_keys:
logger.info(f"Found telemetry keys: {', '.join(found_keys)}")
if empty_keys:
logger.info(f"Empty telemetry keys (will remain empty): {', '.join(empty_keys)}")
if write_telemetry_keys_file(keys):
logger.info("Telemetry keys population completed successfully")
return 0
else:
logger.error("Telemetry keys population failed")
return 1
if __name__ == '__main__':
sys.exit(main())
================================================
FILE: src/imagesorcery_mcp/scripts/post_install.py
================================================
#!/usr/bin/env python3
"""
Script to run post-installation tasks for imagesorcery-mcp.
This script creates the models directory, model descriptions file,
and downloads default models.
"""
import os
import subprocess # Ensure subprocess is imported
import sys # Ensure sys is imported
import uuid
from pathlib import Path
# Import the central logger
from imagesorcery_mcp.logging_config import logger
# For loading .env file
try:
from dotenv import load_dotenv
DOTENV_AVAILABLE = True
except ImportError:
DOTENV_AVAILABLE = False
logger.warning("python-dotenv not available. .env file will not be loaded automatically.")
from imagesorcery_mcp.scripts.create_model_descriptions import create_model_descriptions
from imagesorcery_mcp.scripts.download_clip import download_clip_model
from imagesorcery_mcp.scripts.download_models import download_ultralytics_model
def install_clip():
"""Install CLIP from the Ultralytics GitHub repository."""
logger.info("Installing CLIP package from GitHub...")
try:
subprocess.run(
[sys.executable, "-m", "pip", "install", "git+https://github.com/ultralytics/CLIP.git"],
check=True,
stdout=sys.stdout, # Can be replaced with subprocess.PIPE if console output is not needed
stderr=subprocess.PIPE # Capture stderr to analyze it
)
logger.info("CLIP package installed successfully")
print("✅ CLIP package installed successfully")
return True
except subprocess.CalledProcessError as e:
logger.error(f"Failed to install CLIP: {e}")
error_message = f"❌ Failed to install CLIP package: {e}"
detailed_warning = ""
if e.stderr:
try:
stderr_output = e.stderr.decode(errors='ignore')
logger.debug(f"Captured stderr from CLIP installation attempt: {stderr_output}")
if "No module named pip" in stderr_output:
detailed_warning = (
"\n Hint: The Python environment (potentially created by 'uvx' or a minimal 'uv venv') might be missing 'pip'."
"\n To ensure 'clip' package installation for full functionality (e.g., text prompts in 'find' tool):"
"\n 1. Recommended: Use 'python -m venv' to create a virtual environment, then 'pip install imagesorcery-mcp' and 'imagesorcery-mcp --post-install'."
"\n 2. Or, manually install 'clip' into your active environment: pip install git+https://github.com/ultralytics/CLIP.git"
"\n (If using 'uv venv', you might need: uv pip install git+https://github.com/ultralytics/CLIP.git)"
)
except Exception as decode_exc:
logger.error(f"Error while decoding/processing stderr for CLIP install: {decode_exc}")
print(error_message + detailed_warning)
return False
except FileNotFoundError: # Handle case where pip or python executable is not found
logger.error("Failed to install CLIP: Python executable or pip not found.")
print("❌ Failed to install CLIP package: Python executable or pip not found. Ensure Python is in PATH and pip is installed.")
return False
def create_config_file():
"""Ensure config.toml exists, create with default values if needed."""
config_file = Path("config.toml")
if config_file.exists():
logger.info(f"⏩ Configuration file already exists: {config_file}")
return True
logger.info("Creating config.toml using configuration system defaults")
# The config manager will create it with defaults if it doesn't exist
from imagesorcery_mcp.config import get_config_manager
get_config_manager()
print(f"✅ Configuration file created with default values: {config_file}")
return True
def create_user_id_file():
"""Ensure .user_id file exists in the project root, create a new UUID if needed."""
user_id_file = Path(".user_id")
if user_id_file.exists():
logger.info(f"⏩ .user_id file already exists: {user_id_file}")
return True
logger.info("Creating .user_id file for telemetry...")
try:
user_id = str(uuid.uuid4())
user_id_file.write_text(user_id)
logger.info(f"Generated new .user_id file with user_id: {user_id_file}")
print(f"✅ .user_id file created: {user_id_file}")
return True
except Exception as e:
logger.error(f"Failed to create .user_id file: {e}")
print(f"❌ Failed to create .user_id file: {e}")
return False
def run_post_install():
"""Run all post-installation tasks."""
logger.info(f"Running post-installation tasks from {Path(__file__).resolve()}...")
# Get API keys from environment variables (for Step 4)
amplitude_api_key = os.environ.get("IMAGESORCERY_AMPLITUDE_API_KEY", "")
posthog_api_key = os.environ.get("IMAGESORCERY_POSTHOG_API_KEY", "")
logger.debug(f"Amplitude API key from environment: {'*' * 8 if amplitude_api_key else 'Not set'}")
logger.debug(f"PostHog API key from environment: {'*' * 8 if posthog_api_key else 'Not set'}")
# Create configuration file
logger.info("Creating configuration file...")
config_created = create_config_file()
if not config_created:
logger.error("Failed to create configuration file")
return False
# Create user ID file for telemetry
logger.info("Creating user ID file...")
user_id_created = create_user_id_file()
if not user_id_created:
logger.error("Failed to create user ID file")
# Do not return False here, as telemetry is not critical for core functionality
# and other post-install tasks should still proceed.
# Create models directory
models_dir = Path("models").resolve()
os.makedirs(models_dir, exist_ok=True)
logger.info(f"Created models directory: {models_dir}")
# Create model descriptions file
logger.info("Creating model descriptions file...")
descriptions_file = create_model_descriptions()
if descriptions_file:
logger.info(f"Model descriptions file created at: {descriptions_file}")
else:
logger.error("Failed to create model descriptions file")
return False
# Download default Ultralytics YOLO models
default_models = [
"yoloe-11l-seg-pf.pt",
"yoloe-11s-seg-pf.pt",
"yoloe-11l-seg.pt",
"yoloe-11s-seg.pt"
]
logger.info("Downloading default Ultralytics YOLO models...")
for model in default_models:
logger.info(f"Downloading {model}...")
success = download_ultralytics_model(model)
if not success:
logger.error(f"Failed to download model: {model}")
return False
print("✅ Ultralytics YOLO models download completed successfully")
# Install CLIP package
logger.info("Installing CLIP package for text prompts...")
clip_installed_successfully = install_clip()
if not clip_installed_successfully:
logger.warning("CLIP Python package installation failed. The 'find' tool's text prompt functionality might be limited or unavailable.")
print("⚠️ WARNING: CLIP Python package installation failed. Text prompt features of the 'find' tool may not work.")
print(" Models for CLIP will still be downloaded. If you need this functionality, please try installing the CLIP package manually:")
print(" pip install git+https://github.com/ultralytics/CLIP.git")
# We continue with the rest of the post-installation, especially downloading CLIP models.
# Download CLIP model
logger.info("Downloading CLIP model for text prompts...")
try:
# Download the CLIP model file
success = download_clip_model()
if not success:
logger.error("Failed to download CLIP model")
return False
except Exception as e:
logger.error(f"Error downloading CLIP model: {str(e)}")
return False
print("✅ CLIP model download completed successfully")
logger.info("Post-installation tasks completed successfully!")
print("✅ Post-installation tasks completed successfully!")
return True
def main():
"""Main entry point for the post_install script."""
# Load .env file if available
if DOTENV_AVAILABLE:
env_file = Path(".env")
if env_file.exists():
load_dotenv()
logger.info(f"Loaded environment variables from {env_file}")
else:
logger.info(".env file not found, skipping dotenv loading")
else:
logger.info("python-dotenv not available, skipping .env file loading")
logger.info(f"Starting post-installation process from {Path(__file__).resolve()}")
success = run_post_install()
if not success:
logger.error("Post-installation process failed")
sys.exit(1)
logger.info("Post-installation process completed")
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/server.py
================================================
import argparse
import os
import sys
from pathlib import Path
from fastmcp import FastMCP
from fastmcp.server.middleware.error_handling import ErrorHandlingMiddleware
from imagesorcery_mcp.logging_config import logger
# Change to project root directory
project_root = Path(__file__).parent.parent.parent
os.chdir(project_root)
logger.info(f"Changed current working directory to: {project_root}")
# Load environment variables from .env if python-dotenv is available (so handlers see keys on import)
try:
from dotenv import load_dotenv # type: ignore
env_file = project_root / ".env"
if env_file.exists():
load_dotenv(env_file)
logger.info(f"Loaded environment variables from: {env_file}")
else:
logger.debug(".env file not found, skipping dotenv loading")
except Exception:
logger.debug("python-dotenv not available, skipping .env loading")
from imagesorcery_mcp.middlewares.path_access import PathAccessMiddleware # noqa: E402
from imagesorcery_mcp.middlewares.telemetry import TelemetryMiddleware # noqa: E402
from imagesorcery_mcp.middlewares.validation import ( # noqa: E402
ImprovedValidationMiddleware, # noqa: E402
)
from imagesorcery_mcp.prompts import remove_background # noqa: E402
from imagesorcery_mcp.resources import models # noqa: E402
from imagesorcery_mcp.tools import ( # noqa: E402
blur,
change_color,
config,
crop,
detect,
draw_arrows,
draw_circle,
draw_lines,
draw_rectangle,
draw_text,
fill,
find,
metainfo,
ocr,
overlay,
resize,
rotate,
)
# Create a module-level mcp instance for backward compatibility with tests
mcp = FastMCP(
name="imagesorcery-mcp",
instructions=(
"An MCP server providing tools for image processing operations. "
"Input images must be specified with full paths."
)
)
error_middleware = ErrorHandlingMiddleware(
logger=logger,
include_traceback=True,
transform_errors=True,
)
mcp.add_middleware(error_middleware)
telemetry_middleware = TelemetryMiddleware(logger=logger)
mcp.add_middleware(telemetry_middleware)
path_access_middleware = PathAccessMiddleware(logger=logger)
mcp.add_middleware(path_access_middleware)
validation_middleware = ImprovedValidationMiddleware(logger=logger)
mcp.add_middleware(validation_middleware)
# Register tools with the module-level mcp instance
blur.register_tool(mcp)
change_color.register_tool(mcp)
config.register_tool(mcp)
crop.register_tool(mcp)
detect.register_tool(mcp)
draw_arrows.register_tool(mcp)
draw_circle.register_tool(mcp)
draw_lines.register_tool(mcp)
draw_rectangle.register_tool(mcp)
draw_text.register_tool(mcp)
fill.register_tool(mcp)
find.register_tool(mcp)
metainfo.register_tool(mcp)
ocr.register_tool(mcp)
overlay.register_tool(mcp)
resize.register_tool(mcp)
rotate.register_tool(mcp)
# Register resources
models.register_resource(mcp)
# Register prompts
remove_background.register_prompt(mcp)
def parse_arguments():
"""Parse command line arguments."""
parser = argparse.ArgumentParser(description="ImageSorcery MCP Server")
parser.add_argument(
"--post-install",
action="store_true",
help="Run post-installation tasks and exit"
)
parser.add_argument(
"--transport",
type=str,
default="stdio",
choices=["stdio", "streamable-http", "sse"],
help="Transport protocol to use (default: stdio)"
)
parser.add_argument(
"--host",
type=str,
default="127.0.0.1",
help="Host to bind to when using HTTP-based transports (default: 127.0.0.1)"
)
parser.add_argument(
"--port",
type=int,
default=8000,
help="Port to bind to when using HTTP-based transports (default: 8000)"
)
parser.add_argument(
"--path",
type=str,
default="/mcp",
help="Path for the MCP endpoint when using HTTP-based transports (default: /mcp)"
)
return parser.parse_args()
def main():
"""Main entry point for the server."""
args = parse_arguments()
logger.info("Starting 🪄 ImageSorcery MCP server setup")
# Get version from package metadata
try:
from importlib.metadata import version
package_version = version("imagesorcery-mcp")
print(f"ImageSorcery MCP Version: {package_version}")
except Exception as e:
logger.error(f"Could not read version from package metadata: {e}")
print("ImageSorcery MCP Version: unknown")
# If --post-install flag is provided, run post-installation tasks and exit
if args.post_install:
logger.info("Post-installation flag detected, running post-installation tasks")
try:
from imagesorcery_mcp.scripts.post_install import run_post_install
success = run_post_install()
if not success:
logger.error("Post-installation tasks failed")
sys.exit(1)
logger.info("Post-installation tasks completed successfully")
sys.exit(0)
except Exception as e:
logger.error(f"Error during post-installation: {str(e)}")
sys.exit(1)
# For actual server execution, we'll use the global mcp instance
logger.info(f"Starting MCP server with transport: {args.transport}")
fastmcp_log_level = os.getenv("FASTMCP_LOG_LEVEL", "DEBUG")
# Configure transport with appropriate parameters
if args.transport in ["streamable-http", "sse"]:
mcp.run(
transport=args.transport,
host=args.host,
port=args.port,
path=args.path,
log_level=fastmcp_log_level
)
else:
# Use default stdio transport
mcp.run(log_level=fastmcp_log_level)
if __name__ == "__main__":
main()
================================================
FILE: src/imagesorcery_mcp/telemetry_amplitude.py
================================================
import logging
import os
from typing import Any, Dict
from amplitude import Amplitude, BaseEvent
from imagesorcery_mcp.telemetry_keys import AMPLITUDE_API_KEY
class AmplitudeHandler:
"""Handles sending telemetry events to Amplitude."""
def __init__(self, logger: logging.Logger | None = None):
self.logger = logger or logging.getLogger("imagesorcery.telemetry.amplitude")
self.logger.debug("Initializing Amplitude handler")
api_key = self._get_api_key()
if not api_key:
self.amplitude = None
self.logger.warning("Amplitude API key is not set. Amplitude telemetry will be disabled.")
self.logger.debug("Amplitude telemetry disabled due to missing API key")
else:
self.amplitude = Amplitude(api_key)
self.logger.info("Amplitude handler initialized.")
self.logger.debug(f"Amplitude handler enabled with API key: {api_key}")
def _get_api_key(self) -> str:
"""Get Amplitude API key.
Priority:
1. Environment variable IMAGESORCERY_AMPLITUDE_API_KEY
2. Value from src/imagesorcery_mcp/telemetry_keys.py (AMPLITUDE_API_KEY)
"""
return os.environ.get('IMAGESORCERY_AMPLITUDE_API_KEY', AMPLITUDE_API_KEY)
def track_event(self, event_data: Dict[str, Any]):
"""
Tracks an event using Amplitude.
Args:
event_data: A dictionary containing event properties.
Expected keys: 'user_id', 'action_type', 'identifier', 'status', etc.
"""
if not self.amplitude:
self.logger.debug("Amplitude telemetry disabled, skipping event tracking")
return
# Skip telemetry if DISABLE_TELEMETRY environment variable is set
if os.environ.get('DISABLE_TELEMETRY', '').lower() in ('true', '1', 'yes'):
self.logger.debug("Amplitude telemetry disabled via environment variable")
return
try:
user_id = event_data.get("user_id", "anonymous")
event_type = f"mcp_{event_data.get('action_type', 'unknown_action')}"
self.logger.debug(f"Preparing to track Amplitude event: {event_type} for user {user_id}")
self.logger.debug(f"Event data: {event_data}")
event = BaseEvent(event_type=event_type, user_id=user_id, event_properties=event_data)
self.amplitude.track(event)
self.logger.debug(f"Successfully tracked Amplitude event: {event_type} for user {user_id}")
except Exception as e:
self.logger.error(f"Failed to send event to Amplitude: {e}", exc_info=True)
self.logger.debug(f"Event data that failed: {event_data}")
# Global instance to be used by other modules
amplitude_handler = AmplitudeHandler()
================================================
FILE: src/imagesorcery_mcp/telemetry_keys.py
================================================
# Auto-generated telemetry keys module.
# This file is intended to be updated by build scripts (populate_telemetry_keys.py)
# and cleared by clear_telemetry_keys.py. Keep values as empty strings in the repo.
#
# WARNING: Do NOT commit real production keys to the repository.
AMPLITUDE_API_KEY = ""
POSTHOG_API_KEY = ""
================================================
FILE: src/imagesorcery_mcp/telemetry_posthog.py
================================================
import logging
import os
from typing import Any, Dict
from posthog import Posthog
from imagesorcery_mcp.telemetry_keys import POSTHOG_API_KEY
POSTHOG_HOST = "https://us.i.posthog.com"
class PostHogHandler:
"""Handles sending telemetry events to PostHog."""
def __init__(self, logger: logging.Logger | None = None):
self.logger = logger or logging.getLogger("imagesorcery.telemetry.posthog")
self.logger.debug("Initializing PostHog handler")
api_key = self._get_api_key()
if not api_key:
self.enabled = False
self.logger.warning("PostHog API key is not set. PostHog telemetry will be disabled.")
self.logger.debug("PostHog telemetry disabled due to missing API key")
else:
self.enabled = True
self.posthog_client = Posthog(api_key, host=POSTHOG_HOST)
self.logger.info("PostHog handler initialized.")
self.logger.debug(f"PostHog handler enabled with API key: {api_key}")
def _get_api_key(self) -> str:
"""Get PostHog API key.
Priority:
1. Environment variable IMAGESORCERY_POSTHOG_API_KEY
2. Value from src/imagesorcery_mcp/telemetry_keys.py (POSTHOG_API_KEY)
"""
return os.environ.get('IMAGESORCERY_POSTHOG_API_KEY', POSTHOG_API_KEY)
def track_event(self, event_data: Dict[str, Any]):
"""
Tracks an event using PostHog.
Args:
event_data: A dictionary containing event properties.
Expected keys: 'user_id', 'action_type', 'identifier', 'status', etc.
"""
if not self.enabled:
self.logger.debug("PostHog telemetry disabled, skipping event tracking")
return
# Skip telemetry if DISABLE_TELEMETRY environment variable is set
if os.environ.get('DISABLE_TELEMETRY', '').lower() in ('true', '1', 'yes'):
self.logger.debug("Posthog telemetry disabled via environment variable")
return
try:
user_id = event_data.get("user_id", "anonymous")
event_type = f"mcp_{event_data.get('action_type', 'unknown_action')}"
self.logger.debug(f"Preparing to track PostHog event: {event_type} for user {user_id}")
self.logger.debug(f"Event data: {event_data}")
self.posthog_client.capture(event_type, distinct_id=user_id, properties=event_data)
self.logger.debug(f"Successfully tracked PostHog event: {event_type} for user {user_id}")
except Exception as e:
self.logger.error(f"Failed to send event to PostHog: {e}", exc_info=True)
self.logger.debug(f"Event data that failed: {event_data}")
posthog_handler = PostHogHandler()
================================================
FILE: src/imagesorcery_mcp/tools/README.md
================================================
# 🪄 ImageSorcery MCP Server Tools Documentation
This document provides detailed information about each tool available in the 🪄 ImageSorcery MCP Server, including their arguments, return values, and examples of how to call them using a Claude client.
## Rules
These rules apply to all contributors: humans and AI.
- Register tools by defining a `register_tool` function in each tool's module. This function should accept a `FastMCP` instance and use the `@mcp.tool()` decorator to register the tool function with the server. See `src/imagesorcery_mcp/server.py` for how tools are imported and registered, and individual tool files like `src/imagesorcery_mcp/tools/crop.py` for examples of the `register_tool` function implementation.
- All tools should use Bounding Box format for image coordinates, e.g. `[x1, y1, x2, y2]` where `(x1, y1)` is the top-left corner and `(x2, y2)` is the bottom-right corner.
- All file paths specified in tool arguments (e.g., `input_path`, `output_path`) must be **full paths**, not relative paths. For example, use `/home/user/images/my_image.jpg` instead of `my_image.jpg`.
- When adding new tools, ensure they are listed in alphabetical order in READMEs and in the server registration.
## Available Tools
### `blur`
Blurs specified rectangular or polygonal areas of an image using OpenCV. This tool allows blurring multiple areas of an image with customizable blur strength. Each area can be a rectangle defined by a bounding box with coordinates `[x1, y1, x2, y2]` or a polygon defined by a list of points (in the same format as returned by `detect` or `find`). The `blur_strength` parameter controls the intensity of the blur effect. Higher values result in stronger blur. It must be an odd number (default is 15). If `invert_areas` is True, the tool will blur everything EXCEPT the specified areas.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `areas` (array): List of areas to blur. Each item is a dictionary that must contain either:
- A rectangle: `x1`, `y1`, `x2`, `y2` (integers).
- A polygon: `polygon` (a list of points, e.g., `[[x1, y1], [x2, y2], ...]`).
- Optionally, each dictionary can also contain `blur_strength` (integer, default is 15).
- **Optional arguments:**
- `invert_areas` (boolean): If True, blurs everything EXCEPT the specified areas. Useful for background blurring. Default is False.
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_blurred' suffix.
- **Returns:** string (path to the image with blurred areas)
**Example Claude Request:**
```
Blur the rectangular area from (150, 100) to (250, 200) and a triangular area in my image 'test_image.png' and save it as 'output.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "blur",
"arguments": {
"input_path": "/home/user/images/test_image.png",
"areas": [
{
"x1": 150,
"y1": 100,
"x2": 250,
"y2": 200,
"blur_strength": 21
},
{
"polygon": [[300, 50], [350, 50], [325, 150]],
"blur_strength": 31
}
],
"output_path": "/home/user/images/output.png"
}
}
```
**Example Claude Request (Background Blurring):**
```
Blur the background of 'my_image.png' by blurring everything outside the rectangle (100, 100) to (300, 300) with a blur strength of 25, and save it as 'object_focused.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "blur",
"arguments": {
"input_path": "/home/user/images/my_image.png",
"areas": [
{
"x1": 100,
"y1": 100,
"x2": 300,
"y2": 300,
"blur_strength": 25
}
],
"invert_areas": true,
"output_path": "/home/user/images/object_focused.png"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/output.png"
}
```
### `change_color`
Changes the color palette of an image. This tool applies a predefined color transformation to an image. Currently supported palettes are 'grayscale' and 'sepia'.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `palette` (string): The color palette to apply. Currently supports 'grayscale' and 'sepia'.
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with a suffix based on the palette (e.g., '_grayscale').
- **Returns:** string (path to the image with the new color palette)
**Example Claude Request:**
```
Convert my image 'test_image.png' to sepia and save it as 'output.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "change_color",
"arguments": {
"input_path": "/home/user/images/test_image.png",
"palette": "sepia",
"output_path": "/home/user/images/output.png"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/output.png"
}
```
### `config`
View or update ImageSorcery MCP configuration settings. This tool allows you to view current configuration values, update them for the current session or persistently, and reset runtime overrides. Configuration values control default parameters for other tools like detection confidence thresholds, blur strength, drawing colors, etc.
- **Required arguments:**
- `action` (string): Action to perform. Must be one of:
- `"get"`: View configuration values
- `"set"`: Update configuration values
- `"reset"`: Reset runtime configuration overrides
- **Optional arguments:**
- `key` (string): Configuration key to get/set using dot notation (e.g., "detection.confidence_threshold", "blur.strength"). Leave empty to get/set entire config.
- `value` (string|number|boolean): Value to set (only used with action="set")
- `persist` (boolean): Whether to persist changes to config file (only used with action="set"). Default is False.
- **Returns:** Dictionary containing the requested configuration data or update result
**Available Configuration Keys:**
- `detection.confidence_threshold`: Default confidence threshold for object detection (0.0-1.0)
- `detection.default_model`: Default model for detection tool
- `find.confidence_threshold`: Default confidence threshold for object finding (0.0-1.0)
- `find.default_model`: Default model for find tool
- `blur.strength`: Default blur strength (must be odd number)
- `text.font_scale`: Default font scale for text drawing
- `drawing.color`: Default color in BGR format [Blue, Green, Red]
- `drawing.thickness`: Default line thickness
- `ocr.language`: Default language code for OCR
- `resize.interpolation`: Default interpolation method
**Example Claude Requests:**
```
Show me the current configuration
```
```
What is the current detection confidence threshold?
```
```
Set the default blur strength to 21
```
```
Set the detection confidence to 0.8 and save it to the config file
```
```
Reset all configuration overrides
```
**Example Tool Calls (JSON):**
Get all configuration:
```json
{
"name": "config",
"arguments": {
"action": "get"
}
}
```
Get specific configuration value:
```json
{
"name": "config",
"arguments": {
"action": "get",
"key": "detection.confidence_threshold"
}
}
```
Set configuration value (runtime only):
```json
{
"name": "config",
"arguments": {
"action": "set",
"key": "blur.strength",
"value": 21,
"persist": false
}
}
```
Set and persist configuration value:
```json
{
"name": "config",
"arguments": {
"action": "set",
"key": "detection.confidence_threshold",
"value": 0.8,
"persist": true
}
}
```
Reset runtime overrides:
```json
{
"name": "config",
"arguments": {
"action": "reset"
}
}
```
### `crop`
Crops an image using OpenCV's NumPy slicing approach.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `x1` (integer): X-coordinate of the top-left corner
- `y1` (integer): Y-coordinate of the top-left corner
- `x2` (integer): X-coordinate of the bottom-right corner
- `y2` (integer): Y-coordinate of the bottom-right corner
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_cropped' suffix.
- **Returns:** string (path to the cropped image)
**Example Claude Request:**
```
Crop my image 'input.png' using bounding box [10, 10, 200, 200] and save it as 'cropped.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "crop",
"arguments": {
"input_path": "/home/user/images/input.png",
"x1": 10,
"y1": 10,
"x2": 200,
"y2": 200,
"output_path": "/home/user/images/cropped.png"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/cropped.png"
}
```
### `detect`
Detects objects in an image using models from Ultralytics. This tool requires pre-downloaded models. Use the `download-yolo-models` command to download models before using this tool. If objects aren't common, consider using a specialized model. This tool can optionally return segmentation masks (as PNG files) or polygons if a segmentation model (e.g., one ending in '-seg.pt') is used.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- **Optional arguments:**
- `confidence` (float): Confidence threshold for detection (0.0 to 1.0). Default is 0.75
- `model_name` (string): Model name to use for detection (e.g., 'yoloe-11s-seg.pt', 'yolov8m.pt'). Default is 'yoloe-11l-seg-pf.pt'
- `return_geometry` (boolean): If True, returns segmentation masks or polygons for detected objects. Default is False.
- `geometry_format` (string): Format for returned geometry: 'mask' or 'polygon'. Default is 'mask'. When 'mask' is selected, a PNG file is created for each mask and its path is returned.
- **Returns:** dictionary containing:
- `image_path`: Path to the input image
- `detections`: List of detected objects, each with:
- `class`: Class name of the detected object
- `confidence`: Confidence score (0.0 to 1.0)
- `bbox`: Bounding box coordinates [x1, y1, x2, y2]
- `mask_path` (optional): Path to the PNG file for the object's mask. Included if `return_geometry` is True and `geometry_format` is 'mask'.
- `polygon` (optional): A list of points `[x, y]` describing the object's contour. Included if `return_geometry` is True and `geometry_format` is 'polygon'.
**Example Claude Request:**
```
Detect objects in my image 'photo.jpg' with a confidence threshold of 0.4
```
**Example Tool Call (JSON):**
```json
{
"tool_code": "imagesorcery-mcp",
"name": "detect",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"confidence": 0.4,
"return_geometry": true,
"geometry_format": "polygon"
}
}
```
**Example Response (JSON):**
```json
{
"result": {
"image_path": "/home/user/images/photo.jpg",
"detections": [
{
"class": "person",
"confidence": 0.92,
"bbox": [10.5, 20.3, 100.2, 200.1],
"mask_path": "/home/user/images/photo_mask_0.png"
},
{
"class": "car",
"confidence": 0.85,
"bbox": [150.2, 30.5, 250.1, 120.7],
"mask_path": "/home/user/images/photo_mask_1.png"
}
]
}
}
```
### `draw_arrows`
Draws arrows on an image using OpenCV. This tool allows adding multiple arrows to an image with customizable start and end points, color, thickness, and tip length.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `arrows` (array): List of arrow items to draw. Each item should have:
- `x1` (integer): X-coordinate of the arrow's start point
- `y1` (integer): Y-coordinate of the arrow's start point
- `x2` (integer): X-coordinate of the arrow's end point
- `y2` (integer): Y-coordinate of the arrow's end point
- `color` (array, optional): Color in BGR format [B,G,R]. Default is [0,0,0] (black)
- `thickness` (integer, optional): Line thickness. Default is 1
- `tip_length` (float, optional): Length of the arrow tip relative to the arrow length. Default is 0.1
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_with_arrows' suffix
- **Returns:** string (path to the image with drawn arrows)
**Example Claude Request:**
```
Draw a red arrow from (50,50) to (150,100) and a blue arrow from (200,150) to (300,250) with a tip length of 0.15 on my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "draw_arrows",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"arrows": [
{
"x1": 50,
"y1": 50,
"x2": 150,
"y2": 100,
"color": [0, 0, 255],
"thickness": 2
},
{
"x1": 200,
"y1": 150,
"x2": 300,
"y2": 250,
"color": [255, 0, 0],
"thickness": 3,
"tip_length": 0.15
}
],
"output_path": "/home/user/images/photo_with_arrows.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/photo_with_arrows.jpg"
}
```
### `draw_circles`
Draws circles on an image using OpenCV. This tool allows adding multiple circles to an image with customizable center, radius, color, thickness, and fill option. Each circle is defined by its center coordinates (center_x, center_y) and radius.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `circles` (array): List of circle items to draw. Each item should have:
- `center_x` (integer): X-coordinate of the circle's center
- `center_y` (integer): Y-coordinate of the circle's center
- `radius` (integer): Radius of the circle
- `color` (array, optional): Color in BGR format [B,G,R]. Default is [0,0,0] (black)
- `thickness` (integer, optional): Line thickness. Default is 1. Use -1 for a filled circle.
- `filled` (boolean, optional): Whether to fill the circle. Default is false. If true, thickness is set to -1.
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_with_circles' suffix
- **Returns:** string (path to the image with drawn circles)
**Example Claude Request:**
```
Draw a red circle with center (100,100) and radius 50, and a filled blue circle with center (250,200) and radius 30 on my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "draw_circles",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"circles": [
{
"center_x": 100,
"center_y": 100,
"radius": 50,
"color": [0, 0, 255],
"thickness": 2
},
{
"center_x": 250,
"center_y": 200,
"radius": 30,
"color": [255, 0, 0],
"filled": true
}
],
"output_path": "/home/user/images/photo_with_circles.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/photo_with_circles.jpg"
}
```
### `draw_lines`
Draws lines on an image using OpenCV. This tool allows adding multiple lines to an image with customizable start and end points, color, and thickness.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `lines` (array): List of line items to draw. Each item should have:
- `x1` (integer): X-coordinate of the line's start point
- `y1` (integer): Y-coordinate of the line's start point
- `x2` (integer): X-coordinate of the line's end point
- `y2` (integer): Y-coordinate of the line's end point
- `color` (array, optional): Color in BGR format [B,G,R]. Default is [0,0,0] (black)
- `thickness` (integer, optional): Line thickness. Default is 1
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_with_lines' suffix
- **Returns:** string (path to the image with drawn lines)
**Example Claude Request:**
```
Draw a red line from (50,50) to (150,100) and a blue line from (200,150) to (300,250) on my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "draw_lines",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"lines": [
{
"x1": 50,
"y1": 50,
"x2": 150,
"y2": 100,
"color": [0, 0, 255],
"thickness": 2
},
{
"x1": 200,
"y1": 150,
"x2": 300,
"y2": 250,
"color": [255, 0, 0],
"thickness": 3
}
],
"output_path": "/home/user/images/photo_with_lines.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/photo_with_lines.jpg"
}
```
### `draw_rectangles`
Draws rectangles on an image using OpenCV. This tool allows adding multiple rectangles to an image with customizable position, color, thickness, and fill option. Each rectangle is defined by two points: (x1, y1) for the top-left corner and (x2, y2) for the bottom-right corner.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `rectangles` (array): List of rectangle items to draw. Each item should have:
- `x1` (integer): X-coordinate of the top-left corner
- `y1` (integer): Y-coordinate of the top-left corner
- `x2` (integer): X-coordinate of the bottom-right corner
- `y2` (integer): Y-coordinate of the bottom-right corner
- `color` (array, optional): Color in BGR format [B,G,R]. Default is [0,0,0] (black)
- `thickness` (integer, optional): Line thickness. Default is 1
- `filled` (boolean, optional): Whether to fill the rectangle. Default is false
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_with_rectangles' suffix
- **Returns:** string (path to the image with drawn rectangles)
**Example Claude Request:**
```
Draw a red rectangle from (50,50) to (150,100) and a filled blue rectangle from (200,150) to (300,250) on my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "draw_rectangles",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"rectangles": [
{
"x1": 50,
"y1": 50,
"x2": 150,
"y2": 100,
"color": [0, 0, 255],
"thickness": 2
},
{
"x1": 200,
"y1": 150,
"x2": 300,
"y2": 250,
"color": [255, 0, 0],
"thickness": 3,
"filled": true
}
],
"output_path": "/home/user/images/photo_with_rectangles.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/photo_with_rectangles.jpg"
}
```
### `draw_texts`
Draws text on an image using OpenCV. This tool allows adding multiple text elements to an image with customizable position, font, size, color, and thickness.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `texts` (array): List of text items to draw. Each item should have:
- `text` (string): The text to draw
- `x` (integer): X-coordinate for the text position
- `y` (integer): Y-coordinate for the text position
- `font_scale` (float, optional): Scale factor for the font. Default is 1.0
- `color` (array, optional): Color in BGR format [B,G,R]. Default is [0,0,0] (black)
- `thickness` (integer, optional): Line thickness. Default is 1
- `font_face` (string, optional): Font face to use. Default is "FONT_HERSHEY_SIMPLEX". Available options: 'FONT_HERSHEY_SIMPLEX', 'FONT_HERSHEY_PLAIN', 'FONT_HERSHEY_DUPLEX', 'FONT_HERSHEY_COMPLEX', 'FONT_HERSHEY_TRIPLEX', 'FONT_HERSHEY_COMPLEX_SMALL', 'FONT_HERSHEY_SCRIPT_SIMPLEX', 'FONT_HERSHEY_SCRIPT_COMPLEX'
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_with_text' suffix
- **Returns:** string (path to the image with drawn text)
**Example Claude Request:**
```
Add text 'Hello World' at position (50,50) and 'Copyright 2023' at the bottom right corner of my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "draw_texts",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"texts": [
{
"text": "Hello World",
"x": 50,
"y": 50,
"font_scale": 1.0,
"color": [0, 0, 255],
"thickness": 2
},
{
"text": "Copyright 2023",
"x": 100,
"y": 150,
"font_scale": 2.0,
"color": [255, 0, 0],
"thickness": 3,
"font_face": "FONT_HERSHEY_COMPLEX"
}
],
"output_path": "/home/user/images/photo_with_text.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/photo_with_text.jpg"
}
```
### `fill`
Fills specified rectangular, polygonal, or mask-based areas of an image with a color and opacity, or makes them transparent. This tool allows filling multiple areas of an image with a customizable color and opacity. Each area can be a rectangle, a polygon, or a mask from a PNG file. The `opacity` parameter controls the transparency of the fill (1.0 is fully opaque, 0.0 is fully transparent, default is 0.5). The `color` is in BGR format, e.g., `[255, 0, 0]` for blue (default is `[0,0,0]` black).
**Special behavior**: If `color` is set to `null` (or `None` in Python), the specified area is made fully transparent by setting all channels (BGRA) to 0, effectively creating a black transparent color. This ensures better compatibility with older PNG viewers. The `opacity` parameter is ignored in this case.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `areas` (array): List of areas to fill. Each item is a dictionary that must contain one of:
- A rectangle: `x1`, `y1`, `x2`, `y2` (integers).
- A polygon: `polygon` (a list of points, e.g., `[[x1, y1], [x2, y2], ...]`).
- A mask: `mask_path` (string path to a PNG mask file).
- Optionally, each dictionary can also contain `color` (list of 3 ints [B,G,R] or `null`, default [0,0,0]) and `opacity` (float 0.0-1.0, default 0.5).
- **Optional arguments:**
- `invert_areas` (boolean): If True, fills everything EXCEPT the specified areas. Useful for background removal. Default is False.
- `output_path` (string): Full path to save the output image. If not provided, will use input filename with '_filled' suffix.
- **Returns:** string (path to the image with filled areas)
**Example Claude Request:**
```
Fill the rectangular area from (150, 100) to (250, 200) with semi-transparent red and erase the area [[10, 10], [50, 10], [50, 50], [10, 50]] in my image 'test_image.png' and save it as 'output.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "fill",
"arguments": {
"input_path": "/home/user/images/test_image.png",
"areas": [
{
"x1": 150,
"y1": 100,
"x2": 250,
"y2": 200,
"color": [0, 0, 255],
"opacity": 0.5
},
{
"polygon": [[10, 10], [50, 10], [50, 50], [10, 50]],
"color": null
}
],
"output_path": "/home/user/images/output.png"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/output.png"
}
```
**Example Claude Request (Background Removal):**
```
Remove the background from 'my_image.png' by making everything outside the rectangle (100, 100) to (300, 300) transparent, and save it as 'object_only.png'
```
**Example Tool Call (JSON):**
```json
{
"name": "fill",
"arguments": {
"input_path": "/home/user/images/my_image.png",
"areas": [
{
"x1": 100,
"y1": 100,
"x2": 300,
"y2": 300,
"color": null
}
],
"invert_areas": true,
"output_path": "/home/user/images/object_only.png"
}
}
```
### `find`
Finds objects in an image based on a text description. This tool uses open-vocabulary detection models to find objects matching a text description. It requires pre-downloaded YOLOE models that support text prompts (e.g. yoloe-11l-seg.pt). This tool can optionally return segmentation masks (as PNG files) or polygons.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- `description` (string): Text description of the object to find
- **Optional arguments:**
- `confidence` (float): Confidence threshold for detection (0.0 to 1.0). Default is 0.3
- `model_name` (string): Model name to use for finding objects (must support text prompts). Default is 'yoloe-11l-seg.pt'
- `return_all_matches` (boolean): If True, returns all matching objects; if False, returns only the best match. Default is False
- `return_geometry` (boolean): If True, returns segmentation masks or polygons for found objects. Default is False.
- `geometry_format` (string): Format for returned geometry: 'mask' or 'polygon'. Default is 'mask'. When 'mask' is selected, a PNG file is created for each mask and its path is returned.
- **Returns:** dictionary containing:
- `image_path`: Path to the input image
- `query`: The text description that was searched for
- `found_objects`: List of found objects, each with:
- `description`: The original search query
- `match`: The class name of the matched object
- `confidence`: Confidence score (0.0 to 1.0)
- `bbox`: Bounding box coordinates [x1, y1, x2, y2]
- `mask_path` (optional): Path to the PNG file for the object's mask. Included if `return_geometry` is True and `geometry_format` is 'mask'.
- `polygon` (optional): A list of points `[x, y]` describing the object's contour. Included if `return_geometry` is True and `geometry_format` is 'polygon'.
- `found`: Boolean indicating whether any objects were found
**Example Claude Request:**
```
Find all dogs in my image 'photo.jpg' with a confidence threshold of 0.4
```
**Example Tool Call (JSON):**
```json
{
"name": "find",
"arguments": {
"input_path": "/home/user/images/photo.jpg",
"description": "dog",
"confidence": 0.4,
"return_all_matches": true,
"return_geometry": true,
"geometry_format": "mask"
}
}
```
**Example Response (JSON):**
```json
{
"result": {
"image_path": "/home/user/images/photo.jpg",
"query": "dog",
"found_objects": [
{
"description": "dog",
"match": "dog",
"confidence": 0.92,
"bbox": [150.2, 30.5, 250.1, 120.7],
"mask_path": "/home/user/images/photo_mask_0.png"
},
{
"description": "dog",
"match": "dog",
"confidence": 0.85,
"bbox": [300.5, 150.3, 400.2, 250.1],
"mask_path": "/home/user/images/photo_mask_1.png"
}
],
"found": true
}
}
```
### `get_metainfo`
Gets metadata information about an image file.
- **Required arguments:**
- `input_path` (string): Full path to the input image
- **Returns:** dictionary containing metadata about the image including:
- `filename`
- `file path`
- `file size` (in bytes, KB, and MB)
- `dimensions` (width, height, aspect ratio)
- `image format`
- `color mode`
- `creation and modification timestamps`
**Example Claude Request:**
```
Get metadata information about my image 'photo.jpg'
```
**Example Tool Call (JSON):**
```json
{
"name": "get_metainfo",
"arguments": {
"input_path": "/home/user/images/photo.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": {
"filename": "photo.jpg",
"path": "/home/user/images/photo.jpg",
"size_bytes": 12345,
"size_kb": 12.06,
"size_mb": 0.01,
"dimensions": {
"width": 800,
"height": 600,
"aspect_ratio": 1.33
},
"format": "JPEG",
"color_mode": "RGB",
"created_at": "2023-06-15T10:30:45",
"modified_at": "2023-06-15T10:30:45"
}
}
```
### `ocr`
Performs Optical Character Recognition (OCR) on an image using EasyOCR. This tool extracts text from images in various languages. The default language is English, but you can specify other languages using their language codes (e.g., 'en', 'ru', 'fr', etc.).
- **Required arguments:**
- `input_path` (string): Full path to the input image
- **Optional arguments:**
- `language` (string): Language code for OCR (e.g., 'en', 'ru', 'fr', etc.). Default is 'en'
- **Returns:** dictionary containing:
- `image_path`: Path to the input image
- `text_segments`: List of detected text segments, each with:
- `text`: The extracted text content
- `confidence`: Confidence score (0.0 to 1.0)
- `bbox`: Bounding box coordinates [x1, y1, x2, y2]
**Example Claude Request:**
```
Extract text from my image 'document.jpg' using OCR with English language
```
**Example Tool Call (JSON):**
```json
{
"name": "ocr",
"arguments": {
"input_path": "/home/user/images/document.jpg",
"language": "en"
}
}
```
**Example Response (JSON):**
```json
{
"result": {
"image_path": "/home/user/images/document.jpg",
"text_segments": [
{
"text": "Hello World",
"confidence": 0.92,
"bbox": [10.5, 20.3, 100.2, 200.1]
},
{
"text": "Copyright 2023",
"confidence": 0.85,
"bbox": [150.2, 30.5, 250.1, 120.7]
}
]
}
}
```
### `overlay`
Overlays one image on top of another, handling transparency. This tool places an overlay image onto a base image at a specified (x, y) coordinate. If the overlay image has an alpha channel (e.g., a transparent PNG), it will be blended correctly with the base image. If the overlay extends beyond the boundaries of the base image, it will be cropped.
- **Required arguments:**
- `base_image_path` (string): Full path to the base image
- `overlay_image_path` (string): Full path to the overlay image. This image can have transparency.
- `x` (integer): X-coordinate of the top-left corner of the overlay image on the base image.
- `y` (integer): Y-coordinate of the top-left corner of the overlay image on the base image.
- **Optional arguments:**
- `output_path` (string): Full path to save the output image. If not provided, will use the base image filename with '_overlaid' suffix.
- **Returns:** string (path to the resulting image)
**Example Claude Request:**
```
Overlay 'logo.png' on top of 'background.jpg' at position (10, 10) and save it as 'final.jpg'
```
**Example Tool Call (JSON):
```json
{
"name": "overlay",
"arguments": {
"base_image_path": "/home/user/images/background.jpg",
"overlay_image_path": "/home/user/images/logo.png",
"x": 10,
"y": 10,
"output_path": "/home/user/images/final.jpg"
}
}
```
**Example Response (JSON):**
```json
{
"result": "/home/user/images/final.jpg"
}
```
================================================
FILE: src/imagesorcery_mcp/tools/__init__.py
================================================
# Import the central logger
from imagesorcery_mcp.logging_config import logger
logger.info("🪄 ImageSorcery MCP tools package initialized")
================================================
FILE: src/imagesorcery_mcp/tools/blur.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def blur(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
areas: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of areas to blur. Each area should have: "
"a rectangle ({'x1', 'y1', 'x2', 'y2'}) or a polygon ({'polygon': [[x,y],...]}). "
"Optionally, include 'blur_strength' (int, odd number, default 15) for each area."
)
),
],
invert_areas: Annotated[
bool,
Field(
description="If True, blurs everything EXCEPT the specified areas. Useful for background blurring."
),
] = False,
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_blurred' suffix."
)
),
] = None,
) -> str:
"""
Blur specified rectangular or polygonal areas of an image using OpenCV.
This tool allows blurring multiple rectangular or polygonal areas of an image with customizable
blur strength. Each area can be a rectangle defined by a bounding box
[x1, y1, x2, y2] or a polygon defined by a list of points.
The blur_strength parameter controls the intensity of the blur effect. Higher values
result in stronger blur. It must be an odd number (default is 15).
If `invert_areas` is True, the tool will blur everything EXCEPT the specified areas.
Returns:
Path to the image with blurred areas
"""
logger.info(f"Blur tool requested for image: {input_path} with {len(areas)} areas, invert_areas={invert_areas}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_blurred{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Create a mask for the areas to be blurred (or not blurred if invert_areas is True)
mask = np.zeros(img.shape[:2], dtype=np.uint8)
# Populate the mask based on areas
for area in areas:
if "polygon" in area:
polygon_points = np.array(area["polygon"], dtype=np.int32)
cv2.fillPoly(mask, [polygon_points], 255)
elif "x1" in area and "y1" in area and "x2" in area and "y2" in area:
x1, y1, x2, y2 = area["x1"], area["y1"], area["x2"], area["y2"]
cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
else:
logger.warning("Skipping area due to missing 'polygon' or 'x1,y1,x2,y2' keys.")
continue
# If invert_areas is True, invert the mask
if invert_areas:
mask = cv2.bitwise_not(mask)
logger.info("Inverting blur areas: blurring everything EXCEPT the specified regions.")
else:
logger.info("Applying blur to specified areas.")
# Apply blur to the entire image (this will be used for the masked regions)
# Use the blur_strength from the first area, or config default
config = get_config()
global_blur_strength = areas[0].get("blur_strength", config.blur.strength) if areas else config.blur.strength
if global_blur_strength % 2 == 0:
global_blur_strength += 1
logger.warning(f"Adjusted global blur_strength to odd number: {global_blur_strength}")
full_blurred_img = cv2.GaussianBlur(img, (global_blur_strength, global_blur_strength), 0)
# Combine the original image and the fully blurred image using the mask
# Where mask is 255 (white), use the blurred image. Where mask is 0 (black), use the original image.
result_img = np.where(mask[:, :, None] == 255, full_blurred_img, img)
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the image with blurred areas
logger.info(f"Saving blurred image to: {output_path}")
cv2.imwrite(output_path, result_img)
logger.info(f"Blurred image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/change_color.py
================================================
import os
from typing import Annotated, Literal, Optional
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def change_color(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
palette: Annotated[
Literal["grayscale", "sepia"],
Field(description="The color palette to apply. Currently supports 'grayscale' and 'sepia'."),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with a suffix based on the palette (e.g., '_grayscale')."
)
),
] = None,
) -> str:
"""
Change the color palette of an image.
This tool applies a predefined color transformation to an image.
Currently supported palettes are 'grayscale' and 'sepia'.
Returns:
Path to the image with the new color palette.
"""
logger.info(f"Change color tool requested for image: {input_path} with palette: {palette}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_{palette}{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
# Apply the selected color palette
if palette == "grayscale":
logger.info("Applying grayscale palette")
output_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
elif palette == "sepia":
logger.info("Applying sepia palette")
# Sepia transformation matrix
sepia_kernel = np.array([[0.272, 0.534, 0.131], [0.349, 0.686, 0.168], [0.393, 0.769, 0.189]])
# Apply the transformation
sepia_img = cv2.transform(img, sepia_kernel)
# Clip values to be in the 0-255 range
output_img = np.clip(sepia_img, 0, 255).astype(np.uint8)
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
# Save the image
cv2.imwrite(output_path, output_img)
logger.info(f"Transformed image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/config.py
================================================
"""
Configuration tool for ImageSorcery MCP.
This tool allows viewing and updating configuration values through the MCP interface.
"""
from typing import Annotated, Any, Dict, Optional, Union
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config manager
from imagesorcery_mcp.config import (
generate_config_documentation,
get_available_config_keys,
get_config_manager,
)
from imagesorcery_mcp.logging_config import logger
def _generate_config_tool_docstring() -> str:
"""Generate the dynamic docstring for the config tool."""
base_doc = """View or update ImageSorcery MCP configuration.
This tool allows you to:
- View current configuration values
- Update configuration values for the current session
- Persist configuration changes to the config file
- Reset runtime overrides
"""
config_doc = generate_config_documentation()
examples_doc = """
Examples:
- Get all config: config(action="get")
- Get detection confidence: config(action="get", key="detection.confidence_threshold")
- Set blur strength: config(action="set", key="blur.strength", value=21)
- Set and persist: config(action="set", key="detection.confidence_threshold", value=0.8, persist=True)
- Reset overrides: config(action="reset")
Returns:
Dictionary containing the requested configuration data or update result"""
return base_doc + config_doc + examples_doc
def register_tool(mcp: FastMCP):
@mcp.tool()
def config(
action: Annotated[
str,
Field(
description="Action to perform: 'get' to view config, 'set' to update config, 'reset' to reset runtime overrides",
pattern="^(get|set|reset)$"
),
] = "get",
key: Annotated[
Optional[str],
Field(
description=(
"Configuration key to get/set. Use dot notation for nested values "
"(e.g., 'detection.confidence_threshold', 'blur.strength'). "
"Leave empty to get/set entire config."
)
),
] = None,
value: Annotated[
Optional[Union[str, int, float, bool]],
Field(
description="Value to set (only used with action='set')"
),
] = None,
persist: Annotated[
bool,
Field(
description="Whether to persist changes to config file (only used with action='set')"
),
] = False,
) -> Dict[str, Any]:
"""Configuration tool - docstring will be set dynamically."""
logger.info(f"Config tool called with action='{action}', key='{key}', value='{value}', persist={persist}")
config_manager = get_config_manager()
try:
if action == "get":
if key is None:
# Return entire configuration
config_dict = config_manager.get_config_dict()
runtime_overrides = config_manager.get_runtime_overrides()
result = {
"action": "get",
"config": config_dict,
"runtime_overrides": runtime_overrides,
"config_file": str(config_manager.config_file),
"message": "Current configuration retrieved successfully"
}
logger.info("Retrieved entire configuration")
return result
else:
# Return specific configuration value
config_dict = config_manager.get_config_dict()
# Navigate to the requested key
if '.' in key:
parts = key.split('.')
current = config_dict
for part in parts:
if part not in current:
raise KeyError(f"Configuration key '{key}' not found")
current = current[part]
value_result = current
else:
if key not in config_dict:
raise KeyError(f"Configuration key '{key}' not found")
value_result = config_dict[key]
result = {
"action": "get",
"key": key,
"value": value_result,
"message": f"Configuration value for '{key}' retrieved successfully"
}
logger.info(f"Retrieved configuration value for key '{key}': {value_result}")
return result
elif action == "set":
if key is None:
raise ValueError("Key is required for 'set' action")
if value is None:
raise ValueError("Value is required for 'set' action")
# Prepare update dictionary
updates = {key: value}
# Update configuration
updated_config = config_manager.update_config(updates, persist=persist)
# Get the updated value for confirmation
if '.' in key:
parts = key.split('.')
current = updated_config
for part in parts:
current = current[part]
new_value = current
else:
new_value = updated_config[key]
result = {
"action": "set",
"key": key,
"old_value": value, # This is the input value
"new_value": new_value, # This is the validated/processed value
"persisted": persist,
"message": f"Configuration '{key}' updated successfully" + (" and persisted to file" if persist else " for current session")
}
logger.info(f"Updated configuration '{key}' to '{new_value}'" + (" (persisted)" if persist else " (runtime only)"))
return result
elif action == "reset":
# Reset runtime overrides
config_manager.reset_runtime_overrides()
result = {
"action": "reset",
"message": "Runtime configuration overrides reset successfully",
"config": config_manager.get_config_dict()
}
logger.info("Reset runtime configuration overrides")
return result
else:
raise ValueError(f"Invalid action '{action}'. Must be 'get', 'set', or 'reset'")
except KeyError as e:
logger.error(f"Configuration key error: {e}")
return {
"action": action,
"error": str(e),
"available_keys": get_available_config_keys()
}
except ValueError as e:
logger.error(f"Configuration value error: {e}")
return {
"action": action,
"error": str(e),
"message": "Please check the provided key and value"
}
except Exception as e:
logger.error(f"Configuration tool error: {e}", exc_info=True)
return {
"action": action,
"error": f"Configuration operation failed: {str(e)}",
"message": "An unexpected error occurred while processing the configuration request"
}
# Set the dynamic docstring
config.__doc__ = _generate_config_tool_docstring()
================================================
FILE: src/imagesorcery_mcp/tools/crop.py
================================================
import os
from typing import Annotated
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def crop(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
x1: Annotated[
int,
Field(description="X-coordinate of the top-left corner"),
],
y1: Annotated[
int,
Field(description="Y-coordinate of the top-left corner"),
],
x2: Annotated[
int,
Field(description="X-coordinate of the bottom-right corner"),
],
y2: Annotated[
int,
Field(description="Y-coordinate of the bottom-right corner"),
],
output_path: Annotated[
str,
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_cropped' suffix."
)
),
] = None,
) -> str:
"""
Crop an image using OpenCV's NumPy slicing approach
with OpenMCP's bounding box annotations.
Returns:
Path to the cropped image
"""
logger.info(f"Crop tool requested for image: {input_path} with region [{x1}, {y1}, {x2}, {y2}]")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_cropped{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Crop the image using NumPy slicing
logger.info(f"Cropping image with region [{x1}, {y1}, {x2}, {y2}]")
cropped_img = img[y1:y2, x1:x2]
logger.info(f"Image cropped successfully. New shape: {cropped_img.shape}")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the cropped image
logger.info(f"Saving cropped image to: {output_path}")
cv2.imwrite(output_path, cropped_img)
logger.info(f"Cropped image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/detect.py
================================================
import os
from pathlib import Path
from typing import Annotated, Any, Dict, List, Literal, Optional, Union
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def get_model_path(model_name):
"""Get the path to a model in the models directory."""
logger.info(f"Attempting to get path for model: {model_name}")
model_path = Path("models") / model_name
if model_path.exists():
logger.info(f"Model found at: {model_path}")
return str(model_path)
logger.warning(f"Model not found in models directory: {model_name}")
return None
def register_tool(mcp: FastMCP):
@mcp.tool()
def detect(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
confidence: Annotated[
Optional[float],
Field(
description="Confidence threshold for detection (0.0 to 1.0). If not provided, uses config default.",
ge=0.0,
le=1.0,
),
] = None,
model_name: Annotated[
Optional[str],
Field(
description="Model name to use for detection (e.g., 'yoloe-11l-seg-pf.pt', 'yolov8m.pt'). If not provided, uses config default.",
),
] = None,
return_geometry: Annotated[
bool, Field(description="If True, returns segmentation masks or polygons for detected objects.")
] = False,
geometry_format: Annotated[
Literal["mask", "polygon"], Field(description="Format for returned geometry: 'mask' or 'polygon'.")
] = "mask",
) -> Dict[str, Union[str, List[Dict[str, Any]]]]:
"""
Detect objects in an image using models from Ultralytics.
This tool requires pre-downloaded models. Use the download-yolo-models
command to download models before using this tool.
If objects aren't common, consider using a specialized model.
This tool can optionally return segmentation masks or polygons if a segmentation
model (e.g., one ending in '-seg.pt') is used.
When 'mask' is chosen for geometry_format, a PNG file is created for each
detected object's mask. The file path is returned in the 'mask_path' field.
Returns:
Dictionary containing the input image path and a list of detected objects.
Each object includes its class name, confidence score, and bounding box.
If return_geometry is True, it also includes a 'mask_path' (path to a PNG file) or
'polygon' (list of points).
"""
# Get configuration defaults
config = get_config()
# Use config defaults if parameters not provided
if confidence is None:
confidence = config.detection.confidence_threshold
logger.info(f"Using config default confidence: {confidence}")
if model_name is None:
model_name = config.detection.default_model
logger.info(f"Using config default model: {model_name}")
logger.info(
f"Detect tool requested for image: {input_path} with model: {model_name} and confidence: {confidence}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Add .pt extension if it doesn't exist
if not model_name.endswith(".pt"):
original_model_name = model_name
model_name = f"{model_name}.pt"
logger.info(f"Added .pt extension to model name: {original_model_name} -> {model_name}")
# Try to find the model
model_path = get_model_path(model_name)
# If model not found, raise an error with helpful message
if not model_path:
logger.error(f"Model {model_name} not found.")
# List available models
available_models = []
models_dir = Path("models")
# Find all .pt files in the models directory and its subdirectories
if models_dir.exists():
for file in models_dir.glob("**/*.pt"):
available_models.append(str(file.relative_to(models_dir)))
error_msg = (
f"Model {model_name} not found. "
f"Available local models: "
f"{', '.join(available_models) if available_models else 'None'}\n"
"To use this tool, you need to download the model first using:\n"
"download-yolo-models --ultralytics MODEL_NAME\n"
"or\n"
"download-yolo-models --huggingface REPO_ID[:FILENAME]\n"
"Models will be downloaded to the 'models' directory "
"in the project root."
)
raise RuntimeError(error_msg)
try:
# Set environment variable to use the models directory
os.environ["YOLO_CONFIG_DIR"] = str(Path("models").absolute())
logger.info(f"Set YOLO_CONFIG_DIR environment variable to: {os.environ['YOLO_CONFIG_DIR']}")
# Import here to avoid loading ultralytics if not needed
logger.info("Importing Ultralytics")
from ultralytics import YOLO
logger.info("Ultralytics imported successfully")
# Load the model from the found path
logger.info(f"Loading model from: {model_path}")
model = YOLO(model_path)
logger.info("Model loaded successfully")
# Run inference on the image
logger.info(f"Running inference on {input_path} with confidence {confidence}")
results = model(input_path, conf=confidence)[0]
logger.info(f"Inference completed. Found {len(results.boxes)} detections.")
if return_geometry and results.masks is None:
raise ValueError(
f"Model '{model_name}' does not support segmentation, but return_geometry=True was requested. "
"Please use a segmentation model (e.g., one ending in '-seg.pt')."
)
# Process results
detections = []
for i, box in enumerate(results.boxes):
# Get class name
class_id = int(box.cls.item())
class_name = results.names[class_id]
# Get confidence score
conf = float(box.conf.item())
# Get bounding box coordinates (x1, y1, x2, y2)
x1, y1, x2, y2 = [float(coord) for coord in box.xyxy[0].tolist()]
detection_item = {"class": class_name, "confidence": conf, "bbox": [x1, y1, x2, y2]}
if return_geometry:
if geometry_format == "mask":
# Convert mask to a savable format
mask = results.masks.data[i].cpu().numpy()
mask_image = (mask * 255).astype(np.uint8)
# Generate a unique filename for the mask, always with .png extension
input_p = Path(input_path)
base_name = input_p.stem
output_dir = input_p.parent
mask_output_path = output_dir / f"{base_name}_mask_{i}.png"
# Save the mask as a PNG file
try:
success = cv2.imwrite(str(mask_output_path), mask_image)
if success:
logger.info(f"Saved detection mask to {mask_output_path}")
detection_item["mask_path"] = str(mask_output_path)
else:
logger.error(f"Failed to save mask to {mask_output_path}")
except Exception as e:
logger.error(f"An unexpected error occurred while saving mask to {mask_output_path}: {e}")
elif geometry_format == "polygon":
# Ultralytics masks.xy are lists of polygons
polygon = results.masks.xy[i].tolist()
detection_item["polygon"] = polygon
detections.append(detection_item)
logger.debug(
f"Detected: class={class_name}, confidence={conf:.2f}, bbox=[{x1:.2f}, {y1:.2f}, {x2:.2f}, {y2:.2f}]")
logger.info(f"Detection completed successfully for {input_path}")
return {"image_path": input_path, "detections": detections}
except Exception as e:
# Provide more helpful error message
error_msg = f"Error running object detection: {str(e)}\n"
logger.error(f"Error during object detection: {str(e)}", exc_info=True)
if "not found" in str(e).lower():
error_msg += (
"The model could not be found. "
"Please download it first using: "
"download-yolo-models --ultralytics MODEL_NAME"
)
elif "permission denied" in str(e).lower():
error_msg += (
"Permission denied when trying to access or create the models "
"directory.\n"
"Try running the command with appropriate permissions."
)
raise RuntimeError(error_msg) from e
================================================
FILE: src/imagesorcery_mcp/tools/draw_arrows.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def draw_arrows(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
arrows: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of arrow items to draw. Each item should have: "
"'x1' (int), 'y1' (int) - start point, "
"'x2' (int), 'y2' (int) - end point, and optionally "
"'color' (list of 3 ints [B,G,R]), "
"'thickness' (int), 'tip_length' (float, relative to arrow length)"
)
),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_with_arrows' suffix."
)
),
] = None,
) -> str:
"""
Draw arrows on an image using OpenCV.
This tool allows adding multiple arrows to an image with customizable
start and end points, color, thickness, and tip length.
Each arrow is defined by its start point (x1, y1) and end point (x2, y2).
The 'tip_length' is relative to the arrow's length (e.g., 0.1 means 10%).
Returns:
Path to the image with drawn arrows
"""
logger.info(f"Draw arrows tool requested for image: {input_path} with {len(arrows)} arrows")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_with_arrows{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Draw each arrow on the image
for i, arrow_item in enumerate(arrows):
x1, y1 = arrow_item["x1"], arrow_item["y1"]
x2, y2 = arrow_item["x2"], arrow_item["y2"]
color = arrow_item.get("color", [0, 0, 0]) # Default: black
thickness = arrow_item.get("thickness", 1)
tip_length = arrow_item.get("tip_length", 0.1)
logger.debug(f"Drawing arrow {i+1}: from ({x1},{y1}) to ({x2},{y2}), color={color}, thickness={thickness}, tip_length={tip_length}")
cv2.arrowedLine(img, (x1, y1), (x2, y2), color, thickness, tipLength=tip_length)
logger.debug(f"Arrow {i+1} drawn")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
cv2.imwrite(output_path, img)
logger.info(f"Image with arrows saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/draw_circle.py
================================================
import os
from typing import Annotated, List, Optional
import cv2
from fastmcp import FastMCP
from pydantic import BaseModel, Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
class CircleItem(BaseModel):
"""Represents a circle to be drawn on an image."""
center_x: Annotated[int, Field(description="X-coordinate of the circle's center")]
center_y: Annotated[int, Field(description="Y-coordinate of the circle's center")]
radius: Annotated[int, Field(description="Radius of the circle")]
color: Annotated[List[int], Field(description="Color in BGR format [B,G,R]")] = [0, 0, 0] # Default: black
thickness: Annotated[int, Field(description="Line thickness. Use -1 for a filled circle.")] = 1
filled: Annotated[bool, Field(description="Whether to fill the circle. If true, thickness is set to -1.")] = False
def register_tool(mcp: FastMCP):
@mcp.tool()
def draw_circles(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
circles: Annotated[
List[CircleItem],
Field(
description=(
"List of circle items to draw. Each item should have: "
"'center_x' (int), 'center_y' (int), 'radius' (int), and optionally "
"'color' (list of 3 ints [B,G,R]), "
"'thickness' (int), 'filled' (bool)"
)
),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_with_circles' suffix."
)
),
] = None,
) -> str:
"""
Draw circles on an image using OpenCV.
This tool allows adding multiple circles to an image with customizable
center, radius, color, thickness, and fill option.
Each circle is defined by its center coordinates (center_x, center_y) and radius.
Returns:
Path to the image with drawn circles
"""
logger.info(f"Draw circles tool requested for image: {input_path} with {len(circles)} circles")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_with_circles{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Draw each circle on the image
for i, circle_item in enumerate(circles):
center_x = circle_item.center_x
center_y = circle_item.center_y
radius = circle_item.radius
color = circle_item.color
thickness = circle_item.thickness
filled = circle_item.filled
logger.debug(f"Drawing circle {i+1}: center=({center_x}, {center_y}), radius={radius}, color={color}, thickness={thickness}, filled={filled}")
if filled:
thickness = -1
cv2.circle(img, (center_x, center_y), radius, color, thickness)
logger.debug(f"Circle {i+1} drawn")
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
cv2.imwrite(output_path, img)
logger.info(f"Image with circles saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/draw_lines.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def draw_lines(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
lines: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of line items to draw. Each item should have: "
"'x1' (int), 'y1' (int) - start point, "
"'x2' (int), 'y2' (int) - end point, and optionally "
"'color' (list of 3 ints [B,G,R]), "
"'thickness' (int)"
)
),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_with_lines' suffix."
)
),
] = None,
) -> str:
"""
Draw lines on an image using OpenCV.
This tool allows adding multiple lines to an image with customizable
start and end points, color, and thickness.
Each line is defined by its start point (x1, y1) and end point (x2, y2).
Returns:
Path to the image with drawn lines
"""
logger.info(f"Draw lines tool requested for image: {input_path} with {len(lines)} lines")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_with_lines{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Draw each line on the image
for i, line_item in enumerate(lines):
x1, y1 = line_item["x1"], line_item["y1"]
x2, y2 = line_item["x2"], line_item["y2"]
color = line_item.get("color", [0, 0, 0]) # Default: black
thickness = line_item.get("thickness", 1)
logger.debug(f"Drawing line {i+1}: from ({x1},{y1}) to ({x2},{y2}), color={color}, thickness={thickness}")
cv2.line(img, (x1, y1), (x2, y2), color, thickness)
logger.debug(f"Line {i+1} drawn")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
cv2.imwrite(output_path, img)
logger.info(f"Image with lines saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/draw_rectangle.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def draw_rectangles(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
rectangles: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of rectangle items to draw. Each item should have: "
"'x1' (int), 'y1' (int), 'x2' (int), 'y2' (int), and optionally "
"'color' (list of 3 ints [B,G,R]), "
"'thickness' (int), 'filled' (bool)"
)
),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_with_rectangles' suffix."
)
),
] = None,
) -> str:
"""
Draw rectangles on an image using OpenCV.
This tool allows adding multiple rectangles to an image with customizable
position, color, thickness, and fill option.
Each rectangle is defined by two points: (x1, y1) for the top-left corner
and (x2, y2) for the bottom-right corner.
Returns:
Path to the image with drawn rectangles
"""
logger.info(f"Draw rectangles tool requested for image: {input_path} with {len(rectangles)} rectangles")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_with_rectangles{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Draw each rectangle on the image
for i, rect_item in enumerate(rectangles):
# Extract rectangle coordinates (required)
x1 = rect_item["x1"]
y1 = rect_item["y1"]
x2 = rect_item["x2"]
y2 = rect_item["y2"]
# Extract optional parameters with defaults
color = rect_item.get("color", [0, 0, 0]) # Default: black
thickness = rect_item.get("thickness", 1)
filled = rect_item.get("filled", False)
logger.debug(f"Drawing rectangle {i+1}: x1={x1}, y1={y1}, x2={x2}, y2={y2}, color={color}, thickness={thickness}, filled={filled}")
# If filled is True, set thickness to -1 (OpenCV's way of filling shapes)
if filled:
thickness = -1
# Draw the rectangle on the image
cv2.rectangle(
img,
(x1, y1),
(x2, y2),
color,
thickness
)
logger.debug(f"Rectangle {i+1} drawn")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the image with rectangles
logger.info(f"Saving image with rectangles to: {output_path}")
cv2.imwrite(output_path, img)
logger.info(f"Image with rectangles saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/draw_text.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def draw_texts(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
texts: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of text items to draw. Each item should have: "
"'text' (string), 'x' (int), 'y' (int), and optionally "
"'font_scale' (float), 'color' (list of 3 ints [B,G,R]), "
"'thickness' (int), 'font_face' (string)"
)
),
],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_with_text' suffix."
)
),
] = None,
) -> str:
"""
Draw text on an image using OpenCV.
This tool allows adding multiple text elements to an image with customizable
position, font, size, color, and thickness.
Available font_face options:
- 'FONT_HERSHEY_SIMPLEX' (default)
- 'FONT_HERSHEY_PLAIN'
- 'FONT_HERSHEY_DUPLEX'
- 'FONT_HERSHEY_COMPLEX'
- 'FONT_HERSHEY_TRIPLEX'
- 'FONT_HERSHEY_COMPLEX_SMALL'
- 'FONT_HERSHEY_SCRIPT_SIMPLEX'
- 'FONT_HERSHEY_SCRIPT_COMPLEX'
Returns:
Path to the image with drawn text
"""
logger.info(f"Draw texts tool requested for image: {input_path} with {len(texts)} text items")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_with_text{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Font face mapping
font_faces = {
"FONT_HERSHEY_SIMPLEX": cv2.FONT_HERSHEY_SIMPLEX,
"FONT_HERSHEY_PLAIN": cv2.FONT_HERSHEY_PLAIN,
"FONT_HERSHEY_DUPLEX": cv2.FONT_HERSHEY_DUPLEX,
"FONT_HERSHEY_COMPLEX": cv2.FONT_HERSHEY_COMPLEX,
"FONT_HERSHEY_TRIPLEX": cv2.FONT_HERSHEY_TRIPLEX,
"FONT_HERSHEY_COMPLEX_SMALL": cv2.FONT_HERSHEY_COMPLEX_SMALL,
"FONT_HERSHEY_SCRIPT_SIMPLEX": cv2.FONT_HERSHEY_SCRIPT_SIMPLEX,
"FONT_HERSHEY_SCRIPT_COMPLEX": cv2.FONT_HERSHEY_SCRIPT_COMPLEX,
}
logger.debug("OpenCV font face mapping created")
# Get configuration defaults
config = get_config()
# Draw each text item on the image
for i, text_item in enumerate(texts):
# Extract text and position (required)
text = text_item["text"]
x = text_item["x"]
y = text_item["y"]
# Extract optional parameters with config defaults
font_scale = text_item.get("font_scale", config.text.font_scale)
color = text_item.get("color", config.drawing.color)
thickness = text_item.get("thickness", config.drawing.thickness)
# Get font face (default to SIMPLEX if not specified or invalid)
font_face_name = text_item.get("font_face", "FONT_HERSHEY_SIMPLEX")
font_face = font_faces.get(font_face_name, cv2.FONT_HERSHEY_SIMPLEX)
logger.debug(f"Drawing text {i+1}: '{text}' at ({x}, {y}) with font_scale={font_scale}, color={color}, thickness={thickness}, font_face={font_face_name}")
# Draw the text on the image
cv2.putText(
img,
text,
(x, y),
font_face,
font_scale,
color,
thickness
)
logger.debug(f"Text {i+1} drawn")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the image with text
logger.info(f"Saving image with text to: {output_path}")
cv2.imwrite(output_path, img)
logger.info(f"Image with text saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/fill.py
================================================
import os
from typing import Annotated, Any, Dict, List, Optional
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def fill(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
areas: Annotated[
List[Dict[str, Any]],
Field(
description=(
"List of areas to fill. Each area can be a rectangle ({'x1', 'y1', 'x2', 'y2'}), "
"a polygon ({'polygon': [[x,y],...]}), or a mask from a file ({'mask_path': 'path/to/mask.png'}). "
"Optionally, include 'color' (list of 3 ints [B,G,R] or None, default black) and "
"'opacity' (float 0.0-1.0, default 0.5) INSIDE each area object. "
"Example: [{'polygon': [[0,0], [100,0], [100,100]], 'color': [255,0,0], 'opacity': 0.5}]"
)
),
],
invert_areas: Annotated[
bool,
Field(
description="If True, fills everything EXCEPT the specified areas. Useful for background removal."
),
] = False,
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_filled' suffix."
)
),
] = None,
) -> str:
"""
Fill specified areas of an image with a color and opacity.
This tool allows filling multiple areas of an image with a customizable
color and opacity. Each area can be a rectangle, a polygon, or a mask from a PNG file.
The 'opacity' parameter controls the transparency of the fill. 1.0 is fully opaque,
0.0 is fully transparent. Default is 0.5.
The 'color' is in BGR format, e.g., [255, 0, 0] for blue. Default is black.
If the `color` is set to `None`, the specified area will be made fully transparent,
effectively deleting it (similar to ImageMagick). In this case, the `opacity`
parameter is ignored.
If `invert_areas` is True, the tool will fill everything EXCEPT the specified areas.
Example usage:
{
"input_path": "/path/to/image.jpg",
"areas": [
{
"polygon": [[0, 0], [100, 0], [100, 100], [0, 100]],
"color": null, // Makes area transparent
"opacity": 1.0
}
],
"invert_areas": true, // Removes background, keeps only the polygon area
"output_path": "/path/to/output.png"
}
Returns:
Path to the image with filled areas
"""
logger.info(f"Fill tool requested for image: {input_path} with {len(areas)} areas, invert_areas={invert_areas}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_filled{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path, cv2.IMREAD_UNCHANGED)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# If any area requests transparency OR invert_areas is used with transparency, ensure we have an alpha channel
if any(area.get("color") is None for area in areas) or (invert_areas and areas and areas[0].get("color") is None):
if len(img.shape) < 3 or img.shape[2] == 3:
logger.info("Converting image to BGRA to support transparency")
img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA if len(img.shape) > 2 and img.shape[2] == 3 else cv2.COLOR_GRAY2BGRA)
# Create mask for invert mode if needed
if invert_areas:
# Create a mask where specified areas are 0 (don't fill) and everything else is 255 (fill)
mask = np.ones(img.shape[:2], dtype=np.uint8) * 255
# Mark each area as 0 (don't fill)
for area in areas:
if "mask_path" in area:
if not os.path.exists(area["mask_path"]):
logger.warning(f"Mask file not found: {area['mask_path']}. Skipping.")
continue
area_mask = cv2.imread(area["mask_path"], cv2.IMREAD_GRAYSCALE)
if area_mask is None:
logger.warning(f"Failed to read mask file: {area['mask_path']}. Skipping.")
continue
# Resize mask to match image dimensions if necessary
if area_mask.shape != mask.shape:
area_mask = cv2.resize(area_mask, (mask.shape[1], mask.shape[0]), interpolation=cv2.INTER_NEAREST)
mask[area_mask > 0] = 0 # Set area to not fill
elif "polygon" in area:
polygon_points = np.array(area["polygon"], dtype=np.int32)
cv2.fillPoly(mask, [polygon_points], 0)
elif "x1" in area and "y1" in area and "x2" in area and "y2" in area:
x1, y1, x2, y2 = int(area["x1"]), int(area["y1"]), int(area["x2"]), int(area["y2"])
mask[y1:y2, x1:x2] = 0
# Get fill parameters from the first area
color = areas[0].get("color") if areas else None
opacity = areas[0].get("opacity", 0.5) if areas else 0.5
logger.info("Inverted areas: applying fill to masked regions")
# Apply the fill using the mask
if color is None:
# Make masked areas fully transparent (black transparent)
if img.shape[2] != 4:
raise ValueError("Image must have an alpha channel for transparency operations.")
# Set all channels to 0 where mask is 255 (BGRA = 0,0,0,0)
img[mask == 255] = [0, 0, 0, 0]
else:
# Fill with color where mask is 255
color_tuple = tuple(color)
if not (0.0 <= opacity <= 1.0):
logger.warning(f"Opacity {opacity} is outside the valid range [0.0, 1.0]. Clamping it.")
opacity = max(0.0, min(1.0, opacity))
# Create an overlay image
overlay = img.copy()
overlay[mask == 255] = color_tuple + (255,) if img.shape[2] == 4 else color_tuple
# Blend the overlay with the original image
img = np.where(mask[:, :, None] == 255,
cv2.addWeighted(overlay, opacity, img, 1 - opacity, 0),
img)
else:
# Normal mode - process each area to fill
for i, area in enumerate(areas):
color = area.get("color")
if color is None:
# Make area transparent
logger.debug(f"Making area {i+1} transparent")
if img.shape[2] != 4:
raise ValueError("Image must have an alpha channel for transparency operations.")
transparent_color = (0, 0, 0, 0)
if "mask_path" in area:
if not os.path.exists(area["mask_path"]):
logger.warning(f"Mask file not found: {area['mask_path']}. Skipping.")
continue
mask = cv2.imread(area["mask_path"], cv2.IMREAD_GRAYSCALE)
if mask is None:
logger.warning(f"Failed to read mask file: {area['mask_path']}. Skipping.")
continue
if mask.shape != img.shape[:2]:
mask = cv2.resize(mask, (img.shape[1], img.shape[0]), interpolation=cv2.INTER_NEAREST)
img[mask > 0] = transparent_color
logger.debug(f"Mask area {i+1} from {area['mask_path']} made transparent")
elif "polygon" in area:
polygon_points = np.array(area["polygon"], dtype=np.int32)
cv2.fillPoly(img, [polygon_points], transparent_color)
logger.debug(f"Polygon area {i+1} made transparent")
elif "x1" in area and "y1" in area and "x2" in area and "y2" in area:
x1, y1, x2, y2 = int(area["x1"]), int(area["y1"]), int(area["x2"]), int(area["y2"])
img[y1:y2, x1:x2] = transparent_color
logger.debug(f"Rectangle area {i+1} made transparent")
else:
logger.warning(f"Skipping area {i+1} due to missing 'polygon', 'mask_path' or 'x1,y1,x2,y2' keys.")
else:
# Fill with color
color_tuple = tuple(color)
opacity = area.get("opacity", 0.5)
if not (0.0 <= opacity <= 1.0):
logger.warning(f"Opacity {opacity} is outside the valid range [0.0, 1.0]. Clamping it.")
opacity = max(0.0, min(1.0, opacity))
mask_to_fill = None
if "mask_path" in area:
if not os.path.exists(area["mask_path"]):
logger.warning(f"Mask file not found: {area['mask_path']}. Skipping.")
continue
mask_to_fill = cv2.imread(area["mask_path"], cv2.IMREAD_GRAYSCALE)
if mask_to_fill is None:
logger.warning(f"Failed to read mask file: {area['mask_path']}. Skipping.")
continue
if mask_to_fill.shape != img.shape[:2]:
mask_to_fill = cv2.resize(mask_to_fill, (img.shape[1], img.shape[0]), interpolation=cv2.INTER_NEAREST)
logger.debug(f"Filling mask area {i+1} from {area['mask_path']} with color={color_tuple}, opacity={opacity}")
elif "polygon" in area:
logger.debug(f"Filling polygon area {i+1} with color={color_tuple}, opacity={opacity}")
polygon_points = np.array(area["polygon"], dtype=np.int32)
mask_to_fill = np.zeros(img.shape[:2], dtype=np.uint8)
cv2.fillPoly(mask_to_fill, [polygon_points], 255)
elif "x1" in area and "y1" in area and "x2" in area and "y2" in area:
x1, y1, x2, y2 = int(area["x1"]), int(area["y1"]), int(area["x2"]), int(area["y2"])
logger.debug(f"Filling rectangle area {i+1}: ({x1}, {y1}) to ({x2}, {y2}) with color={color_tuple}, opacity={opacity}")
mask_to_fill = np.zeros(img.shape[:2], dtype=np.uint8)
cv2.rectangle(mask_to_fill, (x1, y1), (x2, y2), 255, -1)
if mask_to_fill is not None:
# Create an overlay for the fill color
overlay = img.copy()
# Apply color to the overlay where the mask is set
overlay[mask_to_fill > 0] = color_tuple + (255,) if img.shape[2] == 4 else color_tuple
# Blend the overlay with the original image using the mask
img = np.where(
mask_to_fill[:, :, None] > 0, # Condition where to apply the blend
cv2.addWeighted(overlay, opacity, img, 1 - opacity, 0),
img
)
logger.debug(f"Area {i+1} filled")
else:
logger.warning(f"Skipping area {i+1} due to missing 'polygon', 'mask_path' or 'x1,y1,x2,y2' keys.")
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
logger.info(f"Saving filled image to: {output_path}")
cv2.imwrite(output_path, img)
logger.info(f"Filled image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/find.py
================================================
import os
from pathlib import Path
from typing import Annotated, Any, Dict, List, Literal, Optional, Union
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def get_model_path(model_name):
"""Get the path to a model in the models directory."""
logger.info(f"Attempting to get path for model: {model_name}")
model_path = Path("models") / model_name
if model_path.exists():
logger.info(f"Model found at: {model_path}")
return str(model_path)
logger.warning(f"Model not found in models directory: {model_name}")
return None
def check_clip_installed():
"""Check if CLIP is installed and the model is available."""
logger.info("Checking if CLIP is installed and MobileCLIP model is available")
try:
import clip # noqa: F401
logger.info("CLIP is installed")
# Check if the MobileCLIP model is available in the root directory
clip_model_path = Path("mobileclip_blt.ts")
if clip_model_path.exists():
logger.info(f"MobileCLIP model found at: {clip_model_path}")
return True, None
logger.warning(f"MobileCLIP model not found at: {clip_model_path}")
return False, "MobileCLIP model not found. Please run 'download-clip-models' to download it."
except ImportError:
logger.warning("CLIP is not installed")
return False, "CLIP is not installed. Please install it with 'pip install git+https://github.com/ultralytics/CLIP.git'"
def register_tool(mcp: FastMCP):
@mcp.tool()
def find(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
description: Annotated[
str, Field(description="Text description of the object to find")
],
confidence: Annotated[
Optional[float],
Field(
description="Confidence threshold for detection (0.0 to 1.0). If not provided, uses config default.",
ge=0.0,
le=1.0,
),
] = None,
model_name: Annotated[
Optional[str],
Field(
description="Model name to use for finding objects (must support text prompts). If not provided, uses config default.",
),
] = None,
return_all_matches: Annotated[
bool, Field(description="If True, returns all matching objects; if False, returns only the best match")
] = False,
return_geometry: Annotated[
bool, Field(description="If True, returns segmentation masks or polygons for found objects.")
] = False,
geometry_format: Annotated[
Literal["mask", "polygon"], Field(description="Format for returned geometry: 'mask' or 'polygon'.")
] = "mask",
) -> Dict[str, Union[str, List[Dict[str, Any]], bool]]:
"""
Find objects in an image based on a text description.
This tool uses open-vocabulary detection models to find objects matching a text description.
It requires pre-downloaded YOLOE models that support text prompts (e.g. yoloe-11l-seg.pt).
This tool can optionally return segmentation masks or polygons.
When 'mask' is chosen for geometry_format, a PNG file is created for each
found object's mask. The file path is returned in the 'mask_path' field.
Returns:
Dictionary containing the input image path and a list of found objects.
Each object includes its confidence score and bounding box. If return_geometry
is True, it also includes a 'mask_path' (path to a PNG file) or
'polygon' (list of points).
"""
# Get configuration defaults
config = get_config()
# Use config defaults if parameters not provided
if confidence is None:
confidence = config.find.confidence_threshold
logger.info(f"Using config default confidence: {confidence}")
if model_name is None:
model_name = config.find.default_model
logger.info(f"Using config default model: {model_name}")
logger.info(
f"Find tool requested for image: {input_path}, description: '{description}', model: {model_name}, "
f"confidence: {confidence}, return_all_matches: {return_all_matches}, "
f"return_geometry: {return_geometry}, geometry_format: {geometry_format}"
)
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
# Add .pt extension if it doesn't exist
if not model_name.endswith(".pt"):
original_model_name = model_name
model_name = f"{model_name}.pt"
logger.info(f"Added .pt extension to model name: {original_model_name} -> {model_name}")
# Try to find the model
model_path = get_model_path(model_name)
logger.info(f"Resolved model path: {model_path}")
# If model not found, raise an error with helpful message
if not model_path:
logger.error(f"Model {model_name} not found.")
# List available models
available_models = []
models_dir = Path("models")
# Find all .pt files in the models directory and its subdirectories
if models_dir.exists():
for file in models_dir.glob("**/*.pt"):
available_models.append(str(file.relative_to(models_dir)))
# Filter for models that support text prompts
text_prompt_models = [
model for model in available_models
if "yoloe" in model.lower() and not model.lower().endswith("-pf.pt")
]
error_msg = (
f"Model {model_name} not found. "
f"Available models supporting text prompts: "
f"{', '.join(text_prompt_models) if text_prompt_models else 'None'}\n"
"To use this tool, you need to download a model that supports text prompts first using:\n"
"download-yolo-models --ultralytics MODEL_NAME\n"
"Recommended models: yoloe-11l-seg.pt\n"
"Models will be downloaded to the 'models' directory "
"in the project root."
)
raise RuntimeError(error_msg)
# Check if the model supports text prompts
if not ("yoloe" in model_name.lower() and not model_name.lower().endswith("-pf.pt")):
logger.error(f"Model {model_name} does not support text prompts.")
raise ValueError(
f"The model {model_name} does not support text prompts. "
f"Please use a model that supports text prompts, such as "
f"yoloe-11l-seg.pt"
)
# Check if CLIP is installed and the model is available
clip_installed, clip_error = check_clip_installed()
if not clip_installed:
logger.error(f"CLIP not installed or MobileCLIP model missing: {clip_error}")
raise RuntimeError(
f"Cannot use text prompts: {clip_error}\n"
"Text prompts require CLIP and the MobileCLIP model.\n"
"Run 'download-clip-models' to set up the required dependencies."
)
try:
# Set environment variable to use the models directory
os.environ["YOLO_CONFIG_DIR"] = str(Path("models").absolute())
logger.info(f"Set YOLO_CONFIG_DIR environment variable to: {os.environ['YOLO_CONFIG_DIR']}")
# Set environment variable for CLIP model path
clip_model_path = Path("mobileclip_blt.ts").absolute()
if not clip_model_path.exists():
logger.error(f"CLIP model not found at expected path: {clip_model_path}")
raise RuntimeError(
f"CLIP model not found at {clip_model_path}. "
"Please run 'download-clip-models' to download it."
)
os.environ["CLIP_MODEL_PATH"] = str(clip_model_path)
logger.info(f"Set CLIP_MODEL_PATH environment variable to: {os.environ['CLIP_MODEL_PATH']}")
logger.info("Importing Ultralytics")
# Import here to avoid loading ultralytics if not needed
from ultralytics import YOLO
logger.info("Ultralytics imported successfully")
logger.info("Loading model...")
# Load the model from the found path
model = YOLO(model_path)
logger.info("Model loaded successfully")
# For YOLOe models, we need to set the classes using the text description
logger.info("Setting up text prompts...")
# Convert the description to a list (YOLOe expects a list of class names)
class_names = [description]
logger.debug(f"Class names for text prompts: {class_names}")
try:
# Set the classes for the model
logger.info("Getting text embeddings...")
text_embeddings = model.get_text_pe(class_names)
logger.info("Setting classes...")
model.set_classes(class_names, text_embeddings)
logger.info("Classes set successfully")
except Exception as e:
logger.error(f"Error setting classes: {str(e)}", exc_info=True)
raise RuntimeError(
f"Error setting up text prompts: {str(e)}\n"
"This may be due to missing CLIP dependencies.\n"
"Please run 'download-clip-models' to set up the required dependencies."
) from e
# Run inference on the image
logger.info(f"Running inference on {input_path} with confidence {confidence}")
results = model.predict(input_path, conf=confidence, verbose=True)
logger.info("Inference completed")
found_objects = []
# Process results
if results and len(results) > 0:
logger.info(f"Processing {len(results)} results")
# The main result object is the first one in the list
main_result = results[0]
if return_geometry and main_result.masks is None:
raise ValueError(
f"Model '{model_name}' does not support segmentation, but return_geometry=True was requested. "
"Please use a segmentation model (e.g., one ending in '-seg.pt')."
)
if hasattr(main_result, 'boxes') and len(main_result.boxes) > 0:
logger.info(f"Found {len(main_result.boxes)} boxes")
for i, box in enumerate(main_result.boxes):
# Get class name
class_id = int(box.cls.item())
class_name = main_result.names[class_id]
# Get confidence score
conf = float(box.conf.item())
# Get bounding box coordinates (x1, y1, x2, y2)
x1, y1, x2, y2 = [float(coord) for coord in box.xyxy[0].tolist()]
found_object = {
"description": description,
"match": class_name,
"confidence": conf,
"bbox": [x1, y1, x2, y2]
}
if return_geometry:
if geometry_format == "mask":
# Convert mask to a savable format
mask = main_result.masks.data[i].cpu().numpy()
mask_image = (mask * 255).astype(np.uint8)
# Generate a unique filename for the mask, always with .png extension
input_p = Path(input_path)
base_name = input_p.stem
output_dir = input_p.parent
mask_output_path = output_dir / f"{base_name}_mask_{i}.png"
# Save the mask as a PNG file
try:
success = cv2.imwrite(str(mask_output_path), mask_image)
if success:
logger.info(f"Saved find mask to {mask_output_path}")
found_object["mask_path"] = str(mask_output_path)
else:
logger.error(f"Failed to save mask to {mask_output_path}")
except Exception as e:
logger.error(f"An unexpected error occurred while saving mask to {mask_output_path}: {e}")
elif geometry_format == "polygon":
polygon = main_result.masks.xy[i].tolist()
found_object["polygon"] = polygon
found_objects.append(found_object)
logger.debug(
f"Found object: match={class_name}, confidence={conf:.2f}, bbox=[{x1:.2f}, {y1:.2f}, {x2:.2f}, {y2:.2f}]")
else:
logger.info("No boxes found in results")
else:
logger.info("No results returned from model")
# Sort by confidence (highest first)
found_objects.sort(key=lambda x: x["confidence"], reverse=True)
# Return only the best match if return_all_matches is False and we have matches
if not return_all_matches and found_objects:
logger.info("Returning only the best match")
found_objects = [found_objects[0]]
logger.info(f"Returning {len(found_objects)} found objects")
return {
"image_path": input_path,
"query": description,
"found_objects": found_objects,
"found": len(found_objects) > 0
}
except Exception as e:
logger.error(f"Error in find tool: {str(e)}", exc_info=True)
# Provide more helpful error message
error_msg = f"Error finding objects: {str(e)}\n"
if "not found" in str(e).lower():
error_msg += (
"The model could not be found. "
"Please download it first using: "
"download-yolo-models --ultralytics MODEL_NAME"
)
elif "permission denied" in str(e).lower():
error_msg += (
"Permission denied when trying to access or create the models "
"directory.\n"
"Try running the command with appropriate permissions."
)
elif "no module named" in str(e).lower():
error_msg += (
"Required dependencies are missing. "
"Please install them using: "
"pip install git+https://github.com/ultralytics/CLIP.git"
)
elif "mobileclip" in str(e).lower():
error_msg += (
"MobileCLIP model is missing. "
"Please download it using: "
"download-clip-models"
)
raise RuntimeError(error_msg) from e
================================================
FILE: src/imagesorcery_mcp/tools/metainfo.py
================================================
import datetime
import os
from typing import Annotated, Any
from fastmcp import FastMCP
from PIL import Image
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def get_metainfo(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
) -> Any:
"""
Get metadata information about an image file.
Returns:
Dictionary containing metadata about the image (size, dimensions,
format, etc.)
"""
logger.info(f"Get metainfo tool requested for image: {input_path}")
# Check if input_path is empty
if not input_path or not input_path.strip():
logger.error("Input path is empty. Please provide a full path to the image file.")
raise ValueError("input_path cannot be empty. Please provide a full path to the image file.")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
logger.info(f"Input file found: {input_path}")
# Get file stats
logger.info(f"Getting file stats for: {input_path}")
file_stats = os.stat(input_path)
file_size = file_stats.st_size
creation_time = datetime.datetime.fromtimestamp(file_stats.st_ctime)
modification_time = datetime.datetime.fromtimestamp(file_stats.st_mtime)
logger.info(f"File size: {file_size} bytes, Created: {creation_time}, Modified: {modification_time}")
# Get image-specific information
logger.info(f"Opening image with PIL: {input_path}")
try:
with Image.open(input_path) as img:
width, height = img.size
format = img.format
mode = img.mode
logger.info(f"Image opened successfully. Dimensions: {width}x{height}, Format: {format}, Mode: {mode}")
except Exception as e:
logger.error(f"Failed to open image with PIL: {input_path} - {str(e)}")
raise ValueError(f"Failed to read image: {input_path}") from e
# Compile all metadata
metadata = {
"filename": os.path.basename(input_path),
"path": input_path,
"size_bytes": file_size,
"size_kb": round(file_size / 1024, 2),
"size_mb": round(file_size / (1024 * 1024), 2),
"dimensions": {
"width": width,
"height": height,
"aspect_ratio": round(width / height, 2) if height != 0 else None,
},
"format": format,
"color_mode": mode,
"created_at": creation_time.isoformat(),
"modified_at": modification_time.isoformat(),
}
logger.info("Metadata compiled successfully")
logger.debug(f"Metadata: {metadata}")
return metadata
================================================
FILE: src/imagesorcery_mcp/tools/ocr.py
================================================
import os
from typing import Annotated, Dict, List, Optional, Union
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def ocr(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
language: Annotated[
Optional[str],
Field(
description="Language code for OCR (e.g., 'en', 'ru', 'fr', etc.). If not provided, uses config default.",
),
] = None,
) -> Dict[str, Union[str, List[Dict[str, Union[str, float, List[float]]]]]]:
"""
Performs Optical Character Recognition (OCR) on an image using EasyOCR.
This tool extracts text from images in various languages. The default language is English,
but you can specify other languages using their language codes (e.g., 'en', 'ru', 'fr', etc.).
Returns:
Dictionary containing the input image path and a list of detected text segments
with their text content, confidence scores, and bounding box coordinates.
"""
# Get configuration defaults
config = get_config()
# Use config default if language not provided
if language is None:
language = config.ocr.language
logger.info(f"Using config default language: {language}")
logger.info(f"OCR requested for image: {input_path} with language: {language}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
try:
# Import here to avoid loading dependencies if not needed
import cv2
import easyocr
logger.info("EasyOCR imported successfully")
# Read the image
logger.info(f"Reading image from: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}. The file may be corrupted or not an image.")
# Check image dimensions and convert to grayscale
logger.info(f"Image shape: {img.shape}")
if len(img.shape) == 3:
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
logger.info(f"Converted image to grayscale: {img_gray.shape}")
else:
img_gray = img
logger.info("Image is already grayscale")
# Create reader with specified language
logger.info(f"Creating EasyOCR reader for language: {language}")
reader = easyocr.Reader([language])
logger.info("EasyOCR reader created successfully")
# Perform OCR directly on the numpy array
logger.info("Starting OCR processing on image array")
results = reader.readtext(img_gray) # Pass the numpy array directly
logger.info(f"OCR processing completed with {len(results)} text segments found")
# Process results
text_segments = []
for i, result in enumerate(results):
# EasyOCR can return results in different formats depending on the version
# Handle both possible formats
if len(result) == 3:
# Format: (bbox, text, confidence)
bbox, text, confidence = result
elif len(result) == 4:
# Format: (bbox, text, confidence, _)
bbox, text, confidence, _ = result
else:
# Unknown format, try to extract what we can
logger.warning(f"Unexpected result format for segment {i}: {result}")
bbox = result[0] if len(result) > 0 else [[0, 0], [0, 0], [0, 0], [0, 0]]
text = result[1] if len(result) > 1 else ""
confidence = result[2] if len(result) > 2 else 0.0
# EasyOCR returns bounding box as 4 points (top-left, top-right, bottom-right, bottom-left)
# Convert to [x1, y1, x2, y2] format (top-left and bottom-right corners)
x_coords = [point[0] for point in bbox]
y_coords = [point[1] for point in bbox]
x1, y1 = min(x_coords), min(y_coords)
x2, y2 = max(x_coords), max(y_coords)
text_segments.append(
{
"text": text,
"confidence": float(confidence),
"bbox": [float(x1), float(y1), float(x2), float(y2)]
}
)
logger.debug(f"Processed segment {i}: text='{text[:30]}...' confidence={confidence:.2f}")
logger.info(f"OCR completed successfully for {input_path}")
return {"image_path": input_path, "text_segments": text_segments}
except ImportError as e:
if "easyocr" in str(e).lower():
error_msg = (
"EasyOCR is not installed. "
"Please install it first using: "
"pip install easyocr"
)
elif "cv2" in str(e).lower():
error_msg = (
"OpenCV (cv2) is not installed. "
"Please install it first using: "
"pip install opencv-python"
)
else:
error_msg = f"Required dependency not installed: {str(e)}"
logger.error(f"Import error: {error_msg}")
raise RuntimeError(error_msg) from None
except Exception as e:
# Provide more helpful error message
error_msg = f"Error performing OCR: {str(e)}\n"
if "not found" in str(e).lower() and "language" in str(e).lower():
error_msg += (
f"The language '{language}' is not supported or the language model "
f"could not be found. Please check available languages in EasyOCR documentation."
)
logger.error(f"Language not supported: {language}")
elif "permission denied" in str(e).lower():
error_msg += (
"Permission denied when trying to access the image file.\n"
"Try running the command with appropriate permissions."
)
logger.error(f"Permission denied accessing file: {input_path}")
else:
logger.error(f"OCR processing error: {str(e)}", exc_info=True)
raise RuntimeError(error_msg) from e
================================================
FILE: src/imagesorcery_mcp/tools/overlay.py
================================================
import os
from typing import Annotated, Optional
import cv2
import numpy as np
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def overlay(
base_image_path: Annotated[str, Field(description="Full path to the base image (must be a full path)")],
overlay_image_path: Annotated[
str, Field(description="Full path to the overlay image (must be a full path). This image can have transparency.")
],
x: Annotated[int, Field(description="X-coordinate of the top-left corner of the overlay image on the base image.")],
y: Annotated[int, Field(description="Y-coordinate of the top-left corner of the overlay image on the base image.")],
output_path: Annotated[
Optional[str],
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use the base image filename "
"with '_overlaid' suffix."
)
),
] = None,
) -> str:
"""
Overlays one image on top of another, handling transparency.
This tool places an overlay image onto a base image at a specified (x, y)
coordinate. If the overlay image has an alpha channel (e.g., a transparent PNG),
it will be blended correctly with the base image. If the overlay extends
beyond the boundaries of the base image, it will be cropped.
Returns:
Path to the resulting image.
"""
logger.info(f"Overlay tool requested for base image: {base_image_path}, overlay image: {overlay_image_path}")
# Check if input files exist
if not os.path.exists(base_image_path):
logger.error(f"Base image not found: {base_image_path}")
raise FileNotFoundError(f"Base image not found: {base_image_path}. Please provide a full path to the file.")
if not os.path.exists(overlay_image_path):
logger.error(f"Overlay image not found: {overlay_image_path}")
raise FileNotFoundError(f"Overlay image not found: {overlay_image_path}. Please provide a full path to the file.")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(base_image_path)
output_path = f"{file_name}_overlaid{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read images
base_img = cv2.imread(base_image_path)
overlay_img = cv2.imread(overlay_image_path, cv2.IMREAD_UNCHANGED)
if base_img is None:
logger.error(f"Failed to read base image: {base_image_path}")
raise ValueError(f"Failed to read base image: {base_image_path}")
if overlay_img is None:
logger.error(f"Failed to read overlay image: {overlay_image_path}")
raise ValueError(f"Failed to read overlay image: {overlay_image_path}")
# Get dimensions
base_h, base_w, _ = base_img.shape
overlay_h, overlay_w, _ = overlay_img.shape
# Handle coordinates and potential cropping of the overlay
x_start, y_start = x, y
x_end, y_end = x + overlay_w, y + overlay_h
overlay_x_start, overlay_y_start = 0, 0
overlay_x_end, overlay_y_end = overlay_w, overlay_h
if x_start < 0:
overlay_x_start = -x_start
x_start = 0
if y_start < 0:
overlay_y_start = -y_start
y_start = 0
if x_end > base_w:
overlay_x_end -= x_end - base_w
x_end = base_w
if y_end > base_h:
overlay_y_end -= y_end - base_h
y_end = base_h
if x_start >= x_end or y_start >= y_end:
logger.warning("Overlay is completely outside the base image. No changes made.")
cv2.imwrite(output_path, base_img)
return output_path
overlay_img = overlay_img[overlay_y_start:overlay_y_end, overlay_x_start:overlay_x_end]
roi = base_img[y_start:y_end, x_start:x_end]
if overlay_img.shape[2] == 4:
logger.info("Overlay image has alpha channel. Performing alpha blending.")
alpha = overlay_img[:, :, 3] / 255.0
overlay_colors = overlay_img[:, :, :3]
alpha_mask = cv2.merge([alpha, alpha, alpha])
blended_roi = (alpha_mask * overlay_colors) + ((1 - alpha_mask) * roi)
base_img[y_start:y_end, x_start:x_end] = blended_roi.astype(np.uint8)
else:
logger.info("Overlay image has no alpha channel. Pasting directly.")
base_img[y_start:y_end, x_start:x_end] = overlay_img
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
cv2.imwrite(output_path, base_img)
logger.info(f"Overlaid image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/resize.py
================================================
import os
from typing import Annotated, Optional
import cv2
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger and config
from imagesorcery_mcp.config import get_config
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def resize(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
width: Annotated[
Optional[int],
Field(
description=(
"Target width in pixels. "
"If None, will be calculated based on height "
"and preserve aspect ratio"
)
),
] = None,
height: Annotated[
Optional[int],
Field(
description=(
"Target height in pixels. "
"If None, will be calculated based on width "
"and preserve aspect ratio"
)
),
] = None,
scale_factor: Annotated[
Optional[float],
Field(
description=(
"Scale factor to resize the image "
"(e.g., 0.5 for half size, 2.0 for double size). "
"Overrides width and height if provided"
)
),
] = None,
interpolation: Annotated[
Optional[str],
Field(
description=(
"Interpolation method: 'nearest', 'linear', 'area', "
"'cubic', 'lanczos'. If not provided, uses config default."
)
),
] = None,
output_path: Annotated[
str,
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_resized' suffix."
)
),
] = None,
) -> str:
"""
Resize an image using OpenCV.
The function can resize an image in three ways:
1. By specifying both width and height
2. By specifying either width or height (preserving aspect ratio)
3. By specifying a scale factor
Returns:
Path to the resized image
"""
# Get configuration defaults
config = get_config()
# Use config default if interpolation not provided
if interpolation is None:
interpolation = config.resize.interpolation
logger.info(f"Using config default interpolation: {interpolation}")
logger.info(f"Resize tool requested for image: {input_path}, width: {width}, height: {height}, scale_factor: {scale_factor}, interpolation: {interpolation}")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
logger.info(f"Input file found: {input_path}")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_resized{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Get original dimensions
orig_height, orig_width = img.shape[:2]
logger.debug(f"Original dimensions: {orig_width}x{orig_height}")
# Determine interpolation method
interpolation_methods = {
"nearest": cv2.INTER_NEAREST,
"linear": cv2.INTER_LINEAR,
"area": cv2.INTER_AREA,
"cubic": cv2.INTER_CUBIC,
"lanczos": cv2.INTER_LANCZOS4,
}
if interpolation not in interpolation_methods:
logger.error(f"Invalid interpolation method: {interpolation}")
raise ValueError(
f"Invalid interpolation method. Choose from: "
f"{', '.join(interpolation_methods.keys())}"
)
interp = interpolation_methods[interpolation]
logger.debug(f"Using interpolation method: {interpolation} ({interp})")
# Calculate target dimensions
if scale_factor is not None:
# Resize by scale factor
target_width = int(orig_width * scale_factor)
target_height = int(orig_height * scale_factor)
logger.info(f"Resizing by scale factor {scale_factor} to {target_width}x{target_height}")
elif width is not None and height is not None:
# Resize to specific dimensions
target_width = width
target_height = height
logger.info(f"Resizing to specific dimensions: {target_width}x{target_height}")
elif width is not None:
# Resize to specific width, maintain aspect ratio
target_width = width
target_height = int(orig_height * (width / orig_width))
logger.info(f"Resizing to width {width}, maintaining aspect ratio. Target height: {target_height}")
elif height is not None:
# Resize to specific height, maintain aspect ratio
target_height = height
target_width = int(orig_width * (height / orig_height))
logger.info(f"Resizing to height {height}, maintaining aspect ratio. Target width: {target_width}")
else:
logger.error("No resize parameters provided (width, height, or scale_factor)")
raise ValueError("Must provide either width, height, or scale_factor")
# Resize the image
logger.info(f"Performing resize to {target_width}x{target_height}")
resized_img = cv2.resize(
img, (target_width, target_height), interpolation=interp
)
logger.info(f"Image resized successfully. New shape: {resized_img.shape}")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the resized image
logger.info(f"Saving resized image to: {output_path}")
cv2.imwrite(output_path, resized_img)
logger.info(f"Resized image saved successfully to: {output_path}")
return output_path
================================================
FILE: src/imagesorcery_mcp/tools/rotate.py
================================================
import os
from typing import Annotated
import cv2
import imutils
from fastmcp import FastMCP
from pydantic import Field
# Import the central logger
from imagesorcery_mcp.logging_config import logger
def register_tool(mcp: FastMCP):
@mcp.tool()
def rotate(
input_path: Annotated[str, Field(description="Full path to the input image (must be a full path)")],
angle: Annotated[
float,
Field(
description=(
"Angle of rotation in degrees (positive for counterclockwise)"
)
),
],
output_path: Annotated[
str,
Field(
description=(
"Full path to save the output image (must be a full path). "
"If not provided, will use input filename "
"with '_rotated' suffix."
)
),
] = None,
) -> str:
"""
Rotate an image using imutils.rotate_bound function.
The function rotates the image by the specified angle in degrees.
Positive angles represent counterclockwise rotation.
The rotate_bound function ensures the entire rotated image is visible
by automatically adjusting the output image size.
Returns:
Path to the rotated image
"""
logger.info(f"Rotate tool requested for image: {input_path} with angle: {angle} degrees")
# Check if input file exists
if not os.path.exists(input_path):
logger.error(f"Input file not found: {input_path}")
raise FileNotFoundError(f"Input file not found: {input_path}. Please provide a full path to the file.")
logger.info(f"Input file found: {input_path}")
# Generate output path if not provided
if not output_path:
file_name, file_ext = os.path.splitext(input_path)
output_path = f"{file_name}_rotated{file_ext}"
logger.info(f"Output path not provided, generated: {output_path}")
# Read the image using OpenCV
logger.info(f"Reading image: {input_path}")
img = cv2.imread(input_path)
if img is None:
logger.error(f"Failed to read image: {input_path}")
raise ValueError(f"Failed to read image: {input_path}")
logger.info(f"Image read successfully. Shape: {img.shape}")
# Rotate the image using imutils.rotate_bound
logger.info(f"Rotating image by {angle} degrees")
rotated_img = imutils.rotate_bound(img, angle)
logger.info(f"Image rotated successfully. New shape: {rotated_img.shape}")
# Create directory for output if it doesn't exist
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
logger.info(f"Output directory does not exist, creating: {output_dir}")
os.makedirs(output_dir)
logger.info(f"Output directory created: {output_dir}")
# Save the rotated image
logger.info(f"Saving rotated image to: {output_path}")
cv2.imwrite(output_path, rotated_img)
logger.info(f"Rotated image saved successfully to: {output_path}")
return output_path
================================================
FILE: tests/conftest.py
================================================
"""
Pytest configuration file for setting up test environment.
"""
import os
def pytest_configure(config):
"""Configure pytest to set DISABLE_TELEMETRY environment variable for all tests."""
# Set DISABLE_TELEMETRY environment variable for all tests
os.environ['DISABLE_TELEMETRY'] = 'true'
def pytest_unconfigure(config):
"""Clean up after tests if needed."""
# Optionally remove the environment variable after tests
if 'DISABLE_TELEMETRY' in os.environ:
del os.environ['DISABLE_TELEMETRY']
================================================
FILE: tests/prompts/test_remove_background.py
================================================
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
class TestRemoveBackgroundPromptDefinition:
"""Tests for the remove-background prompt definition and metadata."""
@pytest.mark.asyncio
async def test_remove_background_in_prompts_list(self, mcp_server: FastMCP):
"""Tests that remove-background prompt is in the list of available prompts."""
async with Client(mcp_server) as client:
prompts = await client.list_prompts()
# Verify that prompts list is not empty
assert prompts, "Prompts list should not be empty"
# Check if remove-background is in the list of prompts
prompt_names = [prompt.name for prompt in prompts]
assert "remove-background" in prompt_names, (
"remove-background prompt should be in the list of available prompts"
)
@pytest.mark.asyncio
async def test_remove_background_description(self, mcp_server: FastMCP):
"""Tests that remove-background prompt has the correct description."""
async with Client(mcp_server) as client:
prompts = await client.list_prompts()
remove_bg_prompt = next(
(prompt for prompt in prompts if prompt.name == "remove-background"), None
)
# Check description
assert remove_bg_prompt.description, (
"remove-background prompt should have a description"
)
assert "background removal" in remove_bg_prompt.description.lower(), (
"Description should mention background removal"
)
@pytest.mark.asyncio
async def test_remove_background_parameters(self, mcp_server: FastMCP):
"""Tests that remove-background prompt has the correct parameter structure."""
async with Client(mcp_server) as client:
prompts = await client.list_prompts()
remove_bg_prompt = next(
(prompt for prompt in prompts if prompt.name == "remove-background"), None
)
# Check arguments schema
assert hasattr(remove_bg_prompt, "arguments"), (
"remove-background prompt should have an arguments field"
)
assert isinstance(remove_bg_prompt.arguments, list), (
"arguments should be a list of PromptArgument objects"
)
# Get argument names for easier checking
arg_names = [arg.name for arg in remove_bg_prompt.arguments]
# Check required parameters
required_params = ["image_path"]
for param in required_params:
assert param in arg_names, (
f"remove-background prompt should have a '{param}' argument"
)
# Check optional parameters
assert "target_objects" in arg_names, (
"remove-background prompt should have a 'target_objects' argument"
)
assert "output_path" in arg_names, (
"remove-background prompt should have an 'output_path' argument"
)
# Check parameter requirements
image_path_arg = next(arg for arg in remove_bg_prompt.arguments if arg.name == "image_path")
target_objects_arg = next(arg for arg in remove_bg_prompt.arguments if arg.name == "target_objects")
output_path_arg = next(arg for arg in remove_bg_prompt.arguments if arg.name == "output_path")
assert image_path_arg.required, "image_path should be required"
assert not target_objects_arg.required, "target_objects should be optional"
assert not output_path_arg.required, "output_path should be optional"
class TestRemoveBackgroundPromptExecution:
"""Tests for the remove-background prompt execution and results."""
@pytest.mark.asyncio
async def test_remove_background_prompt_execution(self, mcp_server: FastMCP):
"""Tests the remove-background prompt execution and return value."""
async with Client(mcp_server) as client:
result = await client.get_prompt(
"remove-background",
{
"image_path": "/test/path/image.jpg",
"target_objects": "person",
"output_path": "/test/path/output.png",
},
)
# Check that the prompt returned a result
assert result.messages, "Prompt should return messages"
assert len(result.messages) > 0, "Prompt should return at least one message"
# Check the content of the returned prompt
prompt_content = result.messages[0].content.text
assert "Step 1:" in prompt_content, "Prompt should contain step-by-step instructions"
assert "detect" not in prompt_content, "Prompt should not mention detect tool when target_objects specified"
assert "fill" in prompt_content, "Prompt should mention fill tool"
assert "find" in prompt_content, "Prompt should mention find tool when target_objects specified"
assert "person" in prompt_content, "Prompt should include the target objects"
assert "/test/path/image.jpg" in prompt_content, "Prompt should include the input path"
assert "/test/path/output.png" in prompt_content, "Prompt should include the output path"
@pytest.mark.asyncio
async def test_remove_background_default_parameters(self, mcp_server: FastMCP):
"""Tests the remove-background prompt with default parameters."""
async with Client(mcp_server) as client:
result = await client.get_prompt(
"remove-background",
{
"image_path": "/test/path/photo.jpg",
},
)
# Check that the prompt returned a result
assert result.messages, "Prompt should return messages"
prompt_content = result.messages[0].content.text
# Check default behavior (no target_objects specified)
assert "find" not in prompt_content, (
"Prompt should not use find tool when no target_objects specified"
)
assert "detect" in prompt_content, (
"Prompt should use detect tool when no target_objects specified"
)
assert "/test/path/photo_no_background.png" in prompt_content, (
"Prompt should auto-generate output path"
)
@pytest.mark.asyncio
async def test_remove_background_custom_target(self, mcp_server: FastMCP):
"""Tests the remove-background prompt with custom target objects."""
async with Client(mcp_server) as client:
result = await client.get_prompt(
"remove-background",
{
"image_path": "/test/path/car.jpg",
"target_objects": "red car",
},
)
# Check that the prompt returned a result
assert result.messages, "Prompt should return messages"
prompt_content = result.messages[0].content.text
# Check custom target objects is used
assert "red car" in prompt_content, (
"Prompt should use custom target_objects 'red car'"
)
assert "preserving the red car" in prompt_content, (
"Prompt should mention preserving the custom target"
)
assert "find" in prompt_content, (
"Prompt should include find tool when target_objects specified"
)
assert "detect" not in prompt_content, (
"Prompt should not include detect tool when target_objects specified"
)
================================================
FILE: tests/resources/test_models.py
================================================
import json
import os
from pathlib import Path
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_models_dir(tmp_path):
"""Create a temporary models directory with test model files."""
# Save the original models directory path
original_models_dir = Path("models")
original_exists = original_models_dir.exists()
# Create a temporary models directory
models_dir = tmp_path / "models"
models_dir.mkdir(exist_ok=True)
# Create some test model files
test_models = ["yolov8n.pt", "yolov8m.pt", "custom_model.pt"]
for model_name in test_models:
model_path = models_dir / model_name
# Create an empty file
model_path.touch()
# Create a test model descriptions file
descriptions = {
"yolov8n.pt": "YOLOv8 Nano - Smallest and fastest model, suitable for edge devices with limited resources.",
"yolov8m.pt": "YOLOv8 Medium - Default model with good balance between accuracy and speed."
}
with open(models_dir / "model_descriptions.json", "w") as f:
json.dump(descriptions, f)
# Temporarily replace the models directory
if original_exists:
# Rename the original directory
temp_original = original_models_dir.with_name("models_original_backup")
original_models_dir.rename(temp_original)
# Create a symlink to our temporary directory
os.symlink(models_dir, original_models_dir)
yield models_dir
# Clean up: remove the symlink
if os.path.islink(original_models_dir):
os.unlink(original_models_dir)
# Restore the original directory if it existed
if original_exists:
temp_original.rename(original_models_dir)
class TestModelsResourceDefinition:
"""Tests for the models resource definition and metadata."""
@pytest.mark.asyncio
async def test_models_in_resources_list(self, mcp_server: FastMCP):
"""Tests that models resource is in the list of available resources."""
async with Client(mcp_server) as client:
resources = await client.list_resources()
# Verify that resources list is not empty
assert resources, "Resources list should not be empty"
# Check if models://list is in the list of resources
# Convert AnyUrl objects to strings
resource_uris = [str(resource.uri) for resource in resources]
assert "models://list" in resource_uris, (
f"models://list resource should be in the list of available resources. "
f"Found: {resource_uris}"
)
@pytest.mark.asyncio
async def test_models_resource_metadata(self, mcp_server: FastMCP):
"""Tests that models resource has the correct metadata."""
async with Client(mcp_server) as client:
resources = await client.list_resources()
models_resource = next(
(resource for resource in resources if str(resource.uri) == "models://list"), None
)
# Check that the resource exists
assert models_resource is not None, f"models://list resource should exist. Found resources: {[str(r.uri) for r in resources]}"
# Check name - it appears FastMCP uses the full URI as the name
assert models_resource.name == "list_models", f"Resource name should be 'list_models' but got '{models_resource.name}'"
# Since description is None, let's skip this check for now or check for None
# The actual resource implementation might not set a description at the transport level
class TestModelsResourceExecution:
"""Tests for the models resource execution and results."""
@pytest.mark.asyncio
async def test_models_resource_execution(self, mcp_server: FastMCP, test_models_dir):
"""Tests the models resource execution and return value."""
async with Client(mcp_server) as client:
result = await client.read_resource("models://list")
# Check that the resource returned a result
assert len(result) == 1
# Parse the result
result_dict = json.loads(result[0].text)
# Check that the result has the expected structure
assert "models" in result_dict
assert isinstance(result_dict["models"], list)
# Check that we have the expected number of models
assert len(result_dict["models"]) == 3
# Check that each model has the expected fields
for model in result_dict["models"]:
assert "name" in model
assert "description" in model
assert "path" in model
# Check specific models
if model["name"] == "yolov8n.pt":
assert "Smallest and fastest" in model["description"]
elif model["name"] == "yolov8m.pt":
assert "Default model" in model["description"]
elif model["name"] == "custom_model.pt":
assert model["description"] == "Model 'custom_model.pt' not found in model_descriptions.json (total descriptions: 2)"
@pytest.mark.asyncio
async def test_models_empty_directory(self, mcp_server: FastMCP, tmp_path):
"""Tests the models resource with an empty models directory."""
# Save the original models directory path
original_models_dir = Path("models")
original_exists = original_models_dir.exists()
if original_exists:
# Rename the original directory
temp_original = original_models_dir.with_name("models_original_backup")
original_models_dir.rename(temp_original)
# Create an empty models directory
empty_models_dir = tmp_path / "empty_models"
empty_models_dir.mkdir(exist_ok=True)
os.symlink(empty_models_dir, original_models_dir)
try:
async with Client(mcp_server) as client:
result = await client.read_resource("models://list")
# Check that the resource returned a result
assert len(result) == 1
# Parse the result
result_dict = json.loads(result[0].text)
# Check that the result has the expected structure
assert "models" in result_dict
assert isinstance(result_dict["models"], list)
# Check that the list is empty
assert len(result_dict["models"]) == 0
finally:
# Clean up: remove the symlink
if os.path.islink(original_models_dir):
os.unlink(original_models_dir)
# Restore the original directory if it existed
if original_exists:
temp_original.rename(original_models_dir)
@pytest.mark.asyncio
async def test_models_no_directory(self, mcp_server: FastMCP, tmp_path):
"""Tests the models resource when the models directory doesn't exist."""
# Save the original models directory path
original_models_dir = Path("models")
original_exists = original_models_dir.exists()
# Remove the original directory if it exists
if original_exists:
# Rename the original directory
temp_original = original_models_dir.with_name("models_original_backup")
original_models_dir.rename(temp_original)
try:
async with Client(mcp_server) as client:
result = await client.read_resource("models://list")
# Check that the resource returned a result
assert len(result) == 1
# Parse the result
result_dict = json.loads(result[0].text)
# Check that the result has the expected structure
assert "models" in result_dict
assert isinstance(result_dict["models"], list)
# Check that the list is empty when directory doesn't exist
assert len(result_dict["models"]) == 0
finally:
# Restore the original directory if it existed
if original_exists:
temp_original.rename(original_models_dir)
@pytest.mark.asyncio
async def test_models_with_subdirectories(self, mcp_server: FastMCP, tmp_path):
"""Tests the models resource with models in subdirectories."""
# Save the original models directory path
original_models_dir = Path("models")
original_exists = original_models_dir.exists()
if original_exists:
# Rename the original directory
temp_original = original_models_dir.with_name("models_original_backup")
original_models_dir.rename(temp_original)
# Create a models directory with subdirectories
models_dir = tmp_path / "models"
models_dir.mkdir(exist_ok=True)
# Create subdirectories and model files
(models_dir / "detection").mkdir()
(models_dir / "detection" / "yolov8n.pt").touch()
(models_dir / "segmentation").mkdir()
(models_dir / "segmentation" / "sam.onnx").touch()
(models_dir / "root_model.pt").touch()
# Create descriptions file
descriptions = {
"detection/yolov8n.pt": "YOLOv8 Nano for object detection",
"segmentation/sam.onnx": "Segment Anything Model",
"root_model.pt": "Model in root directory"
}
with open(models_dir / "model_descriptions.json", "w") as f:
json.dump(descriptions, f)
# Create a symlink to our temporary directory
os.symlink(models_dir, original_models_dir)
try:
async with Client(mcp_server) as client:
result = await client.read_resource("models://list")
# Check that the resource returned a result
assert len(result) == 1
# Parse the result
result_dict = json.loads(result[0].text)
# Check that the result has the expected structure
assert "models" in result_dict
assert isinstance(result_dict["models"], list)
# Check that we have all the models
assert len(result_dict["models"]) == 3
# Check model names
model_names = [model["name"] for model in result_dict["models"]]
assert "detection/yolov8n.pt" in model_names
assert "segmentation/sam.onnx" in model_names
assert "root_model.pt" in model_names
# Check descriptions
for model in result_dict["models"]:
if model["name"] == "detection/yolov8n.pt":
assert model["description"] == "YOLOv8 Nano for object detection"
elif model["name"] == "segmentation/sam.onnx":
assert model["description"] == "Segment Anything Model"
elif model["name"] == "root_model.pt":
assert model["description"] == "Model in root directory"
finally:
# Clean up: remove the symlink
if os.path.islink(original_models_dir):
os.unlink(original_models_dir)
# Restore the original directory if it existed
if original_exists:
temp_original.rename(original_models_dir)
@pytest.mark.asyncio
async def test_models_ignores_non_model_files(self, mcp_server: FastMCP, tmp_path):
"""Tests that the models resource ignores non-model files."""
# Save the original models directory path
original_models_dir = Path("models")
original_exists = original_models_dir.exists()
if original_exists:
# Rename the original directory
temp_original = original_models_dir.with_name("models_original_backup")
original_models_dir.rename(temp_original)
# Create a models directory with various files
models_dir = tmp_path / "models"
models_dir.mkdir(exist_ok=True)
# Create model and non-model files
(models_dir / "model1.pt").touch()
(models_dir / "model2.onnx").touch()
(models_dir / "readme.txt").touch()
(models_dir / "config.json").touch()
(models_dir / "image.jpg").touch()
# Create descriptions file
descriptions = {
"model1.pt": "PyTorch model",
"model2.onnx": "ONNX model"
}
with open(models_dir / "model_descriptions.json", "w") as f:
json.dump(descriptions, f)
# Create a symlink to our temporary directory
os.symlink(models_dir, original_models_dir)
try:
async with Client(mcp_server) as client:
result = await client.read_resource("models://list")
# Check that the resource returned a result
assert len(result) == 1
# Parse the result
result_dict = json.loads(result[0].text)
# Check that the result has the expected structure
assert "models" in result_dict
assert isinstance(result_dict["models"], list)
# Check that we have only the model files
assert len(result_dict["models"]) == 2
# Check model names
model_names = [model["name"] for model in result_dict["models"]]
assert "model1.pt" in model_names
assert "model2.onnx" in model_names
# Non-model files should not be included
assert "readme.txt" not in model_names
assert "config.json" not in model_names
assert "image.jpg" not in model_names
assert "model_descriptions.json" not in model_names
finally:
# Clean up: remove the symlink
if os.path.islink(original_models_dir):
os.unlink(original_models_dir)
# Restore the original directory if it existed
if original_exists:
temp_original.rename(original_models_dir)
================================================
FILE: tests/test_config.py
================================================
"""
Tests for the configuration management system.
"""
import os
import tempfile
from pathlib import Path
import pytest
import toml
from imagesorcery_mcp.config import (
ConfigManager,
ImageSorceryConfig,
get_config,
get_config_manager,
)
class TestImageSorceryConfig:
"""Tests for the ImageSorceryConfig model."""
def test_default_values(self):
"""Test that default configuration values are correct."""
config = ImageSorceryConfig()
# Detection defaults
assert config.detection.confidence_threshold == 0.75
assert config.detection.default_model == "yoloe-11l-seg-pf.pt"
# Find defaults
assert config.find.confidence_threshold == 0.75
assert config.find.default_model == "yoloe-11l-seg.pt"
# Blur defaults
assert config.blur.strength == 15
# Text defaults
assert config.text.font_scale == 1.0
# Drawing defaults
assert config.drawing.color == [0, 0, 0]
assert config.drawing.thickness == 1
# OCR defaults
assert config.ocr.language == "en"
# Resize defaults
assert config.resize.interpolation == "linear"
# Telemetry defaults
assert config.telemetry.enabled is False
def test_validation_confidence_threshold(self):
"""Test validation of confidence thresholds."""
# Valid values
config = ImageSorceryConfig(detection={"confidence_threshold": 0.5})
assert config.detection.confidence_threshold == 0.5
# Invalid values
with pytest.raises(ValueError):
ImageSorceryConfig(detection={"confidence_threshold": 1.5})
with pytest.raises(ValueError):
ImageSorceryConfig(detection={"confidence_threshold": -0.1})
def test_validation_blur_strength(self):
"""Test validation of blur strength."""
# Valid odd values
config = ImageSorceryConfig(blur={"strength": 21})
assert config.blur.strength == 21
# Invalid even values
with pytest.raises(ValueError):
ImageSorceryConfig(blur={"strength": 20})
def test_validation_drawing_color(self):
"""Test validation of drawing color."""
# Valid color
config = ImageSorceryConfig(drawing={"color": [255, 128, 0]})
assert config.drawing.color == [255, 128, 0]
# Invalid color values
with pytest.raises(ValueError):
ImageSorceryConfig(drawing={"color": [256, 0, 0]})
with pytest.raises(ValueError):
ImageSorceryConfig(drawing={"color": [-1, 0, 0]})
# Invalid color length
with pytest.raises(ValueError):
ImageSorceryConfig(drawing={"color": [255, 0]})
def test_validation_interpolation(self):
"""Test validation of resize interpolation."""
# Valid interpolation methods
for method in ["nearest", "linear", "area", "cubic", "lanczos"]:
config = ImageSorceryConfig(resize={"interpolation": method})
assert config.resize.interpolation == method
# Invalid interpolation method
with pytest.raises(ValueError):
ImageSorceryConfig(resize={"interpolation": "invalid"})
def test_validation_telemetry_enabled(self):
"""Test validation of telemetry enabled flag."""
# Valid values
config = ImageSorceryConfig(telemetry={"enabled": True})
assert config.telemetry.enabled is True
config = ImageSorceryConfig(telemetry={"enabled": False})
assert config.telemetry.enabled is False
# Invalid values
with pytest.raises(ValueError):
ImageSorceryConfig(telemetry={"enabled": "not_a_bool"})
class TestConfigManager:
"""Tests for the ConfigManager class."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.original_cwd = os.getcwd()
os.chdir(self.temp_dir)
def teardown_method(self):
"""Clean up test environment."""
os.chdir(self.original_cwd)
import shutil
shutil.rmtree(self.temp_dir)
def test_config_file_creation(self):
"""Test that config file is created if it doesn't exist."""
ConfigManager()
# Check that config.toml was created
assert Path("config.toml").exists()
# Check that it contains valid TOML
with open("config.toml", "r") as f:
config_data = toml.load(f)
assert "detection" in config_data
assert "blur" in config_data
def test_config_loading_from_file(self):
"""Test loading configuration from existing file."""
# Create a config file with custom values
config_data = {
"detection": {"confidence_threshold": 0.8},
"blur": {"strength": 21}
}
with open("config.toml", "w") as f:
toml.dump(config_data, f)
config_manager = ConfigManager()
config = config_manager.config
assert config.detection.confidence_threshold == 0.8
assert config.blur.strength == 21
def test_runtime_updates(self):
"""Test runtime configuration updates."""
config_manager = ConfigManager()
# Update configuration
updates = {
"detection.confidence_threshold": 0.9,
"text.font_scale": 2.0
}
updated_config = config_manager.update_config(updates, persist=False)
assert updated_config["detection"]["confidence_threshold"] == 0.9
assert updated_config["text"]["font_scale"] == 2.0
# Check that file wasn't modified
with open("config.toml", "r") as f:
file_config = toml.load(f)
# Should still have defaults since we didn't persist
assert file_config.get("detection", {}).get("confidence_threshold", 0.75) == 0.75
def test_persistent_updates(self):
"""Test persistent configuration updates."""
config_manager = ConfigManager()
# Update configuration with persistence
updates = {
"detection.confidence_threshold": 0.85,
"ocr.language": "fr"
}
config_manager.update_config(updates, persist=True)
# Check that file was modified
with open("config.toml", "r") as f:
file_config = toml.load(f)
assert file_config["detection"]["confidence_threshold"] == 0.85
assert file_config["ocr"]["language"] == "fr"
def test_persistent_telemetry_update(self):
"""Test persistent telemetry configuration update."""
config_manager = ConfigManager()
# Update telemetry with persistence
updates = {
"telemetry.enabled": True
}
config_manager.update_config(updates, persist=True)
# Check that file was modified
with open("config.toml", "r") as f:
file_config = toml.load(f)
assert file_config["telemetry"]["enabled"] is True
# Verify the runtime config also reflects the change
config = config_manager.config
assert config.telemetry.enabled is True
def test_validation_in_updates(self):
"""Test that updates are validated."""
config_manager = ConfigManager()
# Invalid confidence threshold
with pytest.raises(ValueError):
config_manager.update_config({"detection.confidence_threshold": 1.5})
# Invalid blur strength
with pytest.raises(ValueError):
config_manager.update_config({"blur.strength": 20})
def test_reset_runtime_overrides(self):
"""Test resetting runtime overrides."""
config_manager = ConfigManager()
# Make runtime changes
config_manager.update_config({
"detection.confidence_threshold": 0.9,
"text.font_scale": 2.0
}, persist=False)
# Verify changes
config = config_manager.config
assert config.detection.confidence_threshold == 0.9
assert config.text.font_scale == 2.0
# Reset
config_manager.reset_runtime_overrides()
# Verify reset
config = config_manager.config
assert config.detection.confidence_threshold == 0.75 # Back to default
assert config.text.font_scale == 1.0 # Back to default
def test_get_runtime_overrides(self):
"""Test getting current runtime overrides."""
config_manager = ConfigManager()
# Initially no overrides
assert config_manager.get_runtime_overrides() == {}
# Add some overrides
config_manager.update_config({
"detection.confidence_threshold": 0.9
}, persist=False)
overrides = config_manager.get_runtime_overrides()
assert overrides["detection.confidence_threshold"] == 0.9
class TestGlobalConfigFunctions:
"""Tests for global configuration functions."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.original_cwd = os.getcwd()
os.chdir(self.temp_dir)
# Reset global config manager
import imagesorcery_mcp.config
imagesorcery_mcp.config._config_manager = None
def teardown_method(self):
"""Clean up test environment."""
os.chdir(self.original_cwd)
import shutil
shutil.rmtree(self.temp_dir)
# Reset global config manager
import imagesorcery_mcp.config
imagesorcery_mcp.config._config_manager = None
def test_get_config_manager(self):
"""Test get_config_manager function."""
manager1 = get_config_manager()
manager2 = get_config_manager()
# Should return the same instance
assert manager1 is manager2
def test_get_config(self):
"""Test get_config function."""
config = get_config()
assert isinstance(config, ImageSorceryConfig)
assert config.detection.confidence_threshold == 0.75
================================================
FILE: tests/test_logging.py
================================================
import inspect
import logging
import os
import re
import tempfile
import time
from datetime import datetime
import pytest
from imagesorcery_mcp.logging_config import logger as imagesorcery_logger
@pytest.fixture
def temp_log_file():
"""Create a temporary log file for testing."""
fd, path = tempfile.mkstemp(suffix='.log')
yield path
os.close(fd)
os.unlink(path)
def test_log_structure_and_components(temp_log_file):
"""
Test that logs have the correct structure and all components are properly formatted
using the actual logging configuration from the project.
"""
# Get the actual imagesorcery logger initialized by logging_config
# Create a temporary handler to capture logs
handler = logging.FileHandler(temp_log_file)
# Use the same formatter as the original logger
original_formatter = imagesorcery_logger.handlers[0].formatter
handler.setFormatter(original_formatter)
imagesorcery_logger.addHandler(handler)
# Generate a test log with a unique message
test_message = f"Test message generated at {time.time()}"
line_num = inspect.currentframe().f_lineno + 1
imagesorcery_logger.info(test_message)
# Remove the temporary handler
imagesorcery_logger.removeHandler(handler)
# Read the log file
with open(temp_log_file, 'r') as f:
log_content = f.read().strip()
# Log format regex pattern to capture each component
# This pattern should match the format defined in logging_config.py
log_pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - ([\w\.]+)\.(\w+):(\d+) - (\w+) - (.+)'
match = re.match(log_pattern, log_content)
# Verify we matched the pattern
assert match, f"Log entry doesn't match expected pattern. Log content: {log_content}"
# Extract components
timestamp, logger_name, module_name, log_line_num, level, message = match.groups()
# Verify each component
# 1. Timestamp should be parseable
try:
datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S,%f')
except ValueError:
pytest.fail(f"Invalid timestamp format: {timestamp}")
# 2. Logger name should match what we set
assert logger_name == "imagesorcery", f"Expected logger name 'imagesorcery', got '{logger_name}'"
# 3. Module name should be this test module
assert module_name == "test_logging", f"Expected module name 'test_logging', got '{module_name}'"
# 4. Line number should match our recording
assert int(log_line_num) == line_num, f"Expected line number {line_num}, got {log_line_num}"
# 5. Level should be what we logged
assert level == "INFO", f"Expected log level 'INFO', got '{level}'"
# 6. Message should match what we logged
assert message == test_message, f"Expected message '{test_message}', got '{message}'"
def test_different_modules_log_correctly(temp_log_file):
"""Test that logs from different modules include correct module names."""
# Get the actual imagesorcery logger initialized by logging_config
# Setup a new handler for our test log file using the same formatter
handler = logging.FileHandler(temp_log_file)
original_formatter = imagesorcery_logger.handlers[0].formatter
handler.setFormatter(original_formatter)
# Add our handler to the logger
imagesorcery_logger.addHandler(handler)
# Log a message and get the current line number
line_num = inspect.currentframe().f_lineno + 1
imagesorcery_logger.info("Test message from test function")
# Remove our handler
imagesorcery_logger.removeHandler(handler)
# Check the log file content
with open(temp_log_file, 'r') as f:
log_content = f.read()
# Verify the module name and line number are in the log
assert f'imagesorcery.test_logging:{line_num}' in log_content, "Log doesn't contain module info"
assert 'Test message from test function' in log_content, "Log message not written correctly"
def test_different_log_levels(temp_log_file):
"""
Test that different log levels are correctly formatted and filtered
according to the logger's level setting.
"""
# Get the actual imagesorcery logger initialized by logging_config
# Store the original level to restore it later
original_level = imagesorcery_logger.level
# Create a temporary handler to capture logs
handler = logging.FileHandler(temp_log_file)
original_formatter = imagesorcery_logger.handlers[0].formatter
handler.setFormatter(original_formatter)
imagesorcery_logger.addHandler(handler)
try:
# Test with different log levels
# 1. First test with DEBUG level (lower than default INFO)
imagesorcery_logger.setLevel(logging.DEBUG)
# Log messages at different levels
debug_msg = "This is a DEBUG message"
info_msg = "This is an INFO message"
warning_msg = "This is a WARNING message"
error_msg = "This is an ERROR message"
critical_msg = "This is a CRITICAL message"
imagesorcery_logger.debug(debug_msg)
imagesorcery_logger.info(info_msg)
imagesorcery_logger.warning(warning_msg)
imagesorcery_logger.error(error_msg)
imagesorcery_logger.critical(critical_msg)
# Read the log file
with open(temp_log_file, 'r') as f:
debug_level_logs = f.readlines()
# There should be 5 log entries (one for each level) when set to DEBUG
assert len(debug_level_logs) == 5, f"Expected 5 log entries at DEBUG level, got {len(debug_level_logs)}"
# Verify each log level appears in the correct entry
assert "DEBUG" in debug_level_logs[0], f"First log should be DEBUG: {debug_level_logs[0]}"
assert "INFO" in debug_level_logs[1], f"Second log should be INFO: {debug_level_logs[1]}"
assert "WARNING" in debug_level_logs[2], f"Third log should be WARNING: {debug_level_logs[2]}"
assert "ERROR" in debug_level_logs[3], f"Fourth log should be ERROR: {debug_level_logs[3]}"
assert "CRITICAL" in debug_level_logs[4], f"Fifth log should be CRITICAL: {debug_level_logs[4]}"
# Verify messages are correctly logged
assert debug_msg in debug_level_logs[0], f"DEBUG message not correctly logged: {debug_level_logs[0]}"
assert info_msg in debug_level_logs[1], f"INFO message not correctly logged: {debug_level_logs[1]}"
assert warning_msg in debug_level_logs[2], f"WARNING message not correctly logged: {debug_level_logs[2]}"
assert error_msg in debug_level_logs[3], f"ERROR message not correctly logged: {debug_level_logs[3]}"
assert critical_msg in debug_level_logs[4], f"CRITICAL message not correctly logged: {debug_level_logs[4]}"
# 2. Now test with INFO level (default level)
# Clear the log file first
open(temp_log_file, 'w').close()
imagesorcery_logger.setLevel(logging.INFO)
# Log messages at different levels again
imagesorcery_logger.debug("This shouldn't appear in the log")
imagesorcery_logger.info(info_msg)
imagesorcery_logger.warning(warning_msg)
imagesorcery_logger.error(error_msg)
imagesorcery_logger.critical(critical_msg)
# Read the log file again
with open(temp_log_file, 'r') as f:
info_level_logs = f.readlines()
# There should be 4 log entries (DEBUG should be filtered out)
assert len(info_level_logs) == 4, f"Expected 4 log entries at INFO level, got {len(info_level_logs)}"
# Verify each log level appears in the correct entry
assert "INFO" in info_level_logs[0], f"First log should be INFO: {info_level_logs[0]}"
assert "WARNING" in info_level_logs[1], f"Second log should be WARNING: {info_level_logs[1]}"
assert "ERROR" in info_level_logs[2], f"Third log should be ERROR: {info_level_logs[2]}"
assert "CRITICAL" in info_level_logs[3], f"Fourth log should be CRITICAL: {info_level_logs[3]}"
# 3. Test with WARNING level
# Clear the log file first
open(temp_log_file, 'w').close()
imagesorcery_logger.setLevel(logging.WARNING)
# Log messages at different levels again
imagesorcery_logger.debug("This shouldn't appear in the log")
imagesorcery_logger.info("This shouldn't appear in the log either")
imagesorcery_logger.warning(warning_msg)
imagesorcery_logger.error(error_msg)
imagesorcery_logger.critical(critical_msg)
# Read the log file again
with open(temp_log_file, 'r') as f:
warning_level_logs = f.readlines()
# There should be 3 log entries (DEBUG and INFO should be filtered out)
assert len(warning_level_logs) == 3, f"Expected 3 log entries at WARNING level, got {len(warning_level_logs)}"
# Verify each log level appears in the correct entry
assert "WARNING" in warning_level_logs[0], f"First log should be WARNING: {warning_level_logs[0]}"
assert "ERROR" in warning_level_logs[1], f"Second log should be ERROR: {warning_level_logs[1]}"
assert "CRITICAL" in warning_level_logs[2], f"Third log should be CRITICAL: {warning_level_logs[2]}"
# 4. Test with ERROR level
# Clear the log file first
open(temp_log_file, 'w').close()
imagesorcery_logger.setLevel(logging.ERROR)
# Log messages at different levels again
imagesorcery_logger.debug("This shouldn't appear in the log")
imagesorcery_logger.info("This shouldn't appear in the log either")
imagesorcery_logger.warning("This shouldn't appear in the log either")
imagesorcery_logger.error(error_msg)
imagesorcery_logger.critical(critical_msg)
# Read the log file again
with open(temp_log_file, 'r') as f:
error_level_logs = f.readlines()
# There should be 2 log entries (DEBUG, INFO, WARNING should be filtered out)
assert len(error_level_logs) == 2, f"Expected 2 log entries at ERROR level, got {len(error_level_logs)}"
# Verify each log level appears in the correct entry
assert "ERROR" in error_level_logs[0], f"First log should be ERROR: {error_level_logs[0]}"
assert "CRITICAL" in error_level_logs[1], f"Second log should be CRITICAL: {error_level_logs[1]}"
# 5. Test with CRITICAL level
# Clear the log file first
open(temp_log_file, 'w').close()
imagesorcery_logger.setLevel(logging.CRITICAL)
# Log messages at different levels again
imagesorcery_logger.debug("This shouldn't appear in the log")
imagesorcery_logger.info("This shouldn't appear in the log either")
imagesorcery_logger.warning("This shouldn't appear in the log either")
imagesorcery_logger.error("This shouldn't appear in the log either")
imagesorcery_logger.critical(critical_msg)
# Read the log file again
with open(temp_log_file, 'r') as f:
critical_level_logs = f.readlines()
# There should be 1 log entry (all others should be filtered out)
assert len(critical_level_logs) == 1, f"Expected 1 log entry at CRITICAL level, got {len(critical_level_logs)}"
# Verify the log level and message
assert "CRITICAL" in critical_level_logs[0], f"Log should be CRITICAL: {critical_level_logs[0]}"
assert critical_msg in critical_level_logs[0], f"CRITICAL message not correctly logged: {critical_level_logs[0]}"
# Verify that the log format is still correct for each level
for log_line in debug_level_logs:
# Check that the log format includes module and line number
assert re.match(r'.*imagesorcery\.test_logging:\d+ - \w+ -.*', log_line), f"Log format incorrect: {log_line}"
finally:
# Restore the original logger level and remove our handler
imagesorcery_logger.setLevel(original_level)
imagesorcery_logger.removeHandler(handler)
================================================
FILE: tests/test_path_access.py
================================================
import os
import pytest
from fastmcp import Client, FastMCP
from PIL import Image
from imagesorcery_mcp.middlewares.path_access import (
AVAILABLE_PATHS_ENV,
get_allowed_directories,
split_paths,
)
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
return image_sorcery_mcp_server
@pytest.fixture
def test_image(tmp_path):
image_dir = tmp_path / "allowed"
image_dir.mkdir()
image_path = image_dir / "image.png"
Image.new("RGB", (20, 20), color="white").save(image_path)
return image_path
def test_available_paths_empty_disables_restrictions(monkeypatch):
monkeypatch.delenv(AVAILABLE_PATHS_ENV, raising=False)
assert get_allowed_directories() == []
monkeypatch.setenv(AVAILABLE_PATHS_ENV, " ")
assert get_allowed_directories() == []
def test_available_paths_supports_pathsep_and_comma():
raw_paths = os.pathsep.join(["/tmp/images", "/tmp/output"]) + ",/tmp/masks"
assert split_paths(raw_paths) == ["/tmp/images", "/tmp/output", "/tmp/masks"]
@pytest.mark.asyncio
async def test_path_inside_allowed_directory_is_accepted(
mcp_server: FastMCP, test_image, monkeypatch
):
monkeypatch.setenv(AVAILABLE_PATHS_ENV, str(test_image.parent))
async with Client(mcp_server) as client:
result = await client.call_tool("get_metainfo", {"input_path": str(test_image)})
assert result.data["path"] == str(test_image)
@pytest.mark.asyncio
async def test_relative_traversal_outside_allowed_directory_is_rejected(
mcp_server: FastMCP, tmp_path, monkeypatch
):
allowed_dir = tmp_path / "allowed"
outside_dir = tmp_path / "outside"
allowed_dir.mkdir()
outside_dir.mkdir()
outside_image = outside_dir / "image.png"
Image.new("RGB", (20, 20), color="white").save(outside_image)
monkeypatch.chdir(tmp_path)
monkeypatch.setenv(AVAILABLE_PATHS_ENV, str(allowed_dir))
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool(
"get_metainfo",
{"input_path": "allowed/../outside/image.png"},
)
assert "outside allowed directories" in str(excinfo.value)
@pytest.mark.asyncio
async def test_symlink_inside_allowed_directory_is_accepted_without_resolving_target(
mcp_server: FastMCP, tmp_path, monkeypatch
):
allowed_dir = tmp_path / "allowed"
outside_dir = tmp_path / "outside"
allowed_dir.mkdir()
outside_dir.mkdir()
outside_image = outside_dir / "image.png"
Image.new("RGB", (20, 20), color="white").save(outside_image)
try:
(allowed_dir / "link").symlink_to(outside_dir, target_is_directory=True)
except OSError as exc:
pytest.skip(f"Symlink creation is not available: {exc}")
monkeypatch.setenv(AVAILABLE_PATHS_ENV, str(allowed_dir))
async with Client(mcp_server) as client:
result = await client.call_tool(
"get_metainfo",
{"input_path": str(allowed_dir / "link" / "image.png")},
)
assert result.data["path"] == str(allowed_dir / "link" / "image.png")
@pytest.mark.asyncio
async def test_output_path_outside_allowed_directory_is_rejected(
mcp_server: FastMCP, test_image, tmp_path, monkeypatch
):
outside_output = tmp_path / "outside" / "output.png"
monkeypatch.setenv(AVAILABLE_PATHS_ENV, str(test_image.parent))
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool(
"resize",
{
"input_path": str(test_image),
"width": 10,
"output_path": str(outside_output),
},
)
assert "outside allowed directories" in str(excinfo.value)
@pytest.mark.asyncio
async def test_nested_mask_path_outside_allowed_directory_is_rejected(
mcp_server: FastMCP, test_image, tmp_path, monkeypatch
):
output_path = test_image.parent / "output.png"
outside_mask_path = tmp_path / "outside" / "mask.png"
monkeypatch.setenv(AVAILABLE_PATHS_ENV, str(test_image.parent))
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool(
"fill",
{
"input_path": str(test_image),
"areas": [{"mask_path": str(outside_mask_path), "color": [0, 0, 0]}],
"output_path": str(output_path),
},
)
assert "areas[0].mask_path" in str(excinfo.value)
assert "outside allowed directories" in str(excinfo.value)
================================================
FILE: tests/test_server.py
================================================
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.mark.asyncio
async def test_list_tools(mcp_server: FastMCP):
"""Tests listing available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools() # Correctly list tools using the client
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
assert len(tools) > 0, "Tools list should contain at least one tool"
@pytest.mark.asyncio
async def test_nonexisting_tool(mcp_server: FastMCP):
"""Tests calling a non-existent tool."""
nonexistent_tool_name = "nonexistent_tool"
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool(nonexistent_tool_name)
# Check that the error message contains the tool name
assert nonexistent_tool_name in str(excinfo.value)
================================================
FILE: tests/test_telemetry.py
================================================
"""
Tests for the telemetry system.
"""
import logging
import os
import tempfile
import uuid
from pathlib import Path
from unittest.mock import patch
from imagesorcery_mcp.config import get_config_manager
from imagesorcery_mcp.logging_config import logger as imagesorcery_logger
from imagesorcery_mcp.middlewares.telemetry import TelemetryMiddleware
# Mock the awaitable response for call_next
async def mock_call_next_func(context):
"""A simple async function to mock the call_next behavior."""
return "response"
# Mock the telemetry handlers to prevent actual network calls during tests
class MockAmplitudeHandler:
def __init__(self):
self.events = []
def track_event(self, event_data):
self.events.append(event_data)
class MockPostHogHandler:
def __init__(self):
self.events = []
def track_event(self, event_data):
self.events.append(event_data)
class TestTelemetryMiddleware:
"""Tests for the TelemetryMiddleware."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.original_cwd = os.getcwd()
os.chdir(self.temp_dir)
# Ensure a config.toml exists for get_config()
config_manager = get_config_manager()
config_manager._ensure_config_file_exists()
# Create a .user_id file for testing
self.user_id_file = Path(".user_id")
self.test_user_id = str(uuid.uuid4())
self.user_id_file.write_text(self.test_user_id)
# Reset global config manager to ensure fresh load with temp config
import imagesorcery_mcp.config
imagesorcery_mcp.config._config_manager = None
get_config_manager().reset_runtime_overrides() # Ensure config is reloaded
# Suppress logging during tests to avoid clutter
logging.disable(logging.CRITICAL)
# Initialize mock handlers for each test run
self._mock_amplitude_handler = MockAmplitudeHandler()
self._mock_posthog_handler = MockPostHogHandler()
def teardown_method(self):
"""Clean up test environment."""
logging.disable(logging.NOTSET) # Re-enable logging
os.chdir(self.original_cwd)
import shutil
shutil.rmtree(self.temp_dir)
# Reset global config manager again for other tests
import imagesorcery_mcp.config
imagesorcery_mcp.config._config_manager = None
def test_middleware_initialization(self):
"""Test that TelemetryMiddleware can be initialized."""
# Patch the module-level handlers during initialization
with patch('imagesorcery_mcp.middlewares.telemetry.amplitude_handler', new=self._mock_amplitude_handler), \
patch('imagesorcery_mcp.middlewares.telemetry.posthog_handler', new=self._mock_posthog_handler):
middleware = TelemetryMiddleware(logger=imagesorcery_logger)
assert isinstance(middleware, TelemetryMiddleware)
assert middleware.user_id == self.test_user_id
assert middleware.version != "unknown" # Should get a version from pyproject.toml
assert middleware.system is not None
async def test_telemetry_tracking_enabled_and_disabled(self):
"""Test that telemetry events are tracked when enabled and not tracked when disabled."""
# Patch the module-level handlers for this test
with patch('imagesorcery_mcp.middlewares.telemetry.amplitude_handler', new=self._mock_amplitude_handler), \
patch('imagesorcery_mcp.middlewares.telemetry.posthog_handler', new=self._mock_posthog_handler):
middleware = TelemetryMiddleware(logger=imagesorcery_logger)
config_manager = get_config_manager()
# 1. Test when telemetry is DISABLED (default)
config_manager.update_config({"telemetry.enabled": False}, persist=True)
await middleware.on_call_tool(
context=type("MockContext", (object,), {"message": type("MockMessage", (object,), {"name": "test_tool"})})(),
call_next=mock_call_next_func
)
assert len(self._mock_amplitude_handler.events) == 0
assert len(self._mock_posthog_handler.events) == 0
# 2. Test when telemetry is ENABLED
config_manager.update_config({"telemetry.enabled": True}, persist=True)
await middleware.on_call_tool(
context=type("MockContext", (object,), {"message": type("MockMessage", (object,), {"name": "test_tool"})})(),
call_next=mock_call_next_func
)
assert len(self._mock_amplitude_handler.events) == 1
assert len(self._mock_posthog_handler.events) == 1
# Verify event data
amplitude_event = self._mock_amplitude_handler.events[0]
posthog_event = self._mock_posthog_handler.events[0]
assert amplitude_event["user_id"] == self.test_user_id
assert amplitude_event["action_type"] == "calling_tool"
assert amplitude_event["identifier"] == "test_tool"
assert amplitude_event["status"] == "success"
assert posthog_event["user_id"] == self.test_user_id
assert posthog_event["action_type"] == "calling_tool"
assert posthog_event["identifier"] == "test_tool"
assert posthog_event["status"] == "success"
# 3. Test when telemetry is DISABLED again
config_manager.update_config({"telemetry.enabled": False}, persist=True)
self._mock_amplitude_handler.events = [] # Clear previous events
self._mock_posthog_handler.events = []
await middleware.on_call_tool(
context=type("MockContext", (object,), {"message": type("MockMessage", (object,), {"name": "another_tool"})})(),
call_next=mock_call_next_func
)
assert len(self._mock_amplitude_handler.events) == 0
assert len(self._mock_posthog_handler.events) == 0
================================================
FILE: tests/tools/test_blur.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image with a checkerboard pattern for blurring."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
# Create a checkerboard pattern in the center area
square_size = 20 # Size of each square in the checkerboard
for i in range(5): # 5x5 checkerboard
for j in range(5):
if (i + j) % 2 == 0: # Alternate black and white
x1 = 150 + j * square_size
y1 = 100 + i * square_size
x2 = x1 + square_size
y2 = y1 + square_size
cv2.rectangle(img, (x1, y1), (x2, y2), (0, 0, 0), -1) # Black square
cv2.imwrite(str(img_path), img)
return str(img_path)
@pytest.fixture
def test_image_for_invert_blur(tmp_path):
"""Create a test image with a noisy background and a solid central object for invert_areas blurring."""
img_path = tmp_path / "test_image_invert_blur.png"
# Create a noisy background
img = np.random.randint(0, 256, (300, 400, 3), dtype=np.uint8)
# Create a checkerboard pattern in the center area (the area to be kept unblurred)
square_size = 20 # Size of each square in the checkerboard
center_x_start = 150
center_y_start = 100
for i in range(5): # 5x5 checkerboard
for j in range(5):
if (i + j) % 2 == 0: # Alternate black and white
x1 = center_x_start + j * square_size
y1 = center_y_start + i * square_size
x2 = x1 + square_size
y2 = y1 + square_size
cv2.rectangle(img, (x1, y1), (x2, y2), (0, 0, 0), -1) # Black square
else:
x1 = center_x_start + j * square_size
y1 = center_y_start + i * square_size
x2 = x1 + square_size
y2 = y1 + square_size
cv2.rectangle(img, (x1, y1), (x2, y2), (255, 255, 255), -1) # White square
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestBlurToolDefinition:
"""Tests for the blur tool definition and metadata."""
@pytest.mark.asyncio
async def test_blur_in_tools_list(self, mcp_server: FastMCP):
"""Tests that blur tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if blur is in the list of tools
tool_names = [tool.name for tool in tools]
assert "blur" in tool_names, (
"blur tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_blur_description(self, mcp_server: FastMCP):
"""Tests that blur tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
blur_tool = next((tool for tool in tools if tool.name == "blur"), None)
# Check description
assert blur_tool.description, "blur tool should have a description"
assert "blur" in blur_tool.description.lower(), (
"Description should mention that it blurs areas of an image"
)
@pytest.mark.asyncio
async def test_blur_parameters(self, mcp_server: FastMCP):
"""Tests that blur tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
blur_tool = next((tool for tool in tools if tool.name == "blur"), None)
# Check input schema
assert hasattr(blur_tool, "inputSchema"), (
"blur tool should have an inputSchema"
)
assert "properties" in blur_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "areas"]
for param in required_params:
assert param in blur_tool.inputSchema["properties"], (
f"blur tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
assert "output_path" in blur_tool.inputSchema["properties"], (
"blur tool should have an 'output_path' property in its inputSchema"
)
# Check parameter types
assert (
blur_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert (
blur_tool.inputSchema["properties"]["areas"].get("type")
== "array"
), "areas should be of type array"
# Check output_path type - it can be string or null since it's optional
output_path_schema = blur_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
# Check that string is one of the allowed types
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestBlurToolExecution:
"""Tests for the blur tool execution and results."""
@pytest.mark.asyncio
async def test_blur_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the blur tool execution and return value."""
output_path = str(tmp_path / "output.png")
# Define the area to blur - covering the checkerboard pattern
blur_area = {
"x1": 150,
"y1": 100,
"x2": 250,
"y2": 200,
"blur_strength": 21
}
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_path,
"areas": [blur_area],
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
@pytest.mark.asyncio
async def test_blur_invert_rectangle(self, mcp_server: FastMCP, test_image_for_invert_blur, tmp_path):
"""Tests the blur tool with invert_areas for a rectangle."""
output_path = str(tmp_path / "output_inverted.png")
# Define a rectangle in the center (the solid black area to be kept unblurred)
blur_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "blur_strength": 21}
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_for_invert_blur,
"areas": [blur_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
original_img = cv2.imread(test_image_for_invert_blur)
# Center pixel (inside the specified area) should NOT be blurred - remains original
center_pixel_original = original_img[150, 200]
center_pixel_blurred = img[150, 200]
assert np.array_equal(center_pixel_original, center_pixel_blurred)
# Pixels outside the area (noisy background) should be blurred
outside_pixel_original = original_img[50, 50]
outside_pixel_blurred = img[50, 50]
assert not np.array_equal(outside_pixel_original, outside_pixel_blurred)
assert np.std(outside_pixel_blurred) < np.std(outside_pixel_original)
@pytest.mark.asyncio
async def test_blur_invert_polygon(self, mcp_server: FastMCP, test_image_for_invert_blur, tmp_path):
"""Tests the blur tool with invert_areas for a polygon."""
output_path = str(tmp_path / "output_inverted_poly.png")
# Define a triangle polygon within the central object area
polygon_area = {"polygon": [[160, 110], [240, 110], [200, 190]], "blur_strength": 21}
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_for_invert_blur,
"areas": [polygon_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
original_img = cv2.imread(test_image_for_invert_blur)
# Center of polygon (inside the specified area) should NOT be blurred
poly_center_original = original_img[150, 200]
poly_center_blurred = img[150, 200]
assert np.array_equal(poly_center_original, poly_center_blurred)
# Outside pixels (noisy background) should be blurred
outside_pixel_original = original_img[50, 50]
outside_pixel_blurred = img[50, 50]
assert not np.array_equal(outside_pixel_original, outside_pixel_blurred)
assert np.std(outside_pixel_blurred) < np.std(outside_pixel_original)
@pytest.mark.asyncio
async def test_blur_invert_multiple_areas(self, mcp_server: FastMCP, test_image_for_invert_blur, tmp_path):
"""Tests invert_areas with multiple areas to keep unblurred."""
output_path = str(tmp_path / "output_multi_unblurred.png")
# Keep two areas unblurred (within the central object), blur everything else
areas = [
{"x1": 160, "y1": 110, "x2": 190, "y2": 140, "blur_strength": 11},
{"x1": 210, "y1": 160, "x2": 240, "y2": 190, "blur_strength": 21}
]
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_for_invert_blur,
"areas": areas,
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
original_img = cv2.imread(test_image_for_invert_blur)
# First kept area should NOT be blurred (remains original)
kept_pixel1_original = original_img[125, 175]
kept_pixel1_blurred = img[125, 175]
assert np.array_equal(kept_pixel1_original, kept_pixel1_blurred)
# Second kept area should NOT be blurred (remains original)
kept_pixel2_original = original_img[175, 225]
kept_pixel2_blurred = img[175, 225]
assert np.array_equal(kept_pixel2_original, kept_pixel2_blurred)
# Area between them (noisy background) should be blurred
between_pixel_original = original_img[50, 50]
between_pixel_blurred = img[50, 50]
assert not np.array_equal(between_pixel_original, between_pixel_blurred)
assert np.std(between_pixel_blurred) < np.std(between_pixel_original)
@pytest.mark.asyncio
async def test_blur_polygon_area(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the blur tool with a polygon area."""
output_path = str(tmp_path / "output_poly.png")
# Define a triangular polygon within the checkerboard area
polygon_area = {
"polygon": [[160, 110], [240, 110], [200, 190]],
"blur_strength": 21
}
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_path,
"areas": [polygon_area],
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the image was created with correct dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400) # height, width
# Verify that the blurred area has different pixel values than the original
original_img = cv2.imread(test_image_path)
# Create a mask of the polygon to check pixels
mask = np.zeros(img.shape[:2], dtype=np.uint8)
cv2.fillPoly(mask, [np.array(polygon_area["polygon"], dtype=np.int32)], 255)
# Get pixels from original and blurred images using the mask
original_pixels = original_img[mask == 255]
blurred_pixels = img[mask == 255]
# The pixels should be different
assert not np.array_equal(original_pixels, blurred_pixels)
# The standard deviation of the blurred pixels should be lower
# because the checkerboard pattern is being smoothed
assert np.std(blurred_pixels) < np.std(original_pixels)
@pytest.mark.asyncio
async def test_blur_mixed_areas(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the blur tool with a mix of rectangle and polygon areas."""
output_path = str(tmp_path / "output_mixed.png")
# Define areas
rect_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "blur_strength": 11}
poly_area = {"polygon": [[160, 110], [240, 110], [200, 190]], "blur_strength": 21}
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_path,
"areas": [rect_area, poly_area],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
original_img = cv2.imread(test_image_path)
# Check rectangle blur by comparing regions
blurred_rect_region = img[rect_area["y1"]:rect_area["y2"], rect_area["x1"]:rect_area["x2"]]
original_rect_region = original_img[rect_area["y1"]:rect_area["y2"], rect_area["x1"]:rect_area["x2"]]
assert not np.array_equal(blurred_rect_region, original_rect_region)
assert np.std(blurred_rect_region) < np.std(original_rect_region)
# Check polygon blur by checking a point inside
# Create a mask for the polygon to check pixels
mask = np.zeros(img.shape[:2], dtype=np.uint8)
cv2.fillPoly(mask, [np.array(poly_area["polygon"], dtype=np.int32)], 255)
original_poly_pixels = original_img[mask == 255]
blurred_poly_pixels = img[mask == 255]
assert not np.array_equal(original_poly_pixels, blurred_poly_pixels)
assert np.std(blurred_poly_pixels) < np.std(original_poly_pixels)
# Verify the image was created with correct dimensions
assert img.shape[:2] == (300, 400) # height, width
@pytest.mark.asyncio
async def test_blur_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the blur tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_path,
"areas": [
{
"x1": 150,
"y1": 100,
"x2": 250,
"y2": 200,
}
]
},
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_blurred.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
@pytest.mark.asyncio
async def test_blur_multiple_areas(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the blur tool with multiple areas."""
output_path = str(tmp_path / "multi_blur.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"blur",
{
"input_path": test_image_path,
"areas": [
{
"x1": 50,
"y1": 50,
"x2": 100,
"y2": 100,
"blur_strength": 11
},
{
"x1": 150,
"y1": 100,
"x2": 250,
"y2": 200,
"blur_strength": 21
}
],
"output_path": output_path
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
================================================
FILE: tests/tools/test_change_color.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a colorful test image."""
img_path = tmp_path / "test_color_image.png"
img = np.zeros((100, 100, 3), dtype=np.uint8)
# Add some colors
img[0:50, 0:50] = [255, 0, 0] # Blue
img[0:50, 50:100] = [0, 255, 0] # Green
img[50:100, 0:50] = [0, 0, 255] # Red
img[50:100, 50:100] = [255, 255, 0] # Cyan
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestChangeColorToolDefinition:
"""Tests for the change_color tool definition and metadata."""
@pytest.mark.asyncio
async def test_change_color_in_tools_list(self, mcp_server: FastMCP):
"""Tests that change_color tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "change_color" in tool_names, "change_color tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_change_color_description(self, mcp_server: FastMCP):
"""Tests that change_color tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
change_color_tool = next((tool for tool in tools if tool.name == "change_color"), None)
assert change_color_tool.description, "change_color tool should have a description"
assert "color palette" in change_color_tool.description.lower(), "Description should mention changing the color palette"
@pytest.mark.asyncio
async def test_change_color_parameters(self, mcp_server: FastMCP):
"""Tests that change_color tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
change_color_tool = next((tool for tool in tools if tool.name == "change_color"), None)
assert hasattr(change_color_tool, "inputSchema"), "change_color tool should have an inputSchema"
assert "properties" in change_color_tool.inputSchema, "inputSchema should have properties field"
required_params = ["input_path", "palette"]
for param in required_params:
assert param in change_color_tool.inputSchema["properties"], f"change_color tool should have a '{param}' property in its inputSchema"
assert "output_path" in change_color_tool.inputSchema["properties"], "change_color tool should have an 'output_path' property in its inputSchema"
assert change_color_tool.inputSchema["properties"]["input_path"].get("type") == "string"
assert change_color_tool.inputSchema["properties"]["palette"].get("type") == "string"
output_path_schema = change_color_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema
string_type_present = any(type_option.get("type") == "string" for type_option in output_path_schema["anyOf"])
assert string_type_present
class TestChangeColorToolExecution:
"""Tests for the change_color tool execution and results."""
@pytest.mark.asyncio
async def test_change_color_grayscale(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the change_color tool with the 'grayscale' palette."""
output_path = str(tmp_path / "output_grayscale.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"change_color",
{"input_path": test_image_path, "palette": "grayscale", "output_path": output_path},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert len(img.shape) == 2, "Grayscale image should have 2 dimensions (height, width)"
# Check a pixel from the original blue area
# Original blue: [255, 0, 0] -> BGR
# Grayscale conversion: Y = 0.299*R + 0.587*G + 0.114*B = 0.114*255 = 29.07
# Expected grayscale value: ~29
pixel_value = img[25, 25]
assert np.isclose(pixel_value, 29, atol=2), f"Pixel value {pixel_value} is not close to expected grayscale value for blue"
# Check a pixel from the original green area
# Original green: [0, 255, 0] -> BGR
# Grayscale conversion: Y = 0.299*R + 0.587*G + 0.114*B = 0.587*255 = 149.69
# Expected grayscale value: ~150
pixel_value = img[25, 75]
assert np.isclose(pixel_value, 150, atol=2), f"Pixel value {pixel_value} is not close to expected grayscale value for green"
@pytest.mark.asyncio
async def test_change_color_sepia(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the change_color tool with the 'sepia' palette."""
output_path = str(tmp_path / "output_sepia.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"change_color",
{"input_path": test_image_path, "palette": "sepia", "output_path": output_path},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
assert len(img.shape) == 3, "Sepia image should have 3 dimensions"
# Check a pixel from the original blue area
# Original blue: [255, 0, 0] -> BGR
# Sepia transform: B' = 0.272*B + 0.534*G + 0.131*R = 0.272*255 = 69.36
# G' = 0.349*B + 0.686*G + 0.168*R = 0.349*255 = 88.99
# R' = 0.393*B + 0.769*G + 0.189*R = 0.393*255 = 100.21
# Expected BGR: [69, 89, 100]
pixel = img[25, 25]
assert np.allclose(pixel, [69, 89, 100], atol=2), f"Pixel {pixel} is not close to sepia-toned blue"
@pytest.mark.asyncio
async def test_change_color_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the change_color tool with a default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool("change_color", {"input_path": test_image_path, "palette": "grayscale"})
expected_output = test_image_path.replace(".png", "_grayscale.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
@pytest.mark.asyncio
async def test_change_color_invalid_palette(self, mcp_server: FastMCP, test_image_path):
"""Tests the change_color tool with an invalid palette."""
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool("change_color", {"input_path": test_image_path, "palette": "invalid_palette"})
assert "input validation error" in str(excinfo.value).lower()
================================================
FILE: tests/tools/test_config_tool.py
================================================
"""
End-to-end tests for the config tool through MCP client interface.
"""
import os
import tempfile
from pathlib import Path
import pytest
import toml
from fastmcp import Client
from fastmcp.exceptions import ToolError
from imagesorcery_mcp.server import mcp
class TestConfigToolE2E:
"""End-to-end tests for the config tool through MCP client."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.original_cwd = os.getcwd()
os.chdir(self.temp_dir)
def teardown_method(self):
"""Clean up test environment."""
os.chdir(self.original_cwd)
import shutil
shutil.rmtree(self.temp_dir)
# Reset global config manager
import imagesorcery_mcp.config
imagesorcery_mcp.config._config_manager = None
@pytest.mark.asyncio
async def test_config_tool_registration(self):
"""Test that config tool is properly registered in the server."""
async with Client(mcp) as client:
tools = await client.list_tools()
config_tool = next((tool for tool in tools if tool.name == "config"), None)
assert config_tool is not None, "Config tool should be registered"
assert config_tool.name == "config"
# Check input schema has required parameters
schema = config_tool.inputSchema
assert "properties" in schema
assert "action" in schema["properties"]
assert "key" in schema["properties"]
assert "value" in schema["properties"]
assert "persist" in schema["properties"]
@pytest.mark.asyncio
async def test_config_get_all(self):
"""Test getting entire configuration through MCP client."""
async with Client(mcp) as client:
# Call config tool to get all configuration
result = await client.call_tool("config", {"action": "get"})
assert result.is_error is False, f"Config tool should not error: {result.content}"
# Parse the result content
content = result.content[0].text
assert "action" in content
assert "config" in content
assert "runtime_overrides" in content
# Verify it contains expected configuration sections
assert "detection" in content
assert "blur" in content
assert "text" in content
@pytest.mark.asyncio
async def test_config_get_specific_key(self):
"""Test getting specific configuration value through MCP client."""
async with Client(mcp) as client:
# Call config tool to get specific key
result = await client.call_tool("config", {
"action": "get",
"key": "detection.confidence_threshold"
})
assert result.is_error is False, f"Config tool should not error: {result.content}"
content = result.content[0].text
assert "action" in content
assert "key" in content
assert "value" in content
assert "detection.confidence_threshold" in content
@pytest.mark.asyncio
async def test_config_set_runtime(self):
"""Test setting configuration value for runtime only through MCP client."""
async with Client(mcp) as client:
# Set a runtime configuration value
result = await client.call_tool("config", {
"action": "set",
"key": "detection.confidence_threshold",
"value": 0.8,
"persist": False
})
assert result.is_error is False, f"Config tool should not error: {result.content}"
content = result.content[0].text
assert "action" in content
assert "set" in content
assert "detection.confidence_threshold" in content
assert "0.8" in content
assert "current session" in content
# Verify the change by getting the value back
get_result = await client.call_tool("config", {
"action": "get",
"key": "detection.confidence_threshold"
})
assert get_result.is_error is False
get_content = get_result.content[0].text
assert "0.8" in get_content
@pytest.mark.asyncio
async def test_config_set_persistent(self):
"""Test setting configuration value persistently through MCP client."""
async with Client(mcp) as client:
# Set a persistent configuration value
result = await client.call_tool("config", {
"action": "set",
"key": "blur.strength",
"value": 21,
"persist": True
})
assert result.is_error is False, f"Config tool should not error: {result.content}"
content = result.content[0].text
assert "action" in content
assert "set" in content
assert "blur.strength" in content
assert "21" in content
assert "persisted to file" in content
# Verify the config file was updated
assert Path("config.toml").exists()
with open("config.toml", "r") as f:
config_data = toml.load(f)
assert config_data["blur"]["strength"] == 21
@pytest.mark.asyncio
async def test_config_set_invalid_value(self):
"""Test setting invalid configuration value through MCP client."""
async with Client(mcp) as client:
# Try to set an invalid confidence threshold
result = await client.call_tool("config", {
"action": "set",
"key": "detection.confidence_threshold",
"value": 1.5 # Invalid: > 1.0
})
assert result.is_error is False # Tool doesn't error, but returns error in content
content = result.content[0].text
assert "error" in content
assert "Invalid configuration update" in content
@pytest.mark.asyncio
async def test_config_reset(self):
"""Test resetting runtime configuration overrides through MCP client."""
async with Client(mcp) as client:
# First set some runtime values
await client.call_tool("config", {
"action": "set",
"key": "detection.confidence_threshold",
"value": 0.9,
"persist": False
})
await client.call_tool("config", {
"action": "set",
"key": "text.font_scale",
"value": 2.0,
"persist": False
})
# Reset runtime overrides
result = await client.call_tool("config", {"action": "reset"})
assert result.is_error is False, f"Config tool should not error: {result.content}"
content = result.content[0].text
assert "action" in content
assert "reset" in content
assert "Runtime configuration overrides reset successfully" in content
# Verify values are back to defaults
get_result = await client.call_tool("config", {
"action": "get",
"key": "detection.confidence_threshold"
})
get_content = get_result.content[0].text
assert "0.75" in get_content # Back to default
@pytest.mark.asyncio
async def test_config_get_nonexistent_key(self):
"""Test getting non-existent configuration key through MCP client."""
async with Client(mcp) as client:
result = await client.call_tool("config", {
"action": "get",
"key": "nonexistent.key"
})
assert result.is_error is False # Tool doesn't error, but returns error in content
content = result.content[0].text
assert "error" in content
assert "Configuration key 'nonexistent.key' not found" in content
assert "available_keys" in content
@pytest.mark.asyncio
async def test_config_invalid_action(self):
"""Test config tool with invalid action through MCP client."""
async with Client(mcp) as client:
# Invalid action should raise ToolError due to input validation
with pytest.raises(ToolError) as exc_info:
await client.call_tool("config", {"action": "invalid"})
assert "Input validation error" in str(exc_info.value)
assert "invalid" in str(exc_info.value)
@pytest.mark.asyncio
async def test_config_set_missing_parameters(self):
"""Test config tool with missing required parameters through MCP client."""
async with Client(mcp) as client:
# Test setting without key
result = await client.call_tool("config", {
"action": "set",
"value": 0.8
})
assert result.is_error is False # Tool doesn't error, but returns error in content
content = result.content[0].text
assert "error" in content
assert "Key is required for 'set' action" in content
================================================
FILE: tests/tools/test_crop.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for cropping."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((200, 200, 3), dtype=np.uint8) * 255
# Draw some colored areas to verify cropping
# Red square (50,50) to (100,100)
img[50:100, 50:100] = [0, 0, 255] # OpenCV uses BGR
# Blue square (100,100) to (150,150)
img[100:150, 100:150] = [255, 0, 0] # OpenCV uses BGR
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestCropToolDefinition:
"""Tests for the crop tool definition and metadata."""
@pytest.mark.asyncio
async def test_crop_in_tools_list(self, mcp_server: FastMCP):
"""Tests that crop tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if crop is in the list of tools
tool_names = [tool.name for tool in tools]
assert "crop" in tool_names, (
"crop tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_crop_description(self, mcp_server: FastMCP):
"""Tests that crop tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
crop_tool = next((tool for tool in tools if tool.name == "crop"), None)
# Check description
assert crop_tool.description, "crop tool should have a description"
assert "crop" in crop_tool.description.lower(), (
"Description should mention that it crops an image"
)
@pytest.mark.asyncio
async def test_crop_parameters(self, mcp_server: FastMCP):
"""Tests that crop tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
crop_tool = next((tool for tool in tools if tool.name == "crop"), None)
# Check input schema
assert hasattr(crop_tool, "inputSchema"), (
"crop tool should have an inputSchema"
)
assert "properties" in crop_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "x1", "y1", "x2", "y2"]
for param in required_params:
assert param in crop_tool.inputSchema["properties"], (
f"crop tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
assert "output_path" in crop_tool.inputSchema["properties"], (
"crop tool should have an 'output_path' property in its inputSchema"
)
# Check parameter types
assert (
crop_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert (
crop_tool.inputSchema["properties"]["x1"].get("type") == "integer"
), "x1 should be of type integer"
assert (
crop_tool.inputSchema["properties"]["y1"].get("type") == "integer"
), "y1 should be of type integer"
assert (
crop_tool.inputSchema["properties"]["x2"].get("type") == "integer"
), "x2 should be of type integer"
assert (
crop_tool.inputSchema["properties"]["y2"].get("type") == "integer"
), "y2 should be of type integer"
assert (
crop_tool.inputSchema["properties"]["output_path"].get("type")
== "string"
), "output_path should be of type string"
class TestCropToolExecution:
"""Tests for the crop tool execution and results."""
@pytest.mark.asyncio
async def test_crop_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the crop tool execution and return value."""
output_path = str(tmp_path / "output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"crop",
{
"input_path": test_image_path,
"x1": 50,
"y1": 50,
"x2": 100,
"y2": 100,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the cropped image dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (50, 50) # height, width
# Check if the red square was properly cropped (BGR in OpenCV)
assert all(img[0, 0] == [0, 0, 255]) # Red in BGR
@pytest.mark.asyncio
async def test_crop_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the crop tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"crop",
{
"input_path": test_image_path,
"x1": 50,
"y1": 50,
"x2": 100,
"y2": 100,
},
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_cropped.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
================================================
FILE: tests/tools/test_detect.py
================================================
import os
import shutil
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from PIL import Image, ImageDraw
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Path to a test image with known objects for detection."""
# Path to the test image in the tests/data directory
current_dir = os.path.dirname(os.path.abspath(__file__))
test_data_dir = os.path.join(os.path.dirname(current_dir), "data")
src_path = os.path.join(test_data_dir, "test_detection.jpg")
dest_path = tmp_path / "test_detection.jpg"
shutil.copy(src_path, dest_path)
return str(dest_path)
@pytest.fixture
def test_image_negative_path(tmp_path):
"""Path to a test image with different objects for negative testing."""
current_dir = os.path.dirname(os.path.abspath(__file__))
test_data_dir = os.path.join(os.path.dirname(current_dir), "data")
src_path = os.path.join(test_data_dir, "test_detection_negative.jpg")
dest_path = tmp_path / "test_detection_negative.jpg"
shutil.copy(src_path, dest_path)
return str(dest_path)
@pytest.fixture
def test_segmentation_image_path(tmp_path):
"""Path to a simple test image for segmentation mask validation."""
current_dir = os.path.dirname(os.path.abspath(__file__))
test_data_dir = os.path.join(os.path.dirname(current_dir), "data")
src_path = os.path.join(test_data_dir, "test_detection_mask.jpg")
dest_path = tmp_path / "test_detection_mask.jpg"
shutil.copy(src_path, dest_path)
return str(dest_path)
class TestDetectToolDefinition:
"""Tests for the detect tool definition and metadata."""
@pytest.mark.asyncio
async def test_detect_in_tools_list(self, mcp_server: FastMCP):
"""Tests that detect tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if detect is in the list of tools
tool_names = [tool.name for tool in tools]
assert "detect" in tool_names, (
"detect tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_detect_description(self, mcp_server: FastMCP):
"""Tests that detect tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
detect_tool = next((tool for tool in tools if tool.name == "detect"), None)
# Check description
assert detect_tool.description, "detect tool should have a description"
assert "detect" in detect_tool.description.lower(), (
"Description should mention that it detects objects in an image"
)
@pytest.mark.asyncio
async def test_detect_parameters(self, mcp_server: FastMCP):
"""Tests that detect tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
detect_tool = next((tool for tool in tools if tool.name == "detect"), None)
# Check input schema
assert hasattr(detect_tool, "inputSchema"), (
"detect tool should have an inputSchema"
)
assert "properties" in detect_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path"]
for param in required_params:
assert param in detect_tool.inputSchema["properties"], (
f"detect tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
optional_params = ["confidence", "model_name", "return_geometry", "geometry_format"]
for param in optional_params:
assert param in detect_tool.inputSchema["properties"], (
f"detect tool should have a '{param}' property in its inputSchema"
)
# Check parameter types and defaults
assert (
detect_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
# Check optional parameters (now have anyOf structure with null)
confidence_schema = detect_tool.inputSchema["properties"]["confidence"]
assert "anyOf" in confidence_schema, "confidence should have anyOf structure for optional parameter"
assert any(item.get("type") == "number" for item in confidence_schema["anyOf"]), "confidence should allow number type"
assert any(item.get("type") == "null" for item in confidence_schema["anyOf"]), "confidence should allow null type"
model_name_schema = detect_tool.inputSchema["properties"]["model_name"]
assert "anyOf" in model_name_schema, "model_name should have anyOf structure for optional parameter"
assert any(item.get("type") == "string" for item in model_name_schema["anyOf"]), "model_name should allow string type"
assert any(item.get("type") == "null" for item in model_name_schema["anyOf"]), "model_name should allow null type"
# New parameters for geometry
assert (
detect_tool.inputSchema["properties"]["return_geometry"].get("type")
== "boolean"
), "return_geometry should be of type boolean"
assert (
detect_tool.inputSchema["properties"]["return_geometry"].get("default")
is False
), "return_geometry default should be False"
assert (
detect_tool.inputSchema["properties"]["geometry_format"].get("type")
== "string"
), "geometry_format should be of type string"
assert (
detect_tool.inputSchema["properties"]["geometry_format"].get("enum")
== ["mask", "polygon"]
), "geometry_format enum should be ['mask', 'polygon']"
assert (
detect_tool.inputSchema["properties"]["geometry_format"].get("default")
== "mask"
), "geometry_format default should be 'mask'"
class TestDetectToolExecution:
"""Tests for the detect tool execution and results."""
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_tool_execution(self, mcp_server: FastMCP, test_image_path):
"""Tests the detect tool execution and return value."""
# Skip if test image doesn't exist
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
# Use the smallest model for faster tests
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
},
)
# Parse the result
detection_result = result.structured_content
# Check that the tool returned a result
assert detection_result is not None
# Basic structure checks
assert "image_path" in detection_result
assert "detections" in detection_result
assert detection_result["image_path"] == test_image_path
assert isinstance(detection_result["detections"], list)
# Check that we have at least some detections
assert len(detection_result["detections"]) > 0, (
"No objects detected in the test image"
)
# Check the structure of a detection
detection = detection_result["detections"][0]
assert "class" in detection, "Detection should have a class name"
assert "confidence" in detection, "Detection should have a confidence score"
assert "bbox" in detection, "Detection should have a bounding box"
# Check that the confidence is within expected range
assert 0 <= detection["confidence"] <= 1, (
"Confidence should be between 0 and 1"
)
# Check that the bounding box has 4 coordinates
assert len(detection["bbox"]) == 4, "Bounding box should have 4 coordinates"
# Check for expected classes in the image
# We expect at least one of these classes to be detected
expected_classes = ["person", "car", "dog"]
detected_classes = [d["class"] for d in detection_result["detections"]]
assert any(cls in detected_classes for cls in expected_classes), (
f"None of the expected classes {expected_classes} were detected. "
f"Detected classes: {detected_classes}"
)
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_with_mask_geometry(self, mcp_server: FastMCP, test_image_path):
"""Tests the detect tool with mask geometry return."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"return_geometry": True,
"geometry_format": "mask",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert len(detection_result["detections"]) > 0
for detection in detection_result["detections"]:
assert "mask_path" in detection
assert "polygon" not in detection
mask_path = detection["mask_path"]
assert isinstance(mask_path, str)
assert os.path.exists(mask_path)
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_with_polygon_geometry(self, mcp_server: FastMCP, test_image_path):
"""Tests the detect tool with polygon geometry return."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"return_geometry": True,
"geometry_format": "polygon",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert detection_result is not None
assert len(detection_result["detections"]) > 0
detection = detection_result["detections"][0]
assert "polygon" in detection
assert "mask" not in detection
polygon_data = detection["polygon"]
assert isinstance(polygon_data, list)
assert len(polygon_data) > 0
# It's a list of points [x, y]
assert isinstance(polygon_data[0], list)
assert len(polygon_data[0]) == 2
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_no_geometry_by_default(self, mcp_server: FastMCP, test_image_path):
"""Tests that no geometry is returned by default."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert detection_result is not None
assert len(detection_result["detections"]) > 0
detection = detection_result["detections"][0]
assert "mask_path" not in detection
assert "polygon" not in detection
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_geometry_with_non_seg_model_raises_error(
self, mcp_server: FastMCP, test_image_path, caplog
):
"""Tests that requesting geometry with a non-segmentation model raises an error."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
non_seg_model = "yolov8n.pt"
model_path = os.path.join("models", non_seg_model)
if not os.path.exists(model_path):
pytest.skip(f"Non-segmentation model '{non_seg_model}' not found for testing.")
async with Client(mcp_server) as client:
from fastmcp.exceptions import ToolError
with pytest.raises(ToolError):
await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": non_seg_model,
"return_geometry": True,
},
)
assert any("does not support segmentation" in record.message for record in caplog.records), \
"Expected error about segmentation not supported to be logged"
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_negative_scenario(
self, mcp_server: FastMCP, test_image_negative_path
):
"""Tests that certain objects are not detected in an image where they don't
exist.
"""
# Skip if test image doesn't exist
if not os.path.exists(test_image_negative_path):
pytest.skip(f"Test image not found at {test_image_negative_path}")
async with Client(mcp_server) as client:
# Use the smallest model for faster tests
result = await client.call_tool(
"detect",
{
"input_path": test_image_negative_path,
"confidence": 0.5,
"model_name": "yoloe-11s-seg-pf.pt",
},
)
# Parse the result
detection_result = result.structured_content
# Check that the tool returned a result
assert detection_result is not None
# Basic structure checks
assert "image_path" in detection_result
assert "detections" in detection_result
assert detection_result["image_path"] == test_image_negative_path
assert isinstance(detection_result["detections"], list)
# Check that we have at least some detections
assert len(detection_result["detections"]) > 0, (
"No objects detected in the test image"
)
# Objects that should NOT be detected in this image
not_expected_classes = ["person", "car", "dog", "truck", "bus"]
detected_classes = [d["class"] for d in detection_result["detections"]]
# Check that none of the not expected classes are detected
for cls in not_expected_classes:
assert cls not in detected_classes, (
f"Class '{cls}' was detected but should not be present in the image"
)
# Objects that SHOULD be detected in this image
expected_classes = ["bicycle", "cat"]
# Check that at least one of the expected classes is detected
assert any(cls in detected_classes for cls in expected_classes), (
f"None of the expected classes {expected_classes} were detected. "
f"Detected classes: {detected_classes}"
)
class TestDetectGeometryValidation:
"""Tests for validating the correctness of masks and polygons returned by detect tool."""
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_mask_correctness(self, mcp_server: FastMCP, test_image_path):
"""Tests that returned masks are valid and correctly positioned."""
# Load the test image to get its dimensions
with Image.open(test_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"return_geometry": True,
"geometry_format": "mask",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert detection_result is not None
assert len(detection_result["detections"]) > 0
for detection in detection_result["detections"]:
assert "mask_path" in detection
mask_path = detection["mask_path"]
assert os.path.exists(mask_path)
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
assert mask is not None
bbox = detection["bbox"]
x1, y1, x2, y2 = bbox
mask_height, mask_width = mask.shape
assert (
(mask_height == mask_width) or
(mask_height == orig_height and mask_width == orig_width)
), f"Mask dimensions {mask.shape} should be square or match original image"
scale_x = orig_width / mask_width
scale_y = orig_height / mask_height
unique_values = np.unique(mask)
assert len(unique_values) <= 2, "Mask should be binary"
assert all(v in [0, 255] for v in unique_values), (
"Mask should contain only 0/255 values"
)
assert np.sum(mask) > 0, "Mask should not be empty"
mask_indices = np.where(mask > 0)
if len(mask_indices[0]) > 0:
min_y, max_y = mask_indices[0].min(), mask_indices[0].max()
min_x, max_x = mask_indices[1].min(), mask_indices[1].max()
scaled_x1 = x1 / scale_x
scaled_x2 = x2 / scale_x
scaled_y1 = y1 / scale_y
scaled_y2 = y2 / scale_y
tolerance = 10
assert min_x >= scaled_x1 - tolerance
assert max_x <= scaled_x2 + tolerance
assert min_y >= scaled_y1 - tolerance
assert max_y <= scaled_y2 + tolerance
mask_area = np.sum(mask > 0)
scaled_bbox_area = ((scaled_x2 - scaled_x1) * (scaled_y2 - scaled_y1))
coverage_ratio = mask_area / scaled_bbox_area if scaled_bbox_area > 0 else 0
assert 0.1 <= coverage_ratio <= 1.5
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_polygon_correctness(self, mcp_server: FastMCP, test_image_path):
"""Tests that returned polygons are valid and correctly positioned."""
# Load the test image to get its dimensions
with Image.open(test_image_path) as img:
img_width, img_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"return_geometry": True,
"geometry_format": "polygon",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert detection_result is not None
assert len(detection_result["detections"]) > 0
for detection in detection_result["detections"]:
polygon = detection["polygon"]
bbox = detection["bbox"]
x1, y1, x2, y2 = bbox
# 1. Check polygon has at least 3 points
assert len(polygon) >= 3, "Polygon should have at least 3 points"
# 2. Check all points have exactly 2 coordinates
for point in polygon:
assert len(point) == 2, f"Each polygon point should have 2 coordinates, got {len(point)}"
# 3. Check all coordinates are reasonable
# Note: Polygon coordinates should be in original image space
for x, y in polygon:
# Allow some tolerance outside image bounds
tolerance = 10
assert -tolerance <= x <= img_width + tolerance, (
f"X coordinate {x} should be within image width {img_width} (with tolerance)"
)
assert -tolerance <= y <= img_height + tolerance, (
f"Y coordinate {y} should be within image height {img_height} (with tolerance)"
)
# 4. Check polygon points are within bbox bounds (with tolerance)
tolerance = 10
xs = [p[0] for p in polygon]
ys = [p[1] for p in polygon]
assert min(xs) >= x1 - tolerance, f"Min polygon x {min(xs)} should be >= bbox x1 {x1}"
assert max(xs) <= x2 + tolerance, f"Max polygon x {max(xs)} should be <= bbox x2 {x2}"
assert min(ys) >= y1 - tolerance, f"Min polygon y {min(ys)} should be >= bbox y1 {y1}"
assert max(ys) <= y2 + tolerance, f"Max polygon y {max(ys)} should be <= bbox y2 {y2}"
# 5. Check polygon area is positive (using shoelace formula)
area = 0
n = len(polygon)
for i in range(n):
j = (i + 1) % n
area += polygon[i][0] * polygon[j][1]
area -= polygon[j][0] * polygon[i][1]
area = abs(area) / 2.0
assert area > 0, "Polygon area should be positive"
# 6. Check polygon area relative to bbox area
bbox_area = (x2 - x1) * (y2 - y1)
area_ratio = area / bbox_area
assert 0.1 <= area_ratio <= 1.5, (
f"Polygon area ratio {area_ratio:.2f} should be reasonable relative to bbox"
)
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_mask_to_polygon_consistency(self, mcp_server: FastMCP, test_image_path):
"""Tests that mask and polygon representations are consistent for the same object."""
with Image.open(test_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
mask_result = await client.call_tool("detect", {"input_path": test_image_path, "model_name": "yoloe-11s-seg-pf.pt", "return_geometry": True, "geometry_format": "mask", "confidence": 0.5})
polygon_result = await client.call_tool("detect", {"input_path": test_image_path, "model_name": "yoloe-11s-seg-pf.pt", "return_geometry": True, "geometry_format": "polygon", "confidence": 0.5})
mask_data = mask_result.structured_content
polygon_data = polygon_result.structured_content
assert len(mask_data["detections"]) == len(polygon_data["detections"])
if len(mask_data["detections"]) > 0:
mask_detections = sorted(mask_data["detections"], key=lambda x: (x["class"], -x["confidence"]))
polygon_detections = sorted(polygon_data["detections"], key=lambda x: (x["class"], -x["confidence"]))
mask_det = mask_detections[0]
polygon_det = polygon_detections[0]
mask_path = mask_det["mask_path"]
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
assert mask_det["class"] == polygon_det["class"]
mask_bbox = mask_det["bbox"]
polygon_bbox = polygon_det["bbox"]
bbox_tolerance = 20
for i in range(4):
assert abs(mask_bbox[i] - polygon_bbox[i]) < bbox_tolerance
polygon_points = polygon_det["polygon"]
mask_height, mask_width = mask.shape
img = Image.new('L', (mask_width, mask_height), 0)
scale_x = mask_width / orig_width
scale_y = mask_height / orig_height
scaled_polygon = [(p[0] * scale_x, p[1] * scale_y) for p in polygon_points]
ImageDraw.Draw(img).polygon(scaled_polygon, outline=1, fill=1)
polygon_mask = np.array(img)
mask_bool = mask > 0
polygon_mask_bool = polygon_mask > 0
intersection = np.logical_and(mask_bool, polygon_mask_bool).sum()
union = np.logical_or(mask_bool, polygon_mask_bool).sum()
iou = intersection / union if union > 0 else 0
assert iou > 0.5
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_detect_mask_validation_on_simple_image(
self, mcp_server: FastMCP, test_segmentation_image_path
):
"""
Tests that generated masks are valid using a simple, predictable image.
It checks for binarity and bounding box confinement for every generated mask.
"""
with Image.open(test_segmentation_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool(
"detect",
{
"input_path": test_segmentation_image_path,
"model_name": "yoloe-11s-seg-pf.pt",
"return_geometry": True,
"geometry_format": "mask",
"confidence": 0.3,
},
)
detection_result = result.structured_content
assert detection_result is not None
# We expect at least a "dog" and a "cat" to be detected
detected_classes = [d["class"] for d in detection_result["detections"]]
assert "dog" in detected_classes
assert "cat" in detected_classes
assert len(detection_result["detections"]) >= 2
# Validate every mask that was generated
for detection in detection_result["detections"]:
assert "mask_path" in detection, "Each detection should have a mask_path"
mask_path = detection["mask_path"]
assert os.path.exists(mask_path), f"Mask file should exist at {mask_path}"
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
assert mask is not None, f"Mask file {mask_path} could not be read"
# 1. Check for binarity (only 0 and 255 values)
unique_values = np.unique(mask)
assert all(v in [0, 255] for v in unique_values), (
f"Mask {mask_path} is not binary. Found values: {unique_values}"
)
assert np.sum(mask) > 0, f"Mask {mask_path} should not be empty"
# 2. Check for bounding box confinement
bbox = detection["bbox"]
x1, y1, x2, y2 = bbox
mask_height, mask_width = mask.shape
# The model might return a mask that is the size of the original image
# or a cropped, resized version. We need to handle both cases by scaling.
scale_x = orig_width / mask_width
scale_y = orig_height / mask_height
# Find the bounding box of the mask's content
mask_indices = np.where(mask > 0)
if len(mask_indices[0]) > 0:
min_mask_y, max_mask_y = mask_indices[0].min(), mask_indices[0].max()
min_mask_x, max_mask_x = mask_indices[1].min(), mask_indices[1].max()
# Scale the detection bbox to the mask's coordinate system
scaled_x1 = x1 / scale_x
scaled_y1 = y1 / scale_y
scaled_x2 = x2 / scale_x
scaled_y2 = y2 / scale_y
# Check if the mask's content is within the scaled bbox (with tolerance)
tolerance = 10 # Use a small tolerance
assert min_mask_x >= scaled_x1 - tolerance, f"Mask content of {mask_path} extends past the left of its bbox"
assert max_mask_x <= scaled_x2 + tolerance, f"Mask content of {mask_path} extends past the right of its bbox"
assert min_mask_y >= scaled_y1 - tolerance, f"Mask content of {mask_path} extends past the top of its bbox"
assert max_mask_y <= scaled_y2 + tolerance, f"Mask content of {mask_path} extends past the bottom of its bbox"
================================================
FILE: tests/tools/test_draw_arrows.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for drawing arrows."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestDrawArrowsToolDefinition:
"""Tests for the draw_arrows tool definition and metadata."""
@pytest.mark.asyncio
async def test_draw_arrows_in_tools_list(self, mcp_server: FastMCP):
"""Tests that draw_arrows tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "draw_arrows" in tool_names, \
"draw_arrows tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_draw_arrows_description(self, mcp_server: FastMCP):
"""Tests that draw_arrows tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_arrows_tool = next((tool for tool in tools if tool.name == "draw_arrows"), None)
assert draw_arrows_tool.description, "draw_arrows tool should have a description"
assert "arrow" in draw_arrows_tool.description.lower(), \
"Description should mention that it draws arrows on an image"
@pytest.mark.asyncio
async def test_draw_arrows_parameters(self, mcp_server: FastMCP):
"""Tests that draw_arrows tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_arrows_tool = next((tool for tool in tools if tool.name == "draw_arrows"), None)
assert hasattr(draw_arrows_tool, "inputSchema"), \
"draw_arrows tool should have an inputSchema"
assert "properties" in draw_arrows_tool.inputSchema, \
"inputSchema should have properties field"
required_params = ["input_path", "arrows"]
for param in required_params:
assert param in draw_arrows_tool.inputSchema["properties"], \
f"draw_arrows tool should have a '{param}' property in its inputSchema"
assert "output_path" in draw_arrows_tool.inputSchema["properties"], \
"draw_arrows tool should have an 'output_path' property in its inputSchema"
assert draw_arrows_tool.inputSchema["properties"]["input_path"].get("type") == "string", \
"input_path should be of type string"
assert draw_arrows_tool.inputSchema["properties"]["arrows"].get("type") == "array", \
"arrows should be of type array"
arrows_items_schema = draw_arrows_tool.inputSchema["properties"]["arrows"].get("items", {})
assert arrows_items_schema.get("type") == "object", "arrows items should be objects"
output_path_schema = draw_arrows_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestDrawArrowsToolExecution:
"""Tests for the draw_arrows tool execution and results."""
@pytest.mark.asyncio
async def test_draw_arrows_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "output_arrows.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_arrows",
{
"input_path": test_image_path,
"arrows": [
{"x1": 50, "y1": 50, "x2": 150, "y2": 100, "color": [0, 0, 255], "thickness": 2, "tip_length": 0.2},
{"x1": 200, "y1": 150, "x2": 300, "y2": 250, "color": [255, 0, 0], "thickness": 3, "tip_length": 0.15}
],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400)
# Check if pixels along the arrow path are changed (not white)
# For the first arrow (red: BGR [0,0,255])
# Midpoint of the first arrow: ( (50+150)/2, (50+100)/2 ) = (100, 75)
# Check a pixel near the midpoint
assert not np.array_equal(img[75, 100], [255, 255, 255]), "First arrow (red) should be drawn"
# For the second arrow (blue: BGR [255,0,0])
# Midpoint of the second arrow: ( (200+300)/2, (150+250)/2 ) = (250, 200)
assert not np.array_equal(img[200, 250], [255, 255, 255]), "Second arrow (blue) should be drawn"
@pytest.mark.asyncio
async def test_draw_arrows_default_parameters(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "default_arrows_output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_arrows",
{
"input_path": test_image_path,
"arrows": [{"x1": 10, "y1": 10, "x2": 100, "y2": 100}], # Use default color, thickness, tip_length
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel near the midpoint (55, 55) for default black color [0,0,0]
# It should not be white [255,255,255]
assert not np.array_equal(img[55, 55], [255, 255, 255]), "Arrow with default parameters should be drawn"
@pytest.mark.asyncio
async def test_draw_arrows_default_output_path(self, mcp_server: FastMCP, test_image_path):
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_arrows",
{"input_path": test_image_path, "arrows": [{"x1": 20, "y1": 20, "x2": 120, "y2": 120}]},
)
expected_output = test_image_path.replace(".png", "_with_arrows.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
img = cv2.imread(expected_output)
assert img.shape[:2] == (300, 400)
================================================
FILE: tests/tools/test_draw_circle.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for drawing circles."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestDrawCircleToolDefinition:
"""Tests for the draw_circles tool definition and metadata."""
@pytest.mark.asyncio
async def test_draw_circles_in_tools_list(self, mcp_server: FastMCP):
"""Tests that draw_circles tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "draw_circles" in tool_names, \
"draw_circles tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_draw_circles_description(self, mcp_server: FastMCP):
"""Tests that draw_circles tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_circles_tool = next((tool for tool in tools if tool.name == "draw_circles"), None)
assert draw_circles_tool.description, "draw_circles tool should have a description"
assert "circle" in draw_circles_tool.description.lower(), \
"Description should mention that it draws circles on an image"
@pytest.mark.asyncio
async def test_draw_circles_parameters(self, mcp_server: FastMCP):
"""Tests that draw_circles tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_circles_tool = next((tool for tool in tools if tool.name == "draw_circles"), None)
assert hasattr(draw_circles_tool, "inputSchema"), \
"draw_circles tool should have an inputSchema"
assert "properties" in draw_circles_tool.inputSchema, \
"inputSchema should have properties field"
required_params = ["input_path", "circles"]
for param in required_params:
assert param in draw_circles_tool.inputSchema["properties"], \
f"draw_circles tool should have a '{param}' property in its inputSchema"
assert "output_path" in draw_circles_tool.inputSchema["properties"], \
"draw_circles tool should have an 'output_path' property in its inputSchema"
assert draw_circles_tool.inputSchema["properties"]["input_path"].get("type") == "string", \
"input_path should be of type string"
assert draw_circles_tool.inputSchema["properties"]["circles"].get("type") == "array", \
"circles should be of type array"
circles_items_schema = draw_circles_tool.inputSchema["properties"]["circles"].get("items", {})
if "$ref" in circles_items_schema:
ref_path = circles_items_schema["$ref"]
model_name = ref_path.split("/")[-1]
defs_schema = draw_circles_tool.inputSchema.get("$defs", {})
assert model_name in defs_schema, f"'$defs' should contain a definition for '{model_name}'"
circle_item_schema = defs_schema[model_name]
else:
circle_item_schema = circles_items_schema
assert circle_item_schema.get("type") == "object", "Circle item schema should be an object"
circles_props = circle_item_schema.get("properties", {})
assert "center_x" in circles_props and circles_props["center_x"].get("type") == "integer"
assert "center_y" in circles_props and circles_props["center_y"].get("type") == "integer"
assert "radius" in circles_props and circles_props["radius"].get("type") == "integer"
required_circle_item_fields = circle_item_schema.get("required", [])
assert "center_x" in required_circle_item_fields
assert "center_y" in required_circle_item_fields
assert "radius" in required_circle_item_fields
assert "color" in circles_props, "'color' property should be in circles_props"
assert "thickness" in circles_props, "'thickness' property should be in circles_props"
assert "filled" in circles_props, "'filled' property should be in circles_props"
output_path_schema = draw_circles_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestDrawCircleToolExecution:
"""Tests for the draw_circles tool execution and results."""
@pytest.mark.asyncio
async def test_draw_circles_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "output_circles.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_circles",
{
"input_path": test_image_path,
"circles": [
{"center_x": 100, "center_y": 100, "radius": 50, "color": [0, 0, 255], "thickness": 2},
{"center_x": 250, "center_y": 150, "radius": 30, "color": [255, 0, 0], "thickness": 3}
],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400)
assert not np.array_equal(img[100, 150-1], [255, 255, 255]), "Circle 1 (red) should be drawn"
assert not np.array_equal(img[150, 280-1], [255, 255, 255]), "Circle 2 (blue) should be drawn"
@pytest.mark.asyncio
async def test_draw_filled_circle(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "filled_circle_output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_circles",
{
"input_path": test_image_path,
"circles": [{"center_x": 150, "center_y": 150, "radius": 50, "color": [0, 255, 0], "filled": True}],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
assert np.array_equal(img[150, 150], [0, 255, 0]), "Circle should be filled with green"
assert np.array_equal(img[150 + 40, 150 + 0], [0, 255, 0]), "Inner part of filled circle should be green"
@pytest.mark.asyncio
async def test_draw_circles_default_output_path(self, mcp_server: FastMCP, test_image_path):
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_circles",
{"input_path": test_image_path, "circles": [{"center_x": 50, "center_y": 50, "radius": 20}]},
)
expected_output = test_image_path.replace(".png", "_with_circles.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
img = cv2.imread(expected_output)
assert img.shape[:2] == (300, 400)
================================================
FILE: tests/tools/test_draw_lines.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for drawing lines."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestDrawLinesToolDefinition:
"""Tests for the draw_lines tool definition and metadata."""
@pytest.mark.asyncio
async def test_draw_lines_in_tools_list(self, mcp_server: FastMCP):
"""Tests that draw_lines tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "draw_lines" in tool_names, \
"draw_lines tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_draw_lines_description(self, mcp_server: FastMCP):
"""Tests that draw_lines tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_lines_tool = next((tool for tool in tools if tool.name == "draw_lines"), None)
assert draw_lines_tool.description, "draw_lines tool should have a description"
assert "line" in draw_lines_tool.description.lower(), \
"Description should mention that it draws lines on an image"
@pytest.mark.asyncio
async def test_draw_lines_parameters(self, mcp_server: FastMCP):
"""Tests that draw_lines tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_lines_tool = next((tool for tool in tools if tool.name == "draw_lines"), None)
assert hasattr(draw_lines_tool, "inputSchema"), \
"draw_lines tool should have an inputSchema"
assert "properties" in draw_lines_tool.inputSchema, \
"inputSchema should have properties field"
required_params = ["input_path", "lines"]
for param in required_params:
assert param in draw_lines_tool.inputSchema["properties"], \
f"draw_lines tool should have a '{param}' property in its inputSchema"
assert "output_path" in draw_lines_tool.inputSchema["properties"], \
"draw_lines tool should have an 'output_path' property in its inputSchema"
assert draw_lines_tool.inputSchema["properties"]["input_path"].get("type") == "string", \
"input_path should be of type string"
assert draw_lines_tool.inputSchema["properties"]["lines"].get("type") == "array", \
"lines should be of type array"
lines_items_schema = draw_lines_tool.inputSchema["properties"]["lines"].get("items", {})
assert lines_items_schema.get("type") == "object", "lines items should be objects"
output_path_schema = draw_lines_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestDrawLinesToolExecution:
"""Tests for the draw_lines tool execution and results."""
@pytest.mark.asyncio
async def test_draw_lines_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "output_lines.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_lines",
{
"input_path": test_image_path,
"lines": [
{"x1": 50, "y1": 50, "x2": 150, "y2": 100, "color": [0, 0, 255], "thickness": 2},
{"x1": 200, "y1": 150, "x2": 300, "y2": 250, "color": [255, 0, 0], "thickness": 3}
],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400)
# Check if pixels along the line path are changed (not white)
# For the first line (red: BGR [0,0,255])
# Midpoint of the first line: ( (50+150)/2, (50+100)/2 ) = (100, 75)
# Check a pixel near the midpoint
assert not np.array_equal(img[75, 100], [255, 255, 255]), "First line (red) should be drawn"
# For the second line (blue: BGR [255,0,0])
# Midpoint of the second line: ( (200+300)/2, (150+250)/2 ) = (250, 200)
assert not np.array_equal(img[200, 250], [255, 255, 255]), "Second line (blue) should be drawn"
@pytest.mark.asyncio
async def test_draw_lines_default_parameters(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
output_path = str(tmp_path / "default_lines_output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_lines",
{
"input_path": test_image_path,
"lines": [{"x1": 10, "y1": 10, "x2": 100, "y2": 100}], # Use default color, thickness
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel near the midpoint (55, 55) for default black color [0,0,0]
# It should not be white [255,255,255]
assert not np.array_equal(img[55, 55], [255, 255, 255]), "Line with default parameters should be drawn"
@pytest.mark.asyncio
async def test_draw_lines_default_output_path(self, mcp_server: FastMCP, test_image_path):
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_lines",
{"input_path": test_image_path, "lines": [{"x1": 20, "y1": 20, "x2": 120, "y2": 120}]},
)
expected_output = test_image_path.replace(".png", "_with_lines.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
img = cv2.imread(expected_output)
assert img.shape[:2] == (300, 400)
================================================
FILE: tests/tools/test_draw_rectangle.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for drawing rectangles."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestDrawRectanglesToolDefinition:
"""Tests for the draw_rectangles tool definition and metadata."""
@pytest.mark.asyncio
async def test_draw_rectangles_in_tools_list(self, mcp_server: FastMCP):
"""Tests that draw_rectangles tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if draw_rectangles is in the list of tools
tool_names = [tool.name for tool in tools]
assert "draw_rectangles" in tool_names, (
"draw_rectangles tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_draw_rectangles_description(self, mcp_server: FastMCP):
"""Tests that draw_rectangles tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_rectangles_tool = next((tool for tool in tools if tool.name == "draw_rectangles"), None)
# Check description
assert draw_rectangles_tool.description, "draw_rectangles tool should have a description"
assert "rectangle" in draw_rectangles_tool.description.lower(), (
"Description should mention that it draws rectangles on an image"
)
@pytest.mark.asyncio
async def test_draw_rectangles_parameters(self, mcp_server: FastMCP):
"""Tests that draw_rectangles tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_rectangles_tool = next((tool for tool in tools if tool.name == "draw_rectangles"), None)
# Check input schema
assert hasattr(draw_rectangles_tool, "inputSchema"), (
"draw_rectangles tool should have an inputSchema"
)
assert "properties" in draw_rectangles_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "rectangles"]
for param in required_params:
assert param in draw_rectangles_tool.inputSchema["properties"], (
f"draw_rectangles tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
assert "output_path" in draw_rectangles_tool.inputSchema["properties"], (
"draw_rectangles tool should have an 'output_path' property in its inputSchema"
)
# Check parameter types
assert (
draw_rectangles_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert (
draw_rectangles_tool.inputSchema["properties"]["rectangles"].get("type")
== "array"
), "rectangles should be of type array"
# Check output_path type - it can be string or null since it's optional
output_path_schema = draw_rectangles_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
# Check that string is one of the allowed types
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestDrawRectanglesToolExecution:
"""Tests for the draw_rectangles tool execution and results."""
@pytest.mark.asyncio
async def test_draw_rectangles_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the draw_rectangles tool execution and return value."""
output_path = str(tmp_path / "output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_rectangles",
{
"input_path": test_image_path,
"rectangles": [
{
"x1": 50,
"y1": 50,
"x2": 150,
"y2": 100,
"color": [0, 0, 255], # Red in BGR
"thickness": 2
},
{
"x1": 200,
"y1": 150,
"x2": 300,
"y2": 250,
"color": [255, 0, 0], # Blue in BGR
"thickness": 3
}
],
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the image was created with correct dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400) # height, width
# Verify that pixels at rectangle locations have changed color
# Check a point on the first rectangle's border
assert not np.array_equal(img[50, 50], [255, 255, 255]), "Rectangle 1 should be drawn"
# Check a point on the second rectangle's border
assert not np.array_equal(img[150, 200], [255, 255, 255]), "Rectangle 2 should be drawn"
@pytest.mark.asyncio
async def test_draw_filled_rectangle(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests drawing a filled rectangle."""
output_path = str(tmp_path / "filled_output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_rectangles",
{
"input_path": test_image_path,
"rectangles": [
{
"x1": 100,
"y1": 100,
"x2": 200,
"y2": 200,
"color": [0, 255, 0], # Green in BGR
"filled": True
}
],
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the image was created with correct dimensions
img = cv2.imread(output_path)
# Check a point inside the filled rectangle
# It should be green (BGR: 0, 255, 0)
assert np.array_equal(img[150, 150], [0, 255, 0]), "Rectangle should be filled with green"
@pytest.mark.asyncio
async def test_draw_rectangles_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the draw_rectangles tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_rectangles",
{
"input_path": test_image_path,
"rectangles": [
{
"x1": 50,
"y1": 50,
"x2": 150,
"y2": 100,
"color": [0, 0, 0], # Black in BGR
"thickness": 2
}
]
},
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_with_rectangles.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
# Verify the image was created with correct dimensions
img = cv2.imread(expected_output)
assert img.shape[:2] == (300, 400) # height, width
================================================
FILE: tests/tools/test_draw_text.py
================================================
import os
import cv2
import easyocr
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
# Add this line to filter out the PyTorch warnings
pytestmark = pytest.mark.filterwarnings("ignore:.*'pin_memory' argument is set as true but no accelerator is found.*:UserWarning")
# Initialize the OCR reader for testing
reader = None
def get_ocr_reader():
"""Get or initialize the EasyOCR reader for testing."""
global reader
if reader is None:
reader = easyocr.Reader(['en'])
return reader
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for drawing text."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestDrawTextsToolDefinition:
"""Tests for the draw_texts tool definition and metadata."""
@pytest.mark.asyncio
async def test_draw_texts_in_tools_list(self, mcp_server: FastMCP):
"""Tests that draw_texts tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if draw_texts is in the list of tools
tool_names = [tool.name for tool in tools]
assert "draw_texts" in tool_names, (
"draw_texts tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_draw_texts_description(self, mcp_server: FastMCP):
"""Tests that draw_texts tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_texts_tool = next((tool for tool in tools if tool.name == "draw_texts"), None)
# Check description
assert draw_texts_tool.description, "draw_texts tool should have a description"
assert "text" in draw_texts_tool.description.lower(), (
"Description should mention that it draws text on an image"
)
@pytest.mark.asyncio
async def test_draw_texts_parameters(self, mcp_server: FastMCP):
"""Tests that draw_texts tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
draw_texts_tool = next((tool for tool in tools if tool.name == "draw_texts"), None)
# Check input schema
assert hasattr(draw_texts_tool, "inputSchema"), (
"draw_texts tool should have an inputSchema"
)
assert "properties" in draw_texts_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "texts"]
for param in required_params:
assert param in draw_texts_tool.inputSchema["properties"], (
f"draw_texts tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
assert "output_path" in draw_texts_tool.inputSchema["properties"], (
"draw_texts tool should have an 'output_path' property in its inputSchema"
)
# Check parameter types
assert (
draw_texts_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert (
draw_texts_tool.inputSchema["properties"]["texts"].get("type")
== "array"
), "texts should be of type array"
# Check output_path type - it can be string or null since it's optional
output_path_schema = draw_texts_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
# Check that string is one of the allowed types
string_type_present = any(
type_option.get("type") == "string"
for type_option in output_path_schema["anyOf"]
)
assert string_type_present, "output_path should allow string type"
class TestDrawTextsToolExecution:
"""Tests for the draw_texts tool execution and results."""
@pytest.mark.asyncio
async def test_draw_texts_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the draw_texts tool execution and return value."""
output_path = str(tmp_path / "output.png")
# Define the text to draw
text1 = "Hello World"
text2 = "Testing"
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_texts",
{
"input_path": test_image_path,
"texts": [
{
"text": text1,
"x": 50,
"y": 50,
"font_scale": 1.0,
"color": [0, 0, 255], # Red in BGR
"thickness": 2
},
{
"text": text2,
"x": 100,
"y": 150,
"font_scale": 2.0,
"color": [255, 0, 0], # Blue in BGR
"thickness": 3,
"font_face": "FONT_HERSHEY_COMPLEX"
}
],
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the image was created with correct dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (300, 400) # height, width
# Use OCR to verify the text was actually drawn
reader = get_ocr_reader()
ocr_results = reader.readtext(output_path)
# Extract the detected text
detected_texts = [result[1] for result in ocr_results]
# Check if our drawn texts are detected by OCR
# We use partial matching because OCR might not be 100% accurate
assert any(text1 in detected_text for detected_text in detected_texts), \
f"Expected text '{text1}' not found in OCR results: {detected_texts}"
assert any(text2 in detected_text for detected_text in detected_texts), \
f"Expected text '{text2}' not found in OCR results: {detected_texts}"
@pytest.mark.asyncio
async def test_draw_texts_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the draw_texts tool with default output path."""
# Define the text to draw
test_text = "Simple Text"
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_texts",
{
"input_path": test_image_path,
"texts": [
{
"text": test_text,
"x": 50,
"y": 50,
"font_scale": 1.5, # Larger scale for better OCR detection
"thickness": 2
}
]
},
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_with_text.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
# Use OCR to verify the text was actually drawn
reader = get_ocr_reader()
ocr_results = reader.readtext(expected_output)
# Extract the detected text
detected_texts = [result[1] for result in ocr_results]
# Check if our drawn text is detected by OCR
assert any(test_text in detected_text for detected_text in detected_texts), \
f"Expected text '{test_text}' not found in OCR results: {detected_texts}"
@pytest.mark.asyncio
async def test_draw_texts_minimal_parameters(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the draw_texts tool with minimal required parameters."""
output_path = str(tmp_path / "minimal_output.png")
# Define the text to draw
test_text = "Minimal Text"
async with Client(mcp_server) as client:
result = await client.call_tool(
"draw_texts",
{
"input_path": test_image_path,
"texts": [
{
"text": test_text,
"x": 50,
"y": 50,
"font_scale": 1.5, # Larger scale for better OCR detection
"thickness": 2
}
],
"output_path": output_path
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Use OCR to verify the text was actually drawn
reader = get_ocr_reader()
ocr_results = reader.readtext(output_path)
# Extract the detected text
detected_texts = [result[1] for result in ocr_results]
# Check if our drawn text is detected by OCR
assert any(test_text in detected_text for detected_text in detected_texts), \
f"Expected text '{test_text}' not found in OCR results: {detected_texts}"
assert os.path.exists(output_path)
================================================
FILE: tests/tools/test_fill.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image with a black and white background for filling."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
# Draw a black rectangle to check blending against
cv2.rectangle(img, (100, 75), (300, 225), (0, 0, 0), -1)
cv2.imwrite(str(img_path), img)
return str(img_path)
@pytest.fixture
def test_jpeg_image_path(tmp_path):
"""Create a test JPEG image (no alpha channel) for testing transparency operations."""
img_path = tmp_path / "test_image.jpg"
# Create a white image
img = np.ones((300, 400, 3), dtype=np.uint8) * 255
# Draw a black rectangle to check blending against
cv2.rectangle(img, (100, 75), (300, 225), (0, 0, 0), -1)
# Draw a red circle
cv2.circle(img, (200, 150), 50, (0, 0, 255), -1)
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestFillToolDefinition:
"""Tests for the fill tool definition and metadata."""
@pytest.mark.asyncio
async def test_fill_in_tools_list(self, mcp_server: FastMCP):
"""Tests that fill tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "fill" in tool_names, "fill tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_fill_description(self, mcp_server: FastMCP):
"""Tests that fill tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
fill_tool = next((tool for tool in tools if tool.name == "fill"), None)
assert fill_tool.description, "fill tool should have a description"
assert "fill" in fill_tool.description.lower(), "Description should mention that it fills areas of an image"
@pytest.mark.asyncio
async def test_fill_parameters(self, mcp_server: FastMCP):
"""Tests that fill tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
fill_tool = next((tool for tool in tools if tool.name == "fill"), None)
assert hasattr(fill_tool, "inputSchema"), "fill tool should have an inputSchema"
assert "properties" in fill_tool.inputSchema, "inputSchema should have properties field"
required_params = ["input_path", "areas"]
for param in required_params:
assert param in fill_tool.inputSchema["properties"], f"fill tool should have a '{param}' property in its inputSchema"
assert "output_path" in fill_tool.inputSchema["properties"], "fill tool should have an 'output_path' property in its inputSchema"
assert fill_tool.inputSchema["properties"]["input_path"].get("type") == "string", "input_path should be of type string"
assert fill_tool.inputSchema["properties"]["areas"].get("type") == "array", "areas should be of type array"
output_path_schema = fill_tool.inputSchema["properties"]["output_path"]
assert "anyOf" in output_path_schema, "output_path should have anyOf field for optional types"
string_type_present = any(type_option.get("type") == "string" for type_option in output_path_schema["anyOf"])
assert string_type_present, "output_path should allow string type"
class TestFillToolExecution:
"""Tests for the fill tool execution and results."""
@pytest.mark.asyncio
async def test_fill_tool_execution(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the fill tool execution and return value."""
output_path = str(tmp_path / "output.png")
fill_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": [0, 0, 255], "opacity": 0.5}
async with Client(mcp_server) as client:
result = await client.call_tool("fill", {"input_path": test_image_path, "areas": [fill_area], "output_path": output_path})
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
filled_pixel = img[150, 200]
assert np.allclose(filled_pixel, [0, 0, 128], atol=2)
unfilled_pixel = img[150, 120]
assert np.array_equal(unfilled_pixel, [0, 0, 0])
white_pixel = img[50, 50]
assert np.array_equal(white_pixel, [255, 255, 255])
@pytest.mark.asyncio
async def test_fill_polygon_area(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the fill tool with a polygon area."""
output_path = str(tmp_path / "output_poly.png")
polygon_area = {"polygon": [[160, 110], [240, 110], [200, 190]], "color": [0, 255, 0], "opacity": 0.8}
async with Client(mcp_server) as client:
result = await client.call_tool("fill", {"input_path": test_image_path, "areas": [polygon_area], "output_path": output_path})
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
poly_center_pixel = img[130, 200]
assert np.allclose(poly_center_pixel, [0, 204, 0], atol=2)
@pytest.mark.asyncio
async def test_fill_default_output_path(self, mcp_server: FastMCP, test_image_path):
"""Tests the fill tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool("fill", {"input_path": test_image_path, "areas": [{"x1": 150, "y1": 100, "x2": 250, "y2": 200}]})
expected_output = test_image_path.replace(".png", "_filled.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
@pytest.mark.asyncio
async def test_fill_multiple_areas(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the fill tool with multiple overlapping areas."""
output_path = str(tmp_path / "multi_fill.png")
async with Client(mcp_server) as client:
await client.call_tool("fill", {"input_path": test_image_path, "areas": [{"x1": 110, "y1": 85, "x2": 160, "y2": 135, "color": [0, 0, 255], "opacity": 1.0}, {"x1": 150, "y1": 125, "x2": 200, "y2": 175, "color": [0, 255, 0], "opacity": 0.5}], "output_path": output_path})
img = cv2.imread(output_path)
assert np.array_equal(img[100, 120], [0, 0, 255])
assert np.allclose(img[150, 160], [0, 128, 0], atol=2)
assert np.allclose(img[130, 155], [0, 128, 128], atol=2)
@pytest.mark.asyncio
async def test_fill_transparent_rectangle(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests making a rectangular area transparent with all channels set to 0."""
output_path = str(tmp_path / "output_transparent.png")
fill_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [fill_area],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# Check a pixel inside the transparent area - all channels should be 0
pixel_inside = img[150, 200]
assert np.array_equal(pixel_inside, [0, 0, 0, 0]), "All BGRA channels should be 0 for transparent areas"
# Check multiple pixels in the transparent area
for y in range(100, 200, 20):
for x in range(150, 250, 20):
pixel = img[y, x]
assert np.array_equal(pixel, [0, 0, 0, 0]), f"Pixel at ({y}, {x}) should have all channels set to 0"
# Check a pixel outside the transparent area
pixel_outside = img[50, 50]
assert pixel_outside[3] == 255 # Alpha should be 255
assert np.array_equal(pixel_outside[:3], [255, 255, 255]) # Should be white
@pytest.mark.asyncio
async def test_fill_transparent_polygon(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests making a polygonal area transparent with all channels set to 0."""
output_path = str(tmp_path / "output_transparent_poly.png")
fill_area = {"polygon": [[160, 110], [240, 110], [200, 190]], "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [fill_area],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# Check pixels inside the transparent polygon - all channels should be 0
test_points = [
(140, 200), # Center of polygon
(170, 200), # Another point inside
(150, 180), # Another point inside
]
for y, x in test_points:
pixel = img[y, x]
assert np.array_equal(pixel, [0, 0, 0, 0]), f"Pixel at ({y}, {x}) inside polygon should have all channels set to 0"
# Check a pixel outside the transparent area
pixel_outside = img[50, 50]
assert pixel_outside[3] == 255 # Alpha should be 255
assert np.array_equal(pixel_outside[:3], [255, 255, 255]) # Should be white
@pytest.mark.asyncio
async def test_fill_invert_rectangle(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the fill tool with invert_areas for a rectangle."""
output_path = str(tmp_path / "output_inverted.png")
# Define a rectangle in the center (where the black rectangle is)
fill_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": [0, 255, 0], "opacity": 1.0}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [fill_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Center pixel (inside the specified area) should NOT be filled - remains original
center_pixel = img[150, 200]
assert np.array_equal(center_pixel, [0, 0, 0]) # Should remain black (original)
# Pixels outside the area should be filled with green
outside_pixel = img[50, 50]
assert np.allclose(outside_pixel, [0, 255, 0], atol=2) # Should be green - allow tolerance for JPEG
# Another outside pixel
edge_pixel = img[250, 350]
assert np.array_equal(edge_pixel, [0, 255, 0]) # Should be green
@pytest.mark.asyncio
async def test_fill_invert_polygon(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests the fill tool with invert_areas for a polygon."""
output_path = str(tmp_path / "output_inverted_poly.png")
# Define a triangle polygon
polygon_area = {"polygon": [[160, 110], [240, 110], [200, 190]], "color": [255, 0, 0], "opacity": 0.8}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [polygon_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Center of polygon (inside the specified area) should NOT be filled
poly_center = img[150, 200]
assert np.array_equal(poly_center, [0, 0, 0]) # Should remain black
# Outside pixels should be filled with blue at 80% opacity
# Since original is white [255,255,255], and we're applying blue [255,0,0] at 80% opacity:
# Result = 0.8 * [255,0,0] + 0.2 * [255,255,255] = [255, 51, 51] (approximately)
outside_pixel = img[50, 50]
assert np.allclose(outside_pixel, [255, 51, 51], atol=2) # 80% blue over white
@pytest.mark.asyncio
async def test_fill_invert_transparent(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests making everything except a rectangle transparent (background removal) with all channels set to 0."""
output_path = str(tmp_path / "output_bg_removed.png")
# Keep only the center rectangle, make everything else transparent
keep_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [keep_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# Inside the kept area - should be opaque (not modified)
inside_pixel = img[150, 200]
assert inside_pixel[3] == 255 # Alpha should be 255 (opaque)
assert np.array_equal(inside_pixel[:3], [0, 0, 0]) # Color preserved
# Outside the kept area - should be fully transparent (all channels 0)
outside_pixels = [
(50, 50), # Top left
(250, 350), # Bottom right
(10, 10), # Corner
]
for y, x in outside_pixels:
pixel = img[y, x]
assert np.array_equal(pixel, [0, 0, 0, 0]), f"Pixel at ({y}, {x}) outside kept area should have all channels set to 0"
@pytest.mark.asyncio
async def test_fill_invert_multiple_areas(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests invert_areas with multiple areas to keep."""
output_path = str(tmp_path / "output_multi_keep.png")
# Keep two areas, fill everything else
areas = [
{"x1": 50, "y1": 50, "x2": 100, "y2": 100, "color": [0, 0, 255], "opacity": 1.0},
{"x1": 200, "y1": 150, "x2": 250, "y2": 200} # Will use first area's color
]
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": areas,
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# First kept area should NOT be filled (remains original)
kept_pixel1 = img[75, 75]
assert np.array_equal(kept_pixel1, [255, 255, 255]) # Should remain white
# Second kept area should NOT be filled (remains original)
kept_pixel2 = img[175, 225]
assert np.array_equal(kept_pixel2, [0, 0, 0]) # Should remain black
# Area between them should be filled with blue
between_pixel = img[125, 150]
assert np.array_equal(between_pixel, [0, 0, 255]) # Should be blue (BGR format)
@pytest.mark.asyncio
async def test_fill_invert_complex_polygon(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests invert_areas with a complex polygon shape to keep."""
output_path = str(tmp_path / "output_complex_keep.png")
# Create a star-like polygon
star_polygon = {
"polygon": [
[200, 50], [220, 100], [270, 100], [230, 130],
[250, 180], [200, 150], [150, 180], [170, 130],
[130, 100], [180, 100]
],
"color": None # Make background transparent
}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [star_polygon],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# Center of star should be opaque (not modified)
star_center = img[115, 200]
assert star_center[3] == 255 # Should be opaque
# Outside corners should be transparent
corner_pixel = img[10, 10]
assert corner_pixel[3] == 0 # Should be transparent
@pytest.mark.asyncio
async def test_fill_invert_single_area_transparent(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests a simple background removal use case - keep single object, remove background."""
output_path = str(tmp_path / "object_only.png")
# Define object boundaries
object_area = {"x1": 100, "y1": 80, "x2": 300, "y2": 220, "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [object_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
# Check that image has alpha channel
assert img.shape[2] == 4
# Inside object area should be opaque
object_pixel = img[150, 200]
assert object_pixel[3] == 255
# Outside object area should be transparent
bg_pixel = img[10, 10]
assert bg_pixel[3] == 0
# Edge case - just outside the object
edge_pixel = img[79, 150] # Just above the object
assert edge_pixel[3] == 0
@pytest.mark.asyncio
async def test_fill_with_mask_path(self, mcp_server: FastMCP, test_image_path, tmp_path):
"""Tests filling an area using a mask from a file."""
output_path = str(tmp_path / "output_mask_fill.png")
mask_path = str(tmp_path / "test_mask.png")
# Create a mask image (e.g., a circle)
mask_img = np.zeros((300, 400), dtype=np.uint8)
cv2.circle(mask_img, (150, 150), 50, 255, -1)
cv2.imwrite(mask_path, mask_img)
fill_area = {"mask_path": mask_path, "color": [0, 255, 255], "opacity": 1.0}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_image_path,
"areas": [fill_area],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel inside the masked area
assert np.array_equal(img[150, 150], [0, 255, 255])
# Check a pixel outside the masked area
assert np.array_equal(img[50, 50], [255, 255, 255])
class TestFillToolWithJPEG:
"""Tests for the fill tool with JPEG images (no alpha channel)."""
@pytest.mark.asyncio
async def test_fill_jpeg_to_transparent_rectangle(self, mcp_server: FastMCP, test_jpeg_image_path, tmp_path):
"""Tests making a rectangular area transparent in a JPEG image with all channels set to 0."""
output_path = str(tmp_path / "output_transparent.png") # Output as PNG to support transparency
fill_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_jpeg_image_path,
"areas": [fill_area],
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel added
# Check a pixel inside the transparent area - all channels should be 0
pixel_inside = img[150, 200]
assert np.array_equal(pixel_inside, [0, 0, 0, 0]), "All BGRA channels should be 0 for transparent areas"
# Check a pixel outside the transparent area
pixel_outside = img[50, 50]
assert pixel_outside[3] == 255 # Alpha should be 255
assert np.array_equal(pixel_outside[:3], [255, 255, 255]) # Should be white
@pytest.mark.asyncio
async def test_fill_jpeg_invert_transparent(self, mcp_server: FastMCP, test_jpeg_image_path, tmp_path):
"""Tests making everything except a rectangle transparent in a JPEG image (background removal) with all channels set to 0."""
output_path = str(tmp_path / "output_bg_removed.png")
# Keep only the center rectangle, make everything else transparent
keep_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": None}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_jpeg_image_path,
"areas": [keep_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# Inside the kept area - should be opaque (not modified)
inside_pixel = img[150, 200]
assert inside_pixel[3] == 255 # Alpha should be 255 (opaque)
# Check the red circle is preserved (use allclose due to JPEG compression)
circle_center = img[150, 200]
assert circle_center[3] == 255 # Should be opaque
assert np.allclose(circle_center[:3], [0, 0, 255], atol=2) # Should be red (BGR) - allow tolerance for JPEG
# Outside the kept area - should be fully transparent (all channels 0)
outside_pixels = [
(50, 50), # Top left
(10, 10), # Corner
(250, 350), # Bottom right
]
for y, x in outside_pixels:
pixel = img[y, x]
assert np.array_equal(pixel, [0, 0, 0, 0]), f"Pixel at ({y}, {x}) outside kept area should have all channels set to 0"
@pytest.mark.asyncio
async def test_fill_jpeg_invert_with_color(self, mcp_server: FastMCP, test_jpeg_image_path, tmp_path):
"""Tests invert_areas with color fill on a JPEG image."""
output_path = str(tmp_path / "output_inverted_color.jpg") # Keep as JPEG
# Define a rectangle in the center
fill_area = {"x1": 150, "y1": 100, "x2": 250, "y2": 200, "color": [0, 255, 0], "opacity": 1.0}
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_jpeg_image_path,
"areas": [fill_area],
"invert_areas": True,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Center pixel (inside the specified area) should NOT be filled
center_pixel = img[150, 200]
assert np.allclose(center_pixel, [0, 0, 255], atol=2) # Should remain red - allow tolerance for JPEG
# Pixels outside the area should be filled with green
outside_pixel = img[50, 50]
assert np.allclose(outside_pixel, [0, 255, 0], atol=2) # Should be green - allow tolerance for JPEG
@pytest.mark.asyncio
async def test_fill_jpeg_multiple_transparent_areas(self, mcp_server: FastMCP, test_jpeg_image_path, tmp_path):
"""Tests multiple transparent areas on a JPEG image with all channels set to 0."""
output_path = str(tmp_path / "output_multi_transparent.png")
areas = [
{"x1": 50, "y1": 50, "x2": 100, "y2": 100, "color": None},
{"polygon": [[250, 150], [350, 150], [300, 250]], "color": None}
]
async with Client(mcp_server) as client:
result = await client.call_tool(
"fill",
{
"input_path": test_jpeg_image_path,
"areas": areas,
"output_path": output_path
}
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path, cv2.IMREAD_UNCHANGED)
assert img.shape[2] == 4 # Should have alpha channel
# First transparent area - all channels should be 0
pixel_area1 = img[75, 75]
assert np.array_equal(pixel_area1, [0, 0, 0, 0]), "First transparent area should have all channels set to 0"
# Second transparent area (inside polygon) - all channels should be 0
pixel_area2 = img[180, 300]
assert np.array_equal(pixel_area2, [0, 0, 0, 0]), "Second transparent area should have all channels set to 0"
# Non-transparent area
pixel_normal = img[150, 200]
assert pixel_normal[3] == 255 # Should be opaque
assert np.allclose(pixel_normal[:3], [0, 0, 255], atol=2) # Should be red (from the circle)
================================================
FILE: tests/tools/test_find.py
================================================
import os
import shutil
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from PIL import Image, ImageDraw
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Path to a test image with known objects for finding."""
# Path to the test image in the tests/data directory
current_dir = os.path.dirname(os.path.abspath(__file__))
test_data_dir = os.path.join(os.path.dirname(current_dir), "data")
src_path = os.path.join(test_data_dir, "test_detection.jpg")
dest_path = tmp_path / "test_detection.jpg"
shutil.copy(src_path, dest_path)
return str(dest_path)
@pytest.fixture
def test_segmentation_image_path(tmp_path):
"""Path to a simple test image for segmentation mask validation."""
current_dir = os.path.dirname(os.path.abspath(__file__))
test_data_dir = os.path.join(os.path.dirname(current_dir), "data")
src_path = os.path.join(test_data_dir, "test_detection_mask.jpg")
dest_path = tmp_path / "test_detection_mask.jpg"
shutil.copy(src_path, dest_path)
return str(dest_path)
class TestFindToolDefinition:
"""Tests for the find tool definition and metadata."""
@pytest.mark.asyncio
async def test_find_in_tools_list(self, mcp_server: FastMCP):
"""Tests that find tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if find is in the list of tools
tool_names = [tool.name for tool in tools]
assert "find" in tool_names, (
"find tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_find_description(self, mcp_server: FastMCP):
"""Tests that find tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
find_tool = next((tool for tool in tools if tool.name == "find"), None)
# Check description
assert find_tool.description, "find tool should have a description"
assert "find" in find_tool.description.lower(), (
"Description should mention that it finds objects in an image"
)
@pytest.mark.asyncio
async def test_find_parameters(self, mcp_server: FastMCP):
"""Tests that find tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
find_tool = next((tool for tool in tools if tool.name == "find"), None)
# Check input schema
assert hasattr(find_tool, "inputSchema"), (
"find tool should have an inputSchema"
)
assert "properties" in find_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "description"]
for param in required_params:
assert param in find_tool.inputSchema["properties"], (
f"find tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
optional_params = ["confidence", "model_name", "return_all_matches", "return_geometry", "geometry_format"]
for param in optional_params:
assert param in find_tool.inputSchema["properties"], (
f"find tool should have a '{param}' property in its inputSchema"
)
# Check parameter types and defaults
assert (
find_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert (
find_tool.inputSchema["properties"]["description"].get("type")
== "string"
), "description should be of type string"
# Check optional parameters (now have anyOf structure with null)
confidence_schema = find_tool.inputSchema["properties"]["confidence"]
assert "anyOf" in confidence_schema, "confidence should have anyOf structure for optional parameter"
assert any(item.get("type") == "number" for item in confidence_schema["anyOf"]), "confidence should allow number type"
assert any(item.get("type") == "null" for item in confidence_schema["anyOf"]), "confidence should allow null type"
model_name_schema = find_tool.inputSchema["properties"]["model_name"]
assert "anyOf" in model_name_schema, "model_name should have anyOf structure for optional parameter"
assert any(item.get("type") == "string" for item in model_name_schema["anyOf"]), "model_name should allow string type"
assert any(item.get("type") == "null" for item in model_name_schema["anyOf"]), "model_name should allow null type"
assert (
find_tool.inputSchema["properties"]["return_all_matches"].get("type")
== "boolean"
), "return_all_matches should be of type boolean"
# New parameters for geometry
assert (
find_tool.inputSchema["properties"]["return_geometry"].get("type")
== "boolean"
), "return_geometry should be of type boolean"
assert (
find_tool.inputSchema["properties"]["return_geometry"].get("default")
is False
), "return_geometry default should be False"
assert (
find_tool.inputSchema["properties"]["geometry_format"].get("type")
== "string"
), "geometry_format should be of type string"
assert (
find_tool.inputSchema["properties"]["geometry_format"].get("enum")
== ["mask", "polygon"]
), "geometry_format enum should be ['mask', 'polygon']"
assert (
find_tool.inputSchema["properties"]["geometry_format"].get("default")
== "mask"
), "geometry_format default should be 'mask'"
class TestFindToolExecution:
"""Tests for the find tool execution and results."""
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_tool_execution(self, mcp_server: FastMCP, test_image_path):
"""Tests the find tool execution and return value."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"confidence": 0.25,
"model_name": "yoloe-11s-seg.pt",
"return_all_matches": True,
},
)
# Parse the result
find_result = result.structured_content
# Basic structure checks
assert "image_path" in find_result, "Result should contain image_path"
assert "query" in find_result, "Result should contain query"
assert "found_objects" in find_result, "Result should contain found_objects"
assert "found" in find_result, "Result should contain found flag"
assert find_result["image_path"] == test_image_path, "Image path should match input path"
assert find_result["query"] == "car", "Query should match input description"
assert isinstance(find_result["found_objects"], list), "found_objects should be a list"
# Verify that at least one object was found (the test image has 2 people)
assert find_result["found"] is True, "Should have found at least one car in the test image"
assert len(find_result["found_objects"]) > 0, "Should have found at least one car in the test image"
# Check the structure of each found object
for found_object in find_result["found_objects"]:
assert "description" in found_object, "Found object should have description"
assert "match" in found_object, "Found object should have match"
assert "confidence" in found_object, "Found object should have confidence"
assert "bbox" in found_object, "Found object should have bbox"
# Check that confidence is within expected range
assert 0 <= found_object["confidence"] <= 1, "Confidence should be between 0 and 1"
# Check that the bounding box has 4 coordinates
assert len(found_object["bbox"]) == 4, "Bounding box should have 4 coordinates"
# Check that the description matches the query
assert found_object["description"] == "car", "Description should match the query"
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_single_result(self, mcp_server: FastMCP, test_image_path):
"""Tests that the find tool returns only the best match when return_all_matches is False."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"confidence": 0.25,
"model_name": "yoloe-11s-seg.pt",
"return_all_matches": False,
},
)
# Parse the result
find_result = result.structured_content
# Verify that exactly one car was found when return_all_matches is False
assert find_result["found"] is True, "Should have found a car in the test image"
assert len(find_result["found_objects"]) == 1, "Should have returned exactly one car when return_all_matches is False"
# Check the structure of the found object
found_object = find_result["found_objects"][0]
assert "description" in found_object, "Found object should have description"
assert "match" in found_object, "Found object should have match"
assert "confidence" in found_object, "Found object should have confidence"
assert "bbox" in found_object, "Found object should have bbox"
# Check that confidence is within expected range
assert 0 <= found_object["confidence"] <= 1, "Confidence should be between 0 and 1"
# Check that the bounding box has 4 coordinates
assert len(found_object["bbox"]) == 4, "Bounding box should have 4 coordinates"
# Check that the description matches the query
assert found_object["description"] == "car", "Description should match the query"
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_nonexistent_object(self, mcp_server: FastMCP, test_image_path):
"""Tests that the find tool correctly handles searching for objects that don't exist."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "unicorn", # Something unlikely to be in the test image
"confidence": 0.25,
"model_name": "yoloe-11s-seg.pt",
},
)
# Parse the result
find_result = result.structured_content
# Check the structure of the result
assert "image_path" in find_result, "Result should contain image_path"
assert "query" in find_result, "Result should contain query"
assert "found_objects" in find_result, "Result should contain found_objects"
assert "found" in find_result, "Result should contain found flag"
# The found flag should be False if no objects were found
if len(find_result["found_objects"]) == 0:
assert find_result["found"] is False, "found flag should be False when no objects are found"
# The query should match what we searched for
assert find_result["query"] == "unicorn", "Query should match input description"
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_with_mask_geometry(self, mcp_server: FastMCP, test_image_path):
"""Tests the find tool with mask geometry return."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"model_name": "yoloe-11s-seg.pt",
"return_geometry": True,
"geometry_format": "mask",
"confidence": 0.25,
},
)
find_result = result.structured_content
assert find_result["found"]
assert len(find_result["found_objects"]) > 0
found_object = find_result["found_objects"][0]
assert "mask_path" in found_object
assert "polygon" not in found_object
mask_path = found_object["mask_path"]
assert isinstance(mask_path, str)
assert os.path.exists(mask_path)
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_with_polygon_geometry(self, mcp_server: FastMCP, test_image_path):
"""Tests the find tool with polygon geometry return."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"model_name": "yoloe-11s-seg.pt",
"return_geometry": True,
"geometry_format": "polygon",
"confidence": 0.25,
},
)
find_result = result.structured_content
assert find_result["found"]
assert len(find_result["found_objects"]) > 0
found_object = find_result["found_objects"][0]
assert "polygon" in found_object
assert "mask" not in found_object
polygon_data = found_object["polygon"]
assert isinstance(polygon_data, list)
assert len(polygon_data) > 0
assert isinstance(polygon_data[0], list)
assert len(polygon_data[0]) == 2
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_no_geometry_by_default(self, mcp_server: FastMCP, test_image_path):
"""Tests that find tool returns no geometry by default."""
if not os.path.exists(test_image_path):
pytest.skip(f"Test image not found at {test_image_path}")
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"model_name": "yoloe-11s-seg.pt",
"confidence": 0.25,
},
)
find_result = result.structured_content
assert find_result["found"]
assert len(find_result["found_objects"]) > 0
found_object = find_result["found_objects"][0]
assert "mask_path" not in found_object
assert "polygon" not in found_object
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_mask_correctness(self, mcp_server: FastMCP, test_image_path):
"""Tests that returned masks are valid and correctly positioned."""
with Image.open(test_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool("find", {"input_path": test_image_path, "description": "car", "model_name": "yoloe-11s-seg.pt", "return_geometry": True, "geometry_format": "mask", "confidence": 0.25})
find_result = result.structured_content
assert find_result["found"]
for obj in find_result["found_objects"]:
assert "mask_path" in obj
mask_path = obj["mask_path"]
assert os.path.exists(mask_path)
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
assert mask is not None
bbox = obj["bbox"]
x1, y1, x2, y2 = bbox
mask_height, mask_width = mask.shape
assert (
(mask_height == mask_width) or
(mask_height == orig_height and mask_width == orig_width)
), f"Mask dimensions {mask.shape} should be square or match original image"
scale_x = orig_width / mask_width
scale_y = orig_height / mask_height
unique_values = np.unique(mask)
assert len(unique_values) <= 2
assert all(v in [0, 255] for v in unique_values)
assert np.sum(mask) > 0
mask_indices = np.where(mask > 0)
if len(mask_indices[0]) > 0:
min_y, max_y = mask_indices[0].min(), mask_indices[0].max()
min_x, max_x = mask_indices[1].min(), mask_indices[1].max()
scaled_x1 = x1 / scale_x
scaled_x2 = x2 / scale_x
scaled_y1 = y1 / scale_y
scaled_y2 = y2 / scale_y
tolerance = 10
assert min_x >= scaled_x1 - tolerance
assert max_x <= scaled_x2 + tolerance
assert min_y >= scaled_y1 - tolerance
assert max_y <= scaled_y2 + tolerance
mask_area = np.sum(mask > 0)
scaled_bbox_area = ((scaled_x2 - scaled_x1) * (scaled_y2 - scaled_y1))
coverage_ratio = mask_area / scaled_bbox_area if scaled_bbox_area > 0 else 0
assert 0.1 <= coverage_ratio <= 1.5
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_polygon_correctness(self, mcp_server: FastMCP, test_image_path):
"""Tests that returned polygons are valid and correctly positioned."""
# Load the test image to get its dimensions
with Image.open(test_image_path) as img:
img_width, img_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_image_path,
"description": "car",
"model_name": "yoloe-11s-seg.pt",
"return_geometry": True,
"geometry_format": "polygon",
"confidence": 0.25,
},
)
find_result = result.structured_content
assert find_result["found"]
for obj in find_result["found_objects"]:
polygon = obj["polygon"]
bbox = obj["bbox"]
x1, y1, x2, y2 = bbox
# 1. Check polygon has at least 3 points
assert len(polygon) >= 3, "Polygon should have at least 3 points"
# 2. Check all points have exactly 2 coordinates
for point in polygon:
assert len(point) == 2, f"Each polygon point should have 2 coordinates, got {len(point)}"
# 3. Check all coordinates are reasonable
# Note: Polygon coordinates should be in original image space
for x, y in polygon:
# Allow some tolerance outside image bounds
tolerance = 10
assert -tolerance <= x <= img_width + tolerance, (
f"X coordinate {x} should be within image width {img_width} (with tolerance)"
)
assert -tolerance <= y <= img_height + tolerance, (
f"Y coordinate {y} should be within image height {img_height} (with tolerance)"
)
# 4. Check polygon points are within bbox bounds (with tolerance)
tolerance = 10
xs = [p[0] for p in polygon]
ys = [p[1] for p in polygon]
assert min(xs) >= x1 - tolerance, f"Min polygon x {min(xs)} should be >= bbox x1 {x1}"
assert max(xs) <= x2 + tolerance, f"Max polygon x {max(xs)} should be <= bbox x2 {x2}"
assert min(ys) >= y1 - tolerance, f"Min polygon y {min(ys)} should be >= bbox y1 {y1}"
assert max(ys) <= y2 + tolerance, f"Max polygon y {max(ys)} should be <= bbox y2 {y2}"
# 5. Check polygon area is positive (using shoelace formula)
area = 0
n = len(polygon)
for i in range(n):
j = (i + 1) % n
area += polygon[i][0] * polygon[j][1]
area -= polygon[j][0] * polygon[i][1]
area = abs(area) / 2.0
assert area > 0, "Polygon area should be positive"
# 6. Check polygon area relative to bbox area
bbox_area = (x2 - x1) * (y2 - y1)
area_ratio = area / bbox_area
assert 0.1 <= area_ratio <= 1.5, (
f"Polygon area ratio {area_ratio:.2f} should be reasonable relative to bbox"
)
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_mask_to_polygon_consistency(self, mcp_server: FastMCP, test_image_path):
"""Tests that mask and polygon representations are consistent for the same object."""
with Image.open(test_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
mask_result = await client.call_tool("find", {"input_path": test_image_path, "description": "car", "model_name": "yoloe-11s-seg.pt", "return_geometry": True, "geometry_format": "mask", "confidence": 0.5, "return_all_matches": False})
polygon_result = await client.call_tool("find", {"input_path": test_image_path, "description": "car", "model_name": "yoloe-11s-seg.pt", "return_geometry": True, "geometry_format": "polygon", "confidence": 0.5, "return_all_matches": False})
mask_data = mask_result.structured_content
polygon_data = polygon_result.structured_content
if mask_data["found"] and polygon_data["found"]:
mask_obj = mask_data["found_objects"][0]
polygon_obj = polygon_data["found_objects"][0]
mask_path = mask_obj["mask_path"]
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
mask_bbox = mask_obj["bbox"]
polygon_bbox = polygon_obj["bbox"]
bbox_tolerance = 20
for i in range(4):
assert abs(mask_bbox[i] - polygon_bbox[i]) < bbox_tolerance
polygon_points = polygon_obj["polygon"]
mask_height, mask_width = mask.shape
img = Image.new('L', (mask_width, mask_height), 0)
scale_x = mask_width / orig_width
scale_y = mask_height / orig_height
scaled_polygon = [(p[0] * scale_x, p[1] * scale_y) for p in polygon_points]
ImageDraw.Draw(img).polygon(scaled_polygon, outline=1, fill=1)
polygon_mask = np.array(img)
mask_bool = mask > 0
polygon_mask_bool = polygon_mask > 0
intersection = np.logical_and(mask_bool, polygon_mask_bool).sum()
union = np.logical_or(mask_bool, polygon_mask_bool).sum()
iou = intersection / union if union > 0 else 0
assert iou > 0.5
@pytest.mark.asyncio
@pytest.mark.skipif(
os.environ.get("SKIP_YOLO_TESTS") == "1",
reason="Skipping YOLO tests to avoid downloading models in CI",
)
async def test_find_mask_validation_on_simple_image(
self, mcp_server: FastMCP, test_segmentation_image_path
):
"""
Tests that generated masks from the find tool are valid using a simple image.
It checks for binarity and bounding box confinement for every generated mask.
"""
with Image.open(test_segmentation_image_path) as img:
orig_width, orig_height = img.size
async with Client(mcp_server) as client:
result = await client.call_tool(
"find",
{
"input_path": test_segmentation_image_path,
"description": "dog",
"model_name": "yoloe-11s-seg.pt",
"return_geometry": True,
"geometry_format": "mask",
"confidence": 0.3,
"return_all_matches": True,
},
)
find_result = result.structured_content
assert find_result is not None
assert find_result["found"], "Should have found a dog in the image"
assert len(find_result["found_objects"]) >= 1, "Should have found at least one dog"
# Validate every mask that was generated
for found_object in find_result["found_objects"]:
assert "mask_path" in found_object, "Each found object should have a mask_path"
mask_path = found_object["mask_path"]
assert os.path.exists(mask_path), f"Mask file should exist at {mask_path}"
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
assert mask is not None, f"Mask file {mask_path} could not be read"
# 1. Check for binarity (only 0 and 255 values)
unique_values = np.unique(mask)
assert all(v in [0, 255] for v in unique_values), (
f"Mask {mask_path} is not binary. Found values: {unique_values}"
)
assert np.sum(mask) > 0, f"Mask {mask_path} should not be empty"
# 2. Check for bounding box confinement
bbox = found_object["bbox"]
x1, y1, x2, y2 = bbox
mask_height, mask_width = mask.shape
scale_x = orig_width / mask_width
scale_y = orig_height / mask_height
mask_indices = np.where(mask > 0)
if len(mask_indices[0]) > 0:
min_mask_y, max_mask_y = mask_indices[0].min(), mask_indices[0].max()
min_mask_x, max_mask_x = mask_indices[1].min(), mask_indices[1].max()
scaled_x1 = x1 / scale_x
scaled_y1 = y1 / scale_y
scaled_x2 = x2 / scale_x
scaled_y2 = y2 / scale_y
tolerance = 10
assert min_mask_x >= scaled_x1 - tolerance, f"Mask content of {mask_path} extends past the left of its bbox"
assert max_mask_x <= scaled_x2 + tolerance, f"Mask content of {mask_path} extends past the right of its bbox"
assert min_mask_y >= scaled_y1 - tolerance, f"Mask content of {mask_path} extends past the top of its bbox"
assert max_mask_y <= scaled_y2 + tolerance, f"Mask content of {mask_path} extends past the bottom of its bbox"
================================================
FILE: tests/tools/test_metainfo.py
================================================
import pytest
from fastmcp import Client, FastMCP
from PIL import Image
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for metadata extraction."""
img_path = tmp_path / "test_image.png"
img = Image.new("RGB", (200, 200), color="white")
# Draw some colored areas
for x in range(50, 100):
for y in range(50, 100):
img.putpixel((x, y), (255, 0, 0)) # Red square
img.save(img_path)
return str(img_path)
class TestMetainfoToolDefinition:
"""Tests for the get_metainfo tool definition and metadata."""
@pytest.mark.asyncio
async def test_metainfo_in_tools_list(self, mcp_server: FastMCP):
"""Tests that get_metainfo tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if get_metainfo is in the list of tools
tool_names = [tool.name for tool in tools]
assert "get_metainfo" in tool_names, (
"get_metainfo tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_metainfo_description(self, mcp_server: FastMCP):
"""Tests that get_metainfo tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
metainfo_tool = next(
(tool for tool in tools if tool.name == "get_metainfo"), None
)
# Check description
assert metainfo_tool.description, (
"get_metainfo tool should have a description"
)
assert "metadata" in metainfo_tool.description.lower(), (
"Description should mention metadata"
)
@pytest.mark.asyncio
async def test_metainfo_parameters(self, mcp_server: FastMCP):
"""Tests that get_metainfo tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
metainfo_tool = next(
(tool for tool in tools if tool.name == "get_metainfo"), None
)
# Check input schema
assert hasattr(metainfo_tool, "inputSchema"), (
"get_metainfo tool should have an inputSchema"
)
assert "properties" in metainfo_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path"]
for param in required_params:
assert param in metainfo_tool.inputSchema["properties"], (
f"get_metainfo tool should have a '{param}' property "
f"in its inputSchema"
)
# Check parameter types
assert (
metainfo_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
class TestMetainfoToolExecution:
"""Tests for the get_metainfo tool execution and results."""
@pytest.mark.asyncio
async def test_metainfo_tool_execution(self, mcp_server: FastMCP, test_image_path):
"""Tests the get_metainfo tool execution and return value."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"get_metainfo", {"input_path": test_image_path}
)
# Check that the tool returned a result
# Parse the JSON string from the text attribute
metadata = result.data
# Verify the metadata contains expected fields
assert "filename" in metadata
assert "size_bytes" in metadata
assert "dimensions" in metadata
assert "format" in metadata
assert "color_mode" in metadata
assert "created_at" in metadata
assert "modified_at" in metadata
# Verify the dimensions are correct
assert metadata["dimensions"]["width"] == 200
assert metadata["dimensions"]["height"] == 200
assert metadata["dimensions"]["aspect_ratio"] == 1.0
# Verify the format is correct
assert metadata["format"] == "PNG"
# Verify the color mode is correct
assert metadata["color_mode"] == "RGB"
@pytest.mark.asyncio
async def test_metainfo_nonexistent_file(self, mcp_server: FastMCP, tmp_path):
"""Tests the get_metainfo tool with a nonexistent file."""
nonexistent_path = str(tmp_path / "nonexistent.png")
async with Client(mcp_server) as client:
with pytest.raises(Exception) as excinfo:
await client.call_tool("get_metainfo", {"input_path": nonexistent_path})
# The error message structure is different with FastMCP - it wraps the original error
# Just check that we got an error (any kind of exception is acceptable)
assert isinstance(excinfo.value, Exception)
================================================
FILE: tests/tools/test_ocr.py
================================================
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
# Add this line to filter out the PyTorch warnings
pytestmark = pytest.mark.filterwarnings("ignore:.*'pin_memory' argument is set as true but no accelerator is found.*:UserWarning")
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image with text for OCR."""
img_path = tmp_path / "test_ocr_image.png"
# Create a white image
img = np.ones((300, 600, 3), dtype=np.uint8) * 255
# Add text to the image
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img, "Hello World", (50, 50), font, 1, (0, 0, 0), 2)
cv2.putText(img, "OCR Test", (50, 150), font, 2, (0, 0, 0), 3)
cv2.putText(img, "12345", (50, 250), font, 1.5, (0, 0, 0), 2)
# Save the image
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestOcrToolDefinition:
"""Tests for the OCR tool definition and metadata."""
@pytest.mark.asyncio
async def test_ocr_in_tools_list(self, mcp_server: FastMCP):
"""Tests that OCR tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if OCR is in the list of tools
tool_names = [tool.name for tool in tools]
assert "ocr" in tool_names, (
"OCR tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_ocr_description(self, mcp_server: FastMCP):
"""Tests that OCR tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
ocr_tool = next((tool for tool in tools if tool.name == "ocr"), None)
# Check description
assert ocr_tool.description, "OCR tool should have a description"
assert "ocr" in ocr_tool.description.lower() or "optical character recognition" in ocr_tool.description.lower(), (
"Description should mention that it performs OCR on an image"
)
@pytest.mark.asyncio
async def test_ocr_parameters(self, mcp_server: FastMCP):
"""Tests that OCR tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
ocr_tool = next((tool for tool in tools if tool.name == "ocr"), None)
# Check input schema
assert hasattr(ocr_tool, "inputSchema"), (
"OCR tool should have an inputSchema"
)
assert "properties" in ocr_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path"]
for param in required_params:
assert param in ocr_tool.inputSchema["properties"], (
f"OCR tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
optional_params = ["language"]
for param in optional_params:
assert param in ocr_tool.inputSchema["properties"], (
f"OCR tool should have a '{param}' property in its inputSchema"
)
# Check parameter types
assert (
ocr_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
# Check optional parameter (now has anyOf structure with null)
language_schema = ocr_tool.inputSchema["properties"]["language"]
assert "anyOf" in language_schema, "language should have anyOf structure for optional parameter"
assert any(item.get("type") == "string" for item in language_schema["anyOf"]), "language should allow string type"
assert any(item.get("type") == "null" for item in language_schema["anyOf"]), "language should allow null type"
class TestOcrToolExecution:
"""Tests for the OCR tool execution and results."""
@pytest.mark.asyncio
async def test_ocr_tool_execution(self, mcp_server: FastMCP, test_image_path):
"""Tests the OCR tool execution and return value."""
try:
import easyocr # noqa: F401
except ImportError:
pytest.skip("EasyOCR is not installed")
async with Client(mcp_server) as client:
result = await client.call_tool(
"ocr",
{
"input_path": test_image_path,
"language": "en",
},
)
# Check that the tool returned a result
# Parse the result
ocr_result = result.structured_content
# Basic structure checks
assert "image_path" in ocr_result
assert "text_segments" in ocr_result
assert ocr_result["image_path"] == test_image_path
assert isinstance(ocr_result["text_segments"], list)
# Check that we have at least some text segments
assert len(ocr_result["text_segments"]) > 0, (
"No text segments detected in the test image"
)
# Check the structure of a text segment
segment = ocr_result["text_segments"][0]
assert "text" in segment, "Text segment should have text content"
assert "confidence" in segment, "Text segment should have a confidence score"
assert "bbox" in segment, "Text segment should have a bounding box"
# Check that the confidence is within expected range
assert 0 <= segment["confidence"] <= 1, (
"Confidence should be between 0 and 1"
)
# Check that the bounding box has 4 coordinates
assert len(segment["bbox"]) == 4, "Bounding box should have 4 coordinates"
# Check for expected text in the image
# We expect at least one of these texts to be detected
expected_texts = ["Hello World", "OCR Test", "12345"]
detected_texts = [segment["text"] for segment in ocr_result["text_segments"]]
# Check if any of our expected texts are detected (allowing for partial matches)
matches_found = False
for expected in expected_texts:
for detected in detected_texts:
if expected.lower() in detected.lower() or detected.lower() in expected.lower():
matches_found = True
break
if matches_found:
break
assert matches_found, (
f"None of the expected texts {expected_texts} were detected. "
f"Detected texts: {detected_texts}"
)
================================================
FILE: tests/tools/test_overlay.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def base_image_path(tmp_path):
"""Create a base test image."""
img_path = tmp_path / "base_image.png"
# Create a blue image
img = np.full((300, 400, 3), (255, 0, 0), dtype=np.uint8)
cv2.imwrite(str(img_path), img)
return str(img_path)
@pytest.fixture
def overlay_image_path_rgb(tmp_path):
"""Create an RGB overlay image (no alpha)."""
img_path = tmp_path / "overlay_rgb.png"
# Create a green square
img = np.full((100, 100, 3), (0, 255, 0), dtype=np.uint8)
cv2.imwrite(str(img_path), img)
return str(img_path)
@pytest.fixture
def overlay_image_path_rgba(tmp_path):
"""Create an RGBA overlay image with transparency."""
img_path = tmp_path / "overlay_rgba.png"
# Create a semi-transparent red circle on a transparent background
img = np.zeros((100, 100, 4), dtype=np.uint8)
cv2.circle(img, (50, 50), 40, (0, 0, 255, 128), -1) # B,G,R,A (semi-transparent red)
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestOverlayToolDefinition:
"""Tests for the overlay tool definition and metadata."""
@pytest.mark.asyncio
async def test_overlay_in_tools_list(self, mcp_server: FastMCP):
"""Tests that overlay tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
assert tools, "Tools list should not be empty"
tool_names = [tool.name for tool in tools]
assert "overlay" in tool_names, "overlay tool should be in the list of available tools"
@pytest.mark.asyncio
async def test_overlay_parameters(self, mcp_server: FastMCP):
"""Tests that overlay tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
overlay_tool = next((tool for tool in tools if tool.name == "overlay"), None)
assert overlay_tool is not None
props = overlay_tool.inputSchema["properties"]
required = overlay_tool.inputSchema["required"]
assert "base_image_path" in props and props["base_image_path"]["type"] == "string"
assert "overlay_image_path" in props and props["overlay_image_path"]["type"] == "string"
assert "x" in props and props["x"]["type"] == "integer"
assert "y" in props and props["y"]["type"] == "integer"
assert "output_path" in props
assert "base_image_path" in required
assert "overlay_image_path" in required
assert "x" in required
assert "y" in required
class TestOverlayToolExecution:
"""Tests for the overlay tool execution and results."""
@pytest.mark.asyncio
async def test_overlay_rgb(self, mcp_server: FastMCP, base_image_path, overlay_image_path_rgb, tmp_path):
"""Tests overlaying an RGB image (no alpha)."""
output_path = str(tmp_path / "output_rgb.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"overlay",
{
"base_image_path": base_image_path,
"overlay_image_path": overlay_image_path_rgb,
"x": 50,
"y": 50,
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel inside the overlay area, it should be green
assert np.array_equal(img[100, 100], [0, 255, 0])
# Check a pixel outside the overlay area, it should be blue
assert np.array_equal(img[200, 200], [255, 0, 0])
@pytest.mark.asyncio
async def test_overlay_rgba(self, mcp_server: FastMCP, base_image_path, overlay_image_path_rgba, tmp_path):
"""Tests overlaying an RGBA image with transparency."""
output_path = str(tmp_path / "output_rgba.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"overlay",
{
"base_image_path": base_image_path,
"overlay_image_path": overlay_image_path_rgba,
"x": 50,
"y": 50,
"output_path": output_path,
},
)
assert result.data == output_path
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel inside the overlay area, it should be purple
assert np.allclose(img[100, 100], [128, 0, 128], atol=2)
# Check a pixel outside the overlay area, it should be blue
assert np.array_equal(img[55, 55], [255, 0, 0])
@pytest.mark.asyncio
async def test_overlay_partial_offscreen(self, mcp_server: FastMCP, base_image_path, overlay_image_path_rgb, tmp_path):
"""Tests overlaying an image partially offscreen."""
output_path = str(tmp_path / "output_partial.png")
async with Client(mcp_server) as client:
await client.call_tool(
"overlay",
{
"base_image_path": base_image_path,
"overlay_image_path": overlay_image_path_rgb,
"x": 350,
"y": 250,
"output_path": output_path,
},
)
assert os.path.exists(output_path)
img = cv2.imread(output_path)
# Check a pixel inside the overlay area, it should be green
assert np.array_equal(img[299, 399], [0, 255, 0])
# Check a pixel outside the overlay area, it should be blue
assert np.array_equal(img[299, 349], [255, 0, 0])
@pytest.mark.asyncio
async def test_overlay_default_output_path(self, mcp_server: FastMCP, base_image_path, overlay_image_path_rgb):
"""Tests the overlay tool with a default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"overlay",
{
"base_image_path": base_image_path,
"overlay_image_path": overlay_image_path_rgb,
"x": 0,
"y": 0,
},
)
expected_output = base_image_path.replace(".png", "_overlaid.png")
assert result.data == expected_output
assert os.path.exists(expected_output)
================================================
FILE: tests/tools/test_resize.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for resizing."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((200, 300, 3), dtype=np.uint8) * 255
# Draw some colored areas to verify resizing
# Red square (50,50) to (100,100)
img[50:100, 50:100] = [0, 0, 255] # OpenCV uses BGR
# Blue square (100,100) to (150,150)
img[100:150, 100:150] = [255, 0, 0] # OpenCV uses BGR
# Draw a green circle in the center with thickness 5px and diameter 100px
center = (150, 100) # x, y coordinates (center of the image)
radius = 50 # 100px diameter
color = (0, 255, 0) # Green in BGR
thickness = 5
cv2.circle(img, center, radius, color, thickness)
# Add text "TEST" in the center
font = cv2.FONT_HERSHEY_SIMPLEX
font_scale = 1
text_color = (0, 0, 0) # Black in BGR
text_thickness = 2
text = "TEST"
# Get text size to center it properly
text_size = cv2.getTextSize(text, font, font_scale, text_thickness)[0]
text_x = int(center[0] - text_size[0] / 2)
text_y = int(center[1] + text_size[1] / 2)
cv2.putText(
img, text, (text_x, text_y), font, font_scale, text_color, text_thickness
)
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestResizeToolDefinition:
"""Tests for the resize tool definition and metadata."""
@pytest.mark.asyncio
async def test_resize_in_tools_list(self, mcp_server: FastMCP):
"""Tests that resize tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if resize is in the list of tools
tool_names = [tool.name for tool in tools]
assert "resize" in tool_names, (
"resize tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_resize_description(self, mcp_server: FastMCP):
"""Tests that resize tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
resize_tool = next((tool for tool in tools if tool.name == "resize"), None)
# Check description
assert resize_tool.description, "resize tool should have a description"
assert "resize" in resize_tool.description.lower(), (
"Description should mention that it resizes an image"
)
@pytest.mark.asyncio
async def test_resize_parameters(self, mcp_server: FastMCP):
"""Tests that resize tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
resize_tool = next((tool for tool in tools if tool.name == "resize"), None)
# Check input schema
assert hasattr(resize_tool, "inputSchema"), (
"resize tool should have an inputSchema"
)
assert "properties" in resize_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
assert "input_path" in resize_tool.inputSchema["properties"], (
"resize tool should have an 'input_path' property in its inputSchema"
)
# Check optional parameters
optional_params = [
"width",
"height",
"scale_factor",
"interpolation",
"output_path",
]
for param in optional_params:
assert param in resize_tool.inputSchema["properties"], (
f"resize tool should have a '{param}' property in its inputSchema"
)
# Check parameter types - accounting for optional parameters
# that use anyOf structure
assert (
resize_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
# For optional integer parameters, check if they have the correct type
# in anyOf structure
for param in ["width", "height"]:
param_schema = resize_tool.inputSchema["properties"][param]
if "anyOf" in param_schema:
# Check if one of the anyOf options is integer
has_integer_type = any(
option.get("type") == "integer"
for option in param_schema["anyOf"]
)
assert has_integer_type, f"{param} should allow integer type"
else:
assert param_schema.get("type") == "integer", (
f"{param} should be of type integer"
)
# For scale_factor (float parameter)
scale_factor_schema = resize_tool.inputSchema["properties"][
"scale_factor"
]
if "anyOf" in scale_factor_schema:
# Check if one of the anyOf options is number
has_number_type = any(
option.get("type") == "number"
for option in scale_factor_schema["anyOf"]
)
assert has_number_type, "scale_factor should allow number type"
else:
assert scale_factor_schema.get("type") == "number", (
"scale_factor should be of type number"
)
# For string parameters
for param in ["interpolation", "output_path"]:
param_schema = resize_tool.inputSchema["properties"][param]
if "anyOf" in param_schema:
# Check if one of the anyOf options is string
has_string_type = any(
option.get("type") == "string"
for option in param_schema["anyOf"]
)
assert has_string_type, f"{param} should allow string type"
else:
assert param_schema.get("type") == "string", (
f"{param} should be of type string"
)
class TestResizeToolExecution:
"""Tests for the resize tool execution and results."""
@pytest.mark.asyncio
async def test_resize_with_dimensions_smaller(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with specific dimensions (smaller)."""
output_path = str(tmp_path / "output_dimensions_smaller.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 150,
"height": 100,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (100, 150) # height, width
@pytest.mark.asyncio
async def test_resize_with_dimensions_larger(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with specific dimensions (larger)."""
output_path = str(tmp_path / "output_dimensions_larger.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 600,
"height": 400,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[:2] == (400, 600) # height, width
@pytest.mark.asyncio
async def test_resize_with_width_only_smaller(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with only width specified
(smaller, preserving aspect ratio)."""
output_path = str(tmp_path / "output_width_only_smaller.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 150,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[1] == 150 # width
# Height should be proportional (original: 200x300, new width: 150)
# So new height should be 200 * (150/300) = 100
assert img.shape[0] == 100 # height
@pytest.mark.asyncio
async def test_resize_with_width_only_larger(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with only width specified
(larger, preserving aspect ratio)."""
output_path = str(tmp_path / "output_width_only_larger.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 600,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[1] == 600 # width
# Height should be proportional (original: 200x300, new width: 600)
# So new height should be 200 * (600/300) = 400
assert img.shape[0] == 400 # height
@pytest.mark.asyncio
async def test_resize_with_height_only_smaller(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with only height specified
(smaller, preserving aspect ratio)."""
output_path = str(tmp_path / "output_height_only_smaller.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"height": 100,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[0] == 100 # height
# Width should be proportional (original: 200x300, new height: 100)
# So new width should be 300 * (100/200) = 150
assert img.shape[1] == 150 # width
@pytest.mark.asyncio
async def test_resize_with_height_only_larger(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with only height specified
(larger, preserving aspect ratio)."""
output_path = str(tmp_path / "output_height_only_larger.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"height": 400,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
assert img.shape[0] == 400 # height
# Width should be proportional (original: 200x300, new height: 400)
# So new width should be 300 * (400/200) = 600
assert img.shape[1] == 600 # width
@pytest.mark.asyncio
async def test_resize_with_scale_factor_smaller(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with scale factor (smaller)."""
output_path = str(tmp_path / "output_scale_smaller.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"scale_factor": 0.5,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
# Original: 200x300, scale: 0.5, so new dimensions should be 100x150
assert img.shape[:2] == (100, 150) # height, width
@pytest.mark.asyncio
async def test_resize_with_scale_factor_larger(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool execution with scale factor (larger)."""
output_path = str(tmp_path / "output_scale_larger.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"scale_factor": 2.0,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the resized image dimensions
img = cv2.imread(output_path)
# Original: 200x300, scale: 2.0, so new dimensions should be 400x600
assert img.shape[:2] == (400, 600) # height, width
@pytest.mark.asyncio
async def test_resize_default_output_path(
self, mcp_server: FastMCP, test_image_path
):
"""Tests the resize tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize", {"input_path": test_image_path, "width": 150, "height": 100}
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_resized.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
# Verify the resized image dimensions
img = cv2.imread(expected_output)
assert img.shape[:2] == (100, 150) # height, width
@pytest.mark.asyncio
async def test_resize_with_interpolation(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the resize tool with different interpolation methods."""
interpolation_methods = ["nearest", "linear", "area", "cubic", "lanczos"]
for method in interpolation_methods:
# Test downscaling
output_path_smaller = str(tmp_path / f"output_{method}_smaller.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 150,
"height": 100,
"interpolation": method,
"output_path": output_path_smaller,
},
)
# Check that the tool returned a result
assert result.data == output_path_smaller
# Verify the file exists
assert os.path.exists(output_path_smaller)
# Verify the resized image dimensions
img = cv2.imread(output_path_smaller)
assert img.shape[:2] == (100, 150) # height, width
# Test upscaling
output_path_larger = str(tmp_path / f"output_{method}_larger.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"resize",
{
"input_path": test_image_path,
"width": 600,
"height": 400,
"interpolation": method,
"output_path": output_path_larger,
},
)
# Check that the tool returned a result
assert result.data == output_path_larger
# Verify the file exists
assert os.path.exists(output_path_larger)
# Verify the resized image dimensions
img = cv2.imread(output_path_larger)
assert img.shape[:2] == (400, 600) # height, width
================================================
FILE: tests/tools/test_rotate.py
================================================
import os
import cv2
import numpy as np
import pytest
from fastmcp import Client, FastMCP
from imagesorcery_mcp.server import mcp as image_sorcery_mcp_server
@pytest.fixture
def mcp_server():
# Use the existing server instance
return image_sorcery_mcp_server
@pytest.fixture
def test_image_path(tmp_path):
"""Create a test image for rotation."""
img_path = tmp_path / "test_image.png"
# Create a white image
img = np.ones((100, 200, 3), dtype=np.uint8) * 255
# Draw a red rectangle in the top-left corner to verify rotation
# Red rectangle from (10,10) to (40,40)
img[10:40, 10:40] = [0, 0, 255] # OpenCV uses BGR
cv2.imwrite(str(img_path), img)
return str(img_path)
class TestRotateToolDefinition:
"""Tests for the rotate tool definition and metadata."""
@pytest.mark.asyncio
async def test_rotate_in_tools_list(self, mcp_server: FastMCP):
"""Tests that rotate tool is in the list of available tools."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
# Verify that tools list is not empty
assert tools, "Tools list should not be empty"
# Check if rotate is in the list of tools
tool_names = [tool.name for tool in tools]
assert "rotate" in tool_names, (
"rotate tool should be in the list of available tools"
)
@pytest.mark.asyncio
async def test_rotate_description(self, mcp_server: FastMCP):
"""Tests that rotate tool has the correct description."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
rotate_tool = next((tool for tool in tools if tool.name == "rotate"), None)
# Check description
assert rotate_tool.description, "rotate tool should have a description"
assert "rotate" in rotate_tool.description.lower(), (
"Description should mention that it rotates an image"
)
@pytest.mark.asyncio
async def test_rotate_parameters(self, mcp_server: FastMCP):
"""Tests that rotate tool has the correct parameter structure."""
async with Client(mcp_server) as client:
tools = await client.list_tools()
rotate_tool = next((tool for tool in tools if tool.name == "rotate"), None)
# Check input schema
assert hasattr(rotate_tool, "inputSchema"), (
"rotate tool should have an inputSchema"
)
assert "properties" in rotate_tool.inputSchema, (
"inputSchema should have properties field"
)
# Check required parameters
required_params = ["input_path", "angle"]
for param in required_params:
assert param in rotate_tool.inputSchema["properties"], (
f"rotate tool should have a '{param}' property in its inputSchema"
)
# Check optional parameters
assert "output_path" in rotate_tool.inputSchema["properties"], (
"rotate tool should have an 'output_path' property in its inputSchema"
)
# Check parameter types
assert (
rotate_tool.inputSchema["properties"]["input_path"].get("type")
== "string"
), "input_path should be of type string"
assert rotate_tool.inputSchema["properties"]["angle"].get("type") in [
"number",
"integer",
"float",
], "angle should be a numeric type"
assert (
rotate_tool.inputSchema["properties"]["output_path"].get("type")
== "string"
), "output_path should be of type string"
class TestRotateToolExecution:
"""Tests for the rotate tool execution and results."""
@pytest.mark.asyncio
async def test_rotate_tool_execution(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the rotate tool execution and return value."""
output_path = str(tmp_path / "output.png")
async with Client(mcp_server) as client:
result = await client.call_tool(
"rotate",
{
"input_path": test_image_path,
"angle": 90,
"output_path": output_path,
},
)
# Check that the tool returned a result
assert result.data == output_path
# Verify the file exists
assert os.path.exists(output_path)
# Verify the rotated image dimensions
original_img = cv2.imread(test_image_path)
rotated_img = cv2.imread(output_path)
# For a 90-degree rotation, width and height should be approximately swapped
# Due to the rotate_bound function, dimensions might be slightly larger
# to fit the entire rotated image
# We check that the original width is close to the rotated height
# and vice versa
original_height, original_width = original_img.shape[:2]
rotated_height, rotated_width = rotated_img.shape[:2]
# Allow for a small margin of error due to padding in rotate_bound
margin = 5
assert abs(original_width - rotated_height) <= margin, (
"Original width should approximately match rotated height "
"for 90-degree rotation"
)
assert abs(original_height - rotated_width) <= margin, (
"Original height should approximately match rotated width "
"for 90-degree rotation"
)
# Verify the rotation by checking the position of the red rectangle
# In the original image, the red rectangle is in the top-left corner
# (10,10) to (40,40)
# After 90-degree counterclockwise rotation, it should be in the
# top-right area
# Check if the top-right area has red pixels (BGR format)
# For 90-degree counterclockwise rotation, the red rectangle should move
# from top-left to top-right
# We need to check the appropriate coordinates in the rotated image
# The exact coordinates depend on how rotate_bound handles the rotation
# and padding
# For a 90-degree counterclockwise rotation of a 100x200 image with a
# red rectangle at (10,10)-(40,40),
# the red rectangle should be approximately in the top-right area
# Check if there are red pixels in the expected area after rotation
# For 90-degree counterclockwise rotation, the top-left (10,10) would move
# to approximately (10, rotated_width-40)
has_red_pixels = False
for y in range(10, 40):
for x in range(rotated_width - 40, rotated_width - 10):
if x >= 0 and x < rotated_width and y >= 0 and y < rotated_height:
# Check if pixel is red (BGR format: [0,0,255])
pixel = rotated_img[y, x]
if (
pixel[0] < 50 and pixel[1] < 50 and pixel[2] > 200
): # Allow for some color variation
has_red_pixels = True
break
if has_red_pixels:
break
assert has_red_pixels, (
"Red rectangle should be in the top-right area after "
"90-degree counterclockwise rotation"
)
@pytest.mark.asyncio
async def test_rotate_clockwise(
self, mcp_server: FastMCP, test_image_path, tmp_path
):
"""Tests the rotate tool with clockwise rotation (-90 degrees)."""
output_path = str(tmp_path / "output_clockwise.png")
async with Client(mcp_server) as client:
await client.call_tool(
"rotate",
{
"input_path": test_image_path,
"angle": -90, # Negative angle for clockwise rotation
"output_path": output_path,
},
)
# Verify the file exists
assert os.path.exists(output_path)
# Load the rotated image
rotated_img = cv2.imread(output_path)
rotated_height, rotated_width = rotated_img.shape[:2]
# For -90-degree (clockwise) rotation, the red rectangle should move
# from top-left to bottom-left
# Check if there are red pixels in the expected area after rotation
has_red_pixels = False
for y in range(rotated_height - 40, rotated_height - 10):
for x in range(10, 40):
if x >= 0 and x < rotated_width and y >= 0 and y < rotated_height:
# Check if pixel is red (BGR format: [0,0,255])
pixel = rotated_img[y, x]
if (
pixel[0] < 50 and pixel[1] < 50 and pixel[2] > 200
): # Allow for some color variation
has_red_pixels = True
break
if has_red_pixels:
break
assert has_red_pixels, (
"Red rectangle should be in the bottom-left area after "
"90-degree clockwise rotation"
)
@pytest.mark.asyncio
async def test_rotate_default_output_path(
self, mcp_server: FastMCP, test_image_path
):
"""Tests the rotate tool with default output path."""
async with Client(mcp_server) as client:
result = await client.call_tool(
"rotate", {"input_path": test_image_path, "angle": 45}
)
# Check that the tool returned a result
expected_output = test_image_path.replace(".png", "_rotated.png")
assert result.data == expected_output
# Verify the file exists
assert os.path.exists(expected_output)
assert os.path.exists(expected_output)